213 Commits

Author SHA1 Message Date
Sybren A. Stüvel
530520b1c7 Implement mass updating of tasks when JobUpdate.refresh_tasks = true
Send & handle `JobUpdate.refresh_tasks = true` when many tasks are
updated simultaneously. This applies to things like cancelling &
requeueing an entire job.

This partially rolls back 67bf77de13d99b1bc5d7344951068822c4fadd88, as
it was too slow when 1000+ tasks were being updated all at once.
2022-05-17 14:48:50 +02:00
Sybren A. Stüvel
0b39f229a1 Implement may-I-keep-running protocol
Worker and Manager implementation of the "may-I-kee-running" protocol.

While running tasks, the Worker will ask the Manager periodically
whether it's still allowed to keep running that task. This allows the
Manager to abort commands on Workers when:

- the Worker should go to another state (typically 'asleep' or
  'shutdown'),
- the task changed status from 'active' to something non-runnable
  (typically 'canceled' when the job as a whole is canceled).
- the task has been assigned to a different Worker. This can happen when
  a Worker loses its connection to its Manager, resulting in a task
  timeout (not yet implemented) after which the task can be assigned to
  another Worker. If then the connectivity is restored, the first Worker
  should abort (last-assigned Worker wins).
2022-05-12 15:06:05 +02:00
Sybren A. Stüvel
d35ca9d98f Manager: limit database connections
Limit the database connection pool to only a single connection. I hope that
this will solve the intermittent `SQLITE_BUSY` errors I've been seeing.
2022-05-12 13:58:15 +02:00
Sybren A. Stüvel
3d606a3fa0 Manager: task scheduler, fix handling of worker assignment of tasks
Improve how the task scheduler deals with tasks that already have a
worker assigned to them:

- When a Worker asks for a task, and there is already an active task
  assigned to it, always return that task.
- Otherwise, never allow scheduling of active tasks, as those are
  already being run by another worker. If this is not the case, their
  status should change to queued/failed, instead of handling the
  situation in the task scheduler.
- Apart from the assigned-and-active case above, ignore task's worker ID
  when scheduling tasks. If the status is 'queued' or 'soft-failed', the
  task's worker ID just indicates who ran the task last.
2022-05-12 13:52:16 +02:00
Sybren A. Stüvel
d3e2638f84 Cleanup: rename uri to dsn
"DSN" (Data Source Name) is used to indicate which database to open, and
was intermixed with "URI". This is now consistent.

No functional changes.
2022-05-12 11:08:54 +02:00
Sybren A. Stüvel
d673da7a0c Manager: check for stuck jobs at startup
Check for jobs in 'cancel-requested' or 'requeued' statuses, and ensure
they transition to the right status. This happens at startup, before
even starting the web interface, so that a consistent state is presented.
2022-05-06 16:07:27 +02:00
Sybren A. Stüvel
d008991bf4 Revert "Manager: broadcast job/task updates in a separate goroutine"
This reverts commit cd28ef552e2476dda68ba671436b805d7b32a655, as it
breaks the unit tests and I don't want to spend the time to fix those.
2022-05-06 14:48:16 +02:00
Sybren A. Stüvel
98da20f1a9 Manager: vacuum the database at startup 2022-05-06 14:35:34 +02:00
Sybren A. Stüvel
cd28ef552e Manager: broadcast job/task updates in a separate goroutine 2022-05-06 12:27:10 +02:00
Sybren A. Stüvel
ba34652cd1 Implement task status changes from web interface
This also reworks some of the logic due to the recently-removed
`cancel-requested` task status.
2022-05-05 16:44:09 +02:00
Sybren A. Stüvel
23680c27bf OAPI: regenerate code 2022-05-05 16:36:38 +02:00
Sybren A. Stüvel
67bf77de13 Manager: rework mass updates to task statuses
When the job status changes, it impacts the task statuses as well. These
status changes are now no longer done with a single database query, but
instead each affected task is fetched, changed, and saved. This unifies
the regular & mass updates to the tasks, and causes the resulting task
changes to be broadcast to SocketIO clients.
2022-05-03 16:13:44 +02:00
Sybren A. Stüvel
b3e1d1c6de Cleanup: manager, typo fix 2022-05-03 13:05:30 +02:00
Sybren A. Stüvel
18891dda91 Manager: implement FetchTask OAPI endpoint 2022-05-03 13:04:28 +02:00
Sybren A. Stüvel
891e791853 Manager: reduce log level of socketIO subscription changes 2022-05-03 12:04:27 +02:00
Sybren A. Stüvel
50c8cd39f2 Task update notifications via SocketIO
Manager now sends out task updates via SocketIO, and the web interface
handles those.

Note that there is a `BroadcastTaskUpdate()` function, but not a
`BroadcastNewTask`. The 'new job' broadcast is sent after the job's
tasks have been created, and thus there is no need for a separate
broadcast per task.
2022-05-03 11:26:24 +02:00
Sybren A. Stüvel
bb68488c5e Cleanup: Manager, add bit of documentation 2022-05-03 10:39:44 +02:00
Sybren A. Stüvel
9b330280b7 Add SocketIO subscription system for job-related updates
SocketIO clients can now send a message with `/subscription` event type
in order to subscribe to or unsubscribe from job-related updates.

These job-related updates themselves aren't sent yet, so this is a change
that's impossible to really test. The socketIO code for joining/leaving
rooms is called, though.
2022-05-02 18:36:14 +02:00
Sybren A. Stüvel
8d69bfe069 Cleanup: Manager, reorganise the socketio code a bit 2022-04-29 16:58:48 +02:00
Sybren A. Stüvel
629c073ed7 Manager: fix query for job tasks 2022-04-29 12:26:53 +02:00
Sybren A. Stüvel
992fc38604 OAPI: add endpoint for fetching the tasks of a job
Add `fetchJobTasks` operation to the Jobs API. This returns a summary of
each of the job's tasks, suitable for display in a task list view.

The actually used fields may need tweaking once we actually have a task
list view, but at least the functionality is there.
2022-04-22 12:52:57 +02:00
Sybren A. Stüvel
e399b14e66 Manager: cleanup, rename jobId to jobID
No functional changes.
2022-04-22 12:16:11 +02:00
Sybren A. Stüvel
0cd478a409 Manager: move FetchJob function into jobs_query.go
I want to put more of the "get stuff" code into `jobs_query.go`, keeping
`jobs.go` for creationg & manipulation.
2022-04-22 11:51:02 +02:00
Sybren A. Stüvel
6bdc198301 Manager: more graceful errors when receiving task update of unknown task
Return a 404 Not Found when the task can't be found, and a 500 on other
errors.
2022-04-21 19:06:18 +02:00
Sybren A. Stüvel
90be370095 Manager: reduce password strength of Workers
The password check of worker API calls was 2 orders of magnitude slower
than actually handling the API call itself. Since the Worker authentication
is not that important (it's all on the same network anyway, and Worker
account registration is automatic too), lowering the BCrypt cost to the
minimum helps.

On my machine, this reduces the time for password checks from 50 to 2 ms.
2022-04-21 19:06:18 +02:00
Sybren A. Stüvel
65427ee38e Manager: use e.NoContent(http.StatusNoContent) to return "no content"
No functional changes, just the right call for the job.
2022-04-21 19:06:18 +02:00
Sybren A. Stüvel
79bac3a5f3 Manager: fix race condition in logging of worker properties
Dereferencing the `w *persistence.Worker` pointer should happen directly
in the function call, not in the zerolog callback function.
2022-04-21 19:06:18 +02:00
Sybren A. Stüvel
5466f65225 OAPI: add setJobStatus operation
Add API endpoint for updating the job status.
2022-04-21 19:06:18 +02:00
Sybren A. Stüvel
b699647ed4 OpenAPI: add activity field to Job schema 2022-04-21 12:40:25 +02:00
Sybren A. Stüvel
d79fde17f3 Manager: keep track of the reason of job status changes
To prepare for job status changes being requestable from the API, store
the reason for any status change on the job itself.

Not yet part of the API, just on the persistence layer.
2022-04-21 12:32:07 +02:00
Sybren A. Stüvel
954af37fd5 Manager: rename assertXXXResponse to assertResponseXXX
Rename test functions like `assertJSONResponse` to `assertResponseJSON`,
so that they get ordered together by autocompletion.

No functional changes.
2022-04-21 12:01:46 +02:00
Sybren A. Stüvel
c3b694ab2a Manager: wrap job/task errors in persistence layer
Avoid users of the persistence layer to have to test against Gorm errors,
by wrapping job/task errors in a new `PersistenceError` struct.

Instead of testing for `gorm.ErrRecordNotFound`, code can now test for
`persistence.ErrJobNotFound` or `persistence.ErrTaskNotFound`.
2022-04-21 11:54:59 +02:00
Sybren A. Stüvel
d099a31531 OAPI: add endpoint for getting a single job type
This will be used by the web frontend to determine which job settings
to show and which to hide.
2022-04-15 16:21:48 +02:00
Sybren A. Stüvel
d30befa2d7 Manager: add assert function for testing JSON responses
This makes it much easier to test an API response actually matches the
expected JSON values.
2022-04-15 16:14:17 +02:00
Sybren A. Stüvel
de3c4af8cb Manager: fix broken unit test 2022-04-15 14:37:41 +02:00
Sybren A. Stüvel
13e3607571 Manager: reduce logging of config loading
The logging was very verbose, and even though it was only at trace level,
a failing unit test would show them all.
2022-04-15 14:34:30 +02:00
Sybren A. Stüvel
e672289a11 OAPI: include all info for the jobs table in the JobUpdate schema
By having all info for the jobs table in the `JobUpdate` schema, it won't
have to query for the full job when a new job is added. This fetching of
the full job is now delayed until someone clicks on the table row.
2022-04-14 09:41:04 +02:00
Sybren A. Stüvel
555c935790 Web: Replace Vue 2 with Vue 3 webapp
Replace the Vue v2 webapp with a Vue v3 one, and embed the OpenAPI
client in the webapp itself (instead of being its own npm project).

- Vue v2.x -> v3.x
- Tabulator v4.x -> v5.1
- Moment JS -> replaced with Luxon JS
- Vue CLI/UI -> replaced with Vite
2022-04-12 12:34:49 +02:00
Sybren A. Stüvel
48417f7f14 Manager: Fix unittests after recent changes to the job compiler script
d98dbaa3 introduced a change to the job compiler, which wasn't taken into
account in the unit test.
2022-04-11 14:08:40 +02:00
Sybren A. Stüvel
d98dbaa333 Worker: implement ffmpeg for frame-to-video conversion on Windows 2022-04-09 16:20:29 +02:00
Sybren A. Stüvel
7a19e2f38d Manager: use absolute storage path
This helps to get things consistent on Windows and Linux. Otherwise a path
like `/some/path` is absolute on one platform but not on the other. This is
mostly for getting the unit tests in this package to work on Windows, but
using absolute paths also helps in clarity of error logging.
2022-04-09 12:03:11 +02:00
Sybren A. Stüvel
1960b668aa Cleanup: remove unused code 2022-04-08 14:47:07 +02:00
Sybren A. Stüvel
2e2205c00e Manager: return error from sendAPIError()
Small bugfix.
2022-04-08 14:46:36 +02:00
Sybren A. Stüvel
06bf3c0482 Cleanup: manager, fix copy-paste from original OpenAPI example code 2022-04-08 12:04:58 +02:00
Sybren A. Stüvel
930d7497d7 OAPI: Better 'SQLITE_BUSY' error handling
SQLite can return `SQLITE_BUSY` errors when it's doing too many things at
the same time. This is now improved a bit by setting a 5-second timeout,
during which the SQLite driver will wait for the database to become
available. If that doesn't happen, Flamenco Manager will return a
`503 Service Unavailable` response so that the client knows to back off
a little.
2022-04-08 12:02:30 +02:00
Sybren A. Stüvel
89e130c04f Manager: update tests for inclusion of job name in job updates 2022-04-08 11:59:30 +02:00
Sybren A. Stüvel
df3f7b44b9 Hook up web interface to job updates 2022-04-07 18:46:09 +02:00
Sybren A. Stüvel
a476f39365 Manager: improve SocketIO event handling & logging 2022-04-05 16:34:32 +02:00
Sybren A. Stüvel
a715b3bfbe Manager: connect SocketIO broadcaster with job creation 2022-04-05 16:19:33 +02:00
Sybren A. Stüvel
0c0df41f5d Job status change system for SocketIO broadcasts
Not fully tested yet.
2022-04-05 15:52:55 +02:00