SocketIO clients no longer automatically subscribe to the jobs updates.
This is now done explicitly via the `allJobs` subscription type, and
unsubscribing is also possible.
A task can exist in the database but not have any log stored on disk yet.
This is now returned as `204 No Content` instead of an internal server
error.
The web interface is also adjusted to cope with this.
It returns 2048 bytes at most. It'll likely be less than that, as it will
ignore the first bytes until the very first newline (to avoid returning
cut-off lines). If the log file itself is 2048 bytes or smaller, return the
entire file.
Add a small wrapper around github.com/google/uuid. That way it's clearer
which functionality is used by Flamenco, doesn't link most of the code to
any specific UUID library, and allows a bit of customisation.
The only customisation now is that Flamenco is a bit stricter in the
formats it accepts; only the `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx` is
accepted. This makes things a little bit stricter, with the advantage
that we don't need to do any normalisation of received UUID strings.
Implement task log broadcasting via SocketIO. The logs aren't shown in the
web interface yet, but do arrive there in a Pinia store. That store is
capped at 1000 lines to keep memory requirements low-ish.
Rename `pkg/api/flamenco-manager.yaml` to `flamenco-openapi.yaml`, to
distinguish the OpenAPI definition file from the Flamenco Manager
configuration file of the same name (but in a different directory).
No functional changes.
This also changes the order in which the task is updated; the activity is
now saved first, so that it can be included in the task status change
notification sent to SocketIO clients.
Send & handle `JobUpdate.refresh_tasks = true` when many tasks are
updated simultaneously. This applies to things like cancelling &
requeueing an entire job.
This partially rolls back 67bf77de13d99b1bc5d7344951068822c4fadd88, as
it was too slow when 1000+ tasks were being updated all at once.
Worker and Manager implementation of the "may-I-kee-running" protocol.
While running tasks, the Worker will ask the Manager periodically
whether it's still allowed to keep running that task. This allows the
Manager to abort commands on Workers when:
- the Worker should go to another state (typically 'asleep' or
'shutdown'),
- the task changed status from 'active' to something non-runnable
(typically 'canceled' when the job as a whole is canceled).
- the task has been assigned to a different Worker. This can happen when
a Worker loses its connection to its Manager, resulting in a task
timeout (not yet implemented) after which the task can be assigned to
another Worker. If then the connectivity is restored, the first Worker
should abort (last-assigned Worker wins).
Improve how the task scheduler deals with tasks that already have a
worker assigned to them:
- When a Worker asks for a task, and there is already an active task
assigned to it, always return that task.
- Otherwise, never allow scheduling of active tasks, as those are
already being run by another worker. If this is not the case, their
status should change to queued/failed, instead of handling the
situation in the task scheduler.
- Apart from the assigned-and-active case above, ignore task's worker ID
when scheduling tasks. If the status is 'queued' or 'soft-failed', the
task's worker ID just indicates who ran the task last.
Check for jobs in 'cancel-requested' or 'requeued' statuses, and ensure
they transition to the right status. This happens at startup, before
even starting the web interface, so that a consistent state is presented.
When the job status changes, it impacts the task statuses as well. These
status changes are now no longer done with a single database query, but
instead each affected task is fetched, changed, and saved. This unifies
the regular & mass updates to the tasks, and causes the resulting task
changes to be broadcast to SocketIO clients.
Manager now sends out task updates via SocketIO, and the web interface
handles those.
Note that there is a `BroadcastTaskUpdate()` function, but not a
`BroadcastNewTask`. The 'new job' broadcast is sent after the job's
tasks have been created, and thus there is no need for a separate
broadcast per task.
SocketIO clients can now send a message with `/subscription` event type
in order to subscribe to or unsubscribe from job-related updates.
These job-related updates themselves aren't sent yet, so this is a change
that's impossible to really test. The socketIO code for joining/leaving
rooms is called, though.
Add `fetchJobTasks` operation to the Jobs API. This returns a summary of
each of the job's tasks, suitable for display in a task list view.
The actually used fields may need tweaking once we actually have a task
list view, but at least the functionality is there.