Worker and Manager implementation of the "may-I-kee-running" protocol.
While running tasks, the Worker will ask the Manager periodically
whether it's still allowed to keep running that task. This allows the
Manager to abort commands on Workers when:
- the Worker should go to another state (typically 'asleep' or
'shutdown'),
- the task changed status from 'active' to something non-runnable
(typically 'canceled' when the job as a whole is canceled).
- the task has been assigned to a different Worker. This can happen when
a Worker loses its connection to its Manager, resulting in a task
timeout (not yet implemented) after which the task can be assigned to
another Worker. If then the connectivity is restored, the first Worker
should abort (last-assigned Worker wins).
Instead of logging "sleep aborted", the message is now "sleep command
aborted", to make it clear that it's about the sleep command, and not the
"asleep" worker state.
It looks rather ugly, and this should be addressed, but without the
`height` option of Tabulator it won't be using the Virtual DOM and result
in very slow browser performance.
Improve how the task scheduler deals with tasks that already have a
worker assigned to them:
- When a Worker asks for a task, and there is already an active task
assigned to it, always return that task.
- Otherwise, never allow scheduling of active tasks, as those are
already being run by another worker. If this is not the case, their
status should change to queued/failed, instead of handling the
situation in the task scheduler.
- Apart from the assigned-and-active case above, ignore task's worker ID
when scheduling tasks. If the status is 'queued' or 'soft-failed', the
task's worker ID just indicates who ran the task last.
Pinia's `$patch()` function will merge the given state with the current
state, instead of doing a replacement. As a result, going from an active
job with metadata fields `A` and `B`, to a job with metadata fields `B`
and `C` would actually have fields `A`, `B`, and `C` in the Pinia store.
This is resolved by replacing the `$patch(object)` with `$patch(function)`
and having that function replace the entire job.
I didn't like how "queued", "paused", and "cancelled" had the same colour,
as they represent rather different statuses (may run in future, vs. won't
run at all).
The selection mechanism of Tabulator was getting in the way of having nice
navigation, as it would deselect (i.e. nav to "/") before selecting the
next job (i.e. nav to "/jobs/{job-id}").
The active job is now determined by the URL and thus handled by Vue Router.
Clicking on a job simply navigates to its URL, which causes the reactive
system to load & display it.
It is still intended to get job selection for "mass actions", but that's
only possible after normal navigation is working well.
Most of the code moved from `App.vue` to `views/JobsView.vue`.
Notification bar has its own component, and there are placeholder
"views" for Workers and Settings pages.
There is still some clunky handling of updates via SocketIO, as those
are a mix of job-specific and global (like SocketIO reconnection
events). The advantage of the current approach is that SocketIO
connections are closed when you leave the Jobs page, and reopened when
you enter the Workers page. My gut feeling says this is nice because it
ensures that all SocketIO connection-specific things are cleaned up when
you navigate.
Check for jobs in 'cancel-requested' or 'requeued' statuses, and ensure
they transition to the right status. This happens at startup, before
even starting the web interface, so that a consistent state is presented.
Only the `ReadHeaderTimeout` is set. `ReadTimeout` is not set, as this is
quite specific per request. Shaman file uploads and websocket connections
should be allowed to run quite long, whereas other queries should be
relatively short.
I didn't find the way VSCode was rewrapping those lines the prettiest,
so I manually changed the formatting. It's now still compatible with the
auto-formatting, but nicer.
No functional changes.