Add a small wrapper around github.com/google/uuid. That way it's clearer
which functionality is used by Flamenco, doesn't link most of the code to
any specific UUID library, and allows a bit of customisation.
The only customisation now is that Flamenco is a bit stricter in the
formats it accepts; only the `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx` is
accepted. This makes things a little bit stricter, with the advantage
that we don't need to do any normalisation of received UUID strings.
Send & handle `JobUpdate.refresh_tasks = true` when many tasks are
updated simultaneously. This applies to things like cancelling &
requeueing an entire job.
This partially rolls back 67bf77de13d99b1bc5d7344951068822c4fadd88, as
it was too slow when 1000+ tasks were being updated all at once.
Improve how the task scheduler deals with tasks that already have a
worker assigned to them:
- When a Worker asks for a task, and there is already an active task
assigned to it, always return that task.
- Otherwise, never allow scheduling of active tasks, as those are
already being run by another worker. If this is not the case, their
status should change to queued/failed, instead of handling the
situation in the task scheduler.
- Apart from the assigned-and-active case above, ignore task's worker ID
when scheduling tasks. If the status is 'queued' or 'soft-failed', the
task's worker ID just indicates who ran the task last.
Check for jobs in 'cancel-requested' or 'requeued' statuses, and ensure
they transition to the right status. This happens at startup, before
even starting the web interface, so that a consistent state is presented.
When the job status changes, it impacts the task statuses as well. These
status changes are now no longer done with a single database query, but
instead each affected task is fetched, changed, and saved. This unifies
the regular & mass updates to the tasks, and causes the resulting task
changes to be broadcast to SocketIO clients.
Add `fetchJobTasks` operation to the Jobs API. This returns a summary of
each of the job's tasks, suitable for display in a task list view.
The actually used fields may need tweaking once we actually have a task
list view, but at least the functionality is there.
To prepare for job status changes being requestable from the API, store
the reason for any status change on the job itself.
Not yet part of the API, just on the persistence layer.
Avoid users of the persistence layer to have to test against Gorm errors,
by wrapping job/task errors in a new `PersistenceError` struct.
Instead of testing for `gorm.ErrRecordNotFound`, code can now test for
`persistence.ErrJobNotFound` or `persistence.ErrTaskNotFound`.
SQLite can return `SQLITE_BUSY` errors when it's doing too many things at
the same time. This is now improved a bit by setting a 5-second timeout,
during which the SQLite driver will wait for the database to become
available. If that doesn't happen, Flamenco Manager will return a
`503 Service Unavailable` response so that the client knows to back off
a little.
The add-on code was copy-pasted from other addons and used the GPL v2
license, whereas by accident the LICENSE text file had the GNU "Affero" GPL
license v3 (instead of regular GPL v3).
This is now all streamlined, and all code is licensed as "GPL v3 or later".
Furthermore, the code comments just show a SPDX License Identifier
instead of an entire license block.
The root cause was a 2nd `context.Context()` that was used in
`constructTestJob()`, which cancelled when that function returned.
The cancellation of the context caused an interrupt in the SQLite
driver, which got into a race condition and could cause an interrupt on
a subsequent database query.
Instead of returning an error "error doing X", just return "doing X". The
fact that it's returned as an error object says enough about that it's
an error.
This also makes it easier to chain error messages, without seeing the
word "error" in every part of the chain.
The on-disk database that was used before caused issues with tests running
in parallel. Not only is there the theoretical issue of tests seeing each
other's data (this didn't happen, but could), there was also the practical
issue of one test running while the other tried to erase the database file
(which fails on Windows due to file locking).
Where the PostgreSQL DB migration code could handle `NOT NULL` columns just
fine, SQLite has less table-altering functionality. As a result, migrations
have to copy entire database tables, which doesn't play well with
not-nullable columns.