Explicitly ignore an error when requesting the Manager to shut down.
This was already ignored implicitly, but now it's explicit to make
linters happy.
No functional changes.
Improve the error handling on some worker management API calls, to deal
with closed HTTP connections better.
A new function, `api_impl.handleConnectionClosed()` can now be called when
`errors.Is(err, context.Canceled)`. This will only log at debug level, and
send a `419 I'm a Teapot` response to the client. This response will very
likely never be seen, as the connection was closed. However, in case this
function is called by mistake, this response is unlikely to be accepted
by the HTTP client.
Add a new API operation to get the overall farm status. This is based on
the jobs and workers, and their status.
The statuses are:
- `active`: Actively working on jobs.
- `idle`: Farm could be active, but has no work to do.
- `waiting`: Work has been queued, but all workers are asleep.
- `asleep`: Farm is idle, and all workers are asleep.
- `inoperative`: Cannot work: no workers, or all are offline/error.
- `starting`: Farm is starting up.
- `unknown`: Unexpected configuration of worker and job statuses.
Change the package base name of the Go code, from
`git.blender.org/flamenco` to `projects.blender.org/studio/flamenco`.
The old location, `git.blender.org`, has no longer been use since the
[migration to Gitea][1]. The new package names now reflect the actual
location where Flamenco is hosted.
[1]: https://code.blender.org/2023/02/new-blender-development-infrastructure/
Implement the `deleteJob` API endpoint. Calling this endpoint will mark
the job as "deletion requested", after which it's queued for actual
deletion. This makes the API response fast, even when there is a lot of
work to do in the background.
A new background service "job deleter" keeps track of the queue of such
jobs, and performs the actual deletion. It removes:
- Shaman checkout for the job (but see below)
- Manager-local files of the job (task logs, last-rendered images)
- The job itself
The removal is done in the above order, so the job is only removed from the
database if the rest of the removal was succesful.
Shaman checkouts are only removed if the job was submitted with Flamenco
version 3.2. Earlier versions did not record enough information to reliably
do this.
Add a handler for the OpenAPI `taskOutputProduced` operation, and an
image thumbnailing goroutine.
The queue of images to process + the function to handle queued images
is managed by `last_rendered.LastRenderedProcessor`. This queue currently
simply allows 3 requests; this should be improved such that it keeps
track of the job IDs as well, as with the current approach a spammy job
can starve the updates from a more calm job.
It returns 2048 bytes at most. It'll likely be less than that, as it will
ignore the first bytes until the very first newline (to avoid returning
cut-off lines). If the log file itself is 2048 bytes or smaller, return the
entire file.
Implement task log broadcasting via SocketIO. The logs aren't shown in the
web interface yet, but do arrive there in a Pinia store. That store is
capped at 1000 lines to keep memory requirements low-ish.
Rename `pkg/api/flamenco-manager.yaml` to `flamenco-openapi.yaml`, to
distinguish the OpenAPI definition file from the Flamenco Manager
configuration file of the same name (but in a different directory).
No functional changes.
Worker and Manager implementation of the "may-I-kee-running" protocol.
While running tasks, the Worker will ask the Manager periodically
whether it's still allowed to keep running that task. This allows the
Manager to abort commands on Workers when:
- the Worker should go to another state (typically 'asleep' or
'shutdown'),
- the task changed status from 'active' to something non-runnable
(typically 'canceled' when the job as a whole is canceled).
- the task has been assigned to a different Worker. This can happen when
a Worker loses its connection to its Manager, resulting in a task
timeout (not yet implemented) after which the task can be assigned to
another Worker. If then the connectivity is restored, the first Worker
should abort (last-assigned Worker wins).
Manager now sends out task updates via SocketIO, and the web interface
handles those.
Note that there is a `BroadcastTaskUpdate()` function, but not a
`BroadcastNewTask`. The 'new job' broadcast is sent after the job's
tasks have been created, and thus there is no need for a separate
broadcast per task.
Add `fetchJobTasks` operation to the Jobs API. This returns a summary of
each of the job's tasks, suitable for display in a task list view.
The actually used fields may need tweaking once we actually have a task
list view, but at least the functionality is there.
To prepare for job status changes being requestable from the API, store
the reason for any status change on the job itself.
Not yet part of the API, just on the persistence layer.
SQLite can return `SQLITE_BUSY` errors when it's doing too many things at
the same time. This is now improved a bit by setting a 5-second timeout,
during which the SQLite driver will wait for the database to become
available. If that doesn't happen, Flamenco Manager will return a
`503 Service Unavailable` response so that the client knows to back off
a little.
- Addon switches between filesystem-packing and Shaman-packing
automatically, depending on whether the Manager has Shaman enabled.
- Actually using BAT for Shaman packing.
It doesn't work though, some error occurs when receiving Shaman response
from the Manager in the Addon.
This introduces some more conceptual changes to Shaman. The most important
one is that there is no longer a "checkout ID", but a "checkout path".
The Shaman client can request any subpath of the checkout directory,
so that it can handle things like project- or scene-specific prefixes.
The add-on code was copy-pasted from other addons and used the GPL v2
license, whereas by accident the LICENSE text file had the GNU "Affero" GPL
license v3 (instead of regular GPL v3).
This is now all streamlined, and all code is licensed as "GPL v3 or later".
Furthermore, the code comments just show a SPDX License Identifier
instead of an entire license block.
The task status change → job status change code is a direct port of the
Flamenco Server v2 code written in Python.
There is no job status change → task status changes logic yet, and the
tests are also far from complete.