Add a PUT method for `/api/v3/configuration/file`, which entirely
replaces `flamenco-manager.yaml` with the received JSON payload. This
will be used in the future to store configuration edited in the web
frontend.
Ref: #99426
Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104406
Add a `Worker` column to the Job Tasks Table. This lets artists quickly
visualize on which machine a task is currently running, giving better
insights on worker utilization, as well as better expectations on how
long a task might take to finish when running Flamenco on a Renderfarm
made of different slow / fast workers.
Similarly to the Task Details panel for the "Assigned To" field
`LinkWorker` Vue element, the cell element contains an hyperlink to the
corresponding worker in the Workers page. Since the Worker page also
contains a backlink to the currently running task, this lets user
quickly navigate between the two pages, as seen in the screen recording
demo below.
Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104402
Reviewed-by: Sybren A. Stüvel <sybren@blender.org>
When a Worker is soft-deleted, references to it are not cleaned up by
the database (that only happens when it is really deleted). As a result,
the "last-assigned worker of a task" field is still set, but the worker
itself cannot be fetched any more.
This is just a quick fix to avoid an error. It's probably better to
remove the soft-deletion of Workers, as the feature is not really used
anyway. Or to implement it properly, so that the info is used.
Explicitly ignore an error when requesting the Manager to shut down.
This was already ignored implicitly, but now it's explicit to make
linters happy.
No functional changes.
When the `blender` variable has no value, Flamenco will not be able to
run Blender, which is worth emitting a warning about early on.
This required a bit of a reshuffle of the configuration loading logic, to
avoid loading things twice (and thus warning twice).
Fixes part of #104366
Prevent an error when fetching a task that was never assigned to a
worker.
The error:
```
WRN error fetching task worker
error="fetching worker : worker not found: sql: no rows in result set"
```
Replace old used-to-be-GORM datastructures (#104305) with sqlc-generated
structs. This also makes it possible to use more specific structs that
are more taylored to the specific queries, increasing efficiency.
This commit deals with the remaining areas, like the job deleter, task
timeout checker, and task state machine. And anything else to get things
running again.
Functional changes are kept to a minimum, as the API still serves the
same data.
Because this work covers so much of Flamenco's code, it's been split up
into different commits. Each commit brings Flamenco to a state where it
compiles and unit tests pass. Only the result of the final commit has
actually been tested properly.
Ref: #104343
Replace old used-to-be-GORM datastructures (#104305) with sqlc-generated
structs. This also makes it possible to use more specific structs that
are more taylored to the specific queries, increasing efficiency.
This commit deals with the worker sleep schedule.
Functional changes are kept to a minimum, as the API still serves the
same data.
Because this work covers so much of Flamenco's code, it's been split up
into different commits. Each commit brings Flamenco to a state where it
compiles and unit tests pass. Only the result of the final commit has
actually been tested properly.
Ref: #104343
Replace old used-to-be-GORM datastructures (#104305) with sqlc-generated
structs. This also makes it possible to use more specific structs that
are more taylored to the specific queries, increasing efficiency.
This commit deals with worker tags (on both workers and jobs).
Functional changes are kept to a minimum, as the API still serves the
same data.
Because this work covers so much of Flamenco's code, it's been split up
into different commits. Each commit brings Flamenco to a state where it
compiles and unit tests pass. Only the result of the final commit has
actually been tested properly.
Ref: #104343
Replace old used-to-be-GORM datastructures (#104305) with sqlc-generated
structs. This also makes it possible to use more specific structs that
are more taylored to the specific queries, increasing efficiency.
This commit mostly deals with workers, including the sleep schedule and
task scheduler.
Functional changes are kept to a minimum, as the API still serves the
same data.
Because this work covers so much of Flamenco's code, it's been split up
into different commits. Each commit brings Flamenco to a state where it
compiles and unit tests pass. Only the result of the final commit has
actually been tested properly.
Ref: #104343
Replace old used-to-be-GORM datastructures (#104305) with sqlc-generated
structs. This also makes it possible to use more specific structs that
are more taylored to the specific queries, increasing efficiency.
This commit covers job blocklists and last-rendered images.
Functional changes are kept to a minimum, as the API still serves the
same data.
Because this work covers so much of Flamenco's code, it's been split up
into different commits. Each commit brings Flamenco to a state where it
compiles and unit tests pass. Only the result of the final commit has
actually been tested properly.
Ref: #104343
Fix most linter warnings reported by 'staticcheck'. This doesn't fix all
of them, some unused functions are still there, and some generated code
also still triggers some warnings. Most issues are fixed, though.
No functional changes, except for the captialisation of some error
messages.
Use RFC 2047 (aka MIME encoding) to send the original filename when
uploading a file to the Shaman server.
HTTP headers should be ASCII-only, and some systems use Latin-1 as
fallback. That's not suitable in general, though, because almost all
characters fall outside the Latin-1 range.
Add an option to the setup assistant to skip configuring the path to
Blender. It will just use the `default` option, which causes the Workers
to try and find Blender on their own.
Fixes#104306
Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104306
Reviewed-by: Sybren A. Stüvel <sybren@blender.org>
There was a missing condition, which meant that configuring any two-way
variable caused path separator normalisation on all job/task/command
parameters.
Manager: Instead of embedding the worker tag info in a fetched `Job`,
just include its UUID.
Webapp: fetch the worker tag by UUID, instead of using the embedded
info.
Show the worker tag name (and its description in a tooltip) in the job
details. When no worker tag is assigned, "All Workers" is shown in a more
dimmed colour.
This also renames the "Type" field to "Job Type". "Tag" and "Type" could
be confused, and now they're displayed as "Worker Tag" and "Job Type".
The UI in the add-on's submission interface is also updated for this, so
that that also shows "Worker Tag" (instead of just "Tag").
Run the actually-doing-stuff parts of `RequestWorkerStatusChange()` and
`SetWorkerTags()` in a background context. That way the operation can
continue even when the HTTP client disconnects.
Improve the error handling on some worker management API calls, to deal
with closed HTTP connections better.
A new function, `api_impl.handleConnectionClosed()` can now be called when
`errors.Is(err, context.Canceled)`. This will only log at debug level, and
send a `419 I'm a Teapot` response to the client. This response will very
likely never be seen, as the connection was closed. However, in case this
function is called by mistake, this response is unlikely to be accepted
by the HTTP client.
The context passed to the database layer will auto-close when the HTTP
client disconnects. This will cancel any running query, which is the
expected behaviour. Now this no longer results in an error being logged
in the database layer. Instead, a message is logged at debug level.
The API layer is also adjusted to silence logging of `context.Canceled`
for certain operations, most notably getting all jobs and getting all
tasks of a job. These calls occur when the webapp reconnects after a
restart of the Manager. That may trigger a refresh of the page, which
immediately aborts any pending API calls. This is normal and should not
cause errors to be logged.
This is a bit more work than other queries, as it also breaks apart the
fetching of the job and the worker into separate ones. In other words,
internally the persistence layer API changes.
Regenerate the Go mock implementation after the removal of the SaveTask
function from the mocked interface.
See 097d5abb7c13e6eff1facea12f89f24c144194c0
Back in the days when I wrote the code, I didn't know about the
`require` package yet. Using `require.NoError()` makes the test code
more straight-forward.
No functional changes, except that when tests fail, they now fail
without panicking.
Add a new API operation to get the overall farm status. This is based on
the jobs and workers, and their status.
The statuses are:
- `active`: Actively working on jobs.
- `idle`: Farm could be active, but has no work to do.
- `waiting`: Work has been queued, but all workers are asleep.
- `asleep`: Farm is idle, and all workers are asleep.
- `inoperative`: Cannot work: no workers, or all are offline/error.
- `starting`: Farm is starting up.
- `unknown`: Unexpected configuration of worker and job statuses.
Introduce an "event bus"-like system. It's more like a fan-out
broadcaster for certain events. Instead of directly sending events to
SocketIO, they are now sent to the broker, which in turn sends it to any
registered "forwarder". Currently there is ony one forwarder, for
SocketIO.
This opens the door for a proper MQTT client that sends the same events
to an MQTT server.
When an API request comes in to delete a job, not only log the job's UUID,
but also include its database ID. This can help in figuring out database
issues, as when the job is deleted, it's unknown what UUID it had. Database
relations use the ID, and not the UUID.
Implement the API function to mass-mark jobs for deletion, based on
their 'updated_at' timestamp.
Note that the `last_updated_max` parameter is rounded up to entire
seconds. This may mark more jobs for deletion than you expect, if their
`updated_at` timestamps differ by less than a second.
Updating a tag without `description` field in the request body will keep
the tag's description as-is. Previously this caused it to become empty,
which is now still possible by using an explicit `description: ""`.
When the worker is started with `-restart-exit-code 47` or has
`restart_exit_code=47` in `flamenco-worker.yaml`, it's marked as
'restartable'. This will enable two worker actions 'Restart
(immediately)' and 'Restart (after task is finished)' in the Manager web
interface. When a worker is asked to restart, it will exit with exit
code `47`. Of course any positive exit code can be used here.
Change the package base name of the Go code, from
`git.blender.org/flamenco` to `projects.blender.org/studio/flamenco`.
The old location, `git.blender.org`, has no longer been use since the
[migration to Gitea][1]. The new package names now reflect the actual
location where Flamenco is hosted.
[1]: https://code.blender.org/2023/02/new-blender-development-infrastructure/
Fix an issue where a shared storage path on Linux, that maps via two-way
variables to a drive root on Windows, caused problems with the path
translation system.
Windows paths that consist only of a drive letter (`F:`) cannot just be
concatenated to a relative path, as that will result in `F:path\to\file`,
which is still a relative path of sorts. This is now handled correctly,
and should result in `F:\path\to\file`.
This fixes#104237.
Simplify the variable expansion code. Instead of using a separate goroutine
and two channels, use a struct + a simple function call.
No functional changes.
Simplify the code for the two-way variables' value-to-variable replacement.
Instead of using a goroutine and two channels, use a separate struct and
call a function on that directly.
No functional changes.
As it was decided that the name "tags" would be better for the clarity
of the feature, all files and code named "cluster" or "worker cluster"
have been removed and replaced with "tag" and "worker tag". This is only
a name change, no other features were touched.
This addresses part of #104204.
Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104223
As a note to anyone who already ran a pre-release version of Flamenco
and configured some worker clusters, with the help of an SQLite client
you can migrate the clusters to tags. First build Flamenco Manager and
start it, to create the new database schema. Then run these SQL queries
via an sqlite commandline client:
```sql
insert into worker_tags
(id, created_at, updated_at, uuid, name, description)
select id, created_at, updated_at, uuid, name, description
from worker_clusters;
insert into worker_tag_membership (worker_tag_id, worker_id)
select worker_cluster_id, worker_id from worker_cluster_membership;
```
If the mock tests are run by root user then this specific test of
inaccessible path fails because root can write files to anywhere on the
filesystem. It is not clear that Flamenco Manager test
TestCheckSharedStoragePath is checking inaccessible file locations when
it fails and that it should be run by an unprivileged user.
Fix is to fail the permission test if the tests are run as a root user.