Instead of returning an error when getting the sqlc queries object, just
panic. This'll make the calling code quite a bit simpler. The situation
in which it might error out is so rare that I've never seen it, and I
don't even know if it will ever be possible to happen with the SQLite
implementation we use now. Furthermore, once we get rid of GORM, it
should just always work anyway.
Ref: #104305
Add an option to the setup assistant to skip configuring the path to
Blender. It will just use the `default` option, which causes the Workers
to try and find Blender on their own.
Fixes#104306
Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104306
Reviewed-by: Sybren A. Stüvel <sybren@blender.org>
Add a new job type that can render a single image. It is broken up
into separate tiles, each of which can be rendered independently by a
worker. Only tested with Cycles.
Adaptive sampling is supported. For this, each tile is expanded by 16
pixels in each direction, which is later cropped off before merging
the tiles.
Denoising is not (yet) supported, as Blender/Cycles does not output
all the necessary data into EXR layers.
Tile sizes should (for now) be a power of 2, to avoid alignment
issues.
Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104327
Reviewed-by: Sybren A. Stüvel <sybren@blender.org>
Convert most of the job blocklist queries from GORM to sqlc. The management
functions (add worker, remove worker, clear list, fetch list) have been
converted.
There was a missing condition, which meant that configuring any two-way
variable caused path separator normalisation on all job/task/command
parameters.
Include the current scene name in a hidden setting of the Simple Blender
Render, and pass that as CLI argument to Blender.
This ensures that the correct scene is rendered when working directly on
shared storage (as that does not have a copy of the blend file per job).
The job setting `"scene"` is still optional. If it's missing or empty,
the `--scene <scene name>` CLI arg will simply not be passed to Blender.
Manager: Instead of embedding the worker tag info in a fetched `Job`,
just include its UUID.
Webapp: fetch the worker tag by UUID, instead of using the embedded
info.
Show the worker tag name (and its description in a tooltip) in the job
details. When no worker tag is assigned, "All Workers" is shown in a more
dimmed colour.
This also renames the "Type" field to "Job Type". "Tag" and "Type" could
be confused, and now they're displayed as "Worker Tag" and "Job Type".
The UI in the add-on's submission interface is also updated for this, so
that that also shows "Worker Tag" (instead of just "Tag").
The unit test was using the real clock, and depends on ordering by
last-updated timestamps. This means that sometimes two updates both fell
within the granularity of the timestamp in sqlite, which made the
ordering unreliable.
Switching to a mocked clock fixes this, as it moves forward with a hard-
coded pace, regardless of the execution speed of the test.
No functional changes to Flamenco itself.
A job first goes to `pause-requested` status, during which any `active` task
gets a chance to be completed. Once there are no more active tasks, the job
goes to `paused` state (or `failed`, if that is applicable).
Pull request: https://projects.blender.org/studio/flamenco/pulls/104313
Instead of explicitly checking errors / nil values and calling
`t.Fatal()`, just use `require.NoError()` or `require.NotNil()`.
No functional changes. The wording of test failures will be slightly
different though.
Some reorganisation to make it easier to convert a job & its tasks from
sqlc to gorm data structures.
The persistence layer is being converted to sqlc. Once that is done, the
remainder of the code can switch over from using gorm structs to sqlc
structs. Then this code will no longer be necessary.
Run the actually-doing-stuff parts of `RequestWorkerStatusChange()` and
`SetWorkerTags()` in a background context. That way the operation can
continue even when the HTTP client disconnects.
Improve the error handling on some worker management API calls, to deal
with closed HTTP connections better.
A new function, `api_impl.handleConnectionClosed()` can now be called when
`errors.Is(err, context.Canceled)`. This will only log at debug level, and
send a `419 I'm a Teapot` response to the client. This response will very
likely never be seen, as the connection was closed. However, in case this
function is called by mistake, this response is unlikely to be accepted
by the HTTP client.
Fix an issue where a timed-out task would cause a panic, as it wasn't
fetching its Job UUID.
I see this as working around a limitation of GORM, which should get
replaced with sqlc soon-ish anyway.
When expanding the `{shared}` variable in a path like
`{shared}\shot\file.blend`, all slashes will be normalised to the target
platform, so for example result in:
- Windows: `Y:\shared\flamenco\shot\file.blend`
- Linux : `/shared/flamenco/shot/file.blend`
Due to this normalisation, the same paths will be served to these
platforms when the path was `{shared}/shot/file.blend`.
Remove the introductionary comments from `query_jobs.sql` and
`query_workers.sql`. Sqlc got confused by this, and placed them in the
wrong (well, not-intended-by-me) place in the generated Go code.
No functional changes.
Convert 'last-rendered' to sqlc. The query is slightly suboptimal.
There's a bug in sqlc: the `ON CONFLICT DO UPDATE` clause is generated
incorrectly. See https://github.com/sqlc-dev/sqlc/issues/3334 for more
info.
Turn `workers.lazy_status_request` and `workers.can_restart` into a
`boolean`. They were `smallint` before.
Having these explicitly modeled as `boolean` will make sqlc generate the
right type for them.
No functional changes.
The context passed to the database layer will auto-close when the HTTP
client disconnects. This will cancel any running query, which is the
expected behaviour. Now this no longer results in an error being logged
in the database layer. Instead, a message is logged at debug level.
The API layer is also adjusted to silence logging of `context.Canceled`
for certain operations, most notably getting all jobs and getting all
tasks of a job. These calls occur when the webapp reconnects after a
restart of the Manager. That may trigger a refresh of the page, which
immediately aborts any pending API calls. This is normal and should not
cause errors to be logged.
Add a GORM hook that sets `task.JobUUID` and `.WorkerUUID`. These were
only set by the sqlc code; this change ensures that they are now always
set, so that the caller doesn't have to worry about which function is
already ported to sqlc and which one is still GORM.
Wrap the SQLite error "interrupted (9)". That error is (as far as I
could figure out) caused by the context being closed. Unfortunately
there is no wrapping of the underlying context error, so it's not
possible to determine whether it was due to a 'deadline exceeded' error
or another cancellation cause (like upstream HTTP connection closing).
Primarily this makes a rather unreliable unit test properly reliable.
The code under test could return either `context.DeadlineExceeded` or
the "interrupted (9)" error (GORM + SQLite doesn't reliably chose one or
the other), and now this is cleanly tested for.
Job deletions are placed in an in-memory queue in batches of 100 jobs.
Between batches the Manager's job deleter would idle for 1 minute. Now,
once the in-memory queue has been emptied, the job deleter will wait
only 100ms before checking the database again.
This 100ms might not be necessary either, but I think it's nice to give
the Manager a bit of a breather before diving into another batch of
deletions.
Speed up the deletion of multiple jobs by skipping the database integrity
check. It is now clear what was causing the integrity issues (disabled
foreign key constraints), and this is now checked for before deleting
anything. This reduces the deletion time from ~500ms per job to ~150ms
(on my computer, with my database, of course).
Replace the use of the `t *testing.T` parameter with just plain `panic()`
when test setup fails. This makes it easier to call the same functions
from other situations, like benchmark functions.
No functional changes to Flamenco itself.
Instead of closing the sqlite database connection, tell GORM to close the
connection. Only that properly closes the DB, so that testing with a file
on disk doesn't fail when trying to delete that file.
No functional changes to the Manager itself.
Tweak the logging a little bit so it's less noisy, properly warns when the
Shaman checkout dir cannot be removed, and optimise the database query
a bit (by just fetching the one field that's needed, instead of the entire
job).
Deletion still works the same.
The task state machine expects that `task.Job` is set correctly. Since
SQLC does not automatically fill this field (and rightfully so), I've added
a bit of Go code that fetches the job in a separate query.
A TODO is added as a reminder that it would be better for the task state
machine itself to fetch the job when needed.