646 Commits

Author SHA1 Message Date
Sybren A. Stüvel
44ec93275d Cleanup: reformat all Go files
Run `go fmt` on all files, to fix their formatting.

No functional changes.
2023-07-10 10:58:14 +02:00
Michael Cook
b20ede97ea Shaman: fail unit test when running as root user
If the mock tests are run by root user then this specific test of
inaccessible path fails because root can write files to anywhere on the
filesystem. It is not clear that Flamenco Manager test
TestCheckSharedStoragePath is checking inaccessible file locations when
it fails and that it should be run by an unprivileged user.

Fix is to fail the permission test if the tests are run as a root user.
2023-07-07 16:05:43 +02:00
Sybren A. Stüvel
7a508c7e6b Manager: perform database integrity check at startup
Perform these two SQL calls & check their result:

- `PRAGMA integrity_check`
- `PRAGMA foreign_key_check`:

See  https: //www.sqlite.org/pragma.html for more info on these.

This also removes the unused `PeriodicMaintenanceLoop()` function.
Periodic checking while Flamenco Manager is running might be introduced
in a future commit, after the startup-time checks have been shown to not
get in the way.
2023-07-07 16:03:06 +02:00
Sybren A. Stüvel
7f588e6dbc Manager: close database connection on startup errors
When there is an error detected at startup, close the database connection.
Before, the connection could be kept open even when an error was returned,
causing the write-ahead log files to be kept around. These are now
properly integrated into the main database file before exiting.
2023-07-07 15:48:08 +02:00
Sybren A. Stüvel
988cdf61ff Upgrade GORM & SQLite
Upgrade:
- `gorm.io/gorm` v1.23.8 → 1.25.2
- `github.com/glebarez/go-sqlite` v1.17.3 → v1.8.0
- `github.com/glebarez/sqlite` v1.4.6 → v1.8.0

and also some indirect dependencies.

This is in the hope that some weird cases at Blender Studio get resolved.
It appears that sometimes, for some unknown reason, when deleting a job,
its tasks get reassigned to another job (instead of also getting deleted).

Since there is no code in Flamenco itself to do this task deletion (it's
all depending on SQLite following the foreign keys and cascading to tasks),
I hope it was a bug in either GORM or SQLite that got fixed at some point.
2023-07-06 16:08:57 +02:00
Sybren A. Stüvel
22f4aa09f3 Manager: expand job deletion unit test
Add extra job to the database before deleting one, to ensure that job
deletion doesn't do anything with other jobs (and their tasks).

No functional changes to Flamenco itself.
2023-07-06 16:08:57 +02:00
Sybren A. Stüvel
6a30f844eb Manager: Better reporting of version via API call
Before: `3.3-alpha0-v3.2-76-gdd34d538-dirty`
After : `3.3-alpha0 (v3.2-76-gdd34d538-dirty)`

Also include the new `git` property to always have the Git hash (the part
between parentheses).
2023-07-06 12:21:47 +02:00
Sybren A. Stüvel
77db55bb14 Manager: when worker signs off, only remember specific statuses
Limit which worker statuses are remembered (when they go offline) to
those that we want to restore when they come back online. This is now
set to `awake` and `asleep`. This prevents workers from being told to go
to states that they cannot handle, such as `error` or `starting`.
2023-06-23 11:38:37 +02:00
Bastien Montagne
71f2947c4b Add 'copy-file' command. (#104220)
Initial implementation with some basic tests.

The API command only accept one source and destination path, which must
be absolute. If the destination already exists, it will not be
ovewritten, and an error is generated.

JS job definition code using this new command can easily handle list of
paths and/or relative paths if needed.

Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104220
2023-06-08 16:20:43 +02:00
Eveline Anderson
4d2200bb0c Fix #99549: Remember Previous Status (#104217)
Fix #99549: When sending Workers offline, remember their previous status

When the status of a worker goes offline, the Manager will now make the status of the worker to be remembered once it goes back online. So when the Worker makes this status change (so for example `X → offline`), Manager should immediately set `StatusRequested = "X" ` once it goes online.

Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104217
2023-06-02 22:50:07 +02:00
Sybren A. Stüvel
afde952c10 Fix incompatibility with 32-bit platforms 2023-05-24 21:23:05 +02:00
Anish Bharadwaj (he)
0502498dfa Fix #104201: Task Limit error in Flamenco Manager
Insert tasks in batches so that the required SQL query stays within the limits of SQLite.

No changes to the API, only to the persistence layer.

Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104205
2023-04-24 15:10:59 +02:00
Nitin-Rawat-1
752597b8e1 Check for number of workers before soft failing the task. (#104195)
Manager: fixed issue #104190 job getting stuck with less workers than soft-failed threshold,
before soft-failing check the number of workers to decide if job should be failed or not.

Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104195
2023-04-20 11:53:41 +02:00
Sybren A. Stüvel
472b73eb5c Cleanup: run go fmt ./...
No functional changes.
2023-04-17 16:36:52 +02:00
Sybren A. Stüvel
6a89fa346c Manager: correctly count how many workers can run a job
Basically this accounts for the change in 3724a8874e4f22ef0740f464d9e912b19a1e061e
2023-04-04 15:19:21 +02:00
Sybren A. Stüvel
10d7e7e203 Manager: allow creation of worker clusters without UUID 2023-04-04 13:19:11 +02:00
Sybren A. Stüvel
3724a8874e Slight change of worker cluster behaviour
Workers without cluster now only run jobs without cluster.
2023-04-04 13:17:45 +02:00
Sebastian Parborg
f6f1ebdd05 Make runtime paths configurable at build time
To allow more build-time configuration:

- `Makefile` will now pick up `LDFLAGS` from environment variables, and
- locations of configuration files can now be overridden with linker
  options.

These are not used for regular Flamenco builds, but do allow studios to
customize where configuration files are stored.

Review: https://projects.blender.org/studio/flamenco/pulls/104200
2023-04-04 12:29:03 +02:00
Sybren A. Stüvel
8408d28a6b Manager: add support for worker clusters 2023-04-04 12:18:35 +02:00
Sybren A. Stüvel
675d966263 OAPI: regenerate code 2023-04-04 12:18:17 +02:00
Sybren A. Stüvel
28cc7b7a3f Manager: improve logging when workers register
The info message that a worker registered now also includes its UUID.
Any failure hashing the password will now also log the worker name + UUID.
2023-04-04 12:13:21 +02:00
Sybren A. Stüvel
e2559b1181 Cleanup: remove doubly-declared default value in persistence layer
No functional changes.
2023-04-03 16:59:22 +02:00
Sybren A. Stüvel
159ce5b34a Manager: avoid starting error messages with 'error'
No real functional changes, just server-side logging.
2023-04-03 16:58:48 +02:00
Sybren A. Stüvel
0ac64719e7 Job deleter: improve logging
Various improvements to the logging of the job deletion:

- Reduce the log level of the "removing logs" and "removing job from
  database" lines from INFO to DEBUG, so that only one line of INFO is
  logged per deleted job
- Show size of the queue and the check interval in the "job deletion
  queue is full" log message.
2023-03-21 12:16:04 +01:00
Sybren A. Stüvel
0fb252083b Job deletion: when stopping to queue up more deletions, log how many remain
When queueing up jobs to be deleted, log how many deletions remain to be
picked up later. Once a minute the database is checked for such deletion
requests, so the next batch will be scheduled in a minute.
2023-03-21 10:45:34 +01:00
Sybren A. Stüvel
b25e63f557 Job deletion: avoid looping over entire list of jobs when queue full
When there are more job deletion requests than can be kept in the queue,
just stop trying to queue them.
2023-03-21 10:44:28 +01:00
MKRelax
7963ab5efd Manager: fixed copy/paste typo in CheckBlenderExePath() (#104192)
The `toCheck` variable in `CheckBlenderExePath()` was initialized to `CheckSharedStoragePathJSONBody`, should be `CheckBlenderExePathJSONBody`.

Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104192
2023-03-06 12:55:53 +01:00
Sybren A. Stüvel
49d8c4e6fd Add "rc" as possible release cycle value
"rc" stands for "release candidate", which will trigger the same versioning
display as an actual release (i.e. just report the version, without the
Git hash info).
2023-02-21 11:35:55 +01:00
Sybren A. Stüvel
22f56890c1 Small fix for sleep schedule of soft-deleted workers
There were `ErrWorkerNotFound` errors in different packages, which got
mixed up. Now there's only one.
2023-02-09 11:46:29 +01:00
Sybren A. Stüvel
adfc2652b5 Add internal/tools/tools.go to mock-import code generator packages
`internal/tools/tools.go` is a bit of a hacky workaround a limitation of
`go mod tidy`. It will never be built, but `go mod tidy` will see the
packages imported here as dependencies of the Flamenco project, and not
remove them from `go.mod`.

This is meant for build-time requirements that are otherwise never
imported as Go libraries, like our OpenAPI code generator.
2023-02-09 11:25:47 +01:00
Sybren A. Stüvel
426b2aab4d Gracefully handle sleep schedules of deleted workers
Workers can be soft-deleted, which means that they stay in the database.
As such, foreign key constraints `ON DELETE CASCADE` do not trigger, and
thus their sleep schedule can still be active. This is now detected and
handled gracefully.
2023-02-09 11:18:38 +01:00
Sybren A. Stüvel
fe0899fd55 shaman-checkout-id-setter: Don't update job's "updated at" timestamp
The Shaman Checkout ID setter shouldn't update a job's "updated at"
timestamp. Its goal is to fake that the job was submitted with a new
enough Flamenco version, and thus should not touch the timestamps.
2023-02-07 16:24:23 +01:00
Sybren A. Stüvel
bf464055e0 Avoid double logging of job storage directory removal 2023-02-07 15:22:52 +01:00
Sybren A. Stüvel
01a85d86cb Add "Shaman Checkout ID setter" command
This is a command that can be run to retroactively set the Shaman
Checkout ID of jobs, allowing the job deletion to also remove the job's
Shaman checkout directory.

This is highly experimental, and not built by default or shipped with
Flamenco releases. It's only been used once at Blender Animation Studio
to help cleaning up. Run at your own risk. Make backups first.
2023-02-07 15:07:41 +01:00
Sybren A. Stüvel
aa1c6b8ff3 Close the database when Flamenco shuts down
This prevents SQLite journal files from lingering around.
2023-02-07 15:05:49 +01:00
Sybren A. Stüvel
ef3cab9745 Webapp: handle job deletions properly
- Add a little confirmation overlay before deleting a job. This overlay
  also shows information about whether the Shaman checkout directory
  will be deleted or not.
- Send job updates to the web frontend when jobs are marked for
  deletion, and when they are actually deleted.
- Respond to those updates, and handle some corner cases where job info
  is missing (because it just got deleted).

This closes T99401.
2023-02-03 16:59:15 +01:00
Sybren A. Stüvel
c21cc7d316 OAPI: regenerate code 2023-02-03 16:44:55 +01:00
Sybren A. Stüvel
bf0906eb95 Manager: avoid logging an error when requesting a non-existent job
This is expected to happen every once in a while, especially now that
Flamenco supports job deletion. It's not something to log at error level.
2023-02-03 16:37:55 +01:00
Sybren A. Stüvel
2927e82802 Swagger UI: remove the "try it out" buttons
Remove the "try it out" buttons, and have the "Execute" buttons visible
immediately.
2023-02-03 13:40:26 +01:00
Sybren A. Stüvel
3bedc2c87a Swagger UI: add "Back to Flamenco" link
Add a link from the API section to the main Flamenco web interface.
2023-02-03 13:40:22 +01:00
Sybren A. Stüvel
a97a4e6e67 Manager: show delete-requested jobs in the web interface
Show jobs that have been marked for deletion with a red strike-through
line in the jobs table, and show the deletion-request timestamp in the
job details.
2023-01-08 13:52:27 +01:00
Sybren A. Stüvel
416138fd70 Manager: add test for QueryJobs() API function
No functional changes.
2023-01-08 13:15:30 +01:00
Sybren A. Stüvel
791d877ff1 Manager: implement API endpoint for deleting jobs
Implement the `deleteJob` API endpoint. Calling this endpoint will mark
the job as "deletion requested", after which it's queued for actual
deletion. This makes the API response fast, even when there is a lot of
work to do in the background.

A new background service "job deleter" keeps track of the queue of such
jobs, and performs the actual deletion. It removes:

- Shaman checkout for the job (but see below)
- Manager-local files of the job (task logs, last-rendered images)
- The job itself

The removal is done in the above order, so the job is only removed from the
database if the rest of the removal was succesful.

Shaman checkouts are only removed if the job was submitted with Flamenco
version 3.2. Earlier versions did not record enough information to reliably
do this.
2023-01-04 01:18:21 +01:00
Sybren A. Stüvel
2e5f5ffadd OAPI: regenerate code 2023-01-04 01:18:21 +01:00
Sybren A. Stüvel
f413a40f4e Store Shaman checkout ID when submitting a job
If Shaman is used to submit the job files, store the job's checkout ID
(i.e. the path relative to the checkout root) in the database. This will
make it possible in the future to remove the Shaman checkout along with
the job itself.
2023-01-04 01:18:21 +01:00
Sybren A. Stüvel
6aea02c904 Fix T103516: max image size for previews is set too low
Manager had a limit of 10 MB, but the Worker can produce images that are
larger than that (even after down-scaling the image). I've bumped the
limit to 25 MB, which should be enough (it's 2x the bug reporter's file
size).
2023-01-03 16:28:28 +01:00
Sybren A. Stüvel
2df5a1089a Fix T102707: Flamenco Manager crash on frame range without hyphen
The "invalid range" error reporting had an infinite loop, causing a crash.
This is now resolved.
2023-01-03 16:16:44 +01:00
Sybren A. Stüvel
9bda21648e Manager: add timeout when fetching job
Add a timeout when fetching a job from the persistence layers.

It's my intention to add more timeouts, so this also introduces some code
to make it easier to test that a context has a deadline set.
2022-12-14 13:02:59 +01:00
Sybren A. Stüvel
c16c1f4b15 Refactor: deduplicate job fetching code
Deduplicate API implementation code to fetch a job from the persistence
service.

Almost no functional changes. Checking that the requested job UUID is
actually a valid UUID is now consistently done on all fetches. This is
not a functional change in normal Flamenco operations, where only valid
UUIDs are used anyway.
2022-12-14 13:02:59 +01:00
Sybren A. Stüvel
15e3745820 Manager: SQLite WAL journal + NORMAL sync mode
Run `PRAGMA journal_mode = WAL` and `PRAGMA synchronous = normal` when
connecting to the SQLite database. This enables the write-ahead-log journal
mode, which makes it safe to enable "normal" synchronisation (instead of
the default "full" synchronisation).
2022-11-24 17:18:06 +01:00