flamenco

Author	SHA1	Message	Date
Sybren A. Stüvel	04dd479248	Manager: protect task log writing with mutex A per-task mutex is used to protect the writing of task logs, so that mutliple goroutines can safely write to the same task log.	2022-06-09 14:44:54 +02:00
Sybren A. Stüvel	b4d2fc4231	Manager: keep track of when a Worker last worked on a task This will be used for keeping track of stuck tasks.	2022-06-03 16:33:50 +02:00
Sybren A. Stüvel	6cf82e5d43	Manager: cleanup, refactor Worker state change request persistence code Move the setting & clearing of worker state change requests into separate functions. No functional changes.	2022-06-02 16:36:06 +02:00
Sybren A. Stüvel	f97f0a34c3	Manager: implement worker status change requests Implement the OpenAPI `RequestWorkerStatusChange` operation, and handle these changes in the web interface.	2022-05-31 17:22:03 +02:00
Sybren A. Stüvel	1496736f7a	Manager: wrap Worker fetching errors Do the same wrapping as for task/job errors, but then for workers.	2022-05-31 11:18:57 +02:00
Sybren A. Stüvel	19db947eb4	Manager: remove `Worker.LastActivity` This removes the field both from the OpenAPI interface and the database.	2022-05-31 10:46:27 +02:00
Sybren A. Stüvel	08676f48f4	Manager: implement `fetchWorkers` OpenAPI operation	2022-05-30 18:52:02 +02:00
Sybren A. Stüvel	f77b11d85e	Manager: add a small wrapper around Google's UUID library Add a small wrapper around github.com/google/uuid. That way it's clearer which functionality is used by Flamenco, doesn't link most of the code to any specific UUID library, and allows a bit of customisation. The only customisation now is that Flamenco is a bit stricter in the formats it accepts; only the `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx` is accepted. This makes things a little bit stricter, with the advantage that we don't need to do any normalisation of received UUID strings.	2022-05-20 15:35:51 +02:00
Sybren A. Stüvel	7b664475ca	Rename job status `requeued` to `requeueing`	2022-05-19 17:25:53 +02:00
Sybren A. Stüvel	530520b1c7	Implement mass updating of tasks when `JobUpdate.refresh_tasks = true` Send & handle `JobUpdate.refresh_tasks = true` when many tasks are updated simultaneously. This applies to things like cancelling & requeueing an entire job. This partially rolls back 67bf77de13d99b1bc5d7344951068822c4fadd88, as it was too slow when 1000+ tasks were being updated all at once.	2022-05-17 14:48:50 +02:00
Sybren A. Stüvel	d35ca9d98f	Manager: limit database connections Limit the database connection pool to only a single connection. I hope that this will solve the intermittent `SQLITE_BUSY` errors I've been seeing.	2022-05-12 13:58:15 +02:00
Sybren A. Stüvel	3d606a3fa0	Manager: task scheduler, fix handling of worker assignment of tasks Improve how the task scheduler deals with tasks that already have a worker assigned to them: - When a Worker asks for a task, and there is already an active task assigned to it, always return that task. - Otherwise, never allow scheduling of active tasks, as those are already being run by another worker. If this is not the case, their status should change to queued/failed, instead of handling the situation in the task scheduler. - Apart from the assigned-and-active case above, ignore task's worker ID when scheduling tasks. If the status is 'queued' or 'soft-failed', the task's worker ID just indicates who ran the task last.	2022-05-12 13:52:16 +02:00
Sybren A. Stüvel	d3e2638f84	Cleanup: rename `uri` to `dsn` "DSN" (Data Source Name) is used to indicate which database to open, and was intermixed with "URI". This is now consistent. No functional changes.	2022-05-12 11:08:54 +02:00
Sybren A. Stüvel	d673da7a0c	Manager: check for stuck jobs at startup Check for jobs in 'cancel-requested' or 'requeued' statuses, and ensure they transition to the right status. This happens at startup, before even starting the web interface, so that a consistent state is presented.	2022-05-06 16:07:27 +02:00
Sybren A. Stüvel	98da20f1a9	Manager: vacuum the database at startup	2022-05-06 14:35:34 +02:00
Sybren A. Stüvel	ba34652cd1	Implement task status changes from web interface This also reworks some of the logic due to the recently-removed `cancel-requested` task status.	2022-05-05 16:44:09 +02:00
Sybren A. Stüvel	67bf77de13	Manager: rework mass updates to task statuses When the job status changes, it impacts the task statuses as well. These status changes are now no longer done with a single database query, but instead each affected task is fetched, changed, and saved. This unifies the regular & mass updates to the tasks, and causes the resulting task changes to be broadcast to SocketIO clients.	2022-05-03 16:13:44 +02:00
Sybren A. Stüvel	b3e1d1c6de	Cleanup: manager, typo fix	2022-05-03 13:05:30 +02:00
Sybren A. Stüvel	bb68488c5e	Cleanup: Manager, add bit of documentation	2022-05-03 10:39:44 +02:00
Sybren A. Stüvel	629c073ed7	Manager: fix query for job tasks	2022-04-29 12:26:53 +02:00
Sybren A. Stüvel	992fc38604	OAPI: add endpoint for fetching the tasks of a job Add `fetchJobTasks` operation to the Jobs API. This returns a summary of each of the job's tasks, suitable for display in a task list view. The actually used fields may need tweaking once we actually have a task list view, but at least the functionality is there.	2022-04-22 12:52:57 +02:00
Sybren A. Stüvel	d79fde17f3	Manager: keep track of the reason of job status changes To prepare for job status changes being requestable from the API, store the reason for any status change on the job itself. Not yet part of the API, just on the persistence layer.	2022-04-21 12:32:07 +02:00
Sybren A. Stüvel	c3b694ab2a	Manager: wrap job/task errors in persistence layer Avoid users of the persistence layer to have to test against Gorm errors, by wrapping job/task errors in a new `PersistenceError` struct. Instead of testing for `gorm.ErrRecordNotFound`, code can now test for `persistence.ErrJobNotFound` or `persistence.ErrTaskNotFound`.	2022-04-21 11:54:59 +02:00
Sybren A. Stüvel	1960b668aa	Cleanup: remove unused code	2022-04-08 14:47:07 +02:00
Sybren A. Stüvel	930d7497d7	OAPI: Better 'SQLITE_BUSY' error handling SQLite can return `SQLITE_BUSY` errors when it's doing too many things at the same time. This is now improved a bit by setting a 5-second timeout, during which the SQLite driver will wait for the database to become available. If that doesn't happen, Flamenco Manager will return a `503 Service Unavailable` response so that the client knows to back off a little.	2022-04-08 12:02:30 +02:00
Sybren A. Stüvel	781f1d936a	OAPI: add jobs query endpoint	2022-04-04 18:53:19 +02:00
Sybren A. Stüvel	8d52a03648	Manager: fix bug in task scheduler The task scheduler was handing out tasks for which any dependency (instead of all dependencies) were completed.	2022-03-17 13:07:20 +01:00
Sybren A. Stüvel	c5a2a23f6e	Manager: reduce log levels	2022-03-17 11:47:53 +01:00
Sybren A. Stüvel	22ea599554	Manager: periodically run the SQL `VACUUM` command	2022-03-17 11:03:29 +01:00
Sybren A. Stüvel	9f5e4cc0cc	License: license all code under "GPL-3.0-or-later" The add-on code was copy-pasted from other addons and used the GPL v2 license, whereas by accident the LICENSE text file had the GNU "Affero" GPL license v3 (instead of regular GPL v3). This is now all streamlined, and all code is licensed as "GPL v3 or later". Furthermore, the code comments just show a SPDX License Identifier instead of an entire license block.	2022-03-07 15:26:46 +01:00
Sybren A. Stüvel	dbb9c71df8	Tests: more unified way to do database tests	2022-03-04 12:33:45 +01:00
Sybren A. Stüvel	f497ac8536	Cleanup: add and remove some comments	2022-03-04 12:19:19 +01:00
Sybren A. Stüvel	3bfd5a339f	Fix database tests getting interrupted The root cause was a 2nd `context.Context()` that was used in `constructTestJob()`, which cancelled when that function returned. The cancellation of the context caused an interrupt in the SQLite driver, which got into a race condition and could cause an interrupt on a subsequent database query.	2022-03-04 12:19:19 +01:00
Sybren A. Stüvel	de150567b0	Manager: avoid double error message	2022-03-04 11:37:29 +01:00
Sybren A. Stüvel	cd2fe8170e	Errors: remove "error" prefix from message Instead of returning an error "error doing X", just return "doing X". The fact that it's returned as an error object says enough about that it's an error. This also makes it easier to chain error messages, without seeing the word "error" in every part of the chain.	2022-03-04 11:30:31 +01:00
Sybren A. Stüvel	b9609f8866	Cleanup: remove unused code	2022-03-03 13:52:57 +01:00
Sybren A. Stüvel	641ed7ace9	Manager: make Gorm use Zerolog for logging A wrapper for Zerolog implements the Gorm logger interface. This gives us coloured output on Windows, and uniform-looking logs in production.	2022-03-03 13:52:50 +01:00
Sybren A. Stüvel	8824489980	Manager: use in-memory SQLite database for testing The on-disk database that was used before caused issues with tests running in parallel. Not only is there the theoretical issue of tests seeing each other's data (this didn't happen, but could), there was also the practical issue of one test running while the other tried to erase the database file (which fails on Windows due to file locking).	2022-03-03 13:51:55 +01:00
Sybren A. Stüvel	9b9c6bffff	Replace self-hacked SQLite Gorm driver with 3rd party one The new Gorm driver is made by the creators of the pure-Go SQLite library we were already using.	2022-03-03 13:48:14 +01:00
Sybren A. Stüvel	2b04623e00	Manager: fix DB transaction isolation issue in task scheduler The created transaction wasn't actually used for the should-be-in-the- transaction queries. That's now resolved.	2022-03-03 13:46:27 +01:00
Sybren A. Stüvel	9643bf768e	Manager: Fix DB migration error of not-null columns Where the PostgreSQL DB migration code could handle `NOT NULL` columns just fine, SQLite has less table-altering functionality. As a result, migrations have to copy entire database tables, which doesn't play well with not-nullable columns.	2022-03-03 12:10:13 +01:00
Sybren A. Stüvel	47e36c927c	Change package URL to the blender.org repository	2022-03-01 20:45:09 +01:00
Sybren A. Stüvel	e70a44a146	Manager: switch from PostgreSQL to SQLite This includes a modified copy of the Gorm SQLite backend, adjusted to use https://modernc.org/sqlite instead.	2022-03-01 18:50:31 +01:00
Sybren A. Stüvel	0235ffcb4a	Manager: avoid "no record found" error in task scheduler It's fine when there is no task for a worker, so having Gorm log an error was just causing noise.	2022-03-01 11:52:28 +01:00
Sybren A. Stüvel	fab988295d	Manager: remove task scheduler SQL debug logs	2022-02-28 12:07:23 +01:00
Sybren A. Stüvel	7689a988b1	Manager: re-queue tasks of worker when signing off	2022-02-28 12:06:50 +01:00
Sybren A. Stüvel	32af1ffaef	Manager: actually pass context to Gorm queries	2022-02-28 11:53:31 +01:00
Sybren A. Stüvel	3d854078ba	Manager: integrate task state machine into API implementation	2022-02-25 16:30:27 +01:00
Sybren A. Stüvel	9a5bbb4131	Manager: implement persistence layer interface for task status machine Implement the functions used by the task status machine in the DB persistence layer.	2022-02-25 14:34:29 +01:00
Sybren A. Stüvel	7420177209	Manager: use api.JobStatus in persistence layer as well	2022-02-24 11:54:35 +01:00

1 2

85 Commits