flamenco

Author	SHA1	Message	Date
Sybren A. Stüvel	15e3745820	Manager: SQLite WAL journal + NORMAL sync mode Run `PRAGMA journal_mode = WAL` and `PRAGMA synchronous = normal` when connecting to the SQLite database. This enables the write-ahead-log journal mode, which makes it safe to enable "normal" synchronisation (instead of the default "full" synchronisation).	2022-11-24 17:18:06 +01:00
Sybren A. Stüvel	85d53de1f9	Manager: implement API endpoint for changing job priority The priority of an existing can now be changed. It will be taken into account when assigning tasks to workers, but it will not reassign tasks that are already active.	2022-09-30 16:30:03 +02:00
Sybren A. Stüvel	59655ea770	Manager: fix error in sleep scheduler when shutting down When the Manager was shutting down while the sleep scheduler was running, it could cause a null pointer dereference. This is now doubly solved: - `worker.Identifier()` is now nil-safe, as in, `worker` can be `nil` and it will still return a sensible string. - failure to apply the sleep schedule due to the context closing is not logged as error any more.	2022-09-27 12:27:18 +02:00
Sybren A. Stüvel	2a345a3d2c	API for deleting workers Workers can now be soft-deleted. Tasks assigned to the worker will remain associated with that Worker. Active tasks will be re-queued so other workers can pick them up.	2022-08-11 16:59:53 -07:00
Sybren A. Stüvel	1469345f3a	Manager: sort blocklist by worker name	2022-08-01 18:54:28 +02:00
Sybren A. Stüvel	be1ddaa4eb	Manager test: reduce timeout to practical value The timeout was increased to aid debugging, but shouldn't have been committed.	2022-07-29 09:59:54 +02:00
Sybren A. Stüvel	736ca103c3	Manager: show current/last task in worker details The Task details component already linked to the Worker it was assigned to last, and now the Worker links back to the task. There's only one task shown in the Worker details. If the Worker is actively working on a task, that one's shown. Otherwise it's the last-updated task that was assigned to the worker.	2022-07-26 10:36:02 +02:00
Sybren A. Stüvel	ab8ecc24cc	Cleanup: Add missing license specifiers Add license specifiers to Go files that were missing them: ``` // SPDX-License-Identifier: GPL-3.0-or-later ``` No functional changes.	2022-07-25 16:08:07 +02:00
Sybren A. Stüvel	83467e4c60	Sleep schedule: store 'next check' timestamp in UTC SQLite doesn't parse the timezone info, so timestamps should always be in UTC.	2022-07-18 19:30:17 +02:00
Sybren A. Stüvel	658a3d7a85	Worker Timeout: subject all but offline/error workers to timeout checks Workers that are in `starting`, `asleep`, or `testing` state should also be subject to the timeout check, not just workers in `awake` state.	2022-07-18 11:30:39 +02:00
Sybren A. Stüvel	d7b164133a	Sleep Scheduler implementation for the Manager The Manager now has a sleep scheduler for Workers. The API and background service work, but there is no web interface yet. Manifest Task: T99397	2022-07-17 17:27:32 +02:00
Sybren A. Stüvel	627996525e	Manager: implement operations for getting & setting worker sleep schedule This is just the API, no web interface yet. Manifest Task: T99397	2022-07-16 16:00:25 +02:00
Sybren A. Stüvel	859a261b05	Manager: on deletion of a worker, do not cascade to deletion of its tasks Fix an issue where deleting a Worker would also delete the tasks it was assigned to.	2022-07-15 17:00:25 +02:00
Sybren A. Stüvel	1fceae3604	Manager: more efficient database queries Be more selective in what's saved to the database to speed some things up. Most importantly, this avoids saving the entire job when a task status is updated or a task is assigned.	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	1055aabee2	Manager: optimise db.SaveActivity() query Use an explicit `Select()` GORM call to avoid saving related objects.	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	6e28271c93	Manager: prevent saving related job & worker when "touching" task	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	6b5f9317cb	Manager: clear job's blocklist when requeueing the job Requeueing a job means that the issues that caused workers to get blocked might be resolved, so it should be run with a clean slate.	2022-07-14 11:03:11 +02:00
Sybren A. Stüvel	d25151184d	Add a "Last Rendered" view Add a "Last Rendered" view to the webapp. The Manager now stores (in the database) which job was the last recipient of a rendered image, and serves that to the appropriate OpenAPI endpoint. A new SocketIO subscription + accompanying room makes it possible for the web interface to receive all rendered images (if they survive the queue, which discards images when it gets too full).	2022-07-01 12:34:40 +02:00
Sybren A. Stüvel	64512c81ba	Manager: implement OAPI operations to fetch blocklist & delete items	2022-06-27 11:32:35 +02:00
Sybren A. Stüvel	87f1959e26	Manager: use blocklist to actually block workers Actually use the blocklist in the task scheduler to block workers from doing blocked job types.	2022-06-21 17:59:20 +02:00
Sybren A. Stüvel	64c8fa851d	Show assigned worker in task details Show the worker assigned to the task in the task details view, as link to the worker itself.	2022-06-17 16:36:55 +02:00
Sybren A. Stüvel	046853932d	Manager: re-queue previously failed tasks of worker when blocklisting When a Worker is blocked from a job, re-queue its previously failed tasks so that other workers can give them a try.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	fd31a85bcd	Manager: add blocking of workers when they fail certain tasks too much When a worker fails too many tasks, of the same task type, on the same job, it'll get blocked from doing those.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	81f81d0e0a	Show task failure list in the web frontend Show the task failure list in the web frontend's `TaskDetails` component.	2022-06-17 11:37:56 +02:00
Sybren A. Stüvel	0b5140fc5f	Manager: clear task failure list on requeueing of jobs & tasks When a job or task gets requeued from the web interface, its task failure lists (i.e. the list of workers that previously failed this task) will be cleared. This clearing doesn't happen in other situations, e.g. when a worker signs off and its task gets requeued, the task's failure list will remain as-is.	2022-06-17 11:37:28 +02:00
Sybren A. Stüvel	e9fca8d993	Cleanup: typo fix in comment	2022-06-17 11:03:43 +02:00
Sybren A. Stüvel	8764f8f7c1	Manager: task scheduler, don't schedule tasks the worker failed before When a worker asks for a task to perform, don't give it a task that it failed before.	2022-06-16 16:02:28 +02:00
Sybren A. Stüvel	7d7c2b1bd6	Cleanup: blacklist → blocklist Change "blacklist" to "blocklist", because that makes people happier. No functional changes.	2022-06-16 10:36:36 +02:00
Sybren A. Stüvel	c5debdeb70	Manager: add 'task failure list' to record workers failing tasks The persistence layer can now store which worker failed which task, as preparation for a blocklisting system. Such a system should be able to determine whether there are still any workers left to do the work.	2022-06-13 18:41:30 +02:00
Sybren A. Stüvel	e35911d106	Manager: add ability to delete jobs This is needed for a future unit test, and exposed the fact that SQLite didn't enforce foreign key constraints (and thus also didn't handle on-delete-cascade attributes). This has been fixed in the previous commit.	2022-06-13 18:41:19 +02:00
Sybren A. Stüvel	e5d0e987e1	Manager: enforce DB foreign key checks at startup SQLite disables foreign key checks by default, so Flamenco has to enable them explicitly.	2022-06-13 18:41:19 +02:00
Sybren A. Stüvel	6ec493d944	Manager, more efficiently create tasks When creating tasks the inter-task dependencies are saved as a 2nd pass,by updating the tasks in the database. This now only saves those dependencies, and no longer saves the entire task again.	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	02bc03ae2b	Manager: replace `gorm.Model` with our own `persistence.Model` struct `persistence.Model` contains the common database fields for most model structs. It is a copy of `gorm.Model`, but without the `DeletedAt` field (which triggers Gorm's soft deletion). Soft deletion is not used by Flamenco. If it ever becomes necessary to support soft-deletion, see https://gorm.io/docs/delete.html#Soft-Delete	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	6fc936d0a6	Revert accidental debug code Revert change in rF01c45afc20854918d1f18e6859b4154499d500b6 that made unit tests use an on-disk database.	2022-06-13 18:40:25 +02:00
Sybren A. Stüvel	5dac3c2dc0	Manager: mark workers as 'seen' when they send updates Update the 'last seen at' timestamp of workers when they: - sign on - sign off - get a task assigned - send a task update - check whether they can keep running their task Note that this commit is necessary to not have the workers time out immediately ;-)	2022-06-13 12:47:07 +02:00
Sybren A. Stüvel	7d5aae25b5	Manager: add timeout checks for workers	2022-06-13 12:33:22 +02:00
Sybren A. Stüvel	67562856d3	Manager: let Gorm create an index on `Task.LastTouchedAt` It's used in timeout queries, and there could be tens or hundreds of thousands of tasks in the database.	2022-06-13 12:33:05 +02:00
Sybren A. Stüvel	01c45afc20	Manager: explicitly store timestamps as UTC SQLite doesn't handle timezones by default, when you just use something like `date1 < date2`, for example. This makes GORM explicitly use UTC timestamps for the `CreatedAt`, `UpdatedAt`, and `DeletedAt` fields. Our own code should also use UTC when saving timestamps. That way all datetimes in the database are in the same timezone, and can be compared naievely.	2022-06-13 12:10:11 +02:00
Sybren A. Stüvel	09902d201c	Manager: fix task timeout check logging of assigned workers The task's worker wasn't fetched from the database, always causing "unknown worker" messages in the task log.	2022-06-10 14:52:03 +02:00
Sybren A. Stüvel	d90a8b987d	Manager: Task Timeout Checker Tasks that are in state `active` but haven't been 'touched' by a Worker for 10 minutes or longer will transition to state `failed`. In the future, it might be better to move the decision about which state is suitable to the Task State Machine service, so that it can be smarter and take the history of the task into account. Going to `soft-failed` first might be a nice touch.	2022-06-10 14:32:02 +02:00
Sybren A. Stüvel	295891a17a	Manager: ensure Gorm-generated timestamps are in UTC SQLite should store all timestamps in UTC, as the database is woefully unaware of timezones and will compare lexicographically.	2022-06-10 14:31:53 +02:00
Sybren A. Stüvel	04dd479248	Manager: protect task log writing with mutex A per-task mutex is used to protect the writing of task logs, so that mutliple goroutines can safely write to the same task log.	2022-06-09 14:44:54 +02:00
Sybren A. Stüvel	b4d2fc4231	Manager: keep track of when a Worker last worked on a task This will be used for keeping track of stuck tasks.	2022-06-03 16:33:50 +02:00
Sybren A. Stüvel	6cf82e5d43	Manager: cleanup, refactor Worker state change request persistence code Move the setting & clearing of worker state change requests into separate functions. No functional changes.	2022-06-02 16:36:06 +02:00
Sybren A. Stüvel	f97f0a34c3	Manager: implement worker status change requests Implement the OpenAPI `RequestWorkerStatusChange` operation, and handle these changes in the web interface.	2022-05-31 17:22:03 +02:00
Sybren A. Stüvel	1496736f7a	Manager: wrap Worker fetching errors Do the same wrapping as for task/job errors, but then for workers.	2022-05-31 11:18:57 +02:00
Sybren A. Stüvel	19db947eb4	Manager: remove `Worker.LastActivity` This removes the field both from the OpenAPI interface and the database.	2022-05-31 10:46:27 +02:00
Sybren A. Stüvel	08676f48f4	Manager: implement `fetchWorkers` OpenAPI operation	2022-05-30 18:52:02 +02:00
Sybren A. Stüvel	f77b11d85e	Manager: add a small wrapper around Google's UUID library Add a small wrapper around github.com/google/uuid. That way it's clearer which functionality is used by Flamenco, doesn't link most of the code to any specific UUID library, and allows a bit of customisation. The only customisation now is that Flamenco is a bit stricter in the formats it accepts; only the `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx` is accepted. This makes things a little bit stricter, with the advantage that we don't need to do any normalisation of received UUID strings.	2022-05-20 15:35:51 +02:00
Sybren A. Stüvel	7b664475ca	Rename job status `requeued` to `requeueing`	2022-05-19 17:25:53 +02:00

1 2 3

126 Commits