flamenco

Author	SHA1	Message	Date
Sybren A. Stüvel	d25151184d	Add a "Last Rendered" view Add a "Last Rendered" view to the webapp. The Manager now stores (in the database) which job was the last recipient of a rendered image, and serves that to the appropriate OpenAPI endpoint. A new SocketIO subscription + accompanying room makes it possible for the web interface to receive all rendered images (if they survive the queue, which discards images when it gets too full).	2022-07-01 12:34:40 +02:00
Sybren A. Stüvel	2457a63518	Manager: Show "nothing rendered yet" image in job details Show a "nothing rendered yet" image in the job details when there is no last-rendered image yet.	2022-06-30 19:20:19 +02:00
Sybren A. Stüvel	0fc5ba0bc6	Manager: broadcast last-rendered image info via SocketIO After processing an image in the "last-rendered" processor, a SocketIO object is sent to clients to indicate the last-rendered image needs to be (re)loaded. This also moves the previously existing "done callback" from a single function to a per-image callback, so that it can be called with the right information in there, and only when that particular image is actually done processing. The notification message sent via SocketIO also contains the necessary info to render the image, so that the web client doesn't have to call the `fetchJobLastRenderedInfo` operation.	2022-06-30 18:36:24 +02:00
Sybren A. Stüvel	6efd67b05c	Manager: implement `FetchJobLastRenderedInfo()` API operation Allow querying for the URL & available versions of a job's last-rendered image.	2022-06-28 17:08:00 +02:00
Sybren A. Stüvel	64512c81ba	Manager: implement OAPI operations to fetch blocklist & delete items	2022-06-27 11:32:35 +02:00
Sybren A. Stüvel	e687c95e5d	Manager: add "last rendered image" processing pipeline Add a handler for the OpenAPI `taskOutputProduced` operation, and an image thumbnailing goroutine. The queue of images to process + the function to handle queued images is managed by `last_rendered.LastRenderedProcessor`. This queue currently simply allows 3 requests; this should be improved such that it keeps track of the job IDs as well, as with the current approach a spammy job can starve the updates from a more calm job.	2022-06-24 16:51:11 +02:00
Sybren A. Stüvel	b53cd67eb4	Cleanup: rename `assertResponseEmpty()` → `assertResponseNoContent()` The function tests the HTTP response is `204 No Content`, and now the name reflects that better. No functional changes.	2022-06-24 16:09:46 +02:00
Sybren A. Stüvel	2d05e1c773	Fix unit test for recent scheduler change Fix unit test for rF1586c37b.	2022-06-20 16:05:36 +02:00
Sybren A. Stüvel	1586c37b32	Manager: mark task as active as soon as it is assigned to a worker Move the task to 'active' status so that it won't be assigned to another worker. This also enables the task timeout monitoring.	2022-06-20 13:00:49 +02:00
Sybren A. Stüvel	a2b667c043	Manager: log blocklist threshold	2022-06-17 17:15:23 +02:00
Sybren A. Stüvel	13bdb0ed73	Manager: remove outdated TODO	2022-06-17 17:15:13 +02:00
Sybren A. Stüvel	a368230afa	Manager: fix race condition in logging of worker name/UUID Instead of updating the logger in the context, just store a new logger in a new sub-context.	2022-06-17 17:13:32 +02:00
Sybren A. Stüvel	cdb7789f08	Refactor: Manager, move test code Move code that covers `worker_task_updates.go` into `worker_task_updates_test.go`. No functional changes.	2022-06-17 15:51:15 +02:00
Sybren A. Stüvel	046853932d	Manager: re-queue previously failed tasks of worker when blocklisting When a Worker is blocked from a job, re-queue its previously failed tasks so that other workers can give them a try.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	b95bed1f96	Refactor: rename `RequeueTasksOfWorker` to `RequeueActiveTasksOfWorker` Soon there will be another function to requeue tasks of workers by other criteria, so being clear in the name helps. No functional changes.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	fd31a85bcd	Manager: add blocking of workers when they fail certain tasks too much When a worker fails too many tasks, of the same task type, on the same job, it'll get blocked from doing those.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	56abc825a6	Refactor: Manager, refactor handling of task failures Split the handling of soft and hard failures into separate functions. No functional changes intended.	2022-06-17 15:01:52 +02:00
Sybren A. Stüvel	6feee74c54	Cleanup: Manager, move worker task update handling code into its own file Move the code related to task updates from workers to `worker_task_updates.go`. It's going to get more complex with the blocklisting in there; this prepares for that. No functional changes.	2022-06-17 11:46:07 +02:00
Sybren A. Stüvel	81f81d0e0a	Show task failure list in the web frontend Show the task failure list in the web frontend's `TaskDetails` component.	2022-06-17 11:37:56 +02:00
Sybren A. Stüvel	0b5140fc5f	Manager: clear task failure list on requeueing of jobs & tasks When a job or task gets requeued from the web interface, its task failure lists (i.e. the list of workers that previously failed this task) will be cleared. This clearing doesn't happen in other situations, e.g. when a worker signs off and its task gets requeued, the task's failure list will remain as-is.	2022-06-17 11:37:28 +02:00
Sybren A. Stüvel	9ab41984ac	Adjust Go code for Nickname -> Name change This fixes a bug where 'Worker undefined changed status' was logged in the web interface, as that was (back then incorrectly) `workerupdate.name`. Now that code is correct.	2022-06-16 11:03:18 +02:00
Sybren A. Stüvel	5f2712980e	Manager: task scheduler, check for requested worker status change first Before checking whether the Worker is allowed to do work (i.e. is in `awake` state), check any queued-up status changes. Those should be communicated, before saying "no work for you", so that the Worker can actually respond to it.	2022-06-16 10:48:38 +02:00
Sybren A. Stüvel	ee53373878	Cleanup: compare worker state to constant instead of hard-coded state Use the `requiredStatusToGetTask` constant to compare the worker status, and not just for logging. No functional changes, just better code.	2022-06-16 10:46:50 +02:00
Sybren A. Stüvel	40f711bf69	Fix two unit tests for the previous commit I pushed too soon :'(	2022-06-16 10:42:04 +02:00
Sybren A. Stüvel	be0b10400f	Manager: count workers as 'seen' even when there is no task Fix a bug where a worker would only be counted as 'seen' by the task scheduler if it actually got a task assigned.	2022-06-16 10:39:42 +02:00
Sybren A. Stüvel	6e12a2fb25	Manager: keep track of which worker failed which task When a Worker indicates a task failed, mark it as `soft-failed` until enough workers have tried & failed at the same task. This is the first step in a blocklisting system, where tasks of an often-failing worker will be requeued to be retried by others. NOTE: currently the failure list of a task is NOT reset whenever it is requeued! This will be implemented in a future commit, and is tracked in `FEATURES.md`.	2022-06-13 18:41:38 +02:00
Sybren A. Stüvel	02bc03ae2b	Manager: replace `gorm.Model` with our own `persistence.Model` struct `persistence.Model` contains the common database fields for most model structs. It is a copy of `gorm.Model`, but without the `DeletedAt` field (which triggers Gorm's soft deletion). Soft deletion is not used by Flamenco. If it ever becomes necessary to support soft-deletion, see https://gorm.io/docs/delete.html#Soft-Delete	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	ec5b3aac52	Manager: on getting task update from Worker, write log before status change When receiving a `TaskUpdate` from a Worker, write to the task log, before handling any task status change. If both log and task status change are sent, the log will likely contain the cause of the task state change. Any subsequent task logs, for example generated by the Manager in response to the status change, should be logged after that.	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	5dac3c2dc0	Manager: mark workers as 'seen' when they send updates Update the 'last seen at' timestamp of workers when they: - sign on - sign off - get a task assigned - send a task update - check whether they can keep running their task Note that this commit is necessary to not have the workers time out immediately ;-)	2022-06-13 12:47:07 +02:00
Sybren A. Stüvel	c3525c3b1a	Manager: move task requeueing to `TaskStateMachine` Requeueing the tasks of a specific worker is now done in the `TaskStateMachine`, such that it can be called from other services as well in future commits. This also makes the `LogStorage` service a dependency of the `TaskStateMachine`, as it needs to write "this task was requeued" kind of messages to the task logs.	2022-06-13 12:33:01 +02:00
Sybren A. Stüvel	24204084c1	Manager: move timestamping of log messages to `task_logs` package In the future different services will write to the task log, and thus it makes sense to move the responsibility of prepending the timestamps to the log storage service.	2022-06-09 17:00:38 +02:00
Sybren A. Stüvel	819cad1d18	Manager: move broadcasting of task logs via SocketIO to task log service To ensure all task logs also get broadcast via SocketIO, the responsibility has moved from the `api_impl` to the `task_logs` package.	2022-06-09 16:49:48 +02:00
Sybren A. Stüvel	92d6693871	Show Task's "last touched" in the web interface	2022-06-09 11:59:43 +02:00
Sybren A. Stüvel	354fd29f9e	Manager: Start timeout counting as soon as Worker gets task assigned Set the task's "last touched" field in the database to "now" as soon as the task is assigned to a worker.	2022-06-09 11:58:30 +02:00
Sybren A. Stüvel	87bce6be36	Manager: unify logging of task assignment and requeue-on-signoff The requeue-task-on-worker-signoff operation also needs to log a timestamp. The code for this, and the recently added code for timestamping the "task assigned to worker" message, are now unified.	2022-06-09 11:30:46 +02:00
Sybren A. Stüvel	75903a2da3	Manager: prepend timestamp to "task assigned to worker" task log entries Add a new `clock` service to the Flamenco struct, which allows us to mock the passing of time, and thus test for timestamps in a stable fashion.	2022-06-09 11:24:02 +02:00
Sybren A. Stüvel	b186ea1828	Manager: write to task log when assigning it to a worker	2022-06-09 10:59:44 +02:00
Sybren A. Stüvel	b4d2fc4231	Manager: keep track of when a Worker last worked on a task This will be used for keeping track of stuck tasks.	2022-06-03 16:33:50 +02:00
Sybren A. Stüvel	0be1ca30dd	Cleanup: manager, move api_impl interfaces to interfaces.go The number of interfaces declared by the `api_impl` package is getting large, so they deserve their own file. No functional changes.	2022-06-03 15:52:07 +02:00
Sybren A. Stüvel	8e7f1e2868	Manager: some extra unit tests for worker signoff behaviour	2022-06-02 16:37:29 +02:00
Sybren A. Stüvel	6cf82e5d43	Manager: cleanup, refactor Worker state change request persistence code Move the setting & clearing of worker state change requests into separate functions. No functional changes.	2022-06-02 16:36:06 +02:00
Sybren A. Stüvel	132ce8f2ec	Merge 'shutdown' and 'offline' states Move the 'shutdown' state code to the 'offline' state, to match the removal of the 'shutdown' state from the OpenAPI definition.	2022-06-02 16:35:07 +02:00
Sybren A. Stüvel	678308fb6d	Manager: allow cancelling worker state change requests A worker state change request can now be cancelled by requesting the worker to go to its current state. In other words, a previously requested change `A → B` can be cancelled by requesting the worker goes to state `A`. Previously this would simply overwrite the last request, resulting in a requested state change `A → A`. Having this non-lazy would even interrupt the currently running task.	2022-06-02 12:43:16 +02:00
Sybren A. Stüvel	9ed6b6d931	Manager: adjust code for `WorkerStatusChangeRequest` extraction See preceeding OpenAPI change.	2022-06-02 12:17:54 +02:00
Sybren A. Stüvel	ae6831ce6e	Manager: fix unit test rFcfb17b178da2055ef12b2aa2ad8f7f778a952bc3 changed the semantics of `SocketIOWorkerUpdate`, in the sense that any update that doesn't change the worker status can omit `previous_status`. This commit adjusts the unit test for this.	2022-06-02 12:13:25 +02:00
Sybren A. Stüvel	487a31624f	Cleanup: manager, make `workerDBtoAPI(w)` use `workerSummary(w)` This makes the `workerDBtoAPI(w)` and `workerSummary(w)` functions consistent, and makes the former use the latter.	2022-06-02 12:10:53 +02:00
Sybren A. Stüvel	f97f0a34c3	Manager: implement worker status change requests Implement the OpenAPI `RequestWorkerStatusChange` operation, and handle these changes in the web interface.	2022-05-31 17:22:03 +02:00
Sybren A. Stüvel	dd3f99ebaa	Manager: Fix unit test	2022-05-31 16:12:28 +02:00
Sybren A. Stüvel	f6dff086ef	Manager: show worker version in the workers table	2022-05-31 15:47:26 +02:00
Sybren A. Stüvel	3063e1fe6d	Manager: construct `api.Worker` from `api.WorkerSummary` + extra fields	2022-05-31 15:30:46 +02:00

1 2 3 4 5 ...

263 Commits