flamenco

Author	SHA1	Message	Date
Sybren A. Stüvel	0697f71b62	Manager: run some operations in a background context Run some API operations in a background context. This should prevent some of the SQLite "interrupted" errors, as those can occur when the context closes while a query is running. The API operations that Workers use are now mostly running in a separate background context, at least from the moment onward when they can run independently of the Worker connection.	2022-07-18 16:26:06 +02:00
Sybren A. Stüvel	d7b164133a	Sleep Scheduler implementation for the Manager The Manager now has a sleep scheduler for Workers. The API and background service work, but there is no web interface yet. Manifest Task: T99397	2022-07-17 17:27:32 +02:00
Sybren A. Stüvel	627996525e	Manager: implement operations for getting & setting worker sleep schedule This is just the API, no web interface yet. Manifest Task: T99397	2022-07-16 16:00:25 +02:00
Sybren A. Stüvel	726129446d	T99730: Allow access to full task log The web interface has a button that opens the task log in a new window. This might need some restyling ;-)	2022-07-16 12:55:41 +02:00
Sybren A. Stüvel	686295090b	Manager: implement endpoint for getting the full task log Previously only the log tail was available, which is fine for many cases, but for serious debugging the entire log is needed. Manifest task: T99730	2022-07-16 11:13:31 +02:00
Sybren A. Stüvel	ca586bf3fe	Windows: Skip "inaccessible path" test For some reason, on Windows, creating a directory with zero permissions still allows creating a file in there. Just skip that part of the test. The Explorer's properties panel of the directory also shows "Read Only (only applies to files)", so at least that seems consistent.	2022-07-16 10:31:35 +02:00
Sybren A. Stüvel	2e1a9c61b8	Manager: add SHA256 password hasher for worker auth Add a SHA256 password hasher for worker authentication. It's not used at the moment, but can be switched to for faster API queries. Note that switching will cause authentication errors on already-existing workers, which means they'll automatically re-register. This is mostly useful for debugging & profiling purposes.	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	0e4ed1c54d	Manager: move worker password hasher into a struct + interface Move the Worker password hashing/comparison functions into a struct, and use it via an interface. This will make it easier to switch to different hashing algorithms. Even with a low number of iterations, BCrypt is quite slow. That's good for security, but not for Flamenco Worker authentication -- the password is more as "nice check to avoid accidentally reusing the same ID" than something for security.	2022-07-15 15:08:00 +02:00
Sybren A. Stüvel	62ecd09f5f	Don't return 500 Error when Blender cannot be found on $PATH In the first-time wizard, if Blender cannot be found on $PATH but it can be found via .blend file association, that should just be reported as a normal sitation, and not as a `500 Internal Server Error`.	2022-07-14 18:50:34 +02:00
Sybren A. Stüvel	b35af5de9f	Manager: allow requesting shutdown multiple times It's fine to request a shutdown multiple times. This fixes a hard crash due to a panic.	2022-07-14 18:24:16 +02:00
Sybren A. Stüvel	38b8220476	Restart Flamenco Manager when the first-time wizard is complete	2022-07-14 17:52:38 +02:00
Sybren A. Stüvel	10f56148d4	Allow saving configuration from the first-time wizard This just updates the config and saves it to `flamenco-manager.yaml`. Saving the configuration doesn't restart the Manager yet, that's for another commit.	2022-07-14 17:27:17 +02:00
Sybren A. Stüvel	aec5ee49e0	First-Time Wizard: allow selecting Blender executables The wizard now finds Blender in various ways, and lets the user select which one to use. Doesn't save anything yet, though.	2022-07-14 12:22:56 +02:00
Sybren A. Stüvel	aa9837b5f0	First incarnation of the first-time wizard This adds a `-wizard` CLI option to the Manager, which opens a webbrowser and shows the First-Time Wizard to aid in configuration of Flamenco. This is work in progress. The wizard is just one page, and doesn't save anything yet to the configuration.	2022-07-14 11:17:03 +02:00
Sybren A. Stüvel	6b5f9317cb	Manager: clear job's blocklist when requeueing the job Requeueing a job means that the issues that caused workers to get blocked might be resolved, so it should be run with a clean slate.	2022-07-14 11:03:11 +02:00
Sybren A. Stüvel	3c290b1f6d	Manager: ensure the `{jobs}` implicit variable uses forward slashes Since the variable expansion is unaware of path semantics, using forward slashes is the safest way to go about things in a platform-indepdent way.	2022-07-13 12:45:55 +02:00
Sybren A. Stüvel	0ff8ed7585	Manager: implement the `getVariables` OpenAPI operation	2022-07-08 11:36:00 +02:00
Sybren A. Stüvel	ac5bb5e378	Remove assumption `{jobs}` only exists when Shaman is enabled Manager always creates an implicit variable `{jobs}`. This used to be Shaman-dependent, but now it's always there (has been for a while). This is now reflected in an add-on comment, and in an extra unit test.	2022-07-05 18:19:49 +02:00
Sybren A. Stüvel	d4429d593c	Unify task log storage & manager-local storage The task logs storage system is refactored to use the `local_storage` package. Configuration options have also changed: - `task_logs_path` is renamed to `local_manager_storage_path`, to emphasise that only the Manager deals with those files, with default value `./flamenco-manager-storage`. - `storage_path` is renamed to `shared_storage_path`, to emphasise this is the storage shared between Manager and Workers, with default value `./flamenco-shared-storage`. Task logs are still stored in `${local_manager_storage_path}/job-{jobUUID[0:4]}/{jobUUID}/task-{taskUUID}.txt` Manifest task: T99409	2022-07-05 17:58:58 +02:00
Sybren A. Stüvel	2c932ebad5	Show Worker's "last seen" timestamp in web interface & API responses	2022-07-04 12:49:56 +02:00
Sybren A. Stüvel	d25151184d	Add a "Last Rendered" view Add a "Last Rendered" view to the webapp. The Manager now stores (in the database) which job was the last recipient of a rendered image, and serves that to the appropriate OpenAPI endpoint. A new SocketIO subscription + accompanying room makes it possible for the web interface to receive all rendered images (if they survive the queue, which discards images when it gets too full).	2022-07-01 12:34:40 +02:00
Sybren A. Stüvel	2457a63518	Manager: Show "nothing rendered yet" image in job details Show a "nothing rendered yet" image in the job details when there is no last-rendered image yet.	2022-06-30 19:20:19 +02:00
Sybren A. Stüvel	0fc5ba0bc6	Manager: broadcast last-rendered image info via SocketIO After processing an image in the "last-rendered" processor, a SocketIO object is sent to clients to indicate the last-rendered image needs to be (re)loaded. This also moves the previously existing "done callback" from a single function to a per-image callback, so that it can be called with the right information in there, and only when that particular image is actually done processing. The notification message sent via SocketIO also contains the necessary info to render the image, so that the web client doesn't have to call the `fetchJobLastRenderedInfo` operation.	2022-06-30 18:36:24 +02:00
Sybren A. Stüvel	6efd67b05c	Manager: implement `FetchJobLastRenderedInfo()` API operation Allow querying for the URL & available versions of a job's last-rendered image.	2022-06-28 17:08:00 +02:00
Sybren A. Stüvel	64512c81ba	Manager: implement OAPI operations to fetch blocklist & delete items	2022-06-27 11:32:35 +02:00
Sybren A. Stüvel	e687c95e5d	Manager: add "last rendered image" processing pipeline Add a handler for the OpenAPI `taskOutputProduced` operation, and an image thumbnailing goroutine. The queue of images to process + the function to handle queued images is managed by `last_rendered.LastRenderedProcessor`. This queue currently simply allows 3 requests; this should be improved such that it keeps track of the job IDs as well, as with the current approach a spammy job can starve the updates from a more calm job.	2022-06-24 16:51:11 +02:00
Sybren A. Stüvel	b53cd67eb4	Cleanup: rename `assertResponseEmpty()` → `assertResponseNoContent()` The function tests the HTTP response is `204 No Content`, and now the name reflects that better. No functional changes.	2022-06-24 16:09:46 +02:00
Sybren A. Stüvel	2d05e1c773	Fix unit test for recent scheduler change Fix unit test for rF1586c37b.	2022-06-20 16:05:36 +02:00
Sybren A. Stüvel	1586c37b32	Manager: mark task as active as soon as it is assigned to a worker Move the task to 'active' status so that it won't be assigned to another worker. This also enables the task timeout monitoring.	2022-06-20 13:00:49 +02:00
Sybren A. Stüvel	a2b667c043	Manager: log blocklist threshold	2022-06-17 17:15:23 +02:00
Sybren A. Stüvel	13bdb0ed73	Manager: remove outdated TODO	2022-06-17 17:15:13 +02:00
Sybren A. Stüvel	a368230afa	Manager: fix race condition in logging of worker name/UUID Instead of updating the logger in the context, just store a new logger in a new sub-context.	2022-06-17 17:13:32 +02:00
Sybren A. Stüvel	cdb7789f08	Refactor: Manager, move test code Move code that covers `worker_task_updates.go` into `worker_task_updates_test.go`. No functional changes.	2022-06-17 15:51:15 +02:00
Sybren A. Stüvel	046853932d	Manager: re-queue previously failed tasks of worker when blocklisting When a Worker is blocked from a job, re-queue its previously failed tasks so that other workers can give them a try.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	b95bed1f96	Refactor: rename `RequeueTasksOfWorker` to `RequeueActiveTasksOfWorker` Soon there will be another function to requeue tasks of workers by other criteria, so being clear in the name helps. No functional changes.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	fd31a85bcd	Manager: add blocking of workers when they fail certain tasks too much When a worker fails too many tasks, of the same task type, on the same job, it'll get blocked from doing those.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	56abc825a6	Refactor: Manager, refactor handling of task failures Split the handling of soft and hard failures into separate functions. No functional changes intended.	2022-06-17 15:01:52 +02:00
Sybren A. Stüvel	6feee74c54	Cleanup: Manager, move worker task update handling code into its own file Move the code related to task updates from workers to `worker_task_updates.go`. It's going to get more complex with the blocklisting in there; this prepares for that. No functional changes.	2022-06-17 11:46:07 +02:00
Sybren A. Stüvel	81f81d0e0a	Show task failure list in the web frontend Show the task failure list in the web frontend's `TaskDetails` component.	2022-06-17 11:37:56 +02:00
Sybren A. Stüvel	0b5140fc5f	Manager: clear task failure list on requeueing of jobs & tasks When a job or task gets requeued from the web interface, its task failure lists (i.e. the list of workers that previously failed this task) will be cleared. This clearing doesn't happen in other situations, e.g. when a worker signs off and its task gets requeued, the task's failure list will remain as-is.	2022-06-17 11:37:28 +02:00
Sybren A. Stüvel	9ab41984ac	Adjust Go code for Nickname -> Name change This fixes a bug where 'Worker undefined changed status' was logged in the web interface, as that was (back then incorrectly) `workerupdate.name`. Now that code is correct.	2022-06-16 11:03:18 +02:00
Sybren A. Stüvel	5f2712980e	Manager: task scheduler, check for requested worker status change first Before checking whether the Worker is allowed to do work (i.e. is in `awake` state), check any queued-up status changes. Those should be communicated, before saying "no work for you", so that the Worker can actually respond to it.	2022-06-16 10:48:38 +02:00
Sybren A. Stüvel	ee53373878	Cleanup: compare worker state to constant instead of hard-coded state Use the `requiredStatusToGetTask` constant to compare the worker status, and not just for logging. No functional changes, just better code.	2022-06-16 10:46:50 +02:00
Sybren A. Stüvel	40f711bf69	Fix two unit tests for the previous commit I pushed too soon :'(	2022-06-16 10:42:04 +02:00
Sybren A. Stüvel	be0b10400f	Manager: count workers as 'seen' even when there is no task Fix a bug where a worker would only be counted as 'seen' by the task scheduler if it actually got a task assigned.	2022-06-16 10:39:42 +02:00
Sybren A. Stüvel	6e12a2fb25	Manager: keep track of which worker failed which task When a Worker indicates a task failed, mark it as `soft-failed` until enough workers have tried & failed at the same task. This is the first step in a blocklisting system, where tasks of an often-failing worker will be requeued to be retried by others. NOTE: currently the failure list of a task is NOT reset whenever it is requeued! This will be implemented in a future commit, and is tracked in `FEATURES.md`.	2022-06-13 18:41:38 +02:00
Sybren A. Stüvel	02bc03ae2b	Manager: replace `gorm.Model` with our own `persistence.Model` struct `persistence.Model` contains the common database fields for most model structs. It is a copy of `gorm.Model`, but without the `DeletedAt` field (which triggers Gorm's soft deletion). Soft deletion is not used by Flamenco. If it ever becomes necessary to support soft-deletion, see https://gorm.io/docs/delete.html#Soft-Delete	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	ec5b3aac52	Manager: on getting task update from Worker, write log before status change When receiving a `TaskUpdate` from a Worker, write to the task log, before handling any task status change. If both log and task status change are sent, the log will likely contain the cause of the task state change. Any subsequent task logs, for example generated by the Manager in response to the status change, should be logged after that.	2022-06-13 18:40:42 +02:00
Sybren A. Stüvel	5dac3c2dc0	Manager: mark workers as 'seen' when they send updates Update the 'last seen at' timestamp of workers when they: - sign on - sign off - get a task assigned - send a task update - check whether they can keep running their task Note that this commit is necessary to not have the workers time out immediately ;-)	2022-06-13 12:47:07 +02:00
Sybren A. Stüvel	c3525c3b1a	Manager: move task requeueing to `TaskStateMachine` Requeueing the tasks of a specific worker is now done in the `TaskStateMachine`, such that it can be called from other services as well in future commits. This also makes the `LogStorage` service a dependency of the `TaskStateMachine`, as it needs to write "this task was requeued" kind of messages to the task logs.	2022-06-13 12:33:01 +02:00

1 2 3 4

183 Commits