flamenco

Author	SHA1	Message	Date
Sybren A. Stüvel	1586c37b32	Manager: mark task as active as soon as it is assigned to a worker Move the task to 'active' status so that it won't be assigned to another worker. This also enables the task timeout monitoring.	2022-06-20 13:00:49 +02:00
Sybren A. Stüvel	2a4c9b2c13	Worker: enable SQLite foreign keys They're not used now, but enabling them is good default behaviour anyway.	2022-06-20 13:00:49 +02:00
Sybren A. Stüvel	de5d12362d	Manager: add `sleep_repeats` parameter to `echo-sleep-test` job type This makes it convenient to create an arbitrary number of tasks.	2022-06-20 11:44:41 +02:00
Sybren A. Stüvel	e0b9866fd4	Web: resize columns after their data was updated When data is updated, resize columns in the job/task/worker tables. For example, status change requests of Workers require more space, for example going from `awake` to `awake → offline`.	2022-06-20 11:44:08 +02:00
Sybren A. Stüvel	8bfe24cd96	Web: upgrade Tabulator 5.2.4 → 5.2.7	2022-06-20 11:21:42 +02:00
Sybren A. Stüvel	a2b667c043	Manager: log blocklist threshold	2022-06-17 17:15:23 +02:00
Sybren A. Stüvel	13bdb0ed73	Manager: remove outdated TODO	2022-06-17 17:15:13 +02:00
Sybren A. Stüvel	a368230afa	Manager: fix race condition in logging of worker name/UUID Instead of updating the logger in the context, just store a new logger in a new sub-context.	2022-06-17 17:13:32 +02:00
Sybren A. Stüvel	21a294a267	FEATURES.md: mark task as done	2022-06-17 16:40:19 +02:00
Sybren A. Stüvel	ec55cf6ce1	FEATURES.md: more features	2022-06-17 16:37:06 +02:00
Sybren A. Stüvel	64c8fa851d	Show assigned worker in task details Show the worker assigned to the task in the task details view, as link to the worker itself.	2022-06-17 16:36:55 +02:00
Sybren A. Stüvel	7327896db9	Worker: allow overriding worker name from environment Allow overriding the worker name by setting the `FLAMENCO_WORKER_NAME` environment variable. This makes it easy to do from Docker configs, and, more importantly, from the scripts I use to run multiple workers on the same machine while developing Flamenco.	2022-06-17 16:24:03 +02:00
Sybren A. Stüvel	857704c184	Web: worker nickname → name See 55676b000efbd04cd895da9068f375dfad473ff4	2022-06-17 15:55:36 +02:00
Sybren A. Stüvel	cdb7789f08	Refactor: Manager, move test code Move code that covers `worker_task_updates.go` into `worker_task_updates_test.go`. No functional changes.	2022-06-17 15:51:15 +02:00
Sybren A. Stüvel	046853932d	Manager: re-queue previously failed tasks of worker when blocklisting When a Worker is blocked from a job, re-queue its previously failed tasks so that other workers can give them a try.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	b95bed1f96	Refactor: rename `RequeueTasksOfWorker` to `RequeueActiveTasksOfWorker` Soon there will be another function to requeue tasks of workers by other criteria, so being clear in the name helps. No functional changes.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	fd31a85bcd	Manager: add blocking of workers when they fail certain tasks too much When a worker fails too many tasks, of the same task type, on the same job, it'll get blocked from doing those.	2022-06-17 15:49:16 +02:00
Sybren A. Stüvel	56abc825a6	Refactor: Manager, refactor handling of task failures Split the handling of soft and hard failures into separate functions. No functional changes intended.	2022-06-17 15:01:52 +02:00
Sybren A. Stüvel	0396919229	FEATURES: add new way in which jobs can get stuck	2022-06-17 14:59:26 +02:00
Sybren A. Stüvel	6feee74c54	Cleanup: Manager, move worker task update handling code into its own file Move the code related to task updates from workers to `worker_task_updates.go`. It's going to get more complex with the blocklisting in there; this prepares for that. No functional changes.	2022-06-17 11:46:07 +02:00
Sybren A. Stüvel	50e795c595	FEATURES.md: mark 'clear task failure list' as done	2022-06-17 11:39:57 +02:00
Sybren A. Stüvel	81f81d0e0a	Show task failure list in the web frontend Show the task failure list in the web frontend's `TaskDetails` component.	2022-06-17 11:37:56 +02:00
Sybren A. Stüvel	7f14dac62f	OAPI: regenerate code	2022-06-17 11:37:54 +02:00
Sybren A. Stüvel	aaed1e0589	OAPI: include task failure list in Task schema Include the list of workers who failed this task in the `Task` schema.	2022-06-17 11:37:28 +02:00
Sybren A. Stüvel	0b5140fc5f	Manager: clear task failure list on requeueing of jobs & tasks When a job or task gets requeued from the web interface, its task failure lists (i.e. the list of workers that previously failed this task) will be cleared. This clearing doesn't happen in other situations, e.g. when a worker signs off and its task gets requeued, the task's failure list will remain as-is.	2022-06-17 11:37:28 +02:00
Sybren A. Stüvel	d8be9d95e8	README: document task status meanings	2022-06-17 11:37:28 +02:00
Sybren A. Stüvel	e9fca8d993	Cleanup: typo fix in comment	2022-06-17 11:03:43 +02:00
Sybren A. Stüvel	b991e5f446	Cleanup: Manager, clarify some function names of the task state machine Rename functions `onTaskStatusX` to `updateJobOnTaskStatusX` to clarify their responsibility is to update the job in reaction to a task status change. No functional changes.	2022-06-17 11:01:41 +02:00
Sybren A. Stüvel	8764f8f7c1	Manager: task scheduler, don't schedule tasks the worker failed before When a worker asks for a task to perform, don't give it a task that it failed before.	2022-06-16 16:02:28 +02:00
Sybren A. Stüvel	ec10128f85	Worker: Sleep command, return error when sleep time is negative I need a way to reliably generate task errors, and having a more thorough check on the sleep duration parameter seemed a nice way to create those.	2022-06-16 15:46:03 +02:00
Sybren A. Stüvel	d5d0893b05	Worker: use explicit types for command parameter errors Introduce `ParameterMissingError` and `ParameterInvalidError` structs, to be returned from command executors. These replace free-form `fmt.Errorf()` style errors.	2022-06-16 15:45:09 +02:00
Sybren A. Stüvel	8af1b9d976	Worker: fix sync issue in TestUpstreamBufferManagerUnavailable unit test Fix synchronisation/goroutine issue in the "upstream buffer" test, where very occasionally the queue size was checked at the wrong time.	2022-06-16 15:43:20 +02:00
Sybren A. Stüvel	da1b42f9fa	Worker: fix sqlite connection issue in unit tests Fix sqlite issues in the "upstream buffer" test. The test used `:memory:` to have an in-memory DB to separate from other tests. The "flush at shutdown" code runs in a different goroutine, though, and creates a new DB connection. The SQLite separation was too strong, making that function not find any tables. This is now solved by having an in-memory database that's shared between all connections made from the same unit test.	2022-06-16 15:42:52 +02:00
Sybren A. Stüvel	7e28cfa69c	Worker: add task failures to the task log as well Task failures were only placed in the task's activity field, and are now added to the log as well.	2022-06-16 12:22:05 +02:00
Sybren A. Stüvel	e1309ad8fc	Worker: flush upstream buffer when shutting down When shutting down, the worker now tries to flush any buffered task updates before closing.	2022-06-16 12:21:17 +02:00
Sybren A. Stüvel	9ddf72fa37	Worker: sign off as last step of shutdown Within the shutdown procedure, signing off is now the last thing the worker does. This makes things more consistent from the Manager's point of view (like receiving last-second log entries while the Worker is still online).	2022-06-16 12:19:03 +02:00
Sybren A. Stüvel	5bc94101e8	Worker: Avoid sleep at shutdown Make the sleep between fetching tasks interruptable, so that a shutdown doesn't have to wait a few seconds.	2022-06-16 12:08:13 +02:00
Sybren A. Stüvel	9ab41984ac	Adjust Go code for Nickname -> Name change This fixes a bug where 'Worker undefined changed status' was logged in the web interface, as that was (back then incorrectly) `workerupdate.name`. Now that code is correct.	2022-06-16 11:03:18 +02:00
Sybren A. Stüvel	61aad21e99	OAPI: regenerate code	2022-06-16 11:02:04 +02:00
Sybren A. Stüvel	55676b000e	OAPI: change worker 'nickname' to just 'name' There was no need to have the extra four letters 'nick', and some parts of the code were already using just 'name' for the workers. This simplifies and unifies things.	2022-06-16 11:01:27 +02:00
Sybren A. Stüvel	12f0a605a4	Manager: log configured worker timeout at startup	2022-06-16 10:51:17 +02:00
Sybren A. Stüvel	5f2712980e	Manager: task scheduler, check for requested worker status change first Before checking whether the Worker is allowed to do work (i.e. is in `awake` state), check any queued-up status changes. Those should be communicated, before saying "no work for you", so that the Worker can actually respond to it.	2022-06-16 10:48:38 +02:00
Sybren A. Stüvel	ee53373878	Cleanup: compare worker state to constant instead of hard-coded state Use the `requiredStatusToGetTask` constant to compare the worker status, and not just for logging. No functional changes, just better code.	2022-06-16 10:46:50 +02:00
Sybren A. Stüvel	40f711bf69	Fix two unit tests for the previous commit I pushed too soon :'(	2022-06-16 10:42:04 +02:00
Sybren A. Stüvel	be0b10400f	Manager: count workers as 'seen' even when there is no task Fix a bug where a worker would only be counted as 'seen' by the task scheduler if it actually got a task assigned.	2022-06-16 10:39:42 +02:00
Sybren A. Stüvel	7d7c2b1bd6	Cleanup: blacklist → blocklist Change "blacklist" to "blocklist", because that makes people happier. No functional changes.	2022-06-16 10:36:36 +02:00
Sybren A. Stüvel	6e12a2fb25	Manager: keep track of which worker failed which task When a Worker indicates a task failed, mark it as `soft-failed` until enough workers have tried & failed at the same task. This is the first step in a blocklisting system, where tasks of an often-failing worker will be requeued to be retried by others. NOTE: currently the failure list of a task is NOT reset whenever it is requeued! This will be implemented in a future commit, and is tracked in `FEATURES.md`.	2022-06-13 18:41:38 +02:00
Sybren A. Stüvel	c5debdeb70	Manager: add 'task failure list' to record workers failing tasks The persistence layer can now store which worker failed which task, as preparation for a blocklisting system. Such a system should be able to determine whether there are still any workers left to do the work.	2022-06-13 18:41:30 +02:00
Sybren A. Stüvel	e35911d106	Manager: add ability to delete jobs This is needed for a future unit test, and exposed the fact that SQLite didn't enforce foreign key constraints (and thus also didn't handle on-delete-cascade attributes). This has been fixed in the previous commit.	2022-06-13 18:41:19 +02:00
Sybren A. Stüvel	e5d0e987e1	Manager: enforce DB foreign key checks at startup SQLite disables foreign key checks by default, so Flamenco has to enable them explicitly.	2022-06-13 18:41:19 +02:00

1 2 3 4 5 ...

869 Commits