40 Commits

Author SHA1 Message Date
Sybren A. Stüvel
96cc8255e7 Manager: simplify use of sqlc
Instead of returning an error when getting the sqlc queries object, just
panic. This'll make the calling code quite a bit simpler. The situation
in which it might error out is so rare that I've never seen it, and I
don't even know if it will ever be possible to happen with the SQLite
implementation we use now. Furthermore, once we get rid of GORM, it
should just always work anyway.

Ref: #104305
2024-09-18 10:17:20 +02:00
Sybren A. Stüvel
3b2b5c47f3 Manager: convert FetchWorkerTask() from gorm to sqlc
Ref: #104305

No functional changes
2024-07-02 10:16:41 +02:00
Sybren A. Stüvel
7a9f809c43 Manager: convert more worker functions to sqlc
No functional changes.
2024-05-28 18:20:15 +02:00
Sybren A. Stüvel
9a229a7b8f Manager: Convert CreateWorker to sqlc
No functional changes.
2024-05-28 18:20:07 +02:00
Sybren A. Stüvel
0c4240ec3a Manager: make some db fields boolean instead of smallint
Turn `workers.lazy_status_request` and `workers.can_restart` into a
`boolean`. They were `smallint` before.

Having these explicitly modeled as `boolean` will make sqlc generate the
right type for them.

No functional changes.
2024-05-28 18:15:21 +02:00
Sybren A. Stüvel
c1cdff567e Manager: Convert FetchTask to sqlc
This is a bit more work than other queries, as it also breaks apart the
fetching of the job and the worker into separate ones. In other words,
internally the persistence layer API changes.
2024-05-28 14:46:42 +02:00
Sybren A. Stüvel
3974770f36 Manager: refuse to delete workers without foreign key constraints
As a safety measure, refuse to delete Workers from the Manager's database
when foreign key constraints are disabled.

In the long term, the underlying problem should be solved. This is a stop-
gap measure to ensure database consistency.
2024-04-12 10:48:40 +02:00
Sybren A. Stüvel
61cc8ff04d Manager: implement API operation to get the farm status
Add a new API operation to get the overall farm status. This is based on
the jobs and workers, and their status.

The statuses are:

- `active`: Actively working on jobs.
- `idle`: Farm could be active, but has no work to do.
- `waiting`: Work has been queued, but all workers are asleep.
- `asleep`: Farm is idle, and all workers are asleep.
- `inoperative`: Cannot work: no workers, or all are offline/error.
- `starting`: Farm is starting up.
- `unknown`: Unexpected configuration of worker and job statuses.
2024-02-29 20:42:28 +01:00
Sybren A. Stüvel
9afd79d8c0 Manager: prevent logging an error when fetching unknown worker
Prevent logging an error in the persistence layer when an unknown worker
is requested.

This reduces the noise & confusion when the web interface is showing the
details of a worker, but the worker gets removed by someone else. Or when
the Manager doesn't know about a Worker and it's trying to connect.

See #104282.
2024-01-25 12:38:13 +01:00
Sybren A. Stüvel
3e72391cbf Restartable workers
When the worker is started with `-restart-exit-code 47` or has
`restart_exit_code=47` in `flamenco-worker.yaml`, it's marked as
'restartable'. This will enable two worker actions 'Restart
(immediately)' and 'Restart (after task is finished)' in the Manager web
interface. When a worker is asked to restart, it will exit with exit
code `47`. Of course any positive exit code can be used here.
2023-08-14 16:00:09 +02:00
Sybren A. Stüvel
02fac6a4df Change Go package name from git.blender.org to projects.blender.org
Change the package base name of the Go code, from
`git.blender.org/flamenco` to `projects.blender.org/studio/flamenco`.

The old location, `git.blender.org`, has no longer been use since the
[migration to Gitea][1]. The new package names now reflect the actual
location where Flamenco is hosted.

[1]: https://code.blender.org/2023/02/new-blender-development-infrastructure/
2023-08-01 12:42:31 +02:00
Sybren A. Stüvel
60d54eabb3 Manager: avoid recreation of Worker table at startup
Mark the default value of `Worker.LazyStatusRequest` as `false`. The
previous default was configured as `0`, which was different enough to
always trigger a database migration of that column. However, since these
values do map to each other, the migration didn't do anything concrete,
and would be triggered again at the next startup.
2023-07-10 13:56:47 +02:00
Eveline Anderson
830c3fe794 Rename worker 'clusters' to 'tags'
As it was decided that the name "tags" would be better for the clarity
of the feature, all files and code named "cluster" or "worker cluster"
have been removed and replaced with "tag" and "worker tag". This is only
a name change, no other features were touched.

This addresses part of #104204.

Reviewed-on: https://projects.blender.org/studio/flamenco/pulls/104223

As a note to anyone who already ran a pre-release version of Flamenco
and configured some worker clusters, with the help of an SQLite client
you can migrate the clusters to tags. First build Flamenco Manager and
start it, to create the new database schema. Then run these SQL queries
via an sqlite commandline client:

```sql
insert into worker_tags
    (id, created_at, updated_at, uuid, name, description)
  select id, created_at, updated_at, uuid, name, description
  from worker_clusters;

insert into worker_tag_membership (worker_tag_id, worker_id)
  select worker_cluster_id, worker_id from worker_cluster_membership;
```
2023-07-10 11:11:03 +02:00
Sybren A. Stüvel
8408d28a6b Manager: add support for worker clusters 2023-04-04 12:18:35 +02:00
Sybren A. Stüvel
e2559b1181 Cleanup: remove doubly-declared default value in persistence layer
No functional changes.
2023-04-03 16:59:22 +02:00
Sybren A. Stüvel
59655ea770 Manager: fix error in sleep scheduler when shutting down
When the Manager was shutting down while the sleep scheduler was running, it
could cause a null pointer dereference. This is now doubly solved:

- `worker.Identifier()` is now nil-safe, as in, `worker` can be `nil` and
  it will still return a sensible string.
- failure to apply the sleep schedule due to the context closing is not
  logged as error any more.
2022-09-27 12:27:18 +02:00
Sybren A. Stüvel
2a345a3d2c API for deleting workers
Workers can now be soft-deleted. Tasks assigned to the worker will remain
associated with that Worker. Active tasks will be re-queued so other
workers can pick them up.
2022-08-11 16:59:53 -07:00
Sybren A. Stüvel
736ca103c3 Manager: show current/last task in worker details
The Task details component already linked to the Worker it was assigned
to last, and now the Worker links back to the task.

There's only one task shown in the Worker details. If the Worker is
actively working on a task, that one's shown. Otherwise it's the
last-updated task that was assigned to the worker.
2022-07-26 10:36:02 +02:00
Sybren A. Stüvel
d7b164133a Sleep Scheduler implementation for the Manager
The Manager now has a sleep scheduler for Workers. The API and background
service work, but there is no web interface yet.

Manifest Task: T99397
2022-07-17 17:27:32 +02:00
Sybren A. Stüvel
02bc03ae2b Manager: replace gorm.Model with our own persistence.Model struct
`persistence.Model` contains the common database fields for most model
structs. It is a copy of `gorm.Model`, but without the `DeletedAt`
field (which triggers Gorm's soft deletion).

Soft deletion is not used by Flamenco. If it ever becomes necessary to
support soft-deletion, see https://gorm.io/docs/delete.html#Soft-Delete
2022-06-13 18:40:42 +02:00
Sybren A. Stüvel
5dac3c2dc0 Manager: mark workers as 'seen' when they send updates
Update the 'last seen at' timestamp of workers when they:
- sign on
- sign off
- get a task assigned
- send a task update
- check whether they can keep running their task

Note that this commit is necessary to not have the workers time out
immediately ;-)
2022-06-13 12:47:07 +02:00
Sybren A. Stüvel
7d5aae25b5 Manager: add timeout checks for workers 2022-06-13 12:33:22 +02:00
Sybren A. Stüvel
04dd479248 Manager: protect task log writing with mutex
A per-task mutex is used to protect the writing of task logs, so that
mutliple goroutines can safely write to the same task log.
2022-06-09 14:44:54 +02:00
Sybren A. Stüvel
6cf82e5d43 Manager: cleanup, refactor Worker state change request persistence code
Move the setting & clearing of worker state change requests into separate
functions.

No functional changes.
2022-06-02 16:36:06 +02:00
Sybren A. Stüvel
f97f0a34c3 Manager: implement worker status change requests
Implement the OpenAPI `RequestWorkerStatusChange` operation, and handle
these changes in the web interface.
2022-05-31 17:22:03 +02:00
Sybren A. Stüvel
1496736f7a Manager: wrap Worker fetching errors
Do the same wrapping as for task/job errors, but then for workers.
2022-05-31 11:18:57 +02:00
Sybren A. Stüvel
19db947eb4 Manager: remove Worker.LastActivity
This removes the field both from the OpenAPI interface and the database.
2022-05-31 10:46:27 +02:00
Sybren A. Stüvel
08676f48f4 Manager: implement fetchWorkers OpenAPI operation 2022-05-30 18:52:02 +02:00
Sybren A. Stüvel
9f5e4cc0cc License: license all code under "GPL-3.0-or-later"
The add-on code was copy-pasted from other addons and used the GPL v2
license, whereas by accident the LICENSE text file had the GNU "Affero" GPL
license v3 (instead of regular GPL v3).

This is now all streamlined, and all code is licensed as "GPL v3 or later".

Furthermore, the code comments just show a SPDX License Identifier
instead of an entire license block.
2022-03-07 15:26:46 +01:00
Sybren A. Stüvel
cd2fe8170e Errors: remove "error" prefix from message
Instead of returning an error "error doing X", just return "doing X". The
fact that it's returned as an error object says enough about that it's
an error.

This also makes it easier to chain error messages, without seeing the
word "error" in every part of the chain.
2022-03-04 11:30:31 +01:00
Sybren A. Stüvel
9643bf768e Manager: Fix DB migration error of not-null columns
Where the PostgreSQL DB migration code could handle `NOT NULL` columns just
fine, SQLite has less table-altering functionality. As a result, migrations
have to copy entire database tables, which doesn't play well with
not-nullable columns.
2022-03-03 12:10:13 +01:00
Sybren A. Stüvel
47e36c927c Change package URL to the blender.org repository 2022-03-01 20:45:09 +01:00
Sybren A. Stüvel
32af1ffaef Manager: actually pass context to Gorm queries 2022-02-28 11:53:31 +01:00
Sybren A. Stüvel
270c54fdb7 More status change acks & checks to get stable flow between worker states 2022-02-15 17:46:37 +01:00
Sybren A. Stüvel
50088b4c94 Save worker info on sign-on (not just on registration) 2022-02-15 10:57:29 +01:00
Sybren A. Stüvel
0457809641 Scheduler: filter on supported task types & assigned worker ID 2022-02-14 18:00:43 +01:00
Sybren A. Stüvel
2ca8858c28 Only update status field in DB when worker changes status 2022-02-01 10:16:10 +01:00
Sybren A. Stüvel
be89349632 Very basic non-functional framework for a task runner
Also has some login/logout functionality for storing stuff in the DB.
2022-01-31 16:05:27 +01:00
Sybren A. Stüvel
c501899185 Ported lots of stuff from gitlab.com/dr.sybren/flamenco-worker-go
Much isn't working though.
2022-01-28 17:02:50 +01:00
Sybren A. Stüvel
28a56f3d91 Store workers in database when registering 2022-01-28 15:31:39 +01:00