For development of the web interface, to get a less predictable order of
asynchronous requests, the API responses were artificially delayed. This
was supposed to be optional, to be enabled via the `-delay` CLI argument,
but somehow the optionalness either never made it in or was mysteriously
removed.
Add a `-pprof` CLI option to enable the profiler. It will expose profiler
info on the web interface at `/debug/pprof/`.
To have a nice view of this, including flame graphs, run:
```
go tool pprof -http localhost:8082 http://localhost:8080/debug/pprof/profile
```
Build with `make stresser`. Run with:
./stresser -worker UUID -secret ABCXYZ
The worker ID and secret can be obtained from
`flamenco-worker-credentials.yaml`. If left empty, the stresser will
register as a new worker, and log the credentials to be used on the next
invocation.
This adds a `-wizard` CLI option to the Manager, which opens a webbrowser
and shows the First-Time Wizard to aid in configuration of Flamenco.
This is work in progress. The wizard is just one page, and doesn't save
anything yet to the configuration.
Rather than just print the error message ("error creating UPnP/SSDP
server"), it now explains what the effect is of this error (workers
unable to automatically find this Manager) and how to solve it (pass
`-manager URL` to the Worker).
The task logs storage system is refactored to use the `local_storage`
package. Configuration options have also changed:
- `task_logs_path` is renamed to `local_manager_storage_path`, to
emphasise that only the Manager deals with those files, with default
value `./flamenco-manager-storage`.
- `storage_path` is renamed to `shared_storage_path`, to emphasise this
is the storage shared between Manager and Workers, with default value
`./flamenco-shared-storage`.
Task logs are still stored in
`${local_manager_storage_path}/job-{jobUUID[0:4]}/{jobUUID}/task-{taskUUID}.txt`
Manifest task: T99409
When installing, Blender will just unzip directly into the addons dir,
so the ZIP has to contain the `flamenco` package directory.
This also makes things simpler, naming-wise. We can offer the addon from
the Manager web interface as `flamenco3-addon.zip`, and still have it
install into the `addons/flamenco` directory.
This makes it possible to start Flamenco Worker at Blender Studio with
a worker-local current working directory, with the executable in a shared
filesystem.
`os.IsNotExist()` is from before `errors.Is()` existed. The latter is the
recommended approach, as it also recognised wrapped errors.
No functional changes, except for recognising more cases of "does not
exist" errors as such.
Vue Router generates URLs for which there are no static files on the
filesystem (like `/jobs/{job ID}`). To make this work, the webapp's
`index.html` has to be served for such requests. The client-side JavaScript
then figures out how things fit together, and can even render a nice 404
page if necessary.
This shouldn't happen for non-webapp URLs, though. Because of this, the
entire webapp (including the "serve `index.html` if file not found logic)
is moved to a `/app/` base URL.
`make flamenco-manager` now also builds the webapp and embeds the static
files into the binary.
`make flamenco-manager_race` does NOT rebuild the static web files, to
help speed up of debug cycles. Run `make webapp-static` to rebuild the
webapp itself, if necessary, or run a separate web development server with
`yarn --cwd web/app run dev --host`.
Add a handler for the OpenAPI `taskOutputProduced` operation, and an
image thumbnailing goroutine.
The queue of images to process + the function to handle queued images
is managed by `last_rendered.LastRenderedProcessor`. This queue currently
simply allows 3 requests; this should be improved such that it keeps
track of the job IDs as well, as with the current approach a spammy job
can starve the updates from a more calm job.
The OpenAPI library we use for request validation needs to know per mime
type how to handle the contents. The same function for
`application/octet-stream` is now used for `image/png` and `image/jpeg`
as well.
Change "accepted CORS origins" to "acceptable CORS origins", as the former
is too ambiguous (it can mean "I just accepted these" or "These are the
acceptable ones").
Within the shutdown procedure, signing off is now the last thing the
worker does. This makes things more consistent from the Manager's point
of view (like receiving last-second log entries while the Worker is still
online).
Requeueing the tasks of a specific worker is now done in the
`TaskStateMachine`, such that it can be called from other services as
well in future commits.
This also makes the `LogStorage` service a dependency of the
`TaskStateMachine`, as it needs to write "this task was requeued" kind
of messages to the task logs.
Tasks that are in state `active` but haven't been 'touched' by a Worker
for 10 minutes or longer will transition to state `failed`.
In the future, it might be better to move the decision about which state
is suitable to the Task State Machine service, so that it can be smarter
and take the history of the task into account. Going to `soft-failed`
first might be a nice touch.
In the future different services will write to the task log, and thus
it makes sense to move the responsibility of prepending the timestamps
to the log storage service.
add `-delay` CLI argument, which adds a random delay of around 250ms to
all HTTP responses.
The web interface is quite asynchronous in nature, and having more
randomness and more visible delays in there will help development.
This CLI argument should not be used in production.
Check for jobs in 'cancel-requested' or 'requeued' statuses, and ensure
they transition to the right status. This happens at startup, before
even starting the web interface, so that a consistent state is presented.
Only the `ReadHeaderTimeout` is set. `ReadTimeout` is not set, as this is
quite specific per request. Shaman file uploads and websocket connections
should be allowed to run quite long, whereas other queries should be
relatively short.
Completely flush the upstream buffer at startup, before attempting to
fetch a new task. These updates could impact any task on the Manager side,
and first flushing the buffer before appending new updates also seems
like a good idea.
Remove the hard-coded list of allowed CORS origins, and build it
dynamically from the list of "own URLs", i.e. the URLs at which the
Manager expects to be available.
This list of "own URLs" is constructed from the available network
interfaces.
This adds a JS client for the OAPI interface, and introduces the SocketIO
stuff into Flamenco Manager itself.
To build & run:
- in `web/manager-api` run `npm install`
- in `web/manager-api` run `npm link`
- in `web/app` run `npm install`
- in `web/app` run `npm link flamenco-manager`
- in `web/app` run `yarn serve`
This may not be a complete list, but at least some of those steps are
necessary.
This introduces some more conceptual changes to Shaman. The most important
one is that there is no longer a "checkout ID", but a "checkout path".
The Shaman client can request any subpath of the checkout directory,
so that it can handle things like project- or scene-specific prefixes.