feat: background workers = non-HTTP workers with shared state#2287
feat: background workers = non-HTTP workers with shared state#2287nicolas-grekas wants to merge 1 commit intophp:mainfrom
Conversation
e1655ab to
867e9b3
Compare
|
Interesting approach to parallelism, what would be a concrete use case for only letting information flow one way from the sidekick to the http workers? Usually the flow would be inverted, where a http worker offloads work to a pool of 'sidekick' workers and can optionally wait for a task to complete. |
da54ab8 to
a06ba36
Compare
|
Thank you for the contribution. Interesting idea, but I'm thinking we should merge the approach with #1883. The kind of worker is the same, how they are started is but a detail. @nicolas-grekas the Caddyfile setting should likely be per |
ad71bfe to
05e9702
Compare
|
@AlliBalliBaba The use case isn't task offloading (HTTP->worker), but out-of-band reconfigurability (environment->worker->HTTP). Sidekicks observe external systems (Redis Sentinel failover, secret rotation, feature flag changes, etc.) and publish updated configuration that HTTP workers pick up on their next request; with per-request consistency guaranteed via Task offloading (what you describe) is a valid and complementary pattern, but it solves a different problem. The non-HTTP worker foundation here could support both. @henderkes Agreed that the underlying non-HTTP worker type overlaps with #1883. The foundation (skip HTTP startup/shutdown, immediate readiness, cooperative shutdown) is the same. The difference is the API layer and the DX goals:
Happy to follow up with your proposals now that this is hopefully clarified. |
05e9702 to
8a56d4c
Compare
|
Great PR! Couldn't we create a single API that covers both use case? We try to keep the number of public symbols and config option as small as possible! |
Yes, that's why I'd like to unify the two API's and background implementations into one. Unfortunately the first task worker attempt didn't make it into |
|
The PHP-side API has been significantly reworked since the initial iteration: I replaced The old design used
Key improvements:
Other changes:
|
cb65f46 to
4dda455
Compare
|
Thanks @dunglas and @henderkes for the feedback. I share the goal of keeping the API surface minimal. Thinking about it more, the current API is actually quite small and already general:
The name "sidekick" works as a generic concept: a helper running alongside. The current set_vars/get_vars protocol covers the config-publishing use case. For task offloading (HTTP->worker) later, the same sidekick infrastructure could support:
Same worker type, same So the path would be:
The foundation (non-HTTP threads, cooperative shutdown, crash recovery, per-php_server scoping) is shared. Only the communication primitives differ. WDYT? |
b3734f5 to
ed79f46
Compare
|
|
|
Hmm, it seems they are on some versions, for example here: https://github.com/php/frankenphp/actions/runs/23192689128/job/67392820942?pr=2287#step:10:3614 For the cache, I'm not aware of a Github feature that allow to clear everything unfortunately 🙁 |
6d09848 to
feb2492
Compare
|
Some important changes to note from previous iteration:
Instead of polling a boolean, sidekicks now get a stream resource that becomes readable on shutdown. This integrates with
Values can be This allows richer data types without resorting to implicit serialize - see reasons above for why I think this would be a bad thing. FTR, Symfony 8.1 will provide a
This wasn't fully implemented, now: when
When copying persistent vars to request memory in Doc updated. I think this covers the sidekick use case really well. And indeed, as @henderkes noted, I want sidekick to be fully reactive. No mandatory polling. Same for tasks BTW. I'm going to give them a try now. |
6e422fa to
85bd7d9
Compare
aa19e57 to
875337e
Compare
|
Back to Some important changes since the last update:
When opcache produces an
Strings that live in shared memory (
Each
Data race on
Extracted Error messages now consistently mention enums in validation errors. |
875337e to
70c91d6
Compare
70c91d6 to
af6d34e
Compare
|
Thanks for all the feedback so far. I heard the naming concern and explored renaming "sidekick" to "worker". I also wondered about auto-starting and pre-declaring workers. So here we are:
The php_server {
worker public/index.php { num 4 }
worker bin/worker.php { background; match config-watcher }
worker bin/worker.php { background } # catch-all for dynamic names
}PHP API is now
Same keyword as HTTP workers, with this difference: HTTP workers match on URL path, background workers match on worker name. Named background workers are auto-started at boot. A worker without Name values are validated at config parse time - only alphanumeric, hyphens, underscores, and colons allowed. Typos like
Named background workers (those with
Background workers get dedicated thread slots outside the global Docs updated at PR description above also updated. 🚀 |
af6d34e to
24679d9
Compare
| function background_worker_should_stop(float $timeout = 0): bool | ||
| { | ||
| static $signalingStream; | ||
| $signalingStream ??= frankenphp_worker_get_signaling_stream(); | ||
| $s = (int) $timeout; | ||
|
|
||
| return match (@stream_select(...[[$signalingStream], [], [], $s, (int) (($timeout - $s) * 1e6)])) { | ||
| 0 => false, // timeout | ||
| false => true, // error (pipe closed) = stop | ||
| default => "stop\n" === fgets($signalingStream), | ||
| }; | ||
| } |
There was a problem hiding this comment.
IMO doing this via stream/pipe instead of with just a go channel still feels too clunky. As I understood the idea is to be able to integrate the stream into the Amp/React event-loop.
As @henderkes mentioned, not sure where async PHP is going in the near future and even resources might go away in a future PHP version. So I'd rather have this abstracted away into a single function call somehow.
Easiest to do would be just selecting over the worker drain channel a sleep timeout, but I understand that will block the entire thread. If we really want to support an async select, I'd rather have a dedicated way to let the 2 event loops communicate, that is generic enough so we can change it later on. Haven't though too much about how that would look like yet.
| reservedThreads := 0 | ||
| for _, w := range f.Workers { | ||
| if !w.Background { | ||
| continue | ||
| } | ||
| if len(w.MatchPath) > 0 { | ||
| // Named: one thread per match name | ||
| reservedThreads += len(w.MatchPath) | ||
| } else { | ||
| // Catch-all: reserve up to the safety cap (default 16) | ||
| maxW := w.MaxThreads | ||
| if maxW <= 0 { | ||
| maxW = 16 | ||
| } | ||
| reservedThreads += maxW | ||
| } | ||
| } | ||
| if reservedThreads > 0 { |
There was a problem hiding this comment.
Instead of explicitly reserving threads, couldn't we just set the default to
num_threads 0
max_threads 1
Then the workers would just behave as any other worker in terms of thread number and it would be possible to have the thread running instantly or start multiple of the same worker.
num_threads 0 => worker starts lazily
num_threads 1 => worker starts immediately
num_threads 2 => multiple threads (for future use cases)
| char *error = go_frankenphp_start_background_worker( | ||
| thread_index, Z_STRVAL_P(names), Z_STRLEN_P(names)); | ||
| if (error) { | ||
| zend_throw_exception(spl_ce_RuntimeException, error, 0); | ||
| free(error); | ||
| RETURN_THROWS(); | ||
| } | ||
|
|
||
| char *name_ptr = Z_STRVAL_P(names); | ||
| size_t name_len_val = Z_STRLEN_P(names); | ||
| void *vars_ptr = NULL; | ||
| unsigned long out_version = 0; | ||
| error = go_frankenphp_worker_wait_and_get(thread_index, &name_ptr, | ||
| &name_len_val, 1, timeout_ms, | ||
| &vars_ptr, NULL, &out_version); |
There was a problem hiding this comment.
You can probably combine go_frankenphp_start_background_worker and go_frankenphp_worker_wait_and_get in one function call.
24679d9 to
ad5e484
Compare
ad5e484 to
7ffba57
Compare
| free(error); | ||
| RETURN_THROWS(); | ||
| } | ||
| bg_worker_free_stored_vars(old); |
There was a problem hiding this comment.
Freeing these persistent zvals is probably unsafe since other threads might still be referencing them. Not sure you can make this fully safe and-non leaky with persistent vals.
The safe solution is to probably copy the zval directly from thread-local memory to thread-local memory.
Hmm actually not entirely sure, would need to look into how the parallel extension does this.
| func (registry *BackgroundWorkerRegistry) startAutoWorkers() error { | ||
| for _, name := range registry.autoStartNames { | ||
| if err := startBackgroundWorkerWithRegistry(registry, name); err != nil { | ||
| return fmt.Errorf("failed to auto-start background worker %q: %w", name, err) | ||
| } | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Not sure I fully understand the 'auto start names'. Couldn't you just re-use the worker name itself? Also would be nice to reuse num and max_threads to determine which workers should be started immediately or lazily as mentioned before.
| timeout := time.Duration(timeoutMs) * time.Millisecond | ||
| done := make(chan error, 1) | ||
| go func() { | ||
| timer := time.NewTimer(timeout) | ||
| defer timer.Stop() | ||
| for i, sk := range sks { | ||
| select { | ||
| case <-sk.ready: | ||
| // background worker has called set_vars | ||
| case <-timer.C: | ||
| done <- fmt.Errorf("timeout waiting for background worker: %s", goNames[i]) | ||
| return | ||
| } | ||
| } | ||
| done <- nil | ||
| }() | ||
|
|
||
| if err := <-done; err != nil { | ||
| return C.CString(err.Error()) | ||
| } |
There was a problem hiding this comment.
Couldn't you just read the variables directly as long as they have been set once (RWLock)? Would be a lot more efficient than channels+goroutines.
A timeout should just be necessary the first time when starting lazily.
If a background worker only gets read after first calling set_var, you just need to check if they are ready or wait for their ready state.
| func go_frankenphp_worker_release_vars(threadIndex C.uintptr_t) { | ||
| idx := int(threadIndex) | ||
| stack := lockedVarsStacks[idx] | ||
| if len(stack) == 0 { | ||
| return | ||
| } | ||
| sks := stack[len(stack)-1] | ||
| lockedVarsStacks[idx] = stack[:len(stack)-1] | ||
|
|
||
| for _, sk := range sks { | ||
| sk.mu.RUnlock() | ||
| } | ||
| } |
There was a problem hiding this comment.
Ideally locking/unlocking would happen in a single go function call, this feels a bit convoluted.
C->go->C is fine to do btw.
| if !ok || handler.worker.httpEnabled || handler.worker.backgroundWorker == nil { | ||
| return C.CString("frankenphp_worker_set_vars() can only be called from a background worker") | ||
| } |
There was a problem hiding this comment.
I wonder if HTTP workers could have their own global variables, there probably are possible use cases. Maybe calling get_vars without a worker name gets the vars of the current worker.
get_vars potentially lazily starting workers instead of just returning null makes this more awkward though.
| char **name_ptrs = emalloc(sizeof(char *) * name_count); | ||
| size_t *name_lens_arr = emalloc(sizeof(size_t) * name_count); | ||
| void **vars_ptrs = emalloc(sizeof(void *) * name_count); | ||
| int idx = 0; | ||
| ZEND_HASH_FOREACH_VAL(ht, val) { | ||
| name_ptrs[idx] = Z_STRVAL_P(val); | ||
| name_lens_arr[idx] = Z_STRLEN_P(val); | ||
| idx++; | ||
| } | ||
| ZEND_HASH_FOREACH_END(); |
There was a problem hiding this comment.
I wonder if we should even allow getting multiple global variables at once. Makes all of this a lot more complex. Probably could make this PR a lot smaller by just allowing one get_var per function call, better to start out easy.
Multiple function calls might even be more efficient since you have to do less allocations. You can't start bg workers in parallel in that case though, at least not lazily.
| if (worker_bg_name != NULL) { | ||
| CG(skip_shebang) = 1; | ||
|
|
||
| zend_is_auto_global_str("_SERVER", sizeof("_SERVER") - 1); | ||
| zval *server = &PG(http_globals)[TRACK_VARS_SERVER]; | ||
| if (server && Z_TYPE_P(server) == IS_ARRAY) { | ||
| zval argv_array; | ||
| array_init(&argv_array); | ||
| add_next_index_string(&argv_array, file_name); | ||
| add_next_index_string(&argv_array, worker_bg_name); | ||
|
|
||
| zval argc_zval; | ||
| ZVAL_LONG(&argc_zval, 2); | ||
|
|
||
| zend_hash_str_update(Z_ARRVAL_P(server), "argv", sizeof("argv") - 1, | ||
| &argv_array); | ||
| zend_hash_str_update(Z_ARRVAL_P(server), "argc", sizeof("argc") - 1, | ||
| &argc_zval); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Do you actually need the worker name in argv and argc here if it's already in $_SERVER? What do you need the filename for?
| opts = append(opts, WithRequestPreparedEnv(PreparedEnv{ | ||
| "FRANKENPHP_WORKER_NAME\x00": worker.name, | ||
| })) |
There was a problem hiding this comment.
We could probably add FRANKENPHP_WORKER_NAME to all workers, not just background workers. You could even initialize key and value here as persistent zend_strings, so the registration cost is negligible.
| } | ||
|
|
||
| if !worker.httpEnabled { | ||
| handler.thread.state.Set(state.Ready) |
There was a problem hiding this comment.
Probably should reach ready state in markBackgroundReady instead of here. Allows better control over when the bg worker is actually ready.
| onThreadReady func(int) | ||
| onThreadShutdown func(int) | ||
| queuedRequests atomic.Int32 | ||
| httpEnabled bool |
There was a problem hiding this comment.
isBackgroundWorker probably makes more sense at this point.
| httpEnabled bool | |
| isBackgroundWorker bool |
|
Didn't get through everything, might continue at a later point. |
Summary
Background workers are long-running PHP workers that run outside the HTTP cycle. They observe their environment (Redis, DB, filesystem, etc.) and publish configuration that HTTP workers read per-request - enabling real-time reconfiguration without restarts or polling.
PHP API
frankenphp_worker_set_vars(array $vars)- publishes config from a background worker (persistent memory, cross-thread)frankenphp_worker_get_vars(string|array $name, float $timeout = 30.0)- reads config from HTTP workers (blocks until first publish, version-cached)frankenphp_worker_get_signaling_stream()- returns a pipe-based stream forstream_select()integration (replaces polling with event-driven shutdown)Caddyfile configuration
backgroundmarks a worker as non-HTTPmatchspecifies exact worker names (not URL paths); named workers auto-start at boot, before HTTP workersmatch= catch-all for lazy-started names viaget_vars(). Not declaring a catch-all forbids lazy-started ones.max_threadson catch-all sets a safety cap for lazy-started instances (defaults to 16)numandmax_threadsrejected on named background workers (pooling is a future feature)max_consecutive_failuresdefaults to -1 (never panic)Architecture
BackgroundWorkerRegistryperphp_serverfor isolation and at-most-once semanticspemalloc) withRWMutexfor safe cross-thread sharingvarsVersionatomic counter) to skip unchanged dataIS_ARRAY_IMMUTABLE)ZSTR_IS_INTERNED) - skip copy/free for shared memory stringsstream_select()- compatible with amphp/ReactPHP event loopsExample: Redis config watcher
Test coverage
14 tests covering: basic vars, at-most-once start, validation, type support (enums, binary-safe strings), multiple background workers, crash restart, signaling stream, worker restart lifecycle, non-background-worker error handling.
All tests pass on PHP 8.2, 8.3, 8.4, and 8.5 with
-race.Documentation
Full docs at
docs/background-workers.md.