Skip to content

Conversation

@AlliBalliBaba
Copy link
Contributor

This is an alternative to #1552. instead of having a new directive, this introduces a way to match workers against paths, therefore allowing placing the worker anywhere outside of the public directory..

wip

not sure yet if this is better:

php_server {
    match * worker index.php
}

or this:

php_server {
    worker {
        file index.php
        match *
    }
}

performance:

default php_server wih worker
flamegraph-default

php_server wih worker and try_files {path} index.php
flamegraph

php_server wih worker and match * (minimizes file operations)
flamegraph-match

@AlliBalliBaba AlliBalliBaba marked this pull request as draft June 11, 2025 18:23
@withinboredom
Copy link
Member

I think I prefer the latter of the two options, but I can see the benefits of the first one too.

@AlliBalliBaba
Copy link
Contributor Author

Thinking about it, the first option makes it maybe clearer that this is only possible inside of php_server.

@nickchomey
Copy link

nickchomey commented Jun 11, 2025

FYI, you might want to explore using pprof's base flag, which allows for showing a flamegraph diff between two profiles.

I think this is the syntax

go tool pprof -http=:8080 -diff_base=base.prof new.prof

You could do 1v2 and 1v3 to easily see where and how much the difference is

@AlliBalliBaba
Copy link
Contributor Author

I've managed to create a differential flamegraph via this guide

# first profile
go tool pprof -raw -output=cpu1.txt 'http://localhost:2019/debug/pprof/profile?seconds=20'

#second profile
go tool pprof -raw -output=cpu2.txt 'http://localhost:2019/debug/pprof/profile?seconds=20'

# diff flamegraph
./stackcollapse-go.pl ./cpu1.txt > out.folded1
./stackcollapse-go.pl ./cpu2.txt > out.folded2
./difffolded.pl -n out.folded2 out.folded1 | ./flamegraph.pl > /go/src/app/flamegraph.svg

A bit hard to make out what's the difference between 'red' and 'blue' though
worker vs match worker

flamegraph

@henderkes
Copy link
Contributor

I'm very strongly in favour of option one, simply because it will allow the match to behave in a general scope, such as for assets (php_server also has a file_server built in, it would also benefit) as well.

@AlliBalliBaba
Copy link
Contributor Author

@henderkes what would file_server matching look like in your opinion? Should it 404 or just fall through if nothing matches?

php_server {
    match /path1/* worker /anywhere/worker1.php
    match /path2/* worker /anywhere/worker2.php
    match /assets/* fileserver
}

# if nothing matches continue to the next directive

@henderkes
Copy link
Contributor

If /assets/some_asset.png isn't found, it should 404 of course. But that's the same as falling through to the default behaviour of file_server, is it not?

@AlliBalliBaba
Copy link
Contributor Author

I mean if it matches neither of the paths /assets/* , /path1/* or /path2/* . Currently, php_server would always respond with a 404.

@henderkes
Copy link
Contributor

oh, then it should fall through to the default behaviour.

@AlliBalliBaba
Copy link
Contributor Author

Hmm makes sense, so the order would be:

  • serve {file} if it exists, file_server is enabled and there are no other match file_server directives:
  • apply all match worker and match file_server directives in order
  • fallback to original try_files implementation

@henderkes
Copy link
Contributor

I was thinking more along the lines of

  • at the beginning of the request, loop through match directives in order, if one matches, pass it to that functionality
  • no further changes to the code as it is now

@AlliBalliBaba
Copy link
Contributor Author

AlliBalliBaba commented Jun 12, 2025

Hmm I'm still not sure about match file_server, are there any cases where it wouldn't just be an abbreviation for a route?
Otherwise there's also a benefit to having match inside of the worker, since it might allow matching global workers.

The reason the file_server needs to go first in the match worker case is to allow the most common use case:
-> file exists: serve file
-> file does not exist: forward request to worker

@henderkes
Copy link
Contributor

henderkes commented Jun 13, 2025

Otherwise there's also a benefit to having match inside of the worker, since it might allow matching global workers.

That would be quite confusing and might also hurt performance of regularly handled requests, because it has to go through all global workers and try to match the path, even if it was never intended and never produces a match.

The reason the file_server needs to go first in the match worker case is to allow the most common use case:
-> file exists: serve file
-> file does not exist: forward request to worker

But not with the match directive. It's an explicit tell that all routes with /path1/* should be handled by the worker. There absolutely doesn't and shouldn't be a file_server involved with that at all. If a user configures /path1/* to be handled by a worker but then expects /path1/image.jpg to be served as a file, that's not our problem, just like it isn't Caddy's problem when someone uses

route {
    @assets path /path1*
    handle @assets {
         rewrite worker1.php
         php
    }
}

and then expects an image in that path1 to be served by a file server.

Hmm I'm still not sure about match file_server, are there any cases where it wouldn't just be an abbreviation for a route?

That's pretty much what it would be. The reason it should exist is something like this:

php_server {
    match /path1/images/* file_server
    match /path1/* worker worker1.php
}

And that's also what I envision the match directive being - an abbreviated version of route + matchers + handle directives, but it carries the same performance benefit that you were after and also allows workers in arbitrary paths.

What I think would happen when a php_server (or php) directive is hit in pseudocode:

(FrankenPHPModule* f) handleRequest(httpRequest r) {
    path := r.URL.Path
    for pattern, handler in range f.matches {
        if path matches pattern
            handler(r)
    }
    // fallthrough to current logic
}
    

@AlliBalliBaba
Copy link
Contributor Author

(FrankenPHPModule* f) handleRequest(httpRequest r) {
path := r.URL.Path
for pattern, handler in range f.matches {
if path matches pattern
handler(r)
}
// fallthrough to current logic
}

This would not work since we would never hit the php module via try_files. Matching needs to also happen beforehand on the caddy routing level.

That's pretty much what it would be. The reason it should exist is something like this:

php_server {
match /path1/images/* file_server
match /path1/* worker worker1.php
}

While I can see the benefit of an abbreviation like this, it wouldn't cover the most common use case, which is
serve file -> fallback to worker (the default in most frameworks with a worker implementation)

So I'm still a bit torn, but now leaning more towards the worker->match direction, also because the implementation is simpler.

php_server {
  worker {
    file /anywhere/worker.php
    match * 
  }
}

@henderkes
Copy link
Contributor

(FrankenPHPModule* f) handleRequest(httpRequest r) {
path := r.URL.Path
for pattern, handler in range f.matches {
if path matches pattern
handler(r)
}
// fallthrough to current logic
}

This would not work since we would never hit the php module via try_files. Matching needs to also happen beforehand on the caddy routing level.

In that case match ... file_server really doesn't make sense. Unless we were to register a different directive, but that's exactly what I wanted to avoid.

That's pretty much what it would be. The reason it should exist is something like this:
php_server {
match /path1/images/* file_server
match /path1/* worker worker1.php
}

While I can see the benefit of an abbreviation like this, it wouldn't cover the most common use case, which is serve file -> fallback to worker (the default in most frameworks with a worker implementation)

If serving files is always tried before our php module handler is even hit, wouldn't the equivalent just be

php_server {
    match /path1/* worker worker1.php
    // fall back to normal logic to see if we can match against a worker or use a regular php thread
}

So I'm still a bit torn, but now leaning more towards the worker->match direction, also because the implementation is simpler.

php_server {
  worker {
    file /anywhere/worker.php
    match * 
  }
}

I really don't like the other way around. It just doesn't make sense. The worker doesn't perform any matching, the worker should never even be queried if it didn't match the request.

@AlliBalliBaba
Copy link
Contributor Author

Yeah these 2 would be equivalent, depends on how you think about it and how this might be extended in the future

php_server {
    match /path/* worker /anywhere/worker.php
}
php_server {
  worker {
    file /anywhere/worker.php
    match /path/*
  }
}

@dunglas dunglas force-pushed the feat/worker-matching branch from d3d8405 to b8ad01a Compare June 27, 2025 12:46
@dunglas dunglas self-requested a review June 27, 2025 12:47
Copy link
Member

@dunglas dunglas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this one ready to be merged? Could you add some docs?

@AlliBalliBaba
Copy link
Contributor Author

Yes I think this one is ready to merge. I'll add some docs (in this PR or a separate one, whichever you prefer)

…nkenphp into feat/worker-matching

# Conflicts:
#	caddy/module.go
#	caddy/workerconfig.go
#	frankenphp.go
#	phpmainthread_test.go
#	worker.go
@AlliBalliBaba
Copy link
Contributor Author

Docs are added 👍

@dunglas
Copy link
Member

dunglas commented Jun 30, 2025

Could you please rebase @AlliBalliBaba?

@AlliBalliBaba
Copy link
Contributor Author

Merged 👍

@dunglas dunglas merged commit fb10b1e into main Jul 1, 2025
56 of 58 checks passed
@dunglas dunglas deleted the feat/worker-matching branch July 1, 2025 08:27
@dunglas
Copy link
Member

dunglas commented Jul 1, 2025

Great work, thank you!

Regarding the GitHub issue, I discussed with the PHP team, and they are reluctant to add 2 accounts for the same person. Would you prefer that we remove @AlliBalliBaba and add @AlliBalliBaba2 instead?

@AlliBalliBaba
Copy link
Contributor Author

Hmm yeah maybe this account is just irreversibly broken and I'll have to migrate over to @Alliballibaba2, all of these github support issues didn't really go anywhere

@AlliBalliBaba
Copy link
Contributor Author

Nevermind, just got feedback that actions work again 🎉

henderkes added a commit that referenced this pull request Jan 2, 2026
This one is interesting — though I’m not sure the best way to provide a
test. I will have to look into maybe an integration test because it is a
careful dance between how we resolve paths in the Caddy module vs.
workers. I looked into making a proper change (literally using the same
logic everywhere), but I think it is best to wait until #1646 is merged.

But anyway, this change deals with some interesting edge cases. I will
use gherkin to describe them:

```gherkin
Feature: FrankenPHP symlinked edge cases
  Background: 
    Given a `test` folder
    And a `public` folder linked to `test`
    And a worker script located at `test/index.php`
    And a `test/nested` folder
    And a worker script located at `test/nested/index.php`
  Scenario: neighboring worker script
    Given frankenphp located in the test folder
    When I execute `frankenphp php-server --listen localhost:8080 -w index.php` from `public`
    Then I expect to see the worker script executed successfully
  Scenario: nested worker script
    Given frankenphp located in the test folder
    When I execute `frankenphp --listen localhost:8080 -w nested/index.php` from `public`
    Then I expect to see the worker script executed successfully
  Scenario: outside the symlinked folder
    Given frankenphp located in the root folder
    When I execute `frankenphp --listen localhost:8080 -w public/index.php` from the root folder
    Then I expect to see the worker script executed successfully
  Scenario: specified root directory
    Given frankenphp located in the root folder
    When I execute `frankenphp --listen localhost:8080 -w public/index.php -r public` from the root folder
    Then I expect to see the worker script executed successfully    
```

Trying to write that out in regular English would be more complex IMHO.

These scenarios should all pass now with this PR.

---------

Signed-off-by: Marc <[email protected]>
Co-authored-by: henderkes <[email protected]>
Co-authored-by: Kévin Dunglas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants