8000 [HttpFoundation] Problem with Request::getBaseUrl and url-rewrites · Issue #7329 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

[HttpFoundation] Problem with Request::getBaseUrl and url-rewrites #7329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nicmart opened this issue Mar 11, 2013 · 15 comments
Closed

[HttpFoundation] Problem with Request::getBaseUrl and url-rewrites #7329

nicmart opened this issue Mar 11, 2013 · 15 comments

Comments

@nicmart
Copy link
nicmart commented Mar 11, 2013

I've encountered an inconsistent behavior with Request::getBaseUrl method when the Request URI does not include the SCRIPT_NAME (due for example to an apache url rewrite), but it includes a segment of the SCRIPT_NAME path.

In RequestTest.php, if I add this data to the getBaseUrlData data provider, testGetBaseUrl fails:

//Test Fails
array(
    '/webmaster',
    array(
        'SCRIPT_FILENAME' => '/foo/bar/web/index.php',
        'SCRIPT_NAME'     => '/web/index.php',
        'PHP_SELF'        => '/web/index.php',
    ),
    '',            //getBaseUrl    Fails, '/web' returned
    '/webmaster',  //getPathInfo   Fails, 'master' returned
)

because in this case getBaseUrl returns '/web' and not ''. This make getPathInfo fails too (it returns 'master' instead of '/webmaster'),

If the URI does not start with 'web' all works as expected:

//Test passes
array(
    '/administrator',
    array(
        'SCRIPT_FILENAME' => '/foo/bar/web/index.php',
        'SCRIPT_NAME'     => '/web/index.php',
        'PHP_SELF'        => '/web/index.php',
    ),
    '',          //getBaseUrl
    '/administrator', //getPathInfo
)
@nicmart
Copy link
Author
nicmart commented Mar 13, 2013

There are inconsistencies also when the uri basename is included in the SCRIPT_NAME. This data makes testGetBaseUrl fails too:

//Test Fails
array(
    '/foobar/index.php',
    array(
        'SCRIPT_FILENAME' => '/foo/bar/web/index.php',
        'SCRIPT_NAME'     => '/web/index.php',
        'PHP_SELF'        => '/web/index.php',
    ),
    '',            //getBaseUrl    Fails, '/web/index.php' returned
    '/foobar/index.php',  //getPathInfo   Fails, 'php' returned
)

@gogume
Copy link
gogume commented Apr 9, 2013

this test is written wrong:

array(
    '/webmaster',
    array(
        'SCRIPT_FILENAME' => '/foo/bar/web/index.php',
        'SCRIPT_NAME'     => '/web/index.php',
        'PHP_SELF'        => '/web/index.php',
    ),
    '',            //getBaseUrl    Fails, '/web' returned
    '/webmaster',  //getPathInfo   Fails, 'master' returned
)

it should be

array(
    '/webmaster',
    array(
        'SCRIPT_FILENAME' => '/foo/bar/web/index.php',
        'SCRIPT_NAME'     => '/web/index.php',
        'PHP_SELF'        => '/web/index.php',
    ),
    '/web',
    'master',
)

@nicmart
Copy link
Author
nicmart commented Apr 9, 2013

This is what the actual implementation returns... :) But is the result the wished one? It seems to me a an inconsistency with the other case.
This has caused me trouble in routes where the initial part of the url matches with the front controller folder.

@Tobion
Copy link
Contributor
Tobion commented Apr 9, 2013

@gogume not it's not. @nicmart is right that this seems to be buggy.

@Tobion
Copy link
Contributor
Tobion commented Apr 9, 2013

See also #7053 that maybe fixes the problems here becausee it includes fixes made upstream in ZF to prepareBasePath and preparePathInfo as a side effect to fix something else.

@gogume
Copy link
gogume commented Apr 9, 2013

I'm not implying that the current implementation is not buggy (I don't have the specs), I was just making an observation based on what is the implementation and how the test was written. (sorry @nicmart I didn't notice from your code comments that you already know that)
Maybe this code should be more restrictive and request a separator like '/' between the pathInfo and baseUrl.

@Tobion
Copy link
Contributor
Tobion commented Apr 11, 2013

I've encountered an inconsistent behavior with Request::getBaseUrl method when the Request URI does not include the SCRIPT_NAME (due for example to an apache url rewrite), but it includes a segment of the SCRIPT_NAME path.

@nicmart Could you please explain me when the problem you describe above can actually happen? I don't see how the SCRIPT_NAME can have a folder prefix when the request URI does not specifiy one. Do you actually have a real case for this? The only theoretical case I can maybe think of is when the mod_rewrite rules are moved one directory up and rewrites /webmaster to /web/index.php front controller.

@nicmart
Copy link
Author
nicmart commented Apr 11, 2013

@Tobion It's a real case. I've crushed against this problem trying to gettin a legacy application working with the http and routing symfony components.

Take for example

  1. web root in /var/www/mysite
  2. front controller in /var/www/mysite/web/index.php
  3. a rewrite rule that maps all requests to the front controller:
    RewriteCond %{REQUEST_URI} !^/web/index.php
    RewriteRule ^(.*)$ /web/index.php [QSA,L]

SCRIPT_NAME has a folder prefix because the front controller is not in the web root folder, but in a descendant folder.

8000
@Tobion
Copy link
Contributor
Tobion commented Apr 11, 2013

Yes that's what I imagined. But it's not best practise because it makes the full project accessible. Usually the web root should be in /var/www/mysite/web which is why it's called web ;)
Anyway, it's still a bug.

@nicmart
Copy link
Author
nicmart commented Apr 11, 2013

I know, but it was not possible to move the web root due to legacy constraints.

@Tobion
Copy link
Contributor
Tobion commented Apr 11, 2013

Hm seeing your second report I figured this makes all symfony projects even more vulnurable to symfony/symfony-standard#469.
Because you can basically prefix all URLs with anything as long as app.php is in place.
So /foo/app.php/mypage would resolve to the same page as /bar/app.php/mypage and /mypage.
Because symfony thinks /foo/app.php and /bar/app.php is the base url and doesn't care about matching it.
So you can make as many duplicates as you wish! And these pages won't work correctly because assets for example won't resolve to the correct path. It will create paths like /foo/bundles/mybundle/css/layout.css which do not exists of course with the hacked /foo prefix.

@Tobion
Copy link
Contributor
Tobion commented Apr 11, 2013

I don't see how your second example can be fixed reliably. Because /foobar/index.php can or cannot be the base url. We don't really know. It depends on the webserver configuration.

With mod_rewrite enabled it probably is not the base url but just a requested path that was rewritten to the front controller. Handling it wrong as now, will result in the problems I described above.

But this path might also be an apache alias path where /foobar maps to /myproject/web. In this case /foobar/index.php is suddently really the base url.

Tobion added a commit to Tobion/symfony that referenced this issue Apr 11, 2013
…ath info

I synchronized the code with the changes done upstream in ZF: https://github.com/zendframework/zf2/blob/master/library/Zend/Http/PhpEnvironment/Request.php

I also added tests from ZF and symfony#7329.

I also rewrote the preparePathInfo slightly because there were some strange things.
@Tobion
Copy link
Contributor
Tobion commented Apr 12, 2013

As example for the bad base url problem: http://jmsyst.com/www/jmstutorial/current/web/app.php
I don't think @schmittjoh homepage should be accessible under this url. ^^
Sorry @schmittjoh I just wanted to find an example and it's not possible on all deployments.

@Enmankan
Copy link

When using mod_rewrite-like functionality, heuristic search for baseUrl can only do so much. It would be better to let developers override baseUrl on a per-request basis, as only they can know the proper value. Allowing to set it in environment configuration would solve this problem.

Below are tests for my specific case, where I rewrote /foo as /foo/public and /foo/bar as /foo/bar/public, equivalent to putting Symfony project in a subdirectory and rewriting url to point to public folder (both tests are equivalent and the latter only represents nesting Symfony project further, such as domain.com/sandbox/project/):

array(
    '/foo/home',
    array(
        'SCRIPT_FILENAME' => '/home/John Doe/public_html/foo/public/app.php',
        'SCRIPT_NAME'     => '/foo/public/app.php',
        'PHP_SELF'        => '/foo/public/app.php',
    ),
    '/foo',        //getBaseUrl     Fails, '' returned
    '/home',       //getPathInfo    Fails, '/foo/home' returned
),
array(
    '/foo/bar/home',
    array(
        'SCRIPT_FILENAME' => '/home/John Doe/public_html/foo/bar/public/app.php',
        'SCRIPT_NAME'     => '/foo/bar/public/app.php',
        'PHP_SELF'        => '/foo/bar/public/app.php',
    ),
    '/foo/bar',      //getBaseUrl     Fails, '' returned
    '/home',         //getPathInfo    Fails, '/foo/bar/home' returned
),

@rk3rn3r
Copy link
Contributor
rk3rn3r commented Dec 20, 2014

closed #13038 according to @nicolas-grekas in #13039 (in favor to #13039 that is still open)

fabpot added a commit that referenced this issue Jan 18, 2015
…g path_info (rk3rn3r)

This PR was squashed before being merged into the 2.3 branch (closes #13039).

Discussion
----------

[HttpFoundation] [Request] fix baseUrl parsing to fix wrong path_info

Hi everyone!

We at trivago had an issue with the Request object. It seems that all versions of symfony 2.x and 3.x are affected from this (possible) bug (don't checked 1.x).
Here is the problem:

some old legacy pages are deployed in the Document Root, let's say /var/www/www.test.com/ .
one or more new applications based on symfony are deployed to /var/release/new_app1/ , /var/release/new_app2/ , ... .
in /var/www/www.test.com/ there is a symlink "app" to /var/release/new_app1/web, like:
/var/www/www.test.com/app --> /var/release/new_app1/web/

there is a "SEO"/human-readable rewrite rule for Document Root (if called path/file not exist): (.*) --> app/app.php

the problem comes, when the user calls a uri starting with "app" or whatever the rewrite rule / symlink points to:

the user calls "http://www.test.com/apparthotel-1234"
results in $_SERVER parameters like this
```
'DOCUMENT_ROOT' =>'/var/www/www.test.com',
'SCRIPT_FILENAME' => '/var/www/www.test.com/app/app.php',
'SCRIPT_NAME' => '/app/app.php',
'PHP_SELF' => '/app/app.php/apparthotel-1234'
```
in Request::prepareBaseUrl() there are checks to find the baseUrl:
```
        if ($baseUrl && false !== $prefix = $this->getUrlencodedPrefix($requestUri, $baseUrl)) {
            // full $baseUrl matches
            return $prefix;
        }

        if ($baseUrl && false !== $prefix = $this->getUrlencodedPrefix($requestUri, dirname($baseUrl))) {
            // directory portion of $baseUrl matches
            return rtrim($prefix, '/');
        }
```
first it is checked if (in our case) "/app/app.php" is in the request uri (/apparthotel-1234).
it's not.

then it takes the dirname (of /app/app.php) which is /app and checks if it is in the request uri (/apparthotel-1234), and YES, it is! and "/app" is returned, but this is wrong, it should be empty (because it comes from a rewrite rule from root: /)!

later in preparePathInfo(), if there is a baseUrl, then the baseUrl is removed from the request uri:
/apparthotel-1234  --->  /arthotel-1234

The cause is, the second baseUrl check, checks if the path of the application is already in the uri, like when the request was "http://www.test.com/app/apparthotel-1234" and hit a rewrite rule like (.*) --> app.php in there, but because it matches a directory it must match "dirname($baseUrl) . '/'".

I also needed to fix one unit test of the getBaseUrl test:
the request uri recently was "/foo%20bar".
but from the $_SERVE
737D
R infos "foo bar" is a directory, see:
```
'SCRIPT_FILENAME' => '/home/John Doe/public_html/foo bar/app.php',
'SCRIPT_NAME' => '/foo bar/app.php',
'PHP_SELF' => '/foo bar/app.php',
```

webservers will redirect a request "http://www.test.com/foo%20bar" to "http://www.test.com/foo%20bar/" when "foo bar" is a directory. checked this for apache 2.x and nginx 1.4.x.

this fix is for symfony master (3.0.x, see #13039).
I also prepared a merge request for actual 2.7 branch, it will also follow in some minutes. (see #13040)

| Q             | A
| ------------- | ---
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| Fixed tickets | this, #13040, #13038, #7329
| License       | MIT

[HttpFoundation] [Request]
* added missing slash to baseUrl-path part check to remove the path, only when it's also a path in the uri
[HttpFoundation] [Tests] [RequestTest]
* fixed and added unittests

This is the symfony 2.3 branch fix for the issue related to #13038 and #13040

Happy christmas!

Commits
-------

3a3ecd3 [HttpFoundation] [Request] fix baseUrl parsing to fix wrong path_info
@fabpot fabpot closed this as completed Jan 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
0