8000 Allow non word characters in env variable names · Issue #33806 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content
8000

Allow non word characters in env variable names #33806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pounard opened this issue Oct 2, 2019 · 21 comments
Closed

Allow non word characters in env variable names #33806

pounard opened this issue Oct 2, 2019 · 21 comments

Comments

@pounard
Copy link
Contributor
pounard commented Oct 2, 2019

Right now, env variables names that don't match preg \w are rejected with an exception, in EnvPlaceholderParameterBag:

            if (!preg_match('/^(?:\w++:)*+\w++$/', $env)) {
                throw new InvalidArgumentException(sprintf('Invalid %s name: only "word" characters are allowed.', $name));
            }

But it also is documented that env system can be used to load dynamic variables, using env var processors. We had a need to allow user-driven configuration injected as services parameters, and dynamic runtime property of injecting env is exactly the right API for this.

In our case, the env var processor injects variables from database, stored in a registry-like tree (implementation is a flat key/value store but that's implementation detail) and we'd like our variables names to contain dots . or any other char, pretty much like parameters allow, so we can hierarchize/organize dynamic configuration.

Could it be possible to drop this naming restriction ? Or is there an easy way to bypass it ?

@ro0NL
Copy link
Contributor
ro0NL commented Oct 2, 2019

letters/numbers/underscores are the most portable chars for environment variable names. We could even consider restricting it further to not start with a number 😅

ref https://stackoverflow.com/questions/2821043/allowed-characters-in-linux-environment-variable-names

@ro0NL
Copy link
Contributor
ro0NL commented Oct 2, 2019

your usecase is interesting though, currently only prefix-based "processing" of existing env vars is documented obtained from the built-in $getEnv closure. Yet, you can load values by name from any source for sure. However using an incompatible name makes your prefix incompatible with other prefixes as well ... so we need to think about this feature a bit better perhaps.

@pounard
Copy link
Contributor Author
pounard commented Oct 3, 2019

I think the framework should not restrict, since it won't be a problem to process in Symfony itself, actually it's no the framework responsibility to care, limitations are only due to the underlying environment capabilities, not the framework.

There is one and only one thing that this check fixes: bundle writers will be forced to give sane environment variable that works everywhere, but instead of being forced, it could should be a documented convention, or a deactivable behaviour maybe ? If someone gives a non supported name for an OS, (s)he will have issues opened by random people very quickly.

@pounard
Copy link
Contributor Author
pounard commented Oct 3, 2019

However using an incompatible name makes your prefix incompatible with other prefixes as well

I don't think that's a problem actually, prefixes are mostly used to cast/coerce value types, but in our case we use it to select the storage it comes from. Maybe that's this which is unnatural.

Either way, dynamic parameters loading is something we do need, and the env var provider is the only place we can plug into to achieve this, and this naming restriction is a huge one.

@fabpot
Copy link
Member
fabpot commented Oct 3, 2019

"user-driven configuration" does not seem to match a use case for env vars. Can you explain a bit more the use case?

@pounard
Copy link
Contributor Author
pounard commented Oct 3, 2019

@fabpot we're injecting the configuration as services arguments, which allows to eliminate the dependency from our domain services to the variable registry, other parameters than envs are compiled and hardcoded into the container, which makes cache clear mandatory each time a webamster changes it, using vars actually ensures that each time a value is requested, it gets the fresh one from the user writable registry without the need to clear the cache.

EDIT: Another design solution (and proper one if this wasn't PHP) would be to implement variable injection via a notification, domain services would then listen for variable changes. But I guess that if I wasn't writing PHP, I would probably do a lot of things very differently ! Anyway this would be a much more aggressive coupling of domain services to the configuration API.

@pounard
Copy link
Contributor Author
pounard commented Oct 4, 2019

@fabpot @ro0NL I agree that I hacked around env variables to achieve our needs, but I think it remain an interesting use case to consider, maybe it could be discussed further in another issue ?

Right now, is it worth considering adding a container parameter that could be set pragmatically in the App\Kernel::configureContainer() to deactivate those strict naming checks ?

Such as:

<?php

namespace App;

use Symfony\Bundle\FrameworkBundle\Kernel\MicroKernelTrait;
use Symfony\Component\Config\Loader\LoaderInterface;
use Symfony\Component\Config\Resource\FileResource;
use Symfony\Component\DependencyInjection\ContainerBuilder;
use Symfony\Component\HttpKernel\Kernel as BaseKernel;
use Symfony\Component\Routing\RouteCollectionBuilder;

final class Kernel extends BaseKernel
{
    use MicroKernelTrait;

    // ...

    protected function configureContainer(ContainerBuilder $container, LoaderInterface $loader)
    {
        // For example:
        $container->setParameter('container.env.strict_naming', true /* false */);

        // ...
    }
}

That's probably an question that would be of interest for @nicolas-grekas as well I guess ?

@ro0NL
Copy link
Contributor
ro0NL commented Oct 4, 2019

my main concern is if your prefix provides values, then what happens with your_prefix:other_prefix:VAR. If we go this way, you're 'value providing' prefix must be the first, which is currently not required to do so (thus opening all sort of WTFs)

@pounard
Copy link
Contributor Author
pounard commented Oct 4, 2019

@ro0NL I ear you, right now, what I did in my custom project is actually none of your concerns, I will try harder to fix those kind of problems as soon as we'll propagate it to other projects, and will be extra careful on testing once we'll publish it open source, but right now, it's still a prototype.

I try to focus on most simple things in Symfony that can unblock us easily, and we'll then on the longer run focus on the industrialising it side of things.

@ostrolucky
Copy link
Contributor
ostrolucky commented Oct 6, 2019

DotEnv stays as close to handling of native environment variables as possible by ensurinng you can use same variables interchangeably between DotEnv and native system variables. This was the sole purpose fabien introduced this component - no other PHP env parser does this correctly.

Now, even though it's possible to work with non-alpha characters as demonstrated via

$ php -r $'$d = [["pipe", "r"], ["pipe", "w"]]; \
	$p = proc_open(\'env\', $d, $pipes, null, ["foo.bar" => "baz", "baz" => "bar"]); \
	echo stream_get_contents($pipes[1]);'
foo.bar=baz
baz=bar
...

This is not the case with PHP

$ php -r $'$d = [["pipe", "r"], ["pipe", "w"]]; \
	$p = proc_open(\'php -r \\\'print_r($_SERVER);\\\'\', $d, $pipes, null, ["foo.bar" => "baz", "baz" => "bar"]); \
	echo stream_get_contents($pipes[1]);'
Array
(
    [baz] => bar
    ...
)

As you can see in above output, PHP strips such variables out.

Hence I think DotEnv's behaviour is entirely correct. If DotEnv changed behaviour as you suggest, as soon as user switches to production environment where instead of DotEnv, real environment variables are used, such environment variable would disappear and could have nasty consequencies.

I recommend to open issue on PHP's bugtracker though, because From The Open Group:

These strings have the form name=value; names shall not contain the character '='. For values to be portable across systems conforming to IEEE Std 1003.1-2001, the value shall be composed of characters from the portable character set (except NUL and as indicated below).

which suggests anything outside of = should be good to go

@pounard
Copy link
Contributor Author
pounard commented Oct 7, 2019

Actually it doesn't prevent Symfony from allowing the user to disable the check ! As long as the default is to be done.

@ostrolucky
Copy link
Contributor

I see no reason though. You should not use env files if you don't want to support real system environment variables. Use different configuration format which is better suited for your needs.

@pounard
Copy link
Contributor Author
pounard commented Oct 8, 2019

The problem is not due to env files, my variables are not in env files, nor in env itself. For many reasons an env var processor could actually inject the variable from another source, from pretty much anywhere else than env files or env variables, and Symfony is actually pluggable enough (and documented) for using it this way.

May I quote https://symfony.com/doc/current/configuration/env_var_processors.html

Using env vars to configure Symfony applications is a common practice to make your applications truly dynamic.

That's what motivated me using this as a way to inject database stored variables: that's the only (there is no other I could find) way to inject dynamic variables in the container.

And in fact, even if that's a hack, it does work gracefully, it's robust and efficient.

Allowing the developer to disable or loosen this very strict variable name check on a per application basis would actually make it completely transparent and effective. I don't see why it couldn't be done.

As you said:

which suggests anything outside of = should be good to go

Loosening the check to this simple rule would fix my problem actually ! And if you consider it to be too dangerous to loosen it per default (many developers just don't care, and use real env vars, for them the actual behaviour is perfect) then just give the opportunity to opt-in using a explicit container parameter.

@nicolas-grekas
Copy link
Member

allow user-driven configuration injected as services parameters

does this mean you have a backend that allows configuring app-wide settings?

we'd like our variables names to contain dots . or any other char

if we talk about allowing dot, that'd be possible to me - about "any other", that's definitely no-go on my side. The more chars we allow the more we close extensibility for future (unknown) improvements.

@pounard
Copy link
Contributor Author
pounard commented Oct 8, 2019

does this mean you have a backend that allows configuring app-wide settings?

More or less, it's a basic key value like store for user configuration, which may then be used to replace container variables (and more, but that's not in this issue scope), it works based upon an explicit schema, which means values are validated prior to storage, and that nothing else than developer-defined variables can be stored.

It gives us the possibility to inject dynamically variables into bundle provided parameters, of course we cannot with those who interact with compile phase, so we're actually careful about what we do with it.

about "any other", that's definitely no-go on my side. The more chars we allow the more we close extensibility for future (unknown) improvements.

I fully understand, and disagree at the same time. That why I'm asking for a compromise: a way to opt-out of the strict check and leaving the default to be strict.

@nicolas-grekas
Copy link
Member

it's a basic key value like store for user configuration

There's something that scares me here: the word "user". Are you configuring a different container per user - ie per request? If yes, that's a total abuse of the DI container sorry, we'd have no reason to support this.

@nicolas-grekas
Copy link
Member

(you can still encode any arbitrary characters to the subset that is supported here, so that you would achieve what you describe without asking core to care about it at all - and it shouldn't if it's per user.)

@pounard
Copy link
Contributor Author
pounard commented Oct 8, 2019

There's something that scares me here: the word "user". Are you configuring a different container per user - ie per request?

No of course not ! I keep one compiled container and that's fine. In our use case, most user configured variables will change behaviours with the application business and not the container! It's just that for a very few edge cases we alter other bundles variables (we dynamically allow admins to disable SSO logins for example, but we ensured first that it would not have side effect on compilation, of course).

But in the end, use case doesn't matter, this allows to inject those variables to services and avoid them the dependency to the configuration storage.

and it shouldn't if it's per user

Nothing is "per user", it's global configuration that the site admin may change, pretty much like variables in Drupal for example. Nevertheless, for the developer point of view, a site admin is a user. For real end-users we have a business oriented preferences storage that is tied to the domain code, which will never, ever, try to do anything with the container.

--

If this was a persistent application server (like in any other techno than PHP) we would never have done that, we would have instead put a notification based system for configuration changes, but in the end, we all do ugly things for performance, having to compile the container is one, we just needed some bits of dynamism without having to recompile it every change.

@nicolas-grekas
Copy link
Member

OK, I feel better now :)

you can still encode any arbitrary characters to the subset that is supported here, so that you would achieve what you describe without asking core to care about it at all

what about this ?

@pounard
Copy link
Contributor Author
pounard commented Oct 8, 2019

you can still encode any arbitrary characters to the subset that is supported here

I guess that's achievable, but I'm afraid that encoding it would make the configuration less readable. Right now, it's a rather young project and we don't have hundreds of those variables, but I'm afraid that encoding would create ambiguities in variable naming, by reducing the char set we might create collision.

@nicolas-grekas
Copy link
Member
nicolas-grekas commented Oct 27, 2019

I'm afraid that encoding would create ambiguities in variable naming, by reducing the char set we might create collision.

I get that but I think that's out of scope for the component. My proposal: use a slugger and replace any non-letter/digit by an underscore. You can deal with collisions on your side before accepting a new variable name. I cannot imagine two legitly different vars differ only by a dot vs underscore.

I'm closing as allowing names that are not compatible with env vars will create DX issues for users. The use case is really interesting, but there are solutions that don't require allowing more chars while not reducing the capabilities of the system built on top, as discussed.

Thanks for proposing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants
0