8000 Allow use of UTF-8 catalogues in non-UTF-8 applications by Seldaek · Pull Request #2313 · symfony/symfony · GitHub
[go: up one dir, main page]

Skip to content

Allow use of UTF-8 catalogues in non-UTF-8 applications #2313

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 325 commits into from

Conversation

Seldaek
Copy link
Member
@Seldaek Seldaek commented Oct 3, 2011

Bug fix: yes
Feature addition: ?:)
Backwards compatibility break: no
Symfony2 tests pass: yes
Fixes the following tickets: -

The problem I'm having is that, while porting an existing app, we are using UTF-8 everywhere to have a migration path ready, but the current application and DB is still in ISO-8859-1, which means translations containing accented chars are broken.

Also, we didn't hit the issue yet since we don't use forms much, but I imagine we would have similar issues with core translations for the validator which are all UTF-8 encoded.

Note that I explicitly suppressed this conversion in case your application is setup as UTF-8, to make sure most people are not affected by any slow down this introduces.

benjamindulau and others added 30 commits September 4, 2011 18:12
[Translation] support CsvDumper
[Translation] support CsvDumper
[Translation] support CsvDumper
[Translation] support CsvDumper
Commits
-------

bbb68b7 Added RSS HTTP request format

Discussion
----------

Added RSS HTTP request format

Hi,

We have added RSS HTTP request format to `initializeFormats()` function in HttpFoundation Request class, to enable specifying `_format: rss` in routing files.

We couldn't find a reason for not having the rss format specified, and even the [cookbook refers](http://symfony.com/doc/2.0/book/routing.html#advanced-routing-example) to using of `_format: rss` .

Thanks,
Bait.sk
Commits
-------

9f0bd03 [HttpKernel] Update tests for FileProfilerStorage
b7032bc [HttpKernel] Update FileProfileStorage to search from EOF
188a5fa [HttpKernel] Override the existing tokens in FileProfilerStorage
b1b1424 [HttpKernel] Delete folders in the profiler cache
88bc3ec [HttpKernel] Fixes standards of FileProfilerStorage
affe66c Merge remote-tracking branch 'origin/master' into new-profiler-storage
ea916c3 [HttpKernel] Coding convention for the file profiler storage
9ae2c8d [HttpKernel] CS in file storage
b415efd [HttpKernel] Add a test for semicolon in file storage test
1c1215f [HttpKernel] Use subfolders for better storage in file storage of profiler
4b1dc1f [HttpKernel] Fix the folder attribute of file storage to private
70f73e1 [HttpKernel] Fix tests for the file storage of profiler
d5313d9 [HttpKernel] Add tests for the file profiler storage
09fc0a2 [HttpKernel] Add Symfony credits to the file storage class for the profiler
d1d5892 [HttpKernel] Finalize the file storage for the profiler
2f65cf2 Add POC for file storage system

Discussion
----------

[2.1] [HttpKernel] File storage for profiler

Symfony2 has some problems when dealing with multiple concurrency queries in the SQLite storage, resulting in a timeout error or terrible lack.

I've implemented after discussions with @fabpot a filesystem storage.

Enable it in your project with :

    framework:
        profiler:
          dsn:             "file:%kernel.cache_dir%/profiler"

I also studied the possibility to store only big data string in files and rest in the SQLite, but not concluant.

Results of my measures (4 concurrency, 120 total) :

* SQLite with data : 1057ms
* SQLite without data : 615ms
* MySQL : 40ms
* This File storage : 54ms

---------------------------------------------------------------------------

by alexandresalome at 2011/07/22 12:01:10 -0700

An idea for the find method : a csv file containing ip;url;token

The iteration could be done over a big file, without loading the whole file in memory.

---------------------------------------------------------------------------

by alexandresalome at 2011/07/23 14:22:32 -0700

OK new version, with as explained previously : a CSV file containing the index + file for each profile.

The speed is similar to the speed of MySQL, and no memory overhead should occur with this solution.

---------------------------------------------------------------------------

by alexandresalome at 2011/07/23 14:37:14 -0700

Hm... Created tests, duplicated from SqliteProfilerStorageTest.

Any idea on how to put this code in common ? Is it usual to create a base class for 2 tests ?

---------------------------------------------------------------------------

by alexandresalome at 2011/07/23 14:48:39 -0700

Just tested with 24.000 requests, the 24.001'th request still takes less than 50ms to execute.

The index file is about 2Mb, and iterating the whole file is fast.

---------------------------------------------------------------------------

by alexandresalome at 2011/07/23 14:53:19 -0700

I've filled the file with 120Mb of data, requests are still less than 50ms for executing.

Iterating the index takes more than 30s (so it crashed), but it's because of the amount of lines. 30 seconds = 1,400,000 lines in this computer. The file = 1,500,000 lines

---------------------------------------------------------------------------

by alexandresalome at 2011/07/23 14:56:54 -0700

I've tested it with Linux, is someone can test with Windows

---------------------------------------------------------------------------

by stloyd at 2011/07/24 00:32:32 -0700

IMO to speedup it a bit more and not end up with "crash" (to not end with "limit" of files per directory, also to many files in dir slow down every OS) you should use same method to write as Twig, split up files in to directories. If you do this you can speed up index more, because you can create smaller one per directory.

Also you should fix CS (coding standards).

---------------------------------------------------------------------------

by stloyd at 2011/07/24 02:10:20 -0700

Tested on Win 7, seems ok. Similar speed to sqlite, dunno why ;-) but used a bit less of memory.

---------------------------------------------------------------------------

by alexandresalome at 2011/07/24 02:13:21 -0700

Did you tried with concurrent requests ? It makes sense when you use assetic
and your browser hits the application 4 times simultaneously for CSS
generation

---------------------------------------------------------------------------

by alexandresalome at 2011/07/24 02:17:23 -0700

I used Apache Benchmark for producing results :

    ab -c 4 -n 120 URL

---------------------------------------------------------------------------

by alexandresalome at 2011/07/24 02:56:55 -0700

OK I used subfolders, based on last characters (because the first part of token is mostly the same between queries.

---------------------------------------------------------------------------

by stof at 2011/09/04 01:27:15 -0700

@fabpot any news about it ? Can it be merged ?
Commits
-------

41b7a19 Updated the tests so that tests will be marked as skipped when there is no MongoDB server present!
233c7db Updated the code to follow the symfony coding standards
7b24de5 Updated the code to follow the symfony coding standard using stof his remarks
fbcbdde - Fixed a small bug - Updated some phpdoc
00fdfec Added a MongoDbProfilerStorage engine

Discussion
----------

[2.1] [HttpKernel] MongoDb storage for Profiler

As a documentbased database like MongoDB is [supposedly fantastic in logging](http://blog.mongodb.org/post/172254834/mongodb-is-fantastic-for-logging) I implemented a storage engine for the profiler that should enable us to use this database as storage for this.

Activate it using this way:

    framework:
        profiler:
            dsn:     mongodb://user:pass@location/database/collection

---------------------------------------------------------------------------

by stof at 2011/07/24 11:23:06 -0700

btw, the MongoDB session storage has already be rejected from the core so this should probably be moved to the DoctrineMongoDBBundle (even if it uses only the PHP extension and not Doctrine). @fabpot thoughts about this ?

---------------------------------------------------------------------------

by Wotre at 2011/07/24 11:52:56 -0700

Just my personal opinion, if it is prefered that way I will move this into the DoctrineMongoDBBundle.

While it is reasonable to bundle all Mongo related things together, I do believe that in the case of logging we want to avoid as many depencies as possible. Some exceptions can occur pretty early inside the framework, and it would be a shame if those aren't logged because this layer is written on top of doctrine. I'm not exactly familliar enough with the symfony internals as I only started using it a few days ago, but I can imagine that this can make a difference with some exceptions.

---------------------------------------------------------------------------

by stof at 2011/07/24 11:59:10 -0700

I don't ask you to use Doctrine in this code. It is fine to use the extension directly if it is enough.

Btw, the profiler is *not* used early. :)

---------------------------------------------------------------------------

by Wotre at 2011/07/26 10:45:05 -0700

So... Any final remark whether this should be moved to [the DoctrineMongoDBBundle](https://github.com/symfony/DoctrineMongoDBBundle) or not?

If it has to be moved, any comment on where in that bundle this should be put?

Also, if it has to be moved, how can we arrange the configuration using DI? Currently I've put a line in the FrameworkExtension file to use this engine for everything with a $dsn starting with mongodb; I imagine this kind of ugly depency can't really exist between the FrameworkBundle and another one.

Although it seems completely illogical to me, I will move it, but I do need some directions on how to elegantly do this...

---------------------------------------------------------------------------

by stof at 2011/07/26 11:03:04 -0700

@fabpot what do you think ?

---------------------------------------------------------------------------

by stof at 2011/09/04 01:28:48 -0700

@fabpot what do you think about the place where this should be done ?
Commits
-------

f2761dd Fix typo and include suggestion by Stof
4ac380e Adjust QtTranslations patch and include QtTranslationsDumper + test aswell
5712798 Adjust QtTranslationLoader to throw RuntimeException
21b29c2 Merge symfony/master
6bf43a1 [Translation] Add .ts as file extension to search for Qt Translation files.
5808715 [Translation] Add Qt Translation component with tests

Discussion
----------

[2.1] Qt translations

Add support for QT translations - it has a GUI and www.crowdin.net supports it.

---------------------------------------------------------------------------

by stof at 2011/05/30 07:24:48 -0700

You also need to register it in FrameworkBundle.

---------------------------------------------------------------------------

by tristanbes at 2011/06/26 12:08:47 -0700

crowdin seems to be a cool service. didn't know about this website. thanks

---------------------------------------------------------------------------

by fabpot at 2011/07/11 02:58:31 -0700

Just for the record: I've just removed all usage of `\Exception` in Symfony2 master:

symfony@6a73593

---------------------------------------------------------------------------

by beberlei at 2011/07/11 03:02:14 -0700

I will adjust the PR.

---------------------------------------------------------------------------

by tristanbes at 2011/08/28 11:10:02 -0700

Any news @beberlei ?

---------------------------------------------------------------------------

by beberlei at 2011/08/29 01:07:28 -0700

Yes, i have to allocate some time. Havent managed to do so yet.

---------------------------------------------------------------------------

by stof at 2011/09/04 05:28:16 -0700

@beberlei the PR also need to be rebased as it conflicts with master.
Commits
-------

dea46c7 [Translation] support CsvDumper [Translation] support CsvDumper [Translation] support CsvDumper [Translation] support CsvDumper [Translation] support CsvDumper

Discussion
----------

[2.1][Translation] support CsvDumper

---------------------------------------------------------------------------

by michelsalib at 2011/09/05 04:56:47 -0700

The DumperInterface is probably going to change soon, it should be considered when merging : symfony#2051.
Commits
-------

89f4791 Fixed CS
a348efe Removed trailing whitespaces
135531a Replaced setExpectedException() methods by annotations
6ad83e7 Updated according to PR review
2de243c * Added Fixtures 2x2px test.gif image file * Updated tests for using fixture image instead of imagecreate function
a5a2dfa [ImageValidator] Added dedicated ImageValidator class with min width, max width, min height and max height validations

Discussion
----------

[2.1] [ImageValidator] Improved ImageValidator + tests

Added dedicated ImageValidator class with min width, max width, min height and max height validations

---------------------------------------------------------------------------

by kriswallsmith at 2011/06/04 10:32:56 -0700

This is nice but doesn't belong in the core as it adds a dependency on GD.

---------------------------------------------------------------------------

by benjamindulau at 2011/06/04 10:41:02 -0700

@kriswallsmith i'm not sure, php.net doc says :

    Note:

    This function does not require the GD image library.

Well, that could also be a mistake :)

---------------------------------------------------------------------------

by pborreli at 2011/06/04 11:17:25 -0700

i think @benjamindulau is right

```
$ php -m
[PHP Modules]
bcmath
calendar
com_dotnet
Core
ctype
date
dom
ereg
filter
ftp
hash
iconv
json
libxml
mcrypt
mhash
mysqlnd
odbc
pcre
PDO
Phar
Reflection
session
SimpleXML
SPL
sqlite3
standard
tokenizer
wddx
xdebug
xml
xmlreader
xmlwriter
zip
zlib
```

```
$ php -r "var_dump(getimagesize(__DIR__.'/src/Symfony/Bundle/FrameworkBund
le/Resources/public/images/blue_picto_less.gif'));"
array(7) {
  [0]=>
  int(18)
  [1]=>
  int(18)
  [2]=>
  int(1)
  [3]=>
  string(22) "width="18" height="18""
  ["bits"]=>
  int(5)
  ["channels"]=>
  int(3)
  ["mime"]=>
  string(9) "image/gif"
}
```

---------------------------------------------------------------------------

by benjamindulau at 2011/06/04 11:22:57 -0700

However, the tests use imagejpeg and imagedestroy, that's not valid.
But, i'm not sure how to correctly test the validator then.

---------------------------------------------------------------------------

by benjamindulau at 2011/06/04 11:31:51 -0700

Can i use an image from FrameworkBundle resources ?

-----------------------------------------------------
8000
----------------------

by pborreli at 2011/06/06 03:47:24 -0700

just add your own image inside Fixture folder (smaller the better, 1x1, 1x2 ..)

---------------------------------------------------------------------------

by stof at 2011/09/04 05:27:42 -0700

@benjamindulau could you rebase your PR and update it according to the comments ?

---------------------------------------------------------------------------

by benjamindulau at 2011/09/04 06:31:34 -0700

Yep, i think i can find some time today to do that

---------------------------------------------------------------------------

by benjamindulau at 2011/09/04 09:27:29 -0700

I've updated the PR.

A big thank to Stof who helped me with my git mess ;-)
Commits
-------

34494b3 whitespace fixes
1a86a4a Refactor mime-type to file extension guessing
e7481a3 Decouple mime-type to extension logic from File class

Discussion
----------

[2.1] Decouple mimetype-to-extension logic from File class

This allows guessing the extension from a given mime type
without requiring the existence of a local file.

If a file's meta information (mime-type, etc.) is already available (i.e. it's
been extracted once and stored in some persistent data store), it would be
nice to be able to make a best-guess on the extension based on the known mime-type.

A concrete use case of this is for the symfony-cmf, where a file has been stored
in the jackrabbit data store.  When delivering this file or saving it to disk, we'd like to
use an extension that's created based on the known mime type of the file.

---------------------------------------------------------------------------

by brki at 2011/06/21 04:35:13 -0700

Now implemented similarly to the existing MimeTypeGuesser.

---------------------------------------------------------------------------

by brki at 2011/06/21 07:51:22 -0700

whitespace removed

---------------------------------------------------------------------------

by stof at 2011/09/04 05:04:54 -0700

@fabpot @brki what is the status of this PR ?
* 2.0:
  [Validator] Sync polish translation
  [FrameworkBundle] sanitize target arg in asset:install command
  few optimisations for XliffFileLoader and XmlFileLoader
  [FrameworkBundle] Sync the Russian translations
  [FrameworkBundle] Added Dutch validator translation for trans-unit 41
  [FrameworkBundle] Updated German validator translation
  [FrameworkBundle] Fixed a typo in the translation file per @peymanhr
Commits
-------

7139154 [Validator] Translate image validation errors into polish

Discussion
----------

[2.1][Validator] Translate image validation errors into polish
Commits
-------

f23413c Updated french translations for image validator

Discussion
----------

Updated french translations for image validator
Commits
-------

522adde [Validator] Fixed typo in Image constraint

Discussion
----------

[Validator] Fixed typo in Image constraint
aboks and others added 17 commits October 1, 2011 14:26
Commits
-------

0299d38 [Framework
8000
Bundle] Updated Dutch validator translations

Discussion
----------

[FrameworkBundle] Updated Dutch validator translations

Bug fix: yes
Feature addition: no
Backwards compatibility break: no
Symfony2 tests pass: yes
Fixes the following tickets: -
Commits
-------

eac2a77  Oops.. translated the variable by mistake
51b3b59  Updated Hebrew validator translations (trans-unit id="48")

Discussion
----------

Updated Hebrew validator translations (trans-unit id="48")
Commits
-------

8404b35 Updated indonesian translations for trans-unit 47 and 48

Discussion
----------

Updated indonesian translations for trans-unit 47 and 48
Commits
-------

31840b9 Updated Polish validator translations (trans-unit id="47" and id="48")

Discussion
----------

Updated Polish validator translations (trans-unit id="47" and id="48")
Commits
-------

9219cf9 [FrameworkBundle] Sync the Russian translations with the SizeLength constraint

Discussion
----------

[FrameworkBundle] Sync the Russian translations with the SizeLength const
Commits
-------

4909169 Typo, should be "znaków"

Discussion
----------

[FrameworkBundle] Polish validator translations - Typo, should be "znaków"
@fabpot
Copy link
Member
fabpot commented Oct 4, 2011

IMO, this is a new feature that must be done in master.

@Seldaek
Copy link
Member Author
Seldaek commented Oct 4, 2011

Ok. I have to say though it's not so easy to replicate cleanly without hacking into core; I had to copy paste lots of stuff from the core files to get it working externally here due to the private stuff and the fact loadCatalogue doesn't return the loaded catalogue.

That means the sooner it gets in core the better. Can you still comment on the patch? I'll rebase on master later.

@Seldaek
Copy link
Member Author
Seldaek commented Oct 4, 2011

By the way IMO it is a bug fix much in the same direction as #2072 - edge cases perhaps because most people use UTF8, but painful for those that don't. Changing the charset was a feature in 2.0, and it should be made to work in 2.0.x as far as I'm concerned. Then again, I'm probably biased.

if (null !== $this->charset && extension_loaded('mbstring')) {
foreach ($catalogue->all() as $domain => $messages) {
foreach ($messages as $key => $translation) {
$srcCharset = mb_detect_encoding($translation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about how mb_detect_encoding work, but it looks fragile (and what about the performance).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fairly fragile if not configured properly, yes. But without this, it just poops UTF-8 in your non-UTF-8 output, which is not any better to be honest.

As for speed, surely this patch slows things down, but only at compile time. Then again, it's the best and most centralized solution I have found. Decoding the source file before parsing is a no-go, since YAML has to be UTF-8 for example.

All in all, I do agree that this might need to be made optional somehow. Just so it doesn't surprise people with poorly configured mbstring who don't use UTF-8. Because it might detect a wrong encoding and then try to transcode to their target encoding using wrong source charset, which would be bad. If you do know what you're doing though, it's a great way to get an app ready for migration imo. Also please don't forget about the issue I mentionned with the validation messages.

… the service is defined for better consistency
@fabpot
Copy link
Member
fabpot commented Oct 6, 2011

@Seldaek: Can you rebase this PR on master? thanks.

fabpot and others added 3 commits October 6, 2011 21:40
* 2.0:
  [DoctrineBundle] fixed a unit test (detected thanks to PHP 3.6.0)
  [Form] Fixed lacking attributes in DateTimeType
@Seldaek
Copy link
Member Author
Seldaek commented Oct 7, 2011

Done.

@Seldaek Seldaek closed this Oct 7, 2011
fabpot added a commit that referenced this pull request Oct 7, 2011
Commits
-------

5473d3b [Translation] Allow use of UTF-8 encoded catalogues into non-UTF-8 applications
deb6dea [Translation] Add failing tests to verify that UTF-8 lang files can't be used with another charset

Discussion
----------

Allow use of UTF-8 catalogues in non-UTF-8 applications

This is #2313 but targetting the master branch.

Bug fix: yes
Feature addition: ?:)
Backwards compatibility break: no
Symfony2 tests pass: yes
Fixes the following tickets: -

The problem I'm having is that, while porting an existing app, we are using UTF-8 everywhere to have a migration path ready, but the current application and DB is still in ISO-8859-1, which means translations containing accented chars are broken.

Also, we didn't hit the issue yet since we don't use forms much, but I imagine we would have similar issues with core translations for the validator which are all UTF-8 encoded.

Note that I explicitly suppressed this conversion in case your application is setup as UTF-8, to make sure most people are not affected by any slow down this introduces.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0