8000 test_sqlite3 fails on non-UTF-8 locale · Issue #91922 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

test_sqlite3 fails on non-UTF-8 locale #91922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
serhiy-storchaka opened this issue Apr 25, 2022 · 7 comments · Fixed by #92926
Closed

test_sqlite3 fails on non-UTF-8 locale #91922

serhiy-storchaka opened this issue Apr 25, 2022 · 7 comments · Fixed by #92926
Labels
3.11 only security fixes topic-sqlite3 topic-unicode type-bug An unexpected behavior, bug, or error

Comments

@serhiy-storchaka
Copy link
Member
$ LC_ALL=en_US.iso88591 ./python -m test -vuall test_sqlite3
...
======================================================================
ERROR: test_ctx_mgr_rollback_if_commit_failed (test.test_sqlite3.test_dbapi.MultiprocessTests.test_ctx_mgr_rollback_if_commit_failed)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 1767, in test_ctx_mgr_rollback_if_commit_failed
    cx = sqlite.connect(TESTFN, timeout=self.CONNECTION_TIMEOUT)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 16: unexpected end of data

======================================================================
ERROR: test_open_uri (test.test_sqlite3.test_dbapi.OpenTests.test_open_uri)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 674, in test_open_uri
    with managed_connect(TESTFN) as cx:
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/serhiy/py/cpython/Lib/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 44, in managed_connect
    cx = sqlite.connect(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 16: unexpected end of data

======================================================================
ERROR: test_open_with_path_like_object (test.test_sqlite3.test_dbapi.OpenTests.test_open_with_path_like_object)
Checks that we can successfully connect to a database using an object that
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 670, in test_open_with_path_like_object
    with managed_connect(path) as cx:
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/serhiy/py/cpython/Lib/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_dbapi.py", line 44, in managed_connect
    cx = sqlite.connect(*args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 16: unexpected end of data

======================================================================
ERROR: test_trace_callback_content (test.test_sqlite3.test_hooks.TraceCallbackTests.test_trace_callback_content)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serhiy/py/cpython/Lib/test/test_sqlite3/test_hooks.py", line 279, in test_trace_callback_content
    con1 = sqlite.connect(TESTFN, isolation_level=None)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe6 in position 16: unexpected end of data

----------------------------------------------------------------------
@serhiy-storchaka serhiy-storchaka added type-bug An unexpected behavior, bug, or error 3.11 only security fixes 3.10 only security fixes 3.9 only security fixes labels Apr 25, 2022
@AlexWaygood AlexWaygood added the tests Tests in the Lib/test dir label Apr 25, 2022
@serhiy-storchaka serhiy-storchaka removed the tests Tests in the Lib/test dir label Apr 26, 2022
@serhiy-storchaka
Copy link
Member Author

It is not just tests. There is a bug in the code.

It is due to this line:

if (PySys_Audit("sqlite3.connect", "s", database) < 0) {

database is a file path which can be not UTF-8. And "s" tries to decode it as UTF-8. There may be other similar bugs related to audition in other code. @tiran

I afraid also that sqlite3.connect() does not work correctly with non-ASCII path on Windows. It could be more correct to use sqlite3_open16() on Windows, but it lacks the flags parameter.

@erlend-aasland

@erlend-aasland
Copy link
Contributor

I'd rather fix this by documenting that the database path must be UTF-8. I'm afraid that using both sqlite3_open16 and sqlite3_open_v2 will create too much complexity in the code; my initial reaction is that it is not worth the added complexity, but I will absolutely consider it.

@erlend-aasland
Copy link
Contributor
erlend-aasland commented Apr 26, 2022

Quoting the SQLite docs:

Note to Windows users: The encoding used for the filename argument of sqlite3_open() and sqlite3_open_v2() must be UTF-8, not whatever codepage is currently defined. Filenames containing international characters must be converted to UTF-8 prior to passing them into sqlite3_open() or sqlite3_open_v2().

We should add this information to the docs.

@erlend-aasland
Copy link
Contributor

I'm troubled by this sentence in the SQLite docs:

The default encoding will be UTF-8 for databases created using sqlite3_open() or sqlite3_open_v2(). The default encoding for databases created using sqlite3_open16() will be UTF-16 in the native byte order.

@erlend-aasland
Copy link
Contributor

I propose to resolve this by improving the docs. Let me know if you agree.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue May 18, 2022
* Fix function sqlite.connect() and the sqlite.Connection constructor
  on non-UTF-8 locales.
* Fix support of bytes paths non-decodable with the current FS encoding.
@serhiy-storchaka
Copy link
Member Author

#92926 resolves this issue. It also fixes support of non-decodable bytes paths. I am not sure about URIs, perhaps they need additional work.

@serhiy-storchaka
Copy link
Member Author

BTW, sqlite3_open_v2() works with non-UTF8 paths.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 20, 2022
pythonGH-92926)

(cherry picked from commit d853758)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
miss-islington added a commit that referenced this issue May 20, 2022
…92926)

(cherry picked from commit d853758)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@serhiy-storchaka serhiy-storchaka removed 3.10 only security fixes 3.9 only security fixes labels May 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.11 only security fixes topic-sqlite3 topic-unicode type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0