8000 gh-105156: Cleanup usage of old Py_UNICODE type by vstinner · Pull Request #105158 · python/cpython · GitHub
[go: up one dir, main page]

Skip to content

gh-105156: Cleanup usage of old Py_UNICODE type #105158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 1, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
gh-105156: Cleanup usage of old Py_UNICODE type
* refcounts.dat:

  * Remove Py_UNICODE functions
  * Replace Py_UNICODE argument type with wchar_t

* _PyUnicode_ToLowercase(), _PyUnicode_ToUppercase(),
  _PyUnicode_ToTitlecase() are no longer deprecate in comment.
  It's no longer needed since they now use Py_UCS4 type, rather than
  the deprecated Py_UNICODE type.
* gdb: Remove unused char_width() method.
  • Loading branch information
vstinner committed May 31, 2023
commit f44643944ea5e5b6e70fb4b02905225a386d8392
60 changes: 20 additions & 40 deletions Doc/data/refcounts.dat
Original file line number Diff line number Diff line change
Expand Up @@ -2374,76 +2374,56 @@ PyUnicode_KIND:PyObject*:o:0:
PyUnicode_MAX_CHAR_VALUE::::
PyUnicode_MAX_CHAR_VALUE:PyObject*:o:0:

PyUnicode_AS_UNICODE:Py_UNICODE*:::
PyUnicode_AS_UNICODE:PyObject*:o:0:

PyUnicode_AS_DATA:const char*:::
PyUnicode_AS_DATA:PyObject*:o:0:

Py_UNICODE_ISALNUM:int:::
Py_UNICODE_ISALNUM:Py_UNICODE:ch::
Py_UNICODE_ISALNUM:Py_UCS4:ch::

Py_UNICODE_ISALPHA:int:::
Py_UNICODE_ISALPHA:Py_UNICODE:ch::
Py_UNICODE_ISALPHA:Py_UCS4:ch::

Py_UNICODE_ISSPACE:int:::
Py_UNICODE_ISSPACE:Py_UNICODE:ch::
Py_UNICODE_ISSPACE:Py_UCS4:ch::

Py_UNICODE_ISLOWER:int:::
Py_UNICODE_ISLOWER:Py_UNICODE:ch::
Py_UNICODE_ISLOWER:Py_UCS4:ch::

Py_UNICODE_ISUPPER:int:::
Py_UNICODE_ISUPPER:Py_UNICODE:ch::
Py_UNICODE_ISUPPER:Py_UCS4:ch::

Py_UNICODE_ISTITLE:int:::
Py_UNICODE_ISTITLE:Py_UNICODE:ch::
Py_UNICODE_ISTITLE:Py_UCS4:ch::

Py_UNICODE_ISLINEBREAK:int:::
Py_UNICODE_ISLINEBREAK:Py_UNICODE:ch::
Py_UNICODE_ISLINEBREAK:Py_UCS4:ch::

Py_UNICODE_ISDECIMAL:int:::
Py_UNICODE_ISDECIMAL:Py_UNICODE:ch::
Py_UNICODE_ISDECIMAL:Py_UCS4:ch::

Py_UNICODE_ISDIGIT:int:::
Py_UNICODE_ISDIGIT:Py_UNICODE:ch::
Py_UNICODE_ISDIGIT:Py_UCS4:ch::

Py_UNICODE_ISNUMERIC:int:::
Py_UNICODE_ISNUMERIC:Py_UNICODE:ch::
Py_UNICODE_ISNUMERIC:Py_UCS4:ch::

Py_UNICODE_ISPRINTABLE:int:::
Py_UNICODE_ISPRINTABLE:Py_UNICODE:ch::
Py_UNICODE_ISPRINTABLE:Py_UCS4:ch::

Py_UNICODE_TOLOWER:Py_UNICODE:::
Py_UNICODE_TOLOWER:Py_UNICODE:ch::
Py_UNICODE_TOLOWER:Py_UCS4:::
Py_UNICODE_TOLOWER:Py_UCS4:ch::

Py_UNICODE_TOUPPER:Py_UNICODE:::
Py_UNICODE_TOUPPER:Py_UNICODE:ch::
Py_UNICODE_TOUPPER:Py_UCS4:::
Py_UNICODE_TOUPPER:Py_UCS4:ch::

Py_UNICODE_TOTITLE:Py_UNICODE:::
Py_UNICODE_TOTITLE:Py_UNICODE:ch::
Py_UNICODE_TOTITLE:Py_UCS4:::
Py_UNICODE_TOTITLE:Py_UCS4:ch::

Py_UNICODE_TODECIMAL:int:::
Py_UNICODE_TODECIMAL:Py_UNICODE:ch::
Py_UNICODE_TODECIMAL:Py_UCS4:ch::

Py_UNICODE_TODIGIT:int:::
Py_UNICODE_TODIGIT:Py_UNICODE:ch::
Py_UNICODE_TODIGIT:Py_UCS4:ch::

Py_UNICODE_TONUMERIC:double:::
Py_UNICODE_TONUMERIC:Py_UNICODE:ch::

PyUnicode_FromUnicode:PyObject*::+1:
PyUnicode_FromUnicode:const Py_UNICODE*:u::
PyUnicode_FromUnicode:Py_ssize_t:size::

PyUnicode_AsUnicode:Py_UNICODE*:::
PyUnicode_AsUnicode:PyObject*:unicode:0:

PyUnicode_AsUnicodeAndSize:Py_UNICODE*:::
PyUnicode_AsUnicodeAndSize:PyObject*:unicode:0:
PyUnicode_AsUnicodeAndSize:Py_ssize_t*:size::

PyUnicode_GetSize:Py_ssize_t:::
PyUnicode_GetSize:PyObject*:unicode:0:
Py_UNICODE_TONUMERIC:Py_UCS4:ch::

PyUnicode_FromObject:PyObject*::+1:
PyUnicode_FromObject:PyObject*:obj:0:
Expand Down
8 changes: 3 additions & 5 deletions Include/cpython/unicodeobject.h
8000
Original file line number Diff line number Diff line change
Expand Up @@ -379,8 +379,6 @@ static inline Py_UCS4 PyUnicode_MAX_CHAR_VALUE(PyObject *op)

/* === Public API ========================================================= */

/* --- Plain Py_UNICODE --------------------------------------------------- */

/* With PEP 393, this is the recommended way to allocate a new unicode object.
This function will allocate the object and its buffer in a single memory
block. Objects created using this function are not resizable. */
Expand Down Expand Up @@ -827,15 +825,15 @@ PyAPI_FUNC(int) _PyUnicode_IsLinebreak(
const Py_UCS4 ch /* Unicode character */
);

/* Py_DEPRECATED(3.3) */ PyAPI_FUNC(Py_UCS4) _PyUnicode_ToLowercase(
PyAPI_FUNC(Py_UCS4) _PyUnicode_ToLowercase(
Py_UCS4 ch /* Unicode character */
);

/* Py_DEPRECATED(3.3) */ PyAPI_FUNC(Py_UCS4) _PyUnicode_ToUppercase(
PyAPI_FUNC(Py_UCS4) _PyUnicode_ToUppercase(
Py_UCS4 ch /* Unicode character */
);

Py_DEPRECATED(3.3) PyAPI_FUNC(Py_UCS4) _PyUnicode_ToTitlecase(
PyAPI_FUNC(Py_UCS4) _PyUnicode_ToTitlecase(
Py_UCS4 ch /* Unicode character */
);

Expand Down
2 changes: 1 addition & 1 deletion Objects/stringlib/README.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ the following defines used by the different modules:

STRINGLIB_CHAR

the type used to hold a character (char or Py_UNICODE)
the type used to hold a character (char, Py_UCS1, Py_UCS2 or Py_UCS4)

STRINGLIB_GET_EMPTY()

Expand Down
10 changes: 3 additions & 7 deletions Tools/gdb/libpython.py
Original file line number Diff line number Diff line change
Expand Up @@ -1390,10 +1390,6 @@ def _unichr_is_printable(char):
class PyUnicodeObjectPtr(PyObjectPtr):
_typename = 'PyUnicodeObject'

def char_width(self):
_type_Py_UNICODE = gdb.lookup_type('Py_UNICODE')
return _type_Py_UNICODE.sizeof

def proxyval(self, visited):
compact = self.field('_base')
ascii = compact['_base']
Expand All @@ -1414,13 +1410,13 @@ def proxyval(self, visited):
elif repr_kind == 4:
field_str = field_str.cast(_type_unsigned_int_ptr())

# Gather a list of ints from the Py_UNICODE array; these are either
# Gather a list of ints from the character array; these are either
# UCS-1, UCS-2 or UCS-4 code points:
Py_UNICODEs = [int(field_str[i]) for i in safe_range(field_length)]
characters = [int(field_str[i]) for i in safe_range(field_length)]

# Convert the int code points to unicode characters, and generate a
# local unicode instance.
result = u''.join(map(chr, Py_UNICODEs))
result = u''.join(map(chr, characters))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result = u''.join(map(chr, characters))
result = ''.join(map(chr, characters))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing supporting for Python 2 require way more changes. I prefer to restrict changes to just Py_UNICODE here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I removed Python 2 support from libpython.py already.
https://github.com/python/cpython/pull/31717/files

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't know. Well, feel free to remove that u prefix in a separated PR :-) My PR doesn't add it at least :-)

return result

def write_repr(self, out, visited):
Expand Down
0