@@ -317,7 +317,7 @@ These APIs can be used to work with surrogates:
317
317
318
318
.. c:function:: Py_UCS4 Py_UNICODE_JOIN_SURROGATES(Py_UCS4 high, Py_UCS4 low)
319
319
320
- Join two surrogate characters and return a single :c:type:`Py_UCS4` value.
320
+ Join two surrogate code points and return a single :c:type:`Py_UCS4` value.
321
321
*high* and *low* are respectively the leading and trailing surrogates in a
322
322
surrogate pair. *high* must be in the range [0xD800; 0xDBFF ] and *low* must
323
323
be in the range [0xDC00; 0xDFFF].
@@ -999,6 +999,9 @@ These are the UTF-8 codec APIs:
999
999
object. Error handling is "strict". Return ``NULL `` if an exception was
1000
1000
raised by the codec.
1001
1001
1002
+ The function fails if the string contains surrogate code points
1003
+ (``U+D800 `` - ``U+DFFF ``).
1004
+
1002
1005
1003
1006
.. c:function:: const char* PyUnicode_AsUTF8AndSize(PyObject *unicode, Py_ssize_t *size)
1004
1007
@@ -1011,6 +1014,9 @@ These are the UTF-8 codec APIs:
1011
1014
On error, set an exception, set *size* to ``-1`` (if it's not NULL) and
1012
1015
return ``NULL``.
1013
1016
1017
+ The function fails if the string contains surrogate code points
1018
+ (``U+D800 `` - ``U+DFFF ``).
1019
+
1014
1020
This caches the UTF-8 representation of the string in the Unicode object, and
1015
1021
subsequent calls will return a pointer to the same buffer. The caller is not
1016
1022
responsible for deallocating the buffer. The buffer is deallocated and
@@ -1438,8 +1444,9 @@ They all return ``NULL`` or ``-1`` if an exception occurs.
1438
1444
Compare a Unicode object with a char buffer which is interpreted as
1439
1445
being UTF-8 or ASCII encoded and return true (``1 ``) if they are equal,
1440
1446
or false (``0 ``) otherwise.
1441
- If the Unicode object contains surrogate characters or
1442
- the C string is not valid UTF-8, false (``0 ``) is returned.
1447
+ If the Unicode object contains surrogate code points
1448
+ (``U+D800 `` - ``U+DFFF ``) or the C string is not valid UTF-8,
1449
+ false (``0 ``) is returned.
1443
1450
1444
1451
This function does not raise exceptions.
<
41D9
tr class="diff-line-row">1445
1452
0 commit comments