8000 Fix mapping of PostgreSQL encodings to Python encodings. · lhcezar/postgres@138313e · GitHub
[go: up one dir, main page]

Skip to content

Commit 138313e

Browse files
committed
Fix mapping of PostgreSQL encodings to Python encodings.
Windows encodings, "win1252" and so forth, are named differently in Python, like "cp1252". Also, if the PyUnicode_AsEncodedString() function call fails for some reason, use a plain ereport(), not a PLy_elog(), to report that error. That avoids recursion and crash, if PLy_elog() tries to call PLyUnicode_Bytes() again. This fixes bug reported by Asif Naeem. Backpatch down to 9.0, before that plpython didn't even try these conversions. Jan Urbański, with minor comment improvements by me.
1 parent b8aca12 commit 138313e

File tree

1 file changed

+62
-7
lines changed

1 file changed

+62
-7
lines changed

src/pl/plpython/plpython.c

Lines changed: 62 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4873,16 +4873,71 @@ PLyUnicode_Bytes(PyObject *unicode)
48734873
const char *serverenc;
48744874

48754875
/*
4876-
* Python understands almost all PostgreSQL encoding names, but it doesn't
4877-
* know SQL_ASCII.
4876+
* Map PostgreSQL encoding to a Python encoding name.
48784877
*/
4879-
if (GetDatabaseEncoding() == PG_SQL_ASCII)
4880-
serverenc = "ascii";
4881-
else
4882-
serverenc = GetDatabaseEncodingName();
4878+
switch (GetDatabaseEncoding())
4879+
{
4880+
case PG_SQL_ASCII:
4881+
/*
4882+
* Mapping SQL_ASCII to Python's 'ascii' is a bit bogus. Python's
4883+
* 'ascii' means true 7-bit only ASCII, while PostgreSQL's
4884+
* SQL_ASCII means that anything is allowed, and the system doesn't
4885+
* try to interpret the bytes in any way. But not sure what else
4886+
* to do, and we haven't heard any complaints...
4887+
*/
4888+
serverenc = "ascii";
4889+
break;
4890+
case PG_WIN1250:
4891+
serverenc = "cp1250";
4892+
break;
4893+
case PG_WIN1251:
4894+
serverenc = "cp1251";
4895+
break;
4896+
case PG_WIN1252:
4897+
serverenc = "cp1252";
4898+
break;
4899+
case PG_WIN1253:
4900+
serverenc = "cp1253";
4901+
break;
4902+
case PG_WIN1254:
4903+
serverenc = "cp1254";
4904+
break;
4905+
case PG_WIN1255:
4906+
serverenc = "cp1255";
4907+
break;
4908+
case PG_WIN1256:
4909+
serverenc = "cp1256";
4910+
break;
4911+
case PG_WIN1257:
4912+
serverenc = "cp1257";
4913+
break;
4914+
case PG_WIN1258:
4915+
serverenc = "cp1258";
4916+
break;
4917+
case PG_WIN866:
4918+
serverenc = "cp866";
4919+
break;
4920+
case PG_WIN874:
4921+
serverenc = "cp874";
4922+
break;
4923+
default:
4924+
/* Other encodings have the same name in Python. */
4925+
serverenc = GetDatabaseEncodingName();
4926+
break;
4927+
}
4928+
48834929
rv = PyUnicode_AsEncodedString(unicode, serverenc, "strict");
48844930
if (rv == NULL)
4885-
PLy_elog(ERROR, "could not convert Python Unicode object to PostgreSQL server encoding");
4931+
{
4932+
/*
4933+
* Use a plain ereport instead of PLy_elog to avoid recursion, if
4934+
* the traceback formatting functions try to do unicode to bytes
4935+
* conversion again.
4936+
*/
4937+
ereport(ERROR,
4938+
(errcode(ERRCODE_INTERNAL_ERROR),
4939+
errmsg("could not convert Python Unicode object to PostgreSQL server encoding")));
4940+
}
48864941
return rv;
48874942
}
48884943

0 commit comments

Comments
 (0)
0