-
-
Notifications
You must be signed in to change notification settings - Fork 28
PEP 750: danger with explicit concatenation #289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi SC, To add on: after discussion, the PEP 750 team agrees that we should update the spec and implementation to:
No other changes are necessary. In particular, we:
This change effectively requires developers to mark arbitrary strings as trusted with There's more color on discuss.python.org. We'll provide proposed PRs for both the PEP and for cpython. I'll link them here when they're available. Thank you for considering this! |
The PEP PR is now up: python/peps#4395 |
A corresponding update to I haven't created a PR for this yet because I don't know whether I should create a GitHub issue now or not. :-) |
Process: I would open a draft PR using the overarching 'implement PEP 750' issue, but also cross reference here. A |
Polite critique to the release methodology: this is a major language feature that is being pushed to the language without testing apart from the people who have been directly involved in the PEP: it wasn't present in 3.14a7, it will be immutable in 3.14b1. If there are other major design flaws in the feature, or even if the current issue is not addressed, it will take Python 4 to fix them. I think that the feature requires still more soaking in reality to be considered mature. I suggest either:
I am available for testing on my part. |
Just to clarify a few process points that you raised
The first beta release is 'feature freeze', but that isn't a total moratorium on changes -- there will be many bug fixes. You might find the development cycle overview in the Developer's Guide useful as a quick briefing on how releases work. Features can, and have, been removed or reverted up to and including release candidates, so please don't worry that there is no opportunity to improve things after Tuesday (b1 release). As such, I personally don't think we need an 3.14a8 release, though ultimately this is Hugo's decision as RM. Our deprecation & versioning policy also doesn't follow SemVer, so it wouldn't be 4.0, but 3.1X, if we needed to deprecate or remove a feature.
Thank you! We always appreciate more testing of alpha and beta releases, to help iron out these problems before the final release -- especially for PEP-sized features. It would also be great to know how we can make testing pre-releases easier! A |
Just to clarify our position, the PEP 750 team does not believe the spec has "major design flaws". We think the spec as accepted is fine and can ship as-is. This said, we do think this proposed change is a useful improvement to the spec and is worth making even at this late stage. We would all like to see it go through. (Removing footguns is good! But we see it as a footgun, not a show-stopper. Linters exist and can flag uses of |
(Updated my comment above to explicitly mention |
Dear Steering Council,
to introduce myself, I have been involved in the Psycopg project (the de facto standard PostgreSQL driver for Python) since 2005, I have been the main maintainer of psycopg2 since 2010, and, in 2020, I designed and implemented Psycopg 3, in order to use new Python features (typing, async), making new choices based on the experience gathered in the previous 15 years.
As you can imagine, one of the main preoccupations of our project is safety: how to enable end users to craft any statement they need to execute on a database while guarding them from unsafe input. Therefore, we have always stressed using the best safety practices when dealing with untrusted user input.
I have recently come across the efforts of PEP 750 to provide a template string object in Python, and I found it extremely fitting for the project. A few days ago I have implemented a first version of template strings execution, and I must say that it is the most important and refreshing change I have seen in the language, positively affecting our project.
However, during a discussion about whether to include the implementation of a
Template.join()
method (which would be very desirable in my opinion, but this is a different matter), it dawned on me that explicit string concatenation is a very dangerous feature, and it pretty much defeats entirely the safety that the PEP declares as being one of the design goals.Because templates and strings can be concatenated without any safety check, it is very easy to include insecure input in a template. Taking the same example from the PEP's Motivation section:
The hypothetical
html()
function will receive aTemplate
object on which it cannot put any trust. The design of the Template object makes any insecure input string instantly secure.Please note that this design goes pretty much in the opposite direction of the
LiteralString
defined in PEP 675: concatenating a safeLiteralString
and an unsafestr
produces an unsafestr
. This is bizarre:LiteralString
, anything safe, when it comes in contact with something unsafe, becomes unsafe.This, I am afraid, is not a well-thought-out design from the safety point of view.
LiteralString
, in our project, is so relevant that it is actually the only string accepted by theexecute()
function: we define ourQuery
type as:What we require is for a statement to be either a literal string or to have been produced by composition of the
psycopg.sql
family of objects, which are designed to compose the different parts of a SQL statement employing the correct escaping method (and whose use would become largely marginal with a good template string solution). While there isn't widespread support for this feature yet in type checkers, this is our formal requirement for a query, and even if, every now and then, some user is confused by linter errors, it is no problem to explain our design.I have looked for information about the rationale of the current design and I have been kindly provided some references. There were some discussions mixing explicit and implicit concatenation, with a final resolution stating:
There are indeed explanations in the PEP stating:
This statement is misguided. People can concatenate f-strings and normal strings without a problem because 1) they are the same type and 2) there is no safety semantics behind
str
. The type of bug that can be caused by disallowingstr + Template
is an immediateTypeError
; the type of bug that can be caused by allowing it is a safety bug.Using the current design, accepting a template string in a query cannot be considered safe. We are back to the point of people being able to compose queries such as:
and no runtime or static checker should have any problem with it.
This is less safe than
LiteralString
orsql.SQL
objects, which require an active action from the user to allow astr
to be part of a statement, signifying that the author has taken their measures to prevent problems:This goes in the opposite direction of where we want to go, in terms of safety. Therefore, we cannot, in our conscience, allow the use of template strings as query input, and, despite the initial enthusiasm, we will prefer to not merge the feature.
I understand that the template string branch was merged to the Python 3.14 branch only yesterday; version 3.14a7 didn't include the feature, and 3.14b1 is due to be released in a few days, after which no change would be accepted. I believe we are still in time to fix this design.
Thank you very much.
-- Daniele
The text was updated successfully, but these errors were encountered: