-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
py/qstr: Add support for sorted qstr pools. #12678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Code size report:
|
+136 bytes on bare-arm is 6*4 bytes for the extra pool, 84 bytes for the binary search, 28 bytes for |
Would it be faster to sort and lookup by hash instead of value? |
In the original PR for this I did compare sorting string value vs hash, see #6896 (comment) |
In summary, using a hash is not really any faster and a fair bit more complicated. There is a very detailed discussion and size/performance numbers in #6896. |
Codecov Report
@@ Coverage Diff @@
## master #12678 +/- ##
=======================================
Coverage 98.39% 98.39%
=======================================
Files 158 158
Lines 20965 20973 +8
=======================================
+ Hits 20629 20637 +8
Misses 336 336
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the full history before this PR, but this change looks good and the performance numbers look good!
@@ -52,7 +52,8 @@ | |||
codepoint2name[ord("|")] = "pipe" | |||
codepoint2name[ord("~")] = "tilde" | |||
|
|||
# static qstrs, should be sorted | |||
# static qstrs, these must maintain a specific order for .mpy compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe more extreme version of the previous idea, but what about adding a constant lookup table (equivalent of mp_binary_op_method_name
) for these? That would add 157 byte code size, but also allow this whole qstr pool to be sorted. In the next .mpy version bump the lookup table could be removed to get the code size back.
(This would also potentially allow deleting the !sorted
case from the search lookup code as all pools would be sorted, wouldn't it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this idea! It might be easier just to go with three sorted pools (because this list is almost sorted) as described above.
Unfortunately, any runtime qstrs are not sorted (because the pools are created incrementally), so the !sorted
case needs to stay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing it's not viable to insertion sort the runtime pool? EDIT: Of course it's not, the addresses need to be stable. Oh well.
I'll stop proposing ever more complex changes now, I promise. 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main problem is that we return the qstr id at insertion time. This id is the index across all the pools (i.e. the first pool contains qstr 0 -> N, the second contains N+1 -> M, etc).
So sorting would require some way to know the final index in the pool, even though the sort ordering might change with further insertions.
There are easy ways to solve this with extra indirection of course, but that would be an extra 2*sizeof(qstr)
per entry in the runtime tables. Maybe this is worth considering?
Also other ideas around some sort of hash-table-like approach. The problem with hashtables though is that the main expensive operation we perform on the pool is set-existence (i.e. we have this new string from somewhere, and we're trying to decide whether it's an existing qstr or not). And worse, the expected outcome for any given pool is negative, which is pathological for high-load-factor hashtables.
These tests are designed to measure changes in performance relating to: - string interning / searching for existing strings - map lookup - string operations - string hashing This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This test depends on the order in which qstrs are stored in ROM, which affects the order in which `dir()` will probe the object to see what it supports. Because of the lazy-loading in asyncio/__init__.py, if it tries to do e.g. `wait_for_ms` before `funcs` then it will import funcs, making `funcs` later succeed. But in the other way around, `funcs` will initially not be found. This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
Required by upcoming qstr sorting. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This provides a significant performance boost for qstr_find_strn, which is called a lot during parsing and loading of .mpy files, as well as interning of string objects (which happens in most string methods that return new strings). Also adds comments to explain the "static" qstrs. These are part of the .mpy ABI and avoid needing to duplicate string data for QSTRs known to already be in the firmware. The static pool isn't currently sorted, but in the future we could either split the static pool into the sorted regions, or in the next .mpy version just sort them. Based on initial work done by @amirgon in micropython#6896. This work was funded through GitHub Sponsors. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
This is just the sorted-pools part of #10758 (originally based on #6896).
This work was funded through GitHub Sponsors.