-
-
Notifications
You must be signed in to change notification settings - Fork 11k
MAINT: Refactor numpy/lib/format.py #10525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
VERSION_2 = (2, 0) | ||
|
||
|
||
class DictionarySerializer(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this class would be better as 2 or 3 free functions - there's no state here, so I see little point in using class
|
||
AUTOMATIC = None | ||
VERSION_1 = (1, 0) | ||
VERSION_2 = (2, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems redundant - version = (2, 0)
seems clearer and more extensible that version = VERSION_2
to me.
there do seem to be a few tiny fixes in here, but agree it's unclear that the big refactoring is worth the code churn. @rkern any opinion as the author of this code? |
I agree; I'm not sure the refactoring is enough of an improvement. To my eyes, it's a lateral move, though I recognize that can be a matter of taste. |
@pnbat it sounds like we'd prefer not to have the refactoring.
Would you be able to write a test case for this? If so, stripping this PR down a bit while keeping your fixes for this would still be good to have. |
Actually, I was not expecting to hear anything about that pull request anymore. Unfortunately, I currently do not have the time and motivation to support the numpy project. The issue mentioned in the previous mail is really minor and could probably be ignored. My whole point is that such issues (the initial bug and the minor issue mentioned above) just occur, because the internal quality / structure of the code could be much better. As an example, saying that (2, 0) seems clearer than using constants is just absurd. Well, (2, 0) is much more extensible, I agree, but I would be surprised to see a lot more versions here in the future. I am aware that my changes are pretty radical and do not really fit into the overall style of (what I saw from the python part of) numpy. I don't think it is only about taste and personal preferences, but I am fine leaving it at that. |
Thanks for trying to improve numpy anyway @pnbat |
Refactoring of format.py in order to remove code duplication and improve readability and maintainability.
The computation of the required padding bytes (for alignment) was fixed, although it was only a very minor issue: in case the output would have been already perfectly aligned, 64 more bytes would have been added (which does not seem a big issue to me) and this could also lead to usage of header format version 2, where version 1 would have been sufficient (even less likely and also not a big issue).
Please let me know if you want me to go further with the refactoring. The reading (deserialization) could be refactored in the same spirit. Alignment and buffer sizes (for reading / writing) could be made configurable via parameters. Unit tests should be much easier using the new classes and could be adapted accordingly.