8000 MAINT: Refactor numpy/lib/format.py by pnbat · Pull Request #10525 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

MAINT: Refactor numpy/lib/format.py #10525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed

Conversation

pnbat
Copy link
Contributor
@pnbat pnbat commented Feb 5, 2018

Refactoring of format.py in order to remove code duplication and improve readability and maintainability.

The computation of the required padding bytes (for alignment) was fixed, although it was only a very minor issue: in case the output would have been already perfectly aligned, 64 more bytes would have been added (which does not seem a big issue to me) and this could also lead to usage of header format version 2, where version 1 would have been sufficient (even less likely and also not a big issue).

Please let me know if you want me to go further with the refactoring. The reading (deserialization) could be refactored in the same spirit. Alignment and buffer sizes (for reading / writing) could be made configurable via parameters. Unit tests should be much easier using the new classes and could be adapted accordingly.

@charris charris changed the title ENH: refactoring of numpy/lib/format.py MAINt: Refactor numpy/lib/format.py Feb 5, 2018
@charris charris changed the title MAINt: Refactor numpy/lib/format.py MAINT: Refactor numpy/lib/format.py Feb 5, 2018
VERSION_2 = (2, 0)


class DictionarySerializer(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class would be better as 2 or 3 free functions - there's no state here, so I see little point in using class


AUTOMATIC = None
VERSION_1 = (1, 0)
VERSION_2 = (2, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems redundant - version = (2, 0) seems clearer and more extensible that version = VERSION_2 to me.

@mattip mattip added the 57 - Close? Issues which may be closable unless discussion continued label Jun 4, 2018
@rgommers
Copy link
Member

there do seem to be a few tiny fixes in here, but agree it's unclear that the big refactoring is worth the code churn. @rkern any opinion as the author of this code?

@rkern
Copy link
Member
rkern commented Jul 14, 2018

I agree; I'm not sure the refactoring is enough of an improvement. To my eyes, it's a lateral move, though I recognize that can be a matter of taste.

@rgommers
Copy link
Member

Please let me know if you want me to go further with the refactoring.

@pnbat it sounds like we'd prefer not to have the refactoring.

The computation of the required padding bytes (for alignment) was fixed, although it was only a very minor issue: in case the output would have been already perfectly aligned, 64 more bytes would have been added (which does not seem a big issue to me) and this could also lead to usage of header format version 2, where version 1 would have been sufficient (even less likely and also not a big issue).

Would you be able to write a test case for this? If so, stripping this PR down a bit while keeping your fixes for this would still be good to have.

@pnbat
Copy link
Contributor Author
pnbat commented Aug 1, 2018

Actually, I was not expecting to hear anything about that pull request anymore. Unfortunately, I currently do not have the time and motivation to support the numpy project. The issue mentioned in the previous mail is really minor and could probably be ignored.

My whole point is that such issues (the initial bug and the minor issue mentioned above) just occur, because the internal quality / structure of the code could be much better. As an example, saying that (2, 0) seems clearer than using constants is just absurd. Well, (2, 0) is much more extensible, I agree, but I would be surprised to see a lot more versions here in the future.

I am aware that my changes are pretty radical and do not really fit into the overall style of (what I saw from the python part of) numpy. I don't think it is only about taste and personal preferences, but I am fine leaving it at that.

@pnbat pnbat closed this Aug 1, 2018
@rgommers
Copy link
Member
rgommers commented Aug 1, 2018

Thanks for trying to improve numpy anyway @pnbat

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 - Maintenance 57 - Close? Issues which may be closable unless discussion continued component: numpy.lib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0