-
Notifications
You must be signed in to change notification settings - Fork 73
latest version 0.9.0 does not allow compression of empty string #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an iss 8000 ue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This wasn't an intentional change - we should support the previous behaviour. Will look into it when I get avbit of time. PRs always welcome too :) |
Looking at the code, I realize it was an intentional change, as I thought it would be a helpful check - I couldn't imagine a use case for compressing an empty string. I suppose you have provided one. I'll add an option to check for empty strings, which defaults to off, to retain the old behaviour. |
@jonathanunderwood more than happy to supply a PR, I just wanted to understand the reason for the change, and learn if a PR would be accepted or not. Will try and produce one this weekend. |
There are valid use cases for passing a zero length string to compress: #27
Ok, I think I've fixed this, will cut a new release. Please re-open if you think there's still a problem. |
looks good, thanks. I'll test it out once a new release is on pypi. |
Previously we returned a string (bytes) object.
As far as I can tell, this lets you compress and decompress the empty string, but doesn't allow you to decompress empty strings compressed with <v0.9.0. From bmoscon's first comment opening this issue,
However, the decompression isn't backwards compatible; on v0.8.2, this decompresses to the empty string:
But on v0.9.1 (and newer, as far as I can tell):
Would it make sense to support the old (v0.8.2) "corrupt" compressed empty strings? As far as I know, this would mean checking that the size information is zero, and accepting any garbage after that. |
I'm in two minds over this. On the face of it, this seems like a reasonable idea. But, it would special case the zero length case. At the moment, we always error if there is junk in the passed in bytes, irrespective of uncompressed size. My inclination is to leave it as simple as possible. |
Simple as possible always best! |
Uh oh!
There was an error while loading. Please reload this page.
Previous versions of python-lz4 allowed compression of the empty string:
the latest version fails:
Was there a reason for this change? We use lz4 for compression of data we write to a database. Some of the records might contain empty strings. If I were to upgrade to the latest version of lz4 I'd no longer be able to support datasets with empty strings, and I'd also no longer be able to decompress the compressed data that's already been written.
The text was updated successfully, but these errors were encountered: