-
Notifications
You must be signed in to change notification settings - Fork 48
Add a TimeUUID
class
#231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I tried to use the existing python libraries, but it looks like we can't use any of them.
>>> import time_uuid
>>> a = time_uuid.TimeUUID('00000000-0000-1000-0000-000000000000')
>>> b = time_uuid.TimeUUID('00000000-0000-1000-8080-808080808080')
>>> a < b
True
root@dfc90c4a88f5:/# pip install python-timeuuid
Collecting python-timeuuid
Using cached python-timeuuid-0.3.5.tar.gz (27 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [7 lines of output]
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/tmp/pip-install-ptk7ypbx/python-timeuuid_4457d3d65d544d868b98787f21893a11/setup.py", line 20
print 'Building with Cython %s' % cython_version
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(...)?
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details. |
How does Cassandra's driver work, btw? Can you compare? |
Same thing. Looking at the code, they don't have any Here's a small test case: # cassandra-driver==3.28.0
import cassandra
from cassandra.cluster import Cluster
s = Cluster(["127.0.0.1"]).connect()
s.execute("DROP KEYSPACE IF EXISTS ks")
s.execute("CREATE KEYSPACE ks WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1}")
s.execute("CREATE TABLE ks.t (p int, t timeuuid, PRIMARY KEY (p, t))")
queries = [
"INSERT INTO ks.t (p, t) VALUES (0, 00000000-0000-1000-0000-000000000000);",
"INSERT INTO ks.t (p, t) VALUES (0, 00000000-0000-1000-8080-808080808080);",
"INSERT INTO ks.t (p, t) VALUES (0, 00000000-0000-1000-7f7f-7f7f7f7f7f7f);",
"INSERT INTO ks.t (p, t) VALUES (0, 00000000-0000-1000-f7f7-f7f7f7f7f7f7);",
"INSERT INTO ks.t (p, t) VALUES (0, 00000000-0000-1000-ffff-ffffffffffff);",
"INSERT INTO ks.t (p, t) VALUES (0, 00000000-0000-1000-a1ca-00006490e9a4);"]
for q in queries:
s.execute(q)
rows = list(s.execute("SELECT t FROM ks.t"))
print("rows:", rows)
uuids = [r[0] for r in rows]
print("\nuuids:", uuids)
sorted_uuids = sorted(uuids)
print("\nsorted_uuids:", sorted_uuids)
Output:
|
We need to be 'Cassandra-compatible' in that sense, even bug compatible, or be able to turn on/off different behavior. @nyh - thoughts on the above? |
We can provide the wrapper as a separate class without converting the rows automatically to |
I think it's a mistake to change the meaning of existing classes, it can break existing applications. In this sense, it's fine to add a new timeuuid class. Another alternative is to declare the existing situation a bug, i.e., a real application might get confused by the wrong "<" operator. If it's a bug, the it's fine to change the existing classes. But to be honest, I'm not sure that any real application tries to use <... An application typically trusts Scylla to return the correct order, and not in the habit of checking the order of the results. |
A short description of how A The first Each hex character is 4 bits, the
This character will always be The first 3 parts is the timestamp (+ version which is always The remaining Cassandra compares
|
I would raise is on the upstream drivers as well, as a bug. (and can open this PR on upstream as well) as a user I would expect the object returned by the driver to do the right ordering. |
Opened https://datastax-oss.atlassian.net/browse/PYTHON-1358 |
@cvybhu did you notice that the Python driver already has This can be used to convert timeuuid to time and compare it like time, without needing a new class, although I admit it doesn't give you all the power of comparing all the extra bits just like Cassandra does (but does anyone need this extra power?) There's also a delicate issue of timezone but it's not important if all you want is to compare the order. |
>>> import uuid
>>> u = uuid.uuid1()
>>> u.time
139071763117282130
>>> |
@cvybhu FYI someone working on this on upstream |
That's great to hear :) I shared my code with them in the JIRA issue, so they should have access to everything. |
and now with the link: |
Closing the issue here since upstream is working on it. |
Uh oh!
There was an error while loading. Please reload this page.
timeuuid
is a UUID v1 - it contains a timestamp and some random bits that together form a unique identifier.Their comparison operator should compare them primarily based on this timestamp.
timeuuid
values are currently represented usinguuid.UUID
, but the comparison operators ofuuid.UUID
don't comparetimeuuid
values in the same way that Cassandra/Scylla does.For example, with values
00000257-0efc-11ee-9547-00006490e9a6
andfed35080-0efb-11ee-a1ca-00006490e9a4
:Cassandra believes that
fed35080-0efb-11ee-a1ca-00006490e9a4
is smaller:Because it has a lower timestamp value:
But UUID comparison says the opposite, as it compares the bytes in lexicographical order:
It would be useful to have a class that represents
timeuuid
values and has the same semantics as the values in Cassandra.Here's an example implementation that I wrote, based on the cassandra implementation:
https://gist.github.com/cvybhu/ed5b64d8b62eff51dc46258157a92e41
Ideally python driver would return values of this type when some row is selected, but this would be a breaking change.
The text was updated successfully, but these errors were encountered: