-
-
Notifications
You must be signed in to change notification settings - Fork 32k
uuid.getnode() is not tied to MAC address when using libuuid
#132710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
One thing that caught my eye in the if os.name == 'posix':
_GETTERS = [_unix_getnode] + _OS_GETTERS In binaries where So shouldn't That is, shouldn't that line be: if os.name == 'posix':
_GETTERS = _OS_GETTERS + [_unix_getnode] ? |
@picnixz would you be the right person to look at this :) I know you've been doing some work in this space |
So it's correct to first get I'll have a more in-depth look tomorrow. I'll also try to build locally to see what happens. |
As a note, |
That's a good thing to check. Let's just generate a bunch of node addresses, and if they change, fallback to something else. |
I've looked at the implementation of uuid.h that PBS uses: /*
* Get the ethernet hardware address, if we can find it...
*
* XXX for a windows version, probably should use GetAdaptersInfo:
* http://www.codeguru.com/cpp/i-n/network/networkinformation/article.php/c5451
* commenting out get_node_id just to get gen_uuid to compile under windows
* is not the right way to go!
*/
static int get_node_id(unsigned char *node_id)
{
#ifdef HAVE_NET_IF_H
int sd;
struct ifreq ifr, *ifrp;
struct ifconf ifc;
char buf[1024];
int n, i;
unsigned char *a;
#ifdef HAVE_NET_IF_DL_H
struct sockaddr_dl *sdlp;
#endif
/*
* BSD 4.4 defines the size of an ifreq to be
* max(sizeof(ifreq), sizeof(ifreq.ifr_name)+ifreq.ifr_addr.sa_len
* However, under earlier systems, sa_len isn't present, so the size is
* just sizeof(struct ifreq)
*/
#ifdef HAVE_SA_LEN
#define ifreq_size(i) max(sizeof(struct ifreq),\
sizeof((i).ifr_name)+(i).ifr_addr.sa_len)
#else
#define ifreq_size(i) sizeof(struct ifreq)
#endif /* HAVE_SA_LEN */
sd = socket(AF_INET, SOCK_DGRAM, IPPROTO_IP);
if (sd < 0) {
return -1;
}
memset(buf, 0, sizeof(buf));
ifc.ifc_len = sizeof(buf);
ifc.ifc_buf = buf;
if (ioctl (sd, SIOCGIFCONF, (char *)&ifc) < 0) {
close(sd);
return -1;
}
n = ifc.ifc_len;
for (i = 0; i < n; i+= ifreq_size(*ifrp) ) {
ifrp = (struct ifreq *)((char *) ifc.ifc_buf+i);
strncpy(ifr.ifr_name, ifrp->ifr_name, IFNAMSIZ);
#ifdef SIOCGIFHWADDR
if (ioctl(sd, SIOCGIFHWADDR, &ifr) < 0)
continue;
a = (unsigned char *) &ifr.ifr_hwaddr.sa_data;
#else
#ifdef SIOCGENADDR
if (ioctl(sd, SIOCGENADDR, &ifr) < 0)
continue;
a = (unsigned char *) ifr.ifr_enaddr;
#else
#ifdef HAVE_NET_IF_DL_H
sdlp = (struct sockaddr_dl *) &ifrp->ifr_addr;
if ((sdlp->sdl_family != AF_LINK) || (sdlp->sdl_alen != 6))
continue;
a = (unsigned char *) &sdlp->sdl_data[sdlp->sdl_nlen];
#else
/*
* XXX we don't have a way of getting the hardware
* address
*/
close(sd);
return 0;
#endif /* HAVE_NET_IF_DL_H */
#endif /* SIOCGENADDR */
#endif /* SIOCGIFHWADDR */
if (!a[0] && !a[1] && !a[2] && !a[3] && !a[4] && !a[5])
continue;
if (node_id) {
memcpy(node_id, a, 6);
close(sd);
return 1;
}
}
close(sd);
#endif
return 0;
} The UUID generation is as follows: int __uuid_generate_time(uuid_t out, int *num)
{
static unsigned char node_id[6];
static int has_init = 0;
struct uuid uu;
uint32_t clock_mid;
int ret;
if (!has_init) {
if (get_node_id(node_id) <= 0) {
random_get_bytes(node_id, 6);
/*
* Set multicast bit, to prevent conflicts
* with IEEE 802 addresses obtained from
* network cards
*/
node_id[0] |= 0x01;
}
has_init = 1;
}
ret = get_clock(&clock_mid, &uu.time_low, &uu.clock_seq, num);
uu.clock_seq |= 0x8000;
uu.time_mid = (uint16_t) clock_mid;
uu.time_hi_and_version = ((clock_mid >> 16) & 0x0FFF) | 0x1000;
memcpy(uu.node, node_id, 6);
uuid_pack(&uu, out);
return ret;
} What we call is int uuid_generate_time_safe(uuid_t out)
{
return uuid_generate_time_generic(out);
} And the generic implementation is: static int uuid_generate_time_generic(uuid_t out) {
#ifdef HAVE_TLS
...
#else
if (get_uuid_via_daemon(UUIDD_OP_TIME_UUID, out, 0) == 0)
return 0;
#endif
return __uuid_generate_time(out, 0);
} Now what's important is the following: if (!has_init) {
if (get_node_id(node_id) <= 0) {
random_get_bytes(node_id, 6);
/*
* Set multicast bit, to prevent conflicts
* with IEEE 802 addresses obtained from
* network cards
*/
node_id[0] |= 0x01;
}
has_init = 1;
} If the node cannot be generated, a random one is generated. This is the reason (probably) why we have different node values. However, I see some reasons why we're re-doing the
I'd say the variable should have at least be declared as |
Note: locally on my machine (openSUSE 15.5, ASUS ROG), I don't have different nodes being created. |
Note 2: actually, I doubt I can do anything on Python's side. The reason is that the result is cached so changes can only occur if we have different interpreters runs and not just multiple function calls. One way to do it is to materialize two different Python interpreters for the sake of checking but I don't know if it's not just slower then... EDIT: Oh I may have a small idea... we could create two configure programs and check for the node's output. If it gives two different results, then we can assume that the function is unstable. But this would only work if for non-cross-builds so I don't know if it can work for the python-build-standalone project @zanieb |
libuuid
Ah, this stinks. Hm..
I think this makes sense, but to start we really just need a variable that's not Digging into the root cause... I wonder why |
…#11138) ## Describe your changes Rename the new `stableRandomMachineId` to `machineIdV4` to make it clear (1) we can't rely on `machineIdV3`, and (2) `machineIdV4` is the right one to use. [Originally](b44df19) I avoided doing this because the goal is to keep both IDs side by side, since that will help debug things or even roll back in the future. But even if we name it `machineIdV4` nothing precludes us from keeping them side by side. In the meantime, we're also following bug reports astral-sh/python-build-standalone#587 and python/cpython#132710 , which might rehabilitate `machineIdV3`. ## GitHub Issue Link (if applicable) n/a ## Testing Plan - Explanation of why no additional tests are needed: this PR just renames variables. - ~~Unit Tests (JS and/or Python)~~ - ~~E2E Tests~~ - ~~Any manual testing needed?~~ --- **Contribution License Agreement** By submitting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.
Good news is: I think I found a way to fix the issue and will push the fix by the end of the week |
PR is ready: #132901. |
Uh oh!
There was an error while loading. Please reload this page.
Bug report
Bug description:
According to the docs, uuid.getnode() is meant to return a number based on the MAC address of the network interface. However, if Python is built with
libuuid
, this does not match the observed behavior. Instead,getnode()
often produces a random number.Versions of Python obtained through python.org,
brew
, andpyenv
don't appear to display this bug. But versions obtained throughuv
andpython-build-standalone
do.The key difference seems to be which branch of this
try
block insideuuid.py
is executed:When the top branch executes,
getnode()
produces a random number.When the bottom branch executes,
getnode()
produces a number tied to the MAC address of the network interface.Steps to reproduce:
Case 1: working as intended
Using a version of Python compiled with these flags...
...we get this behavior:
Case 2: buggy behavior
Using a version of Python compiled with these flags...
...we get this behavior:
CPython versions tested on:
3.13
Operating systems tested on:
macOS, Linux
Linked PRs
_uuid.generate_time_safe()
to deduce MAC address #132901The text was updated successfully, but these errors were encountered: