-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Trees incompatible between 32bit and 64bit version #2972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
that's an ugly one... it would be much nicer if we would support this |
I'll try to look into it today... |
On a 64bit OSX system, replacing the |
We already had this discussion... We switched from See #1458 |
And it looks like the remaining data pickled as SIZE_t arrays are just |
I think that discussion just says we can't allow the tree to grow larger than an int32 can represent... |
CC @larsmans: Would love to have your opinion on this |
We could either use int32 or int64 on all platforms, right? Or is there a problem with using int64 indices on 32bit platforms? |
The problem with int32 is that it just doesn't scale up. If we want to support sparse inputs, then switching to 32 bits when SciPy is just switching to 64 is a bad idea (see #2969). The problem with int64 is that it's slower on 32-bit hardware, but I haven't tested to see how slow (even incrementing a The clean solution is to fix the serialization. The second best, IMHO, is to use |
How would you fix the serialization? When loading the class, convert to the native format? |
Yes. And always store as 64-bit, then check for too large indices. |
Why do you need to always store in 64bit if you convert anyhow? |
So basically I need to overwrite |
To get a consistent file format. It's not strictly necessary but it simplifies things. |
Don't think it's worth it. |
It would still be nice to support this. I am calculating data on a 64-bit machine, afterwards I calculate a tree on that data and use feature_importance_ to remove unnecessary features. In the next step I calculate another tree on the stripped down data. This tree should be used on a 32-bit microcontroller. The problem I can't load the tree into the 32-bit system. So I tried to recalculate the tree on 32-bit, but I cannot load my data in because it's over 4GB. So I'm out of luck here. What should I do then? |
Have you tried setting parameters to reduce the size of your model?
|
I found a solution... so I need save my stripped down features for calculation in a model put it on my device and recalculate the tree on the device itself (so I also avoid version incompatibilities). |
I'm not sure if this a known issue but I think trees build on 64bit can not be unpickled on 32bit.
How hard would it be to allow that? I think the problem is that
Size_t
is different between the two platforms.The text was updated successfully, but these errors were encountered: