8000 Add 32 bit support to neural_network module · Issue #17700 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

Add 32 bit support to neural_network module #17700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rth opened this issue Jun 24, 2020 · 5 comments · Fixed by #17759
Closed

Add 32 bit support to neural_network module #17700

rth opened this issue Jun 24, 2020 · 5 comments · Fixed by #17759
Labels
good first issue Easy with clear instructions to resolve Performance Sprint

Comments

@rth
Copy link
Member
rth commented Jun 24, 2020

Related to #11000 it would be good to add support for 32 bit computations for estimators in the neural_network module. This was done for BernoulliRBM in #16352 Because performance is bound by the dot product this is going to have a large impact (cf #17641 (comment))

I would even argue that unlike other models, it could make sense to add a dtype=np.float32 parameter and make calculations in 32 bit by default regardless of X.dtypes. We could also consider supporting dtype=np.float16.

@NicolasHug
Copy link
Member

I was under the impression that we decided to stop providing new features / improvements to the NN module?

@rth
Copy link
Member Author
rth commented Jun 24, 2020

Stop providing features yes, performance improvements no, I think. As we do include it, people will use in some simple cases where installing pytorch/tensorflow is a hassle and there is no reason to keep our implementation slower that it can be.

Here we are talking about easy changes to half a dozen lines( similar to #16352) with a clear performance improvement.

@postmalloc
Copy link
Contributor

@rth, I've tried passing dtype as a parameter to MLP something like this -

clf = MLPClassifier(alpha=1e-5,
                    hidden_layer_sizes=(10, 5, 3), 
					random_state=1, max_iter=100, 
					dtype=np.float32)

Internally, I cast all the network parameters to this dtype. A basic speed comparison of _fit shows that float32 is faster than float64, as expected.

float64 => 4.48 s ± 41.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
float32 => 4.01 s ± 74.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The dataset is something like this:

X, y = make_classification(n_samples=10000, n_features=30, n_informative=15, n_classes=5, random_state=0)

@rth
Copy link
Member Author
rth commented Jun 27, 2020

Great! I was hoping that it would be a bit more significant, I but I guess it also depends on the CPU vectorization support.

Maybe let's not add the dtype parameter after all, but rather do as in #16352: run,

X = check_array(X, ..., dtype=(np.float64, np.float32))

first and then use X.dtype for the creation of new arrays. That would probably be more consistent with other estimators. Would you like to make a PR?

@postmalloc
Copy link
Contributor

Sure @rth, will open a PR in a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Easy with clear instructions to resolve Performance Sprint
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
0