-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ValueError: Buffer dtype mismatch, expected 'int' but got 'long' #10758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Large sparse matrices are not yet supported, but may be in the next release with thanks to @kdhingra307's work at #9678. Closing as a near-duplicate of #2969 etc. |
@jnothman I know this will happen and this is basically happening due to the handling of large_indices directly in cython. Also, I have completed the most aspects of it. I just wanted to check it thoroughly, because we have changed the default condition |
thanks for your guys' effort |
I still got the same error with v0.20.3: model = LogisticRegression(
C=1,
solver='sag',
random_state=0,
tol=0.0001,
max_iter=100,
verbose=1,
warm_start=True,
n_jobs=64,
penalty='l2',
dual=False,
multi_class='ovr',
)
model.fit(train_data.inputs, train_data.targets) /opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/base.py:253: UserWarning: Trying to unpickle estimator StandardScaler from version 0.20.0 when using version 0.20.3. This might lead to breaking code or invalid results. Use at your own risk.
UserWarning)
[Parallel(n_jobs=64)]: Using backend ThreadingBackend with 64 concurrent workers.
Traceback (most recent call last):
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-4-5936f410e945>", line 58, in <module>
model.fit(train_data_cadd.inputs, train_data_cadd.targets)
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/comet_ml/monkey_patching.py", line 244, in wrapper
return_value = original(*args, **kwargs)
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/linear_model/logistic.py", line 1363, in fit
for class_, warm_start_coef_ in zip(classes_, warm_start_coef))
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 930, in __call__
self.retrieve()
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 833, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 567, in __call__
return self.func(*args, **kwargs)
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 225, in __call__
for func, args, kwargs in self.items]
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py", line 225, in <listcomp>
for func, args, kwargs in self.items]
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/linear_model/logistic.py", line 792, in logistic_regression_path
is_saga=(solver == 'saga'))
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/linear_model/sag.py", line 305, in sag_solver
dataset, intercept_decay = make_dataset(X, y, sample_weight, random_state)
File "/opt/modules/i12g/anaconda/3-5.0.1/lib/python3.6/site-packages/sklearn/linear_model/base.py", line 84, in make_dataset
seed=seed)
File "sklearn/utils/seq_dataset.pyx", line 259, in sklearn.utils.seq_dataset.CSRDataset.__cinit__
ValueError: Buffer dtype mismatch, expected 'int' but got 'long' >>> sklearn.__version__
Out[5]: '0.20.3'
>>> train_data.inputs
Out[6]:
<28034374x904 sparse matrix of type '<class 'numpy.float32'>'
with 2223406363 stored elements in Compressed Sparse Row format> Is there some way I can still train my data? |
I was trying this code:
but get this error:
the csr_matrix Q 's size is
my os: Linux version 2.6.32-504.23.4.el6.x86_64
my python : Python 3.6.1 |Anaconda custom (64-bit)| (default, May 11 2017, 13:09:58)
my scipy: 0.19.0
my sklearn:0.19.0
The text was updated successfully, but these errors were encountered: