8000 ValueError: buffer source array is read-only when derializing a Tree from a readonly buffer. · Issue #25584 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

ValueError: buffer source array is read-only when derializing a Tree from a readonly buffer. #25584

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ogrisel opened this issue Feb 10, 2023 · 4 comments · Fixed by #25585
Closed

Comments

@ogrisel
Copy link
Member
ogrisel commented Feb 10, 2023

As observed on our Circle CI and reproduced locally:

/home/circleci/project/examples/release_highlights/plot_release_highlights_0_24_0.py failed leaving traceback:
Traceback (most recent call last):
  File "/home/circleci/mambaforge/envs/testenv/lib/python3.8/site-packages/sphinx_gallery/gen_gallery.py", line 159, in call_memory
    return 0., func()
  File "/home/circleci/mambaforge/envs/testenv/lib/python3.8/site-packages/sphinx_gallery/gen_rst.py", line 466, in __call__
    exec(self.code, self.fake_main.__dict__)
  File "/home/circleci/project/examples/release_highlights/plot_release_highlights_0_24_0.py", line 211, in <module>
    display = PartialDependenceDisplay.from_estimator(
  File "/home/circleci/project/sklearn/inspection/_plot/partial_dependence.py", line 704, in from_estimator
    pd_results = Parallel(n_jobs=n_jobs, verbose=verbose)(
  File "/home/circleci/project/sklearn/utils/parallel.py", line 63, in __call__
    return super().__call__(iterable_with_config)
  File "/home/circleci/mambaforge/envs/testenv/lib/python3.8/site-packages/joblib/parallel.py", line 1098, in __call__
    self.retrieve()
  File "/home/circleci/mambaforge/envs/testenv/lib/python3.8/site-packages/joblib/parallel.py", line 975, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/circleci/mambaforge/envs/testenv/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 567, in wrap_future_result
    return future.result(timeout=timeout)
  File "/home/circleci/mambaforge/envs/testenv/lib/python3.8/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/home/circleci/mambaforge/envs/testenv/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

https://app.circleci.com/pipelines/github/scikit-learn/scikit-learn/43516/workflows/d8bd88b5-872e-4f50-88d4-e41886117849/jobs/225593

@ogrisel
Copy link
Member Author
ogrisel commented Feb 10, 2023

Here is what I got locally:

Features selected by forward sequential selection: ['sepal length (cm)', 'petal width (cm)']
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/site-packages/joblib/externals/loky/process_executor.py", line 391, in _process_worker
    call_item = call_queue.get(block=True, timeout=timeout)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/multiprocessing/queues.py", line 122, in get
    return _ForkingPickler.loads(res)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "sklearn/tree/_tree.pyx", line 711, in sklearn.tree._tree.Tree.__setstate__
    cdef Node[::1] node_memory_view = node_ndarray
  File "stringsource", line 660, in View.MemoryView.memoryview_cwrapper
  File "stringsource", line 350, in View.MemoryView.memoryview.__cinit__
ValueError: buffer source array is read-only
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/ogrisel/code/scikit-learn/examples/release_highlights/plot_release_highlights_0_24_0.py", line 211, in <module>
    display = PartialDependenceDisplay.from_estimator(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/scikit-learn/sklearn/inspection/_plot/partial_dependence.py", line 704, in from_estimator
    pd_results = Parallel(n_jobs=n_jobs, verbose=verbose)(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/code/scikit-learn/sklearn/utils/parallel.py", line 63, in __call__
    return super().__call__(iterable_with_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/site-packages/joblib/parallel.py", line 1098, in __call__
    self.retrieve()
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/site-packages/joblib/parallel.py", line 975, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/site-packages/joblib/_parallel_backends.py", line 567, in wrap_future_result
    return future.result(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/concurrent/futures/_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/ogrisel/mambaforge/envs/dev/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

So it's probably related to a change in how we handle readonly-buffers in decision trees, probably #25540.

@OmarManzoor
Copy link
Contributor

@ogrisel I'll fix this.

@ogrisel ogrisel changed the title joblib/loky BrokenProcessPool raised when calling PartialDependenceDisplay.from_estimator in examples ValueError: buffer source array is read-only when derializing a Tree from a readonly buffer. Feb 10, 2023
@ogrisel
Copy link
Member Author
ogrisel commented Feb 10, 2023

Here is a minimal reproducer that does not involve spawning new Python processes:

>>> from sklearn.tree import DecisionTreeClassifier
>>> import numpy as np
>>> import joblib
>>> clf = DecisionTreeClassifier().fit(np.random.randn(100, 10), np.random.randint(0, 2, 100))
>>> joblib.dump(clf, "clf.joblib")
['clf.joblib']
>>> joblib.load("clf.joblib", mmap_mode="r")
Traceback (most recent call last):
  Cell In[7], line 1
    joblib.load("clf.joblib", mmap_mode="r")
  File ~/mambaforge/envs/dev/lib/python3.11/site-packages/joblib/numpy_pickle.py:658 in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File ~/mambaforge/envs/dev/lib/python3.11/site-packages/joblib/numpy_pickle.py:577 in _unpickle
    obj = unpickler.load()
  File ~/mambaforge/envs/dev/lib/python3.11/pickle.py:1213 in load
    dispatch[key[0]](self)
  File ~/mambaforge/envs/dev/lib/python3.11/site-packages/joblib/numpy_pickle.py:402 in load_build
    Unpickler.load_build(self)
  File ~/mambaforge/envs/dev/lib/python3.11/pickle.py:1718 in load_build
    setstate(state)
  File sklearn/tree/_tree.pyx:711 in sklearn.tree._tree.Tree.__setstate__
    cdef Node[::1] node_memory_view = node_ndarray
  File stringsource:660 in View.MemoryView.memoryview_cwrapper
  File stringsource:350 in View.MemoryView.memoryview.__cinit__
ValueError: buffer source array is read-only

@OmarManzoor
Copy link
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants
0