8000 can't setup.py install without numpy · Issue #4164 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

can't setup.py install without numpy #4164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bukzor opened this issue Jan 26, 2015 · 32 comments · Fixed by #6990
Closed

can't setup.py install without numpy #4164

bukzor opened this issue Jan 26, 2015 · 32 comments · Fixed by #6990

Comments

@bukzor
Copy link
bukzor commented Jan 26, 2015

At minimum, this could use a better error message. Is it not possible to have a project that depends on sklearn and installs all its dependencies to a virtualenv in a single pass? Must I do one pass to install numpy and everything else, and a second pass just to install sklearn?

$ python setup.py install
Partial import of sklearn during the build process.
Traceback (most recent call last):
  File "setup.py", line 154, in <module>
    setup_package()
  File "setup.py", line 146, in setup_package
    from numpy.distutils.core import setup
ImportError: No module named numpy.distutils.core
@amueller
Copy link
Member

What is your current install setup? Scikit-learn requires numpy and if you install everything together, it should work fine.

@GaelVaroquaux
Copy link
Member

At minimum, this could use a better error message.

Agreed.

Is it n 8000 ot possible to have a project that depends on sklearn and
installs all its dependencies to a virtualenv in a single pass?

How are you installing things? If you are compiling from source, you do
need numpy installed.

Must I do one pass to install numpy and everything else, and a second
pass just to install sklearn?

I would be surprised that scipy behaves different.

@amueller
Copy link
Member

Scipy does behave differently:
https://github.com/scipy/scipy/blob/master/setup.py#L190
( I think)

@GaelVaroquaux
Copy link
Member

Scipy does behave differently:
https://github.com/scipy/scipy/blob/master/setup.py#L190

Hum, interesting.

There are pros and cons for doing it this way for numpy. The pro is that
it probably solves the OPs problem. The con is that more people are going
to be hand compiling numpy instead of installing good packages, and thus
be left with horrible linear algebra package.

My hunch would be to leave it the way it is, but I see the point of the
clever trick used in scipy's setup.py.

@bukzor
Copy link
Author
bukzor commented Jan 26, 2015

Demo:

rm -rf fresh
virtualenv fresh
. fresh/bin/activate

# pip 6 reverses order of "top-level" requirements -.-
#   https://github.com/pypa/pip/issues/2260
pip install --upgrade pip

echo >requirements.txt 'numpy
scikit-learn'
pip install  -r requirements.txt

output:

$ sh demo.sh
New python executable in fresh/bin/python
Installing setuptools, pip...done.
Downloading/unpacking pip from https://pypi.python.org/packages/py2.py3/p/pip/pip-6.0.6-py2.py3-none-any.whl#md5=0472d9dc76a0df6cc6ab545e40aef832
  Downloading pip-6.0.6-py2.py3-none-any.whl (1.3MB): 1.3MB downloaded
Installing collected packages: pip
  Found existing installation: pip 1.5.6
    Uninstalling pip:
      Successfully uninstalled pip
Successfully installed pip
Cleaning up...
Collecting numpy (from -r requirements.txt (line 1))
  Using cached numpy-1.9.1.tar.gz
    Running from numpy source directory.
Collecting scikit-learn (from -r requirements.txt (line 2))
  Using cached scikit-learn-0.15.2.tar.gz
    Partial import of sklearn during the build process.
Installing collected packages: scikit-learn, numpy
  Running setup.py install for scikit-learn
    Partial import of sklearn during the build process.
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/nail/tmp/pip-build-oylnbV/scikit-learn/setup.py", line 154, in <module>
        setup_package()
      File "/nail/tmp/pip-build-oylnbV/scikit-learn/setup.py", line 146, in setup_package
        from numpy.distutils.core import setup
    ImportError: No module named numpy.distutils.core
    Complete output from command /nail/home/buck/tmp/fresh/bin/python -c "import setuptools, tokenize;__file__='/nail/tmp/pip-build-oylnbV/scikit-learn/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /nail/tmp/pip-x4l77V-record/install-record.txt --single-version-externally-managed --compile --install-headers /nail/home/buck/tmp/fresh/include/site/python2.6:
    Partial import of sklearn during the build process.

    Traceback (most recent call last):

      File "<string>", line 1, in <module>

      File "/nail/tmp/pip-build-oylnbV/scikit-learn/setup.py", line 154, in <module>

        setup_package()

      File "/nail/tmp/pip-build-oylnbV/scikit-learn/setup.py", line 146, in setup_package

        from numpy.distutils.core import setup

    ImportError: No module named numpy.distutils.core

    ----------------------------------------
    Command "/nail/home/buck/tmp/fresh/bin/python -c "import set
8000
uptools, tokenize;__file__='/nail/tmp/pip-build-oylnbV/scikit-learn/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /nail/tmp/pip-x4l77V-record/install-record.txt --single-version-externally-managed --compile --install-headers /nail/home/buck/tmp/fresh/include/site/python2.6" failed with error code 1 in /nail/tmp/pip-build-oylnbV/scikit-learn

@GaelVaroquaux
Copy link
Member

Maybe we could make this demo work, but chances are that you would have a
really lousy install of numpy and scikit-learn. The problem is that numpy
wouldn't be linked to good linear algebra packages, unless you know what
you are doing, and you have installed the headers of these before. But
that last operation is much harder than getting the pip/python part right.

Really, you should be building from source only if you know well what you
do. In which case none of this should be a problem.

If you are not an expert, please use prebuild packages. At least this is
my opinion.

@amueller
Copy link
Member

You are right, installing this way really is a bad idea in most cases...

@douardda
Copy link
douardda commented Mar 3, 2015

This is a serious issue, since virtualenv and pip are really becoming the standard to set up a test or development environment.

How can I run tox-based contiuous integration tests for projects that depend on sklearn then?

I agree that using prebuilt packages provided by a decent Linux distribution is the way for production/real life applications, but CI and automatic tests are required tools too.

scikit-learn should not be installed using "pip install" or setuptools in general, but scikit-learn must be installable that way.

David

@amueller
Copy link
Member
amueller commented Mar 3, 2015

Hum, continuous integration is a good point.
However, the value of the integration is not that high if your production setup is quite different from your CI setup....

@amueller
Copy link
Member
amueller commented Mar 3, 2015

@douardda can you try if the scipy hack works for you?
That is

try:
    import numpy
except:
    build_requires = ['numpy>=1.6.2']

But I guess it will be more complicated than that...

@amueller
Copy link
Member
amueller commented Mar 3, 2015

Hum, our setup.py seems pretty identical to the scipy one, I'm not sure what makes theirs work.

@douardda
Copy link
douardda commented Mar 4, 2015

Humm, on my jessie laptop, in a fresh virtualenv, "pip install scipy" also fails with a "ImportError: No module named numpy.distutils.core".

saketkc added a commit to saketkc/scikit-learn that referenced this issue Mar 4, 2015
Pass build_requires if numpy is not found
Fixes scikit-learn#4164

Signed-off-by: Saket Choudhary <saketkc@gmail.com>
@saketkc
Copy link
Contributor
saketkc commented Mar 4, 2015

@douardda Would you want to test out the patch in #4332?

@amueller
Copy link
Member
amueller commented Mar 4, 2015

Interesting, for me scipy works but scikit-learn fails.

@gravyboat
Copy link

@amueller Any ideas on this? I read through #4332 and don't see a solution being proposed that actually installs scikit-learn correctly like scipy does when both are in a requirements file (please correct me if I am mistaken). We're using pre-built wheel packages to avoid compile/configuration times taking forever, and I'm running into this same issue. I've got another project using scipy that works flawlessly when numpy is in the requirements file, but I get this error specifically for scikit-learn. I'll try adding scipy to the requirements.txt to see if that helps, but from the errors I'm seeing it's bombing out before even trying to install.

edit Actually it looks like @saketkc's work over in #4371 might address this. What's the timeline looking like on getting that merged in? I can work around it by installing numpy, THEN going back through the requirements.txt in my config management tool but it's a pain and it increases build times.

@ogrisel
Copy link
Member
ogrisel commented Apr 28, 2015

There are pros and cons for doing it this way for numpy. The pro is that
it probably solves the OPs problem. The con is that more people are going
to be hand compiling numpy instead of installing good packages, and thus
be left with horrible linear algebra package.

It is not possible to build scipy from source without installing blas, lapack and gfortran. At this point users will have to read the scipy doc to build from source and install an optimized BLAS / LAPACK.

@ogrisel
Copy link
Member
ogrisel commented Apr 28, 2015

+1 for using the setup_requires=['numpy'] and install_requires=['numpy', 'scipy'] in scikit-learn to have pip install scikit-learn work by default as long as the non-python system build deps (gcc, gfortran, BLAS & LAPACK headers) are installed.

@ogrisel
Copy link
Member
ogrisel commented Apr 28, 2015

Actually I changed my mind as explained in my last comment in #4371. Let's close this for now.

@ogrisel ogrisel closed this as completed Apr 28, 2015
@bukzor
Copy link
Author
bukzor commented May 8, 2015

If I understand it, It's still not possible to have a project that installs numpy and sklearn in the same step, and you all don't plan to fix it.

Am I right?

@amueller
Copy link
Member
amueller commented May 8, 2015

From what @ogrisel said #4371 (comment) it is non-trivial to make this work with ubuntu stable pip.
Is this for CI purposes for you? Do you really want to build scipy in your CI? Or are you installing wheels?

@amueller
Copy link
Member
amueller commented May 8, 2015

didn't @GaelVaroquaux mention a flag to force installing dependencies?

@bukzor
Copy link
Author
bukzor commented May 9, 2015

Various processes at our company involve 'pip install -r requirements.txt'.
This works fine for any group of packages not containing sklearn. I feel
certain this isn't unique to me or my company.

On Fri, May 8, 2015, 4:29 PM Andreas Mueller notifications@github.com
wrote:

didn't @GaelVaroquaux https://github.com/GaelVaroquaux mention a flag
to force installing dependencies?


Reply to this email directly or view it on GitHub
#4164 (comment)
.

@GaelVaroquaux
Copy link
Member
GaelVaroquaux commented May 9, 2015 via email

@amueller
Copy link
Member
amueller commented May 9, 2015

I agree that if that is how you produce production binaries, then you are most certainly doing it wrong.
Though I don't agree with @GaelVaroquaux that scikit-learn should tell you which scipy to use. I feel it's scipy's responsibility to make sure scipy is installed in a sane way.

@timabbott
Copy link

Any progress on this issue? It'd be really great if there was a way to use scikit-learn inside a virtualenv by including it (and whatever versions of numpy/scipy/etc. one wants) in a requirements.txt file. Virtualenvs are a very standard way to deploy applications these days, so this is a pretty common use case.

Since there was a solution under discussion as of a few months ago, maybe it makes sense to reopen the issue at least?

@taion
Copy link
Contributor
taion commented Jul 7, 2016

I never got a chance to work on this. I still think it should be possible/straightforward to set up a scikit-learn[pip] extra that has the right dependencies, especially now that pip-tools properly supports extras.

@matt-carter
Copy link

Seconding 6D40 @timabbott.

@taion
Copy link
Contributor
taion commented Jul 14, 2016

PR up at #6990. It's a very straightforward change.

@eligiblekeng
Copy link
eligiblekeng commented Aug 29, 2016

Temporary fix by overriding the install command in setup.py:

import pip
from setuptools import setup
from setuptools.command.install import install
from pip.req import parse_requirements
install_reqs = parse_requirements('./requirements.txt', session=False)
reqs = [str(ir.req) for ir in install_reqs]


class OverrideInstall(install):

    """
    Emulate sequential install of pip install -r requirements.txt
    To fix numpy bug in scipy, scikit in py2
    """

    def run(self):
        for req in reqs:
            pip.main(["install", req])



# the setup
setup(
    ...
    cmdclass={'install': OverrideInstall}
    ....
)

Then run python setup.py install as usual

kousu added a commit to spinalcordtoolbox/spinalcordtoolbox that referenced this issue Aug 15, 2020
Re: feedback #2840 (comment)

I think we can live with double-installs, now that I have an open PR and the attention
of the scikit-image maintainers (#2841 (comment))
we can probably expect that we can do away with this workaround sooner than later.

This re-adds numpy in alphabetical order without the misleading note
about "installing first" because ordering requirements.txt got deprecated
at some point (e.g. scikit-learn/scikit-learn#4164 (comment))
and that's why we need an entirely separate `pip install numpy` in the
first place.
kousu added a commit to spinalcordtoolbox/spinalcordtoolbox that referenced this issue Aug 15, 2020
It turns out there was a good reason for installing numpy explicitly,
the line that we removed in #2751,
because scikit-learn needs it to build from source but doesn't declare it:

scikit-image/scikit-image#4919

PyPA doesn't give a clear or reliable way to declare build dependencies:
some projects are using pip everywhere, some are using legacy setuptools,
some distutils. It's confusing for everyone right now. They're working on
it.

I think we can live with double-installs, now that I have an open PR and
the attention of the scikit-image maintainers (#2841 (comment))
we can probably expect that we can do away with this workaround sooner than later.

This also moves numpy to alphabetical order and removes the misleading
note about "installing first" because ordering requirements.txt got
deprecated at some point (e.g. scikit-learn/scikit-learn#4164 (comment))
kousu added a commit to spinalcordtoolbox/spinalcordtoolbox that referenced this issue Aug 15, 2020
It turns out there was a good reason for installing numpy explicitly,
the line that we removed in #2751,
because scikit-learn needs it to build from source but doesn't declare it:

scikit-image/scikit-image#4919

PyPA doesn't give a clear or reliable way to declare build dependencies:
some projects are using pip everywhere, some are using legacy setuptools,
some distutils. It's confusing for everyone right now. They're working on
it.

I think we can live with double-installs, now that I have an open PR and
the attention of the scikit-image maintainers (#2841 (comment))
we can probably expect that we can do away with this workaround sooner than later.

This also moves numpy to alphabetical order and removes the misleading
note about "installing first" because ordering requirements.txt got
deprecated at some point (e.g. scikit-learn/scikit-learn#4164 (comment))
Drulex pushed a commit to Drulex/spinalcordtoolbox that referenced this issue Sep 30, 2020
It turns out there was a good reason for installing numpy explicitly,
the line that we removed in spinalcordtoolbox#2751,
because scikit-learn needs it to build from source but doesn't declare it:

scikit-image/scikit-image#4919

PyPA doesn't give a clear or reliable way to declare build dependencies:
some projects are using pip everywhere, some are using legacy setuptools,
some distutils. It's confusing for everyone right now. They're working on
it.

I think we can live with double-installs, now that I have an open PR and
the attention of the scikit-image maintainers (spinalcordtoolbox#2841 (comment))
we can probably expect that we can do away with this workaround sooner than later.

This also moves numpy to alphabetical order and removes the misleading
note about "installing first" because ordering requirements.txt got
deprecated at some point (e.g. scikit-learn/scikit-learn#4164 (comment))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
0