8000 Merge pull request #2 from scikit-learn/master · hongshaoyang/scikit-learn@464dc37 · GitHub
[go: up one dir, main page]

Skip to content

Commit 464dc37

Browse files
authored
Merge pull request scikit-learn#2 from scikit-learn/master
Merging changes from the main repository
2 parents 3b79637 + c36c104 commit 464dc37

File tree

270 files changed

+5474
-2297
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

270 files changed

+5474
-2297
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ doc/samples
3939
*.prof
4040
.tox/
4141
.coverage
42+
pip-wheel-metadata
4243

4344
lfw_preprocessed/
4445
nips2010_pdf/

.pre-commit-config.yaml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
repos:
2+
- repo: https://github.com/pre-commit/pre-commit-hooks
3+
rev: v2.3.0
4+
hooks:
5+
- id: check-yaml
6+
- id: end-of-file-fixer
7+
- id: trailing-whitespace
8+
- repo: https://gitlab.com/pycqa/flake8
9+
rev: 3.7.8
10+
hooks:
11+
- id: flake8
12+
types: [file, python]
13+
# only check for unused imports for now, as long as
14+
# the code is not fully PEP8 compatible
15+
args: [--select=F401]
16+
- repo: https://github.com/pre-commit/mirrors-mypy
17+
rev: v0.730
18+
hooks:
19+
- id: mypy
20+
args:
21+
- --ignore-missing-imports
22+
files: sklearn/

build_tools/azure/install.sh

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -98,9 +98,6 @@ elif [[ "$DISTRIB" == "conda-pip-latest" ]]; then
9898
python -m pip install -U pip
9999
python -m pip install pytest==$PYTEST_VERSION pytest-cov pytest-xdist
100100

101-
# TODO: Remove pin when https://github.com/python-pillow/Pillow/issues/4518 gets fixed
102-
python -m pip install "pillow>=4.3.0,!=7.1.0,!=7.1.1"
103-
104101
python -m pip install pandas matplotlib pyamg scikit-image
105102
# do not install dependencies for lightgbm since it requires scikit-learn
106103
python -m pip install lightgbm --no-deps

build_tools/generate_authors_table.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,14 +11,15 @@
1111
import getpass
1212
import time
1313
from pathlib import Path
14+
from os import path
1415

1516
print("user:", file=sys.stderr)
1617
user = input()
1718
passwd = getpass.getpass("Password or access token:\n")
1819
auth = (user, passwd)
1920

2021
LOGO_URL = 'https://avatars2.githubusercontent.com/u/365630?v=4'
21-
REPO_FOLDER = Path(__file__).parent.parent
22+
REPO_FOLDER = Path(path.abspath(__file__)).parent.parent
2223

2324

2425
def get(url):
@@ -100,7 +101,6 @@ def get_profile(login):
100101
'Duchesnay': 'Edouard Duchesnay',
101102
'Lars': 'Lars Buitinck',
102103
'MechCoder': 'Manoj Kumar',
103-
'jeremiedbb': 'Jérémie Du Boisberranger',
104104
}
105105
if profile["name"] in missing_names:
106106
profile["name"] = missing_names[profile["name"]]

conftest.py

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -99,16 +99,6 @@ def pytest_unconfigure(config):
9999
del sys._is_pytest_session
100100

101101

102-
def pytest_runtest_setup(item):
103-
if isinstance(item, DoctestItem):
104-
set_config(print_changed_only=True)
105-
106-
107-
def pytest_runtest_teardown(item, nextitem):
108-
if isinstance(item, DoctestItem):
109-
set_config(print_changed_only=False)
110-
111-
112102
# TODO: Remove when modules are deprecated in 0.24
113103
# Configures pytest to ignore deprecated modules.
114104
collect_ignore_glob = [

doc/about.rst

Lines changed: 21 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -271,82 +271,82 @@ July 2017.
271271
</div>
272272
</div>
273273

274-
............
274+
Past Sponsors
275+
.............
275276

276277
.. raw:: html
277278

278279
<div class="sk-sponsor-div">
279280
<div class="sk-sponsor-div-box">
280281

281-
`Anaconda, Inc <https://www.anaconda.com/>`_ funds Adrin Jalali since 2019.
282+
`INRIA <https://www.inria.fr>`_ actively supports this project. It has
283+
provided funding for Fabian Pedregosa (2010-2012), Jaques Grobler
284+
(2012-2013) and Olivier Grisel (2013-2017) to work on this project
285+
full-time. It also hosts coding sprints and other events.
282286

283287
.. raw:: html
284288

285289
</div>
286290

287291
<div class="sk-sponsor-div-box">
288292

289-
.. image:: images/anaconda.png
293+
.. image:: images/inria-logo.jpg
290294
:width: 100pt
291295
:align: center
292-
:target: https://sydney.edu.au/
296+
:target: https://www.inria.fr
293297

294298
.. raw:: html
295299

296300
</div>
297301
</div>
298302

299-
Past Sponsors
300-
.............
303+
.....................
301304

302305
.. raw:: html
303306

304307
<div class="sk-sponsor-div">
305308
<div class="sk-sponsor-div-box">
306309

307-
`INRIA <https://www.inria.fr>`_ actively supports this project. It has
308-
provided funding for Fabian Pedregosa (2010-2012), Jaques Grobler
309-
(2012-2013) and Olivier Grisel (2013-2017) to work on this project
310-
full-time. It also hosts coding sprints and other events.
310+
`Paris-Saclay Center for Data Science
311+
<https://www.datascience-paris-saclay.fr/>`_
312+
funded one year for a developer to work on the project full-time
313+
(2014-2015), 50% of the time of Guillaume Lemaitre (2016-2017) and 50% of the
314+
time of Joris van den Bossche (2017-2018).
311315

312316
.. raw:: html
313317

314318
</div>
315-
316319
<div class="sk-sponsor-div-box">
317320

318-
.. image:: images/inria-logo.jpg
321+
.. image:: images/cds-logo.png
319322
:width: 100pt
320323
:align: center
321-
:target: https://www.inria.fr
324+
:target: https://www.datascience-paris-saclay.fr/
322325

323326
.. raw:: html
324327

325328
</div>
326329
</div>
327330

328-
.....................
331+
............
329332

330333
.. raw:: html
331334

332335
<div class="sk-sponsor-div">
333336
<div class="sk-sponsor-div-box">
334337

335-
`Paris-Saclay Center for Data Science
336-
<https://www.datascience-paris-saclay.fr/>`_
337-
funded one year for a developer to work on the project full-time
338-
(2014-2015), 50% of the time of Guillaume Lemaitre (2016-2017) and 50% of the
339-
time of Joris van den Bossche (2017-2018).
338+
`Anaconda, Inc <https://www.anaconda.com/>`_ funded Adrin Jalali in 2019.
340339

341340
.. raw:: html
342341

343342
</div>
343+
344344
<div class="sk-sponsor-div-box">
345345

346-
.. image:: images/cds-logo.png
346+
.. image:: images/anaconda.png
347347
:width: 100pt
348348
:align: center
349-
:target: https://www.datascience-paris-saclay.fr/
349+
:target: https://www.anaconda.com/
350350

351351
.. raw:: html
352352

doc/authors.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
</style>
88
<div>
99
<a href='https://github.com/jeremiedbb'><img src='https://avatars2.githubusercontent.com/u/34657725?v=4' class='avatar' /></a> <br />
10-
<p>Jérémie Du Boisberranger</p>
10+
<p>Jérémie du Boisberranger</p>
1111
</div>
1212
<div>
1313
<a href='https://github.com/jorisvandenbossche'><img src='https://avatars2.githubusercontent.com/u/1020496?v=4' class='avatar' /></a> <br />

doc/conf.py

Lines changed: 24 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
import warnings
1818
import re
1919
from packaging.version import parse
20+
from pathlib import Path
2021

2122
# If extensions (or modules to document with autodoc) are in another
2223
# directory, add these directories to sys.path here. If the directory
@@ -208,6 +209,23 @@
208209
# If true, the reST sources are included in the HTML build as _sources/name.
209210
html_copy_source = True
210211

212+
# Adds variables into templates
213+
html_context = {}
214+
# finds latest release highlights and places it into HTML context for
215+
# index.html
216+
release_highlights_dir = Path("..") / "examples" / "release_highlights"
217+
# Finds the highlight with the latest version number
218+
latest_highlights = sorted(release_highlights_dir.glob(
219+
"plot_release_highlights_*.py"))[-1]
220+
latest_highlights = latest_highlights.with_suffix('').name
221+
html_context["release_highlights"] = \
222+
f"auto_examples/release_highlights/{latest_highlights}"
223+
224+
# get version from higlight name assuming highlights have the form
225+
# plot_release_highlights_0_22_0
226+
highlight_version = ".".join(latest_highlights.split("_")[-3:-1])
227+
html_context["release_highlights_version"] = highlight_version
228+
211229
# -- Options for LaTeX output ------------------------------------------------
212230
latex_elements = {
213231
# The paper size ('letterpaper' or 'a4paper').
@@ -281,6 +299,11 @@ def __repr__(self):
281299

282300
def __call__(self, directory):
283301
src_path = os.path.normpath(os.path.join(self.src_dir, directory))
302+
303+
# Forces Release Highlights to the top
304+
if os.path.basename(src_path) == "release_highlights":
305+
return "0"
306+
284307
readme = os.path.join(src_path, "README.txt")
285308

286309
try:
@@ -314,6 +337,7 @@ def __call__(self, directory):
314337
},
315338
# avoid generating too many cross links
316339
'inspect_global_variables': False,
340+
'remove_config_comments': True,
317341
}
318342

319343

@@ -386,6 +410,3 @@ def setup(app):
386410
warnings.filterwarnings("ignore", category=UserWarning,
387411
message='Matplotlib is currently using agg, which is a'
388412
' non-GUI backend, so cannot show the figure.')
389-
390-
# Reduces the output of estimators
391-
sklearn.set_config(print_changed_only=True)

doc/developers/contributing.rst

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -248,19 +248,28 @@ modifying code and submitting a PR:
248248
and start making changes. Always use a feature branch. It's good
249249
practice to never work on the ``master`` branch!
250250

251-
9. Develop the feature on your feature branch on your computer, using Git to
252-
do the version control. When you're done editing, add changed files using
253-
``git add`` and then ``git commit``::
251+
9. (**Optional**) Install `pre-commit <https://pre-commit.com/#install>`_ to
252+
run code style checks before each commit::
254253

255-
$ git add modified_files
256-
$ git commit
254+
$ pip install pre-commit
255+
$ pre-commit install
257256

258-
to record your changes in Git, then push the changes to your GitHub
259-
account with::
257+
pre-commit checks can be disabled for a particular commit with
258+
`git commit -n`.
259+
260+
10. Develop the feature on your feature branch on your computer, using Git to
261+
do the version control. When you're done editing, add changed files using
262+
``git add`` and then ``git commit``::
263+
264+
$ git add modified_files
265+
$ git commit
266+
267+
to record your changes in Git, then push the changes to your GitHub
268+
account with::
260269

261270
$ git push -u origin my_feature
262271

263-
10. Follow `these
272+
11. Follow `these
264273
<https://help.github.com/articles/creating-a-pull-request-from-a-fork>`_
265274
instructions to create a pull request from your fork. This will send an
266275
email to the committers. You may want to consider sending an email to the
@@ -422,9 +431,12 @@ You can check for common programming errors with the following tools:
422431

423432
mypy --ignore-missing-import sklearn
424433

425-
must not produce new errors in your pull request. Using `# type: ignore` annotation can be a workaround for a few cases that are not supported by mypy, in particular,
426-
- when importing C or Cython modules
427-
- on properties with decorators
434+
must not produce new errors in your pull request. Using `# type: ignore`
435+
annotation can be a workaround for a few cases that are not supported by
436+
mypy, in particular,
437+
438+
- when importing C or Cython modules
439+
- on properties with decorators
428440

429441
Bonus points for contributions that include a performance analysis with
430442
a benchmark script and profiling output (please report on the mailing

doc/developers/develop.rst

Lines changed: 20 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -246,40 +246,19 @@ whether it is just for you or for contributing it to scikit-learn, there are
246246
several internals of scikit-learn that you should be aware of in addition to
247247
the scikit-learn API outlined above. You can check whether your estimator
248248
adheres to the scikit-learn interface and standards by running
249-
:func:`utils.estimator_checks.check_estimator` on the class::
249+
:func:`~sklearn.utils.estimator_checks.check_estimator` on an instance. The
250+
:func:`~sklearn.utils.parametrize_with_checks` pytest decorator can also be
251+
used (see its docstring for details and possible interactions with `pytest`)::
250252

251253
>>> from sklearn.utils.estimator_checks import check_estimator
252254
>>> from sklearn.svm import LinearSVC
253-
>>> check_estimator(LinearSVC) # passes
255+
>>> check_estimator(LinearSVC()) # passes
254256

255257
The main motivation to make a class compatible to the scikit-learn estimator
256258
interface might be that you want to use it together with model evaluation and
257259
selection tools such as :class:`model_selection.GridSearchCV` and
258260
:class:`pipeline.Pipeline`.
259261

260-
Setting `generate_only=True` returns a generator that yields (estimator, check)
261-
tuples where the check can be called independently from each other, i.e.
262-
`check(estimator)`. This allows all checks to be run independently and report
263-
the checks that are failing. scikit-learn provides a pytest specific decorator,
264-
:func:`~sklearn.utils.parametrize_with_checks`, making it easier to test
265-
multiple estimators::
266-
267-
from sklearn.utils.estimator_checks import parametrize_with_checks
268-
from sklearn.linear_model import LogisticRegression
269-
from sklearn.tree import DecisionTreeRegressor
270-
271-
@parametrize_with_checks([LogisticRegression, DecisionTreeRegressor])
272-
def test_sklearn_compatible_estimator(estimator, check):
273-
check(estimator)
274-
275-
This decorator sets the `id` keyword in `pytest.mark.parameterize` exposing
276-
the name of the underlying estimator and check in the test name. This allows
277-
`pytest -k` to be used to specify which tests to run.
278-
279-
.. code-block: bash
280-
281-
pytest test_check_estimators.py -k check_estimators_fit_returns_self
282-
283262
Before detailing the required interface below, we describe two ways to achieve
284263
the correct interface more easily.
285264

@@ -531,17 +510,29 @@ requires_fit (default=True)
531510
requires_positive_X (default=False)
532511
whether the estimator requires positive X.
533512

513+
requires_y (default=False)
514+
whether the estimator requires y to be passed to `fit`, `fit_predict` or
515+
`fit_transform` methods. The tag is True for estimators inheriting from
516+
`~sklearn.base.RegressorMixin` and `~sklearn.base.ClassifierMixin`.
517+
534518
requires_positive_y (default=False)
535519
whether the estimator requires a positive y (only applicable for regression).
536520

537521
_skip_test (default=False)
538522
whether to skip common tests entirely. Don't use this unless you have a
539523
*very good* reason.
540524

541-
_xfail_test (default=False)
542-
dictionary ``{check_name : reason}`` of common checks to mark as a
543-
known failure, with the associated reason. Don't use this unless you have a
544-
*very good* reason.
525+
_xfail_checks (default=False)
526+
dictionary ``{check_name: reason}`` of common checks that will be marked
527+
as `XFAIL` for pytest, when using
528+
:func:`~sklearn.utils.estimator_checks.parametrize_with_checks`. This tag
529+
currently has no effect on
530+
:func:`~sklearn.utils.estimator_checks.check_estimator`.
531+
Don't use this unless there is a *very good* reason for your estimator
532+
not to pass the check.
533+
Also note that the usage of this tag is highly subject to change because
534+
we are trying to make it more flexible: be prepared for breaking change 4BF4 s
535+
in the future.
545536

546537
stateless (default=False)
547538
whether the estimator needs access to data for fitting. Even though an

0 commit comments

Comments
 (0)
0