8000 [MRG] Add pprint for estimators - continued (#11705) · scikit-learn/scikit-learn@a50c03f · GitHub
[go: up one dir, main page]

Skip to content

Commit a50c03f

Browse files
NicolasHugamueller
authored andcommitted
[MRG] Add pprint for estimators - continued (#11705)
* add pprint for estimators * strip color from length, add color option * Minor cleaning, fixes, factoring and docs * Added some basic tests * Fixed line length issue * fixed flake8 and added visual test for review * Fixed test * Fixed Python 2 issues (inspect.signature import) * Trying to fix flake8 again * Added special repr for functions * Added some other visual tests * Changed _format_function in to _format_callable because callable() returns True also for class objects (which we want to reprensent with their name as well anyway) * Consistent output in Python 2 and 3 * WIP * Now using the builtin pprint module * pep8 * Added changed_only param * Fixed printing when string would fit in less than line width * Fixed printing of steps parameter * Fixed changed_only param for short estimators * fixed pep8 * Added some more description in docstring * changed_only is now an option from set_config() * Put _pprint.py into sklearn/utils, added tests * Added doctest NORMALIZE_WHITESPACE where needed * Fixed tests * fix test-doc * fixing test that passed before.... * Fixed tests * Added test for changed_only and long lines * typo * Added authors names * Added license file * Added ellipsis based on number of elements in sequence + added increasinly aggressive repr strategies * Updated whatsnew * dont use increaingly aggressive strategy * Fixed tests * Removed LICENSE file and put license text in _pprint.py * fixed test_base * Sorted parameters dictionary for consistent output in 3.5 * Actually using OrderedDict... * Addressed comments * Added test for NaN changed parameter * Update whatsnew * Added example to set_config() * Removed example * Added example in gallery * Spelling
1 parent e6cbd26 commit a50c03f

30 files changed

+860
-44
lines changed

doc/modules/compose.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,13 +76,13 @@ filling in the names automatically::
7676

7777
The estimators of a pipeline are stored as a list in the ``steps`` attribute::
7878

79-
>>> pipe.steps[0]
79+
>>> pipe.steps[0] # doctest: +NORMALIZE_WHITESPACE
8080
('reduce_dim', PCA(copy=True, iterated_power='auto', n_components=None, random_state=None,
8181
svd_solver='auto', tol=0.0, whiten=False))
8282

8383
and as a ``dict`` in ``named_steps``::
8484

85-
>>> pipe.named_steps['reduce_dim']
85+
>>> pipe.named_steps['reduce_dim'] # doctest: +NORMALIZE_WHITESPACE
8686
PCA(copy=True, iterated_power='auto', n_components=None, random_state=None,
8787
svd_solver='auto', tol=0.0, whiten=False)
8888

doc/modules/linear_model.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -185,7 +185,7 @@ for another implementation::
185185

186186
>>> from sklearn import linear_model
187187
>>> reg = linear_model.Lasso(alpha=0.1)
188-
>>> reg.fit([[0, 0], [1, 1]], [0, 1])
188+
>>> reg.fit([[0, 0], [1, 1]], [0, 1]) # doctest: +NORMALIZE_WHITESPACE
189189
Lasso(alpha=0.1, copy_X=True, fit_intercept=True, max_iter=1000,
190190
normalize=False, positive=False, precompute=False, random_state=None,
191191
selection='cyclic', tol=0.0001, warm_start=False)
@@ -639,7 +639,7 @@ Bayesian Ridge Regression is used for regression::
639639
>>> X = [[0., 0.], [1., 1.], [2., 2.], [3., 3.]]
640640
>>> Y = [0., 1., 2., 3.]
641641
>>> reg = linear_model.BayesianRidge()
642-
>>> reg.fit(X, Y)
642+
>>> reg.fit(X, Y) # doctest: +NORMALIZE_WHITESPACE
643643
BayesianRidge(alpha_1=1e-06, alpha_2=1e-06, compute_score=False, copy_X=True,
644644
fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06, n_iter=300,
645645
normalize=False, tol=0.001, verbose=False)

doc/modules/model_evaluation.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -979,7 +979,7 @@ with a svm classifier in a binary class problem::
979979
>>> X = [[0], [1]]
980980
>>> y = [-1, 1]
981981
>>> est = svm.LinearSVC(random_state=0)
982-
>>> est.fit(X, y)
982+
>>> est.fit(X, y) # doctest: +NORMALIZE_WHITESPACE
983983
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
984984
intercept_scaling=1, loss='squared_hinge', max_iter=1000,
985985
multi_class='ovr', penalty='l2', random_state=0, tol=0.0001,
@@ -997,7 +997,7 @@ with a svm classifier in a multiclass problem::
997997
>>> Y = np.array([0, 1, 2, 3])
998998
>>> labels = np.array([0, 1, 2, 3])
999999
>>> est = svm.LinearSVC()
1000-
>>> est.fit(X, Y)
1000+
>>> est.fit(X, Y) # doctest: +NORMALIZE_WHITESPACE
10011001
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
10021002
intercept_scaling=1, loss='squared_hinge', max_iter=1000,
10031003
multi_class='ovr', penalty='l2', random_state=None, tol=0.0001,

doc/modules/preprocessing.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -488,7 +488,7 @@ Continuing the example above::
488488

489489
>>> enc = preprocessing.OneHotEncoder()
490490
>>> X = [['male', 'from US', 'uses Safari'], ['female', 'from Europe', 'uses Firefox']]
491-
>>> enc.fit(X) # doctest: +ELLIPSIS
491+
>>> enc.fit(X) # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE
492492
OneHotEncoder(categorical_features=None, categories=None,
493493
dtype=<... 'numpy.float64'>, handle_unknown='error',
494494
n_values=None, sparse=True)
@@ -514,7 +514,7 @@ dataset::
514514
>>> # Note that for there are missing categorical values for the 2nd and 3rd
515515
>>> # feature
516516
>>> X = [['male', 'from US', 'uses Safari'], ['female', 'from Europe', 'uses Firefox']]
517-
>>> enc.fit(X) # doctest: +ELLIPSIS
517+
>>> enc.fit(X) # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE
518518
OneHotEncoder(categorical_features=None,
519519
categories=[...],
520520
dtype=<... 'numpy.float64'>, handle_unknown='error',
@@ -532,7 +532,7 @@ columns for this feature will be all zeros
532532

533533
>>> enc = preprocessing.OneHotEncoder(handle_unknown='ignore')
534534
>>> X = [['male', 'from US', 'uses Safari'], ['female', 'from Europe', 'uses Firefox']]
535-
>>> enc.fit(X) # doctest: +ELLIPSIS
535+
>>> enc.fit(X) # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE
536536
OneHotEncoder(categorical_features=None, categories=None,
537537
dtype=<... 'numpy.float64'>, handle_unknown='ignore',
538538
n_values=None, sparse=True)

doc/tutorial/statistical_inference/model_selection.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -267,7 +267,7 @@ parameter automatically by cross-validation::
267267
>>> diabetes = datasets.load_diabetes()
268268
>>> X_diabetes = diabetes.data
269269
>>> y_diabetes = diabetes.target
270-
>>> lasso.fit(X_diabetes, y_diabetes)
270+
>>> lasso.fit(X_diabetes, y_diabetes) # doctest: +NORMALIZE_WHITESPACE
271271
LassoCV(alphas=None, copy_X=True, cv=3, eps=0.001, fit_intercept=True,
272272
max_iter=1000, n_alphas=100, n_jobs=None, normalize=False,
273273
positive=False, precompute='auto', random_state=None,

doc/tutorial/statistical_inference/supervised_learning.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -334,6 +334,7 @@ application of Occam's razor: *prefer simpler models*.
334334
>>> best_alpha = alphas[scores.index(max(scores))]
335335
>>> regr.alpha = best_alpha
336336
>>> regr.fit(diabetes_X_train, diabetes_y_train)
337+
... # doctest: +NORMALIZE_WHITESPACE
337338
Lasso(alpha=0.025118864315095794, copy_X=True, fit_intercept=True,
338339
max_iter=1000, normalize=False, positive=False, precompute=False,
339340
random_state=None, selection='cyclic', tol=0.0001, warm_start=False)

doc/tutorial/statistical_inference/unsupervised_learning.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -274,7 +274,7 @@ data by projecting on a principal subspace.
274274

275275
>>> from sklearn import decomposition
276276
>>> pca = decomposition.PCA()
277-
>>> pca.fit(X)
277+
>>> pca.fit(X) # doctest: +NORMALIZE_WHITESPACE
278278
PCA(copy=True, iterated_power='auto', n_components=None, random_state=None,
279279
svd_solver='auto', tol=0.0, whiten=False)
280280
>>> print(pca.explained_variance_) # doctest: +SKIP

doc/whats_new/v0.21.rst

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -200,12 +200,19 @@ Support for Python 3.4 and below has been officially dropped.
200200
``max_depth`` by 1 while expanding the tree if ``max_leaf_nodes`` and
201201
``max_depth`` were both specified by the user. Please note that this also
202202
affects all ensemble methods using decision trees.
203-
:pr:`12344` by :user:`Adrin Jalali <adrinjalali>`.
203+
:issue:`12344` by :user:`Adrin Jalali <adrinjalali>`.
204204

205205

206206
Multiple modules
207207
................
208208

209+
- The `__repr__()` method of all estimators (used when calling
210+
`print(estimator)`) has been entirely re-written, building on Python's
211+
pretty printing standard library. All parameters are printed by default,
212+
but this can be altered with the ``print_changed_only`` option in
213+
:func:`sklearn.set_config`. :issue:`11705` by :user:`Nicolas Hug
214+
<NicolasHug>`.
215+
209216
Changes to estimator checks
210217
---------------------------
211218

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
"""
2+
=================================
3+
Compact estimator representations
4+
=================================
5+
6+
This example illustrates the use of the print_changed_only global parameter.
7+
8+
Setting print_changed_only to True will alterate the representation of
9+
estimators to only show the parameters that have been set to non-default
10+
values. This can be used to have more compact representations.
11+
"""
12+
print(__doc__)
13+
14+
from sklearn.linear_model import LogisticRegression
15+
from sklearn import set_config
16+
17+
18+
lr = LogisticRegression(penalty='l1')
19+
print('Default representation:')
20+
print(lr)
21+
# LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
22+
# intercept_scaling=1, l1_ratio=None, max_iter=100,
23+
# multi_class='warn', n_jobs=None, penalty='l1',
24+
# random_state=None, solver='warn', tol=0.0001, verbose=0,
25+
# warm_start=False)
26+
27+
set_config(print_changed_only=True)
28+
print('\nWith changed_only option:')
29+
print(lr)
30+
# LogisticRegression(penalty='l1')

sklearn/_config.py

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@
55

66
_global_config = {
77
'assume_finite': bool(os.environ.get('SKLEARN_ASSUME_FINITE', False)),
8-
'working_memory': int(os.environ.get('SKLEARN_WORKING_MEMORY', 1024))
8+
'working_memory': int(os.environ.get('SKLEARN_WORKING_MEMORY', 1024)),
9+
'print_changed_only': False,
910
}
1011

1112

@@ -20,7 +21,8 @@ def get_config():
2021
return _global_config.copy()
2122

2223

23-
def set_config(assume_finite=None, working_memory=None):
24+
def set_config(assume_finite=None, working_memory=None,
25+
print_changed_only=None):
2426
"""Set global scikit-learn configuration
2527
2628
.. versionadded:: 0.19
@@ -43,11 +45,21 @@ def set_config(assume_finite=None, working_memory=None):
4345
4446
.. versionadded:: 0.20
4547
48+
print_changed_only : bool, optional
49+
If True, only the parameters that were set to non-default
50+
values will be printed when printing an estimator. For example,
51+
``print(SVC())`` while True will only print 'SVC()' while the default
52+
behaviour would be to print 'SVC(C=1.0, cache_size=200, ...)' with
53+
all the non-changed parameters.
54+
55+
.. versionadded:: 0.21
4656
"""
4757
if assume_finite is not None:
4858
_global_config['assume_finite'] = assume_finite
4959
if working_memory is not None:
5060
_global_config['working_memory'] = working_memory
61+
if print_changed_only is not None:
62+
_global_config['print_changed_only'] = print_changed_only
5163

5264

5365
@contextmanager

0 commit comments

Comments
 (0)
0