8000 DOC explain how to use custom edge bins in KBinsDiscretizer (#18972) · thomasjpfan/scikit-learn@58f9143 · GitHub
[go: up one dir, main page]

Skip to content

Commit 58f9143

Browse files
Maxime Prieurglemaitre
andauthored
DOC explain how to use custom edge bins in KBinsDiscretizer (scikit-learn#18972)
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
1 parent a28da97 commit 58f9143

File tree

1 file changed

+16
-0
lines changed

1 file changed

+16
-0
lines changed

doc/modules/preprocessing.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -667,6 +667,22 @@ constant-width bins. The 'quantile' strategy uses the quantiles values to have
667667
equally populated bins in each feature. The 'kmeans' strategy defines bins based
668668
on a k-means clustering procedure performed on each feature independently.
669669

670+
Be aware that one can specify custom bins by passing a callable defining the
671+
discretization strategy to :class:`~sklearn.preprocessing.FunctionTransformer`.
672+
For instance, we can use the Pandas function :func:`pandas.cut`::
673+
674+
>>> import pandas as pd
675+
>>> import numpy as np
676+
>>> bins = [0, 1, 13, 20, 60, np.inf]
677+
>>> labels = ['infant', 'kid', 'teen', 'adult', 'senior citizen']
678+
>>> transformer = preprocessing.FunctionTransformer(
679+
... pd.cut, kw_args={'bins': bins, 'labels': labels, 'retbins': False}
680+
... )
681+
>>> X = np.array([0.2, 2, 15, 25, 97])
682+
>>> transformer.fit_transform(X)
683+
['infant', 'kid', 'teen', 'adult', 'senior citizen']
684+
Categories (5, object): ['infant' < 'kid' < 'teen' < 'adult' < 'senior citizen']
685+
670686
.. topic:: Examples:
671687

672688
* :ref:`sphx_glr_auto_examples_preprocessing_plot_discretization.py`

0 commit comments

Comments
 (0)
0