@@ -21,9 +21,6 @@ enhance the functionality of scikit-learn's estimators.
21
21
22
22
**Data formats **
23
23
24
- - `Fast svmlight / libsvm file loader <https://github.com/mblondel/svmlight-loader >`_
25
- Fast and memory-efficient svmlight / libsvm file loader for Python.
26
-
27
24
- `sklearn_pandas <https://github.com/paulgb/sklearn-pandas/ >`_ bridge for
28
25
scikit-learn pipelines and pandas data frame with dedicated transformers.
29
26
@@ -64,19 +61,20 @@ enhance the functionality of scikit-learn's estimators.
64
61
It incorporates multiple modeling libraries under one API, and
65
62
the objects that EvalML creates use an sklearn-compatible API.
66
63
67
- **Experimentation frameworks **
64
+ **Experimentation and model registry frameworks **
65
+
66
+ - `MLFlow <https://mlflow.org/ >`_ MLflow is an open source platform to manage the ML
67
+ lifecycle, including experimentation, reproducibility, deployment, and a central
68
+ model registry.
68
69
69
70
- `Neptune <https://neptune.ai/ >`_ Metadata store for MLOps,
70
- built for teams that run a lot of experiments. It gives you a single
71
+ built for teams that run a lot of experiments. It gives you a single
71
72
place to log, store, display, organize, compare, and query all your
72
73
model building metadata.
73
74
74
75
- `Sacred <https://github.com/IDSIA/Sacred >`_ Tool to help you configure,
75
76
organize, log and reproduce experiments
76
77
77
- - `REP <https://github.com/yandex/REP >`_ Environment for conducting data-driven
78
- research in a consistent and reproducible way
79
-
80
78
- `Scikit-Learn Laboratory
81
79
<https://skll.readthedocs.io/en/latest/index.html> `_ A command-line
82
80
wrapper around scikit-learn that makes it easy to run machine learning
@@ -91,10 +89,7 @@ enhance the functionality of scikit-learn's estimators.
91
89
debugging/inspecting machine learning models and explaining their
92
90
predictions.
93
91
94
- - `mlxtend <https://github.com/rasbt/mlxtend >`_ Includes model visualization
95
- utilities.
96
-
97
- - `sklearn-evaluation <https://github.com/ploomber/sklearn-evaluation >`_
92
+ - `sklearn-evaluation <https://github.com/ploomber/sklearn-evaluation >`_
98
93
Machine learning model evaluation made easy: plots, tables, HTML reports,
99
94
experiment tracking and Jupyter notebook analysis. Visual analysis, model
100
95
selection, evaluation and diagnostics.
@@ -140,7 +135,7 @@ enhance the functionality of scikit-learn's estimators.
140
135
- `treelite <https://treelite.readthedocs.io >`_
141
136
Compiles tree-based ensemble models into C code for minimizing prediction
142
137
latency.
143
-
138
+
144
139
**Model throughput **
145
140
146
141
- `Intel(R) Extension for scikit-learn <https://github.com/intel/scikit-learn-intelex >`_
@@ -161,12 +156,40 @@ project. The following are projects providing interfaces similar to
161
156
scikit-learn for additional learning algorithms, infrastructures
162
157
and tasks.
163
158
164
- **Structured learning **
159
+ **Time series and forecasting **
160
+
161
+ - `Darts <https://unit8co.github.io/darts/ >`_ Darts is a Python library for
162
+ user-friendly forecasting and anomaly detection on time series. It contains a variety
163
+ of models, from classics such as ARIMA to deep neural networks. The forecasting
164
+ models can all be used in the same way, using fit() and predict() functions, similar
165
+ to scikit-learn.
166
+
167
+ - `sktime <https://github.com/alan-turing-institute/sktime >`_ A scikit-learn compatible
168
+ toolbox for machine learning with time series including time series
169
+ classification/regression and (supervised/panel) forecasting.
170
+
171
+ - `skforecast <https://github.com/JoaquinAmatRodrigo/skforecast >`_ A python library
172
+ that eases using scikit-learn regressors as multi-step forecasters. It also works
173
+ with any regressor compatible with the scikit-learn API.
174
+
175
+ - `tslearn <https://github.com/tslearn-team/tslearn >`_ A machine learning library for
176
+ time series that offers tools for pre-processing and feature extraction as well as
177
+ dedicated models for clustering, classification and regression.
178
+
179
+ **Gradient (tree) boosting **
165
180
166
- - `tslearn <https://github.com/tslearn-team/tslearn >`_ A machine learning library for time series
167
- that offers tools for pre-processing and feature extraction as well as dedicated models for clustering, classification and regression.
181
+ Note scikit-learn own modern gradient boosting estimators
182
+ :class: `~sklearn.ensemble.HistGradientBoostingClassifier ` and
183
+ :class: `~sklearn.ensemble.HistGradientBoostingRegressor `.
168
184
169
- - `sktime <https://github.com/alan-turing-institute/sktime >`_ A scikit-learn compatible toolbox for machine learning with time series including time series classification/regression and (supervised/panel) forecasting.
185
+ - `XGBoost <https://github.com/dmlc/xgboost >`_ XGBoost is an optimized distributed
186
+ gradient boosting library designed to be highly efficient, flexible and portable.
187
+
188
+ - `LightGBM <https://lightgbm.readthedocs.io >`_ LightGBM is a gradient boosting
189
+ framework that uses tree based learning algorithms. It is designed to be distributed
190
+ and efficient.
191
+
192
+ **Structured learning **
170
193
171
194
- `HMMLearn <https://github.com/hmmlearn/hmmlearn >`_ Implementation of hidden
172
195
markov models that was previously part of scikit-learn.
@@ -182,21 +205,9 @@ and tasks.
182
205
(`CRFsuite <http://www.chokkan.org/software/crfsuite/ >`_ wrapper with
183
206
sklearn-like API).
184
207
185
- - `skforecast <https://github.com/JoaquinAmatRodrigo/skforecast >`_ A python library
186
- that eases using scikit-learn regressors as multi-step forecasters. It also works
187
- with any regressor compatible with the scikit-learn API.
188
208
189
209
**Deep neural networks etc. **
190
210
191
- - `nolearn <https://github.com/dnouri/nolearn >`_ A number of wrappers and
192
- abstractions around existing neural network libraries
193
-
194
- - `Keras <https://www.tensorflow.org/api_docs/python/tf/keras >`_ High-level API for
195
- TensorFlow with a scikit-learn inspired API.
196
-
197
- - `lasagne <https://github.com/Lasagne/Lasagne >`_ A lightweight library to
198
- build and train neural networks in Theano.
199
-
200
211
- `skorch <https://github.com/dnouri/skorch >`_ A scikit-learn compatible
201
212
neural network library that wraps PyTorch.
202
213
@@ -219,9 +230,6 @@ and tasks.
219
230
220
231
**Other regression and classification **
221
232
222
- - `xgboost <https://github.com/dmlc/xgboost >`_ Optimised gradient boosted decision
223
- tree library.
224
-
225
233
- `ML-Ensemble <https://mlens.readthedocs.io/ >`_ Generalized
226
234
ensemble learning (stacking, blending, subsemble, deep ensembles,
227
235
etc.).
@@ -232,10 +240,6 @@ and tasks.
232
240
- `py-earth <https://github.com/scikit-learn-contrib/py-earth >`_ Multivariate
233
241
adaptive regression splines
234
242
235
- - `Kernel Regression <https://github.com/jmetzen/kernel_regression >`_
236
- Implementation of Nadaraya-Watson kernel regression with automatic bandwidth
237
- selection
238
-
239
243
- `gplearn <https://github.com/trevorstephens/gplearn >`_ Genetic Programming
240
244
for symbolic regression tasks.
241
245
@@ -245,8 +249,6 @@ and tasks.
245
249
- `seglearn <https://github.com/dmbee/seglearn >`_ Time series and sequence
246
250
learning using sliding window segmentation.
247
251
248
- - `libOPF <https://github.com/jppbsi/LibOPF >`_ Optimal path forest classifier
249
-
250
252
- `fastFM <https://github.com/ibayer/fastFM >`_ Fast factorization machine
251
253
implementation compatible with scikit-learn
252
254
@@ -266,6 +268,7 @@ and tasks.
266
268
267
269
- `hdbscan <https://github.com/scikit-learn-contrib/hdbscan >`_ HDBSCAN and Robust Single
268
270
Linkage clustering algorithms for robust variable density clustering.
271
+ As of scikit-learn version 1.3.0, there is :class: `~sklearn.cluster.HDBSCAN `.
269
272
270
273
- `spherecluster <https://github.com/clara-labs/spherecluster >`_ Spherical
271
274
K-means and mixture of von Mises Fisher clustering routines for data on the
@@ -276,6 +279,8 @@ and tasks.
276
279
- `categorical-encoding
277
280
<https://github.com/scikit-learn-contrib/categorical-encoding> `_ A
278
281
library of sklearn compatible categorical variable encoders.
282
+ As of scikit-learn version 1.3.0, there is
283
+ :class: `~sklearn.preprocessing.TargetEncoder `.
279
284
280
285
- `imbalanced-learn
281
286
<https://github.com/scikit-learn-contrib/imbalanced-learn> `_ Various
@@ -331,9 +336,6 @@ Recommendation Engine packages
331
336
- `OpenRec <https://github.com/ylongqi/openrec >`_ TensorFlow-based
332
337
neural-network inspired recommendation algorithms.
333
338
334
- - `Spotlight <https://github.com/maciejkula/spotlight >`_ Pytorch-based
335
- implementation of deep recommender models.
336
-
337
339
- `Surprise Lib <https://surpriselib.com/ >`_ Library for explicit feedback
338
340
datasets.
339
341
@@ -355,9 +357,6 @@ Domain specific packages
355
357
356
358
- `AstroML <https://www.astroml.org/ >`_ Machine learning for astronomy.
357
359
358
- - `MSMBuilder <http://msmbuilder.org/ >`_ Machine learning for protein
359
- conformational dynamics time series.
360
-
361
360
Translations of scikit-learn documentation
362
361
------------------------------------------
363
362
0 commit comments