You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add gradient calculation in _huber_loss_and_gradient
Add tests to check the correctness of the loss and gradient
Fix for old scipy
Add parameter sigma for robust linear regression
Add gradient formula to robust _huber_loss_and_gradient
Add fit_intercept option and fix tests
Add docs to HuberRegressor and the helper functions
Add example demonstrating ridge_regression vs huber_regression
Add sample_weight implementation
Add scaling invariant huber test
Remove exp and add bounds to fmin_l_bfgs_b
Add sparse data support
Add more tests and refactoring of code
Add narrative docs
review huber regressor
Minor additions to docs and tests
Minor fixes that deals with dealing with NaN values in targets
and old verions of SciPy and NumPy
Add HuberRegressor to robust estimator
Refactored computation of gradient and make docs render properly
Temp
Remove float64 dtype conversion
trivial optimizations and add a note about R
Remove sample_weights special_casing
address @amueller comments
* :ref:`RANSAC <ransac_regression>` is faster, and scales much better
910
-
with the number of samples
910
+
* :ref:`HuberRegressor <huber_regression>` should be faster than
911
+
:ref:`RANSAC <ransac_regression>` and :ref:`Theil Sen <theil_sen_regression>`
912
+
unless the number of samples are very large, i.e ``n_samples`` >> ``n_features``.
913
+
This is because :ref:`RANSAC <ransac_regression>` and :ref:`Theil Sen <theil_sen_regression>`
914
+
fit on smaller subsets of the data. However, both :ref:`Theil Sen <theil_sen_regression>`
915
+
and :ref:`RANSAC <ransac_regression>` are unlikely to be as robust as
916
+
:ref:`HuberRegressor <huber_regression>` for the default parameters.
911
917
912
-
* :ref:`RANSAC <ransac_regression>` will deal better with large
913
-
outliers in the y direction (most common situation)
918
+
* :ref:`RANSAC <ransac_regression>` is faster than :ref:`Theil Sen <theil_sen_regression>`
919
+
and scales much better with the number of samples
920
+
921
+
* :ref:`RANSAC <ransac_regression>` will deal better with large
922
+
outliers in the y direction (most common situation)
914
923
915
924
* :ref:`Theil Sen <theil_sen_regression>` will cope better with
916
925
medium-size outliers in the X direction, but this property will
@@ -1050,6 +1059,67 @@ considering only a random subset of all possible combinations.
1050
1059
1051
1060
.. [#f2] T. Kärkkäinen and S. Äyrämö: `On Computation of Spatial Median for Robust Data Mining. <http://users.jyu.fi/~samiayr/pdf/ayramo_eurogen05.pdf>`_
1052
1061
1062
+
.. _huber_regression:
1063
+
1064
+
Huber Regression
1065
+
----------------
1066
+
1067
+
The :class:`HuberRegressor` is different to :class:`Ridge` because it applies a
1068
+
linear loss to samples that are classified as outliers.
1069
+
A sample is classified as an inlier if the absolute error of that sample is
1070
+
lesser than a certain threshold. It differs from :class:`TheilSenRegressor`
1071
+
and :class:`RANSACRegressor` because it does not ignore the effect of the outliers
0 commit comments