scikit-learn
diff --git a/‎.gitattributes‎
Lines changed: 1 addition & 30 deletions b/‎.gitattributes‎
Lines changed: 1 addition & 30 deletions
diff --git a/‎build_tools/circle/push_doc.sh‎
Lines changed: 3 additions & 1 deletion b/‎build_tools/circle/push_doc.sh‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎circle.yml‎
Lines changed: 2 additions & 2 deletions b/‎circle.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎doc/modules/clustering.rst‎
Lines changed: 10 additions & 9 deletions b/‎doc/modules/clustering.rst‎
Lines changed: 10 additions & 9 deletions
diff --git a/‎doc/modules/grid_search.rst‎
Lines changed: 1 addition & 1 deletion b/‎doc/modules/grid_search.rst‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎doc/modules/model_evaluation.rst‎
Lines changed: 12 additions & 0 deletions b/‎doc/modules/model_evaluation.rst‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎doc/modules/model_persistence.rst‎
Lines changed: 4 additions & 0 deletions b/‎doc/modules/model_persistence.rst‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎doc/whats_new.rst‎
Lines changed: 5 additions & 0 deletions b/‎doc/whats_new.rst‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎examples/classification/plot_lda_qda.py‎
Lines changed: 6 additions & 2 deletions b/‎examples/classification/plot_lda_qda.py‎
Lines changed: 6 additions & 2 deletions
diff --git a/‎examples/datasets/plot_iris_dataset.py‎
Lines changed: 3 additions & 3 deletions b/‎examples/datasets/plot_iris_dataset.py‎
Lines changed: 3 additions & 3 deletions
@@ -1,30 +1 @@
-/sklearn/__check_build/_check_build.c -diff
-/sklearn/_isotonic.c -diff
-/sklearn/cluster/_dbscan_inner.cpp -diff
-/sklearn/cluster/_hierarchical.cpp -diff
-/sklearn/cluster/_k_means.c -diff
-/sklearn/cluster/_k_means_elkan.c -diff
-/sklearn/datasets/_svmlight_format.c -diff
-/sklearn/decomposition/_online_lda.c -diff
-/sklearn/decomposition/cdnmf_fast.c -diff
-/sklearn/ensemble/_gradient_boosting.c -diff
-/sklearn/feature_extraction/_hashing.c -diff
-/sklearn/linear_model/cd_fast.c -diff
-/sklearn/linear_model/sgd_fast.c -diff
-/sklearn/linear_model/sag_fast.c -diff
-/sklearn/metrics/pairwise_fast.c -diff
-/sklearn/neighbors/ball_tree.c -diff
-/sklearn/neighbors/kd_tree.c -diff
-/sklearn/svm/liblinear.c -diff
-/sklearn/svm/libsvm.c -diff
-/sklearn/svm/libsvm_sparse.c -diff
-/sklearn/tree/_tree.c -diff
-/sklearn/tree/_utils.c -diff
-/sklearn/utils/arrayfuncs.c -diff
-/sklearn/utils/graph_shortest_path.c -diff
-/sklearn/utils/lgamma.c -diff
-/sklearn/utils/_logistic_sigmoid.c -diff
-/sklearn/utils/murmurhash.c -diff
-/sklearn/utils/seq_dataset.c -diff
-/sklearn/utils/sparsefuncs_fast.c -diff
-/sklearn/utils/weight_vector.c -diff
+/doc/whats_new.rst merge=union
@@ -24,9 +24,11 @@ MSG="Pushing the docs to $dir/ for branch: $CIRCLE_BRANCH, commit $CIRCLE_SHA1"
 
 cd $HOME
 if [ ! -d $DOC_REPO ];
-then git clone "git@github.com:scikit-learn/"$DOC_REPO".git";
+then git clone --depth 1 --no-checkout "git@github.com:scikit-learn/"$DOC_REPO".git";
 fi
 cd $DOC_REPO
+git config core.sparseCheckout true
+echo $dir > .git/info/sparse-checkout
 git checkout $CIRCLE_BRANCH
 git reset --hard origin/$CIRCLE_BRANCH
 git rm -rf $dir/ && rm -rf $dir/
 
@@ -9,9 +9,9 @@ dependencies:
     - ./build_tools/circle/build_doc.sh:
         timeout: 3600 # seconds
 test:
-  # Grep error on the documentation
   override:
-    - cat ~/log.txt && if grep -q "Traceback (most recent call last):" ~/log.txt; then false; else true; fi
+    # override is needed otherwise nosetests is run by default
+    - echo "Documentation has been built in the 'dependencies' step. No additional test to run"
 deployment:
  push:
    branch: /^master$|^[0-9]+\.[0-9]+\.X$/
 
@@ -746,17 +746,18 @@ by black points below.
 
 .. topic:: Implementation
 
-    The algorithm is non-deterministic, but the core samples will
-    always belong to the same clusters (although the labels may be
-    different). The non-determinism comes from deciding to which cluster a
-    non-core sample belongs. A non-core sample can have a distance lower
-    than ``eps`` to two core samples in different clusters. By the
+    The DBSCAN algorithm is deterministic, always generating the same clusters 
+    when given the same data in the same order.  However, the results can differ when
+    data is provided in a different order. First, even though the core samples 
+    will always be assigned to the same clusters, the labels of those clusters
+    will depend on the order in which those samples are encountered in the data.
+    Second and more importantly, the clusters to which non-core samples are assigned
+    can differ depending on the data order.  This would happen when a non-core sample
+    has a distance lower than ``eps`` to two core samples in different clusters. By the
     triangular inequality, those two core samples must be more distant than
     ``eps`` from each other, or they would be in the same cluster. The non-core
-    sample is assigned to whichever cluster is generated first, where
-    the order is determined randomly. Other than the ordering of
-    the dataset, the algorithm is deterministic, making the results relatively
-    stable between runs on the same data.
+    sample is assigned to whichever cluster is generated first in a pass
+    through the data, and so the results will depend on the data ordering.
 
     The current implementation uses ball trees and kd-trees
     to determine the neighborhood of points,
 
@@ -41,7 +41,7 @@ distribution. After describing these tools we detail
 
 Note that it is common that a small subset of those parameters can have a large
 impact on the predictive or computation performance of the model while others
-can be left to their default values. It is recommend to read the docstring of
+can be left to their default values. It is recommended to read the docstring of
 the estimator class to get a finer understanding of their expected behavior,
 possibly by reading the enclosed reference to the literature.  
 
 
@@ -1133,6 +1133,12 @@ are predicted. This is useful if you want to know how many top-scored-labels
 you have to predict in average without missing any true one. The best value
 of this metrics is thus the average number of true labels.
 
+.. note::
+
+    Our implementation's score is 1 greater than the one given in Tsoumakas
+    et al., 2010. This extends it to handle the degenerate case in which an
+    instance has 0 true labels.
+
 Formally, given a binary indicator matrix of the ground truth labels
 :math:`y \in \left\{0, 1\right\}^{n_\text{samples} \times n_\text{labels}}` and the
 score associated with each label
@@ -1236,6 +1242,12 @@ Here is a small example of usage of this function::
     >>> label_ranking_loss(y_true, y_score)
     0.0
 
+
+.. topic:: References:
+
+  * Tsoumakas, G., Katakis, I., & Vlahavas, I. (2010). Mining multi-label data. In
+    Data mining and knowledge discovery handbook (pp. 667-685). Springer US.
+
 .. _regression_metrics:
 
 Regression metrics
 
@@ -81,6 +81,10 @@ additional metadata should be saved along the pickled model:
 This should make it possible to check that the cross-validation score is in the
 same range as before.
 
+Since a model internal representation may be different on two different
+architectures, dumping a model on one architecture and loading it on
+another architecture is not supported.
+
 If you want to know more about these issues and explore other possible
 serialization methods, please refer to this
 `talk by Alex Gaynor <http://pyvideo.org/video/2566/pickles-are-for-delis-not-software>`_.
@@ -85,6 +85,11 @@ Enhancements
      do not set attributes on the estimator.
      :issue:`7533` by :user:`Ekaterina Krivich <kiote>`.
 
+   - For sparse matrices, :func:`preprocessing.normalize` with ``return_norm=True``
+     will now raise a ``NotImplementedError`` with 'l1' or 'l2' norm and with norm 'max'
+     the norms returned will be the same as for dense matrices (:issue:`7771`).
+     By `Ang Lu <https://github.com/luang008>`_.
+
 Bug fixes
 .........
 
 
@@ -1,9 +1,13 @@
 """
 ====================================================================
-Linear and Quadratic Discriminant Analysis with confidence ellipsoid
+Linear and Quadratic Discriminant Analysis with covariance ellipsoid
 ====================================================================
 
-Plot the confidence ellipsoids of each class and decision boundary
+This example plots the covariance ellipsoids of each class and
+decision boundary learned by LDA and QDA. The ellipsoids display
+the double standard deviation for each class. With LDA, the
+standard deviation is the same for all the classes, while each
+class has its own standard deviation with QDA.
 """
 print(__doc__)
 
 
@@ -31,7 +31,7 @@
 # import some data to play with
 iris = datasets.load_iris()
 X = iris.data[:, :2]  # we only take the first two features.
-Y = iris.target
+y = iris.target
 
 x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
 y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
@@ -40,7 +40,7 @@
 plt.clf()
 
 # Plot the training points
-plt.scatter(X[:, 0], X[:, 1], c=Y, cmap=plt.cm.Paired)
+plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired)
 plt.xlabel('Sepal length')
 plt.ylabel('Sepal width')
 
@@ -54,7 +54,7 @@
 fig = plt.figure(1, figsize=(8, 6))
 ax = Axes3D(fig, elev=-150, azim=110)
 X_reduced = PCA(n_components=3).fit_transform(iris.data)
-ax.scatter(X_reduced[:, 0], X_reduced[:, 1], X_reduced[:, 2], c=Y,
+ax.scatter(X_reduced[:, 0], X_reduced[:, 1], X_reduced[:, 2], c=y,
            cmap=plt.cm.Paired)
 ax.set_title("First three PCA directions")
 ax.set_xlabel("1st eigenvector")