8000 ENH avoid division by zero in LDA, also avoid reusing variable names. · pfdevilliers/scikit-learn@f0026be · GitHub
[go: up one dir, main page]

Skip to content

Commit f0026be

Browse files
committed
ENH avoid division by zero in LDA, also avoid reusing variable names.
1 parent cd0b531 commit f0026be

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

sklearn/lda.py

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -146,19 +146,21 @@ def fit(self, X, y, store_covariance=False, tol=1.0e-4):
146146

147147
# ----------------------------
148148
# 1) within (univariate) scaling by with classes std-dev
149-
scaling = 1. / Xc.std(0)
149+
std = Xc.std(axis=0)
150+
# avoid division by zero in normalization
151+
std[std == 0] = 1.
150152
fac = float(1) / (n_samples - n_classes)
151153
# ----------------------------
152154
# 2) Within variance scaling
153-
X = np.sqrt(fac) * (Xc * scaling)
155+
X = np.sqrt(fac) * (Xc / std)
154156
# SVD of centered (within)scaled data
155157
U, S, V = linalg.svd(X, full_matrices=0)
156158

157159
rank = np.sum(S > tol)
158160
if rank < n_features:
159161
warnings.warn("Variables are collinear")
160162
# Scaling of within covariance is: V' 1/S
161-
scaling = (scaling * V[:rank]).T / S[:rank]
163+
scaling = (V[:rank] / std).T / S[:rank]
162164

163165
## ----------------------------
164166
## 3) Between variance scaling

0 commit comments

Comments
 (0)
0