-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
normalized cross-correlation (Trac #1714) #2310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Attachment added by trac user bubla on 2011-01-15: correlate.parch |
@rgommers wrote on 2011-03-29 Since this is new functionality, you should ask on the mailing list whether this can be included. A unit test is necessary, doctests are not executed by default in the test suite. |
This could be useful. |
Can someone verify if this is correct? |
I am wondering about what is right in combination with |
I happen to need this feature, but am not sure that it is correct except when mode='valid' AND both arrays are the same size. In other words, I think the normalization has to be applied to each window. |
any updates on the matter? |
@VlamV, as there is nothing here, no, I don't believe so. If you have time to fix it for the other modes, I think we could add it. |
Any news on this ? |
would it not make more sense to divide by the standard deviation of the portion of also would it not make sense to use the |
There is an answer at SO that points here: https://stackoverflow.com/a/5639626/5392420 It also has the most relevant code included. As the attachment above is no longer available (at least to me), I thought it could be helpful to have this as a start for implementation. I can also confirm that - at least in my case - it works to "normalize" the input vectors before using
|
The normalization by the portion of signal a that is actually used (see @wilberth's comment) is implemented with numpy methods in the correlate_template function in this repository: https://github.com/trichter/xcorr |
Hi all, I'm a beginner of OSS, but maybe I have a comment about this open issue. Because the link is dead, I can't check the implementation progress of the normalized cross-correlation function. The normalized option is useful, and I believe this functionality should be added. In this case, sometimes we can calculate the normalized cross-correlation function. import numpy as np
import maptlotlib.pyplot as plt
a = np.load("a.npy")
b = np.load("b.npy")
a = (a - np.mean(a))/(np.std(a)*len(a))
b = (b - np.mean(b))/(np.std(b))
corr = np.correlate(a, b, "full")
fig=plt.figure()
plt.plot(corr)
plt.show()
fig.savefig("test.png") The exaxt normalized cross-correlation where, We can see the difference in calculation range of mean value and standard deviation of Please point out any mistakes |
For a from dataclasses import dataclass
from typing import Any, Optional, Sequence
import numpy as np
ArrayLike = Any
@dataclass
class XCorr:
cross_correlation: np.ndarray
lags: np.ndarray
def cross_correlation(
signal: ArrayLike, feature: ArrayLike, lags: Optional[Sequence[int]] = None
) -> XCorr:
"""
Computes normalized cross correlation between the `signal` and the `feature`.
Current implementation assumes the `feature` can't be longer than the `signal`.
You can optionally provide specific lags, if not provided `signal` is padded
with the length of the `feature` - 1, and the `feature` is slid/padded (creating lags)
with 0 padding to match the length of the new signal. Pearson product-moment
correlation coefficients is computed for each lag.
:param signal: observed signal
:param feature: feature you are looking for
:param lags: optional lags, if not provided equals to (-len(feature), len(signal))
"""
signal_ar = np.asarray(signal)
feature_ar = np.asarray(feature)
if np.count_nonzero(feature_ar) == 0:
raise ValueError("Unsupported - feature contains only zeros")
assert (
signal_ar.ndim == feature_ar.ndim == 1
), "Unsupported - only 1d signal/feature supported"
assert len(feature_ar) <= len(
signal
), "Unsupported - signal should be at least as long as the feature"
padding_sz = len(feature_ar) - 1
padded_signal = np.pad(
signal_ar, (padding_sz, padding_sz), "constant", constant_values=0
)
lags = lags if lags is not None else range(-padding_sz, len(signal_ar), 1)
if np.max(lags) >= len(signal_ar):
raise ValueError("max positive lag must be shorter than the signal")
if np.min(lags) <= -len(feature_ar):
raise ValueError("max negative lag can't be longer than the feature")
assert np.max(lags) < len(signal_ar), ""
lagged_patterns = np.asarray(
[
np.pad(
feature_ar,
(padding_sz + lag, len(signal_ar) - lag - 1),
"constant",
constant_values=0,
)
for lag in lags
]
)
return XCorr(
cross_correlation=np.corrcoef(padded_signal, lagged_patterns)[0, 1:],
lags=np.asarray(lags),
) Example: sig = np.repeat([0., 1., 1., 0., 1., 0., 0., 1.], 128)
sig_noise = sig + rng.standard_normal(len(sig))
feature = np.ones(128)
corr = cross_correlation(sig_noise, feature)
+ noise: cross correlation: |
Hi, Any updates on this? I think it is really important to have the normalized version of the cross-correlation function, either in numpy or scipy, as we have in Matlab (https://www.mathworks.com/help/matlab/ref/xcorr.html). At the moment, I am getting values greater than 1 or smaller than -1, whereas (as pointed out by @yy-yuichi) the approach mentioned at https://stackoverflow.com/questions/5639280/why-numpy-correlate-and-corrcoef-return-different-values-and-how-to-normalize does not work all the times. Kind regards, |
Developers may be reluctant to implement a normalized cross-correlation because there are different definitions. The Matlab function you reference also only allows normalization of signals of equal length. For example, in seismology we regularly use two different methods of normalization. The first case is used in template matching. You have a short signal (of an earthquake, len N1) and you want to find the same or similar signature in a longer time series. For a perfect match, ignoring amplitudes, you would like to get a 1 (or -1). For this "full" normalization you need to compute the root of the zero-lag autocorrelation (RAC) of all len N1 portions of the longer signal, and use only this and the RAC of the short signal to normalize each cc value. It may not be cheap to compute all the RACs in a loop (as suggested in the comment by @ravwojdyla). In the second case, we usually have two long time series (e.g. one day, 100Hz sampling rate) of the same duration (or similar duration, difference in duration much shorter than duration of signals) and also we are not interested in the full xcorr, but only in the xcorr up to a certain time lag (again short compared to the full duration of the signals, note that the Implementations of both described normalizations, "full" and "naive", can be found in the repo mentioned above. |
I imagine scipy is the more appropriate place for this, and both definitions could be included as a parameter. See also scipy/scipy#4940 |
Original ticket http://projects.scipy.org/numpy/ticket/1714 on 2011-01-15 by trac user bubla, assigned to unknown.
Hello,
I have made a patch that extends the
correlate
function so it can compute normalized cross-correlation nowSee [http://en.wikipedia.org/wiki/Cross-correlation#Normalized_cross-correlation the Wikipedia article].
I have added documentation and simple doctest too, of course.
The patch is against the latest master Git
The text was updated successfully, but these errors were encountered: