Description
Version:
name: SomeEnvironment
channels:
- https://conda.anaconda.org/conda-forge
# Originally created on Ubuntu Jammy:
dependencies:
- numpy=1.24
- python=3.9
- xtensor-python=0.26.1
- xtensor-blas: Need to check, the machine in question is down.
I spent a good few hours lost trying to dig out a bug that was leading to a segfault, and then an obscure message coming back from numpy about not being able to allocate enough space for an enormous array.
It turns out that I was simply assigning an expression that should have been a (1D) vector to a 2-dimensional pytensor. This one verifies my constant love of constraining dimension at compile-time! "Let's just keep things simple" they say, and then I waste time trying to debug code that never could have run correctly and without safety measures it's not so simple 😅
In any case, we are at runtime and here is a MRE:
void debugEntryPoint(){
xt::pytensor<std::complex<float>, 2> matrix{
{3, 0, 0, 0},
{0, 4, 0, 0},
{0, 0, 5, 0},
{0, 0, 0, 6}
};
xt::pytensor<std::complex<float>, 1> vector{{0, 0, 0, 1}};
xt::pytensor<std::complex<float>, 1> correctResult = xt::linalg::dot(matrix, vector);
xt::pytensor<std::complex<float>, 2> incorrectResult = xt::linalg::dot(matrix, vector);
// We never get here
cout << correctResult << endl;
// Or equally:
const auto &view = xt::linalg::dot(matrix, vector);
xt::pytensor<std::complex<float>, 1> correctResult = view;
xt::pytensor<std::complex<float>, 2> incorrectResult = view;
}
I would expect a message explaining that the size of the view as the result of the expression created cannot be assigned to a tensor of this shape. I.e. note that the error comes at the moment we assign, not when we try to compute the result.
And the problem is that this is the error that results:
>>> debugEntryPoint()
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 3.98 PiB for an array with shape (4, 140095169944816) and data type complex64
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: NumPy: unable to create ndarray
Not sure if this is down to the BLAS or Python package, but opening here as it specifically relates to the message from Numpy and at the point we perform the copy.
Cheers