8000 DOC: Document how memory alignment works as of 1.14 · numpy/numpy@38af6dd · GitHub
[go: up one dir, main page]

Skip to content

Commit 38af6dd

Browse files
committed
DOC: Document how memory alignment works as of 1.14
1 parent 33e1a7d commit 38af6dd

File tree

2 files changed

+97
-0
lines changed

2 files changed

+97
-0
lines changed

doc/source/dev/alignment.rst

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
.. _alignment:
2+
3+
4+
Numpy Alignment Goals
5+
=====================
6+
7+
There are three use-cases related to memory alignment in numpy (as of 1.14):
8+
9+
1. Creating structured datatypes with fields aligned like in a C-struct.
10+
2. Speeding up copy operations by using uint assignment in instead of memcpy
11+
3. Guaranteeing safe aligned access for ufuncs/setitem/casting code
12+
13+
Numpy uses two different forms of alignment to achieve these goals:
14+
"True alignment" and "Uint alignment".
15+
16+
"True" alignment refers to the architecture-dependent alignment of an
17+
equivalent C-type in C. For example, in x64 systems ``numpy.float64`` is
18+
equivalent to ``double`` in C. On most systems this has either an alignment of
19+
4 or 8 bytes (and this can be controlled in gcc by the option
20+
``malign-double``). A variable is aligned in memory if its memory offset is a
21+
multiple of its alignment. On some systems (eg sparc) memory alignment is
22+
required, on others it gives a speedup.
23+
24+
"Uint" alignment depends on the size of a datatype. It is defined to be the
25+
"True alignment" of the uint used by numpy's copy-code to copy the datatype, or
26+
undefined/unaligned if there is no equivalent uint. Currently numpy uses uint8,
27+
uint16, uint32, uint64 and uint64 to copy data of size 1,2,4,8,16 bytes
28+
respectively, and all other sized datatypes cannot be uint-aligned.
29+
30+
For example, on a (typical linux x64 gcc) system, the numpy ``complex64``
31+
datatype is implemented as ``struct { float real, imag; }``. This has "true"
32+
alignment of 4 and "uint" alignment of 8 (equal to the true alignment of
33+
``uint64``).
34+
35+
Variables in Numpy which control and describe alignment
36+
=======================================================
37+
38+
There are 4 relevant uses of the word ``align`` used in numpy:
39+
40+
* The ``dtype.alignment`` attribute (``descr->alignment`` in C). This is meant
41+
to reflect the "true alignment" of the type. It has arch-dependent default
42+
values for all datatypes, with the exception of structured types created
43+
with ``align=True`` as described below.
44+
* The ``ALIGNED`` flag of an ndarray, computed in ``IsAligned`` and checked
45+
by ``PyArray_ISALIGNED``. This is computed from ``dtype.alignment``.
46+
It is set to ``True`` if every item in the array is at a memory location
47+
consistent with ``dtype.alignment``, which is the case if the data ptr and
48+
all strides of the array are multiples of that alignment.
49+
* The ``align`` keyword of the dtype constructor, which only affects structured
50+
arrays. If the structure's field offsets are not manually provided numpy
51+
determines offsets automatically. In that case, ``align=True`` pads the
52+
structure so that each field is "true" aligned in memory and sets
53+
``dtype.alignment`` to be the largest of the field "true" alignments. This
54+
is like what C-structs usually do. Otherwise if offsets or itemsize were
55+
manually provided ``align=True`` simply checks that all the fields are
56+
"true" aligned and that the total itemsize is a multiple of the largest
57+
field alignment. In either case ``dtype.isalignedstruct`` is also set to
58+
True.
59+
* ``IsUintAligned`` is used to determine if an ndarray is "uint aligned" in
60+
an analagous way to how ``IsAligned`` checks for true-alignment.
61+
62+
Consequences of alignment
63+
=========================
64+
65+
Here is how the variables above are used:
66+
67+
1. Creating aligned structs: In order to know how to offset a field when
68+
``align=True``, numpy looks up ``field.dtype.alignment``. This includes
69+
fields which are nested structured arrays.
70+
2. Ufuncs: If the ``ALIGNED`` flag of an array is False, ufuncs will
71+
buffer/cast the array before evaluation. This is needed since ufunc inner
72+
loops access raw elements directly, which might fail on some archs if the
73+
elements are not true-aligned.
74+
3. Getitem/setitem/copyswap function: Similar to ufuncs, these functions
75+
generally have two code paths. If ``ALIGNED`` is False they will
76+
use a code path that buffers the arguments so they are true-aligned.
77+
4. Strided copy code: Here, "uint alignment" is used instead. If the itemsize
78+
of an array is equal to 1, 2, 4, 8 or 16 bytes and the array is uint
79+
aligned then instead numpy will do ``*(uintN*)dst) = *(uintN*)src)`` for
80+
appropriate N. Otherwise numpy copies by doing ``memcpy(dst, src, N)``.
81+
5. Nditer code: Since this often calls the strided copy code, it must
82+
check for "uint alignment".
83+
6. Cast code: if the array is "uint aligned" this will essentially do
84+
``*dst = CASTFUNC(*src)``. If not, it does
85+
``memmove(srcval, src); dstval = CASTFUNC(srcval); memmove(dst, dstval)``
86+
where dstval/srcval are aligned.
87+
88+
Note that in principle, only "true alignment" is required for casting code.
89+
However, because the casting code and copy code are deeply intertwined they
90+
both use "uint" alignment. This should be safe assuming uint alignment is
91+
always larger than true alignment, though it can cause unnecessary buffering if
92+
an array is "true aligned" but not "uint aligned". If there is ever a big
93+
rewrite of this code it would be good to allow them to use different
94+
alignments.
95+
96+

numpy/core/src/multiarray/common.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -234,6 +234,7 @@ npy_uint_alignment(int itemsize)
234234
alignment = _ALIGN(npy_uint64);
235235
break;
236236
default:
237+
break;
237238
}
238239

239240
return alignment;

0 commit comments

Comments
 (0)
0