8000 ENH: implement nep 0015: merge multiarray and umath by mattip · Pull Request #10915 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: implement nep 0015: merge multiarray and umath #10915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Aug 31, 2018

Conversation

mattip
Copy link
Member
@mattip mattip commented Apr 16, 2018

This s a first look at how the implementation of "NEP 0015" will look. Only one test pattern needed adjustment.

There is now multiarray.py and umath.py, and a c-extension _multiarray comprising the code from the two previous c-extension modules.

The next step is to clean up the module initialization to avoid set_numeric_ops

Edit: s/test/test pattern/

Edit: strike out first sentence

PyObject *c_api = NULL;

if (numpy == NULL) {
PyErr_SetString(PyExc_ImportError, "numpy.core.umath failed to import");
numpy = PyImport_ImportModule("_multiarray");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we trying to import a top-level module?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment below, maybe we don't need this method at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have enough experience here to know if we want this method.

I'm just worried that this looks like a removed-in-py3k implicit relative import.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed. It was useful only to import _multiarray indepedent of the rest of numpy:

data = imp.find_module('_multiarray', ['cpython3_virt/lib/python3.5/site-packages/numpy/core'])
multiarray = imp.load_module('_multiarray', *data)

which I was doing for debugging the bootstrap of numpy.init

'where', 'zeros']

for s in __all__:
globals()[s] = getattr(_multiarray, s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given you already have the import * above, do you really need this? I'd be inclined to drop the import * and stick with this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -46,11 +46,14 @@
_import_array(void)
{
int st;
PyObject *numpy = PyImport_ImportModule("numpy.core.multiarray");
PyObject *numpy = PyImport_ImportModule("numpy.core._multiarray");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename to _multiarray_umath or something with both names, to make it clearer for people who already know about multiarray existing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, adopted

@mattip
Copy link
Member Author
mattip commented Apr 17, 2018

about _import_array - AFAICT it has a few reasons for existing:

  • make sure multiarray is imported (useful internally before the modules are merged)
  • make sure _ARRAY_API is initialized, which is also checked when importing multiarray
  • perform runtime checks on byte order

The second test seems redundant. The first is now redundant in numpy since they are one-in-the-same module. The third can be moved to _multiarray_umath initialization. So _import_array becomes an alias for third-party users to PyImport_ImportModule("numpy.core.multiarray") and we do not need to call it internally. It can also move out of generate_numpy_api.py to one of the numpy headers.

Am I missing something?

@mattip mattip force-pushed the implement-nep-0015 branch from 9cc4aca to 3703c72 Compare April 24, 2018 13:26
@mattip
Copy link
Member Author
mattip commented Apr 24, 2018

Still WIP. I did not adapt umathmodule.c for the c-api import as documented here so calling any c-api function from umathmodule crashes. I tried changing PY_ARRAY_UNIQUE_SYMBOL to NO_IMPORT_ARRAY but the initialization still crashes. Still investigating.

Copy link
Contributor
@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, it looks far less intrusive than I would have thought! Some mostly ignorant queries...

#######################################################################
# umath module #
# _multiarray_umath module - multiarray part #
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

umath part!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy-paste. Fixed

join('src', 'private', 'mem_overlap.c'),
join('src', 'private', 'npy_longdouble.c'),
join('src', 'private', 'ufunc_override.c')]
# join('src', 'private', 'mem_overlap.c'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These disappear because they are already included above, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

@@ -36,11 +36,14 @@
static NPY_INLINE int
_import_umath(void)
{
PyObject *numpy = PyImport_ImportModule("numpy.core.umath");
PyObject *numpy = PyImport_ImportModule("numpy.core._multiarray");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8000

I'm very confused, shouldn't it be _multiarray_umath here and below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks

@mattip mattip force-pushed the implement-nep-0015 branch from 3703c72 to 00e3433 Compare April 24, 2018 14:17
@charris
Copy link
Member
charris commented Apr 24, 2018

Just to note that I want to branch 1.15 before this goes in.

@mattip
Copy link
Member Author
mattip commented Apr 24, 2018

This is more an experiment in open development, I expect it will take quite a while till I get all the pieces in place. Maybe I should close the PR and continue pushing to my fork to reduce noise here

@mhvk
Copy link
Contributor
mhvk commented Apr 24, 2018

@mattip - I think it is very good to see it happening, especially as this is quite an involved PR. I quite like the approach of doing it as much as possible in small pieces. Anyway, I would not worry about the "noise" - you warned us amply that it was a WIP! Just ping when it is useful to look again.

@mattip mattip force-pushed the implement-nep-0015 branch 2 times, most recently from 88b21fe to 28ce0b4 Compare April 27, 2018 06:28
@mhvk mhvk changed the title WIP: ENH: implement nep 0015 WIP: ENH: implement nep 0015: merge multiarray and umath Jun 2, 2018
@mattip mattip force-pushed the implement-nep-0015 branch from 28ce0b4 to 010fc4f Compare June 21, 2018 23:41
@mattip
Copy link
Member Author
mattip commented Jun 21, 2018

Rebased off master after 1.15

@mhvk
Copy link
Contributor
mhvk commented Jun 22, 2018

I'm quite excited about this - do ping if this is ready for further review!

@mattip mattip force-pushed the implement-nep-0015 branch from 010fc4f to b997efd Compare June 25, 2018 19:35
@mattip mattip changed the title WIP: ENH: implement nep 0015: merge multiarray and umath ENH: implement nep 0015: merge multiarray and umath Jun 25, 2018
@mattip
Copy link
Member Author
mattip commented Jun 25, 2018

I took the WIP out of the title and am ready for a review. While the NEP (PR #10704) itself has not progressed, I have implemented the minimal set of changes I could to merge the two c-extension modules. Merging this would be helpful for continuing the work on matmul (PR #11133) since it would give access to the cblasfuncs.c routines from umath. That is why I added commits a191224 and b997efd.

Note that I only moved files that have not changed in quite a while and that have no NUMPY_API functions into common (used to be private) in order to minimize git churn on possible pull requests and issues. Eventually, we may wish to restructure the source code directories according to a different paradigm.

Copy link
Contributor
@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good, and I'm excited about it! My comments are really two, both organisational: whether to have common_src in setup.py and a new umathmodule.h to keep things umath-related in the multiarray initialization.

join('src', 'private', 'mem_overlap.c'),
join('src', 'private', 'npy_longdouble.c'),
join('src', 'private', 'ufunc_override.c')]
# join('src', 'common', 'mem_overlap.c'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the commented out files? Just remove? Or at least comment on why they are commented out... Is it because they are already listed in multiarray?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

join('src', 'common', 'binop_override.h')] + npymath_sources

config.add_extension('_multiarray_umath',
sources=multiarray_src + umath_src +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it be an idea to have a common_src here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adopted. Eventually everything will become common

#include "numpy/ufuncobject.h"
#include "ufunc_object.h"
/*
*****************************************************************************
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might it be more logical to have this contained in a separate umathmodule.h file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there may be unused includes, reworking

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cleaned this up by removing many uneeded include directives

@@ -4277,6 +4294,11 @@ normalize_axis_index(PyObject *NPY_UNUSED(self), PyObject *args, PyObject *kwds)
return PyInt_FromLong(axis);
}

/* declarations from umathmodule.c */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If following the separate umathmodule.h file idea above, these lines could go in there too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup up via removal and refactoring, without a umathmodule.h

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on second thought umathmodule.h would be better

}

/* Import the array */
/* No need to import the array, only called via import in the first place
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the comment, should this be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

join('src', 'common', 'ufunc_override.h'),
join('src', 'common', 'binop_override.h'),
join('src', 'common', 'npy_extint128.h'),
join('src', 'common', 'lowlevel_strided_loops.h'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this one on down, these are duplications.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@mattip mattip force-pushed the implement-nep-0015 branch 3 times, most recently from de16ebe to bb2cc21 Compare July 2, 2018 23:48
@mattip
Copy link
Member Author
mattip commented Jul 2, 2018

rebased off master to solve merge conflicts, added common/umathmodule.h

@mattip mattip force-pushed the implement-nep-0015 branch from bb2cc21 to 9246b07 Compare July 3, 2018 23:01
@mattip mattip force-pushed the implement-nep-0015 branch from a6bf7e9 to 03e0814 Compare August 21, 2018 17:07
@mattip
Copy link
Member Author
mattip commented Aug 21, 2018

Rebased off master to see if gcov messages noted in #11790 go away after this PR. Also properly formatted commit messages and sqaushed a few together.

@tylerjereddy
Copy link
Contributor

mem_overlap.gcda is still an issue here, but rebasing did resolve a few of the other timestamp issues; perhaps there are still components of the NumPy code base that drive redundant compilation beyond the changes in this PR then?

join('src', 'private', 'npy_extint128.h')],
join('src', 'common', 'mem_overlap.c')],
depends=[join('src', 'common', 'mem_overlap.h'),
join('src', 'common', 'npy_extint128.h')],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm able to reproduce the gcov issue locally on linux & moving this extension addition block up above _multiarray_umath module prevents mem_overlap.c from being excluded from the C coverage report.

The diff is below and build / unit tests still seem fine locally for me:

diff --git a/numpy/core/setup.py b/numpy/core/setup.py
index b306aa4..1588a26 100644
--- a/numpy/core/setup.py
+++ b/numpy/core/setup.py
@@ -710,6 +710,17 @@ def configuration(parent_package='',top_path=None):
                        include_dirs=[])

     #######################################################################
+    #                     multiarray_tests module                         #
+    #######################################################################
+
+    config.add_extension('_multiarray_tests',
+                    sources=[join('src', 'multiarray', '_multiarray_tests.c.src'),
+                             join('src', 'common', 'mem_overlap.c')],
+                    depends=[join('src', 'common', 'mem_overlap.h'),
+                             join('src', 'common', 'npy_extint128.h')],
+                    libraries=['npymath'])
+
+    #######################################################################
     #             _multiarray_umath module - common part                  #
     #######################################################################

@@ -933,16 +944,6 @@ def configuration(parent_package='',top_path=None):
     config.add_extension('_struct_ufunc_tests',
                     sources=[join('src', 'umath', '_struct_ufunc_tests.c.src')])

-    #######################################################################
-    #                     multiarray_tests module                         #
-    #######################################################################
-
-    config.add_extension('_multiarray_tests',
-                    sources=[join('src', 'multiarray', '_multiarray_tests.c.src'),
-                             join('src', 'common', 'mem_overlap.c')],
-                    depends=[join('src', 'common', 'mem_overlap.h'),
-                             join('src', 'common', 'npy_extint128.h')],
-                    libraries=['npymath'])

I could open a PR to do this on master, but since @mattip has been working on this code for a while I'll let him judge if this might be suitable & avoids merged conflicts if we change here, etc.

@mattip
Copy link
Member Author
mattip commented Aug 22, 2018

This seems a bit magical. Any ideas why it solves the problem? I don't mind making the change but perhaps we are simply masking a deeper issue. For instance, and this is just a theory I am making up, it may be that the gcov report for one extension module is getting silently written over by the other, where before the change at least we got the warning.

@tylerjereddy
Copy link
Contributor

I'm not sure I follow -- the setup.py itself is pretty confusing -- shouldn't there be clear comments explaining why the build order is the way it is, and why it is necessary to include the same source files in duplicate in this manner? Just reading through makes it seem magical or like an artifact of the way the source evolved over time.

Can you explain why it might be bad to reverse the extension building order in the way that I have? Why was it the other way around initially?

It looks like the idea might be to construct functionality / libraries first, and then tests second, but why do tests need to duplicate what was built for functionality in a well-designed system?

We could probably dissect the timestamps for all object files and gcov report files that are involved to figure out the reason for warning suppression on order reversal; perhaps invoking gcov on a source file that is older than one that is rebuilt by the test thing downstream causes problems.

@mattip
Copy link
Member Author
mattip commented Aug 22, 2018

I would prefer the changes to the build order become a separate PR, built on top of this one.

@mattip
Copy link
Member Author
mattip commented Aug 31, 2018

This is waiting for another core reviewer, @mhvk suggested @charris, @pv, @njsmith

@charris
Copy link
Member
charris commented Aug 31, 2018

Let's get this in. Thanks @mattip and the reviewers @eric-wieser , @mhvk , @tylerjereddy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
0