8000 MAINT: Split lapack_lite more logically across files by eric-wieser · Pull Request #8651 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

MAINT: Split lapack_lite more logically across files #8651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 3, 2017

Conversation

eric-wieser
Copy link
Member
@eric-wieser eric-wieser commented Feb 21, 2017

Primary goals

Old structure

  • dlamch.c (manually generated, edited) - The dlamch function. Includes the patch in cadbb5f1, so no longer auto-generated
  • zlapack_lite.c - Functions for the complex128 type.
  • dlapack_lite.c - Every other lapack function2

New structure

  • f2c_z_lapack.c - Functions for the complex128 (z) type.
  • f2c_c_lapack.c - Functions for the complex64 (c) type.
  • f2c_d_lapack.c - Functions for the float64 (d) type.
  • f2c_s_lapack.c - Functions for the float32 (s) type.
  • f2c_config.c - dlamch, slamch (for which patch
  • f2c_lapack.c - Every other lapack function.

The patch from cadbb5f, which was forgotten, and any other function that comes from the /INSTALL directory of lapack3 is included in f2c_config.c. The make_lite.py script now applies the f2c_config.c.patch to the result, which means that this patch can't be forgotten in future


1 Which unfortunately has no description of the bug it was trying to fix

2 Probably due to the build script not being correctly updated when the c and s variants were added

3 (Identical copies of) [sd]lamch.f can be found in both /INSTALL and /SRC within the LAPACK tarball. Previously, we were loading it from SRC. However, in later releases of LAPACK it exists exclusively in /INSTALL. This adds INSTALL first on the function lookup path, and puts all the generated code from that path in misc_lite.c. This makes the approach forwards-compatible.

Questions

  1. Is [sd]lamch special enough to get its own file like this, or can that go in the main [sd]lapack_.lite.c
  2. Is slamch less special than dlamch as indicated by the lack of manual generation, or is this a mistake made long ago
  3. Is there a better set of names for lapack_lite.c and misc_lite.c (or single name if they can be merged)

@eric-wieser eric-wieser changed the title MAINT: Remove manually-generated dlamch.c MAINT: Split lapack_lite more logically across files Feb 21, 2017
@eric-wieser
Copy link
Member Author

@charris, can I ping you on this? Thought it would be wise to do some further cleanup before getting #8649 going

@charris
Copy link
Member
charris commented Feb 26, 2017

Can you document up top what you did in more detail? There are are a bunch of new files, what is in them? The misc_* file names are not descriptive. AFAICT, we have the four supported types c*, d*, s*, d*, plus supporting routines, it would be nice to have it documented which is where. The [sd]dlamch functions are rather special as they determine info we already know for the build, but it probably isn't worth worrying about.

@eric-wieser
Copy link
Member Author
eric-wieser commented Feb 26, 2017

The [sd]dlamch functions are rather special

Perhaps, but only dlamch was treated specially before, not slamch. I notice that there's something going on with HAVE_CONFIG, but I can't work out where that's coming from, and whether the build process actually sets it, or if that's an artefact of something older.

I'll have another go at writing the description

@charris
Copy link
Member
charris commented Feb 26, 2017

Also, the current files are divided into complex and real, any reason not to just rename the files complex_lapack_lite and real_lapack_lite?

We probably need to revise the HAVE_CONFIG stuff, IIRC, it just treats long double, which does not apply here.

@eric-wieser
Copy link
Member Author
eric-wieser commented Feb 26, 2017

the current files are divided

You're referring to master, or this PR? Master is divided into complex128 and everything else

I think having two 30k line files is probably more manageable than one 60k line file, so I'd argue for complex64_lapack_lite.c and complex128_lapack_lite.c etc. I'd argue we should keep the names of those four files though, as they emphasize which function names they contain (and ultimately, the heuristic is just "what does the function start with", nothing more.

misc_lite.c and lapack_lite.c could definitely do with a rename, on the other hand.

I've updated the description

@charris
Copy link
Member
charris commented Feb 26, 2017

I'd be happy with four [s,d,c,z]lapack_lite.c files. Still, the description up top of this PR should explain "why" and "what" and lay out the new organization. The diffs are huge and I'd rather this only happened once, so is there a reason not to do this in the lapack upgrade PR?

@eric-wieser
Copy link
Member Author
eric-wieser commented Feb 27, 2017

I'd be happy with four [s,d,c,z]lapack_lite.c files

Do you want [ds]lamch to be folded into these, or do you think they're special enough to deserve their own file?

And what about the stuff that doesn't fit into any of those categories?

The diffs are huge and I'd rather this only happened once

I'm hoping that the diffs should be smaller when the only changes are actually an upgrade, rather than moving lots of content around. But I guess if you're happy with the 4-way split, I can go ahead locally and verify that the resulting diff is indeed small.

Still, the description up top of this PR should explain "why" and "what" and lay out the new organization

I'll give this another pass tomorrow. Thanks for the feedback!

@eric-wieser
Copy link
Member Author

Ok, hoping the up-top description is more useful now

{
/* System generated locals */
- real ret_val;
+ volatile real ret_val;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was forgotten in the previous version, so was presumably causing the same bug as the code that the other part of this patch fixed.

@charris
Copy link
Member
charris commented Feb 28, 2017

This looks much nicer. I guess the only question is the naming/merging of lapack_lite.c and misc_lite.c files. As the *lamch functions are used in both the blas and lapack files and establish the parameters of the floating type, I'd be inclined to name it something like mach_lite, machine_lite, or floating_lite. Or even install_lite, although that seems misleading. OTOH, merging the two files seems reasonable also.

I can see that I'm not much help here ;) Do you have any preferences?

@charris
Copy link
Member
charris commented Feb 28, 2017

Maybe environment_lite.c?

@charris
Copy link
Member
charris commented Feb 28, 2017

I'm coming around to just renaming misc_lite.c to install_lite.c.

@eric-wieser
Copy link
Member Author
eric-wieser commented Feb 28, 2017

Perhaps config_lite.c? I think I like environment_lite.c as well. install_lite.c is better than misc_lite.c too.
I'll let you pick one.

I have no preference for any of these names. The key part of this was splitting zdsc and adding to the search path, and this kinda fell out as something I wasn't expecting.

@charris
Copy link
Member
charris commented Mar 1, 2017

config_lite.c Sounds good to me. With that name, I think it would also make sense to combine the two files as well,

@eric-wieser
Copy link
Member Author

With that name, I think it would also make sense to combine the two files as well

I don't think I agree. config because it seems like those are methods you're supposed to tweak based on your system settings.

The current lapack_lite.c contains utility functions for lapack that are type-independent. If these are thrown in the same bucket, I don't think this bucket should be called config

@charris
Copy link
Member
charris commented Mar 1, 2017

OK, I don't feel strongly about it.

@eric-wieser
Copy link
Member Author

Also, is the _lite suffix buying us anything at all here?

@charris
Copy link
Member
charris commented Mar 1, 2017

Also, is the _lite suffix buying us anything at all here?

Read my mind. The only reason for keeping it I could think of was that it isn't the complete lapack, but rather a "lite" version, as in Miller lite (TM) beer, so the suffix does make some sense.

@eric-wieser
Copy link
Member Author

What I meant by that is is it needed on each file? I'm all for the folder having that name.

@charris
Copy link
Member
charris commented Mar 1, 2017

Sure, go ahead and remove the "lite" suffix, I don't have strong feelings either way.

@eric-wieser
Copy link
Member Author
eric-wieser commented Mar 1, 2017

@charris: Done, and rebuilt the commits in slightly more logical order, and to emphasize the bug fix. All intermediate commits build, I think. I've tried to include the important bits of the issue text in the commit messages.

The end choice was config.c and lapack.c for the files under discussion, and for ease of sorting, zlapack_lite became lapack_z - especially desirable since clapack has an overloaded meaning of complex vs the C programming language (which is what clapack_scrub.py refers to)

@charris
Copy link
Member
charris commented Mar 2, 2017

Oh, now, that is a step too far ;) I do prefer the [sdcz]lapack.c names as they follow the lapack conventions.

@eric-wieser
Copy link
Member Author

follow the lapack conventions.

If this is something we care about, then we need to only use 6 characters in the name too though, right? ;)

Seriously though - do you want me to revert that, or not?

@charris
Copy link
Member
charris commented Mar 2, 2017

I really do prefer the sdcz prefixes, as I instantly know what they mean, i.e., functions beginning with the same prefixes.

@eric-wieser
Copy link
Member Author

How about lapack_zfuncs, which also meets the 6 letter rule? I really want to avoid clapack being one of the names, because we use that with a different meaning (lapack converted to c code, a now-dead project our code is inspired by).

@charris
Copy link
Member
charris commented Mar 2, 2017

Geez. How about c_ etc prefixes? I doubt anyone will be confused given that all the types are present and we managed with clapack_lite all these years. I still prefer the straight forward versions. Next you will be telling me clap is a social disease...

@eric-wieser
Copy link
Member Author
eric-wieser commented Mar 2, 2017

Not as happy with c_lapack over lapack_c just from a sorting of filenames point of view - which emphasizes that there's one for each type. But I think that's enough to disambiguate c_lapack.c from clapackscrub.py.

I'll defer to you decision here, but I'd rather not rebase this more than once more. So can you confirm you're definitely sufficiently content with c_lapack.c, and would much prefer it to lapack_c.c, before I do the rebase?

On a more positive note, I have #8649 just about working successfully on top of this now, although currently with a few hacks to avoid having to write fortran

@charris
Copy link
Member
charris commented Mar 2, 2017

Yeah, I thought if the sorting issue and decided that with so few files it didn't matter much. You could always prepend _ to the generated files ;) I also thought lapack_c.c was a bit awkward with the two c's. That said, no one is going to be looking at these files again unless something serious goes wrong. I'm happy with the prefixed form.

@eric-wieser
Copy link
Member Author
eric-wieser commented Mar 2, 2017

I also thought lapack_c.c was a bit awkward

Yeah, I agree with that

You could always prepend _ to the generated files

Actually, i think doing something like this is probably a good idea, to discourage editing. How do you feel about gen_?

@charris
Copy link
Member
charris commented Mar 2, 2017

gen_ sounds good.

@charris
Copy link
Member
charris commented Mar 2, 2017

Although I think a straight _ might be better. The prefix gen_ is a bit confusing as to whether it is a verb or adjective.

@eric-wieser
Copy link
Member Author
eric-wieser commented Mar 2, 2017

Let's make a decision:

  • _c_lapack.c for sorting only
  • gen_c_lapack.c for generated
  • f_c_lapack.c for fortran
  • f2c_c_lapack.c to indicate it came from f2c
  • c_lapack.c, untouched

Let's make this the last time we pick names here :)

@matthew-brett
Copy link
Contributor

I like f2c_ .

@charris
Copy link
Member
charris commented Mar 2, 2017

I like f2c_ also.

Since we can use 2.7+ features now, we can have the with statement and
subprocess.check_call
@eric-wieser
Copy link
Member Author

@charris: All done. f2c_lite.c moved to f2c.c, so that the rule remains "f2c_ prefix means generated",

Also took the chance to throw a little bit more cleanup into an extra first commit.

@@ -188,64 +201,61 @@ def getLapackRoutines(wrapped_routines, ignores, lapack_dir):
return library

def getWrappedRoutineNames(wrapped_routines_file):
fo = open(wrapped_routines_file)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was never closed before

Also uses this splitting as an excuse to ditch the _lite suffix, in
favor of a f2c_ prefix for all generated files.

Before:

 * `zlapack_lite.c` - Functions for the `complex128` type.
 * `dlapack_lite.c` - Every other lapack function

After:

 * `f2c_z_lapack.c` - Functions for the `complex128` type.
 * `f2c_c_lapack.c` - Functions for the `complex64` type.
 * `f2c_d_lapack.c` - Functions for the `float64` type.
 * `f2c_s_lapack.c` - Functions for the `float32` type.
 * `f2c_lapack.c`   - Every other lapack function
Previously, the dlamch function was a manually-edited file, precluding
autogeneration. We fix this by putting the manual edits in a patch file, and
then generating f2c_config.c from install/*.f, which includes dlamch.

While these functions exist in more than once place in lapack 3.0.0, they only
exist in lapack/install in newer versions. A side effect of this is that some
functions have been pulled out of `f2c_blas.c` and `f2c_s_lapack.c` into this
new file.

The edits in the patch were introduced in cadbb5f,
for a reason that is lost to time.
This bug is mentioned, fixed, but not described, in cadbb5f

To be safe, we patch it for the equivalent float32 function.
@eric-wieser
Copy link
Member Author

Also, the upside of us arguing about names is that by rebasing again, I remembered to fix the readme and comments to match the new names

@charris charris merged commit 96c3e66 into numpy:master Mar 3, 2017
@charris
Copy link
Member
charris commented Mar 3, 2017

In it goes. Thanks Eric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0