8000 ENH: Fix exception causes in _iotools.py by cool-RR · Pull Request #15731 · numpy/numpy · GitHub
[go: up one dir, main page]

Skip to content

ENH: Fix exception causes in _iotools.py #15731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

cool-RR
Copy link
@cool-RR cool-RR commented Mar 9, 2020

I recently went over Matplotlib and Pandas, fixing a small mistake in the way that Python 3's exception chaining is used. If you're interested, I can do it here too. I've done it on just one file right now.

The mistake is this: In some parts of the code, an exception is being caught and replaced with a more user-friendly error. In these cases the syntax raise new_error from old_error needs to be used.

Python 3's exception chaining means it shows not only the traceback of the current exception, but that of the original exception (and possibly more.) This is regardless of raise from. The usage of raise from tells Python to put a more accurate message between the tracebacks. Instead of this:

During handling of the above exception, another exception occurred:

You'll get this:

The above exception was the direct cause of the following exception:

The first is inaccurate, because it signifies a bug in the exception-handling code itself, which is a separate situation than wrapping an exception.

Let me know what you think!

@cool-RR cool-RR marked this pull request as ready for review March 9, 2020 18:21
@seberg
Copy link
Member
seberg commented Mar 10, 2020

Better error messages are a good project/improvement, although I am wondering if most of these should not rather use from None. The old exception is often just noise and thus very confusing, in my opinion.
I am not sure if all of these error messages are actually in tested code paths (or can even be created), although that is specifric to the _iotools.py file.

@seberg seberg changed the title MAINT: Fix exception causes in _iotools.py ENH: Fix exception causes in _iotools.py Mar 10, 2020
@cool-RR
Copy link
Author
cool-RR commented Mar 11, 2020

I'm strongly against using from None. When I'm debugging, I'm like a man who got lost in the desert and is about to die of thirst. Any possible insight into what happened is like an oasis, even if there are just a few drops of water there.

Also, some tools like Django and Sentry show you all the local variables for your stacktraces, which is a godsend. These often have important information that sheds light on what went wrong, and if you remove the traceback they'll be gone.

@mattip mattip added the triage review Issue/PR to be discussed at the next triage meeting label Mar 18, 2020
@mattip
Copy link
Member
mattip commented Mar 25, 2020

In a community discussion we would prefer to go through these one at a time and add from e as default, but some might want to use from None if the deeper error adds no more information.

@mattip mattip added triaged Issue/PR that was discussed in a triage meeting and removed triage review Issue/PR to be discussed at the next triage meeting labels Mar 25, 2020
Copy link
Member
@WarrenWeckesser WarrenWeckesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull request, @cool-RR. I added comments in-line. Based on my attempts to actually trigger these exceptions, I suggested that we not use from e in one case. In another case, I have a question about how to actually trigger the exception. It would be nice to be able to exercise these changes before committing them. For the remaining changes, using from e is probably OK.

# dtype_or_func must be a function, then
if not hasattr(dtype_or_func, '__call__'):
errmsg = ("The input argument `dtype` is neither a"
" function nor a dtype (got '%s' instead)")
raise TypeError(errmsg % type(dtype_or_func))
raise TypeError(errmsg % type(dtype_or_func)) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't use from e here. The original exception doesn't provide useful information. For example,

In [29]: conv = StringConverter("foo")                                                                                              
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/mc37np/lib/python3.7/site-packages/numpy-1.19.0.dev0+116a021-py3.7-macosx-10.9-x86_64.egg/numpy/lib/_iotools.py in __init__(self, dtype_or_func, default, missing_values, locked)
    606                 self.func = None
--> 607                 dtype = np.dtype(dtype_or_func)
    608             except TypeError as e:

TypeError: data type 'foo' not understood

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
<ipython-input-29-f4dc02a5945d> in <module>
----> 1 conv = StringConverter("foo")

~/mc37np/lib/python3.7/site-packages/numpy-1.19.0.dev0+116a021-py3.7-macosx-10.9-x86_64.egg/numpy/lib/_iotools.py in __init__(self, dtype_or_func, default, missing_values, locked)
    611                     errmsg = ("The input argument `dtype` is neither a"
    612                               " function nor a dtype (got '%s' instead)")
--> 613                     r
10000
aise TypeError(errmsg % type(dtype_or_func)) from e
    614                 # Set the function
    615                 self.func = dtype_or_func

TypeError: The input argument `dtype` is neither a function nor a dtype (got '<class 'str'>' instead)

The first exception, TypeError: data type 'foo' not understood doesn't provide any information that is not also in the final exception message (TypeError: The input argument dtype is neither a function nor a dtype (got '<class 'str'>' instead), so it is noise. We should use from None here and not expose the first exception.

except OverflowError:
raise ValueError
except OverflowError as e:
raise ValueError from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example that triggers this exception? If this exception is raised, it is then caught a few lines down and raised again, so the end result of the use of from e in all these try-except statements is the user being given three chained exceptions. That seems pretty noisy, and probably not useful.

if value.strip() in self.missing_values:
if not self._status:
self._checked = False
return self.default
raise ValueError("Cannot convert string '%s'" % value)
raise ValueError("Cannot convert string '%s'" % value) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the previous exception in the will give include the details of what went wrong. This one just says, in effect "fail!". So I guess this use of from e is OK.

(I'm starting to think that the way this code uses exceptions is awkward, and could use a redesign, but that will have to wait for another time.)

# Raise an exception if we locked the converter...
if self._locked:
errmsg = "Converter is locked and cannot be upgraded"
raise ConverterLockError(errmsg)
raise ConverterLockError(errmsg) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception chain is a bit noisy, but it looks like the early exceptions have useful information, so I guess this use of from e is OK:

In [121]: conv = StringConverter(int, locked=True)

In [122]: conv.upgrade('0.')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/mc37np/lib/python3.7/site-packages/numpy-1.19.0.dev0+116a021-py3.7-macosx-10.9-x86_64.egg/numpy/lib/_iotools.py in _strict_call(self, value)
    687             # We check if we can convert the value using the current function
--> 688             new_value = self.func(value)
    689 

ValueError: invalid literal for int() with base 10: '0.'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
~/mc37np/lib/python3.7/site-packages/numpy-1.19.0.dev0+116a021-py3.7-macosx-10.9-x86_64.egg/numpy/lib/_iotools.py in upgrade(self, value)
    736         try:
--> 737             return self._strict_call(value)
    738         except ValueError as e:

~/mc37np/lib/python3.7/site-packages/numpy-1.19.0.dev0+116a021-py3.7-macosx-10.9-x86_64.egg/numpy/lib/_iotools.py in _strict_call(self, value)
    706                 return self.default
--> 707             raise ValueError("Cannot convert string '%s'" % value) from e
    708     #

ValueError: Cannot convert string '0.'

The above exception was the direct cause of the following exception:

ConverterLockError                        Traceback (most recent call last)
<ipython-input-122-b82e0b39004c> in <module>
----> 1 conv.upgrade('0.')

~/mc37np/lib/python3.7/site-packages/numpy-1.19.0.dev0+116a021-py3.7-macosx-10.9-x86_64.egg/numpy/lib/_iotools.py in upgrade(self, value)
    740             if self._locked:
    741                 errmsg = "Converter is locked and cannot be upgraded"
--> 742                 raise ConverterLockError(errmsg) from e
    743             _statusmax = len(self._mapper)
    744             # Complains if we try to upgrade by the maximum

ConverterLockError: Converter is locked and cannot be upgraded

Copy link
Member
@eric-wieser eric-wieser Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct.

This makes the error message The above exception was the direct cause of the following exception:.

However, that's not what's happening here. What's happening here is that something else went wrong while we tried to recover (by upgrading the data type). Today, the message is

During handling of the above exception, another exception occurred:

This is a more accurate message. So this line was better unchanged.

_statusmax = len(self._mapper)
# Complains if we try to upgrade by the maximum
_status = self._status
if _status == _statusmax:
errmsg = "Could not find a valid conversion function"
raise ConverterError(errmsg)
raise ConverterError(errmsg) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it is in the same situation as the previous one: it makes a noisy exception, but there might be useful info. in there, so OK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

# Raise an exception if we locked the converter...
if self._locked:
errmsg = "Converter is locked and cannot be upgraded"
raise ConverterLockError(errmsg)
raise ConverterLockError(errmsg) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, so OK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

_statusmax = len(self._mapper)
# Complains if we try to upgrade by the maximum
_status = self._status
if _status == _statusmax:
raise ConverterError(
"Could not find a valid conversion function"
)
) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, so OK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

@cool-RR cool-RR force-pushed the 2020-03-09-raise-from branch from 116a021 to 536afeb Compare March 31, 2020 18:21
@cool-RR
Copy link
Author
cool-RR commented Mar 31, 2020

@WarrenWeckesser Thanks for your review. That was very thorough.

I disagree with the decision to use from None anywhere, and I don't want to be the reason that a developer didn't get a traceback. So here's what I did: I amended my commit to only include the 5 cases we agree about. The other cases could be done in a separate PR by whoever's interested.

Copy link
Member
@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any of the ConversionError cases make sense to chain exc 685C eption __cause__s, these look like __context__ chains to me, which is what we already had.

# Raise an exception if we locked the converter...
if self._locked:
errmsg = "Converter is locked and cannot be upgraded"
raise ConverterLockError(errmsg)
raise ConverterLockError(errmsg) from e
Copy link
Member
@eric-wieser eric-wieser Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct.

This makes the error message The above exception was the direct cause of the following exception:.

However, that's not what's happening here. What's happening here is that something else went wrong while we tried to recover (by upgrading the data type). Today, the message is

During handling of the above exception, another exception occurred:

This is a more accurate message. So this line was better unchanged.

_statusmax = len(self._mapper)
# Complains if we try to upgrade by the maximum
_status = self._status
if _status == _statusmax:
errmsg = "Could not find a valid conversion function"
raise ConverterError(errmsg)
raise ConverterError(errmsg) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

# Raise an exception if we locked the converter...
if self._locked:
errmsg = "Converter is locked and cannot be upgraded"
raise ConverterLockError(errmsg)
raise ConverterLockError(errmsg) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

_statusmax = len(self._mapper)
# Complains if we try to upgrade by the maximum
_status = self._status
if _status == _statusmax:
raise ConverterError(
"Could not find a valid conversion function"
)
) from e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

@eric-wieser
Copy link
Member
eric-wieser commented Apr 3, 2020

Apologies for the merge conflicts. Note that to merge keeping the semantics of this patch (that I'm arguing against above) you'd need to write:

        except ValueError as e:
            try:
                self._do_upgrade()
            except ConversionError as e2:
                raise e2 from e1  # claim that e2 was _caused_ by e1

Again though, I'd recommend you not do this.

@eric-wieser
Copy link
Member

@cool-RR, I've found a bunch of places elsewhere in numpy where changing to use raise from would definitely be valuable. I'd recommend you restrict yourself to clear-cut cases like:

except TypeError:
        raise ValueError(...)

etc

@cool-RR
Copy link
Author
cool-RR commented Apr 15, 2020

@eric-wieser Hmm, that's not enjoyable enough for me to do, so I'll leave that to whoever's interested.

@eric-wieser
Copy link
Member

Opened #15986 to track that, thanks for bringing it to our attention @cool-RR even though this PR didn't get merged.

This was referenced Jun 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement component: numpy.lib triaged Issue/PR that was discussed in a triage meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0