-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
[MNT]: User feedback should use warnings, not logs #23422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So agree w/ you and also the discussion in/around #13026 was that the categorical warning shouldn't be a warning since it's the expected/intended behavior and that's why it became info and it maybe shouldn't even exist. meeting notes about it |
Ah thanks missed that original discussion. It seems like @jklymak made the opposite argument from point 3 here, that logging is preferable because it's easier for knowledgable users to disable it. I'm not sure I understand that argument? Personally I find Python's logging interface confounding so I'm not sure I share the intuition. But my concern here is primarily about 3rd party libraries, who really shouldn't be messing with matplotlib's logging settings IMO. It also feels odd to me to be logging to communicate with a user in the moment ... logging feels more about communicating with a user in the future. But that's maybe just a personal inclination. (I'd also prefer if the categorical warning were simply removed, although I appreciate that it's a source of confusion and gather that the ship has sailed). |
I am by no means an expert in deciding on the best way to pass info between libraries. But I'm not clear how messing with Matplotlib's logging is any worse than messing with Matplotlib's warnings? https://docs.python.org/3/howto/logging.html is the basic guidance. For the categorical info message, that really seems to not be something the user should see unless they are debugging. I think its a little more urgent than "debug", which barfs out a lot of output, but not as urgent as a "warning". For the default logging config, users should not see this message at all, only if they are in the habit of bumping their log level to see what is breaking. For the RGB message, that seems something a user is more likely to get wrong than a library, in which case the log message seems the best place to put this according to my parsing of the guidelines? But I don't feel strongly about any of this, and I am not speaking from a wealth of experience dealing with maintaining a downstream library... |
Well the warnings module has a context manager for temporarily changing warning filters. I’m not aware of anything similar for logging? |
Maybe, although t
8000
he proximate cause for this issue was someone opening a seaborn issue about this log that I was (eventually) able to trace down to a problem in pandas. Which would have been a lot easier if I could have just flipped a switch to make the warning an error and get a stack trace. I agree that in general that |
Another case where it is useful to turn warnings into errors is in test suites. |
I'm now confused then on why the categorical message is a log and not a warning, given it was specifically aimed at novice users, who presumably aren't messing w/ log levels. |
For the categorical issue, warnings show by default and you don't want warnings to show up if the user is doing something they indeed mean to be doing. |
this was always the problem w/ this warning though, right? that it's the totally expected behavior of the library and also something many users may not have intended. And the users who may not have intended it and therefore may benefit most from seeing the warning, are maybe least likely to see the warning since it's an info log. Just raising that I think there's maybe a UX issue here, though also wondering if at this point it can be removed given categoricals went in in 2.1 |
I think you can use warnings and satisfy your desire for a I realized why I am seeing the categorical log even when not asking for it; some transitive dependency does |
From https://docs.python.org/3/howto/logging.html#when-to-use-logging The best tool for the task:
Applied to the cases:
|
The rgb case I agree should just be a warning Still not convinced about categorical. The point was if someone does something and it behaves unexpectedly we can ask them to turn logging on. If someone does something and it works, and is legitimate, I don't think they should have to filter a warning. |
Usually I'd argee with you that legitimate use cases should not issue warnings. But weighting the likelihood, we may want to deviate from this: If we strictly follow the above rule, we make life much harder for the majority of users. You say "The point was if someone does something and it behaves unexpectedly we can ask them to turn logging on." That's very far down the line from the problem: The user sees a strange output. Then search for a solution. Maybe they find our FAQ or they ask on discourse or open an issue. It would be much more efficient for them if we would issue a warning right away. OTOH I'd argue that number labels is a special enough case that we can burden users to explicitly consent to this. - They may just live with the warning or, suppress it with a filter (which we should describe in the warning message). If we want to try being more distinctive, one could make this dependent on the plot type: I'd say that numeric labels are mostly for categorical data, and these are usually visualized in descrete plots like bar(), boxplot(), violinplot(). So one could leave out the warning for them. But I'm not sure this is worth the additional effort. |
I still think logging is the wrong mechanism here, but this would suggest
This has some appeal and logic seems relatively sound, but I would expect that this might get complicated and difficult to maintain. One other distinction that you could draw is between calling a plotting method vs. calling |
Well the nice thing about INFO is that it is not super cluttered, whereas DEBUG is very verbose. Again, the default level is WARNING, so most users shouldn't see this log message. Asking folks who have a mysterious problem to turn on logging to INFO should give them something useful. DEBUG is largely for the devs.
The original problem is numbers (or dates) being stored in a list or object array as strings. Unfortunately, many csv file decoders will return these as text if you don't specify the dtype, and a lot of people use csv files. If we really think that 99.9% of the users who pass |
c) When to use what: - warnings.warn() in library code if the issue is avoidable and the client application should be modified to eliminate the warning - logging.warning() if there is nothing the client application can do about the situation, but the event should still be noted There are unambiguous ways for the user to specify the color (see the message). Here, the user should take action and switch from using *c* to using *color*. Thus warnings.warn() is the way to go. Closes half of matplotlib#23422.
c) When to use what: - warnings.warn() in library code if the issue is avoidable and the client application should be modified to eliminate the warning - logging.warning() if there is nothing the client application can do about the situation, but the event should still be noted There are unambiguous ways for the user to specify the color (see the message). Here, the user should take action and switch from using *c* to using *color*. Thus warnings.warn() is the way to go. Closes half of matplotlib#23422.
c) When to use what: - warnings.warn() in library code if the issue is avoidable and the client application should be modified to eliminate the warning - logging.warning() if there is nothing the client application can do about the situation, but the event should still be noted There are unambiguous ways for the user to specify the color (see the message). Here, the user should take action and switch from using *c* to using *color*. Thus warnings.warn() is the way to go. Closes half of matplotlib#23422.
c) When to use what: - warnings.warn() in library code if the issue is avoidable and the client application should be modified to eliminate the warning - logging.warning() if there is nothing the client application can do about the situation, but the event should still be noted There are unambiguous ways for the user to specify the color (see the message). Here, the user should take action and switch from using *c* to using *color*. Thus warnings.warn() is the way to go. Closes half of matplotlib#23422.
c) When to use what: - warnings.warn() in library code if the issue is avoidable and the client application should be modified to eliminate the warning - logging.warning() if there is nothing the client application can do about the situation, but the event should still be noted There are unambiguous ways for the user to specify the color (see the message). Here, the user should take action and switch from using *c* to using *color*. Thus warnings.warn() is the way to go. Closes half of matplotlib#23422.
Hello everyone! I see that the second example was marked was done. What about the first one? Could it still be changed to a warning? Thanks! |
No. We don't want to support sloppy programming. If the user wants numbers, they should provide numbers. The warning has the purpose to make the user aware that they should likely convert their data to numbers. There are exactly two possible cases Unfortunately, we cannot infer the user intention reliably. This leaves us with three options
Personally, I'm +0.75 for 1) |
Hey! So, as it has been more than half a month and no additional input was given, I'm going to follow your suggestion number 1. I'll create the pull request implementing it, and, if any new input about it is given, I can change it right away. edit: time interval was wrong |
I don't think we should warn on potentially valid input. |
I emphatically agree that potentially valid input should never warn. |
What is that logging call supposed to achieve that is different from “warning” then? |
Logging is opt-in and most people leave it off. If someone has a mysterious bug they can turn it on and maybe get some info. A warning would need to be opt-out. |
To circle back to the beginning of the thread, Python logging is “opt in” in a very crude sense and a line of code somewhere in your dependency tree can opt you into matplotlib logging without you wanting that (how I ran into this). This still just feels like the wrong mechanism for this kind of information but if you are against it being a warning (I’m fine with that) and still think it adds value (I agree this is a common gotcha, but I am skeptical anyone has thought to debug it by saying “I know, I will turn on matplotlib logging”) we can go ahead and close the thread. |
I think we all agree that
The decision here is a trade-off. The intended all string numbers case should be quite rare. It's a judgement call: How likely is the valid use case vs. the mis-use? 1%, 0.1%? And is there a limit where we say it is acceptable to show a false positive warning in rare cases to help with a common pitfall in the large majority of the cases. |
To be clear, in the seaborn usecase, it is 100% intentional to be sending in string-typed numbers. |
@mwaskom Do I understand you correctly? You intentionally send string-tryped numbers. But you still want us to warnings.warn on them? Do you plan to filter the warning? |
Right, it would be easier for me overall if there were just no feedback here, though I am sympathetic about this being a common gotcha. But whereas with Maybe I am just not used to normal patterns because it's pretty unusual for libraries in the scientific Python ecosystem to use logging; logging feels like something that is usually done at the application layer, not the library layer. But that goes back to why it feels like matplotlib is doing something odd here. |
As a maybe orthogonal conversation, maybe |
I want to circle back to #23422 (comment) cause I think a CustomWarning could work well here. It'd provide an easy hook for filtering and let us name it in a way where it's clear that this is a "PossibleWarning" And also categoricals have been in the library since 2016 😓 so half/half on warning. ETA: also this is now explicitly documented, so I think there's probably less need to warn: |
Matplotlib is not manipulating any other library's settings. I'm not sure what you mean by this? In normal usage of Matplotlib there are no logging messages. If you turn logging on, then yes, you will see our messages, but it's hard to see why that is something we are doing that is incorrect. |
@mwaskom was speaking from the perspective of |
that's right, thanks
i am raising this both as the maintainer of seaborn and as a user who sometimes works in a codebase where a transitive dependency does |
Libraries should never do that - I think your major complaint is with that library, not Matplotlib. Users who set the logging level should expect to see a bunch of messages. However, libraries should never set the logging level for users. If, as a library maintainer, you really want to suppress our logging message, you can temporarily reset the logging level when you call us:
If you are doing this a lot, you could easily write a context manager. Its a bit of a pain for you as a maintainer, however, in my opinion it is much more burdensome for end-users to have to filter warnings for valid code than for a maintainer to suppress log messages they don't want. |
It feels like there are a whole additional set of arguments here about why the logging approach is unusual and provides bad ux that you’re not engaging with. |
Since this is getting completely lost, gonna again bring up your I think very good suggestion to make a CustomWarning. #23422 (comment) I'm thinking here of warnings as analogous to sphinx admonitions- the purpose of either is to bring attention to important information. Just like how we use different types of admonitions all over the place to flag things, we could use a "FlagThings" warning to bring attention to this. |
I think that's true, because the basic argument is that we are spamming your console. But we are not, your dependency is. Matplotlib has always had logging information being spat out if you set a level of verbosity. We turned those messages into standard logging calls quite a few years ago because it is a standard library with standard ways of controlling the level of output. If the argument is general, in that we should never use If the argument is specific to this situation, as @timhoffm says it's a judgement call if we warn for every time a user does this, or if we just log. I personally don't think we should warn users on valid inputs. I guess we could argue about the level of the message, but INFO is more verbose than WARNING.
This is still a warning that users will have to explicitly filter even if they are using the library properly. The logging tutorial suggests that |
Yes, but at least it's a warning we can name "ValidButRareUseCase" so that users who intend this use case like @mwaskom can easily filter it but it's visible to its intended audience. Granted I also fully support just getting rid of it b/c we document this pretty well now & someone seeing this warning will have to Google the same behavior as someone who doesn't. Which going back to the start of this thread and @mwaskom's point - it's questionable whether logging helps anyone. This is a rough guess based on my behavior, but I think most folks are more likely to Google strange behavior than to think to check logs they may not even know about. And practically speaking, if a user comes to us w/ this issue we're unlikely to ask them to check their logs b/c we recognize what's triggering this odd behavior and will directly point them at the docs for this (w/ maybe a 'can you tell us the dtype?" first). |
To be pedantic, you're spamming my notebook. The distinction actually matters because notebooks with matplotlib plots are often (in a data science context) intended as communicative documents so there is extra cost to pointless content in stderr. Beyond that, I've made a number of other points about why logging is less effective for debugging (from my OP):
Plus (in most cases) you need to know to switch on INFO mode logging, which is an unusual pattern in scientific Python — logging isn't to my knowledge used in the other core libraries.
And then circling back here, I don't really think this distinction is relevant. Set aside my poorly-behaved dependency opting me into matplotlib logging. If I, as a user, wanted useful information from matplotlib and activated its logging, I'd still need the third party libraries like seaborn to do the hacky manipulation of matplotlib's log level, otherwise I would be getting uninformative "warnings" for valid code. |
I didn’t know we did logging until I reviewed #26802… |
Adding on to the argument that I think at least me and @mwaskom are making that warning is:
and my argument that
I think this concern:
Can be addressed by adding "to not see this message, cast to a numerical type or add ETA:
If you can catch the handler for the matplotlib logger, you may be able to add a filter https://stackoverflow.com/a/17276457/1267531 but that will I think suppress it for folks who are using seaborn and may run into this in their non seaborn code. |
I'm not exactly sure how it would work, but it also seems possible to have matplotlib's "maybe warnings" suppressed by default and able to be toggled on with a |
I suggest to move from As discussed above |
Uh oh!
There was an error while loading. Please reload this page.
Summary
There are at least two places where matplotlib provides a message to the user about usage patterns that are potentially mistakes using logging functionality, rather than as a warning.
Two examples I've encountered recently:
Passing an RGB tuple to thec
parameter ofscatter
I don't think logging is the right mechanism here:
Proposed fix
It's possible I am wrong here, but
warnings.warn
feels like the right mechanism for both of these pieces of feedback.The text was updated successfully, but these errors were encountered: