10000 Categorical support for NumPy string arrays. by QuLogic · Pull Request #7241 · matplotlib/matplotlib · GitHub
[go: up one dir, main page]

Skip to content

Categorical support for NumPy string arrays. #7241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 14, 2016

Conversation

QuLogic
Copy link
Member
@QuLogic QuLogic commented Oct 9, 2016

I'm only pushing tests now, because I want to verify that it fails on CI as well. Then I'll push the fix in the last comment of #7215.

@story645

@QuLogic QuLogic added this to the 2.1 (next point release) milestone Oct 9, 2016
@QuLogic QuLogic force-pushed the categorical-bytes branch from 9842b91 to 981e0e4 Compare October 9, 2016 06:52
ax[0].xaxis.unit_data)

@cleanup
def test_plot_numlike(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer paramterized (or subtests) to subplots so that it's easier to isolate which exact case failed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with subplots so I could pull the correct values from the string version (which should be working based on other tests); I didn't really know what the tests were doing initially, so I wasn't sure what the correct values were in the first place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so the way to do that if you don't want to write the fixed values (though the expected values is probably better testing practice*) is probably to just always create an axis 0 and then parameterize out axis 1 and axis 2.

  • ticklocs is the fixed assignment, in this case with 3 values ['a', 'c', 'c', 'd'] becomes ['a;, 'c', 'd'] since they're analogous to x values, the ticklocs are [0, 1, 2] and the labels are ['a', 'c', 'd'], which also creates a mismatch between the x and counts and so ['a', 'b', 'c', 'd'] should probably be used explicitely.


ax[0].bar(self.d, counts)

types = [v.encode('ascii') for v in self.d]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kinda confused by this since self.d is ['a', 'c', 'c', 'd']...what's the point of the reencoding

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are strings; this tests bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, maybe this should just be explicit then? [b'a', b'c', b'c', b'd'] I think I'm advocating against using the fixture since it's not really used here.


ax[0].bar(self.d, counts)

types = [v.encode('ascii') for v in self.d]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, maybe this should just be explicit then? [b'a', b'c', b'c', b'd'] I think I'm advocating against using the fixture since it's not really used here.

ax[0].xaxis.unit_data)

@cleanup
def test_plot_numlike(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so the way to do that if you don't want to write the fixed values (though the expected values is probably better testing practice*) is probably to just always create an axis 0 and then parameterize out axis 1 and axis 2.

  • ticklocs is the fixed assignment, in this case with 3 values ['a', 'c', 'c', 'd'] becomes ['a;, 'c', 'd'] since they're analogous to x values, the ticklocs are [0, 1, 2] and the labels are ['a', 'c', 'd'], which also creates a mismatch between the x and counts and so ['a', 'b', 'c', 'd'] should probably be used explicitely.

fig.canvas.draw()

# All four plots should look like the string one.
self.axis_test(ax[1].xaxis,
Copy link
Member
@story645 story645 Oct 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, the expected ticklocs and ticklabels are [0, 1, 2, 3] and ['1', '11', '3', '1'] respectively, The axis unit data should be mocked, as seen in line 141.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really go up to 3, when the '1' is repeated?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, didn't notice that the 1 was repeated. Then it's [0, 1, 2, 0]

@tacaswell
Copy link
Member

@story645 is responsible for review and merging this PR.

@QuLogic
Copy link
Member Author
QuLogic commented Oct 13, 2016

Hopefully that's somewhere along the lines of what you described. Haven't totally wrapped my head around all the pytest stuff.

[b'1', b'11', b'3', b'1'],
np.array([b'1', b'11', b'3', b'1'])])
def test_plot_numlike(self, bins):
counts = np.array([4, 6, 5, 1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does 4 counts for 3 bins do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should have named it bars instead of bins... The bars just overlap each other. Actually, it's probably not necessary; I just used the categories from the original issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, if it works...but yeah might be better to test the simplest case here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I just noticed this is similar to the data fixture, where the last element is a again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but that was for plot where I think it's more important to test repeating x. I dunno if that inherently makes sense with bar, but it likely doesn't matter much either way.

@@ -187,6 +187,37 @@ def test_plot_1d_missing(self):
self.axis_test(ax.yaxis, self.dmticks, self.dmlabels, self.dmunit_data)

@cleanup
@pytest.mark.usefixtures("data")
@pytest.mark.parametrize("bins",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I'm being a pain, but would you mind labeling the different parameterizations? It's the id kwarg on the decorator, see line 22 for an example.

@QuLogic
Copy link
Member Author
QuLogic commented Oct 14, 2016

All set, then.

@story645 story645 merged commit 7fd4b69 into matplotlib:master Oct 14, 2016
@tacaswell
Copy link
Member

Thanks both of you 🎉

@QuLogic QuLogic deleted the categorical-bytes branch October 15, 2016 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0