Categorical support for NumPy string arrays. #7241

QuLogic · 2016-10-09T06:21:42Z

I'm only pushing tests now, because I want to verify that it fails on CI as well. Then I'll push the fix in the last comment of #7215.

@story645

story645 · 2016-10-09T06:56:52Z

lib/matplotlib/tests/test_category.py

+                       ax[0].xaxis.unit_data)
+
+    @cleanup
+    def test_plot_numlike(self):


I prefer paramterized (or subtests) to subplots so that it's easier to isolate which exact case failed

I went with subplots so I could pull the correct values from the string version (which should be working based on other tests); I didn't really know what the tests were doing initially, so I wasn't sure what the correct values were in the first place.

hmm, so the way to do that if you don't want to write the fixed values (though the expected values is probably better testing practice*) is probably to just always create an axis 0 and then parameterize out axis 1 and axis 2.

ticklocs is the fixed assignment, in this case with 3 values ['a', 'c', 'c', 'd'] becomes ['a;, 'c', 'd'] since they're analogous to x values, the ticklocs are [0, 1, 2] and the labels are ['a', 'c', 'd'], which also creates a mismatch between the x and counts and so ['a', 'b', 'c', 'd'] should probably be used explicitely.

story645 · 2016-10-09T07:11:22Z

lib/matplotlib/tests/test_category.py

+
+        ax[0].bar(self.d, counts)
+
+        types = [v.encode('ascii') for v in self.d]


kinda confused by this since self.d is ['a', 'c', 'c', 'd']...what's the point of the reencoding

Those are strings; this tests bytes.

hmm, maybe this should just be explicit then? [b'a', b'c', b'c', b'd'] I think I'm advocating against using the fixture since it's not really used here.

story645 · 2016-10-09T22:54:51Z

lib/matplotlib/tests/test_category.py

+
+        ax[0].bar(self.d, counts)
+
+        types = [v.encode('ascii') for v in self.d]


hmm, maybe this should just be explicit then? [b'a', b'c', b'c', b'd'] I think I'm advocating against using the fixture since it's not really used here.

story645 · 2016-10-09T23:00:03Z

lib/matplotlib/tests/test_category.py

+                       ax[0].xaxis.unit_data)
+
+    @cleanup
+    def test_plot_numlike(self):


hmm, so the way to do that if you don't want to write the fixed values (though the expected values is probably better testing practice*) is probably to just always create an axis 0 and then parameterize out axis 1 and axis 2.

ticklocs is the fixed assignment, in this case with 3 values ['a', 'c', 'c', 'd'] becomes ['a;, 'c', 'd'] since they're analogous to x values, the ticklocs are [0, 1, 2] and the labels are ['a', 'c', 'd'], which also creates a mismatch between the x and counts and so ['a', 'b', 'c', 'd'] should probably be used explicitely.

story645 · 2016-10-09T23:02:28Z

lib/matplotlib/tests/test_category.py

+        fig.canvas.draw()
+
+        # All four plots should look like the string one.
+        self.axis_test(ax[1].xaxis,


same as above, the expected ticklocs and ticklabels are [0, 1, 2, 3] and ['1', '11', '3', '1'] respectively, The axis unit data should be mocked, as seen in line 141.

Does it really go up to 3, when the '1' is repeated?

Oh sorry, didn't notice that the 1 was repeated. Then it's [0, 1, 2, 0]

tacaswell · 2016-10-11T02:16:02Z

@story645 is responsible for review and merging this PR.

QuLogic · 2016-10-13T09:04:47Z

Hopefully that's somewhere along the lines of what you described. Haven't totally wrapped my head around all the pytest stuff.

story645 · 2016-10-13T15:56:27Z

lib/matplotlib/tests/test_category.py

+                              [b'1', b'11', b'3', b'1'],
+                              np.array([b'1', b'11', b'3', b'1'])])
+    def test_plot_numlike(self, bins):
+        counts = np.array([4, 6, 5, 1])


What does 4 counts for 3 bins do?

Maybe I should have named it bars instead of bins... The bars just overlap each other. Actually, it's probably not necessary; I just used the categories from the original issue.

hmm, if it works...but yeah might be better to test the simplest case here.

Actually, I just noticed this is similar to the data fixture, where the last element is a again.

Yeah, but that was for plot where I think it's more important to test repeating x. I dunno if that inherently makes sense with bar, but it likely doesn't matter much either way.

story645 · 2016-10-13T15:58:34Z

lib/matplotlib/tests/test_category.py

@@ -187,6 +187,37 @@ def test_plot_1d_missing(self):
        self.axis_test(ax.yaxis, self.dmticks, self.dmlabels, self.dmunit_data)

    @cleanup
+    @pytest.mark.usefixtures("data")
+    @pytest.mark.parametrize("bins",


I know I'm being a pain, but would you mind labeling the different parameterizations? It's the id kwarg on the decorator, see line 22 for an example.

QuLogic · 2016-10-14T04:51:52Z

All set, then.

tacaswell · 2016-10-14T18:41:45Z

Thanks both of you 🎉

QuLogic added the topic: categorical label Oct 9, 2016

QuLogic added this to the 2.1 (next point release) milestone Oct 9, 2016

QuLogic added the status: needs review label Oct 9, 2016

TST: Add categorical tests for NumPy string arrays.

981e0e4

QuLogic force-pushed the categorical-bytes branch from 9842b91 to 981e0e4 Compare October 9, 2016 06:52

story645 reviewed Oct 9, 2016

View reviewed changes

Use category converter on NumPy str/bytes also.

a29e26d

QuLogic mentioned this pull request Oct 9, 2016

BUG: bar deals with bytes and string x data in different manners, both that are unexpected #7215

Closed

tacaswell assigned story645 Oct 9, 2016

story645 reviewed Oct 9, 2016

View reviewed changes

story645 reviewed Oct 13, 2016

View reviewed changes

TST: Parametrize new category bytes tests.

de03d47

QuLogic force-pushed the categorical-bytes branch from 3ba152a to de03d47 Compare October 14, 2016 03:17

story645 merged commit 7fd4b69 into matplotlib:master Oct 14, 2016

story645 removed the status: needs review label Oct 14, 2016

QuLogic deleted the categorical-bytes branch October 15, 2016 03:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Categorical support for NumPy string arrays. #7241

Categorical support for NumPy string arrays. #7241

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!


		ax[0].bar(self.d, counts)

		types = [v.encode('ascii') for v in self.d]

Uh oh!

Categorical support for NumPy string arrays. #7241

Categorical support for NumPy string arrays. #7241

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!