8000 Add tests for docstring Validation Script + py27 compat by WillAyd · Pull Request #20061 · pandas-dev/pandas · GitHub
[go: up one dir, main page]

Skip to content

Add tests for docstring Validation Script + py27 compat #20061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Aug 17, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
4673475
Added Py27 support for validation script
WillAyd Mar 8, 2018
9abc004
Removed contextlib import
WillAyd Mar 8, 2018
752c6db
Added test for script validator
WillAyd Mar 9, 2018
3eaf3ba
Fixed writer arg to doctest.runner
WillAyd Mar 9, 2018
876337f
Py27 compat and updated tests / logic
WillAyd Mar 12, 2018
aa2a0f9
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Mar 12, 2018
beb56d2
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Mar 13, 2018
d0e0ad6
Added skipif for no sphinx
WillAyd Mar 13, 2018
f123a87
Ported Yields section to numpydoc
WillAyd Mar 13, 2018
4c85b56
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Mar 13, 2018
c463ea9
LINT fixes
WillAyd Mar 13, 2018
103a678
Fixed LINT issue with script
WillAyd Mar 14, 2018
4a1e265
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Mar 14, 2018
06fa6b3
Refactored code, added class docstring test
WillAyd Mar 15, 2018
8494575
Introduced new class structure
WillAyd Mar 15, 2018
fc47d3d
Added bad examples to tests
WillAyd Mar 15, 2018
bed1fc4
Parametrized all tests
WillAyd Mar 15, 2018
3b05e06
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Mar 15, 2018
956c8cb
LINT fixes
WillAyd Mar 15, 2018
c62471b
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Mar 16, 2018
2f6663a
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Mar 16, 2018
0fb52d8
LINT fixup
WillAyd Mar 16, 2018
b6a1624
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Mar 28, 2018
a08220e
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Jul 29, 2018
a9e33a1
Removed errant newline removal
WillAyd Jul 29, 2018
e6bba28
Fixed issue with _accessors
WillAyd Jul 29, 2018
7948d91
Py27 compat
WillAyd Jul 29, 2018
8311b27
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Jul 31, 2018
e1ec864
Parameter compat
WillAyd Jul 31, 2018
099b747
LINT fixup
WillAyd Aug 1, 2018
5e179a0
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Aug 1, 2018
64410f0
Purposely failing test to ensure CI coverage
WillAyd Aug 11, 2018
33d6827
Removed sphinx skipif decorator
WillAyd Aug 13, 2018
e83bd20
Merge remote-tracking branch 'upstream/master' into script-compat
WillAyd Aug 13, 2018
2e51e49
Removed purposely failing test
WillAyd Aug 13, 2018
1fb5405
LINT fixup
WillAyd Aug 13, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Py27 compat and updated tests / logic
  • Loading branch information
WillAyd committed Mar 12, 2018
commit 876337f22ad073c301f7fd891f34ca55bf8b5d4f
87 changes: 50 additions & 37 deletions pandas/tests/scripts/test_validate_docstrings.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if the CI is actually running this?

import sys

import numpy
import numpy as np
import pytest


Expand Down Expand Up @@ -79,7 +79,8 @@ def sample_values(self):
yield random.random()

def head(self):
"""Return the first 5 elements of the Series.
"""
Return the first 5 elements of the Series.

This function is mainly useful to preview the values of the
Series without displaying the whole of it.
Expand All @@ -98,7 +99,8 @@ def head(self):
return self.iloc[:5]

def head1(self, n=5):
"""Return the first elements of the Series.
"""
Return the first elements of the Series.

This function is mainly useful to preview the values of the
Series without displaying the whole of it.
Expand All @@ -108,9 +110,9 @@ def head1(self, n=5):
n : int
Number of values to return.

Return
------
pandas.Series
Returns
-------
Series
Subset of the original series with the n first values.

See Also
Expand All @@ -119,8 +121,7 @@ def head1(self, n=5):

Examples
--------
>>> s = pd.Series(['Ant', 'Bear', 'Cow', 'Dog', 'Falcon',
... 'Lion', 'Monkey', 'Rabbit', 'Zebra'])
>>> s = pd.Series(['Ant', 'Bear', 'Cow', 'Dog', 'Falcon'])
>>> s.head()
0 Ant
1 Bear
Expand All @@ -139,40 +140,49 @@ def head1(self, n=5):
"""
return self.iloc[:n]

def contains(self, pattern, case_sensitive=True, na=numpy.nan):
def contains(self, pat, case=True, na=np.nan):
"""
Return whether each value contains `pattern`.
Return whether each value contains `pat`.

In this case, we are illustrating how to use sections, even
if the example is simple enough and does not require them.

Parameters
----------
pat : str
Pattern to check for within each element.
case : bool, default True
Whether check should be done with case sensitivity.
na : object, default np.nan
Fill value for missing data.

Examples
--------
>>> s = pd.Series('Antelope', 'Lion', 'Zebra', numpy.nan)
>>> s.contains(pattern='a')
>>> s = pd.Series(['Antelope', 'Lion', 'Zebra', np.nan])
>>> s.str.contains(pat='a')
0 False
1 False
2 True
3 NaN
dtype: bool
dtype: object

**Case sensitivity**

With `case_sensitive` set to `False` we can match `a` with both
`a` and `A`:

>>> s.contains(pattern='a', case_sensitive=False)
>>> s.str.contains(pat='a', case=False)
0 True
1 False
2 True
3 NaN
dtype: bool
dtype: object

**Missing values**

We can fill missing values in the output using the `na` parameter:

>>> s.contains(pattern='a', na=False)
>>> s.str.contains(pat='a', na=False)
0 False
1 False
2 True
Expand All @@ -181,20 +191,6 @@ def contains(self, pattern, case_sensitive=True, na=numpy.nan):
"""
pass

def plot2(self):
"""
Generate a plot with the `Series` data.

Examples
--------

.. plot::
:context: close-figs

>>> s = pd.Series([1, 2, 3])
>>> s.plot()
"""
pass

class BadDocStrings(object):

Expand Down Expand Up @@ -285,7 +281,7 @@ def method(self, foo=None, bar=None):
to understand.

Try to avoid positional arguments like in `df.method(1)`. They
can be all right if previously defined with a meaningful name,
can be alright if previously defined with a meaningful name,
like in `present_value(interest_rate)`, but avoid them otherwise.

When presenting the behavior with different parameters, do not place
Expand All @@ -296,19 +292,30 @@ def method(self, foo=None, bar=None):
--------
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame(numpy.random.randn(3, 3),
>>> df = pd.DataFrame(np.ones((3, 3)),
... columns=('a', 'b', 'c'))
>>> df.method(1)
21
>>> df.method(bar=14)
123
>>> df.all(1)
0 True
1 True
2 True
dtype: bool
>>> df.all(bool_only=True)
Series([], dtype: bool)
"""
pass

class TestValidator(object):

@pytest.fixture(autouse=True, scope="class")
def import_scripts(self):
"""
Because the scripts directory is above the top level pandas package
we need to hack sys.path to know where to find that directory for
import. The below traverses up the file system to find the scripts
directory, adds to location to sys.path and imports the required
module into the global namespace before as part of class setup,
reverting those changes on teardown.
"""
up = os.path.dirname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty magical can you add a comment

file_dir = up(os.path.abspath(__file__))
script_dir = os.path.join(up(up(up(file_dir))), 'scripts')
Expand All @@ -321,7 +328,13 @@ def import_scripts(self):

@pytest.mark.parametrize("func", [
'plot', 'sample', 'random_letters', 'sample_values', 'head', 'head1',
'contains', 'plot2'])
'contains'])
def test_good_functions(self, func):
assert validate_one('pandas.tests.scripts.test_validate_docstrings'
'.GoodDocStrings.' + func) == 0

@pytest.mark.parametrize("func", [
'func', 'astype', 'astype1', 'astype2', 'astype3', 'plot', 'method'])
def test_bad_functions(self, func):
assert validate_one('pandas.tests.scripts.test_validate_docstrings'
'.BadDocStrings.' + func) > 0
61 changes: 48 additions & 13 deletions scripts/validate_docstrings.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@

sys.path.insert(0, os.path.join(BASE_PATH))
import pandas
from pandas.compat import signature

sys.path.insert(1, os.path.join(BASE_PATH, 'doc', 'sphinxext'))
from numpydoc.docscrape import NumpyDocString
Expand Down Expand Up @@ -185,11 +186,17 @@ def signature_parameters(self):
# accessor classes have a signature, but don't want to show this
return tuple()
try:
params = self.method_obj.__code__.co_varnames
sig = signature(self.method_obj)
except (TypeError, ValueError):
# Some objects, mainly in C extensions do not support introspection
# of the signature
return tuple()
params = sig.args
if sig.varargs:
params.append("*" + sig.varargs)
if sig.keywords:
params.append("**" + sig.keywords)
params = tuple(params)
if params and params[0] in ('self', 'cls'):
return params[1:]
return params
Expand Down Expand Up @@ -237,6 +244,10 @@ def examples(self):
def returns(self):
return self.doc['Returns']

@property
def method_source(self):
return inspect.getsource(self.method_obj)

@property
def first_line_ends_in_dot(self):
if self.doc:
Expand Down Expand Up @@ -376,13 +387,27 @@ def validate_all():


def validate_one(func_name):
"""
Validate the docstring for the given func_name

Parameters
----------
func_name : function
Function whose docstring will be evaluated

Returns
-------
int
The number of errors found in the `func_name` docstring
"""
func_obj = _load_obj(func_name)
doc = Docstring(func_name, func_obj)

sys.stderr.write(_output_header('Docstring ({})'.format(func_name)))
sys.stderr.write('{}\n'.format(doc.clean_doc))

errs = []
wrns = []
if doc.start_blank_lines != 1:
errs.append('Docstring text (summary) should start in the line '
'immediately after the opening quotes (not in the same '
Expand Down Expand Up @@ -410,16 +435,17 @@ def validate_one(func_name):
'not third person (e.g. use "Generate" instead of '
'"Generates")')
if not doc.extended_summary:
errs.append('No extended summary found')
wrns.append('No extended summary found')

param_errs = doc.parameter_mismatches
for param in doc.doc_parameters:
if not doc.parameter_type(param):
param_errs.append('Parameter "{}" has no type'.format(param))
else:
if doc.parameter_type(param)[-1] == '.':
param_errs.append('Parameter "{}" type '
'should not finish with "."'.format(param))
if not param.startswith("*"): # Check can ignore var / kwargs
if not doc.parameter_type(param):
param_errs.append('Parameter "{}" has no type'.format(param))
else:
if doc.parameter_type(param)[-1] == '.':
param_errs.append('Parameter "{}" type '
'should not finish with "."'.format(param))

if not doc.parameter_desc(param):
param_errs.append('Parameter "{}" '
Expand All @@ -437,24 +463,28 @@ def validate_one(func_name):
for param_err in param_errs:
errs.append('\t{}'.format(param_err))

if not doc.returns:
errs.append('No returns section found')
if not doc.returns and "return" in doc.method_source:
errs.append('No Returns section found')
if "yield" in doc.method_source:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would ideally have the same structure as the returns check directly above it, but I think there's a bug in numpy doc where it doesn't parse Yields sections

# numpydoc is not correctly parsing Yields sections, so
# best we can do is warn the user to lookout for this...
wrns.append('Yield found in source - please make sure to document!')

mentioned_errs = doc.mentioned_private_classes
if mentioned_errs:
errs.append('Private classes ({}) should not be mentioned in public '
'docstring.'.format(mentioned_errs))

if not doc.see_also:
errs.append('See Also section not found')
wrns.append('See Also section not found')
else:
for rel_name, rel_desc in doc.see_also.items():
if not rel_desc:
errs.append('Missing description for '
'See Also "{}" reference'.format(rel_name))
examples_errs = ''
if not doc.examples:
errs.append('No examples section found')
wrns.append('No examples section found')
else:
examples_errs = doc.examples_errors
if examples_errs:
Expand All @@ -465,7 +495,12 @@ def validate_one(func_name):
sys.stderr.write('Errors found:\n')
for err in errs:
sys.stderr.write('\t{}\n'.format(err))
else:
if wrns:
sys.stderr.write('Warnings found:\n')
for wrn in wrns:
sys.stderr.write('\t{}\n'.format(wrn))

if not errs:
sys.stderr.write('Docstring for "{}" correct. :)\n'.format(func_name))

if examples_errs:
Expand Down
0