8000 [MRG + 1] CI some improvements to the flake8 CI by jnothman · Pull Request #8036 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

[MRG + 1] CI some improvements to the flake8 CI #8036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 13, 2016

Conversation

jnothman
Copy link
Member
  • Examples now cannot fail due to 'E402 module level import not at top of file'
  • Changed files are checked for unused imports regardless of diff

In general, I find flake8 --diff somewhat deficient: it's not able to check for things like unused names. I think we therefore rely on it excessively.

Ideally we would:

  • flake8 over the old version
  • flake8 over the new version
  • use the diff to find corresponding line numbers between errors in old version and new version
  • report any error that does not correspond to an error in the old version

If anyone wants to implement that, they're welcome!

- Examples now cannot fail due to 'E402 module level import not at top of file'

Changed files are checked for unused imports regardless of diff
Copy link
Member
@lesteve lesteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I find flake8 --diff somewhat deficient: it's not able to check for things like unused names. I think we therefore rely on it excessively.

I agree it has shortcomings but it is still a lot better than nothing.

echo "$files"
echo $options
git diff --unified=0 $COMMIT -- $files | flake8 --diff --show-source $options
}

if [[ "$MODIFIED_FILES" == "no_match" ]]; then
echo "No file outside sklearn/externals and doc/sphinxext/sphinx_gallery has been modified"
else
# Conservative approach: diff without context so that code that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to move this comment before the git diff --unified=0 $COMMIT ... line. Basically this was an explanation why we use --unified=0 in the git diff command.

8000

if [[ "$MODIFIED_FILES" == "no_match" ]]; then
echo "No file outside sklearn/externals and doc/sphinxext/sphinx_gallery has been modified"
else
# Conservative approach: diff without context so that code that
# was not changed does not create failures
git diff --unified=0 $COMMIT -- $MODIFIED_FILES | flake8 --diff --show-source
check_files "$(echo "$MODIFIED_FILES" | grep -v ^examples)"
# Examples are allowed to not have imports at top of file due to print(__doc__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove due to print(__doc__). This is not the main reason, this happens most of the time because examples are written in a "interactive mode" where you only import stuff when you need it rather than at the top of the file.

check_files() {
files="$1"
options="$2"
echo "$files"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to keep this debugging echo statements?

check_files "$(echo "$MODIFIED_FILES" | grep -v ^examples)"
# Examples are allowed to not have imports at top of file due to print(__doc__)
check_files "$(echo "$MODIFIED_FILES" | grep ^examples)" --ignore=E402
check_files "$MODIFIED_FILES"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this last line ... You have already checked the non-examples files above and the example files ignoring E402.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I basically fell asleep while testing this and left in a few debug statements! thanks for your patience!

@lesteve
Copy link
Member
lesteve commented Dec 12, 2016

Ideally we would:
flake8 over the old version
flake8 over the new version
use the diff to find corresponding line numbers between errors in old version and new version
report any error that does not correspond to an error in the old version

matplotlib has a simplified version of it which checks (if I understood correctly) the number of flake8 errors by file and make sure that they do not go up. They are not all super happy with this though and at one point a PR was opened to copy the functionality of flake8_diff.sh: matplotlib/matplotlib#7403.

All in all I think this is a matter of 80/20 rule.

@amueller
Copy link
Member

How about we error on all violations within a changed file? That'll reduce testing complexity at the cost of annoying committers for a little while.

Alternatively, we can can fix all pep8 violations (or mark them to be ignored by the linter if we don't like the fix) "once and for all" and then run flake8 on the changed files and not annoy future committers, but make everybody rebase their PRs once.

@jnothman
Copy link
Member Author
jnothman commented Dec 13, 2016 via email

@amueller
Copy link
Member

but it's trying to solve a problem that will go away soon anyhow, right?

@jnothman
Copy link
Member Author
jnothman commented Dec 13, 2016 via email

@amueller
Copy link
Member

well if we solve all pep8 in the codebase then we don't need to worry about checking only the diff, right?

@jnothman
Copy link
Member Author
jnothman commented Dec 13, 2016 via email

@amueller
Copy link
Member

Well if we only merge prs that don't introduce pep8 errors that will happen at some point, right? Unless we want to not enforce that and just use the continuous integration as a reminder.

@jnothman
Copy link
Member Author
jnothman commented Dec 13, 2016 via email
8000

@lesteve
Copy link
Member
lesteve commented Dec 13, 2016

I tested this locally and this works fine, so I am +1 for merging this.

I don't have a strong opinion about pep8-fying the full code but I think that can be discussed in a separate issue.

@lesteve lesteve changed the title CI some improvements to the flake8 CI [MRG + 1] CI some improvements to the flake8 CI Dec 13, 2016
@amueller
Copy link
Member

@amueller
Copy link
Member

lgtm. Btw, can we please ignore W503? Do we have ignores setup somewhere? Or is our flake8 just old?

@lesteve
Copy link
Member
lesteve commented Dec 13, 2016

Merging this one, thanks @jnothman!

@lesteve lesteve merged commit aceb73c into scikit-learn:master Dec 13, 2016
@jnothman
Copy link
Member Author

https://erikbern.com/2016/12/05/the-half-life-of-code.html ;)

I'm never convinced when measures are based on number of commits. Surely this is highly influenced by the projects' policies about commit atomicity. I wonder how robust those half lives are to filtering commits to, say, a minimum number of lines changed.

@amueller
Copy link
Member

it's about how long it takes until a line changes, commits don't matter, I think.

@amueller amueller mentioned this pull request Dec 14, 2016
sergeyf pushed a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017
Examples now cannot fail due to 'E402 module level import not at top of file'
@Przemo10 Przemo10 mentioned this pull request Mar 17, 2017
Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017
Examples now cannot fail due to 'E402 module level import not at top of file'
NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017
Examples now cannot fail due to 'E402 module level import not at top of file'
paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017
Examples now cannot fail due to 'E402 module level import not at top of file'
maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
Examples now cannot fail due to 'E402 module level import not at top of file'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
0