8000 Initial Migration of Fuzz Tests & Integration Scripts From the OSS-Fuzz Project Repo by DaveLak · Pull Request #1901 · gitpython-developers/GitPython · GitHub
[go: up one dir, main page]

Skip to content

Initial Migration of Fuzz Tests & Integration Scripts From the OSS-Fuzz Project Repo #1901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

DaveLak
Copy link
Contributor
@DaveLak DaveLak commented Apr 12, 2024

As discussed in #1889, this PR introduces the changes necessary to migrate the OSS-Fuzz fuzz tests and configuration scripts from the OSS-Fuzz repository into GitPython's repo.

Summary of Changes

Most of the changes proposed here are the same as described in the discussion thread linked above. Beyond that, there are a few minor differences and potentially non-obvious details that I want to call out here:

Directory Structure

Initially, I proposed a single, flat "fuzzing/" directory at the top level, but when I moved the files, I felt that a few sub-directories would help with organization.

The fuzzing/READEME.md file introduced in this PR has a section describing them and some additional documentation.

Updates to Fuzz Targets

The fuzz test files in this PR include the changes that were originally proposed in google/oss-fuzz#11763.

Seed Data

Part of efficiency improvements I proposed in google/oss-fuzz#11763 included the addition of dictionary files (described in the new README) and seed corpus zip files.

I stored these in a repo that I maintain: https://github.com/DaveLak/oss-fuzz-inputs/tree/main/gitpython, because OSS-Fuzz has understandable concerns about seed data bloating the size of their repo.

In this PR:

Next Steps

@Byron & @EliahKagan, please:

  1. Let me know what you think of the changes here.
  2. Take a look at the related PR I have open in the OSS-Fuzz repo: [gitpython]: Move Fuzz Tests & Configuration Upstream google/oss-fuzz#11803 and let me know if theres anything I should change. A comment from each of you acknowledging and approving the changes there is required before it can be merged.
  3. If either of you could assist me with validating that the fuzzing/ directory doesn't get picked up by the publishing process, I'd appreciate it! I did what I could to test it locally but I'm not familiar enough with the release process to be confident.

Thanks! I'll address feedback any feedback as quickly as possible.

DaveLak added 5 commits April 11, 2024 19:55
Migrates the OSS-Fuzz tests and setup scripts from the OSS-Fuzz
repository to GitPython's repo as discussed here:
gitpython-developers#1887 (comment)

These files include the changes that were originally proposed in:
google/oss-fuzz#11763

Additional changes include:
- A first pass at documenting the contents of the fuzzing set up in a
  dedicated README.md
- Adding the dictionary files to this repo for improved visibility. Seed
  corpra zips are still located in an external repo pending further
 discussion regarding where those should live in the long term.
Adds additional documentation links and fixes some typos.
- Updates the fuzzing documentation to include steps for working with
  locally modified versions of the gitpython repository.
- Updates the build.sh script to make the fuzz target search path more
  specific, reducing the risk of local OSS-Fuzz builds picking up
  files located outside of where we expect them (for example, in a .venv
  directory.)
- add artifacts produced by local OSS-Fuzz runs to gitignore
- Fix typos in the documentation on dictionaries
- Link to the fuzzing directory in the main README where it is
  referenced.
@DaveLak DaveLak marked this pull request as ready for review April 12, 2024 05:00
DaveLak added a commit to DaveLak/oss-fuzz that referenced this pull request Apr 12, 2024
Updates the gitpython project files to enable migrating and maintaining
fuzz targets and build scripts upstream.

Related PR in the upstream repo: gitpython-developers/GitPython#1901
Copy link
Member
@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this wonderful PR! It's very well thought out and a delight. Particularly the fuzzing/README.md is a great introduction to how it all works. >[!TIP] is also new to me, and something I hope to be using more often in future.

About Fuzzing-Fixtures

Overall, the extra-files are just about 2MB in total, and I wonder how stable they will be. If stable, I could imagine including them in this repo.
If there is any concerns about this, gitpython-developers could fork the repo and make you an admin on it.

About archives for publishing on PyPy

I created a new archive from the branch of this PR with python -m build --sdist --wheel and didn't see any trace of fuzzing in the produced archive.

Screenshot 2024-04-13 at 08 05 16

It seems good.

Regarding the OSS fuzz PR

I didn't review it for correctness, but approved it in case it helps getting it merge.

Regarding my review

I just glanced at the shell scripts, and basically ignored the google-provided fuzz targets.

All that's left for me is to figure out where to put the ~2MB of corpus files.

And, thanks again for picking this up :)!

@DaveLak
Copy link
Contributor Author
DaveLak commented Apr 14, 2024

Thanks, I appreciate the kind words 🙂 And yes, >[!TIP] is a handy feature for sure. There are a few others as well.

About Fuzzing-Fixtures

Overall, the extra-files are just about 2MB in total, and I wonder how stable they will be. If stable, I could imagine including them in this repo. If there is any concerns about this, gitpython-developers could fork the repo and make you an admin on it.

I think a dedicated repo under the gitpython-developers org would be the best option. No need to fork mine though, a new one with whatever license you prefer would be fine. I have no preference as far my own permissions though.

Ideally, a seed corpus would be added for each new fuzz target. Obviously that means more files but it doesn't necessarily mean "10 fuzz targets = 10mb". The size & number of interesting inputs will change over time and depending on the complexity/number of code paths in the functionality under test, and it's hard to know what that equates to in terms of per target corpus size at this point. The two zips in my repo right now actually demonstrate that pretty nicely: both of them basically top out their respective coverage quickly but one zip is ~450kb while the other is ~1.5mb.

I like how Bitcoin Core handles in https://github.com/bitcoin-core/qa-assets. From what I've seen, that seems like the cleanest and most flexible.

About archives for publishing on PyPy

I created a new archive from the branch of this PR with python -m build --sdist --wheel and didn't see any trace of fuzzing in the produced archive.

Regarding the OSS fuzz PR

I didn't review it for correctness, but approved it in case it helps getting it merge.

Regarding my review

I just glanced at the shell scripts, and basically ignored the google-provided fuzz targets.

Awesome thanks!

@Byron
Copy link
Member
Byron commented Apr 14, 2024

I think a dedicated repo under the gitpython-developers org would be the best option. No need to fork mine though, a new one with whatever license you prefer would be fine. I have no preference as far my own permissions though.

Great! I have created a fork of your repository under this organization and added you as maintainer. This should give you enough permissions, but if not, you could be admin as well.

Screenshot 2024-04-14 at 08 35 46

You see that I named it after the same repository of the bitcoin organization.

Once you adjusted the repository URL to the fork under this organisation's control, the PR would be ready for merge from my side.

Thanks again for all your work here!

@DaveLak
Copy link
Contributor Author
DaveLak commented Apr 15, 2024

Thanks @Byron! I've updated the URL to unblock this PR and I'll follow up with a PR to cleanup the new qa-assets repo following the pattern of the Bitcoin one.