Research addressing the lack of datasets for real world SStuBs (Simple Stupid Bugs) in source code, which has resulted in the use of synthetic data for evaluating fault localization techniques. SStuBs consist of small semantics bugs which are syntactically correct and thus hard for a developer to manually spot. The project's main goal is to build an end-to-end system for automatically repairing SStuBs. The "ManySStuBs4J Dataset" is a large dataset of single statement bug-fix changes, mined from open-source Java projects hosted in GitHub.

Items in this Collection

  • ManySStuBs4J Dataset 

    Karampatsis, Rafael-Michael
    The ManySStuBs4J corpus contains simple statement bugs mined from open-source Java projects hosted in GitHub. There are two variations of the dataset. One mined from the 100 Java Maven Projects and one mined from the top ...
  • SUPERSEDED - ManySStuBs4J Dataset 

    Karampatsis, Rafael-Michael
    ## This item has been replaced by the one which can be found at https://doi.org/10.7488/ds/2628 ## The ManySStuBs4J corpus contains simple statement bugs mined from open-source Java projects hosted in GitHub. There are ...