-
Notifications
You must be signed in to change notification settings - Fork 649
Refactor bbtools module to follow MultiQC conventions #3410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Refactor bbtools module to follow MultiQC conventions #3410
Conversation
This refactoring unifies the separate `bbmap` and `bbduk` modules into a single `bbtools` module following MultiQC conventions. The change addresses the issues raised in #3398 and #1146. Key changes: - Merge bbmap and bbduk into unified bbtools module - Each subtool now has its own self-contained file with: - Independent parsing logic - Individual table/plot configuration - Separate data file output (multiqc_bbtools_<subtool>) - Standard data structure: Dict[str, Dict] with sample names as keys - Consistent naming: bbtools/<subtool> in search patterns - Single entry point in pyproject.toml Submodules included: - duk: BBDuk filtering statistics (from bbduk module) - stats: Adapter/contaminant filtering statistics - bbsplit: Reference genome alignment distribution - qchist: Quality value counts (+ Q30% general stats) - aqhist: Average read quality histogram - bhist: Base composition histogram - bqhist: Base quality boxplot data - covhist: Coverage depth histogram - ehist: Errors-per-read histogram - gchist: GC content histogram - idhist: Identity histogram - ihist: Insert size histogram - indelhist: Indel length histogram - lhist: Read length histogram - mhist: Match/sub/del/ins rates - qahist: Quality accuracy histogram - qhist: Quality by position histogram Breaking changes: - Module name changed from `bbmap`/`bbduk` to `bbtools` - Search patterns renamed from `bbmap/*` to `bbtools/*` - Data files renamed from `multiqc_bbmap.json` to individual `multiqc_bbtools_<subtool>.json` files Note: The previously unused file types (bincov, covstats, rpkm, statsfile, statsfile_machine) were not implemented as they were marked as not_implemented in the original module.
Code Review SummaryThis is a well-executed refactoring that successfully consolidates the bbmap and bbduk modules into a unified bbtools module. The code follows MultiQC conventions, is well-structured, and appears functionally sound. A few minor issues and suggestions are noted below. Detailed Review✅ Strengths
🔍 Issues & SuggestionsMinor Issues
Code Quality Observations
Performance Considerations
Security Considerations✅ No security concerns identified:
Test Coverage
Breaking Changes📋 Well-documented breaking changes:
Recommendation: Consider adding a migration note in the changelog/release notes. 📝 Specific Line Commentsbbtools.py:68 - Using 🎯 Recommendations Summary
Final Verdict✅ LGTM with minor suggestions - This is high-quality code that successfully achieves its refactoring goals. The issues noted are minor and mostly suggestions for improvement rather than bugs. The code is ready to merge after considering the recommendations above. Great work on maintaining code quality while performing a significant refactoring! 🎉 |
Update section names across all BBTools submodules to include the subtool name (BBMap, BBDuk, BBSplit) for clarity without the overall module name prefix.
5791f6e to
6e89180
Compare
This refactoring unifies the separate
bbmapandbbdukmodules into a singlebbtoolsmodule following MultiQC conventions. The change addresses the issues raised in #3398 and #1146.Equivalent PR to #3398 but with more significant code rewrite to try to match MultiQC coding style.
🤖 Written by Claude, with instruction and oversight by @ewels
Details
Key changes:
Submodules included:
Breaking changes:
bbmap/bbduktobbtoolsbbmap/*tobbtools/*multiqc_bbmap.jsonto individualmultiqc_bbtools_<subtool>.jsonfilesNote: The previously unused file types (bincov, covstats, rpkm, statsfile, statsfile_machine) were not implemented as they were marked as not_implemented in the original module.