8000 Add AlignedSegment.query_qualities_str and fix sequence and qualities caching by jmarshall · Pull Request #1341 · pysam-developers/pysam · GitHub
[go: up one dir, main page]

Skip to content

Add AlignedSegment.query_qualities_str and fix sequence and qualities caching #1341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 21, 2025

Conversation

jmarshall
Copy link
Member

On PR #1324 @nh13 requested being able to set AlignedSegment.query_qualities via a SAM/FASTQ-style base quality string. This expands that to be able to retrieve QUAL as a string too.

Add a query_qualities_str property implemented directly against the underlying bam1_t data structure rather than translating to an array first via qualitystring_to_array()/array_to_qualitystring(). Recode query_qualities to work more obviously directly against the bam1_t too, and fix several bugs in its caching layer. (See #121 for why this caching is important.) Enable setting query_qualities to use a string too, by delegating to the new property.

Fix some similar caching bugs in query_sequence too.

jmarshall added 7 commits May 3, 2025 21:27
These deprecated properties forward to corresponding query_alignment_XYZ
properties which are all read-only as they are derived from query_XYZ.
Rewrite query_qualities directly, and have query_alignment_qualities
trim that's final value rather than computing its own similarly.

Clear caches in __set__ et al and populate them only in __get__ routines.
Previously the input value was cached rather than the canonical one
reconverted from the raw data, and setting to None did not clear the
previously cached value.
The existing query_qualities property provides QUAL as a Python array
(and in fact has long allowed it to be set from most iterables other
than strings and tuples). This new query_qualities_str property enables
QUAL to be accessed as the usual ASCII-encoded base quality string (and
now query_qualities also allows it to be set from such a string).

Also add (read-only) query_alignment_qualities_str paralleling
query_alignment_qualities similarly.
Updating query_sequence did not invalidate cache_query_alignment_sequence
previously. Clear caches in __set__ et al and populate them only in
__get__ routines.
@jmarshall jmarshall force-pushed the query_qualities_str branch from 173a128 to 1082aa2 Compare May 15, 2025 13:10
@jmarshall jmarshall force-pushed the query_qualities_str branch from 1082aa2 to 5a9eb8c Compare May 21, 2025 09:57
@jmarshall jmarshall merged commit 9c4c693 into pysam-developers:master May 21, 2025
0 of 12 checks passed
@jmarshall jmarshall deleted the query_qualities_str branch May 21, 2025 09:59
@nh13
Copy link
Contributor
nh13 commented May 21, 2025

Thank-you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0