8000 bcftools annotate doesn't liftover fields from VCF with symbolic alleles if END in only one file. · Issue #2507 · samtools/bcftools · GitHub
[go: up one dir, main page]

Skip to content

bcftools annotate doesn't liftover fields from VCF with symbolic alleles if END in only one file. #2507

@ASLeonard

Description

@ASLeonard

Hi,

I'm using bcftools annotate -c INFO -a variant_calls.vcf.gz imputed.vcf.gz to liftover annotations from a variant-call VCF (with meaningful INFO tags) into an imputed VCF (exact same variants, but removes the original INFO tags with other tags). Critically, the two files have the exact same number and order of variants, and POS/ID/REF/ALT are untouched. This works well for SNPs and most SVs, but seems to ignore symbolic alleles. Adding --pair-logic any/all does then carry over all the INFO annotation for all variants, but not the other options like id even though the symbolic allele variant has a unique ID.

It seems like the issue is that the symbolic alleles have their END info stripped by imputation, so the variants are no longer matched, since one is ID/end_pos and the other is just ID.

https://github.com/samtools/htslib/blob/acc28ac1e52efcdd9a06706aaf0021e1082ef1ba/bcf_sr_sort.c#L400C16-L416C22

I guess this is an extreme edge-case as it is complex and probably fragile to only append END if both files have END, otherwise don't use it for either. I guess the most elegant/ugly solution would be to strip/rename the END tag, merge INFOs, and then rename END or recalculate from POS+SVLEN.

The versions are

bcftools 1.23-3-g34a49760-dirty
Using htslib 1.23-9-gacc28ac1-dirty

Best,
Alex

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0