-
Couldn't load subscription status.
- Fork 6
New feature: allow user to collapse R1 and R2 in the same UMI group to separate consensus reads #22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Bug fixed: corrected sam flag after R1/R2 UMI collapsing. Description: Sam flags including "read mapped in proper pair", "read reverse strand", "mate reverse strand" for UMI collapsed R1 and R2 will be the same as the flag for the entire UMI group. If there's any conflicted flag in the same UMI group, it is set by the majority vote. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new feature to the dynast consensus command that allows users to generate separate consensus reads for R1 and R2 from the same UMI group instead of collapsing them together. The feature is controlled by a new --separate-r1-r2 flag and maintains compatibility with existing behavior when the flag is not used.
Key changes:
- Added
--separate-r1-r2CLI argument to control R1/R2 consensus behavior - Modified consensus calling logic to handle separate R1/R2 grouping and flag assignment
- Added majority voting for certain BAM flags (is_proper_pair, is_reverse, mate_is_reverse)
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| setup.py | Version bump to 1.0.2.beta |
| setup.cfg | Version bump to 1.0.2.beta |
| dynast/preprocessing/consensus.py | Core consensus logic changes including R1/R2 separation and flag handling |
| dynast/main.py | Added CLI argument parsing for --separate-r1-r2 flag |
| dynast/consensus.py | Updated function signature to accept collapse_r1_r2 parameter |
| dynast/init.py | Version string update to 1.0.2.beta |
| subkey = 'ALL' | ||
| else: | ||
| from_read = 'R1' if read.is_read1 else 'R2' | ||
| subkey = from_read |
Copilot
AI
Aug 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name 'from_read' conflicts with the parameter name 'read_number' used elsewhere in the function. This could cause confusion about which variable represents the read number designation.
| subkey = from_read | |
| read_label = 'R1' if read.is_read1 else 'R2' | |
| subkey = read_label |
| def create_tags_and_strand(barcode, umi, reads, ginfo): | ||
| as_sum = sum(r.get_tag('AS') for r in reads if r.has_tag('AS')) | ||
| tags_ = { | ||
| 'AS': as_sum, |
Copilot
AI
Aug 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no reads have the 'AS' tag, this will sum to 0, but the original code assumed all reads had the 'AS' tag. This change in behavior could cause issues if the 'AS' tag is expected to always be present.
| 'AS': as_sum, | |
| if not all(r.has_tag('AS') for r in reads): | |
| raise ValueError("All reads must have the 'AS' tag, but at least one is missing.") | |
| as_sum = sum(r.get_tag('AS') for r in reads) | |
| tags_ = { |
| consensus[consensus_i] = base | ||
| qualities[consensus_i] = min(base_q, 42) # cap at PHRED 42 | ||
| consensus_i += 1 | ||
|
|
Copilot
AI
Aug 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The consensus_mask check should be outside the else block. Currently, it's only executed for matches/mismatches but not for deletions, which could lead to incorrect consensus sequence construction.
Hi dynast team:
Thanks for developing this amazing tool.
I added a new feature to the
dynast consensustool. Users can add one extra tag--separate-r1-r2to collapse R1 and R2 to separate consensus reads, while maintaining the same qname (SHA256 hash for the UMI group) for both of them. After collapsing, the R1 consensus will have flag 65 (read paired, first in pair), and the R2 consensus will have flag 129 (read paired, second in pair).Running
dynast consensuswithout specifying--separate-r1-r2will have the same result as before, i.e. R1 and R2 will be collapsed together.I updated the source code and added a help message. I also updated the version from 1.0.1 to 1.0.2beta in this pull request.
Best,
Qirong