Skip to content

Conversation

@Qirong-Lin
Copy link

Hi dynast team:

Thanks for developing this amazing tool.

I added a new feature to the dynast consensus tool. Users can add one extra tag --separate-r1-r2 to collapse R1 and R2 to separate consensus reads, while maintaining the same qname (SHA256 hash for the UMI group) for both of them. After collapsing, the R1 consensus will have flag 65 (read paired, first in pair), and the R2 consensus will have flag 129 (read paired, second in pair).

Running dynast consensus without specifying --separate-r1-r2 will have the same result as before, i.e. R1 and R2 will be collapsed together.

I updated the source code and added a help message. I also updated the version from 1.0.1 to 1.0.2beta in this pull request.

Best,
Qirong

@Qirong-Lin Qirong-Lin closed this Jan 23, 2025
@Qirong-Lin Qirong-Lin reopened this Jan 23, 2025
@Qirong-Lin
Copy link
Author

Bug fixed: corrected sam flag after R1/R2 UMI collapsing.

Description: Sam flags including "read mapped in proper pair", "read reverse strand", "mate reverse strand" for UMI collapsed R1 and R2 will be the same as the flag for the entire UMI group. If there's any conflicted flag in the same UMI group, it is set by the majority vote.

@Xiaojieqiu Xiaojieqiu requested a review from Copilot August 12, 2025 01:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new feature to the dynast consensus command that allows users to generate separate consensus reads for R1 and R2 from the same UMI group instead of collapsing them together. The feature is controlled by a new --separate-r1-r2 flag and maintains compatibility with existing behavior when the flag is not used.

Key changes:

  • Added --separate-r1-r2 CLI argument to control R1/R2 consensus behavior
  • Modified consensus calling logic to handle separate R1/R2 grouping and flag assignment
  • Added majority voting for certain BAM flags (is_proper_pair, is_reverse, mate_is_reverse)

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
setup.py Version bump to 1.0.2.beta
setup.cfg Version bump to 1.0.2.beta
dynast/preprocessing/consensus.py Core consensus logic changes including R1/R2 separation and flag handling
dynast/main.py Added CLI argument parsing for --separate-r1-r2 flag
dynast/consensus.py Updated function signature to accept collapse_r1_r2 parameter
dynast/init.py Version string update to 1.0.2.beta

subkey = 'ALL'
else:
from_read = 'R1' if read.is_read1 else 'R2'
subkey = from_read
Copy link

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'from_read' conflicts with the parameter name 'read_number' used elsewhere in the function. This could cause confusion about which variable represents the read number designation.

Suggested change
subkey = from_read
read_label = 'R1' if read.is_read1 else 'R2'
subkey = read_label

Copilot uses AI. Check for mistakes.
def create_tags_and_strand(barcode, umi, reads, ginfo):
as_sum = sum(r.get_tag('AS') for r in reads if r.has_tag('AS'))
tags_ = {
'AS': as_sum,
Copy link

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If no reads have the 'AS' tag, this will sum to 0, but the original code assumed all reads had the 'AS' tag. This change in behavior could cause issues if the 'AS' tag is expected to always be present.

Suggested change
'AS': as_sum,
if not all(r.has_tag('AS') for r in reads):
raise ValueError("All reads must have the 'AS' tag, but at least one is missing.")
as_sum = sum(r.get_tag('AS') for r in reads)
tags_ = {

Copilot uses AI. Check for mistakes.
consensus[consensus_i] = base
qualities[consensus_i] = min(base_q, 42) # cap at PHRED 42
consensus_i += 1

Copy link

Copilot AI Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The consensus_mask check should be outside the else block. Currently, it's only executed for matches/mismatches but not for deletions, which could lead to incorrect consensus sequence construction.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant