- TWILIGHT v0.2.0
- Add new sequences: Support for adding new sequences to an existing alignment.
- Protein alignment: Support for protein alignment, still relatively new and continuously improving.
- Iterative mode: Improved for faster performance.
- Flexible tree support: Allows using a tree that contains more sequences than the actual dataset for alignment.
- Bug fixes: Resolved issues present in TWILIGHT v0.1.4; users are encouraged to update to v0.2.0.
TWILIGHT (Tall and Wide Alignments at High Throughput) is a tool designed for ultrafast and ultralarge multiple sequence alignment. It is able to scale to millions of long nucleotide sequences (>10000 bases). TWILIGHT can run on CPU-only platforms (Linux/Mac) or take advantage of CUDA-capable GPUs for further acceleration.
By default, TWILIGHT requires an unaligned sequence file in FASTA format and an input guide tree in Newick format to generate the output alignment in FASTA format (Fig. 1a, default mode). When a guide tree is unavailable, TWILIGHT provides a Snakemake workflow to estimate guide trees using external tools (Fig 1b, iterative mode).
TWILIGHT adopts the progressive alignment algorithm (Fig. 1c) and employs tiling strategies to band alignments (Fig. 1e). Combined with a divide-and-conquer technique (Fig. 1a), a novel heuristic dealing with gappy columns (Fig. 1d) and support for GPU acceleration (Fig. 1f), TWILIGHT demonstrates exceptional speed and memory efficiency.
TWILIGHT offers multiple installation methods for different platforms and hardware setups:
- Conda is recommended for most users needing the default mode and partial iterative mode support, as some tree tools may be unavailable on certain platforms.
- Install script is required for AMD GPU support.
- Docker (built from the provided Dockerfile) is recommended for full support for iterative mode.
| Platform / Setup | Conda | Script | Docker |
|---|---|---|---|
| Linux (x86_64) | ✅ | ✅ | ✅ |
| Linux (aarch64) | ✅ | ✅ | 🟡 |
| macOS (Intel Chip) | ✅ | ✅ | ✅ |
| macOS (Apple Silicon) | ✅ | ✅ | 🟡 |
| NVIDIA GPU | ✅ | ✅ | ✅ |
| AMD GPU | ❌ | ✅ | ❌ |
🟡 The Docker image is currently built for the linux/amd64 platform. While it can run on arm64 systems (e.g., Apple Silicon or Linux aarch64) via emulation, this may lead to reduced performance.
TWILIGHT is available on multiple platforms via Conda. See TWILIGHT Bioconda Page for details.
Step 1: Create and activate a Conda environment (ensure Conda is installed first)
conda create -n twilight python=3.11 -y
conda activate twilight
# Set up channels
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority strict
# Install TWILIGHT
conda install bioconda::twilightStep 2 (optional): Install TWILIGHT iterative mode
git clone https://github.com/TurakhiaLab/TWILIGHT.git
cd TWILIGHT
bash ./install/installIterative.shUsing installation script (requires sudo access if certain common libraries are not already installed)
Users without sudo access are advised to install TWILIGHT via Conda or Docker.
Step 1: Clone the repository
git clone https://github.com/TurakhiaLab/TWILIGHT.git
cd TWILIGHTStep 2: Install dependencies (requires sudo access)
TWILIGHT depends on the following common system libraries, which are typically pre-installed on most development environments:
- wget
- build-essential
- cmake
- libboost-all-dev It also requires libtbb-dev, which is not always pre-installed on all systems. For users who do not have sudo access and are missing only libtbb-dev, our script builds and installs TBB from source in the local user environment, with no sudo access required.
For Ubuntu users with sudo access, if any of the required libraries are missing, you can install them with:
sudo apt install -y wget build-essential libboost-all-dev cmake libtbb-devFor Mac users, install dependencies using Homebrew:
xcode-select --install # if not already installed
brew install wget boost cmake tbbStep 3: Build TWILIGHT
Our build script automatically detects the best available compute backend (CPU, NVIDIA GPU, or AMD GPU) and builds TWILIGHT accordingly. Alternatively, users can manually specify the desired target platform.
Automatic build:
bash ./install/buildTWILIGHT.shBuild for a specific platform:
bash ./install/buildTWILIGHT.sh cuda # For NVIDIA GPUs
bash ./install/buildTWILIGHT.sh hip # For AMD GPUsStep 4: The TWILIGHT executable is located in the bin directory and can be run as follows:
cd bin
./twilight --helpStep 5 (optional) Install TWILIGHT iterative mode (ensure Conda is installed first)
# Create and activate a Conda environment
conda create -n twilight python=3.11 -y
conda activate twilight
# Install Snakemake and tree inference tools
bash ./install/installIterative.shThe Dockerfile installed all the dependencies and tools for TWILIGHT default/iterative mode.
Step 1: Clone the repository
git clone https://github.com/TurakhiaLab/TWILIGHT.git
cd TWILIGHTStep 2: Build a docker image (ensure Docker is installed first)
CPU version
cd docker/cpu
docker build -t twilight .GPU version (using nvidia/cuda as base image)
cd docker/gpu
docker build -t twilight .Step 3: Start and run docker container
CPU version
docker run --platform=linux/amd64 -it twilightGPU version
docker run --platform=linux/amd64 --gpus all -it twilightStep 4: Run TWILIGHT
cd bin
./twilight -hFor more information about TWILIGHT's options and instructions, see wiki or Help.
cd bin
./twilight -hPerforms a standard progressive alignment using default configurations.
Usage syntax
./twilight -t <tree file> -i <sequence file> -o <output file>Example
./twilight -t ../dataset/RNASim.nwk -i ../dataset/RNASim.fa -o RNASim.alnTo reduce the CPU’s main memory usage, TWILIGHT divides tree into subtrees with at most m leaves, and align subtrees sequentially. The parameter m is user-defined.
Usage syntax
./twilight -t <tree file> -i <sequence file> -o <output file> -m <maximum subtree size>Example
./twilight -t ../dataset/RNASim.nwk -i ../dataset/RNASim.fa -o RNASim.aln -m 200For better accuracy, it is recommended to use a tree that includes placements for the new sequences. If no tree is provided, TWILIGHT aligns new sequences to the profile of the entire backbone alignment, which may reduce accuracy. In this case, using the provided Snakemake workflow is advised.
./twilight -a <backbone alignment file> -i <new sequence file> -t <tree with placement of new sequences> -o <path to output file>Example
./twilight -a ../dataset/RNASim_backbone.aln -i ../dataset/RNASim_sub.fa -t ../dataset/RNASim.nwk -o RNASim.alnTo merge multiple MSAs, place all MSA files into a single folder.
Usage syntax
./twilight -f <path to the folder> -o <output file>Example
./twilight -f ../dataset/RNASim_subalignments/ -o RNASim.alnPrunes tips that are not present in the raw sequence file. This is useful when working with a large tree but only aligning a subset of sequences, without needing to re-estimate the guide tree. Outputting the pruned tree is also supported.
Usage syntax
./twilight -t <large tree file> -i <subset of raw sequences> -o <output file> --prune [--write-prune]Example
./twilight -t ../dataset/RNASim.nwk -i ../dataset/RNASim_sub.fa -o RNASim_sub.aln --prune --write-pruneFor more information about TWILIGHT's options and instructions, see wiki or Help. To set up the environment and install external tools, see here.
- For users who install TWILIGHT via Conda, please replace the executable path
"../bin/twilight"with"twilight"inconfig.yaml. Feel free to switch to a more powerful tree tool if available, such as replacing"raxmlHPC"with"raxmlHPC-PTHREADS-AVX2"for better performance. - Note that since some tree-building tools can’t automatically detect the sequence type, specifying datatype is required. Use
TYPE=nfor nucleotide sequences orTYPE=pfor protein sequences.
Enter workflow directory and type snakemake to view the help messages.
cd workflow
snakemake
# or, for Snakemake versions that require specifying total number of cores:
snakemake --cores 1TWILIGHT iterative mode estimate guide trees using external tools.
Supported tree inference tools:
- Initial guide tree:
parttree,maffttree,mashtree - Intermediate iterations (optimized for speed):
rapidnj,fasttree - Final tree (optimized for quality):
fasttree,raxml,iqtree
Usage syntax
snakemake [--cores <num threads>] --config TYPE=VALUE SEQ=VALUE OUT=VALUE [OPTION=VALUE ...]Example
- Using default configurations
snakemake --cores 8 --config TYPE=n SEQ=../dataset/RNASim.fa OUT=RNASim.aln- Generates the final tree based on the completed MSA.
snakemake --cores 8 --config TYPE=n SEQ=../dataset/RNASim.fa OUT=RNASim.aln FINALTREE=fasttreeTWILIGHT aligns new sequences to the profile of the backbone alignment, infers their placement with external tools, and then refines the alignment using the inferred tree.
Usage syntax
snakemake [--cores <num threads>] --config TYPE=VALUE SEQ=VALUE OUT=VALUE ALN=VALUE [OPTION=VALUE ...]Example
- The backbone alignment is accompanied by a tree.
snakemake --cores 8 --config TYPE=n SEQ=../dataset/RNASim_sub.fa OUT=RNASim.aln ALN=../dataset/RNASim_backbone.aln TREE=../dataset/RNASim_backbone.nwk- The backbone tree is unavailable, estimate it using external tools and generate a final tree after alignment.
snakemake --cores 8 --config TYPE=n SEQ=../dataset/RNASim_sub.fa OUT=RNASim.aln ALN=../dataset/RNASim_backbone.aln FINALTREE=fasttreeWe welcome contributions from the community to enhance the capabilities of TWILIGHT. If you encounter any issues or have suggestions for improvement, please open an issue on TWILIGHT GitHub page. For general inquiries and support, reach out to our team.
If you use the TWILIGHT in your research or publications, we kindly request that you cite the following paper:
Yu-Hsiang Tseng, Sumit Walia, Yatish Turakhia, "Ultrafast and ultralarge multiple sequence alignments using TWILIGHT", Bioinformatics, Volume 41, Issue Supplement_1, July 2025, Pages i332–i341, doi: 10.1093/bioinformatics/btaf212

