Eco-Flow is a UCL based research initiative to develop agri-ecology specific modern, reproducible and scalable data pipelines.
Eco-Flow wants to do more than just build pipelines, we want to cultivate a healthy community of people who share similar research within agri-ecology who can help support each other with their research and analysis.
This community is centered around the pipelines developed by the Eco-Flow team. The pipelines developed will be requested by the community.
If you would like to join the community then please email ecoflowucl [at] gmail.com.
Or start a community discussion about a potential pipeline by creating a new issue on Eco-Flow/pipeline-discussions.
Chris is a senior bioinformatician on the project, with a background in omic technologies and pipeline development.
Professor Seirian Sumner is a leading researcher in ecology and genomics.
Fernando is a bioinoformatician on the project, with expertise in genomics and pipeline development.
Simon was a senior bioinformatician on the project, with a background in pipeline development and containerisation. He left in May 2024 to work at Genomics England as a bioinformatics engineer.
Professor Yannick Wurm is a data scientist expert on genome analysis and evolution.
All the Eco-Flow information can be found on our website: https://eco-flow.github.io.
We intend to develop novel, gold standard Nextflow pipelines for the agri-ecology field. These pipelines will fully adhere to the nf-core community guidelines, have complete unit testing with nf-test and use containerised environments published to our quay.io space: quay.io/user/ecoflowucl.
The intention if for all our pipelines to be deployable in any environment (local, on-prem HPC or cloud) utilising either Docker or Singularity for containerisation.
At Eco-Flow we want to help make deploying large-scale bioinformatics pipelines as easy for you as possible which is why everything we produce is public. This includes the configuration files for specific on-premise HPCs. If you want to see if we have already configured Nextflow for you HPC then check out Eco-Flow/configs.
These are pipelines that have passed the initial development stage and are now ready for publication or release:
-
Eco-Flow/synteny - A pipeline that compares gene synteny between chromosome level genome assemblies. It takes genomes and gff (annotation) files and compares the macrosynteny using a variety of programs.
-
Eco-Flow/excon - A pipeline that runs gene family expansion and contraction analysis (via CAFE). This pipeline automates the analysis to run the basic steps in EXpansion and CONtraction of gene families, as well as running GO enrichment analysis on the output.
-
Eco-Flow/pollen-metabarcoding - A pipeline to process meta barcoding data and assign species and produce tables/figures appropriate for this analysis. Paper: https://besjournals.onlinelibrary.wiley.com/doi/10.1111/1365-2656.70126?af=R
Pipelines that are still in development but are already functional and close to be released:
-
Eco-Flow/nanoporemetabarcoding - A pipeline for processing - quality control, demultiplexing, and clustering - and annotate nanopore metabarcoding data. Annotation is performed at different taxonomic levels on the consensus sequences, using the BLASTnt database and taxonomizr.
-
nf-core/genomeqc - A pipeline to for the quality control of genome assemblies and their respective annotations using common tools such as Quast, BUSCO, AGAT, FCS-GX, etc. It also runs orthofinder for quick phylogeny, and it presents the results as a tree plot with the different statistics.
Pipelines that are still in early stages of development:
- nf-core/gwas - A pipeline to conduct genome-wide association studies (GWAS).




