Accurately detecting functional genes in metagenomes.
For installation instructions, see INSTALL.md. For license information, see LICENSE.txt.
Once you have installed ROCker, the easiest way to use it is by searching pre-existing models. We maintain a list of precomputed models that you're free to use.
-
Obtain the model of interest either downloading it from our repository or creating one yourself (see below).
-
Execute ROCker search. The minimum required parameters are:
$> ROCker search -q input.fasta -k model.rocker -o output.blastWhere
input.fastais the input metagenome in FastA format,model.rockeris the ROCker model, andoutput.blastis the output file to be created in tabular BLAST format. For additional supported options, executeROCker search -h. -
If you have a pre-computed BLAST file, you can execute instead:
$> ROCker filter -x input.blast -k model.rocker -o output.blastWhere
input.blastis the input search to be filtered in tabular BLAST format,model.rockeris the ROCker model, andoutput.blastis the output file to be created in tabular BLAST format. For additional supported options, executeROCker filter -h.
Collect a good reference collection of the gene of interest. This is the most important step, but there are some resources to help you. In general, we find the resources at UniProt very useful.
-
Create a list of UniProt identifiers (IDs and/or accessions) representing proteins of the family of interest, in a raw text file (one per line).
-
If you want to explicitly exclude certain proteins from the model (e.g., if there are very similar proteins with distinct functional properties), create a similar list with those, we will refer to them as a negative set and it's optional.
-
Build the model files. The minimum required parameters are:
$> ROCker build -P positive.txt -o prepWhere
positive.txtis the set from step 1, andprepis the base name for the output files. You can also pass the negative set from step 2 using-N(or-n). For additional supported options, executeROCker build -h. This is by far the most computationally-expensive step, so you might want to consider using multiple threads (-t) or even re-using files in case the run fails (--reuse-filesand--nocleanup). Also, consider setting the simulated read length to match that of your metagenomes (-l). -
Compile the model. The minimum required parameters are:
$> ROCker compile -a prep.aln -b prep.blast -k model.rockerWhere
prep.alnis the alignment generated in step 3 (manual curation is strongly encouraged),prep.blastis the reference BLAST generated in step 3, andmodel.rockeris the model to compile. -
Register your model (optional). If you would like to share your model with the community, please Contact us. We'll need the final ROCker model and the reference BLAST, and will add your model to our curated list.