A gene predictor developed on Loxodes magnus in the summer-semester of 2020 as part of the master-thesis of David Emanuel Vetter (Thesis-title: "Prediction of genes in genomes with ambiguous genetic codes"). Heavily based on Augustus.
Naming: POGIGWASC (IPA: [ˈpɔ.dʒɪdʒ.wəskʰ]) stands for "Prediction of genes in genomes with ambiguous stop codons" -- the deviation from the thesis-name is due to the program only being specifically designed to handle an ambiguous UGA-codon. Its generalizeability to other cases of genetic code ambiguity is unknown.
Prerequisites: install Maven
- download the code as a zip-file (from github) and extract the contents to some directory
- navigate to that directory (more precisely: navigate to the directory
masterthesis-master, containing the pom.xml) and run the following maven command:mvn package appassembler:assemble(this should print quite a bunch of information into the console; including (green) text informing about the number of tests run, and the number of failures encountered -- there should not be any Failures or errors; at the end, 'BUILD SUCCESS' should be printed to the console) - from the same directory, run
target/appassembler/bin/ghmm-predict: this should print the standard help-text
To move the compiled program, copy/move the entire appassembler-directory (which can be renamed) -- it does not suffice to copy/move just the generated target/appassembler/bin/ghmm-predict file
Prerequisites: install Eclipse
- clone the git-repository with eclipse (or directly with git)
- right-click the pom-xml and choose
Run As>2 Maven build..., then enter e.g.package appassembler:assemblefor the goals and run This should produce an output as described above - Rightclick
de.vetter.pogigwasc/App.javaand run as java application. This should give the usual help-text (set command-line args via Run>Run Configurations)
Directly in src/main/java/de/vetter/pogigwasc, most relevant classes for gene-prediction are found, in particular, the main-class App with the main-method that defines what happens, when the program is run (cf. pom.xml).
ModelParametersis used to load and query the model-parameters from an external fileViterbiandViterbiSeedimplement the viterbi-algorithmGHMMimplements GHMMsLoxodesMagnusGHMMholds the exact model used for Loxodes magnus (and extendsGHMM)LoxodesMagnusIntronlessholds the model used for intronless predictions in Loxodes magnusPairandGFFFeatureare very small and trivial (implementing a pair, and being an enum for writing GFF3-files)- The folder/subpackage
states/contains the implementations of the states: These take care of emission probabilities and enumerating valid emission lengths
The Classes are documented in more detail in their respective files. Tests are found in src/test/java/de/vetter/pogigwasc, where running AllTests runs all tests.