OSCAR (Open Source Chemistry Analysis Routines) is an open source extensible system for the automated annotation of chemistry in scientific articles. It can be used to identify chemical names, reaction names, ontology terms, enzymes and chemical prefixes and adjectives, and chemical data such as state, yield, IR, NMR and mass spectra and elemental analyses. In addition, where possible, any chemical names detected will be annotated with structures derived either by lookup, or name-to-structure parsing using OPSIN or with identifiers from the ChEBI (`Chemical Entities of Biological Interest’) ontology.
OSCAR has been under development since 2002. The current version, OSCAR4, focuses on providing a core library that facilitates integration with other tools. Its simple to use API is modularised to promote extension into other domains and allows for its use within workflow systems like Taverna and U-Compare.
OSCAR is developed by the Murray-Rust research group at the Unilever Centre for Molecular Science Informatics, University of Cambridge. The corresponding publication can be found here and the authors would appreciate it if this is cited in any work that makes use of the code.
The following code will identify chemical named entities in text, and output a list of them together with their Standard InChI, when available.
String s = "....";
Oscar oscar = new Oscar();
List<ResolvedNamedEntity> entities = oscar.findAndResolveNamedEntities(s);
for (ResolvedNamedEntity ne : entities) {
System.out.println(ne.getSurface());
ChemicalStructure stdInchi = ne.getFirstChemicalStructure(FormatType.STD_INCHI);
if (stdInchi != null) {
System.out.println(stdInchi);
}
System.out.println();
}
- Create a gpg key
gpg --full-generate-key --pinentry-mode=loopback
Note, I think it must be RSA and the largest you can create. Remember to protect it with a password.
- Upload it to http://keyserver.ubuntu.com/
gpg --armor --export [email protected]
Take the output from the above command and paste it into that URL.
-
Create an account on https://central.sonatype.com/
-
Log in and make sure you have access to the Namespace you want to deploy to: https://central.sonatype.com/publishing/namespaces
For this repo, it will be uk.ac.cam.ch.wwmm; if you do not, you will need to request access via someone else, who does have access.
- You will need to create a token for deployment via https://central.sonatype.com/account This needs to be pasted into your ~/.m2/settings.xml, e.g.:
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
https://maven.apache.org/xsd/settings-1.0.0.xsd">
<servers>
<server>
<id>central</id>
<username>foo</username>
<password>bar</password>
</server>
</servers>
</settings>
- Note, this assumes you have a ssh key to access github. Build, package and sign:
mvn -Dusername=git release:prepare -DautoVersionSubmodules=true -DreleaseVersion=5.3.0 -DdevelopmentVersion=5.4-SNAPSHOT
- Set the tag label as 5.3.0 when requested
- Enter your GPG password
- Upload it to central.sonatype.com
mvn -Psonatype-oss-release release:perform -DconnectionUrl=scm:git:https://github.com/BlueObelisk/oscar4 -Dtag=5.3.0
- Enter your GPG password
- Log into https://central.sonatype.com/publishing/deployments The deployment should be here, pending to go; if everything is green, hit publish.