This repository hosts machine learning code and discussion (see Issues) for Project Cognoma.
The production notebook that is served to website users can be found in the ml-workers repository. This repository will be used for continued data exploration and new modeling approaches.
The following notebooks implement the primary machine learning workflow for Cognoma:
1.download.ipynb: downloads the cancer datasets.2.mutation-classifier.ipynb: builds a classifier for mutation in a given gene.3.pathway-classifier.ipynb: builds a classifier for mutation in any gene for a given pathway.
If you've modified a notebook and are submitting a pull request, then export the notebooks to scripts:
jupyter nbconvert --to=script --FilesWriter.build_directory=scripts *.ipynbThis repository uses conda to manage its environment and install packages.
If you don't have conda installed on your system, you can download it here.
You can install the Python 2 or 3 version of Miniconda (or Anaconda), which determines the Python version of your root environment.
Since we create a dedicated environment for this project, named cognoma-machine-learning whose explicit dependencies are specified in environment.yml, the version of your root environment will not be relevant.
With conda, you can create the cognoma-machine-learning environment by running the following from the root directory of this repository:
# Create or overwrite the cognoma-machine-learning conda environment
conda env create --file environment.ymlIf environment.yml has changed since you created the environment, run the following update command:
conda env update --file environment.ymlActivate the environment by running source activate cognoma-machine-learning on Linux or OS X and activate cognoma-machine-learning on Windows.
Once this environment is active in a terminal, run jupyter notebook to start a notebook server.