diff --git a/README.md b/README.md
index 3e18418..0b99bab 100755
--- a/README.md
+++ b/README.md
@@ -1,13 +1,10 @@
 ![csa-atrophy](https://github.com/sct-pipeline/csa-atrophy/blob/master/csa_atrophy_scheme3.png)
 
+[![Documentation Status](https://readthedocs.org/projects/sphinx/badge/?version=master)](https://csa-atrophy.readthedocs.io/en/latest/)
+   
 # csa-atrophy
 
-Evaluate the sensitivity of atrophy detection with SCT. The algorithm works as follows:
-- Consider subject I --> sI
-- Applies a rescaling on the native image (e.g. 1, 0.95, 0.8) --> rX
-- Applies random affine transformation --> tY
-- Segment the cord
-- Compute CSA --> CSA(sI, rX, tY)
+The csa-atrophy framework aims to evaluate the robustness and the sensitivity of an automated analysis pipeline for detecting SC atrophy.
 
 # How to run
 
@@ -45,50 +42,4 @@ To output statistics, run in Dataset
 csa_rescale_stat -i csa_atrophy_results/results -o csa_atrophy_results -config config_script.yml -fig
 ~~~
 
-# Quality Control
-
-After running the analysis, check your Quality Control (QC) report by opening the file qc/index.html. Use the 
-“Search” feature of the QC report to quickly jump to segmentations or labeling results. If you spot issues 
-(wrong labeling), add their filenames in the 'config_correction.yml' file 
-(see https://spine-generic.rtfd.io/en/latest/analysis-pipeline.html for further indications). Then, manually create 
-labels in the cord at the level of inter-vertebral discs C1-C2, C2-C3, ..., C4-C5 with the command:
-~~~
-manual_correction -config config_correction.yml -path-in csa_atrophy_results/data_processed -path-out PATH_DATA
-~~~
-The bash script outputs all manual labelings to the derivatives directory in the dataset path defined in `path_data`.
-It is now possible to re-run the whole process. With the command below labeling will use the manual corrections that
-are present in the derivatives/ folder of the dataset, otherwise labeling will be done automatically.
-~~~
-sct_run_batch -config config_sct_run_batch.yml
-~~~
-
-# Statistics
-
-After everything is done, compute stats:
-Per-subject stat: Panda dataframe `df_sub`:
-- intra-subject MEAN: MEAN[CSA(sI, rX, :)] --> MEAN_intra(sI, rX): `df_sub['mean']`
-- intra-subject STD: STD[CSA(sI, rX, :)] --> STD_intra(sI, rX): `df_sub['std']`
-- intra-subject COV: STD[CSA(sI, rX, :)] / MEAN[CSA(sI, rX, :)] --> COV_intra(sI, rX): `df_sub['cov']`
-- rescale_estimated_subject MEAN: MEAN[CSA(sI, rX, :) / CSA(sI, 1, :)] --> MEAN_rescale_estimated_subject(sI, rX): `df_sub['rescale_estimated']`
-- intra-subject error MEAN: MEAN[CSA(sI, rX, :)] - (rX^2 * MEAN[CSA(sI, 1, :)]) --> MEAN_error_intra(sI, rX): `df_sub['error']`
-- intra-subject error in percentage MEAN: [MEAN[CSA(sI, rX, :)] - (rX^2 * MEAN[CSA(sI, 1, :)])] / MEAN[CSA(sI, rX, :)] --> MEAN_error_intra_perc(sI, rX): `df_sub['perc_error']`
-
-Across-subject stats: Panda dataframe `df_rescale`
-- intra-subject STD: MEAN[STD_intra(:, rX)] --> STD_intra(rX): `df_rescale['std_intra']`
-- intra-subject COV: MEAN[COV_intra_sub(:, rX)] --> COV_intra(rX): `df_rescale['cov']`
-- inter-subject STD: STD[MEAN_intra(:, rX)] --> STD_inter(rX): `df_rescale['std_inter']`
-- rescale_estimated (across subjects) MEAN: MEAN[MEAN_rescale_estimated_subject(:, rX)] --> MEAN_rescale_estimated(rX): `df_rescale['mean_rescale_estimated']`
-- rescale_estimated (across subjects) STD: STD[MEAN_rescale_estimated_subject(:, rX)] --> STD_rescale_estimated(rX): `df_rescale['std_rescale_estimated']`
-- error in percentage (across subjects) MEAN: MEAN[MEAN_error_intra(:, rX)]
-- error in percentage (across subjects) STD: STD[MEAN_error_intra(:, rX)]
-
-Power analysis:
-- sample size: [(z(uncertainty) + z(power))^2 * (2 * STD[MEAN(:, rX)]^2)] / [MEAN[CSA(sI, 1, :)] - MEAN[CSA(sI, rX, :)]] 
-
-Plot results:
-- STD_intersub
-- Mean and STD inter-subject error percentage in function of rescaling
-- sample size: minimum number of patients to detect an atrophy of X with Y% power and Z% uncertainty
-- CSA values boxplot in function of rescaling
-- Error values boxplot in function of rescaling
 
diff --git a/docs/Makefile b/docs/Makefile
new file mode 100644
index 0000000..2326b82
--- /dev/null
+++ b/docs/Makefile
@@ -0,0 +1,20 @@
+# Minimal makefile for Sphinx documentation
+#
+
+# You can set these variables from the command line, and also
+# from the environment for the first two.
+SPHINXOPTS    =
+SPHINXBUILD   = python -msphinx
+SOURCEDIR     = .
+BUILDDIR      = _build
+
+# Put it first so that "make" without argument is like "make help".
+help:
+	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+.PHONY: help Makefile
+
+# Catch-all target: route all unknown targets to Sphinx using the new
+# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
+%: Makefile
+	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/docs/conf.py b/docs/conf.py
new file mode 100644
index 0000000..041c735
--- /dev/null
+++ b/docs/conf.py
@@ -0,0 +1,69 @@
+# Configuration file for the Sphinx documentation builder.
+#
+# This file only contains a selection of the most common options. For a full
+# list see the documentation:
+# https://www.sphinx-doc.org/en/master/usage/configuration.html
+
+# -- Path setup --------------------------------------------------------------
+
+# If extensions (or modules to document with autodoc) are in another directory,
+# add these directories to sys.path here. If the directory is relative to the
+# documentation root, use os.path.abspath to make it absolute, like shown here.
+#
+import os
+import sys
+sys.path.insert(0, os.path.abspath('../..'))
+sys.path.insert(0, os.path.abspath('../../csa-atrophy/'))
+
+
+# -- Project information -----------------------------------------------------
+
+project = 'csa-atrophy'
+copyright = '2021, Paul Bautin'
+author = 'Paul Bautin'
+
+# The full version, including alpha/beta/rc tags
+release = 'v1.0'
+
+
+# -- General configuration ---------------------------------------------------
+
+# Add any Sphinx extension module names here, as strings. They can be
+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
+# ones.
+extensions = [
+    'sphinx.ext.autodoc',
+    'sphinx.ext.napoleon',
+    'sphinx.ext.mathjax',
+    'sphinx.ext.viewcode',
+    'sphinx.ext.autosummary',
+    'sphinx.ext.doctest',
+    'sphinx.ext.inheritance_diagram',
+    'sphinx.ext.intersphinx',
+    'sphinx.ext.autosectionlabel',
+    'sphinx-jsonschema',
+]
+
+# Add any paths that contain templates here, relative to this directory.
+templates_path = ['_templates']
+
+# List of patterns, relative to source directory, that match files and
+# directories to ignore when looking for source files.
+# This pattern also affects html_static_path and html_extra_path.
+exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
+
+
+# -- Options for HTML output -------------------------------------------------
+
+# The theme to use for HTML and HTML Help pages.  See the documentation for
+# a list of builtin themes.
+#
+html_theme = 'sphinx_rtd_theme'
+
+# Add any paths that contain custom static files (such as style sheets) here,
+# relative to this directory. They are copied after the builtin static files,
+# so a file named "default.css" will overwrite the builtin "default.css".
+html_static_path = ['_static']
+
+# The master toctree document.
+master_doc = 'index'
\ No newline at end of file
diff --git a/docs/csa_atrophy_scheme3.png b/docs/csa_atrophy_scheme3.png
new file mode 100644
index 0000000..20e0f96
Binary files /dev/null and b/docs/csa_atrophy_scheme3.png differ
diff --git a/docs/how_to_run.rst b/docs/how_to_run.rst
new file mode 100644
index 0000000..63dde39
--- /dev/null
+++ b/docs/how_to_run.rst
@@ -0,0 +1,45 @@
+How to run?
+============
+
+This code has been tested using Python 3.7.
+
+Download (or git clone) this repository:
+
+.. code-block:: python
+
+    git clone https://github.com/sct-pipeline/csa-atrophy.git
+    cd csa-atrophy
+
+
+Installation:
+csa-atrophy requires specific python packages for computing statistics and processing images. If not already present on the computer's python environment such packages will automatically be installed by running pip command:
+
+.. code-block:: python
+
+    pip install -e .
+
+Download the results file from Spine Generic Multi-Subject dataset: https://github.com/spine-generic/data-multi-subject/releases/tag/r20201130 .
+
+Edit the file `config_sct_run_batch.yml` according to your setup. Notable flags include:
+
+*  `path_data`: If you downloaded the spine-generic data at another location, make sure to update the path;
+* `include_list`: If you only want to run the script in a few subjects, list them here. Example:
+  `include_list: ['sub-unf04', 'sub-unf05']`
+
+See `sct_run_batch -h` to look at the available options.
+
+Run the analysis:
+
+.. code-block:: python
+
+    sct_run_batch -config config_sct_run_batch.yml
+
+
+note: desired subjects using flag -include and in parallel processing using flag -jobs.
+
+To output statistics, run in Dataset
+
+.. code-block:: python
+
+    csa_rescale_stat -i csa_atrophy_results/results -o csa_atrophy_results -config config_script.yml -fig
+
diff --git a/docs/index.rst b/docs/index.rst
new file mode 100644
index 0000000..e24786b
--- /dev/null
+++ b/docs/index.rst
@@ -0,0 +1,29 @@
+CSA-atrophy
+============
+CSA-atrophy evaluates the robustness and the sensitivity of an automated analysis pipeline for detecting SC atrophy. Notably, the proposed framework utilizes image scaling and applies a random rigid transformation to mimic subject repositioning (scan-rescan). This enables the quantification of the accuracy and precision of the estimated CSA across various degrees of simulated atrophy. As presented in section statistics, statistics from these experiments such as power analyses and minimum sample sizes are derived.
+
+.. image:: csa_atrophy_scheme3.png
+
+.. toctree::
+    :hidden:
+    :maxdepth: 1
+    :caption: How to run
+
+    how_to_run.rst
+
+.. toctree::
+    :hidden:
+    :maxdepth: 1
+    :caption: Statistics
+
+    statistics/introduction.rst
+    statistics/intra_subject.rst
+    statistics/inter_subject.rst
+    statistics/sample_size.rst
+
+.. toctree::
+    :hidden:
+    :maxdepth: 1
+    :caption: Quality Control
+
+    statistics/quality_control.rst
\ No newline at end of file
diff --git a/docs/make.bat b/docs/make.bat
new file mode 100644
index 0000000..2119f51
--- /dev/null
+++ b/docs/make.bat
@@ -0,0 +1,35 @@
+@ECHO OFF
+
+pushd %~dp0
+
+REM Command file for Sphinx documentation
+
+if "%SPHINXBUILD%" == "" (
+	set SPHINXBUILD=sphinx-build
+)
+set SOURCEDIR=.
+set BUILDDIR=_build
+
+if "%1" == "" goto help
+
+%SPHINXBUILD% >NUL 2>NUL
+if errorlevel 9009 (
+	echo.
+	echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
+	echo.installed, then set the SPHINXBUILD environment variable to point
+	echo.to the full path of the 'sphinx-build' executable. Alternatively you
+	echo.may add the Sphinx directory to PATH.
+	echo.
+	echo.If you don't have Sphinx installed, grab it from
+	echo.http://sphinx-doc.org/
+	exit /b 1
+)
+
+%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+goto end
+
+:help
+%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
+
+:end
+popd
diff --git a/docs/statistics/inter_subject.rst b/docs/statistics/inter_subject.rst
new file mode 100644
index 0000000..e489c39
--- /dev/null
+++ b/docs/statistics/inter_subject.rst
@@ -0,0 +1,88 @@
+Inter-subject
+=============
+
+Inter-subject statistics. These statistics are gathered per scaling in the Panda dataframe ``df_rescale``
+ 
+Mean intra-subject SD
+"""""""""""""""""""""
+
+Intra-subject SD averaged across subjects.
+
+:math:`\mu_s \{ \sigma_t \{ CSA_{rX} \} \}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 422-439
+   :emphasize-lines: 9
+   
+Mean intra-subject COV
+""""""""""""""""""""""
+
+Intra-subject COV averaged across subjects.
+
+:math:`\mu_s \{ COV_t \{ CSA_{rX} \} \}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 422-439
+   :emphasize-lines: 10
+   
+Inter-subject SD
+""""""""""""""""
+
+SD of intra-subject CSA across subjects.
+
+:math:`\sigma_s \{ \mu_t \{ CSA_{rX} \} \}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 422-439
+   :emphasize-lines: 11
+   
+Mean rescale estimated (RE)
+"""""""""""""""""""""""""""
+
+rescale_estimated averaged across subjects.
+
+:math:`\mu_s \left \{ \mu_t \left\{ \frac{CSA_{rX}}{CSA_{r1}} \right\}\right\}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 422-439
+   :emphasize-lines: 12
+   
+SD of rescale estimated
+"""""""""""""""""""""""
+
+SD of rescale_estimated across subjects.
+
+:math:`\sigma_s \left\{\mu_t \left\{ \frac{CSA_{rX}}{CSA_{r1}} \right\}\right\}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 422-439
+   :emphasize-lines: 13
+   
+Mean error
+""""""""""
+
+error on the intra-subject CSA estimation averaged across subjects.
+
+:math:`\mu_s \{ \mu_t \{ CSA_{rX} \} - \mu_t \{ CSA_{r1} \cdot (rX)^2 \} \}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 422-439
+   :emphasize-lines: 14
+   
+SD of error
+"""""""""""
+
+SD of error on intra-subject CSA estimation across subjects.
+
+:math:`\sigma_s \{ \mu_t \{ CSA_{rX} \} - \mu_t \{ CSA_{r1} \cdot (rX)^2 \} \}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 422-439
+   :emphasize-lines: 15
diff --git a/docs/statistics/intra_subject.rst b/docs/statistics/intra_subject.rst
new file mode 100644
index 0000000..f001d9f
--- /dev/null
+++ b/docs/statistics/intra_subject.rst
@@ -0,0 +1,65 @@
+Intra-subject
+=============
+
+Intra-subject statistics. These statistics are gathered per rescaling and per subject in the Panda dataframe ``df_sub``:
+
+Intra-subject CSA (CSA estimation)
+""""""""""""""""""""""""""""""""""
+
+CSA averaged across transformations.
+
+:math:`\mu_t \{{CSA_{sI, rX}}\}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 401-419
+   :emphasize-lines: 10
+    
+Intra-subject SD
+""""""""""""""""
+
+SD of CSA across transformations.
+
+:math:`\sigma_t \{CSA_{sI,rX}\}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 401-419
+   :emphasize-lines: 11
+    
+    
+Intra-subject COV
+""""""""""""""""""
+
+COV of CSA across transformations.
+
+:math:`COV_t \{CSA_{sI,rX}\}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 401-419
+   :emphasize-lines: 12
+    
+Rescale estimation (RE)
+"""""""""""""""""""""""
+
+ratio of the atrophied CSA divided by the un-rescaled CSA averaged across transformations (gives an estimation of the applied scaling).
+
+:math:`\mu_t \left\{ \frac{CSA_{sI, rX}}{CSA_{sI, r1}} \right\}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 401-419
+   :emphasize-lines: 14 
+    
+Error
+"""""
+
+mean absolute error on CSA estimation averaged across transformations.
+
+:math:`\mu_t \{{CSA_{sI, rX}}\} - \mu_t\{CSA_{sI, r1} \cdot (rX)^2 \}`
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 401-419
+   :emphasize-lines: 15
diff --git a/docs/statistics/introduction.rst b/docs/statistics/introduction.rst
new file mode 100644
index 0000000..e45a7c7
--- /dev/null
+++ b/docs/statistics/introduction.rst
@@ -0,0 +1,7 @@
+Introduction
+=============
+Documentation on statistics to evaluate the sensitivity of atrophy detection with SCT. Average CSA of each image is indexed as :math:`CSA(sI, rX, tY)` where:
+
+- :math:`(sI)` corresponds to subject :math:`I`
+- :math:`(rX)` corresponds to the applied scaling factor :math:`X` on the native image (e.g. X=1, X=0.95, X=0.8)
+- :math:`(tY)` corresponds to the applied random affine transformation Y on the native image
diff --git a/docs/statistics/quality_control.rst b/docs/statistics/quality_control.rst
new file mode 100644
index 0000000..9ba8b5c
--- /dev/null
+++ b/docs/statistics/quality_control.rst
@@ -0,0 +1,20 @@
+Quality control
+================
+
+After running the analysis, check your Quality Control (QC) report by opening the file qc/index.html. Use the
+“Search” feature of the QC report to quickly jump to segmentations or labeling results. If you spot issues
+(wrong labeling), add their filenames in the 'config_correction.yml' file
+(see https://spine-generic.rtfd.io/en/latest/analysis-pipeline.html for further indications). Then, manually create
+labels in the cord at the level of inter-vertebral discs C1-C2, C2-C3, ..., C4-C5 with the command:
+
+.. code-block:: python
+
+    manual_correction -config config_correction.yml -path-in csa_atrophy_results/data_processed -path-out PATH_DATA
+
+The bash script outputs all manual labelings to the derivatives directory in the dataset path defined in `path_data`.
+It is now possible to re-run the whole process. With the command below labeling will use the manual corrections that
+are present in the derivatives/ folder of the dataset, otherwise labeling will be done automatically.
+
+.. code-block:: python
+
+    sct_run_batch -config config_sct_run_batch.yml
diff --git a/docs/statistics/sample_size.rst b/docs/statistics/sample_size.rst
new file mode 100644
index 0000000..0a635fb
--- /dev/null
+++ b/docs/statistics/sample_size.rst
@@ -0,0 +1,33 @@
+Power analysis
+==============
+   
+Between-group minimum sample size
+"""""""""""""""""""""""""""""""""
+
+The minimum sample size, number of subjects per group (study arm), necessary to detect an atrophy between groups was computed based on a two-sample (unpaired) bilateral t-test using the following formula (Wang and Ji 2020; Wittes 2002):
+
+:math:`n_{unpaired} = \frac{(z_{α/2} + z_{β})^2(\sigma_{(:,r1)}+\sigma_{(:,rX)})^2}{\Delta_{sub} ^2}`
+
+Where :math:`n_{unpaired}` is the minimum sample size required to differentiate between groups with a given power (:math:`z_{β}` corresponds to the power z score, e.g. 80% power gives β=0.2 and :math:`z_{β}`= -0.84) and level of significance (:math:`z_{α/2}` corresponds to the significance level z score, e.g. 5% level of significance gives 𝛂=0.05 and :math:`z_{α/2}`=-1.96), SD is the inter-subject standard deviation of the mean CSA (which was calculated by taking the mean CSA across Monte Carlo samples). :math:`diff_{group}` group is the difference of the mean CSA between the groups.
+
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 352-363
+   :emphasize-lines: 5-7
+
+Within-subject minimum sample size
+""""""""""""""""""""""""""""""""""
+
+the minimum sample size necessary to detect an atrophy in a within-subject (repeated-measures) study was computed based on a two-sample bilateral paired t-test using the following formula (Altmann et al. 2009):
+
+:math:`n_{paired} = \frac{(z_{α/2} + z_{β})^2(\sigma_{diff})^2}{\Delta_{sub} ^2}`
+   
+Where :math:`\sigma_{diff}` is the standard deviation between longitudinal CSA measures across  subjects and :math:`\Delta_{sub}` is the mean of the difference between longitudinal CSA measures.
+
+.. literalinclude:: ../../csa_rescale_stat.py
+   :language: python
+   :lines: 352-363
+   :emphasize-lines: 9-10
+
+
diff --git a/requirements.txt b/requirements.txt
index f110068..e7637ef 100755
--- a/requirements.txt
+++ b/requirements.txt
@@ -10,4 +10,7 @@ argparse~=1.4.0
 setuptools~=49.6.0
 PyYAML~=5.3.1
 coloredlogs~=14.0
-
+Sphinx
+sphinx_rtd_theme
+recommonmark
+sphinx-jsonschema