Skip to content

Utkarsh Mahamune, Aldo Jongejan, Antoine van Kampen and Perry D Moerland: Systematic evaluation of robustness to cell type mismatch of deconvolution methods for spatial transcriptomics

License

Notifications You must be signed in to change notification settings

EDS-Bioinformatics-Laboratory/Robustness_evaluation_deconv_methods

Repository files navigation

Systematic evaluation of robustness to cell type mismatch of deconvolution methods for spatial transcriptomics data

Table of contents


Overview

     Spatial transcriptomics approaches based on sequencing (barcode-based, e.g., 10x Visium) preserve spatial information but with limited cellular resolution. On the other hand, single-cell RNA-sequencing (scRNA-seq) techniques provide single-cell resolution but lose spatial resolution because of the tissue dissociation step during the scRNA-seq experimental procedure. With these complementary strengths in mind, computational methods have been developed to combine scRNA-seq and spatial transcriptomics data. These approaches use deconvolution to identify cell types and their respective proportions at each location in spatial transcriptomics data with the aid of a scRNA-seq reference dataset. Some suggest that deconvolution approaches are sensitive to the absence of cell type(s) in the single-cell reference dataset, a problem referred to as cell type mismatch. Here, we systematically evaluated the robustness of deconvolution methods to cell type mismatch tailored for spatial transcriptomics data.

Contents of the current directory

Note that in the complete sFSS, this directory corresponds to the Processing directory.

  • 0_SoftwareEnvironment: This directory enlists the software environment specifications used for various programming languages and/or platforms during the project.

  • 1_Generate_sc_ref_data: The directory comprises scripts, results and settings to generate the integrated scRNA-seq dataset from 2 distinct and complementary scRNA-seq datasets. The final result sc.ref.data.rds is used as a basis to generate various reference dataset versions based on cell type removal scenarios and to generate simulated spatial transcriptomics datasets.

  • 2_Simulating_ST_data: Comprised of scripts, results and settings to generate the sequencing-based simulated spatial transcriptomics datasets varying in number of cells & cell types present per spatial location (spot). We created three simulated ST datasets using the algorithm developed for the analysis.
        a. ST1: 4-8 cell types and 10-15 cells per spot;
        b. ST2: 1-5 cell types and 10-15 cells per spot;
        c. ST3: 1-5 cell types and 3-7 cells per spot.

  • 3_ST_methods: Comprised of standalone R/Python scripts, one for each deconvolution method and shell/batch scripts to execute the R or Python scripts in parallel for multiple instances in a removal scenario, provided the required computational power is available.

    • R-based: CARD, RCTD, SCDC, MuSiC, Seurat, SPOTlight
    • Python-based: cell2location, Stereoscope (GPU recommended)

       The deconvolution results for each removal scenario are available in the respective directory (for instance, the rm1 directory refers to the removal scenario for removing one cell type from the scrna-seq reference data). See below the overview of removal scenarios and total reference datasets in each scenario.

    • rm0 - removal of no cell types from reference data (1 dataset)
    • rm1 - removal of one cell type from reference data (13 datasets)
    • rm2 - removal of two cell types from reference data (5 datasets)
    • rm3 - removal of three cell types from reference data (5 datasets)
    • rm5 - removal of five cell types from reference data (1 dataset)
    • rm10 - removal of ten cell types from reference data (5 datasets)
    • rm11 - removal of eleven cell types from reference data (5 datasets)
  • 4_Analysis_results: Comprised of scripts to calculate similarity metrics like JSD and RMSE to understand the performance of deconvolution methods for various cell type removal scenarios compared to a baseline with no cell type missing. Cell type reassignment metrics calculate the assignment of missing cell type proportions from the reference dataset.

  • 5_Post_analysis_work: The directory comprises scripts to generate the final resulting plots in the project; a few plots are included in the manuscript's main text, while others were included in the supplementary information.

  • Data (only available in sFSS, not on GitHub): Comprised of pre-processed data downloaded from a public data-sharing platform. One of the datasets is procured from the group of Lisa van Baarsen and is available on GEO as well with GEO accession ID - ####.

  • renv: The project uses the renv functionality and the directory comprises the essential files (activate.R, settings.json) and directories for the renv setup.

  • Renv_setup.R: This script sets up the R environment for the project work. Details about executing are available in the file as a header and in the comments. Also included under the 'How to reproduce the results' section further down in this document.

  • renv.lock: R environment lock (metadata) file comprising package details. This file will be used by the Renv_setup.R script to download and install the correct package and its version to recreate the computing environment.

  • .Rprofile: Essential R profile script to set up the correct R profile while using the renv infrastructure.

  • environment.yml: A metadata file comprising Python packages installed in the virtual conda environment.


Installation steps

Running the code requires a Linux OS and the R, RTools and Python versions indicated below.

Linux / Ubuntu distribution

  1. Install Anaconda (easy guide for installation)
  2. Install R version 4.1.2 (download). Instructions on how to install
  3. Set up conda environment with Python version 3.9.7 using the command: conda env create -f environment.yml
  4. Set up renv setup details
    • Execute Renv_setup.R script from where it resides currently to initialise the renv infrastructure for the R project (ignore the warning messages) using Rscript Renv_setup.R arg1 command.
      Expected command line argument (arg1) is (classic) GITHUB_PAT token (more details).
      The required files for renv setup (renv.lock, .Rprofile, renv/activate.R, renv/settings.json) should already be in your R project directory; If not, ensure it resides in the same directory as the Renv_setup.R file.

Note

Due to the platform (os) dependencies, a collaborator can still experience minor discrepancies in the results while using the conda/renv functionality.

Installation steps for Windows OS can be found in Appendix A. Note however that using Windows only R-based deconvolution methods can be executed and that therefore the results of Python-based methods (cell2location, SPOTlight) cannot be reproduced.

How can you reproduce the results in the manuscript?

Warning

The following instructions are for reproducing the results using the command-line terminal/prompt, not RStudio or Jupyter Notebook.

Module 1: Generate single-cell reference datasets

Navigate to "1_Generate_sc_ref_data/Code/" directory

  • Generating a single-cell reference dataset from multiple single-cell datasets, UMAP representations included in the supplementary information of the manuscript using the commands as below;

        Rscript Init_env.R
        Rscript SC_ref_data.R

    Note: this script takes a significant amount of time (~2-3 hours on a machine with 16GB RAM) to execute. Ignore the warnings during the execution

  • Generating single-cell reference datasets for all removal scenarios based on cell type removal using the commands as below;

        Rscript SC_ref_data_scenarios.R

    Note: this script requires the single-cell reference dataset generated in the previous step.

Module 2: Generate simulated ST data

Navigate to "2_Simulating_ST_data/Code/" directory

  • Generating simulated spatial transcriptomics datasets using single-cell reference data, the three datasets vary by the number of cells and cell types present per spatial location using the commands as below;

        Rscript Init_env.R
        Rscript Generate_ST_data.R arg1 arg2 arg3 arg4 arg5

       Note: Generate_ST_data.R script expects five command line arguments in the order mentioned below;
         arg1 - min number of cell types per spot
         arg2 - max number of cell types per spot
         arg3 - min number of cells per spot
         arg4 - max number of cells per spot
         arg5 - index of simulated ST dataset [options: 1, 2, 3 ]

      To reproduce the results in the manuscript, the following command line input generates the first simulated ST dataset:
      Rscript Init_env.R
      Rscript Generate_ST_data.R 4 8 10 15 1
    

Module 3: Deconvolution method results

Navigate to "3_ST_methods/Code/" directory

  • The shell scripts execute all deconvolution methods to predict cell type proportions simultaneously for all removal scenarios and multiple reference datasets for each scenario.

    Shell scripts to execute R and Python-based deconvolution methods:
        Rscript Init_env.R
        Execute_R_based_methods.sh arg1 arg2 arg3,
        Execute_python_based_methods.sh arg1 arg2 arg3,

    Please make sure to execute only one script at a time and wait for the results. For every removal scenario, each single-cell reference will have one result (for instance, rm1 will have 13 results for each method) for every method.

    If your machine crashes, please adapt the shell script accordingly (for instance, start fewer jobs in parallel). Also, verify the number of results generated before moving to the next scenario. If a combination of SC & ST datasets fails for a method, please run the standalone script for that pair.

    Note: The script to execute R-based/Python-based methods executes each method in parallel for a provided number of reference datasets and removal scenarios; the required command line arguments are as below:
        arg1 - index of the ST dataset [options: 1, 2, 3 ]
        arg2 - total number of single-cell reference datasets [options per removal scenario;
         rm0= 1:1, rm1= 1:13, rm2= 1:5, rm3= 1:5, rm5= 1:1, rm10= 1:5, rm11= 1:5
    ]
        arg3 - removal scenario [options: rm0, rm1, rm2, rm3, rm5, rm10, rm11 ]

      To reproduce the results in the manuscript, the following provides the command line input for no cell type removal (baseline) & one or more cell type removal scenarios,
      Rscript Init_env.R
      (1) ./Execute_R_based_methods.sh 1 1 rm0
      (2) ./Execute_R_based_methods.sh 1 13 rm1
      (3) ./Execute_R_based_methods.sh 1 5 rm2
      (4) ./Execute_R_based_methods.sh 1 5 rm3
      (5) ./Execute_R_based_methods.sh 1 1 rm5
      (6) ./Execute_R_based_methods.sh 1 5 rm10
      (7) ./Execute_R_based_methods.sh 1 5 rm11
    

            replace Execute_R_based_methods.sh by Execute_python_based_methods.sh for executing python-based methods.

    The standalone scripts for each deconvolution method expect the command line arguments as below; notice the different versions of argument arg2.

    arg1 - index of the ST dataset
    arg2 - for R-based methods: index of single-cell reference datasets [varies per removal scenario]
        for Python-based methods: total number of single-cell reference datasets
    arg3 - path to simulated ST datasets directory
    arg4 - path to single-cell reference datasets directory
    arg5 - path to save deconvolution results directory

    Note: While executing cell2location without GPU support, use_gpu argument should be set to FALSE in Cell2Location.py script.

Note

In some isolated cases the deconvolution methods indicated below do not generate results, however scripts will still work:
CARD: all 5 instances of the rm11 scenario for all three simulated datasets (ST1, ST2, ST3),
Seurat: for 3 instances of the rm10 scenario for ST1,
Seurat: for 2 instances of the rm11 scenario for ST1.


Module 4: Analyse deconvolution results

Navigate to "4_Analysis_results/Code/" directory

Important

Estimating performance metrics needs all the deconvolution results generated in Module 3.

  • Estimating performance metrics for all the scenarios of cell type removal.

    • For JSD calculations:

        Rscript Init_env.R
        Rscript Get_JSD_results.R arg1 optional_arg2 optional_arg3

        Note: The required command line arguments for Get_JSD_results.R script are as below:
        arg1 - index of the simulated ST dataset [options: 1, 2, 3 ]
        arg2 - [optional; default is all scenarios ] removal scenarios separated by comma (rm0 is included by default) [options: rm1, rm2, rm3, rm5, rm10, rm11 ]
        arg3 - [optional; default is all methods] name of the deconvolution methods separated by comma

      To reproduce the results in the manuscript, the following provides the command line input for calculating JSD metrics for all methods with 1st simulated ST data and all removal of cell type scenarios,
      Rscript Init_env.R
      Rscript Get_JSD_results.R 1
    
    • For RMSE calculations:

        Rscript Get_RMSE_results.R arg1 optional_arg2 optional_arg3

        Note: The required command line arguments for Get_RMSE_results.R script are as below:
        arg1 - index of the simulated ST dataset [options: 1, 2, 3 ]
        arg2 - [optional; default is all scenarios ] removal scenarios separated by comma (rm0 is included by default) [options: rm1, rm2, rm3, rm5, rm10, rm11 ]
        arg3 - [optional; default is all methods ] name of the deconvolution methods separated by comma

      To reproduce the results in the manuscript, the following provides the command line input for calculating RMSE metrics for all methods with 1st simulated ST data and all removal of cell type scenarios,
      Rscript Get_RMSE_results.R 1
    
    • For cell type reassignment calculations:

          Rscript Get_celltype_assignment_results.R arg1 arg2 optional_arg3

        Note: The required command line arguments for Get_celltype_assignment_results.R script are as below:
        arg1 - index of the simulated ST dataset [options: 1, 2, 3 ]
        arg2 - removal scenarios separated by comma (rm0 is included by default) [only supported for rm1,rm2,rm3 ]
        arg3 - [optional; default is all methods ] name of the deconvolution methods separated by comma

      To reproduce the results in the manuscript the following provides the command line input for calculating cell type reassignment for all method with 1st simulated ST data, and removal of one, two & three cell type scenarios,
      Rscript Get_celltype_assignment_results.R 1 "rm1","rm2","rm3"	
    

Note

a list has to be provided as a single argument without spaces in between


Module 5: Post analysis results

Navigate to "5_Post_analysis_work/Code/" directory

Important

This module expects calculation results from Module 4

  • Generating figures/plots are included in the manuscript for calculated JSD, RMSE, and cell type reassignment estimates for the specified cell type removal scenarios.

    • Generate plots for cell type correlation and cell type reassignment plots for cell type removal scenarios rm1, rm2, and rm3 using commands as below;

          Rscript Init_env.R
          Rscript Celltype_correlation.R
          Rscript Fig_celltype_assignments.R arg1

        Note:Fig_celltype_assignments.R expects command line arguments as below:
          arg1 - index of the simulated ST dataset [options: 1, 2, 3 ]

      To reproduce the results in the manuscript the following provides the command line input,
      Rscript Init_env.R
      Rscript Celltype_correlation.R
      Rscript Fig_celltype_assignments.R 1
    
    • Generate plots for JSD/RMSE plots for all cell type removal scenarios using commands as below;

          Rscript Fig_jsd_rmse_plots.R arg1

        Note: Fig_jsd_rmse_plots.R expects command line arguments as below; (list of methods and removal scenarios are adapted from JSD/RMSE calculations in Module 4)
          arg1 - index of the simulated ST dataset [options: 1, 2, 3 ]

      To reproduce the results in the manuscript, the following provides the command line input,
      Rscript Fig_jsd_rmse_plots.R 1
    
    • Generate a summary funky plot for an overview of the ranking of deconvolution methods across all the removal scenarios using the commands as below;

          Rscript Funky_plots.R arg1

        Note: Funky_plots.R expects command line arguments as below;
          arg1 - index of the simulated ST dataset [options: 1, 2, 3 ]

      To reproduce the results in the manuscript, the following provides the command line input,
      Rscript Funky_plots.R 1
    
    • For validating the reproduction results and cross-checking the numbers with the original analysis below R script will generate plots that can be compared with those in the original analysis.

          Rscript Validate_numbers.R arg1

        Note: Validate_numbers.R expects command line arguments as below;
          arg1 - index of the simulated ST dataset [options: 1, 2, 3 ]

      To reproduce the results in the manuscript, the following provides the command line input,
      Rscript Validate_numbers.R 1
    

Known issues

1. Installation of 'fields' R package on macOS

The analysis is carried out with fields package version 13.3, developed under R version 4.1.2, and RStudio, built for x86_64 architecture.

If you use MacOS with arm_64 architecture and install RStudio built for x86_64 architecture, it will use the underlying 'Rosetta2' to run RStudio with x86_64 architecture, and you will able to install the required 13.3 version of the fields package from CRAN archives. But if you install RStudio built for arm_64, it will run with the native silicon chip, and then you will get the error below,

Error message (click to unfold)
Error: Error installing package 'fields':
* installing *source* package ‘fields’ ...
** package ‘fields’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c ExponentialUpperC.c -o ExponentialUpperC.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c RdistEarth.c -o RdistEarth.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c addToDiagC.c -o addToDiagC.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c compactToMatC.c -o compactToMatC.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c expfnC.c -o expfnC.o
/usr/local/bin/gfortran -fno-optimize-sibling-calls  -fPIC  -Wall -g -O2  -c fieldsF77Code.f -o fieldsF77Code.o
fieldsF77Code.f:104:32:

  104 |       double precision A(NMAX,4),V(NMAX,7)
      |                                1
Warning: Array ‘a’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider increasing the ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies unlimited ‘-fmax-stack-var-size’) - or change the code to use an ALLOCATABLE array. If the variable is never accessed concurrently, this warning can be ignored, and the variable could also be declared with the SAVE attribute. [-Wsurprising]
fieldsF77Code.f:108:23:

  108 |       integer idx(NMAX)
      |                       1
Warning: Array ‘idx’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider increasing the ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies unlimited ‘-fmax-stack-var-size’) - or change the code to use an ALLOCATABLE array. If the variable is never accessed concurrently, this warning can be ignored, and the variable could also be declared with the SAVE attribute. [-Wsurprising]
fieldsF77Code.f:107:23:

  107 |       integer imx(NMAX)
      |                       1
Warning: Array ‘imx’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider increasing the ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies unlimited ‘-fmax-stack-var-size’) - or change the code to use an ALLOCATABLE array. If the variable is never accessed concurrently, this warning can be ignored, and the variable could also be declared with the SAVE attribute. [-Wsurprising]
fieldsF77Code.f:106:59:

  106 |       double precision ux(NMAX),uy(NMAX), uw(NMAX),ud(NMAX),utr
      |                                                           1
Warning: Array ‘ud’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider increasing the ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies unlimited ‘-fmax-stack-var-size’) - or change the code to use an ALLOCATABLE array. If the variable is never accessed concurrently, this warning can be ignored, and the variable could also be declared with the SAVE attribute. [-Wsurprising]
fieldsF77Code.f:106:50:

  106 |       double precision ux(NMAX),uy(NMAX), uw(NMAX),ud(NMAX),utr
      |                                                  1
Warning: Array ‘uw’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider increasing the ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies unlimited ‘-fmax-stack-var-size’) - or change the code to use an ALLOCATABLE array. If the variable is never accessed concurrently, this warning can be ignored, and the variable could also be declared with the SAVE attribute. [-Wsurprising]
fieldsF77Code.f:106:31:

  106 |       double precision ux(NMAX),uy(NMAX), uw(NMAX),ud(NMAX),utr
      |                               1
Warning: Array ‘ux’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider increasing the ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies unlimited ‘-fmax-stack-var-size’) - or change the code to use an ALLOCATABLE array. If the variable is never accessed concurrently, this warning can be ignored, and the variable could also be declared with the SAVE attribute. [-Wsurprising]
fieldsF77Code.f:106:40:

  106 |       double precision ux(NMAX),uy(NMAX), uw(NMAX),ud(NMAX),utr
      |                                        1
Warning: Array ‘uy’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider increasing the ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies unlimited ‘-fmax-stack-var-size’) - or change the code to use an ALLOCATABLE array. If the variable is never accessed concurrently, this warning can be ignored, and the variable could also be declared with the SAVE attribute. [-Wsurprising]
fieldsF77Code.f:104:42:

  104 |       double precision A(NMAX,4),V(NMAX,7)
      |                                          1
Warning: Array ‘v’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider increasing the ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies unlimited ‘-fmax-stack-var-size’) - or change the code to use an ALLOCATABLE array. If the variable is never accessed concurrently, this warning can be ignored, and the variable could also be declared with the SAVE attribute. [-Wsurprising]
fieldsF77Code.f:379:43:

  379 |       double precision work(nobs),diag(mxM),dumm1(1),dumm2(1)
      |                                           1
Warning: Array ‘diag’ at (1) is larger than limit set by ‘-fmax-stack-var-size=’, moved from stack to static storage. This makes the procedure unsafe when called recursively, or concurrently from multiple threads. Consider increasing the ‘-fmax-stack-var-size=’ limit (or use ‘-frecursive’, which implies unlimited ‘-fmax-stack-var-size’) - or change the code to use an ALLOCATABLE array. If the variable is never accessed concurrently, this warning can be ignored, and the variable could also be declared with the SAVE attribute. [-Wsurprising]
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c init.c -o init.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c multebC.c -o multebC.o
clang -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I/usr/local/include   -fPIC  -Wall -g -O2  -c rdistC.c -o rdistC.o
clang -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o fields.so ExponentialUpperC.o RdistEarth.o addToDiagC.o compactToMatC.o expfnC.o fieldsF77Code.o init.o multebC.o rdistC.o -L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0 -L/usr/local/gfortran/lib -lgfortran -lquadmath -lm -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: -single_module is obsolete
ld: warning: -multiply_defined is obsolete
ld: warning: search path '/usr/local/gfortran/lib/gcc/x86_64-apple-darwin18/8.2.0' not found
ld: warning: ignoring file '/private/var/folders/16/2qzgz8cd2bl65zhgv1x6sy0c0000gn/T/RtmpcY8v1T/R.INSTALL3349486d5ed2/fields/src/fieldsF77Code.o': found architecture 'arm64', required architecture 'x86_64'
ld: warning: ignoring file '/usr/local/lib/libgfortran.5.dylib': found architecture 'arm64', required architecture 'x86_64'
ld: warning: ignoring file '/usr/local/gfortran/lib/libquadmath.0.dylib': found architecture 'arm64', required architecture 'x86_64'
installing to /Users/utkarsh/surfdrive/UtkarshMahamune/20210701_RobustnessEvaluation/Processing/renv/staging/1/00LOCK-fields/00new/fields/libs
** R
** data
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘fields’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/Users/utkarsh/surfdrive/UtkarshMahamune/20210701_RobustnessEvaluation/Processing/renv/staging/1/00LOCK-fields/00new/fields/libs/fields.so':
  dlopen(/Users/utkarsh/surfdrive/UtkarshMahamune/20210701_RobustnessEvaluation/Processing/renv/staging/1/00LOCK-fields/00new/fields/libs/fields.so, 0x0006): symbol not found in flat namespace '_css_'
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/Users/utkarsh/surfdrive/UtkarshMahamune/20210701_RobustnessEvaluation/Processing/renv/staging/1/fields’
install of package 'fields' failed [error code 1]
Traceback (most recent calls last):
14: renv::init()
13: restore(project = project, library = libpaths, repos = repos, 
        prompt = FALSE)
12: renv_restore_run_actions(project, diff, current, lockfile, rebuild)
11: renv_install_impl(records)
10: renv_install_staged(records)
 9: renv_install_default(records)
 8: handler(package, renv_install_package(record))
 7: renv_install_package(record)
 6: withCallingHandlers(renv_install_package_impl(record), error = function(e) writef("FAILED"))
 5: renv_install_package_impl(record)
 4: r_cmd_install(package, path)
 3: r_exec_error(package, output, "install", status)
 2: abort(all)
 1: stop(fallback)

to resolve this error, you can follow either option from below,

  • You can install the latest version 15.2 of fields R package (R-binaries available for arm_64); this will affect the CARD package since it needs to be updated to the latest version as well.

  • Install R and RStudio built for x86_64 architecture and run the analysis using it. If you still get the same error, the compiler uses the default arm_64 architecture to install R packages that need compilation.


2. RCTD parallel execution

If you come across the below error while executing the RCTD deconvolution method.

Error in checkForRemoteErrors(lapply(cl, recvResult)) :
  4 nodes produced errors; first error: object '.doSnowGlobals' not found
Calls: run.RCTD ... %dopar% -> <Anonymous> -> clusterCall -> checkForRemoteErrors
Execution halted
EEError in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -> unserialize
rror in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -> unserialize
EError in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -> unserialize
Error in unserialize(node$con) : error reading from connection
Calls: <Anonymous> ... doTryCatch -> recvData -> recvData.SOCKnode -> unserialize
Execution halted
Execution halted
Execution halted
Execution halted

This is a known issue when RCTD runs multiple jobs in parallel from within the deconvolution function. This has been reported to the developers earlier and can be found in the issues on the GitHub repository (link).

Please follow the solution available in the reported issue; if the problem persists, please raise an issue on GitHub.


3. Creating conda virtual environment with environment.yml

The environment.yml is generated on the linux system and thus you can get error like below.

PackagesNotFoundError: The following packages are not available from current channels:

  - libstdcxx-ng=11.2.0*
  - libgomp=11.2.0*
  - libgcc-ng=11.2.0*
  - ld_impl_linux-64=2.40*
  - _openmp_mutex=5.1*
  - _libgcc_mutex=0.1*

4. python_config_impl(python) error

The R script uses installed Python using the RETICULATE package. You get this error if the package cannot find the proper Python installation.

The issue can be resolved by adding the line below at the beginning of the R script that is giving the error to set up the path for RETICULATE_python.

Sys.setenv(RETICULATE_PYTHON="path_of_the_desired_python_installation")


Appendix A

Installation steps for Windows OS

  1. Install R 4.1.2 download here
  2. Install RTools 4.0 download here.
    Add RTools path to 'PATH' environment variable (make sure no white spaces are present in the path)
  3. Install miniconda download here
    Look for 'Miniconda3-py39_24.11.1-0-Windows-x86_64.exe'
  4. Set up conda environment with Python version 3.9.7 using the command: conda env create -f environment.yml
  5. Set up renv setup details
    • Execute Renv_setup.R script from where it resides currently to initialise the renv infrastructure for the R project (ignore the warning messages) using Rscript Renv_setup.R arg1 command.
      Expected command line argument (arg1) is (classic) GITHUB_PAT token (more details).
      The required files for renv setup (renv.lock, .Rprofile, renv/activate.R, renv/settings.json) should already be in your R project directory; If not, ensure it resides in the same directory as the Renv_setup.R file.

About

Utkarsh Mahamune, Aldo Jongejan, Antoine van Kampen and Perry D Moerland: Systematic evaluation of robustness to cell type mismatch of deconvolution methods for spatial transcriptomics

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages