Replies: 12 comments 1 reply
-
Hi Young,
Thanks for reporting this.
In the current releases, the mpi version for ModelFinder is not available
yet.
In fact, we have implemented this, and it is under final testing. It will
be available shortly, say in one or two months. If you really need to use
this feature, you may download the source codes from the "onnxupdate"
branch of our IQTREE2 GitHub page (
https://github.com/iqtree/iqtree2/tree/onnxupdate), compile the source
codes and try it.
Please let me know if you have any questions or if you want to have the
binary version.
Thanks,
Thomas
…On Fri, May 3, 2024 at 2:04 AM ycsong ***@***.***> wrote:
Hello developers,
I am trying to run modelfinder in combination with mpirun, and have been
facing some issues. I have a script that I submit to slurm system in our
computer here at the lab, and after a couple of minutes, I get the
following error:
ERROR: PERROR: S--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 15 with PID 106762 on node ta2 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
The script that I am trying to run is show here:
#!/bin/bash
#SBATCH --time=168:00:00
#SBATCH --qos maw
#SBATCH --partition analysis
#SBATCH --nodes=3
#SBATCH --ntasks=36
#SBATCH --mail-type=END,FAIL
#SBATCH --mail-user=email
#SBATCH --account=account
#SBATCH --job-name model_finder
#SBATCH --error my_job_name-%j.err
#SBATCH --output my_job_name-%j.out
source /etc/profile.d/modules.sh
module purge
module load gcc
module load cmake/3.19.5-intel
module load openmpi
#set +u;source activate metagem;set -u
export PATH="/tahoma/emsls60141/1000soils_data/iqtree2/build:$PATH"
mpirun --bind-to core --map-by core -report-bindings iqtree2-mpi -s ./high_top_gtdbtk_align/align/gtdbtk.bac120.msa.trim.fasta -m TESTONLY -madd LG4M,LG+C10,LG+C20,LG+C30,LG+C40,LG+C50,LG+C60,C10,C20,C30,C40,C50,C60 -safe
From what I know, segmentation fault results from not enough memory, but I
am not certain where to start the diagnosis process. Your help would be
much appreciated.
Thank you.
Young
—
Reply to this email directly, view it on GitHub
<#188>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AENQCRSNSUEYQS7IJMWQLCTZAJPZXAVCNFSM6AAAAABHEACV42VHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZWGU4TOMBQGU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Thomas, Many thanks for the update. I think at this point, our colleagues and I would like to give the "onnxupdate" a try and monitor the situation. Thanks again, Young |
Beta Was this translation helpful? Give feedback.
-
Hi Young,
This is just a note that there was an update on the branch "onnxupdate"
today. We fixed an issue that happens when using hybrid mode—openmp and mpi
together. If you are using hybrid mode, please download the updated
version. If you are only using mpi, then it should be fine.
Thanks,
Thomas
…On Fri, May 3, 2024 at 9:23 AM ycsong ***@***.***> wrote:
Hi Thomas,
Many thanks for the update. I think at this point, our colleagues and I
would like to give the "onnxupdate" a try and monitor the situation.
Thanks again,
Young
—
Reply to this email directly, view it on GitHub
<#188 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AENQCRUHE57RKWAIS33LMO3ZALDINAVCNFSM6AAAAABHEACV42VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TGMBQG43DK>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Thanks for the update Thomas, it is much appreciated. Is there an installation guideline somewhere for this? I tried looking at README, but wasn't able to find anything there. Thank you, Young |
Beta Was this translation helpful? Give feedback.
-
Hi Young,
You may refer to the section "Compiling MPI version" on our wiki page
regarding the compilation guide:
https://github.com/iqtree/iqtree2/wiki/Compilation-Guide
Let me briefly explain here.
First, you need to have the mpi installed on the machine.
For example, in MacBook:
brew install openmpi
Then, you can get the source code of the branch "onnxupdate" (as you want
to try the MPI version of ModelFinder):
git clone --recursive https://github.com/iqtree/iqtree2
cd iqtree2
git checkout onnxupdate
Then, you can compile the code:
mkdir build
cmake -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx ..
make -j
Sometimes, the openmp include and lib folders cannot be located. Then you
need to specify their locations:
export LDFLAGS=[location of openmp lib folder] (for example:
"-L/opt/homebrew/opt/libomp/lib")
export CPPFLAGS=[location of openmp include folder] (for example:
"-I/opt/homebrew/opt/libomp/include")
cmake -DCMAKE_CXX_FLAGS="$LDFLAGS $CPPFLAGS" -DCMAKE_C_COMPILER=mpicc
-DCMAKE_CXX_COMPILER=mpicxx ..
make -j
If the source codes can be compiled successfully, then you will find an
executable file called "iqtree2-mpi".
You can then run IQTREE2 by the command:
mpirun -np [# of processors] ./iqtree2-mpi ....
I hope the above guidelines will be helpful to you.
Cheers,
Thomas
…On Sat, May 4, 2024 at 12:53 AM ycsong ***@***.***> wrote:
Thanks for the update Thomas, it is much appreciated.
Is there an installation guideline somewhere for this? I tried looking at
README, but wasn't able to find anything there.
Thank you,
Young
—
Reply to this email directly, view it on GitHub
<#188 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AENQCRXHUESZ6X47FQ4I533ZAOQIPAVCNFSM6AAAAABHEACV42VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TGMBWG4YDE>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Thomas, Thanks again for your help so far. Unfortunately, it appears that the onnxupdate has similar issues that I encountered previously. I recently received a following error message:
The slurm code is pretty much the same as the one I provided earlier. Thank you, Young |
Beta Was this translation helpful? Give feedback.
-
Hi Young,
Thanks for reporting this. Is it possible to send me the log file, the data file, and the partition file for checking the issue?
Thanks,
Thomas
From: ycsong ***@***.***>
Date: Wednesday, 8 May 2024 at 12:52 AM
To: iqtree/iqtree2 ***@***.***>
Cc: Thomas Wong ***@***.***>, Comment ***@***.***>
Subject: Re: [iqtree/iqtree2] Segmentation fault while running modelfinder in mpi mode (Discussion #188)
Hi Thomas,
Thanks again for your help so far. Unfortunately, it appears that the onnxupdate has similar issues that I encountered previously. I recently received a following error message:
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
…--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 64 with PID 81051 on node ta6 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
The slurm code is pretty much the same as the one I provided earlier.
Thank you,
Young
—
Reply to this email directly, view it on GitHub<#188 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AENQCRXNJKB5TRMSVHSQRADZBDTEPAVCNFSM6AAAAABHEACV42VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4TGNBSHE2TC>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi Thomas, Apologies for the delayed response. I've attached the log file and the modelwq.gz file that was generated during the execution. I tried to load the fasta file but it's too big for the attachment. To explain everything from the beginning. At very first, I ran the non-mpi version of ModelFinder. It was working, but was running slowly (i.e., finished testing close to 30 out of 97 total models in a week***, with 120GB RAM provided). Hence I decided to explore a possibility of the MPI version. *** In our internal server, we can only provide upto a maximum runtime of a week per job, so the non-mpi execution stopped after this period. I am aware that if I were to run this again, IQ-TREE would pick things up from where it previously ended, but thought I look into more efficient option. Thanks again, and please let me know if you require anything else to make this process easier. I tried looking for a partition file, but it appears that I don't have one with me (or I can't find it). gtdbtk.bac120.msa.trim.fasta.modelwq.gz |
Beta Was this translation helpful? Give feedback.
-
Hi Thomas, I am reattaching the file, gtdbtk.bac120.msa.trim.fasta.model.gz. I was experimenting with the file, and somehow named it to gtdbtk.bac120.msa.trim.fasta.modelwq.gz. Thank you. |
Beta Was this translation helpful? Give feedback.
-
Thanks, @ycsong. |
Beta Was this translation helpful? Give feedback.
-
Hi @ycsong , |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello developers,
I am trying to run modelfinder in combination with mpirun, and have been facing some issues. I have a script that I submit to slurm system in our computer here at the lab, and after a couple of minutes, I get the following error:
The script that I am trying to run is show here:
From what I know, segmentation fault results from not enough memory, but I am not certain where to start the diagnosis process. Your help would be much appreciated.
Thank you.
Young
Beta Was this translation helpful? Give feedback.
All reactions