How to better apply different LJ potential energies to each pair of particles? #2037
-
Dear all Does this warning have any impact on the actual simulation? Also, my simulation runs very slowly. I used an NVIDIA GeForce RTX 4090 with 4 CPU cores to run a simulation (526 particles, with Yukawa and OPP potentials) and it only ran for 5,000,000 steps in 12 hours. Is this normal in your opinion? I look forward to your help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Your simulation will run more efficiently (and possibly even faster) on a single CPU core. Simulations generally need at least 10,000 particles before GPU throughput breaks even with a CPU (~64 cores). https://github.com/glotzerlab/hoomd-benchmarks has scripts to evaluate this. The output is in steps per second. For comparison, your simulation runs at 5e6/(12*3600) = 115 steps per second. The benchmark command is:
I ran this on an A100 GPU. Your RTX 4090 is a single precision optimized card and therefore will run double precision calculations very slowly. You should run production HOOMD-blue simulations on an institutional or national cluster with double precision GPUs (currently V100, A100, or H100). The last two lines answer your original question of how much increasing the number of types costs on the GPU. The pair force calculations in HOOMD-blue are optimized for systems with a large number of particles and a small number of types. If you want to increase performance further, you will need to write custom kernels (CPU and/or GPU) that implement an optimized data structure for the N^2 type parameters. |
Beta Was this translation helpful? Give feedback.
Your simulation will run more efficiently (and possibly even faster) on a single CPU core. Simulations generally need at least 10,000 particles before GPU throughput breaks even with a CPU (~64 cores).
https://github.com/glotzerlab/hoomd-benchmarks has scripts to evaluate this. The output is in steps per second. For comparison, your simulation runs at 5e6/(12*3600) = 115 steps per second.
The benchmark command is:
python3 -m hoomd_benchmarks.md_pair_lj -N {N} --device {DEVICE} --repeat 20 --n_types {N_types}
I ran this on an A100 GPU. Your RTX 4090…