Skip to content

Conversation

@talagayev
Copy link
Member

@talagayev talagayev commented Oct 19, 2024

Fixes #4679 attempt

Changes made in this Pull Request:

  • added backends and aggregators to DistanceMatrix in analysis.diffusionmap
  • added the client_DistanceMatrix in conftest.py
  • added client_DistanceMatrix in run() in test_diffusionmap.py

Here is the Problem:

From what I see currently
self.results.dist_matrix = np.zeros((self.n_frames, self.n_frames))
has the problem, that when it uses parallelization the self.n_frames value gets divided, which leads with the ndarray_sum and ndarray_mean to:

E AssertionError:
E Arrays are not almost equal to 4 decimals
E (shapes (2,), (4,) mismatch)
E ACTUAL: array([1., 1.])
E DESIRED: array([1, 1, 1, 1])

and with the use of ndarray_vstack or ndarray_hstack it leads to the following error:
numpy.linalg.LinAlgError: Last 2 dimensions of the array must be square

so I am not sure if this can be somehow adjusted, I also encountered a similar case also while trying to parallelize analysis.msd

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Developers certificate of origin


📚 Documentation preview 📚: https://mdanalysis--4745.org.readthedocs.build/en/4745/

@pep8speaks
Copy link

pep8speaks commented Oct 19, 2024

Hello @talagayev! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-12-19 01:33:22 UTC

@marinegor
Copy link
Contributor

hey @talagayev the msd PR seems to pass all the tests (#4896) -- do you think it's something you could use here as well?

I skimmed through the code and I think it overcomes your self.n_frames problem by setting up an array of shape (..., self.n_frames) before running analysis, and later you can use that as a replacement for self.n_frames

@talagayev
Copy link
Member Author

hey @talagayev the msd PR seems to pass all the tests (#4896) -- do you think it's something you could use here as well?

I skimmed through the code and I think it overcomes your self.n_frames problem by setting up an array of shape (..., self.n_frames) before running analysis, and later you can use that as a replacement for self.n_frames

Hey @marinegor,

Yes I think that could be a solution for that PR as well. I will try to implement it the coming days. Sadly I didn't have much time the last couple of months, so I didn't look into the PR.

I would also have a draft for the #4659 that I would have locally currently, that would add parallelization for some part of MDAnalysis.analysis.align. Currently the PR is closed, is it fine if I ping you there in a couple of days, when I reopen it an push the changes?

@marinegor
Copy link
Contributor

marinegor commented Nov 10, 2025 via email

@talagayev talagayev closed this Nov 17, 2025
@talagayev talagayev force-pushed the diffusionsmap_parallel branch from ce62693 to ab7769f Compare November 17, 2025 01:45
@talagayev talagayev reopened this Nov 17, 2025
@codecov
Copy link

codecov bot commented Nov 17, 2025

Codecov Report

❌ Patch coverage is 92.59259% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.72%. Comparing base (ab7769f) to head (27e2875).

Files with missing lines Patch % Lines
package/MDAnalysis/analysis/diffusionmap.py 92.59% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4745      +/-   ##
===========================================
- Coverage    92.73%   92.72%   -0.01%     
===========================================
  Files          180      180              
  Lines        22457    22473      +16     
  Branches      3186     3188       +2     
===========================================
+ Hits         20825    20839      +14     
- Misses        1176     1177       +1     
- Partials       456      457       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MDAnalysis.analysis.diffusionmap: Implement parallelization or mark as unparallelizable

4 participants