-
Notifications
You must be signed in to change notification settings - Fork 24
Improve read perf, store coords as SVector #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This substantially reduces the amount of memory used to store coordinates and parse files. While parsing a large mmCIF file, the memory usage dropped by approximately 50%, and read time by ~15%. The most potentially-disruptive change is that the coordinates are now stored as SVectors instead of Vectors. This means that the coordinates are now immutable, and you cannot change them in place by manual indexing. The `x!`, `y!`, and `z!` functions still work, as do in-place transformations, by making the `coords` field itself mutable.
Inspired in part by seeing the activity in #45 |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #70 +/- ##
==========================================
+ Coverage 95.32% 95.34% +0.01%
==========================================
Files 14 14
Lines 2033 2039 +6
==========================================
+ Hits 1938 1944 +6
Misses 95 95 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Seems like a sensible change to me. I'm surprised how small the diff required to do this is. Can we get away with this being a non-breaking change? The return type of If it has to be breaking, I would be tempted to change from Float64 to Float32 coordinates in the same release. |
I agree this is a tricky issue. Fundamentally I'm not sure. Plain type-changes often aren't, but mutable to immutable is a behavior change. If we decide it's breaking, then we might also evaluate whether there are any savings to be had by reordering some of the |
If we think it's non-breaking, should we get out a 4.5.0 release before merging it? If we go ahead with the notion that this is non-breaking, it feels like a release that would at least raise the risk that we might later decide to yank it. |
It's probably safer to consider it breaking, and think about reordering struct fields and moving to Float32 in the same release. I can look at the Float32 switch in the next few weeks unless someone else gets there first. Either way I will release v4.5.0 shortly. |
Beat ya 🙂 Not pushed yet, but here's the current state: julia> c
Chain DA with 401 residues, 0 other molecules, 2947 atoms
julia> Base.summarysize(c; exclude=Model) / 2947
98.67662029182219 That's less than half where we are now, #59 (comment) It's probably best done as a series of changes, though, just in case we regret some of them. JuliaCollections/OrderedCollections.jl#150 is preparatory work for these additional changes. |
Great! v4.5.0 is at JuliaRegistries/General#130705 🎉 |
As the sequence of changes I have planned will ultimately generate merge conflicts for #71, let's wait and get that merged before getting started down this road. |
This has a surprisingly large benefit for performance
This substantially reduces the amount of memory used to store
coordinates and parse files. While parsing a large mmCIF file, the
memory usage dropped by approximately 50%, and read time by 15%.number of allocations was cut by a factor of 3 and the read time dropped 40%.
The most potentially-disruptive change is that the coordinates are now
stored as SVectors instead of Vectors. This means that the coordinates
are now immutable, and you cannot change them in place by manual
indexing. The
x!
,y!
, andz!
functions still work, as do in-placetransformations, by making the
coords
field itself mutable.Closes #55