-
Notifications
You must be signed in to change notification settings - Fork 78
Return geometry by value (alternative 1) #885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
jenkins build this serial please |
1 similar comment
jenkins build this serial please |
No matter what alternative we end up preferring, we should benchmark to make sure performance does not suffer. I do not expect anythin major though. |
a80cfd9
to
66c86e5
Compare
jenkins build this serial please |
1 similar comment
jenkins build this serial please |
66c86e5
to
1e82b5e
Compare
jenkins build this serial please |
1e82b5e
to
2ca87d2
Compare
jenkins build this serial please |
b8b09d2
to
b3e4cf2
Compare
jenkins build this serial please |
benchmark please |
Benchmark results did not get posted back here. I am on it. |
This is not really surprising. Copying a shared pointer involves making an atomic increment to the reference counter. Even when only one processor is involved, this can take quite a number of cycles. That is usually manageable if the code is sequential, but as soon as the code becomes parallel all of these increments have to "compete" to block the bus that communicates with the other the processors. So, even if the pointer never has to be shared with other processors, the overhead of making this operation very often becomes quite noticeable. So this is one of the most common pitfalls of using shared pointers. |
benchmark please |
Benchmark result overview:
View result details @ https://www.ytelses.com/opm/?page=result&id=2822 |
Comparing alternative 1, 2 and 3:
View result details for alt 1 @ https://www.ytelses.com/opm/?page=result&id=2822 View result details for alt 2 @ https://www.ytelses.com/opm/?page=result&id=2823 View result details for alt 3 @ https://www.ytelses.com/opm/?page=result&id=2824 |
I am kind of confused with those results: 17% of difference seems way too much for these changes. Out of curiosity, are those benchmarks compiled with Link Time Optimization (LTO)? Since the code in those methods is rather small, it could be that we are mostly measuring how long it takes to jump to the function instruction when LTO is not enabled. Assuming that those reports are with LTO and the comparison is with simulation time, I get locally these values:
If those are not done with LTO, I would suggest to at least enable them for the benchmarks and production environments. The speed up in drogon is of about 8% with any of the alternatives with LTO. |
a2a089b
to
a771513
Compare
jenkins build this serial please |
This is to conform with the dune interface
a771513
to
5dedf94
Compare
jenkins build this serial please |
This is to conform with the dune interface.
Note that this is possible since the introduction of shared pointers into the geometry. On the other hand, because copying shared pointer can cause performance problems I also propose an alternative #884 and compare them with the benchmarks.