Skip to content

Conversation

chriselrod
Copy link
Contributor

@chriselrod chriselrod commented Jan 3, 2024

I ran the benchmarks on a system with register_size() == 64, and got the following GFLOPS on the master branch for sizes 4:8:500

old_gflops = [0.16210526315789472, 1.6781567157355763, 4.161433018404238, 7.197283531409167, 9.553030303030303, 11.572753482232796, 14.429034619809643, 16.746275715299124, 19.578947368421055, 21.99035679845709, 24.80128720836685, 27.395831851483035, 29.192711404646776, 31.158719856208496, 33.36904068827221, 35.04422745548535, 36.832800851970184, 38.45884561636294, 40.34930579920806, 41.74674634794157, 42.88618689412984, 43.41364958680523, 46.127380024496155, 48.31694140867722, 49.149166776013644, 50.6332078728454, 52.40313864176383, 53.361424069431436, 54.484825818962534, 55.665026474153144, 56.34745964051333, 57.06077387621847, 57.672247676957994, 58.81788096735113, 59.81234582000914, 61.201434991044714, 61.60337858639627, 62.26612862173807, 61.90732041204154, 63.83436121917612, 64.29267682226191, 64.66860695953035, 64.70087883095505, 63.02217143268487, 63.342450710412066, 62.40403463747168, 62.800555826020585, 64.99590271988143, 62.627694636203955, 63.44711043553702, 65.32155270694881, 65.44405587845723, 67.48039597885897, 67.21316766343575, 64.40518187418853, 65.53358371634931, 66.40292069907099, 65.77859620654205, 68.44577118422902, 69.95890248896575, 67.95310444896346, 69.75019630350812, 71.11808005326851]

vs on this PR

new_gflops = [0.14776471843857145, 1.594246329038058, 4.056428001064679, 7.003855050115651, 9.445973663089681, 11.798402175760666, 14.065660685154976, 16.81253462055868, 19.46212787338274, 21.86716463106894, 24.676946410515672, 26.936951428471517, 28.929145361577795, 31.068750933439542, 33.41226763391412, 35.264421821978594, 36.813198509845655, 38.66570163487739, 40.477121882542235, 41.85745830605519, 42.8681555068836, 43.31440606859706, 46.13178282241772, 48.02345401835342, 49.25548520293771, 50.49430103220947, 52.26192142577662, 53.20286698053545, 54.56292220552537, 55.71549705859817, 56.36502191487289, 57.02239791081617, 57.628588073820275, 58.548880604989264, 59.739683915748884, 60.97500460799949, 61.46329113924051, 62.184114689244645, 61.86037659703268, 63.43681821206945, 63.936741274300914, 64.10524762182544, 64.36412311533029, 62.51022439067927, 63.10966290369639, 60.69208268872103, 62.12551472946074, 63.450891278513666, 62.35895975844578, 63.878358004244625, 65.83827413941691, 64.16653728850349, 65.1365044394016, 67.48519372013227, 65.46371685339177, 66.80207753266224, 66.97536258325441, 65.8370965883286, 68.24679282093642, 69.15932131070008, 65.85227749579352, 69.26174482564507, 70.6934274926346]

These are quite similar.

julia> sum(log, old_gflops / new_gflops)
0.4199631234593538

This is a <1% regression on average

julia> exp(ans/63)
1.0066883490969607

which could also be noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant