Skip to content

Extremely slow reduction #110

Open
Open
@maleadt

Description

@maleadt

On a 1024x1024 Float32 matrix:

julia> @benchmark sum($a)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  84.734 μs … 228.946 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     85.332 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   87.917 μs ±   7.545 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▇▃▂▄▁▁                                                      ▁
  ████████▇██▇▇█▇▆█████▇██████▇▆▇▆▆▅▆▆▆▆▆▆▆▆▅▅▅▆▆▄▇█▆▅▅▅▆▄▄▃▅▄ █
  84.7 μs       Histogram: log(frequency) by time       120 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark sum($d_a)
BenchmarkTools.Trial: 618 samples with 1 evaluation.
 Range (min … max):  6.966 ms …   9.740 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     8.047 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.079 ms ± 602.087 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▁▁▄▃▄  ▁ ▁▇ ▅▄█▃▄            ▂ ▁ ▃▁▇▁ ▁ ▁▁  ▃           
  ▂▂▃▃▄█████▇▇█▇██▇█████▄▇▆▆▃█▅▇▇▆▆███████████▅█████▃▄▃▆▂▅▄▂▃ ▅
  6.97 ms         Histogram: frequency by time        9.28 ms <

 Memory estimate: 27.41 KiB, allocs estimate: 509.

It scales, so this is probably the kernel being bad:

julia> d_a = oneArray(rand(Float32, 4096, 4096));

julia> a = rand(Float32, 4096, 4096);

julia> @benchmark sum($a)
BenchmarkTools.Trial: 1682 samples with 1 evaluation.
 Range (min … max):  2.918 ms …  3.185 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.964 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.967 ms ± 25.760 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

            ▅  █      ▆  ▃                                    
  ▂▁▁▃▂▂▇▅▃▇█▃▃█▃▂██▃██▄▄█▇▃▄▇▂▃▇▃▂▅▅▂▃▄▂▂▃▂▂▃▃▂▂▃▂▁▃▂▂▂▂▁▁▂ ▃
  2.92 ms        Histogram: frequency by time        3.05 ms <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark sum($d_a)
BenchmarkTools.Trial: 45 samples with 1 evaluation.
 Range (min … max):  112.776 ms … 113.728 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     113.151 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   113.186 ms ± 218.961 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁       ▁       ▁ ▄   ▄█▁ ▄▁       ▁▁    ▁                     
  █▁▁▆▆▁▁▁█▆▁▁▁▁▁▁█▆█▆▁▆███▁██▁▁▁▆▆▆▆██▁▁▁▆█▆▁▁▁▁▆▁▁▆▁▁▁▁▁▆▁▁▁▆ ▁
  113 ms           Histogram: frequency by time          114 ms <

 Memory estimate: 28.75 KiB, allocs estimate: 516.

Metadata

Metadata

Assignees

No one assigned

    Labels

    arraysThings about the array abstraction.performanceGotta go fast.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions