Skip to content

Weeks_6and7

Michal J. Gajda edited this page Oct 28, 2020 · 3 revisions
alternate text

(source: http://learnyouahaskell.com,
cc Miran Lipovača)

#6th Week This week I have to leave the original timeline I gave in the proposal since changing the program seems to be more challenging than I thought. The milestone for the next weeks will be more or less the same as for the last weeks: Change the code in a way, such that the time and space consumption will decrease.

##Work To Do Since the changes I did until know did not show a significant decrease of time and space consumption, it will be necessary to think about other methods on how to change the program's code. What needs to be done this week is to get away from the sparse matrix representation in the program and use dense matrices.

##Work Done

The current version of transalign includes unboxed and boxed vectors and still an IntMap in group_al''' since this seems to be more efficient than changing it into vectors. Additionally, the current version uses ByteString.copy such that the part of the input which is not used can be garbage collected. I did profiling runs for both, the original and the current version with just a small subset of the test data as an input. It can be seen that for the small input, both versions need almost the same time and the current version needs less memory.

Original transalign:

85,308,683,552 bytes allocated in the heap  
80,932,813,800 bytes copied during GC  
10,395,007,064 bytes maximum residency (18 sample(s))  
305,680,808 bytes maximum slop  
23929 MB total memory in use (0 MB lost due to fragmentation)  

                                Tot time (elapsed)    Avg pause    Max pause  
Gen  0     162199 colls,     0 par   24.34s   24.93s     0.0002s    0.0026s  
Gen  1        18 colls,     0 par   18.04s   32.49s     1.8052s    9.8200s  

INIT    time    0.00s  (  0.00s elapsed)  
MUT     time   15.93s  ( 15.30s elapsed)  
GC      time   42.38s  ( 57.42s elapsed)  
RP      time    0.00s  (  0.00s elapsed)  
PROF    time    0.00s  (  0.00s elapsed)  
EXIT    time    0.14s  (  2.57s elapsed)  
Total   time   58.45s  ( 75.30s elapsed)  

%GC     time      72.5%  (76.3% elapsed)  

Alloc rate    5,354,885,666 bytes per MUT second  

Productivity  27.5% of total user, 21.3% of total elapsed  

Current version of transalign:

110,522,841,960 bytes allocated in the heap  
47,943,022,480 bytes copied during GC  
2,327,319,792 bytes maximum residency (90 sample(s))  
1,755,596,256 bytes maximum slop  
6132 MB total memory in use (0 MB lost due to fragmentation)  

                                Tot time (elapsed)    Avg pause    Max pause  
Gen  0     207927 colls,     0 par   28.74s   29.58s     0.0001s    0.0079s  
Gen  1        90 colls,     0 par    4.96s    8.37s     0.0930s    0.8881s  

TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)  

SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)  

INIT    time    0.00s  (  0.00s elapsed)  
MUT     time   23.19s  ( 25.54s elapsed)  
GC      time   33.70s  ( 37.94s elapsed)  
RP      time    0.00s  (  0.00s elapsed)  
PROF    time    0.00s  (  0.00s elapsed)  
EXIT    time    0.00s  (  0.00s elapsed)  
Total   time   56.90s  ( 63.49s elapsed)  

Alloc rate    4,765,764,389 bytes per MUT second  

Productivity  40.8% of total user, 36.5% of total elapsed  

#7th Week Since for a small input the original and the current version of transalign need the same time, but for the large input data set the original version is still faster, a goal will be to find out the reason for that.

##Work To Do If I do profiling runs with the whole dataset as an input, the original transalign is still faster. The work for the 7th coding week will be including cost centers in the code to see which methods or data structures consume most of the time.

##Work Done I found that merge1 in Align.hs is the function which needs most of the space allocated by the program. Since the function uses cons operations, which need O(n) time for vectors using unfoldr, I reincluded lists at some parts. This reduced the memory consumption and running time about a half, as you can see in the excerpts of the profiling statistics. Since the memory consumption is an important part when running the program on standard computers, some more reduction of the memory usage is needed to reach this goal.

Before:

transalign_prof +RTS -p -RTS ../../transalign/test-data/452.xml ../../transalign/test-data/U50_vs_SP.xml  

total time  =      698.24 secs   (698239 ticks @ 1000 us, 1 processor)  
total alloc = 3,362,406,881,480 bytes  (excludes profiling overheads)  

After:

transalign_prof +RTS -p -RTS ../../transalign/test-data/452.xml ../../transalign/test-data/U50_vs_SP.xml  

total time  =      322.91 secs   (322912 ticks @ 1000 us, 1 processor)  
total alloc = 1,289,345,690,776 bytes  (excludes profiling overheads)  

#Next Weeks Here, you can find the entry for the next two weeks.

#References

Back to main page.
Previous weeks.

Clone this wiki locally