-
Notifications
You must be signed in to change notification settings - Fork 1
Weeks_6and7
(source: http://learnyouahaskell.com,
cc Miran Lipovača)
#6th Week This week I have to leave the original timeline I gave in the proposal since changing the program seems to be more challenging than I thought. The milestone for the next weeks will be more or less the same as for the last weeks: Change the code in a way, such that the time and space consumption will decrease.
##Work To Do Since the changes I did until know did not show a significant decrease of time and space consumption, it will be necessary to think about other methods on how to change the program's code. What needs to be done this week is to get away from the sparse matrix representation in the program and use dense matrices.
##Work Done
The current version of transalign includes unboxed and boxed vectors and still an IntMap in group_al''' since this seems to be more efficient than changing it into vectors. Additionally, the current version uses ByteString.copy such that the part of the input which is not used can be garbage collected. I did profiling runs for both, the original and the current version with just a small subset of the test data as an input. It can be seen that for the small input, both versions need almost the same time and the current version needs less memory.
Original transalign:
85,308,683,552 bytes allocated in the heap
80,932,813,800 bytes copied during GC
10,395,007,064 bytes maximum residency (18 sample(s))
305,680,808 bytes maximum slop
23929 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 162199 colls, 0 par 24.34s 24.93s 0.0002s 0.0026s
Gen 1 18 colls, 0 par 18.04s 32.49s 1.8052s 9.8200s
INIT time 0.00s ( 0.00s elapsed)
MUT time 15.93s ( 15.30s elapsed)
GC time 42.38s ( 57.42s elapsed)
RP time 0.00s ( 0.00s elapsed)
PROF time 0.00s ( 0.00s elapsed)
EXIT time 0.14s ( 2.57s elapsed)
Total time 58.45s ( 75.30s elapsed)
%GC time 72.5% (76.3% elapsed)
Alloc rate 5,354,885,666 bytes per MUT second
Productivity 27.5% of total user, 21.3% of total elapsed
Current version of transalign:
110,522,841,960 bytes allocated in the heap
47,943,022,480 bytes copied during GC
2,327,319,792 bytes maximum residency (90 sample(s))
1,755,596,256 bytes maximum slop
6132 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 207927 colls, 0 par 28.74s 29.58s 0.0001s 0.0079s
Gen 1 90 colls, 0 par 4.96s 8.37s 0.0930s 0.8881s
TASKS: 4 (1 bound, 3 peak workers (3 total), using -N1)
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
INIT time 0.00s ( 0.00s elapsed)
MUT time 23.19s ( 25.54s elapsed)
GC time 33.70s ( 37.94s elapsed)
RP time 0.00s ( 0.00s elapsed)
PROF time 0.00s ( 0.00s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 56.90s ( 63.49s elapsed)
Alloc rate 4,765,764,389 bytes per MUT second
Productivity 40.8% of total user, 36.5% of total elapsed
#7th Week Since for a small input the original and the current version of transalign need the same time, but for the large input data set the original version is still faster, a goal will be to find out the reason for that.
##Work To Do If I do profiling runs with the whole dataset as an input, the original transalign is still faster. The work for the 7th coding week will be including cost centers in the code to see which methods or data structures consume most of the time.
##Work Done I found that merge1 in Align.hs is the function which needs most of the space allocated by the program. Since the function uses cons operations, which need O(n) time for vectors using unfoldr, I reincluded lists at some parts. This reduced the memory consumption and running time about a half, as you can see in the excerpts of the profiling statistics. Since the memory consumption is an important part when running the program on standard computers, some more reduction of the memory usage is needed to reach this goal.
Before:
transalign_prof +RTS -p -RTS ../../transalign/test-data/452.xml ../../transalign/test-data/U50_vs_SP.xml
total time = 698.24 secs (698239 ticks @ 1000 us, 1 processor)
total alloc = 3,362,406,881,480 bytes (excludes profiling overheads)
After:
transalign_prof +RTS -p -RTS ../../transalign/test-data/452.xml ../../transalign/test-data/U50_vs_SP.xml
total time = 322.91 secs (322912 ticks @ 1000 us, 1 processor)
total alloc = 1,289,345,690,776 bytes (excludes profiling overheads)
#Next Weeks Here, you can find the entry for the next two weeks.
#References
Back to main page.
Previous weeks.