Skip to content

Weeks_3and4

Michal J. Gajda edited this page Oct 28, 2020 · 3 revisions
alternate text

(source: http://learnyouahaskell.com,
cc Miran Lipovača)

#3rd Week Here is the milestone for the 3rd and 4th week from the original timeline:
Milestone 2: Vector library and ADPfusion included.

##Work To Do Before start working to achieve the 2nd milestone, I need to really understand the core part of the transalign program, in particular methods like group_al'''. In this way I can decide whether to rewrite the methods or just to change some parts.

##Work Done The transalign program is written using lists which is efficient for sparse matrices. In the 2nd week, I encountered that the matrices are not sparse. Thus, I changed the lists into vectors which is a more efficient datastructure in this case. By now, I included Data.Vector which is a boxed datatype. This will be changed into an unboxed vector datatype in the following week.

#4th Week Here is the milestone for the 3rd and 4th week from the original timeline:
Milestone 2: Vector library and ADPfusion included.

##Work To Do In the 4th coding week I plan to do:

  • change the boxed vector datatype to an Data.Vector.Unboxed.
  • do another profiling to see if using Vector instead of List already shows an improvement of time and space consumption.
  • start looking at the ADPfusion library and think about how to include it in the program.

##Work Done During the last week I changed the code such that it is using Vector.Unboxed now. I did some profiling runs in between which are shown in the follwing sections.

###Vector included Here, I replaced the lists by boxed vectors.

transalign_prof +RTS -p -RTS ../../transalign/test-data/452.xml ../../transalign/test-data/U50_vs_SP.xml
total time  =     7144.50 secs   (7144502 ticks @ 1000 us, 1 processor)  
total alloc = 13,402,222,482,360 bytes  (excludes profiling overheads)  

COST CENTRE                  MODULE  %time %alloc  

trans_align                  Align    50.4   50.4  
merge1                       Align    24.6   26.4  
sort_column.filter_scores    Align    14.6   15.5  
group_al'''.merge            Align     5.8    2.6  
group_al'''.toMap            Align     1.2    1.7  

###Vector.Unboxed included After changing the lists to vectors, I replaced the boxed vectors by unboxed ones. It can be seen that the resulting numbers for time and space usage decreased except for the trans_align method.

transalign_prof +RTS -p -RTS ../../transalign/test-data/452.xml ../../transalign/test-data/U50_vs_SP.xml
total time  =     6522.45 secs   (6522450 ticks @ 1000 us, 1 processor)  
total alloc = 5,627,596,415,256 bytes  (excludes profiling overheads)  

COST CENTRE                  MODULE  %time %alloc  

trans_align                   Align    71.5   44.5  
merge1                        Align    14.8   13.9  
group_al'''.merge             Align     5.0    6.1  
sort_column.filter_scores     Align     4.3    7.7  
group_al'''.toMap             Align     1.6    6.7  
trans_align.yt                Align     0.3    9.7  
trans_align.xt                Align     0.3    8.1  

###Method trans_align changed Since cons in vectors needs O(n) whereas in list it just takes O(1) time, the trans_align method was rewritten to avoid the use of cons. In the profiling statistics it can be seen that the values for time and space consumption decreased in comparison to the statistics before.
For the method merge1 the values increased which might also be caused by the use of cons with vectors.

transalign_prof +RTS -p -RTS ../../transalign/test-data/452.xml ../../transalign/test-data/U50_vs_SP.xml
total time  =     2907.57 secs   (2907568 ticks @ 1000 us, 1 processor)  
total alloc = 3,566,418,978,384 bytes  (excludes profiling overheads)  

COST CENTRE                   MODULE  %time %alloc  

trans_align.go                Align    35.1   11.8  
merge1                        Align    33.4   21.9  
sort_column.filter_scores     Align    10.2   12.1  
group_al'''.merge             Align    10.0    9.6  
group_al'''.toMap             Align     3.6   10.5  
group_al'''.y                 Align     1.5    1.3  
trans_align                   Align     1.1    0.7  
trans_align.go.xt             Align     0.9   12.7  
trans_align.go.yt             Align     0.8   15.4  
#Next Weeks [Here](GSoC_blog/Week_5), you can find the entry for the next week.

#References

The vector package
The ADPfusion package

Back to main page.
Previous weeks.

Clone this wiki locally