Skip to content

MattKotzbauer/lstm-debugger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

When debugging code using bug reports and code commits, what if we could model our session using an LSTM?

  • Each bug report or code commit is represented by a single temporal event within the LSTM
  • The cell / hidden states of the LSTM denote the state of the codebase over time
  • The candidate state to the LSTM denotes changes within the code (in the case of a commit) or new knowledge about the state of the code (in the case of a bug report)
  • Our new goal: when given a bug report, pinpoint the commit that caused it by assembling a probability distribution of potential culprits

Demonstration of output

More specifically, doing this from text-based descriptions requires us to tokenize and embed our output before feeding it into the LSTM. And for practical purposes, having an attention head (where the target bug report's representation serves as a key, and the commit representations serves as keys and values) helps us to reason about our main information-passing question ("did commit A cause bug B?").

This incentivizes us to have a relationship between our LSTM and its self-attention where we use the former for a long-term storage of the state of our codebase (its cell state), alongside its perceived impact (its hidden state), while relying on the self-attention mechanism to 'analyze' these hidden states (by comparing a query vector against the key and value projections of the hidden state)

Diagram of architecture

About

LSTM-based commit blaming using bug reports and commit descriptions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published