Classifier
to detect the programming languge and the NeuralNet
to detect and locate source code in a given frame are not included since this code repository is implemented in a learning plattform and the deep learning algorithms are constantly trained and improved.
The main components of this repository are structured as follows:
-
analyzer/
: The core of the implementation handling initialization of every module and saving the extracted knowledge components -
classifier/
: Implementation ofLanguageClassifier
for classifying the programming language -
lexer/
: Implementation of different lexers based on the classifier programming language -
neural_net/
: Implementation of custom code detection and extraction using retrainedNeuralNet
andTesseract
-
parser/
: Implementation of parsing the tokens from the lexer, processing them and recursively build the knowledge component tree.
In order to run the repository following prerequisites are required:
-
Ubuntu 20.04 or higher
-
OpenCV 4.5 or higher