NLP: Word tokenizer | Java • A Java project that tokenizes all words in a documentary • Stores the number of occurrences, the path to the source file and its position in it