Created in 2019 for the semi-automation of the generation of the French lexical cues used by French Fastcontext [1].
The repository contains the following Jupyter Notebooks:
1_Initiator.ipynb
This notebook takes the ./excels/0_Phase3_LexicalCluesENFR.xlsx as input which is an English version of FastContext with seed (google translated) French counterparts. The outcome of this notebook is the ./excels/1_Phase3_TranslationTable.xlsx which is automatic enriched information such as POS, the number of occurence of French rules, the words candidate for synonyms, etc.
2_SynonymExtraction.ipynb
This notebook takes the ./excels/1_Phase3_TranslationTable.xlsx as input which is an enriched French version of FastContext rules. The outcome of this notebook is the ./excels/2_Phase3_Word_Sense_Extraction.xlsx which is synomyms extraction for French seed words from differnet sources such as JDM French Lexical Semantics, French WordNet, Cnrtl, Synonymo, Cisco and DicSyn.
3.1_SynonymVoting.ipynb
This notebook takes the ./excels/2_Phase3_Word_Sense_Extraction.xlsx as input which is synonyms for seed French words. The outcome of this notebook is the ./excels/3_Phase3_Word_Sense_Voting_New_Format.xlsx which is top_20 sorted synomyms after applying some weights on different sources based on their reliabilities. The validator can eliminate the ones that are not proper.
3.2_SynonymVoting.ipynb
This notebook takes the ./excels/3_Phase3_Word_Sense_Voting_New_Format.xlsx as input which includes validated synonyms for seed words. The outcome of this notebook is the 3_Phase3_Word_Sense_Voting_with_Occurances_curated_CJ2_Cleaned.xlsx which is final list of synomyms for each seed word.
4_SynonymTableMerger.ipynb
This notebook takes two files as input: ./excels/1_Phase3_TranslationTable.xlsx and ./excels/3_Phase3_Word_Sense_Voting_with_Occurances_curated_CJ2_Cleaned.xlsx and update the latter excel with the available synonyms in the former one.
5_CFG_Maker.ipynb
This notebook takes the ./excels/3_Phase3_Word_Sense_Voting_with_Occurances_curated_CJ2_Cleaned.xlsx as input and generates automatically all the CFG ruls in ./notebooks folders. These notebooks can be validated.
6_CFG2Rule.ipynb
This notebook takes all the CFG rules in ./notebooks and convert them to a single file List_of_Rules.xls containg all of the rules.
7_ResultAnalyser.ipynb
This notebook takes outcomes of applied FastContext rules on HEGP and/or CepiDC datasets and provides the Precision, Recall and F1 measures as outputs for each dataset.
Mehdi Mirzapour (@mehdi-mirzapour) with supervision of Clement Jonquet (@jonquet)
LIRMM