Toolbox to evaluate Term Discovery systems.
You can install using pip (pip install zerospeech-tde
) or using conda (conda install -c coml zerospeech-tde
)
- Complete Documentation and metrics description ar available at https://docs.cognitive-ml.fr/tde/
This toolbox transcribed phonetical each discovered interval, then applies NLP evaluation to judge the quality of the discovery. The metrics are:
- NED : mean of the edit distance between all the discovered pairs
- coverage: percentage of the corpus covered
- token/type: measure how good the system was at finding gold tokens and gold types
- boundary: measure how good the system was at finding gold boundaries
- grouping: judge the purity of the clusters formed by the system
This is used for the task 2 of the zerospeech challenge, for this purpose it has been packaged into the zerospeech evaluation toolkit