EvalSubtitle

EvalSubtitle is a tool for reference-based evaluation of subtitle segmentation.

The repository contains the Subtitle Segmentation Score (Sigma), specifically tailored for evaluating segmentation from system outputs where the text is not identical to a reference (imperfect texts). EvalSub also contains a collection of standard segmentation metrics (F1, WindowDiff etc.) as well as subtitling evaluation metrics: BLEU on segmented (BLEU_br) and non-segmented text (BLEU_nb), and TER_br.

More details can be found in the paper.

python>=3.6.0

Reference-based evaluation of subtitle segmentation

Metrics

The script evalsub_main.py allows the computation of the following metrics:

Standard segmentation metrics:

Precision
Recall
F1
Pk
WinDiff (WindowDiff)
SegSim (Segmentation Similarity)
BoundSim (Boundary Similarity)

Subtitling evaluation metrics

BLEU_br
BLEU_nb
TER_br
CPL_conf
Sigma

File format

System and reference files should be in one of these formats: tagged text, or SRT.

Tagged text files contain text interspersed with segmentation symbols: <eol>, which indicates a change of line within the same screen, and <eob>, which indicates the end of a subtitle block and a subsequent change of screen. A line of the file corresponds to a full sentence.

System and reference files should contain the same number of sentences for the computation of certain metrics (BLEU_br, BLEU_nb, Sigma, TER_br). If not, the auto_segmentation option can be used to automatically segment system output according to reference sentences (implementation from SubER Levenshtein alignment tool).

Parameters

--all, -a: Compute all metrics.
--standard, -std: Compute all metrics that require identical/perfect text.
--end2end, -e2e: Compute all metrics that do not require identical/perfect text.
--include, -i: Compute only the specified metrics.
--exclude, -e: Compute all but the specified metrics.
--text, -t: Whether the text from system subtitles is identical to the text from reference subtitles ("perfect"), or not ("imperfect"). (Can be used as a safeguard to prevent computing standard metrics with imperfect text)
--system_files, -sys: Segmented subtitle files to evaluate (by default, the system files in data).
--reference_file, -ref: Reference segmented subtitle file (by default, the reference file in data).
--results_file, -res: CSV file where to write the results.
--srt, -srt: Whether the subtitle files are in SRT format.
--auto_segmentation, -as: Whether to use automatic segmentation for system sequences.
--window_size, -k: Window size for the window-based (Pk, WinDiff) segmentation evaluation (by default, is computed as half of the mean reference segmentation length).
--max_transpo, -n: Maximum distance that can be accounted as a boundary transposition error (by default, 2). Specific to SegSim and BoundSim.
--max_cpl, -cpl: Maximum allowed length for subtitle lines (by default, 42).
--confidence_interval, -ci: If set, compute (and print) the confidence interval (CI) for BLEU and Sigma. The CI is computed using bootstrap resampling (with 95% confidence).

Note: the metric names have to be written as in the list above.

Example

Compute all end-to-end metrics but TER_br, for the automatic subtitles in data:

python evalsub_main.py -res results.csv -e2e -e TER_br

Compute only Sigma, BLEU_br and BLEU_nb, for the automatic subtitles in data:

python evalsub_main.py -res results.csv -i Sigma BLEU_br BLEU_nb

Compute all metrics that are compatible with imperfect text:

python evalsub_main.py -res results.csv -a -t imperfect

also equivalent to

python evalsub_main.py -res results.csv -e2e

Citation

If you use EvalSubtitle in your research, please cite the following paper:

Alina Karakanta, François Buet, Mauro Cettolo and François Yvon. (2022). Evaluating Subtitle Segmentation for End-to-end Generation Systems. In Proceedings of LREC 2022.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
data		data
evalsub		evalsub
lrec		lrec
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
codecheck.yml		codecheck.yml
evalsub_main.py		evalsub_main.py
metrics.md		metrics.md
requirements.txt		requirements.txt
results_lrec.csv		results_lrec.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EvalSubtitle

Reference-based evaluation of subtitle segmentation

Metrics

File format

Parameters

Example

Citation

About

Uh oh!

Releases

Packages

Languages

License

codecheckers/EvalSubtitle

Folders and files

Latest commit

History

Repository files navigation

EvalSubtitle

Reference-based evaluation of subtitle segmentation

Metrics

File format

Parameters

Example

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages