(Re)calculate the CSR score for a given result file. More...

Detailed Description

(Re)calculate the CSR score for a given result file.

spr_scoreres [-CIS](flag: correct skipped) [-PAR](flag: print all) [-c corpus_fname]
    [-ref ref_txt_fname_or_string] [-r result_fname] [-tst tst_txt_fname] [-dic dictionary]
    [-dist distance matrix] [-nr new_result_fname] [-res new_result_fname] [-omit words_to_omit]

Parameters

-CISflag	correct skipped Correct recognition results are not listed in the results file, i.e. any entry for which no result is found is considered correct.
-PARflag	print all Print all recognition results, not only the wrong ones.
-c<em>corpus_fname</em><a	name="spr_scoreres.c" class="el"> Name of the corpus file containing the reference transcriptions.
-ref<em>ref_txt_fname_or_string</em><a	name="spr_scoreres.ref" class="el"> Alternative way to provide reference transcriptions: lines containing word sequences delimited by white space characters; one line per entry. A single reference entry can also be provided directly on the command line; this string must start with '@' as first word (this word is ignored in the alignment).
-r<em>result_fname</em><a	name="spr_scoreres.r" class="el"> Name of result file containing the test sequences.
-tst<em>tst_txt_fname</em><a	name="spr_scoreres.tst" class="el"> Alternative way to provide test transcriptions: lines containing word sequences delimited by white space characters; one line per entry. A single test string can also be provided directly on the command line; this this string must start with '@' as first word (this word is ignored in the alignment).
-dic<em>dictionary</em><a	name="spr_scoreres.dic" class="el"> Name of an optional dictionary (only needed in combination with a distance matrix).
-dist<em>distance	matrix File defining distance between the dictionary elements (requires a dictionary) or a reference to a known distance function \"\@<func_name>[param1=val;param2=val;..]\\"; at this moment only a simple count of unique (occuring in wrd1 and not in wrd2, or vica versa) characters (or character sequences) is supported: \"\@chr_diff[N=2;w1=1;w2=0.1]\\"
-nr<em>new_result_fname</em><a	name="spr_scoreres.nr" class="el"> Name of file to write results to.
-res<em>new_result_fname</em><a	name="spr_scoreres.res" class="el"> Alternative way to get the results: one line per arc in the alignment; each line contains two word, the reference word on the left and the test word on the right; empty words are replaced with '/'; entries are seperated with an empty line.
-omit<em>words_to_omit</em><a	name="spr_scoreres.omit" class="el"> Words that have to be omitted before scoring.

(Re)calculate the CSR score for a given result file. This programme calculates the word error rate from a reference file and a recognition result. The recognition results are either read from a result file produced by the spr_eval.py script or from a text file containing one line per entry, words seperated by spaces. The reference transcriptions are read from a corpus file or from a text file containing one line per entry, words seperated by spaces. It is possible to omit every occurence of words given with option -omit (e.g. for sentence ending) in both correct and recognised string before calculating the score. A result summary is written to standard out:

The number of reference entries
The number of correctly recognized strings
Insertion, deletion and substitution scoring
Word recognition rate (using the total number of errors)
String recognition rate (string fully correct)

Instead of only a summary, also a new result file (option -nr) can be written, omitting the omitted words. In this case, nothing is written to standard out. If flag -PAR is set, all sentences are printed in the resultfile, even correct ones. The resulting alignment can also be dumped in a simple format using the '-res' option.

Author: Jacques Duchateau, Kris Demuynck

Date

07/05/1996 - JD: Creation
15/05/2004 - KD: Adapted for new cwr_err conventions
15/09/2015 - KD: Added to options to process simple text files