(Re)calculate the CSR score for a given result file.
spr_scoreres [-CIS](flag: correct skipped) [-PAR](flag: print all) [-c corpus_fname]
[-ref ref_txt_fname_or_string] [-r result_fname] [-tst tst_txt_fname] [-dic dictionary]
[-dist distance matrix] [-nr new_result_fname] [-res new_result_fname] [-omit words_to_omit]
- Parameters
-
-CISflag | correct skipped
Correct recognition results are not listed in the results file, i.e. any entry for which no result is found is considered correct. |
-PARflag | print all
Print all recognition results, not only the wrong ones. |
-c<em>corpus_fname</em><a | name="spr_scoreres.c" class="el">
Name of the corpus file containing the reference transcriptions. |
-ref<em>ref_txt_fname_or_string</em><a | name="spr_scoreres.ref" class="el">
Alternative way to provide reference transcriptions: lines containing word sequences delimited by white space characters; one line per entry. A single reference entry can also be provided directly on the command line; this string must start with '@' as first word (this word is ignored in the alignment). |
-r<em>result_fname</em><a | name="spr_scoreres.r" class="el">
Name of result file containing the test sequences. |
-tst<em>tst_txt_fname</em><a | name="spr_scoreres.tst" class="el">
Alternative way to provide test transcriptions: lines containing word sequences delimited by white space characters; one line per entry. A single test string can also be provided directly on the command line; this this string must start with '@' as first word (this word is ignored in the alignment). |
-dic<em>dictionary</em><a | name="spr_scoreres.dic" class="el">
Name of an optional dictionary (only needed in combination with a distance matrix). |
-dist<em>distance | matrix
File defining distance between the dictionary elements (requires a dictionary) or a reference to a known distance function \"\@<func_name>[param1=val;param2=val;..]\\"; at this moment only a simple count of unique (occuring in wrd1 and not in wrd2, or vica versa) characters (or character sequences) is supported: \"\@chr_diff[N=2;w1=1;w2=0.1]\\" |
-nr<em>new_result_fname</em><a | name="spr_scoreres.nr" class="el">
Name of file to write results to. |
-res<em>new_result_fname</em><a | name="spr_scoreres.res" class="el">
Alternative way to get the results: one line per arc in the alignment; each line contains two word, the reference word on the left and the test word on the right; empty words are replaced with '/'; entries are seperated with an empty line. |
-omit<em>words_to_omit</em><a | name="spr_scoreres.omit" class="el">
Words that have to be omitted before scoring. |
(Re)calculate the CSR score for a given result file. This programme calculates the word error rate from a reference file and a recognition result. The recognition results are either read from a result file produced by the spr_eval.py script or from a text file containing one line per entry, words seperated by spaces. The reference transcriptions are read from a corpus file or from a text file containing one line per entry, words seperated by spaces. It is possible to omit every occurence of words given with option -omit (e.g. for sentence ending) in both correct and recognised string before calculating the score. A result summary is written to standard out:
-
The number of reference entries
-
The number of correctly recognized strings
-
Insertion, deletion and substitution scoring
-
Word recognition rate (using the total number of errors)
-
String recognition rate (string fully correct)
Instead of only a summary, also a new result file (option -nr) can be written, omitting the omitted words. In this case, nothing is written to standard out. If flag -PAR is set, all sentences are printed in the resultfile, even correct ones. The resulting alignment can also be dumped in a simple format using the '-res' option.
- Author
- Jacques Duchateau, Kris Demuynck
- Date
- 07/05/1996 - JD
- Creation
- 15/05/2004 - KD
- Adapted for new cwr_err conventions
- 15/09/2015 - KD
- Added to options to process simple text files