I've just added some code to automatically estimate semantic error rate for translation applications. It does more or less the same thing as the code we've had for a while in dialogue apps, and counts an example from a speech corpus as semantically correct if it produces the same interlingua as the transcription would have done.
Unfortunately, the problem with this definition is that it doesn't work for utterances that are in domain, but out of grammar coverage. For example, I was just looking though the results for the English MedSLT corpus. In one example, the transcription is "does the pain ache", which is out of grammar coverage. The first hypothesis which produces well-formed interlingua is "does the pain feel aching", which is a good paraphrase and is selected. So this should really be counted as semantically correct, but isn't.
I think we can address the problem by allowing the developer to declare a file of paraphrases, and say that the example is semantically correct if it gives the same result as either the actual transcription or one of its paraphrases. Then if the developer adds in-coverage paraphrases where they exist, things will work correctly. This should be easy to implement. Probably we want a warning if a paraphrase in fact is also determined to be out of coverage.
This paraphrase functionality should also be useful for the N-best rescoring work that Maria and I have been doing for dialogue apps. We have the same problem there - we want to be able to experiment with out of coverage examples, but currently get no figures.
- McNemar at the word level?
- Faster parsing in Regulus using Nuance
- "Paraphrase corpora" for estimating semantic error...
- Better ways to estimate semantic error rate
- Interlingua corpora for multiple domains
- AFF version of Catalan
- Regulus 2.9.0 released
- Problems with SICStus 4.0.3 resolved
- Problems with SICStus 4.0.3
- Interlingua corpora
- Running multiple copies of the GUI
- ▼ June (11)