Saturday 31 May 2008

Progress on N-best rescoring

Maria Georgescul and I have been doing some work over the last few days on N-best rescoring, using the Calendar application as a test-bed. The basic division of labor was for me to define features and transform N-best hypothesis lists into lists of feature vectors, while Maria fed these into an SVM-based learner to perform the actual rescoring. We did the experiments using a set of 459 recorded utterances. Rescoring now reduces semantic error rate from 19% to 11%, and WER from 11% to 10%.

I defined the features by looking at examples of N-best lists, and finding common examples of things which I felt intuitively should be penalized. The current set of features is as follows:


rank: Place in the N-best list

no_dialogue_move: Hypothesis produces no dialogue move

underconstrained_query: Query with no contentful constraints

non_indefinite_existential: Existentials with non-indefinite arg, e.g. "is there the meeting next week"

non_show_imperative: Imperatives where the main verb isn't "show" or something similar

indefinite_meeting_and_meeting_referent: combination of indefinite mention of meeting + available meeting referent

Wednesday 28 May 2008

Printing N-best feature info at top level

If you're in dialogue mode, and have N-best preferences defined, you now get them printed out at top level. This is useful for debugging feature definitions. Here's an example from the Calendar application:


>> what was the last meeting

Old state: [lf=[[whq,form(past,[[be,term(the_last,meeting,[]),[loc,where]]])]],
referents=[record(meeting,meeting_10),attribute(meeting,meeting_10,where)]]
LF: [[whq,form(past,[[be,term(the_last,meeting,[]),term(what,null,[])]])]]
Resolved LF: [[whq,form(past,[[be,term(the_last,meeting,[]),term(what,null,[])]])]]
Resolution: [trivial]
Dialogue move: [tense_information=referent(past), utterance_type=whq,
aggregate(last_n_meetings(1),[])]
Resolved move: [tense_information=interval(datime(1980,0,0,0,0,0),datime(2008,5,28,18,27,24)),
utterance_type=whq, aggregate(last_n_meetings(1),[])]
Paraphrase: list meetings in past the last meeting
Abstract action: say(referent_list([record(meeting,meeting_10)]))
Concrete action: tts(meeting at pierrette 's room on november 25)
New state: [lf=[[whq,form(past,[[be,term(the_last,meeting,[]),term(what,null,[])]])]],
referents=[attribute(meeting,meeting_10,where),record(meeting,meeting_10)]]

N-BEST FEATURES AND SCORES:

rank -1.00 * 0.00 = 0.00
no_dialogue_move -50.00 * 0.00 = 0.00
underconstrained_query -10.00 * 0.00 = 0.00
inconsistent_tense -10.00 * 0.00 = 0.00
non_indefinite_existential -10.00 * 0.00 = 0.00
non_show_imperative -50.00 * 0.00 = 0.00
definite_meeting_and_meeting_referent 3.00 * 0.00 = 0.00

Total score: 0.00

Dialogue processing time: 0.00 seconds

>> when did that meeting start

Old state: [lf=[[whq,form(past,[[be,term(the_last,meeting,[]),term(what,null,[])]])]],
referents=[attribute(meeting,meeting_10,where),record(meeting,meeting_10)]]
LF: [[whq,form(past,[[start,term(that,meeting,[])],[time,when]])]]
Resolved LF: [[whq,form(past,[[start,term(that,meeting,[])],[time,when]])]]
Resolution: [trivial]
Dialogue move: [query_object=start_time, referent_from_context=meeting,
tense_information=referent(past), utterance_type=whq]
Resolved move: [meeting=meeting_10, query_object=start_time, referent_from_context=meeting,
tense_information=interval(datime(1980,0,0,0,0,0),datime(2008,5,28,18,27,35)),
utterance_type=whq]
Paraphrase: start time for that meeting in past
Abstract action: say(referent_list([attribute(meeting,meeting_10,start_time)]))
Concrete action: tts(10 00 on november 25)
New state: [lf=[[whq,form(past,[[start,term(that,meeting,[])],[time,when]])]],
(referents
=
[attribute(meeting,meeting_10,where), record(meeting,meeting_10),
attribute(meeting,meeting_10,start_time)])]

N-BEST FEATURES AND SCORES:

rank -1.00 * 0.00 = 0.00
no_dialogue_move -50.00 * 0.00 = 0.00
underconstrained_query -10.00 * 0.00 = 0.00
inconsistent_tense -10.00 * 0.00 = 0.00
non_indefinite_existential -10.00 * 0.00 = 0.00
non_show_imperative -50.00 * 0.00 = 0.00
definite_meeting_and_meeting_referent 3.00 * 1.00 = 3.00

Total score: 3.00

Dialogue processing time: 0.01 seconds

Tuesday 27 May 2008

Catching Regulus errors

Peter Ljunglöf wondered whether error reporting in Regulus could be improved, and had a couple of suggestions. I've implemented and checked in the following improvements:
  1. All error messages should now be printed to stderr.

  2. When processing fails during execution of the Regulus command , a line of the form

    Error processing command:

    should be printed. This was not previously the case.

  3. There is a new top-level predicate

    regulus_batch_storing_errors(+ConfigFile, +Commands, -ErrorString)

    which is like regulus_batch/2, except that it instantiates ErrorString with a string containing all the errors printed out during execution of Commands.
I expect there will be some glitches (I had to change a lot of lines of code), so please let me know if thing don't work as intended.

Building help resources from the combined interlingua corpus (2)

Considerable progress on this task today:
  1. I've added French to the AFF interlingua corpora, including the New York material as requested by Pierrette. The corpora are remade and checked in.
  2. The help resources for Eng and Ara (the languages where we have help class definitions) are now made from the combined interlingua corpus. A separate help file is made for each of the six pairs EngAra, EngFre, EngJap, AraEng, AraFre, AraJap, reflecting the different levels of coverage. You can make the help resources for all of these pairs by doing 'make help_resources' in $MED_SLT2 (i.e. at the top level in the MedSLT directory), and it only takes a few minutes.

Building help resources from the combined interlingua corpus

I've just checked in code that allows us to build Prolog help resources from the combined interlingua corpus in multi-lingual translation applications. This will make it much easier to integrate construction of help resources into the MedSLT build - it should now be almost trivial.

I'm currently remaking the interlingua corpus (I have had to change the format a little), and should be able to check in all the relevant MedSLT stuff later this evening.

Flagging ambiguity in interlingua checking

I've just checked in code to catch cases where interlingua is
ambiguous, in the sense of generating multiple different surface
strings in the interlingua grammar. This is most likely to occur in
AFF, when the to-interlingua rules are underconstrained and the
interlingua is only partially instantiated. The following Japanese
-> Interlingua example in MedSLT illustrates:

>> doko ga itami masu ka

Source: doko ga itami masu ka
Target: WH-QUESTION pain be where PRESENT ACTIVE
Other info:
n_parses = 1
parse_time = 0.297
source_representation = [null=[path_proc,itamu], null=[tense,present],
null=[utterance_type,question], subject=[body_part,doko]]
source_discourse = [null=[utterance_type,question], subject=[body_part,doko],
null=[tense,present], null=[path_proc,itamu]]
resolved_source_discourse = [null=[utterance_type,question], subject=[body_part,doko],
null=[tense,present], null=[path_proc,itamu]]
resolution_processing = trivial
interlingua = [loc=[loc,where], arg1=[secondary_symptom,pain], null=[tense,present],
null=[utterance_type,whq], null=[verb,be], null=[voice,active]]
interlingua_surface = WH-QUESTION pain be where PRESENT ACTIVE
other_interlingua_surface = [WH-QUESTION pain be above-loc where PRESENT ACTIVE,
WH-QUESTION pain be around-loc where PRESENT ACTIVE,
WH-QUESTION pain be between-loc where PRESENT ACTIVE,
WH-QUESTION pain be in-loc where PRESENT ACTIVE,
WH-QUESTION pain be under-loc where PRESENT ACTIVE]


Background

If you've reached this blog and don't have any idea what it's about, Regulus is an Open Source platform for constructing speech-enabled systems, which we've been developing since 2001. We've now built several high-profile applications, including Clarissa, so far the only speech-enabled system to have flown in space, and MedSLT, a medical speech translator. You can read more about Regulus here

First entry

Rather than mail people about new Regulus features, fixes, etc, I am starting a blog. Don't know why I didn't do this earlier!