Saturday, 31 May 2008

Progress on N-best rescoring

Maria Georgescul and I have been doing some work over the last few days on N-best rescoring, using the Calendar application as a test-bed. The basic division of labor was for me to define features and transform N-best hypothesis lists into lists of feature vectors, while Maria fed these into an SVM-based learner to perform the actual rescoring. We did the experiments using a set of 459 recorded utterances. Rescoring now reduces semantic error rate from 19% to 11%, and WER from 11% to 10%.

I defined the features by looking at examples of N-best lists, and finding common examples of things which I felt intuitively should be penalized. The current set of features is as follows:

rank: Place in the N-best list

no_dialogue_move: Hypothesis produces no dialogue move

underconstrained_query: Query with no contentful constraints

non_indefinite_existential: Existentials with non-indefinite arg, e.g. "is there the meeting next week"

non_show_imperative: Imperatives where the main verb isn't "show" or something similar

indefinite_meeting_and_meeting_referent: combination of indefinite mention of meeting + available meeting referent

No comments: