Wednesday, 23 September 2009

New treatment of Japanese verbs

I've been discussing Japanese verbs with Yukie - we need new inflectional forms for CALL-SLT, in the particular the volitional (-tai) form, and the old system was getting out of hand. Japanese has extraordinarily regular morphology, with only two irregular verbs and some very straightforward sound-changes, so it seemed to me that we really ought to be able to get by without explicitly listing all the inflections of every verb we needed.

Yukie wrote down a table of inflections, and we discussed ways of splitting up inflected verbs into stems and affixes. Based on our discussion, I've implemented a first version of a new treatment, where you now only need to specify a single root form of the verb, and everything else is done by morphotax rules, where the affixes are treated by Nuance as separate words. I've tested by converting the Japanese Calendar lexicon to the new form, and compiling into a recognizer. Coverage is what it was, and recognition is anecdotally fine with my voice. I will dig out some Japanese Calendar data soon and run proper tests.

If anyone wants to look at the details, the morphotax rules are in $REGULUS/Grammar/Japanese/japanese_verb_morphology.regulus. The new version of the Japanese Calendar lexicon is at $REGULUS/Examples/Calendar/Regulus/japanese_calendar_lex_new.regulus.

Here's an example of a parse:

$ nanji ni owa ri mashita ka

(Parsing with left-corner parser)

Analysis time: 0.09 seconds

Return value: [[question,form(past,[[owaru],[ni,term(null,nanji,[])]])]]

Global value: []

Syn features: []

Parse tree:

utterance [JAPANESE_CORE_RULES:120-123]
/ main_clause [JAPANESE_CORE_RULES:147-151]
| / comps [JAPANESE_CORE_RULES:190-195]
| | / pp [JAPANESE_CORE_RULES:414-423]
| | | / np [JAPANESE_CORE_RULES:267-273]
| | | | n lex(nanji) [JAPANESE_CALENDAR_LEX_NEW:84-84]
| | | \ p lex(ni) [JAPANESE_CALENDAR_LEX_NEW:274-284]
| | \ comps null [JAPANESE_CORE_RULES:163-166]
| | vbar [JAPANESE_CORE_RULES:249-253]
| | / v_stem [JAPANESE_VERB_MORPHOLOGY:27-38]
| | | / v_stem lex(owa) [JAPANESE_CALENDAR_LEX_NEW:227-237]
| | | \ stem_affix lex(ri) [JAPANESE_VERB_MORPHOLOGY:138-138]
| \ \ affix lex(mashita) [JAPANESE_VERB_MORPHOLOGY:80-83]
\ lex(ka)

------------------------------- FILES -------------------------------


Thursday, 17 September 2009

"Abstract actions" and the dialogue server

I've had some productive discussions with Maria over the last few days, which have resulted in a couple of significant improvements to the dialogue server. Maria is going to build a Java GUI for the CALL-SLT system. She needs to be able to send requests to the dialogue server, and get back information that she will pass on to the user. Most often this will be in the form of screen-based output. The new functionality is motivated by this scenario, but is quite generic.

The first point Maria made was that she would prefer to use XML-formatted messages. Java finds it easy to manipulate XML; parsing Prolog messages, on the other hand, is a complete pain. So I added switches that allow the client to put the server into a mode where all messages are XML strings inside a minimal Prolog wrapper.

Yesterday, Maria made another very sensible request. In the first version of the application, the Prolog "output manager" module received abstract actions, and transformed them into concrete actions. Typically, concrete actions would involve printing strings. So, for example, suppose that the system has just given you the prompt


and you have correctly replied

i would like a table outside please

The abstract action produced is

display_matching_info('i would like a table outside please', correct, [2,1,2])

which means that the recognized words were 'i would like a table outside please'; they correctly matched the prompt; and the score is now 2 correct, 1 incorrect, with a positive streak of 2. This is converted by the output manager into the concrete action

print('I heard: "i would like a table outside please"


Score: 2 right, 1 wrong (66.7%) Streak: 2')

This works fine for a text-based command-line interface; but, as Maria pointed out, the output processing isn't necessarily appropriate when you have a Java Swing GUI, in which case you probably would prefer to format it yourself. For instance, you might want to print the recognized words in one pane, render the "correct" as a green tick-mark in another one, and present the score graphically as three columns of different heights.

In general, the abstract action is going to be more useful to you than the concrete one. So I've just added a little more functionality to the dialogue server to handle that too. Here's a summary of the new messages, and what they do; they are also documented in the file itself, $REGULUS/Prolog/
  • action(xml_messages). Format future messages in both directions in XML form. Each message will be of the form


    where XMLString is an XML encoding of the corresponding Prolog message produced using the predicate prolog_xml/2 in $REGULUS/PrologLib/ The XML can be converted back into Prolog if necessary using the same predicate.
  • action(prolog_messages). Format future messages in both directions in Prolog form (default).
  • action(abstract_actions). Pass abstract actions to the client, so that the client can do its own output management.
  • action(concrete_actions). Pass concrete actions to the client (default).

Tuesday, 15 September 2009

Warning: don't use SICStus 4.0.2

Maria and I have just spent two very frustrating days trying to figure out why CALL-SLT wasn't running correctly on her machine. In the end, it turned out that a few bits of Regulus functionality don't work correctly under SICStus 4.0.2, which is the version she was using... there appears to be something wrong with the SICStus/operating system interface.

So avoid this release! 4.0.4 is fine.

Sunday, 13 September 2009

Dialogue server now accepts XML format messages

Another of those things I should no doubt have done years ago: after a discussion with Maria, I've now modified the dialogue server so that it can also run in a mode where all messages are XML-formatted.
Details (this is documented in $REGULUS/Prolog/

- Initially, the server is in Prolog mode.

- To put the server into XML mode, send the message


- Subsequent messages are of the form


where XMLMessage is an XML encoding of the corresponding Prolog message produced using the predicate prolog_xml/2 in $REGULUS/PrologLib/
The XML can be converted back into Prolog if necessary using the same predicate.

I have tested by converting the CALL-SLT Prolog client to use XML-flavor messages, and it all works fine.

The routines in $REGULUS/PrologLib/ should in general be useful for translating Prolog into XML form in a reversible way. Look at the file for documentation and an example.

Monday, 7 September 2009

CALL-SLT and Japanese

I have just added a little more coverage to the Japanese version... it now has about a dozen sentences for the student to practice on. I tried it, and so far it still recognizes everything I say. This is probably more because vocabulary is so small than because I have a wonderful Japanese accent :)

Yukie and I should talk about how to proceed here. The first step will be to add material to the Japanese corpus.

Saturday, 5 September 2009

Using recorded wavfiles as help information in CALL-SLT (part 2)

I did a little more fiddling around with the translation game strategy code, and it's now possible to define a strategy where the system only chooses entries which don't have an associated wavfile. The idea is to make it easy for the teacher to add missing wavfiles.

I tested it on English, and we now have a complete set of wavfiles for that language. As soon as we have a bit more coverage for French and Japanese, I'll add similar scripts for them too.

Friday, 4 September 2009

Using recorded wavfiles as help information in CALL-SLT

I've got a new feature working on CALL-SLT, which allows speech input to be logged and reused as help for students. When you start the system, it asks whether or not you wish to be considered a native speaker. If you answer yes, it keeps the wavfile for each successful match, and stores it in such a way that that the wavfile is associated with the current prompt. Subsequently, if a student is given the same prompt and hits the HELP button, the native speaker's wavfile is replayed. By construction, we know that the native speaker was correctly recognized, so if the student can just imitate them well enough they should be recognized too.

The idea is simple, but there were some messy technical problems... a bad interaction between Nuance and SICStus concerning relative pathnames, and the question of what happens if two different users try to check in new wavfiles simultaneously. I think I have decent solutions, though. For more details, look at the online documentation which I have just added. This also tells you how to download and run the system.