Wednesday, 11 June 2008

Interlingua corpora for multiple domains

Following a discussion with Pierrette last week, I have added two more MedSLT Interlingua corpora, for the chest pain and abdominal pain domains. I've also added all the associated config files, scripts etc for the currently relevant language pairs (EngInt, JapInt, IntEng, IntFre and IntJap), so it should now possible to do systematic interlingua-centered development for all three domains. I have only built AFF versions, since we're planning to retire the linear formalism soon.

The naming conventions are the usual ones. I hopefully managed to check everything in, but let me know if files that you expected to find are missing. Pierrette should at some point do some work tidying up IntFre and FreInt and Yukie should do the same for IntJap and JapInt. Further down the line, we should really add coverage for these domains in the missing languages.

No comments: