1996 SALS-SIG Seminars

SALS-SIG Research Seminar

Lojban as a Machine Translation Interlingua in the Pacific

Nick Nicholas
Department of Linguistics and Applied Linguistics, University of Melbourne

When: Friday, 4th October 1996

Time: 3:00pm

Where: Room E6A357, Macquarie University


It has been argued that a human-like interlingua in Machine Translation would yield significant benefits in multilingual systems. Interlinguas, however, are rarely used in practical MT systems, with the exception of Esperanto in the DLT project. There is a feeling that interlinguas which retain too many of their own idiosyncracies (as has been argued for DLT-Esperanto) render the multilingual translation problem overly complex: rendering each source and target language into the interlingua can itself end up a forbidding task, if the interlanguage is not sufficiently well-defined, and if the source and target languages are already so close to each other in their linguistic structures that building up interlanguage representational structures de novo is superfluous effort. It has been argued that this is the case for multilingual translation in the European context: the practicalities of the task do not justify elaborate source-interlingua and interlingua-target translation modules, when a much sketchier intermediate representation can do the work much more efficiently.

The situation in the Pacific Rim, however, is not the same as that in Europe. Typological and lexicological diversity is much greater amongst the languages of this region, and it is not as obvious that the effort required in constructing fully elaborated, language-like intermediate representations in multilingual MT is as superfluous here as in Europe: the 'common ground' shared by languages in the region is much smaller, and therefore the intermediate representation need to be much more explicit and detailed.

I outline in this presentation a feasibility study on Lojban as such an explicit interlingua. Lojban is an artificial language designed collectively by the members of the Logical Language Group, as a continuation of the earlier Loglan project, originally intended as a test of the Sapir-Whorf hypothesis. The aims of the Lojban project have broadened since, and the major interests of those involved are either formal-semantic or computational. Lojban is based for the most part on predicate logic; that is to say, the metalanguage used to describe the various facets of the language's grammar are expressed in terms of predicate logic. Although professional linguists have been involved with the language design effort, the language itself has been mostly designed by amateur linguists; and some of the design decisions are open to debate.

Lojban compensates for this, however, with the breadth to which it is defined. It is much easier for an amateur effort such as this to set out to construct a formal model of a full language; while the formal models in the formal semantic literature, though perhaps more rigorous, are necessarily restricted to a small subset of the semantic scope of a language. Lojban therefore has enough breadth of coverage to be practical as an interlingua, by the same rationale as argued for Esperanto by the DLT researchers. At the same time, it is explicit enough in its definition that MT modules using Lojban as an interlingua can be developed formally and methodically, without relying on subjective judgements.

I consider several linguistic facets of MT, to ascertain whether Lojban provides a workable interlingua for such a task in the Pacific context. The success and failures of Lojban in this regard are intended to provide an illustration of what properties one might and might not require of a MT interlanguage.

