The undergraduate program in Language Technology is made up of five semester-long units, one in first year and two in each of second and third year:
- SLP148: Language, Logic and Computation
- COMP248: Language Technology
- COMP249: Web Technology
- COMP348: Document Processing and the Semantic Web
- COMP349: Spoken Language Dialogue Systems
Descriptions of these units can be found below, and more details are available in the online Study Guides.
Brief information about Language Technology, applications and the undergraduate program can be found in this leaflet about our Language Technology program.
These units can be taken as part of any of the undergraduate degree programs offered by the Department of Computing; they can also be taken as part of degree programs in other subjects, provided that the appropriate prerequisites are met.
The content of the units has been designed in consultation with the program's Management Advisory Board, which carries representation from each of the industry partners in the Language Technology Program. As a result, students undertaking the program can be sure they are acquiring skills that will position them well for jobs in this exciting industry.
SLP148: Language, Logic and Computation
This unit explores the interface between language, logic and computing, for students in any of the three contributing disciplines, Linguistics, Computing and Philosophy. It provides an introduction to the issues that arise in the computational processing of human language, using basic concepts in linguistics and logic, and addressing the surface of language as well as its underlying structures. It covers basic concepts in Prolog programming, and includes an overview of natural language processing. The unit is co-taught by staff from Linguistics, Philosophy and Computing, through a series of interactive lectures and forums.
This unit represents an excellent opportunity for those who want to do some of the more advanced units in the Language Technology program but have limited background in the area.
COMP248: Language Technology
This unit provides a broad introduction to the kinds of applications that are developed within the field of Natural Language Processing, and the techniques and algorithms required to build these applications. Students acquire both theoretical background and practical experience in processing the syntax and semantics of natural language. The practical skills gained are put to use in the development of a chatterbot, a Prolog-based natural language system, and a speech application.
The unit covers:
- Natural Language Processsing: applications and key issues;
- Prolog programming: rules, recursion and lists;
- Chatterbots, AIML;
- Basics of dialogue systems;
- The lexicon and morphology;
- Phrase structure grammars and English syntax;
- Definite Clause Grammars in Prolog; features and structure building in DCGs;
- Top-down and bottom-up parsing strategies;
- Semantics and logical forms;
- Speech recognition;
- Speech production;
- Speech-based dialogue systems
On completion of the unit, the student will have a broad understanding of what is involved in NLP generally and a good awareness of the issues in syntactic and semantic processing. The student will also be comfortable with writing grammars to handle natural language data, and processing the information they deliver. The unit provides an excellent basis for our more advanced Language Technology units, COMP348 and COMP349.
COMP249: Web Technology
This unit covers a broad range of techniques and concepts that are relevant to the design, implementation and maintenance of systems on the World Wide Web. From Web site development using HyperText Markup Language (HTML) and eXtensible Markup Language (XML), through to complete client-server applications written in Perl and Java, the unit explores the full spectrum of this new and exciting technology and provides Internet programming skills to those who have basic programming experience.
The unit covers:
- A history of the Internet and the World Wide Web;
- The Internet backbone and protocols;
- Search engines and search strategies;
- HTML and style sheets;
- Web design and content development;
- XML and XML applications;
- Client-server computing and the Common Gateway Interface;
- Web servers and Web application servers;
- Web-database integration;
- Web site security;
- Graphics on the Web; and
- Peer to peer applications.
On completion of this unit, the student will have a profound understanding of established and emerging Web technologies. With its pragmatic focus, the unit provides technical abilities and programming skills that are immediately useful in building fully-featured Web sites, and prepares the student well for tomorrow's job market.
COMP348: Document Processing and the Semantic Web
This unit explores the issues involved in building natural-language-processing (NLP) applications that operate on large bodies of real text such as the ones found in the World Wide Web (WWW). Application areas covered include Web-based technologies like information retrieval, document summarisation, machine translation, and information extraction. From a theoretical perspective, the unit focuses on the concepts and techniques required in order to process real natural-language text; students also gain practical experience in using the Python programming language to develop various NLP systems that exercise these techniques.
The unit covers:
- Basic concepts in linguistics; the role of linguistic knowledge in language technologies
- Corpus-based approaches; marked-up corpora; SGML and XML;
- Tokenisation and sentence segmentation;
- Morphological analysis; the Porter stemming algorithm;
- Part of speech tagging; the Brill tagger;
- Document summarisation;
- Information retrieval;
- Word-sense disambiguation;
- Information extraction;
- Named entity recognition;
- Syntactic analysis and shallow parsing; finite state methods;
- Reference and anaphora; the coreference task; and
- approaches to machine translation.
On completion of the unit, the student will have a good understanding of what is involved in building working systems that operate on real documents and web pages. The student will also be able to determine what techniques are required for specific NLP applications, and will be aware of the tools that are available to support these kinds of developments. The unit is excellent preparation for working in the text and document processing industries; it also provides an appropriate grounding for more advanced studies such as the material in our COMP448 honours unit, Advanced Topics in Natural Language Processing.
COMP349: Spoken Language Dialogue Systems
This unit explores the issues involved in building significant Spoken Language Dialog Systems (SLDSs). Students will gain the practical experience required to develop SLDSs and learn basic strategies for designing user-friendly speech applications. The unit will prepare students with the skills and knowledge necessary to find employment in the emerging speech processing industry. Recently, VoiceXML has opened up huge business opportunities. To pay tribute to this situation, we will make use of a number of guest lectures that will give you insights into Australia's speech processing industry.
The unit covers:
- SLDS architecture and components;
- SLDS lifecycle: specification, design, testing and tuning;
- Dialogue design, prompts and grammar writing;
- Building habitable systems: dialogue structure and cooperativity; and
- The future of SLDs.
On completion of the unit, the student will have a good understanding of what is involved in building real interactive systems that use natural language, and will understand what is required in order to make these systems acceptable to users. The unit is excellent preparation for working in the spoken language systems industry; it also provides an appropriate grounding for more advanced studies such as the material in our COMP448 honours unit, Advanced Topics in Natural Language Processing.