Calendar

  • 12 April 2010

    11:00-13:00
    Slides

  • 13 April 2010

    11:00-13:00
    Slides

  • 14 April 2010

    11:00-13:00
    Slides

  • 15 April 2010

    11:00-13:00
    Slides

  • 16 April 2010

    11:00-13:00
    Slides

  • 19 April 2010

    11:00-13:00
    Slides

  • 20 April 2010

    11:00-13:00
    Slides

  • 21 April 2010

    11:00-13:00
    Slides

  • 22 April 2010

    11:00-13:00
    Slides

  • 23 April 2010

    11:00-13:00
    Slides

prof. Niladri Chatterjee
Indian Institute of Technology Delhi, Delhi

Overview

Machine Translation (MT) is one of the hardest challenges undertaken by computer science, whose solution is now closer thanks to recent advances in statistical and machine learning techniques. The course will present an overview of the history, approaches, progress and difficulties of MT. The course will introduce the building blocks of MT from linguistics and probability, and will cover the major models for machine translation. Latest research will be reported as well as the major outstanding challenges. In particular, the following topics will be covered:

  • Introduction to NLP and MT
  • Statistics Preliminaries
  • Language Modeling
  • Word-based Models
  • Higher Order Models and Overview of Available Software
  • Phrase-Based Translation (PBT)
  • PBT and Decoding
  • Discriminative Training
  • Hidden Markov Models and Maximum Entropy
  • Metrics for Evaluating Translation Quality

Alternative MT approaches will be discusses, and current research trends in the field will be presented.

The course is part of the activities of the Dottorato in Informatica (PhD in Informatics) at the Università di Pisa.

Bibliography

  • Philip Koehn, Statistical Machine Translation, Cambridge University Press, 2009

Exam