Our sponsors

AMI Consortium

IM2

EU

University of Sheffield

Research Software

Bob

Bob is a software tool for managing and generating lexicon and pronunciations dictionaries to aid the development of automatic speech recognition (ASR) systems. The tool is written in Java and is freely available for noncommercial use under a Creative Commons licence. Specifically, it targets lexicon and pronunciation dictionary generation in a way that maintains:
  • consistency between spellings in the lexicon and spellings in text corpora used for language modelling
  • consistency in use, by non-phoneticians, of a set of phones to describe the pronunciation of words in a language.

Bob is built from the experience, knowledge and procedures that have been applied successfully in the NIST RT evaluations. It is flexible enough to enable users to generate lexicons tailored for specific tasks, including the ability to exclude classes of words (e.g. removing expletives). The word attribute labels may be customised and extended to suit the users' needs. It is portable across languages: the interface is not specific to English and the letter-to-sound pronunciation predictor can be retrained for any language.

Download now

Creative Commons Licence
Bob is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 UK: England & Wales Licence.

pronunciation generation

text normalisation