Institute of Linguistics, Adam Mickiewicz University

Investigationes Linguisticae

  Bulletin devoted to general, comparative and applied linguistics

  Selected: Vol. XXI, 2010
  Table of Contents
  Full Papers
  Reports, Abstracts etc.
  Student Papers
  Varia Linguisticae

  General Information
  Lista recenzent√≥w
  Guidelines for Authors
  Editoral Committee
  CL-A 2010 - Instructions

  Archive (electronic)
 Volume XXX - 2014 (new)
 Volume XXIX - 2013
 Volume XXVIII - 2013
 Volume XXVII - 2012
 Volume XXVI - 2012
 Volume XXV - 2012
 Volume XXIV - 2011
 Volume XXIII - 2011
 Volume XXII - 2010
 Volume XXI - 2010
 Volume XX - 2010
 Volume XIX - 2010
 Volume XVIII - 2009
 Volume XVII - 2009
 Volume XVI - 2008
 Volume XV - 2007
 Volume XIV - 2006
 Volume XIII - 2006
 Volume XII - 2005
 Volume XI - 2004
 Volume X - 2004
 Volume IX - 2003
 Volume VIII - 2002
 Volume - 0000

  Archive (printed)
 Volume VII - 1999
 Volume VI - 1999
 Volume V - 1999
 Volume IV - 1998
 Volume III - 1998
 Volume II - 1997
 Volume I - 1995
Volume XXI, 2010
Table of Contents

Previous Article | Table of Contents | Next Article

Creating and Weighting Hunspell Dictionaries as Finite-State Automata
Tommi Pirinen, Krister Lindén (Department of Modern Languages, University of Helsinki, Finland)

Download document

Other articles by this author | Add to Quick Links

Source: Investigationes Linguisticae, Volume XXI, 2010, pp. 1-16
Category: Full Papers

Language: en

English Abstract:

There are numerous formats for writing spell-checkers for open-source systems and there are many lexical descriptions for natural languages written in these formats. In this paper, we demonstrate a method for converting Hunspell and related spell-checking lexicons into finite-state automata. We also present a simple way to apply unigram corpus training in order to improve the spell-checking suggestion mechanism using weighted finite-state technology. The performance ofthe finite-state based spell-checking system compared with the hunspell approach seems to be an order of magnitude faster. What we propose is a generic and efficient language-independent framework of weighted finite-state automata for spell-checking in typical open-source software, e.g. Mozilla Firefox, OpenOffice and the Gnome desktop.

BibTeX Entry:

@article{ pirinen_inve21,
 author="Tommi Pirinen and Krister Lind√©n",
 title= "Creating and Weighting Hunspell Dictionaries as Finite-State Automata",
 journal="Investigationes Linguisticae",
 url="" }
Quick Search
  Search in
  Results per page

Quick Links
No articles have been added to Quick Links. Use "Add to Quick Links" in the article details.

Get Acrobat Reader
Get Ghostscript
Copyright © Institute of Linguistics, Adam Mickiewicz University
Design and programming by Marcin Junczys-Dowmunt