forked from mia/Aegisub
.. | ||
src | ||
ABOUT-NLS | ||
AUTHORS | ||
AUTHORS.myspell | ||
BUGS | ||
ChangeLog | ||
ChangeLog.O | ||
COPYING | ||
COPYING.LGPL | ||
COPYING.MPL | ||
INSTALL | ||
license.hunspell | ||
license.myspell | ||
NEWS | ||
README | ||
README.myspell | ||
THANKS | ||
TODO |
About Hunspell -------------- Hunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex word compounding or character encoding. Hunspell interfaces: Ispell-like terminal interface using Curses library, Ispell pipe interface, OpenOffice.org UNO module. Hunspell's code base comes from the OpenOffice.org MySpell (http://lingucomponent.openoffice.org/MySpell-3.zip). See README.MYSPELL, AUTHORS.MYSPELL and license.myspell files. Hunspell is designed to eventually replace Myspell in OpenOffice.org. Main features of Hunspell spell checker and morphological analyzer: - Unicode support (affix rules work only with the first 65535 Unicode characters) - Morphological analysis (in custom item and arrangement style) and stemming - Max. 65535 affix classes and twofold affix stripping (for agglutinative languages, like Azeri, Basque, Estonian, Finnish, Hungarian, Turkish, etc.) - Support complex compoundings (for example, Hungarian and German) - Support language specific features (for example, special casing of Azeri and Turkish dotted i, or German sharp s) - Handle conditional affixes, circumfixes, fogemorphemes, forbidden words, pseudoroots and homonyms. - Free software (LGPL, GPL, MPL tri-license) Compiling on Unix/Linux ----------------------- ./configure make make install For dictionary development, use the --with-warnings option of configure. For interactive user interface of Hunspell executable, use the --with-ui option. The developer packages you need to compile Hunspell's interface: glibc-devel optional developer packages: ncurses (need for --with-ui), eg. libncursesw5 for UTF-8 readline (for fancy input line editing, configure parameter: --with-readline) locale and gettext (but you can also use the --with-included-gettext configure parameter) Hunspell distribution uses new Autoconf (2.59) and Automake (1.9). Compiling on Windows -------------------- 1. Compiling with Windows SDK Download the free Windows SDK of Microsoft, open a command prompt window and cd into hunspell/src/win_api. Use the following command to compile hunspell: vcbuild 2. Compiling in Cygwin environment Download and install Cygwin environment for Windows with the following extra packages: make gcc-g++ development package mingw development package (for cygwin.dll free native Windows compilation) ncurses, readline (for user interface) iconv (character conversion) 2.1. Cygwin1.dll dependent compiling Open a Cygwin shell, cd into the hunspell root directory: ./configure make make install For dictionary development, use the --with-warnings option of configure. For interactive user interface of Hunspell executable, use the --with-ui option. readline configure parameter: --with-readline (for fancy input line editing) 1.2. Cygwin1.dll free compiling Open a Cygwin shell, cd into the hunspell/src/win_api and make -f Makefile.cygwin Testing ------- Testing Hunspell (see tests in tests/ subdirectory): make check or with Valgrind debugger: make check VALGRIND=[Valgrind_tool] make check For example: make check VALGRIND=memcheck make check Documentation ------------- features and dictionary format: man 5 hunspell man hunspell hunspell -h http://hunspell.sourceforge.net Usage ----- The src/tools dictionary contains ten executables after compiling (or some of them are in the src/win_api): affixcompress: dictionary generation from large (millions of words) vocabularies analyze: example of spell checking, stemming and morphological analysis chmorph: example of automatic morphological generation and conversion example: example of spell checking and suggestion hunspell: main program for spell checking and others (see manual) hunzip: decompressor of hzip format hzip: compressor of hzip format makealias: alias compression (Hunspell only, not back compatible with MySpell) munch: dictionary generation from vocabularies (it needs an affix file, too). unmunch: list all recognized words of a MySpell dictionary wordforms: word generation (Hunspell version of unmunch) After compiling and installing (see INSTALL) you can run the Hunspell spell checker (compiled with user interface) with a Hunspell or Myspell dictionary: hunspell -d en_US text.txt or without interface: hunspell hunspell -d en_UK -l <text.txt Dictionaries consist of an affix and dictionary file, see tests/ or http://wiki.services.openoffice.org/wiki/Dictionaries. Using Hunspell library with GCC ------------------------------- Including in your program: #include <hunspell.hxx> Linking with Hunspell static library: g++ -lhunspell example.cxx Dictionaries ------------ Myspell & Hunspell dictionaries: http://extensions.libreoffice.org http://cgit.freedesktop.org/libreoffice/dictionaries http://extensions.openoffice.org http://wiki.services.openoffice.org/wiki/Dictionaries Aspell dictionaries (need some conversion): ftp://ftp.gnu.org/gnu/aspell/dict Conversion steps: see relevant feature request at http://hunspell.sf.net. László Németh nemeth at numbertext org