SourceForge.net Logo

HindiDict Readme

Heike Cappel

Download

0. General Information

HindiDict is written in C. I have tested it on Suse Linux 8.2 and 9.0 Sparc Solaris 9 and Mac OS X (10.3). It should work on any platform with the standard C libraries that uses the ASCII code. To use the output file, Latex with two additional packages, see below, is required.

1. What is it and what does it do?

HindiDict takes a tab-separated text file containing Hindi words (in Velthuis encoding), word type, gender, English translation and usage and creates a Latex dictionary from them. It uses the Latex packages devnag by Frans Velthuis and lexikon by Axel Kielhorn, both available from CTAN (www.ctan.org).
Any of the fields in the input file apart from the Hindi and English fields can be empty, as long as the right number of tabs is used. The Hindi words have to be in the Velthuis Transliteration Scheme, but HindiDict takes care of the delimiters needed for the Hindi Latex preprocessor devnag. The field usage is meant for comments and examples. As it can contain a mixture of English and Hindi, all Hindi text that appears here has to be enclosed by the devnag delimiter {\dn ...}.

In the dictionary file HindiDict creates, the data from the input file is given first sorted by Devanagari with the Hindi words on the left of the page, then sorted alphabetically with the English words on the left. Each new letter in the list is marked by an index letter so that words can be found easily. Before the output file can be recognised by Latex, it has to be run with the Hindi preprocessor, devnag. This requires the file to have the ending .dn, so if necessary it has to be renamed.
HindiDict can be called in two ways, either just as itself or giving two arguments. If two arguments are given, the first will be used as the name of the input file and the second as the output file. If none or another number of arguments were provided, HindiDict will ignore them and the user will be asked to provide them later. When the program has got two file names to use, it will check if the output file already exists and warn before overwriting it.
The user will then be asked to provide a title and an author name that will appear on the title page of the dictionary. Finally, there is an option of including an introduction. The introduction must be in a text file, all Hindi text that is included has to be marked by {\dn ...}.
If HindiDict has run successfully it will then say that it has generated the output file. If .dn was not the ending we specified for the output file, this has to be renamed for the preprocessor. Then we have to call the preprocessor with first the file name of the output file and then any file name ending in .tex. The file the preprocessor generates can then be run under Latex.

2. Why does it exist?

During my Hindi course last year, I wanted to create a dictionary of all the Hindi words I had learned so far. The database my words were stored in allowed sorting by the Roman alphabet, but it could not cope with the transliterated Hindi. Even though there are specialised sorting algorithms and software packages which include Devanagari sorting, I could not find a general sorting algorithm that would work with the Velthuis Transliteration Scheme for Hindi. So my idea was to assign a two digit number to each character in the syllabary, starting with 01 for the first character and then counting through in dictionary order. The resulting numbers could then be sorted as strings to put a list of Hindi words in order. But even when the sorting for the words in my database was working by a lot of search and replaces by hand, it still took a lot of work generating the dictionary. With this program, the words wanted in the dictionary can be exported from the database and the dictionary can now be created from them straightaway.

3. Example

Sample input, introduction and output files are included.

Running the program looks something like this:

heike@clara:~/HindiDict> ./HindiDict input12.txt output.dn
Input file: input12.txt, Output file: output.dn.
output.dn will be created.a
Please provide a title for the dictionary:
  --> Sample Hindi Dictionary
Sample Hindi Dictionary will be used as the title.
Please provide an author name for the dictionary:
  --> Heike
Heike will be used as the author name.
Do you want your dictionary to have an introduction? (y/n)
  --> y
Please enter the name of the file that contains the introduction.
  --> intro.txt
The introduction will be taken from intro.txt.
output.dn has been created.
heike@clara:~/HindiDict> ./devnag output.dn output.tex