The Penguin English-Hindi Hindi English Thesaurus and Dictionary (In Three Volumes) - A Most Comprehensive Resource
Language historians, population geneticists and archaeologists believe that a band of early humans, perhaps no more than 2,000 strong, acquired the amazing faculty for complex languages and invented linguistic communication. Blessed with the many advantages of meaningful speech, the band could now organize itself better, take on predators and prevail. Its population grew exponentially.

Around 50,000 years ago, the band developed sufficient navigation skills to cross the seas. Its member traveled far and wide, some settled in new colonies, others moved further, thus launching the first globalization and language splits, its descendants becoming many races. Their ancient tongue has long been forgotten, but it left behind more than 5,000 languages.

Thesauruses and dictionaries through the ages Ancient Sanskrit scholars describe word or language as vyakrita vani or meaningful, analyzed, systematized voice. They called it shabd Brahm, i.e., word, the Brahm. Brahm is personified in the Indian tradition as Brahm, the Creator. His consort, the goddess Sarasvati, is known as gira (voice) or Vagdevi (the goddess of voice). Ancient Greeks called word logos, giving it the status of God. In Christian theology, word is the Ultimate Reality, especially as manifest in the creative and sustaining spirit of God as revealed in Jesus.

Words as specific sound patterns represent things and communicate commands, instructions, ideal and thoughts. They are oral icons, symbols, representations. As societies identified and invented more things, they coined more words for them based on perceived associations, similarities and dissimilarities.

Language development is an ongoing process. It has been pivotal in enriching our mental capabilities, generating new ideas, codifying complex knowledge bases, and inventing and keeping track of philosophical thought, social cods, useful techniques and scientific systems, thus contributing to present-day systemic societal organization.

Before the emergence of early scripts, man had begun to make tools to record words and standardize language by defining rules. The first lexical works were simple were word lists, the precursors of the modern, vast and intricate thesauruses and dictionaries. Examples are a short seventh-century BC Akkadian word list, from central Mesopotamia, and the early-third-century BC Erya, the first Chinese language dictionary which organized Chinese characters by semantic groups.

In India, the translation of glossaries, thesauruses and dictionaries goes back to the Vedic age, between 3000 and 1500 BC. The world’s first-known and extant thesaurus is Nighantu, a wise. Its compiler, Kashyap, was bestowed with the lofty title of Prajapati, the progenitor. Nirukt, the sage Yask’s treatise on Nighantu, may have been the world’s first dictionary-encyclopaedia; it gives words and their meanings which are elaborated upon in great detail.

There were several subsequent compilations of Sanskrit dictionaries. The shabdakalpadrum, a Sanskrit dictionary of an unknown date, lists twenty-nine such works, most of which were arranged subject-wise and were, in a broad sense, thesauruses.

Amar Kosh is the bible of all the Sanskrit thesauruses. It author, amar Singh (Amar Simha in Roman Edevanagri) gave his work the title of Namalinganushasan (the Discipline if Names and Genders). It was also called Trikaand, because it was divided in three hierarchical cantos with twenty-five chapters having a total of 8,000 words in 1,502 shlokas or verses. It is popularly known as Amar Kosh to acknowledge the achievement of its author, just as the English thesaurus, in all its editions and variations, is better known as Roget’s Thesaurus.

When the Amar Kosh first made its appearance is not known, but it may have been written between the fourth and the tenth centuries AD. Ancient Indians rarely kept records of dates! Like the later Roget’s Thesaurus, Amar Kosh was an instant success. Its fame spread beyond the Himalayas and it became the subject of numerous treatises. It is said that one Pandit Gunaraj translated it into Chinese in the sixth century. The Hindi-Pesian poet, Ameer Khusro’s Khalikbari (twelfth-thirteenth century AD) was directly inspired by it. His Persian-Hindi thesaurus-cum-dictionary can be counted among the early bilingual thesauruses of the word.

Most Sanskrit and Indo-Persian dictionaries till the nineteenth century were arranged in a rhyming order. in non-script and pre-printing societies, versification was he accepted way of writing important books on the premise that it is easier to remember a verse than a prose paragraph. This also explains the proliferation of synonyms in these languages; it helps to have parallel words at hand, to balance a metric line.

The advent of modern lexicography goes back to early-seventeenth-century England. The first English dictionary is believed to be Robert Cawdrey’s Table Alphabeticall of 1604. It included 3,000 words and contain little more than synonyms. The first comprehensive dictionary was Thomas Blount’s Glossographia in 1656. But the first true modern English dictionary was Samuel Johnson’s Dictionary of the English Language (1755).

In 1806, Webster published A Compendious Dictionary of the English Language, the first American dictionary. Immediately thereafter, he went to work on his magnum opus, An American Dictionary of the English Language, for which he learned twenty-six languages, including Anglo-Saxon and Sanskrit, in order to research the origins of his mother tongue. This book, published in 1828 with 70,000 entries, set a new standard in lexicography. Many felt that it surpassed Samuel Johnson’s 1755 British masterpiece, not only in scope but in authority as well.

The largest dictionary of the world is het Woordenboek der Nederlandsche Taal (WNT) (the Dictionary of the Dutch language). It took 134 years to create (1864-1998) and has approximately 4,30,000 entries on 45,805 pages in 92,000 columns.

A big landmark in modern lexicography was the publishing of Dr Peter Mark Roget’s thesaurus in 1852. This edition had 1,500 words arranged in a systematic, subject-wise manner. Roget’s work gave the writer his first tool top select the right word for a concept. Since then, its newer editions have had many words added to, it culminating in the vast international editions of today.

Contact with the West and the establishment of British rule in the eighteenth-nineteenth centuries gave a new impetus to understand their subjects better and better and the discipline of Indology came into being. Simultaneously, great effort were afoot to propagate Christianity. To make vernacular translations of the Bible, Christian missionaries took to learning Indian languages and made grammars to fulfil their needs. Scholars made bilingual dictionaries; among them is the famous and still unrivalled Sanskrit-English Dictionary (1857) by Sir Monier Monier-williams.

Even before Independence, many individuals and organization in India were making Hindi, English-Hindi and Hindi-English dictionaries. The vast Hindi dictionaries of Nagari Pracharini Sabha (Varanasi) and Hindi Sahitya Sammelan (Allahabad) are examples of the remarkable collective work and modern India’s attempts in lexicography. India’s independence from British rule in 1947 greatly accelerated the process; the nascent nation had to come to terms with a new world. This gave a new urgency to dictionary making.

Under the British rule, many Indians opposed the usage of English which they viewed as an imperial imposition on the country. After Independence, however, English was increasingly perceived as an important portal of Indian to the world. This explains the emphasis on the creation of English-Hindi and Hindi-English dictionaries. Some bilingual dictionaries between Hindi and language like Russian and German were also made. The Government of India set up commissions to coin technical terms so that Hindi could replace English as the medium of education, governance and technological development.

We decide to make a thesaurus
Arvind first came to know of and use Roget’s work in 1952 and wished Hindi had such a wonderful tool. He hoped that in the new spirit of dictionary making in India, a Hindi thesaurus would soon be made too. Two decades later, Arvind was in Bombay (now Mumbai), editing a Hindi fortnightly magazine, Madhuri, for the Times of India group. There was still no Hindi thesaurus on the horizon. On the evening of Christmas Day 1973, it occurred to him that he would have to make it. The next morning, we discussed the idea during our walk and decided to go ahead with the work.

We were well aware that the colossal job would require our full-time dedication. Arvind would have to leave his lucrative job and, in the absence of any financial support, we would have to live simply of our savings.

We spent some months in collecting reference material. On 19 April 1976, we started work on a part-time basis, in our off hours. Arvind would write words on specially designed cards and Kusum would later create indexes for them on a set of smaller cards. In 1978, Arvind left Bombay and we moved to Delhi. The final plunger into the ocean of Hindi vocabulary had been taken.

Arvind had imagined that we would be able to complete the work in two years (it eventually took twenty!). he had reasoned that we could follow the pattern of Roger’s Thesaurus. We assigned numbers to all the concepts and put the numbered cards in the Rogetian Sequence. All that remained to be done was to fill the cards with appropriate Hindi words. Alas, it was not that simple.

To check the model, Arvind went through the first few pages of a concept were missing in Roget’s and there was no way to add more categories between the already assigned sequential numbers.

Roget’s work is based on the so-called scientific classification. Language, however, is anything but scientific. While the study of words is a science, people coin words in various unscientific ways, mostly associative, but sometimes just whimsical. Associations vary from people to people and time to time and have societal contexts. The scientific system is also handicapped by difficulties that the layman may have in making a straight association of concepts. For example, in modern Rogetian editions, wheat is listed with grasses. Among its associations are bamboo, banana. No relationship has been pointed out with cereal or food. Another example is that of steel. The user thinks of steel in the context of iron. But in Roget, it is counted among alloys with no reference to iron.

When Roget’s system failed us, we considered emulating Amar Singh. However, he was out of sync with new realities. Wars or arms no longer conjure up images of warriors from the Kshatriya caste. Nor would one associate lion with a kshatriya or cow with a vaishya. The shudras are no longer menials or servants. In Amar Singh’s time, music was a heavenly activity, but a musician a menial. Thus, he put music in the first canto Heavens, and musician under Shudras in the second; this would not work in the contemporary context.

It was now plain to us that we had no model; that we were on our own. There were no pointers to what order, sequence, pattern or structure we would give to our word groups. We decided to evolve our own system as we progressed three were at least five false starts. It was fourteen years before we came upon a viable structure.

The job of adding words was divided between the two of us. Arvind took care of categories like activities, ideas, abstract nouns, verbs, adjectives, adverbs, idioms and exclamations. Kusum was assigned words relating to things, animals, trees, herbs and mythological names. She had to face unforeseen difficulties. Hindi has many words for a tree-animal and a word may stand for many trees/animals. He problem was how to find a way to distinguish and insert a word in the right place. Fortunately for her, Sir Monier-Williams’ excellent Sanskrit-English dictionary gives the New Latin technical names of such things. Kusum started making an index of New Latin technical terms, to check and re-check if her entries were right.

Computer and the Shabda Lexicographer
By 1990-91, we had a roomful of 60,000 hand-written cards with over 2,50,000 words. The cards were arranged subject-wise in specially designed wooden trays in which we were able to stack two or three rows of about 150 cards. The trays and rows were arranged in conceptual groups and subgroups. To change the sequence, we would inter-shift trays, or subgroups within a tray. The task of handling of categories and repetition of words.

We also had to think of the means to resolve the logistics of handling the data while publishing. The numerous cards would first have to go to typists who, we feared, would first have to go to typists who, we feared, would mix up their sequence or lose some cards. There could be typographical errors, of corrected type printing press would add their-reading, it seemed unlikely that we would have an error-free work.

The formidable task of creating indexes also stared us in the face; once the thesaurus part of the book was typeset, a veritable army would be required to index it and, worse, indexers might supply their own share of unforgivable blunders. Without an index, a thematic thesaurus would have no meaning. Even fifteen years after starting it, the work was nowhere near completion.

At this time, our son, Dr, Sumeet Kumar, a double gold-medallist MBBS, MS, from the Seth G.S. Medical College, Mumbai,. Was working as a resident surgeon at Dr Ram Manohar Lohia Hospital, New Delhi. There, viewing the first personal computers that were beginning to be used in India and the computerize our data at the hospital, he saw their great potential.

He suggested that we computerize our data. We initially turned down the idea, then submitted. However, having over the years supported our work from our savings, we had no money for a computer nor programmers. Sumeet took up an assignment as surgeon for the National Iranian Oil Company for one and a half years, with the explicit goal of returning to India as soon as he had saved enough money to computerize our work. He was back in Delhi in 1992. After some research, we purchased our first i386 computer in May 1993.

In Iran, Sumeet also educated himself about computers and computer applications. He had determined that our work required a database programme, not just a word processor.

The importance of a database for a thesaurus or dictionary cannot be overstated. It facilitates the handing and management of data in various ways. One can add as many new categories or concepts as one likes, include extra columns, rows and field, enter any number of synonyms, and shift groups to change/modify the sequence. Once a data is in place, duplications show up and can be removed; records or expressions can be examined, edited, changed. Add more importantly, indexing is automatic.

To be of any use, databases need complex programming. We soon learned that there were no programmers available for making thesaurus. We would need to get our own software package developed and customized. But computer programmers do not come cheap. Further, we discovered, no one from the several software companies we approached had any previous experience to meet our specific requirements. The task of developing a custom-built solution would take time and cost an astronomical amount.

Sumeet found he had a natural and hitherto undiscovered talent for programming and took on the most daunting task. He selected FOXPRO 2.0 as the most appropriate platform for our database. Over the next six months, he wrote the initial application for converting our manually written cards. He kept upgrading the programme, adding new modules to satisfy our ever-increasing demands, enabling us to view and examine the growing data, edit it, and reorganize it. His programme allowed us to earmark individual records for selection to feature in various types of mono-, bi- and multilingual thesauruses and dictionaries. He has now evolved a foolproof, almost automatic system of converting DOS data into fully formatted Adobe PageMaker and Microsoft Word documents with multilingual indexes, ready for taking camera-ready printouts.

Our labour of love first bore fruit after twenty years in the shape of Samantar Kosh Hindi Thesaurus-the first ever in Hindi. It Contains 1,60,850 expressions grouped in 1,100 categories and 23, 759 sub-categories. National Book Trust, India, published it in 1996 as part of the golden jubilee celebrations of Independence. We were thrilled to present its first copy on 13 December 1996 to the then President of India, Dr Shankar Dayal Sharma.

We often wonder what would have happened if we had not taken the computer route. We may still have been writing cards!

Cross-cultural linguistic tools: Need of the day
we are in the throes of yet another wave of linguistic globalization, first reflected in the exhaustive international editions of Roget’s thesauruses, designed for English-speaking nations. The new world scenario calls for lexicographical works which can meet the global cross-culture needs. Indians are contributing to the hectic global scientific, economic and cultural activity; opinion leaders, reporters, newscasters, scientists, teachers, students, and migrants have to deal with proliferating concepts milieus. Bilingual thesauruses can suffice, to begin with, before multi-lingual ones materialize.

Makers of bilingual dictionaries would welcome one-to-one correspondences for words in any two languages. However, as linguists known, it is uncommon to find two words in two languages which have the same meanings, weights, backgrounds and associations. To give a simple example, the English word success has two Hindi equivalent words, saphalata and kamyabi. All three words have different cultural and semantic background and context. The word success represent a sense of reaching somewhere, saphalata is a word emanating from an agriculture background; it literally means fruitfulness or having come to fruition. Kamyabi has an Indo-Persian origin and denotes the achievement can be kritkaryata (success in one’s endeavour), a tern now used for thankfulness, Success leads to succession, but neither saphalata nor kamyabi can lead one to uttaradhikar.

One is also at a loss to find the English equivalent for the community used Hindi word, shobha. Hindi-English and Sanskrit-English dictionaries offer a number of English words as its rough equivalents: splendour brilliance luster, beauty, grace, loveliness, elegance, show… None of these is satisfactory. Shobha embodies only a fraction of these put together and a lot more.

A bilingual English-Hindi/Hindi-English thesaurus was the obvious way around this predicament. For a concept in either language, it would offer a host of options to choose from, far exceeding the potential of a simple dictionary.

India has a very high density of English knowing and –speaking people; many Indians have been educated through the English than Hindi. There are also many first- and second-generations non-resident Indians, especially in the USA and UK, non-Indian researchers and scholars of Hindi, others who wish to enrich their Hindi and English vocabularies or some who simply wish to look up a correct Hindi word for an English one. There are also several people translating into and form Hindi and need parallel Hindi/English word. In addition, there are non-English non-Indians who learn English to learn Hindi, as a bridge between their mother tongue and Hindi. South Asians who share culture traits with us can also be included in the list of people for whom such a work would be useful. Also, for the many non-Indians who would like to understand South Asian terms in the context of their own sensibilities, such a work would be needed.

With these factors in mind, we started work on an English-Hindi word bank in 1997.

First step was to add, in the FOXPRO table, columns to accommodate corresponding English headings, subheadings and synonyms. The next was to find equivalent English words for them in the Samantar Kosh. To help us, our daughter Meeta Lall, gold-medallist MSc in Nutrition from Delhi’s Lady Irwin College, willingly took up jotting down the English equivalents in a copy of the Samantar Kosh. (She later edited our data on food, nutrition, and health.)

From here on, it was Arvind’s task to find and add more English words for all the subheads. Kusum would sometimes be pressed in to look up Hindi-English, English-Hindi, and standalone English dictionaries to check and cross-check meanings.

Once the Hindi to English part was done, we knew a large number of non-Hindi concepts must have been left out since our data was basically Hindi and Indian. To ensure a true bilingual character with cross-cultural references, we now engaged in entering words from the English vocabulary, going from A to Z. unique English expressions had to be inserted and liked to Indian culture at appropriate places and Hindi equivalents added for them. This process of cross-fertilizations has helped us change, enrich and improve the Hindi data too. Many new categories have been added, and many more expressions included.

Now we rightfully claim to have a rich cross-cultural bilingual data of English and Hindi expressions, linking Indian and principle world cultures. We can also claim to have developed a unique easy-to-use database system, adaptable to the growing requirements of a lexicographic group.

The initial programming and first entries to the data on the English-Hindi/Hindi-English thesaurus and dictionary were made in kuala Lumpur (Malaysia) where Sumeet was getting his hospital management system installed. Since then the work has moved from country to country and within India from town to town. For two years, we worked on it in Dallas (Texas) and Tulsa (Oklahoma). In India, we worked on it in Ghaziabad and Chennai. The last four years saw us work in Pondicherry (renamed Pudduchenrry) and Auroville, founded in 1968 as an international township that aspires to realize human unity. As a consequence of the growing worldwide influence of Sri Aurobindo and the Mother, Auroville has residents from over forty countries, engaged in cross-cultural exchange, social experimentation and innovation.

The Penguin English-Hindi/Hindi-English Thesaurus and Dictionary is in three parts. The first parts is The English-Hindi/Hindi-English Thesaurus, the second is the English-Hindi Dictionary and Index and the third is The Hindi-English Dictionary and Index.

We are happy it is being published in the diamond Jubilee year of Independence.

We would like to thank…
This work may not have seen the light of day if it where not for the large number of well-wishers who encouraged and applauded us. They are too numerous for us to list individually but we express for heartfelt gratitude to them.

Our special thanks go to Meeta for her initial and valuable input and to her husband Atul who gave her and us moral support. As for Summet, we do not known how to thank him!

We also thank Udayan Mitra of Penguin India for taking personal interst in its publication, and Neeta Gupta and Meeta for coordinating between Penguin India and us.

