The Role Of Knowledge In Natural Language Understanding

The essence of language as a human activity lies not in its ability to reflect the world, but in its characteristic of creating commitment…” (Winnograd and Flores, 1986)

INTRODUCTION

It appears that most people take the idea of understanding natural language for granted. Yet, language is often filled with ambiguities, double meanings, context sensitive observations and hidden implications. If humans already find it difficult not to misinterpret another speaker’s utterance, it would be more so a Herculean task for a computer to achieve this. In fact, in her book Artificial Intelligence and Natural Man, Margaret Boden is somewhat scathing of people’s naivety in accepting the idea that a computer can understand human language, even with its subtle variations and in-built knowledge.

AIMS

Of particular interest to this paper is a discussion of ELIZA which will act as the starting point for an evaluation of the nature of knowledge required for language understanding.

ELIZA UNVEILED

One of the pioneers of language computer programs which made natural language conversation with a computer possible was ELIZA. ELIZA was developed by Joseph Weizenbaum in 1966 to model the language behaviour of a Rogerian psychotherapist talking to a patient. In fact, ELIZA became a hot topic in the psychiatric community as a possible tool for therapy. On first impressions, ELIZA would come across as intelligent because of its ability to mimic human conversation. Furthermore, ELIZA’s breadth of in-built “knowledge” is admirable as it ranges from family, dieting to drugs and religion. Users became emotionally involved with ELIZA and some believed that it demonstrated a general solution to the problem of the understanding of natural language by computers.

However, to claim that ELIZA is intelligent is to trivialise human interaction and worth. The success of the simulated intelligence depends heavily on the notion that the user has a fairly restricted notion of the expected response from the system. All ELIZA does is to pattern match incoming typed keywords against stored prompts in its database. It could only manipulate syntax (grammar) and check for some key words. For instance, the response to the statement `men are all alike’ was `in what way?’. Its replies were only as good as the programmer’s ingenuity in devising standardised replies. No original thinking was required at all. To even suggest that ELIZA had passed the famed Turing test is a mockery in itself. In instances when the user is deliberately uncooperative and poses questions and responses that are not part of ELIZA’s “knowledge”, ELIZA’s façade begins to crumble. When it encounters unknown words, it generates generalised and sometimes unrelated responses after checking earlier stored inputs.

THE NATURE OF “KNOWLEDGE” REQUIRED FOR LANGUAGE UNDERSTANDING

 

 

In general, natural language understanding programmes require three basic components, namely knowledge about the structure of the language (linguistic knowledge), knowledge about the structure of the world (world knowledge) and knowledge about the reasoning capability of human beings.

Knowledge about the structure of the language (linguistic knowledge)

As discussed above, even though ELIZA is capable of manipulating syntax to a certain degree by substituting ‘I am…’ with ‘…you are…’, it still lacks phonological, morphological, semantic and pragmatic knowledge. There are many different types of understanding that humans do naturally when they speak and write sentences.

Firstly, phonological distinctions such as the identification of stress and intonation will be more effective to help programmes like ELIZA respond to the stressed portion of the sentence. Hence, there should be phonological elements that the programme can detect. These can also be used to identify the emotional state of the utterance. This is where humans can detect differences in the way sentences and words are said. At the end of a question we often raise our voice. We stress important words and sarcastically dismiss others.

Secondly, there is a morphological understanding. This is where we recognise smaller parts of words and realise their meaning. An example of this would be happy, and happiness. We know what happy means, and we also know how the meaning changes when we add “ness”. The “ness” makes happy a noun instead of an adjective.

Thirdly, ELIZA-like systems have no semantic representation of the content of either the user’s input or the reply. Nothing can be more difficult for a computer than to ‘understand’ meaning. At the sentence level, semantics can be hidden or implied. But taking the entire context into consideration, meanings of an utterance will be less problematic to decipher. Yet ELIZA lacks the complex ability to function at the semantic level. Thus, implied meanings will be lost, ellipsis will be greeted by ELIZA’s response to be more ‘generous with words’ and the true essence of natural language understanding will be completely omitted.

Next, the quality of ELIZA’s response is limited by the sophistication of the ways in which they can process the input text at a syntactic level. For example, the number of templates available is a serious limitation. But we humans understand syntactically. This means we know how to form words into correct structures and phrases using the grammar for our language. We also know how to understand sentences that are not correct in this context. “Go I must, late it is” is not a correct sentence syntactically, but the meaning can easily be interpreted by an English speaker.

In addition, we are also imbued with pragmatic understanding. This is where the same sentence could be interpreted differently in different situations. The simple statement “I’m cold” might mean that I need to put on a jacket if I were outside. If I were inside it might mean that I want someone to turn up the heat. It could also mean (if it were summer) that I’m now comfortable being cold, because previously I was not.

The understanding of discourse is another skill we have. This is the ability to get a meaning from a whole set of sentences that are related. We will not always extract the same meaning from a paragraph of text if we don’t read the sentences in the correct order. This includes a temporal aspect of understanding.

More importantly, the linguistic knowledge component of natural language understanding needs to incorporate all these understandings with the language’s specific grammar in order to be able to interpret sentences correctly. After all, the different components under linguistics are all related in one way or another. Currently, ELIZA’s responses impose no structure on the conversation. Each response is based entirely on the current input. Any sense of intelligence depends strongly on the coherence of the conversation as judged by the user.

Knowledge about the structure of the world (world knowledge)

Even though artificial intelligence programmes like ELIZA may impress us with its breadth of “knowledge”, an important distinction needs to be made between knowledge and intelligence. A machine like ELIZA may store knowledge, but it need not necessarily possess intelligence.

Besides the fact that knowledge is indispensable, it is also volumous, hard to characterise accurately and constantly changing. After all, no one can deny the fact that knowledge about objects, events, procedures and experiences vary from one person to another. ELIZA shows how easy it is to create and maintain the illusion of understanding, hence perhaps of its supposed credibility. A certain danger lurks here. Although the breadth of “knowledge” encoded in ELIZA’s database is impressive, it is seriously lacking in depth. In other words, ELIZA tries to be all encompassing in the fields it has been trained for but this is seriously one of its biggest weakness. A user who decides to engage ELIZA in a conversation at a deeper level will be frustrated by ELIZA’s incoherent responses or eccentric nonsense. Similarly, a user who converses with ELIZA with regard to a very specialised domain, for example the use of interrogatives in language immediately discovers after two responses that it is not capable of sustaining such talk. Thus, it might be wiser to constrain the scope of ELIZA and to define a specific domain or field in which it can be thoroughly trained for (as is the objective of the modified ELIZA programme).

Knowledge about the reasoning capability of human beings

Another way of building a knowledge structure into a computer is to give the computer a mechanism for understanding why people do the things they do. Artificial intelligence computer programmes need to be trained to understand how humans formulate goals, the kinds of goals people in an event might want to achieve and the plans to achieve these goals. Presently, ELIZA only assumes the pose of understanding the user. In actual fact, it is the user’s contribution to his conversational partner all sorts of background knowledge, insights and reasoning ability. They manifest themselves inferentially in the interpretations he makes of the offered responses.

The crucial test of understanding underpins that it is not sufficient just for a computer programme to continue a conversation robustly, it must also be able to draw valid conclusions from what it is being told. Language understanding is not only a fixed relationship between a representation and the things represented, but also a commitment to carry out a dialogue within the full horizons of both speaker and hearer in a way that permits new distinctions to emerge. In Weizenbaum’s words, ‘ELIZA in its use so far has had as one of its principal objectives the concealment of its lack of understanding. But to encourage its conversational partner to offer inputs from which it can select remedial information, it must reveal its misunderstanding. The switch of objectives from the concealment to the revelation of misunderstanding is seen as a precondition to making an ELIZA-like programme the basis for an effective natural language man-machine communication system’ (Weizenbaum, 1966).

CONCLUSION
 

 

In conclusion, computer programmes like ELIZA can more realistically be compared to a useful replica workbook than to human beings. Nevertheless, ELIZA in its most elementary form is still a valuable tool as it enables us to understand the mechanisms of how humans function via the integration of linguistic knowledge, world knowledge and our reasoning capability.

In the course of modifying ELIZA for this paper, one realises that it is possible to believe that natural language understanding might be feasible in the next ten years for the following reasons. Although ELIZA-like systems have often been seen in the coldest possible light as being a translating processor, we cannot deny the fact that it applied rules, especially the grammar rules which have been in-built religiously. The only thing stopping this from being useful is the computer’s lack of context judging. With the availability of large online corpora, enough data can be collected so that context can be applied to parsing language, and computers will be able to understand the intention of the user’s statements.

Secondly, we have already modelled our programming languages after our own languages which pave the way for a more general object oriented approach for designing computer software. This means we are arriving closer to a common language that both humans and computers can understand. This may be the vital key to the success of natural language understanding. After all, we cannot expect computers to understand our language unless we understand it fully ourselves.

Bibliography

Boden, Margaret A. Artificial Intelligence and Natural Man. New York: Basic books, 1987.

Cawkell, Tony. ‘When will Computers Think?’ in Online and CD Notes, Vol.13, No.6, July/August 2000. http://www.aslib.co.uk/notes/volume/number/articles/index.html

 

 

 

Greene, Judith. Language Understanding: A Cognitive Approach. Philadelphia: Open University Press, 1985.

Probert, Matthew. ‘Conversations with my Computer: A Handbook of Natural Language Processing and Conversational Computing’. http://www.probert-encyclopaedia.co.uk/Servile/NLP.HTM

 

 

 

 

Weizenbaum, Joseph. ‘ELIZA – A Computer Program for the Study of Natural Language Communication between Man and Machine’ in Communications of the ACM, Vol. 9, No. 1. New York: ACM Publishing, 1966.

Indian English And Hinglish: The Case Of Diglossia In India

INTRODUCTION

Kachru’s discussion of the inner, outer and expanding circles has been the accepted bedrock in the classification of English as a world language. But of all the major varieties of English, much of the focus has always been on describing British English and American English (varieties from the inner circle) extensively in much detail. Other varieties of English, especially those found in the outer circle have been denigrated and marginalised.

However, in recent years, there is a clear indication of a reversal trend as increased efforts have been made to give a systematic and comprehensive description of the new varieties of English which are all valuable in their own rights. Among the non-native varieties of English that have been tacitly recognised, the Indian variety of English is of specific interest to this paper. The chief reason for choosing India is buttressed by the fact that according to The Oxford Companion to the English Language, “an estimated 30m people (4% of the population) regularly use English, making India the third largest English-speaking country in the world. In addition, beyond this number is a further, unquantifiably large range of people with greater or less knowledge of the language and competence in its use” (McArthur, 1998).

AIMS

In examining this particular variety, this paper has four main aims in mind. Firstly, it seeks to examine the sociolinguistic situation in India. This is of utmost importance as it will then equip one with the necessary knowledge to test if Ooi’s Concentric Circles Model (cf. Ooi, 1998; Ooi, 2000) can be applied in the Indian context. In the event that this model holds true, the third and fourth goals will then be to determine if words used in the Indian variety of English can be graded accordingly and whether a diglossic situation exists in India.

METHODOLOGY

There are three main sources of linguistic evidence employed in this paper, namely the corpus, citational and introspective aspects. A plurality of sources serves to complement one another to give a better understanding and more accurate description of the Indian variety of English.

The corpus evidence is gathered from The Times of India which is available online. The Times of India newspaper was chosen because it is the largest and most-read newspaper in India. A total number of 119 articles were collected over a fifteen-day period (from 14 Aug to 28 Aug 2000). News articles from the political, infotech, health/science, sports and entertainment arenas; alongside with letters written by the public and interviews pertaining to music, cinema, politicians cum new technologies were gleaned with equal weightage given to each category. This was done so as to avoid the findings from being skewed and to ensure the corpus was maximally representative of the linguistic situation in India. The fifteen days of materials were then subjected to a WordSmith Tools analysis.

Second, the main citational sources were taken from published materials like books, magazines, online articles and dictionaries such as ‘Indian and British English: A Handbook of Usage and Pronunciation’, ‘The Indianization of English: The English Language in India’, ‘Stardust Magazine’ and seven articles pertaining to Indian English written by academics from the Internet.

In addition, an introspective source that would provide a first-hand emic perspective of the linguistic situation in India was valued. The insider’s perspective came from two informants who are from North India and South India respectively. They are Dr. Gyanesh Kudaisya and Dr. SP Thinnappan, both lecturers from the Department of South Asian Studies, Faculty of Arts and Social Sciences, National University of Singapore. Both lecturers were also in a better position to grade the lexical items according to their meanings and context of usage.

THE SOCIOLINGUISTIC SITUATION IN INDIA

The term Indian English is widely used but is a subject of controversy. Some scholars argue that it labels an established variety with an incipient or actual standard while others feel that the kinds of English used in India are too varied – both socially and geographically; and often too deviant or too limited, to be lumped together as one variety. They also argue that no detailed description has been made of the supposed variety and that the term is therefore misleading and ought not to be used.

However, the length of time that English has been in India, its importance and its range, rather than militating against such a term, make the term essential for an adequate discussion of the place of the language in Indian life, culture and its sociolinguistic context. Besides, Indian English is used widely as the language of formality and legislation. It has long since established itself as a respected equal to the sub-continent’s other indigenous languages. Furthermore, it functions as a vital lingua franca, acting as the social glue binding together the various regions of this vast country. This refutes Macaulay’s view strongly that ‘a single shelf of a good European library was worth the whole native literature of India and Arabia’ and poured scorn on every aspect of Indian culture (Philipson, 1992).

Interestingly, Indian English has improvised and innovated so much to the extent that it is referred to as Hinglish – a kind of pidgin English that draws from Hindi and other local languages. According to Saraf, ‘Hinglish shot into prominence thanks to the burgeoning post-reforms mass media in India. This is not without any reason since Hinglish has been identified as the language of the youth of a ‘liberal’ India Inc.’ (Saraf) The Indian community has showed a special penchant for incorporating words from the regional languages into their speech and subsequently their writings thus ensuring their place in the English language. In particular, the rapid growth in the mass media industry, especially satellite TV (best exemplified by ZEE TV and NEWS) and Hinglish pop have added a new dimension to Hinglish.

The increasing pervasiveness of Hinglish is not just a consequence of multi-cultural and multi-lingual contexts functioning in an era of political correctness. It has been helped along by certain precise and concrete political events cum economic forces in the mid-eighties in India. These conditions intervene actively in the evolution of this language at various levels, particularly, semiotically, in the creation of new role models, myths and symbols which are seen as fresh perceptions of the Indian image.

While some argue that Indian English and Hinglish are interchangeable, this paper seeks to differ. In this paper, Indian English will refer specifically to the variety that is used in formal contexts such as education, politics and law. In contrast, Hinglish will refer to the variety used in informal contexts such as conversations with friends and family.

THE CONCENTRIC CIRCLES MODEL

Ooi argues that ‘one can think of 5 main groups typifing the range of language use in a nativised language situation’ (Ooi, 1998). This section will evaluate this claim by applying the model to English usage in India, of which only Groups B, C, D and E words are of particular interest.

Group B: Words of English Origin used in Formal Situations

innings

Figure 1: Concordance listing for innings (from The Times of India online)

In Figure 1 above, we see instances where the word innings is not used in the prototypical sense which the Collins Cobuild defines as ‘a period in a game of cricket during which a particular player or team is batting.’ Instead, in the Indian context, although the form of the word remains unchanged, innings takes on a new meaning that is used to characterise the attempt (by an entertainer) at success in the entertainment field in lines 1 and 2. An ocular scan reveals that the co-text of innings to the right-hand side is always a prepositional phrase, characterised by the collocation in and the colligation Bollywood, the major Indian film industry in India. The left-hand side of the node is also determined by the ordinal number collocate second and a pronoun colligation. However, the meaning of innings changes when we examine the co-text of lines 3 and 4. In these instances, innings refers to the success an entertainer gets after a period of time in the entertainment field. The colligation short and the collocation Bollywood serves to disambiguate this second meaning from the first.

Nevertheless, despite which meaning we are looking at, it can be observed that the semantic prosody reflects an ambition either to succeed or to continue one’s success in Bollywood.

tricolour

Figure 2: Concordance listing for tricolour (from The Times of India online)

Not ‘any three coloured flag or a flag having three coloured stripes’ in India can be termed tricolour. The term is reserved solely to refer to the national flag of India, which has three colours – green, orange and white. In lines 1, 2, 3, 5 and 6, one notes that the tricolour collocates or colligates with a verb on the left-hand side, in these instances with flying, displaying or sending up. Tricolour also collocates very strongly with a preposition acting as a locative marker to the right-hand side. However in line 4, tricolour becomes a proper noun and is capitalised. This stemmed from the fact that there had been much press coverage about Indians not according the tricolour with proper respect to the extent that the issue and views on the matter became ‘the Tricolour controversy’, referring specifically to this case. Because the tricolour is the national flag, the semantic prosody is one of respect. In this concordance listing, the debate over the treatment of the flag arose owing to views that it was not being treated disrespectfully.

There are two other interesting word usage patterns worth mentioning. Firstly, one does not hold up (meaning ‘respect’) ‘their national flag with great respect and pride’. The use of hold up seeks to remind one of a delay (someone or something holds you up) or a robbery (hold up a bank) or the action of moving one’s hand upwards. Next, one does not flag off (meaning ‘start’) a controversy. One only flags off a race.

chargesheeted

Figure 3: Concordance listing for chargesheeted (from The Times of India online)

The third specimen lexical item chargesheet, which consists of two independent English words, namely charge and sheet is unique only to Indian English. According to my informants, the compounding of two words to form a new one is commonly found in India and this marks the creativeness of Indian English in coining new forms.

As evident from Figure 3, chargesheeted is used as an intransitive verb to refer to ‘offenders who were charged in court or their names placed on the charge sheet’. Its usage is restricted solely to legal discourse. As such, we would expect the co-text to contain lexical items pertaining to the crimes, such as ‘embezzlement’ and the collocation or colligation to be a number, either referring to the year he or she was chargesheeted or the length of sentence. The semantic prosody reflects the negative, undesirable side of human beings.

Group C: Words or Hybrids of non-English origin used in formal situations

benami

In this concordance listing, the word benami is used as an attributive adjective, acting as a pre-modifier to the head of the noun phrase in all three instances. One can hazard a guess to its meaning which conjures to mind the ‘illegal’ nature of the transaction. With a negative connotation attached to the meaning of the word, it is not surprising to find the illegal activity ‘black money (scheme)’ which acts as a very strong compound collocate alongside benami. On the contrary, one feels that the usage of benami in lines 1 and 3 is redundant. After all, the occurrence of the compound ‘black money’ already seeks to remind one that the money comes from illegal means. This is further strengthened by the semantic prosody which reflects a particular attitude (often negative) towards benami.

azadi

Figure 5: Concordance listing for azadi (from The Times of India online)

Azadi in Figure 5 is used as a noun, which means freedom or independence. What is most obvious from the concordance listing is that azadi collocates frequently with the preposition of, functioning as the noun in the postposed prepositional phrase. In other instances like in lines 8, 9, 13, 14 and 15, azadi is portrayed as the goal to be attained (eg. provide full azadi, Hurriyat’s azadi, concrete step towards azadi.) Other lexical items that are commonly found alongside azadi include ‘autonomy’, ‘law’, ‘order’ and ‘independence’, suggesting that the successful attainment of azadi will result in these other norms or these other norms will help give rise to azadi.

What is interesting in the context of usage is the formation of ministry of azadi. In this case, the ministry is the actor and azadi is the goal they hope to achieve. In other cases, it is the man on the street who wishes for azadi. Yet this dichotomy reveals that the ministry and the man on the street will differ significantly with regard to their own definition of what full azadi or true azadi symbolises. Nevertheless, the semantic prosody indicates a deep sense of yearning and positive hope.

insaniyat

Figure 6: Concordance listing for insaniyat (from The Times of India online)

The lexical item insaniyat (meaning humanity) functions as the noun in the postposed prepositional phrase marked by of. The colligation which really functions as the head of the noun phrase is defined by the domain, limitations or foundation upon which insaniyat is built upon. One can talk or construe notions of insaniyat, but ultimately what the boundaries of insaniyat differs from person to person. The semantic prosody is one that reflects the opinion(s) of different actors with regard to insaniyat.

Group D: Words of English origin used in informal situations

starrers

Figure 7: Concordance listing for starrers (from The Times of India online)

As evident from Figure 7, starrers is a fascinating construction which is a demonstration of creativeness of Hinglish. It is formed from the lemma star, from which the agentive form –er is added and the –s added to denote plurality. In both instances, it collocates so strongly with multi that both multi and starrers have become fused together as a compound. This noun compound refers to a movie which has many famous celebrities acting in it.

According to one of my informants, it is ironic that although both multi-starrers and Bollywood have close links with each other and the film industry as a whole, only the latter has been elevated in status and accepted in formal situations. In contrast, an utterance like ‘the show has multi-starrers’ is only reserved for informal usage.

…Saturday because that’s the day they cut the birds at the chicken farm.

If you don’t like the word, cut it.

My wife wants me to cut out my moustache.

Figure 8: Concordance listing for cut (from Indian and British English)

Although cut, as used in all three instances in Figure 8 continues to be used as a verb, it can never be found in formal speech nor writing. One notes the polysemous meanings of cut as it moves from that of ‘slaughter’ in line 1 to ‘remove’ in line 2 to ‘shave’ in line 3. As cut behaves as a transitive verb, it has to take a collocation in the form of a noun phrase to the right-hand side. However, the collocational strength between cut and the collocate it takes is weak and one cannot predict with certainty what else a Hinglish speaker can cut. According to my informants, one can even cut (meaning ‘crack’) a joke or cut (meaning ‘solve’) the problem among many others. The semantic prosody reflects an attempt to try to make things better by undertaking a remedying or appropriate action.

funterprise

Figure 9: Concordance listing for funterprise (from The Times of India online)

Perhaps one of the most interesting lexical item is funterprise, a word produced by the process of blending that combines fun and enterprise. Participants enjoy and feel happy taking part in game shows as it is fun and very often, there are prizes to be won which are sponsored by an enterprise, be it a company or business. While it is asserted that Hinglish items like funterprise typically occur in informal situations, they do nevertheless get used, in the present situation, in a formal text among words typically found in Group A, among them its collocate epitomises and the next clause which involves ‘adagios of suspense’ and ‘crescendos of success’ for stylistic reasons. It could have been a deliberate attempt by the writer to catch the reader’s attention by focussing on the overarching funterprise which leads to the intense mood that culminates in the process and the exhilaration of winning (which utilises musical terms in a bid to signify the tempo of the atmosphere).

Group E: Words or hybrids of non-English origin used in informal situations

goonda

Figure 10: Concordance listing for goonda (from The Times of India online)

The concordance listing for a word such as goonda confirms the idea that words in the Indian context can be graded and the existence of terms that occur largely in the informal, spoken contexts. A word like goonda (which means a ruffian) was borrowed from the indigenous languages such as Hindi and Bengali. Goonda is very interesting because while most would agree that words used in the informal contexts do not conform to standard English grammar, my informants say that it is common to hear of goondaism, a morphological process of forming the noun goondaism from the noun goonda. Also, according to The Oxford Companion to the English Language, goonda can also take part in a hybridisation process – for example ‘a goonda ordinance’ (an ordinance against goondas). The semantic prosody is reflective of society’s attitudes towards the goonda, which is often one of disapproval.

Anger karna Sami.

Love karna Sami.

Marriage karna

Pity karna Sami.

Worry karna Sami.

Figure 11: Concordance listing for karna (from Indian and British English)

An example of a very productive word formation hybrid is realised by the Hindi operator karna which is used in informal contexts. When karna attaches itself to other English words such as ‘anger’ or ‘love’ which means ‘to…’, Hinglization takes place. Karna is able to collocate with most verbs, especially with emotive nouns to the left. For example, ‘love karna Sami’ translates into ‘to love Sami’ and ‘marriage karna Sami’ means ‘to get married to Sami’.

CONCLUSION

From the examination of words gleaned from corpus evidence, it is clear that a diglossic situation exists in India, where Indian English represents the H variety and is used in formal situations whereas Hinglish, the L variety is reserved for use in informal situations. In fact, the L variety seems to be growing from strength to strength as scholars observe that Indian youths have adopted Hinglish as their official language and are using it as a statement of their individuality; and as a tool for binding cum bonding. Also, we cannot forget the market forces such as MTV (which has successfully made inroads into India) popularising the use of Hinglish. If one buys Saraf’s argument, ‘Hinglish may be considered as a restructuring of the relationship of power and a passage from marginalisation through language to linguistic empowerment’ (Saraf).

There is no doubt the concentric circles model demonstrates that words used in the Indian context can be graded and divided into different groups. But Ooi himself acknowledges ‘the inadequacy of mere labelling in determining whether a certain lexical or grammatical item is an exclusive feature of a particular group of words or variety’ (Ooi 2000). After all, if a word occurs very frequently in a formal genre like a newspaper, chances are that it is likely to be used in a formal situation. But this need not always be true. It could have been used simply to invoke the familiarity or for stylistic reasons (eg. to bring attention to a particular problem such as goondaism or a phenomenon like funterprise). In addition, one cannot disregard the considerable number of Hindi, Sanskrit or Urdu words that have been codified in Western dictionaries. Examples are sari, pundit, crore, guru and roti. In the concentric circles model, they would be considered Group A words.

While one cannot disregard the usefulness of the concentric circles model as a tool, one must also be willing to be critical about it. When we examine corpus evidence, we are actually looking at language use from a synchronic perspective. A study of the same word usage ten or twenty years ago could reveal enormous differences. This is especially true in the case of Bollywood which can now be used in both the formal and informal contexts owing to the flourishing cum gradual maturity of India’s film industry, leading to its acceptance in formal contexts. Very often, linguistic changes are closely tied to societal changes. After all, there is a bi-directional relationship between language and society. The language used should reflect societal norms and similarly, societal norms construe and determine language use.

One envisages that an increasingly more self-confident India will assert her linguistic heritage with regard to her ten indigenous languages and hundreds of dialects. Both Indian English and Hinglish imply a way of life, but more importantly, an Indian way of life to be used internationally and intranationally.

Acknowledgements

I would like to thank my two informants – Dr. Gyanesh Kudaisya and Dr. SP Thinnappan, lecturers from the Department of South Asian Studies for assisting me in this paper. Their academic perspective is greatly appreciated.

Bibliography

Green, J. ‘Word Wizard’, Critical Quarterly 2. http://www.wordwizard.com/critq2.htm

Dictionary.com http://www.dictionary.com

Kandiah, Thiru. 1998. ‘Why New Englishes?’ in Foley, J. et al, English in New Cultural Contexts – Reflections from Singapore’. 1998. Singapore: Oxford University Press.

Kachru, Braj B. 1983. The Indianization of English: The English Language in India. Delhi: Oxford University Press.

McArthur,T. 1998. The Oxford Companion to the English Language. http://w2.xrefer.com/entry/442361 and http://w2.xrefer.com/entry.jsp?xrefid=442452

Nihalani, P, Tongue R. K. and Hosali P. 1979. Indian and British English: A Handbook of Usage and Pronunciation. Delhi: Oxford University Press.

Ooi, V. B. Y. 1998. ‘The Implications of using Nativised Language Corpora for Lexicography’ in Allison et al., ed. Text and Generation. 1998. Singapore: Singapore University Press.

_______, 2000. ‘Upholding Standards or Passively Observing Language? Corpus Evidence and the Concentric Circles Model’ in Ooi V. B. Y. ed. Evolving Identities: The English Language in Singapore and Malaysia. 2000. Singapore: Times Academic Press.

Philipson, R. 1992. Linguistic Imperialism. Oxford: Oxford University Press.

Saraf, Babli Moitra. Hinglish – Resistance, Empowerment or Marginalization? http://www.iias.nl/host/ccrss/cp/cp3/cp3-Hinglish.html

Sengupta Ramananda. Indian English meets the Web. http://www.ronscheer.com/html/readingroom12.html#topic5

Shastri S.V., Patilkulkarni C. T. and Shastri Geeta S. 1986. The Kolhapur Corpus of Indian English. Department of English, Shivaji University, Kolhapur. http://www.hit.uib.no/icame/kolhapur/kolman.htm#intro

Stardust Magazine. http://www.stardustmag.com

The Marbat Festival of Nagpur. http://www.nagpurkhoj.com/main/hinglish.htm

The Times of India online newspaper. http://www.timesofindia.com

White, Ron. Going Round in Circles: English as an International Language, and Cross-cultural capability. http://www.rdg.ac.uk/AcaDepts/cl/CALS/circles.html