The Role Of Knowledge In Natural Language Understanding

The essence of language as a human activity lies not in its ability to reflect the world, but in its characteristic of creating commitment…” (Winnograd and Flores, 1986)

INTRODUCTION

It appears that most people take the idea of understanding natural language for granted. Yet, language is often filled with ambiguities, double meanings, context sensitive observations and hidden implications. If humans already find it difficult not to misinterpret another speaker’s utterance, it would be more so a Herculean task for a computer to achieve this. In fact, in her book Artificial Intelligence and Natural Man, Margaret Boden is somewhat scathing of people’s naivety in accepting the idea that a computer can understand human language, even with its subtle variations and in-built knowledge.

AIMS

Of particular interest to this paper is a discussion of ELIZA which will act as the starting point for an evaluation of the nature of knowledge required for language understanding.

ELIZA UNVEILED

One of the pioneers of language computer programs which made natural language conversation with a computer possible was ELIZA. ELIZA was developed by Joseph Weizenbaum in 1966 to model the language behaviour of a Rogerian psychotherapist talking to a patient. In fact, ELIZA became a hot topic in the psychiatric community as a possible tool for therapy. On first impressions, ELIZA would come across as intelligent because of its ability to mimic human conversation. Furthermore, ELIZA’s breadth of in-built “knowledge” is admirable as it ranges from family, dieting to drugs and religion. Users became emotionally involved with ELIZA and some believed that it demonstrated a general solution to the problem of the understanding of natural language by computers.

However, to claim that ELIZA is intelligent is to trivialise human interaction and worth. The success of the simulated intelligence depends heavily on the notion that the user has a fairly restricted notion of the expected response from the system. All ELIZA does is to pattern match incoming typed keywords against stored prompts in its database. It could only manipulate syntax (grammar) and check for some key words. For instance, the response to the statement `men are all alike’ was `in what way?’. Its replies were only as good as the programmer’s ingenuity in devising standardised replies. No original thinking was required at all. To even suggest that ELIZA had passed the famed Turing test is a mockery in itself. In instances when the user is deliberately uncooperative and poses questions and responses that are not part of ELIZA’s “knowledge”, ELIZA’s façade begins to crumble. When it encounters unknown words, it generates generalised and sometimes unrelated responses after checking earlier stored inputs.

THE NATURE OF “KNOWLEDGE” REQUIRED FOR LANGUAGE UNDERSTANDING

 

 

In general, natural language understanding programmes require three basic components, namely knowledge about the structure of the language (linguistic knowledge), knowledge about the structure of the world (world knowledge) and knowledge about the reasoning capability of human beings.

Knowledge about the structure of the language (linguistic knowledge)

As discussed above, even though ELIZA is capable of manipulating syntax to a certain degree by substituting ‘I am…’ with ‘…you are…’, it still lacks phonological, morphological, semantic and pragmatic knowledge. There are many different types of understanding that humans do naturally when they speak and write sentences.

Firstly, phonological distinctions such as the identification of stress and intonation will be more effective to help programmes like ELIZA respond to the stressed portion of the sentence. Hence, there should be phonological elements that the programme can detect. These can also be used to identify the emotional state of the utterance. This is where humans can detect differences in the way sentences and words are said. At the end of a question we often raise our voice. We stress important words and sarcastically dismiss others.

Secondly, there is a morphological understanding. This is where we recognise smaller parts of words and realise their meaning. An example of this would be happy, and happiness. We know what happy means, and we also know how the meaning changes when we add “ness”. The “ness” makes happy a noun instead of an adjective.

Thirdly, ELIZA-like systems have no semantic representation of the content of either the user’s input or the reply. Nothing can be more difficult for a computer than to ‘understand’ meaning. At the sentence level, semantics can be hidden or implied. But taking the entire context into consideration, meanings of an utterance will be less problematic to decipher. Yet ELIZA lacks the complex ability to function at the semantic level. Thus, implied meanings will be lost, ellipsis will be greeted by ELIZA’s response to be more ‘generous with words’ and the true essence of natural language understanding will be completely omitted.

Next, the quality of ELIZA’s response is limited by the sophistication of the ways in which they can process the input text at a syntactic level. For example, the number of templates available is a serious limitation. But we humans understand syntactically. This means we know how to form words into correct structures and phrases using the grammar for our language. We also know how to understand sentences that are not correct in this context. “Go I must, late it is” is not a correct sentence syntactically, but the meaning can easily be interpreted by an English speaker.

In addition, we are also imbued with pragmatic understanding. This is where the same sentence could be interpreted differently in different situations. The simple statement “I’m cold” might mean that I need to put on a jacket if I were outside. If I were inside it might mean that I want someone to turn up the heat. It could also mean (if it were summer) that I’m now comfortable being cold, because previously I was not.

The understanding of discourse is another skill we have. This is the ability to get a meaning from a whole set of sentences that are related. We will not always extract the same meaning from a paragraph of text if we don’t read the sentences in the correct order. This includes a temporal aspect of understanding.

More importantly, the linguistic knowledge component of natural language understanding needs to incorporate all these understandings with the language’s specific grammar in order to be able to interpret sentences correctly. After all, the different components under linguistics are all related in one way or another. Currently, ELIZA’s responses impose no structure on the conversation. Each response is based entirely on the current input. Any sense of intelligence depends strongly on the coherence of the conversation as judged by the user.

Knowledge about the structure of the world (world knowledge)

Even though artificial intelligence programmes like ELIZA may impress us with its breadth of “knowledge”, an important distinction needs to be made between knowledge and intelligence. A machine like ELIZA may store knowledge, but it need not necessarily possess intelligence.

Besides the fact that knowledge is indispensable, it is also volumous, hard to characterise accurately and constantly changing. After all, no one can deny the fact that knowledge about objects, events, procedures and experiences vary from one person to another. ELIZA shows how easy it is to create and maintain the illusion of understanding, hence perhaps of its supposed credibility. A certain danger lurks here. Although the breadth of “knowledge” encoded in ELIZA’s database is impressive, it is seriously lacking in depth. In other words, ELIZA tries to be all encompassing in the fields it has been trained for but this is seriously one of its biggest weakness. A user who decides to engage ELIZA in a conversation at a deeper level will be frustrated by ELIZA’s incoherent responses or eccentric nonsense. Similarly, a user who converses with ELIZA with regard to a very specialised domain, for example the use of interrogatives in language immediately discovers after two responses that it is not capable of sustaining such talk. Thus, it might be wiser to constrain the scope of ELIZA and to define a specific domain or field in which it can be thoroughly trained for (as is the objective of the modified ELIZA programme).

Knowledge about the reasoning capability of human beings

Another way of building a knowledge structure into a computer is to give the computer a mechanism for understanding why people do the things they do. Artificial intelligence computer programmes need to be trained to understand how humans formulate goals, the kinds of goals people in an event might want to achieve and the plans to achieve these goals. Presently, ELIZA only assumes the pose of understanding the user. In actual fact, it is the user’s contribution to his conversational partner all sorts of background knowledge, insights and reasoning ability. They manifest themselves inferentially in the interpretations he makes of the offered responses.

The crucial test of understanding underpins that it is not sufficient just for a computer programme to continue a conversation robustly, it must also be able to draw valid conclusions from what it is being told. Language understanding is not only a fixed relationship between a representation and the things represented, but also a commitment to carry out a dialogue within the full horizons of both speaker and hearer in a way that permits new distinctions to emerge. In Weizenbaum’s words, ‘ELIZA in its use so far has had as one of its principal objectives the concealment of its lack of understanding. But to encourage its conversational partner to offer inputs from which it can select remedial information, it must reveal its misunderstanding. The switch of objectives from the concealment to the revelation of misunderstanding is seen as a precondition to making an ELIZA-like programme the basis for an effective natural language man-machine communication system’ (Weizenbaum, 1966).

CONCLUSION
 

 

In conclusion, computer programmes like ELIZA can more realistically be compared to a useful replica workbook than to human beings. Nevertheless, ELIZA in its most elementary form is still a valuable tool as it enables us to understand the mechanisms of how humans function via the integration of linguistic knowledge, world knowledge and our reasoning capability.

In the course of modifying ELIZA for this paper, one realises that it is possible to believe that natural language understanding might be feasible in the next ten years for the following reasons. Although ELIZA-like systems have often been seen in the coldest possible light as being a translating processor, we cannot deny the fact that it applied rules, especially the grammar rules which have been in-built religiously. The only thing stopping this from being useful is the computer’s lack of context judging. With the availability of large online corpora, enough data can be collected so that context can be applied to parsing language, and computers will be able to understand the intention of the user’s statements.

Secondly, we have already modelled our programming languages after our own languages which pave the way for a more general object oriented approach for designing computer software. This means we are arriving closer to a common language that both humans and computers can understand. This may be the vital key to the success of natural language understanding. After all, we cannot expect computers to understand our language unless we understand it fully ourselves.

Bibliography

Boden, Margaret A. Artificial Intelligence and Natural Man. New York: Basic books, 1987.

Cawkell, Tony. ‘When will Computers Think?’ in Online and CD Notes, Vol.13, No.6, July/August 2000. http://www.aslib.co.uk/notes/volume/number/articles/index.html

 

 

 

Greene, Judith. Language Understanding: A Cognitive Approach. Philadelphia: Open University Press, 1985.

Probert, Matthew. ‘Conversations with my Computer: A Handbook of Natural Language Processing and Conversational Computing’. http://www.probert-encyclopaedia.co.uk/Servile/NLP.HTM

 

 

 

 

Weizenbaum, Joseph. ‘ELIZA – A Computer Program for the Study of Natural Language Communication between Man and Machine’ in Communications of the ACM, Vol. 9, No. 1. New York: ACM Publishing, 1966.

No comments yet. Be the first.

Leave a reply