Friday, February 25, 2005

AI output --- response in Natural Language

Jiri> how exactly you want to generate the response sentences?

There are two approaches to generate the answer:
1) Simple approach (for limited AI)
Just copy:
- content of the most relevant page
- reference to this page
(like Google does).

2) Writing text (for strong AI)
When answer is prepared in short memory (in the form of answer concept list) then it should be converted into Natural Language text.
AI already has relations between words and concepts, so we can prepare NL text. The text wouldn't be nice to read, but it would be in a natural language already.

In order to make text output better AI has to remember typical flow of natural language. Such information could be stored in TextPair table.

Information is gathered into TextPair table during massive reading.
Basically TextPair table would have statistical information about typical language constructions.

See also: Writer Prototype

Other things which could improve writing:
1) Phrase concepts could be converted into text too.
2) Output sentences should be kept short. Translate one abstract concept into one sentence would be a good idea.
3) While looking through Pair table, search for synonyms as a substitution for original concepts.
4) The best feature, but the hardest to implement:
Use softcoded routines to generate the text --- for every concept find softcoded routine which relates to both this concept and "writing text" module.
These softcoded routines would output into actual text.
Obviously these softcoded routines should be prepared prior to text generation. It could be done by two standard strong AI learning techniques: "knowledge download" and "experiment".
For example, during experiment successful softcoded routines would be adopted/reinforced. Not efficient softcoded routines would be erased.


> If it involves connecting parts of sentences from various regions of
> data based on statistics then it will often generate garbage.

You are wrong.
Even pretty dumb Elisa text generation algorithm works acceptable.
Why would more efficient algorithm work worse?

No comments: