Tuesday, February 10, 2004

Reader implementation --- Pick out statement

1) Read first "InputTextBufferLengthMax" characters
InputTextBufferLengthMax = 300 characters

2) Try to find "EndOfStatement".
EndOfStatement: "." (point) with following " " (space) or <CR>.

3) If "EndOfStatement" was found within "InputTextBufferLengthMax" characters then
InputTextBufferCurrentEnd = position of EndOfStatement + 1 (or + 2 if is "." is followed by " ").
InputTextNextStartPosition = InputTextBufferCurrentEnd
Continue parsing on deeper level (ParseStatement).

4) Otherwise (if EndOfStatement wasn't found) then consider ";" (semicolon) with following " " (space) or as EndOfStatement. Repeat EndOfStatement search.

5) If EndOfStatement wasn't found still: consider " -" (space with dash) as EndOfStatement. Repeat EndOfStatement search.

6) If EndOfStatement wasn't found still: consider "," (comma) with following " " (space) or as EndOfStatement. Repeat EndOfStatement search.

7) Consider "[", "]", "(", ")" as EndOfStatement.
8) Consider "." (Comma without following space) as EndOfStatement.

9) Consider " " as EndOfStatement.

10) Consider any special character (not char not digit) as EndOfStatement.

11) Consider any digit as EndOfStatement.
12) Consider any capitalized character as EndOfStatement.

13) Complain about bad quality of text. Consider any character as EndOfStatement.

When to merge Short Memory and Main Memory?

Every "ShortMainMemoryMergePeriod" of time --- Short Memory is merged with Main Memory.
Let "ShortMainMemoryMergePeriod" = 3 minutes.

If a neuron wasn't updated during "ShortMainMemoryMergePeriod" then "ShortMemoryNeuronStagnationCounter" is increased.
If "ShortMemoryNeuronStagnationCounter" grew up to "ShortMemoryNeuronStagnationCounterDestroyLevel" then the Neuron is removed from the Short Memory.

"ShortMemoryNeuronStagnationCounterDestroyLevel" = 5

Reader implementation --- Parse statement

1) Separate statement into words. Separator: any (non-letter and non-digit): " ", ".", "'", "/".
2) Save each word into WordDictionary if the word doesn't exist there still.
Any special character is considered as separate word.
Space (" ") is considered as nothing (no word).
3) At this point we have list of words. Each word is represented by NeuronId.
4) Try to find phrases in the word list
Let MaxQuantityOfWordsInPhrase = 5
The Statement Parser should try to separate find out phrases.
Let we have Statement ABCDEFG, where "A", "B", "C", "D", "E", "F", "G" are words.
Then the Parser should create phrases:
A
AB
ABC
ABCD
ABCDE
B
BC
BCD
DCDE
DCDEF
C
CD
CDE
CDEF
CDEFG
D
DE
DEF
DEFG
E
EF
EFG
F
FG
G

If initial Strength of "A" is NewSingleWordPhraseStrength then
Strength of "AB", "EF", and "FG" will be 2 * NewSingleWordPhraseStrength
Strength of "ABC", "EFG", and "CDE" will be 3 * NewSingleWordPhraseStrength
That is proportional to the quantity of words in the phrase.

********) At the end of Paragraph:
Try to find out "EndOfParagraph".
EndOfParagraph: <CR> (^P), <BR>(?).
End of paragraph should cause additional ShortMemoryForgettingProcess.

Tuesday, February 03, 2004

Reader prototype core code

Reader
TextToParse.Parse(string SourceText);
sourceText = SourceText

Do while (!EndOfText)
{
SentenceToParse = TextToParse.GetNextSentence();
CurrentWordList = SentenceToParse.SearchWords();
CurrentPhraseList = SentenceToParse.SearchPhrases(CurrentWordList);
CurrentTextUnitList = MergeWordAndPhraseList(CurrentWordList, CurrentPhraseList)
TextPairList = SearchPairs(CurrentTextUnitList);
TextPairList.SaveToDB();

ReasonConsequenceRelationsList = SearchReasonConsequenceRelations(CurrentTextUnitList);
ShortMemory.Add(CurrentTextUnitList); // Partially clean old items out of ShortMemory; Add new items from CurrentTextUnitList to the ShortMemory
ShortMemory.ImproveReasonConsequenceRelations();
ShortMemory.SaveToTheMainMemory();
}