World of Programming Examples Weblog:
An Open Source To-Do List
[Running Essay|
Weblog|
Topics]
September 1, 2002 - Chatterbots and English Language Teaching IV
Here's my first attempt to
classify the tags
in each Standard AIML file.
I'm taking a top down approach to figuring out how AIML works.
Standard AIML
is the tagset that is being developed as a standard foundation tagset for AIML.
Even though it was not the tagset used to win the Loebner
Turing test prizes the organization of the tags is more conceptual
and with 25,000 categories this seems like the best place to begin
if you want to thoroughly understand how
AIML works it natural language understanding and production
magic.
Finite state machine graphs might be useful
for identifying commonly occuring conversation patterns
that can be factored into AIML patterns:
[
Scripted Language,
List of Examples,
Example 1,
2,
3,
Noun Phrases
]
August 27, 2002 - Chatterbots and English Language Teaching III
For the concept of "Information Gap" in language teaching
I have set up a page of definitions and examples:
Information Gaps In English Teaching
Simulation Gaming and English Teaching
An interview with Will Wright
designer of
"The Sims"
simulation computer game talks about how players shift perspectives
frequently which generates things
to talk about and provides pronoun practice,
making it an effective tool for language teaching:
WW: It’s actually very interesting in The Sims how the pronouns change
all the time. I’m sitting there playing the game and I’m talking about,
"Oh, first I’m going to get a job, then I’m going to do this, then
I’m going to do that." And then you know when the character
starts disobeying me, all of a sudden I shift and say "Oh, Why
won’t he do that?" or "What’s he doing now?" And so at some
point it’s me kind of inhabiting this little person, and I’m thinking,
"It’s me, I’m going to get a job and I’m going to do x, y, and z."
But then when he starts rebelling, it’s he. And so then I kind of
jump out of him, and now it’s me vs. him. You know what I’m saying?
CP: Yes, I do. But one of things that interests me about the game
is that you have these semi-autonomous characters. They’re not
totally autonomous, and they’re not totally avatars either. They’re
somewhere in between. Do think that’s disorienting to the player,
or do you think it’s what makes the game fun?
WW: I don’t think so. I mean it’s interesting. I’m just surprised that
people can do that fluidly, they can so fluidly say "Oh, I’m this guy,
and then I’m going to do x, y, and z." And then they can pop out and
"Now I’m that person. I’m doing this that and the other. What’s he doing?"
And so now he’s a third person to me, even though he was me a moment
ago. I think that’s something we use a lot in our imaginations when we’re
modeling things. We’ll put ourselves in somebody else’s point of view very
specifically for a very short period of time. "Well, let’s see, if I were that person,
I would probably do x, y, and z." And then I kind of jump out of their head
and then I’m me, talking to them, relating to them.
August 26, 2002 - Chatterbots and English Language Teaching II
Information Gap Activities:
An example of an information gap situation
that could be set up by simple narrative descriptions:
Students are each given pages from an
appointment book and divided into two groups:
those who have to make appointments for some reason
and those with whom the appointment is made
(doctor, dentist, psychologist).
They go around the room finding time in their respective calendars
for appointments with each other.
They mix either randomly or move
in opposite directions around a circle.
To simulate students interacting in a classroom
language learning activity
we have to use two chatterbots talking to each other,
similar to what the
Forbin Project
does.
Unlike normal chatterbots there also has to be
near zero tolerance for grammatical errors.
Although a native speaker teacher of English
could probably put up with some errors,
a non-native speaker would be mislead and confused by them.
AIML's Recursive Pattern Matching:
The recursive pattern-matching in AIML is not new.
It has predecessors in the Lisp world that used
s-expressions
[
Examples;
Definitions:
1
2
3
]
rather than XML.
Recursive pattern matching is defined as a small language
embedded in the larger language Lisp.
Norvig in his book
Paradigms of Artificial Intelligence
(Chapters 5 and 6)
designs a general purpose s-expression
pattern matching language in Lisp that is very similar to AIML's.
Queinnec has a wonderful little book in French
Le Filtrage devoted to pattern matching in Lisp.
August 25, 2002 - Chatterbots and English Language Teaching I:
Using Chatterbots to Simulate Information Gap Language Teaching Activities
Computer aided lesson planning for language teachers
could become a reality, if
computational linguists
came up with some user friendly way of parsing and generating
language. To-date they haven't.
Simple text pattern matching
(text stimulus --> text response)
is a much more pragmatic approach
refined by
chatterbots
or conversational robots.
The AIML or Alicebot project
[Homepage,
Google].
which has won the
Loebner
(Turing Test competition)
two years in a row
leads the pack in this sort of application.
(Tech Note: It is basically an XML-based Domain Specific Language (DSL)
[Definition,
Articles]
that relies on recursive pattern-matching.
Some AIML users have
programmed simple story-like situations
[Mailing List Thread]
with AIML XML markup.
Situations like these can form the basis
for information gap activities in
communicative language teaching:
The basic idea [behind an information gap activity]
is based on pairwork, in which each student has
information that his/her partner doesn't have; the goal being to
combine their knowledge in order to complete a task. For example,
one student may have a map on which half the buildings (post office,
bank, newsagents, etc.) are marked, while the other has a similar map
showing only the other buildings. After some pre-teaching of the appropriate
question forms and prepositions of position, the students then take it
in turns to ask and answer, eventually filling in all the gaps in their respective maps.
(
"Information Gaps" , Andy Hoodith, Saitama University, Japan
)
AIML can simulate the dialogues of
students interacting in the classroom
solving the information gap of the story.
This would allow the teacher to better prepare and do
a dress rehearsal of the lesson,
anticipate errors made by students,
create example dialogues for class and tests,
and track teaching points covered.
August 7, 2002 - Programming Narrative IV
There are some things are missing in
Lang's code
:
-
The "grammar interpreter" is not included,
i.e. the clauses that match and execute the "--->" operator
clauses found in the production rules of the story grammar.
The interpreter uses a depth-first iterative deepening search with random choice.
The book
"Computational Intelligence"
actually includes such a Prolog interpreter
(see dsearch.pl and dsearch2.pl in chapter 4 of the
source code).
-
Since examples are not given explicitly with the source,
you have to go to appendix E (or F?) and cut and paste
world models for individual stories (like "The Bad Wife")
and run them under the story grammar.
-
There is no license included,
The source just happens to be included along with the PhD
thesis on an ftp site.
So it's not clear what uses of the system are legitimate.
Unintentional open source?
I writing him a letter asking him whether the source
code is in the public domain too weird?
-
Also no indication is given of the Prolog system is was originally
developed on.
August 6, 2002 - Programming Narrative III
Continuing from my last weblog entry, on Raymond Lang's
PhD thesis:
Lang's system is basically a programming language embedded in Prolog for programming narratives.
A story consists of a setting and a sequence of episodes.
Programs are written by declaring a world model in Prolog clauses.
The AI Agent Oriented Programming paradigm.
provides the logical foundation for modelling human intentions
and goal directed behavior as well as the temporal sequencing of events
A story's sequence of episodes contains "patterns of states and events"
generated by the story grammar.
The states and events are the content of the story.
They are the terminals of the story grammar
and are generated from the 13 predicates of the world model:
- Protagonist
- Fact: A fact is part of the story's setting.
- Episode Initiator: an event initiates an episode
- Effect: an act causes a state or event
- Consequence: if the world is in state 1, then it is also in state 2.
- Plan: A sequence of steps. Plans can be substituted for actions to attain a goal.
- Plan Effect: Given a plan, an event or a state will happen
- Emotional Reaction: If an event happens then the protagonist
will feel an emotion
- Action Motivation: If an agent has a given
emotion, than an action is possible
- Goal Motivation: An emotion motivates one to adopt a goal
- Intention: A protagonist believes that an action is a means to achieve a goal
The world predicates are applied to provide the stories
content:
The story grammar generates a tree representing a story.
When it generates a leaf, the grammar specifies the type
of component which must appear at the leaf node,
but does not instantiate the component.
Instead, the grammar calls predicates defined in a world model...
For example, two terminal components of an episode are an event and the
protagonist's reaction to that event. However, the grammar specifies neither the event
nor the reaction. To instantiate these components, the grammar invokes the world
model, which specifies the events that may occur and what a character's reaction to an
event will be. The following rule rewrites an emotional response as an element in the
event list. The predicate emotion(E) invokes the world model to supply an emotion.
(Chapter 8)
emotional_resp(P,emot(E),T) -->
[holds(feels(P,E),T)],
{emotion(E)}.
A story consists of a setting and a list of episodes.
This can be seen in the root clause of the Prolog
program: "story(story(Setting,EpisodeList))".
Each episode in the episode list contains the features:
- Event: an initiating event.
- Emotional Response: an emotional response on the part of the protagonist.
- Action: an action response on the part of the protagonist.
- Outcome: an outcome or state description which holds at the
conclusion of the episode.
The form of the story is embodied in
coherence relations of "causality" and "goal directedness"
which "glue" together the events and states of the world model. (168)
These coherence relations are part of the story grammar
and have the same rigorous logical foundations that
Aspect Oriented Programming has.
They take up the bulk of the thesis.
Future weblog entries will focus on this technical side
of the thesis and how it relates to the Agent Oriented Programming paradigm,
as well as writing some declarative programs
for new types of narratives.
Additional link of interest:
August 4, 2002 - Programming Narrative II
Reading the papers I pulled off-line yesterday and
digging a little deeper I found that there's a lot more
on-line than there was two years ago when I researched this topic
and.......... I finally got my hands on some source code!
The only work that I found truly satisfying this time around was
Raymond Lang's PhD work which contains a
PhD thesis, source code, a powerpoint presentation of his thesis defence,
and a
paper he published.
(the source was "A Good Bibliography" below)
His approach to generating stories with bare bones Prolog
is an approach I also tried.
It's simplicity is appealing.
Now we can finally see how to do it from a to z,
full source code, no ambiguity here.
This guy should be applauded.
My future weblog entries on the topic of
"Programmng Narrative" will analyze his ideas and code.
For a good survey of several story generating systems see the
Masters thesis
A Model of Story Generation
which provides a broad overview of several systems
that I actually had to dig around in the stacks of U.C. Berkeley
library to find out about two years ago.
The simplest system covered is
Meehan's Tailspin.
It's a working piece of Lisp code,
but the stories it generates aren't very interesting.
Lebowitz's story generation system generates an endless soap opera.
This is the most promising system covered in the survey.
The intriguing little snippets of Lisp included in the original papers
are intriguing.
The programmer programs plot fragments that are woven together
by the program which uses AI planning algorithms.
Lebowitz outlines the algorithms he uses a little bit,
but if you are a programmer, in the end the papers are frustrating.
There is no source code and you'd have to be pretty
smart to reconstruct the whole iceberg from the little tips that
he gives you.
Last time I looked the whole field of story generation
lacked a single line of source code.
That was truly discouraging, everything seemed to be pie-in-the-sky
vaporware without source code or a detailed enough description
to reconstruct the source code from,
but that's the way things often used to be in academia before the
idea of open source became widespread.
The high-brow professors would gather around the cozy fireplace
in the faculty lounge attired in their tweed jackets,
recite their papers out to each other, throw out a couple of narrowly defined
pertinent comments, and then sip their brandy warming themselves by the fire.
I hope open source has changed that a little.
Nowadays, let's hope a paper without source code
would be like wearing a heavy parka to the swimming pool on a sunny day...
(open sourcers = sunbathers, sun = wisdom, knowledge,
skimpy bathing suits = sharing knowledge,
heavy parka = hoarding knowledge)
or like an English professor presenting a paper about a novel
with the only copy of the novel locked up in his desk,
not exactly what universities are supposed to be about.
The most grievous party in the no-source-code-vaporware department is
Bringsford
who managed to publish a book and generate
a lot of hype about his story-of-betrayal story generator
Brutus and yet not provide a single line of source code
but many a line of pristine mathematical proof,
an AI Ponzi game if you ask me.
At the height of the Open Source movement
Brutus was written in a proprietary form of Prolog.
After an hour or so of slogging through Google
I finally managed to find an
online explanation of Lehnert's "Plot Units",
an idea dating back to 1982.
Plot Units are graph-like structures that represent the
"emotional reactions and interplay between characters,"
They provide the background information for story understanding.
In narratives "the ratio of inferred propositions to explicitly stated
propositions is estimated to be about 8:1" so any story understanding
program must supplement a narrative text with a conceptual structure
representing what is implicitly understood if it wants to get anywhere.
The presentation linked to above shows how different
patterns of emotional reaction can be constructed out of
primitive plot units and then how a short story can
be diagrammed. (Don't tell your poet friends.
They'll likely wretch.)
Plot units include
success, failure, motivation, change of mind,
perserverance, loss, resolution,
trade-off, mixed blessing, hidden blessing,
sacrifice, killing two birds with one stone,
fleeting success, starting over,
giving up, intentional problem resolution,
fortuitous problem resolution,
success born of adversity,
threat, and promise.
Additional links:
August 2, 2002 - Programming Narratives
Here's a
visual programming system for constructing multi-perspective narratives
like the famous Japanese film
Rashomon.
Unfortunately it's not open sourced
and is written in a now defunct proprietary multi-media language by
the same people who made Quark-Express.
This visual programming system is for
"designing and presenting metalinear cinematic narratives."
A "metalinear narrative" is a "collection of small related story pieces designed
to be arranged in many different ways, to tell many different linear stories from
different points of view."
Visual tools are provided for "authoring pieces of stories,
authoring the relationships between the many story pieces, and for
designing an abstract narrative structure for sequencing those pieces."
Narrative programming could form the basis for a more complex control algorithm
for simulation games like the
Sims.
I've written about a Sims-like simulation of
Balzac's "Comedie Humaine",
a narrative reconstruction of early nineteenth century Paris
and the longest serial novel in history.
July 31, 2002 -
Quiz Cards - An Open-Sourced Quiz Program
To expand upon the
quiz example
mentioned in the
programming example essay.
QuizCards
is written in Java and has a nice design:
- Stacks of two-sided flash cards.
- Each side of a card can have a picture or text
and/or sound associated with it.
- Cards can be organized into separate quiz sets for presentation.
- Cards can be dropped from a quiz as they are mastered.
- XML is used to store the cards.
July 29, 2002: Searching for Information Within Large Documents.
Search engines have become an indispensable feature in many peoples' lives,
but they only help you locate the most relevant documents among all the
documents in a large collection like all the webpages on the web.
What if you want to find passages within a large unstructured
document that are relevant to your search topic?
Balzac's gigantic serial novel
"The Comedy Humaine"
with almost 100 texts and several recurring characters
is what intially stimulated my interest in within-document search.
I wanted to find
interesting character descriptions
that Balzac had written.
Marti Hearst
does a lot of research on within document search.
She is an associate professor at
U.C. Berkeley's School of Information Management (SIMS)
a school that has refocused itself on Information Retrieval,
a computing discipline, disposing with
the traditional library science.
The chapter she wrote on
user interfaces and visualization
for a survey of the information retrieval field
has many interesting user interface designs.
Her research pages on
Text Tiling
and Tile Bars
provide a useful and easy way to show what sections of a large
document are relevant to a topic.
A series of adjacent buttons are displayed, each representing
a section of the text (300 words let's say).
Each button is a shade of grey,
white indicating that the passage is completely irrelevant to the topic
and black indicating that it is very relevant.
The greyer the button, the more relevant to the document.
|