Game Development Community

dev|Pro Game Development Curriculum

Vespers3D: The Adventure of Text Parsing

by Rubes · 03/09/2006 (11:32 am) · 21 comments

The Adventure of Text Parsing in TGE

When I decided to incorporate text parsing into a 3D FPS game, I soon realized that one of the most challenging tasks was going to be determining all of the different ways commands could be entered by the player. In a pure text game, except in certain situations, each command needs to explicitly state direct and indirect objects:

> READ BOOK

or

> UNLOCK CHEST WITH RED KEY

But a 3D game provides something that text games generally lack: visual and spatial reference. And since I have also incorporated object selection into the game, players can also provide an "unstated" reference to specific objects in the game world. Together, these make for some new ways of interacting with objects in the game world, along with interesting new challenges for a parsing engine.

For instance, in the text version of Vespers (the game upon which I am basing this 3D/IF experiment), there is an alms box in the entrance hall of the monastery with which the player can interact. Of course, there's not much you can do with it besides the following:

The alms box hangs closed beside the door.

>EXAMINE ALMS BOX
It used to be filled with alms. Now, mostly cobwebs.

>SEARCH ALMS BOX
You can't see inside, since the alms box is closed.

>OPEN ALMS BOX
A single coin sparkles at you from the bottom.

>CLOSE IT
You close the alms box.

However, in the 3D version of this game, these actions must all be able to be done from a variety of player perspectives. For instance, a player might walk up to the alms box, such that it's directly in front of him, and want to examine it. How might this be done? Cue the imagery!

homepage.mac.com/rubes/pic01.jpg
1. Just by typing "EXAMINE ALMS BOX". Of course, most players won't know what the alms box is just by seeing it in the game, since there is no elaborate room description (like in the text version), which alerts the player to the presence of an "alms box".

homepage.mac.com/rubes/pic02.jpg
2. By typing "EXAMINE IT" without selecting anything. After all, the player has walked right in front of an object worth examining, should we always force him to select the object first? In this case, the engine has to look in front of the player for an object that would appropriately substitute for "IT".

homepage.mac.com/rubes/pic03.jpg
3. By clicking on the alms box (to select it) and typing "EXAMINE" (or "EXAMINE IT"). In this case, the engine has to see that the EXAMINE command has no object (or just the pronoun IT) and know that he is referring to the object that is currently selected.

homepage.mac.com/rubes/pic04.jpg
4. By clicking on the alms box (to select it) and clicking the right mouse button. The right mouse button can be set to any of a number of different verbs (like EXAMINE or TAKE), so clicking the right mouse button needs to trigger the entry of a command (in this case, EXAMINE) linked to the currently selected object (in this case, the alms box). The text command EXAMINE ALMS BOX is then created and sent to the parser.

(Note how the alms box becomes highlighted when it becomes the "selected object" -- thanks to John and TLK for that nice implementation.)

And the same, of course, is true for the rest of the commands that can be performed on the alms box. What was a relatively simple implementation in a text game (EXAMINE ALMS BOX or EXAMINE IT) becomes a bit more challenging in the 3D world to accommodate the different approaches that players might expect in this setting.

This then becomes slightly more challenging for verbs that require both direct and indirect objects, like the UNLOCK CHEST WITH RED KEY command above. If the player has the red key in his inventory and walks right up to the chest, it should probably be expected that he can type "UNLOCK WITH RED KEY" and be assured that the engine knows he wants to unlock the chest with the red key. But since the direct object is missing from the command, it has to know that it must search for an object to insert into the command structure. These are the kinds of things that have kept me up nights.

But visual and spatial referencing introduces a new little twist to text commands. Should players be allowed to perform actions on objects without actually facing them? What if a player walks right in front of that alms box, turns around 180 degrees, and types "OPEN ALMS BOX"? Do we allow that? Technically speaking, we shouldn't, but will this frustrate players if they get a message back saying "You need to be facing the alms box to do that"? My belief is that players would probably prefer the consistency of a world that requires you to face objects you want to interact with...but that now means that most verbs that perform actions on objects now need to perform these multiple checks on the object before carrying out the action:

- Is the object close enough to the player to allow the action? Players shouldn't be able to perform some actions on objects that are visible, but too far away.
- Is the object actually in the player's line-of-sight? Players shouldn't be able to perform an action on an object if there is a wall, pillar, whatever between them and the object.
- Is the object within the player's field of view? Players shouldn't be able to perform actions on obejcts that they are not able to "see".

Of course, none of these are issues that need to be dealt with in text IF, since visual and spatial referencing never really comes into play (except perhaps for things like objects contained within other objects).

But that brings up another important issue. What about objects that don't exist in the game world as individual models, or things that can't easily be selected? Things like:

- parts of a room (floor, walls, window, etc)
- non-selectable scenery objects (snow, etc)
- sub-objects of other selectable objects (eyes or skin of individuals, etc)
- "game" objects (like "game" in "save game")

These have to be implemented as "nouns", just like selectable objects, but they need to be handled differently by the parser since they can't be selected. I ended up using the basic scriptable simobject resource for these, so they can exist as script objects like regular nouns without being associated with DTS models. Instead of regular nouns, I call these noun objects "floating nouns". So the engine now knows when it's dealing with a noun that's not associated with a world model, and deals with it appropriately.

homepage.mac.com/rubes/pic05.jpg
Then, of course, are all of the little (but not really so little) features of text parsing that seem simple enough but cause all sorts of havoc when you try to implement them:

- pronouns (IT, HIM, HER, THEM)
- possessives (HIS, HER -- also a pronoun, ITS, THEIR)
- numbers (splitting or grouping objects into different numbered groups is a beast)
- disambiguation ("Which key did you mean, the blue key or the red key?")
- multiple commands (TAKE KEY AND UNLOCK CHEST WITH IT).

Pronouns are now implemented, and possessives are not far behind. Fortunately, in the text version of Vespers there are no situations where numbers are needed, so that can safely be put off indefinitely. As for disambiguation and multiple commands, well, those are a little trickier. I'll probably have to implement disambiguation since it's such a widely accepted feature of text parsing engines. Multiple commands can be a real pain, but fortunately my understanding is that this is not widely used in the text IF world, so its absence would probably not be felt.

So I'm happy to say that most of the text parsing engine is actually done and is working pretty well. All it really needs now is implementation of actual game objects. Most verbs in the game pass the action to be performed to the object itself, so that each game object can determine the result of an action performed on it.

So I'm excited that, now that the meat of the text parser is done, I can get to work making Vespers3D actually look and feel like Vespers. Now we're talking.
Page «Previous 1 2
#1
03/09/2006 (11:43 am)
Interesting! I hope this lets you pile on versatility and complexity without hand-coding every logic node. Also, love the bottom screenie... very moody.
#2
03/09/2006 (12:26 pm)
This compelled me to think about the Looking Glass games. There was an interface item which would allow you to select items in the "room" via the mouse as well as in your inventory. You could also select available actions. The first thought that I had was to use a menu parser with the selection system, but that kind of takes away the point of having a text parser at all.

Hmmm....I need to think more. I love the idea of this project, though. And it makes you think spatially rather than simply narratively in a text adventure, which is interesting.
#3
03/09/2006 (1:00 pm)
all i can think of is space balls.

(nice stuff BTW =)
#4
03/09/2006 (1:14 pm)
This is a very fascinating project! I really like to see developers bringing something different to the 3D environment, and this has some quite interesting directions it could take. Please keep posting about your progress, it will be nice to see what you do with the Vespers project.
#5
03/09/2006 (1:16 pm)
@Dave: Thanks...it does, to a degree. If you want something specific to happen with a particular verb-noun combination, much of that needs to be hand-coded. But nouns can also use default code for certain actions; like for simple OPEN and CLOSE commands, although you can also add to the default code as well. Hopefully, N.R. (who's doing the modelling) won't slap me for showing a bit of the scenery...

@David: Thanks as well. The real question is, does adding the spatial dimension really add to a text-based game? I think it does, although the degree to which it does really depends on the game idea and its implementation. A stright-up implementation of Vespers (the text game) may not prove to be the best way of demonstrating the capabilities of this hybrid text-FPS engine, since it's not a game designed to utilize or take advantage of that spatial dimension.

The inventory does create a bit of an inconsistency, since right now you can't actually "see" what's in your inventory. You can only list it, like in a text game, which means the only way to interact with inventory items is through actual text reference. I've been thinking about implementing a visual inventory, like many 3D games have, but I'm wondering about the specifics of implementation and how well it would synergize with the game.

@Allyn: Thanks...why Space Balls?
#6
03/09/2006 (1:59 pm)
My first thought in creating interaction (though not necessarily straight test-parser ones, which was why I decided not to mention it) was to have context sensitive pop-up menus. For example, using the alms box above. You move towards it, and when it is within "spatial range", its name pops up so that the player can interact with it. Selecting the name gives a menu of things to do with it. If you wanted to put a coin in it, an "inventory" or "put" menu/icon could then sub-menu out to the player's inventory (or only specific items that make sense to use with an alm's box...like a coin). This would allow the player to visually construct their sentences according to what they see. To extend this to a text-editor, think of intellisense. For things which are "active" (have their spatial triggers noted on-screen), the parse could suggest words. So that if you type "put", it automatically searches your inventory...but only for items which match "active" elements (alms box) to suggest while typing.

It seems to me that there would have to be three zones of activity for spatial reference. The first would be active. When some point of interaction is close enough to the player to interact with, the player can engage it in whatever way necessary. This is a sub-zone of line-of-sight. Line of sight is basically anything on the screen whether active or not. That way, if it is not active, but is on the screen, it will give the player a message like "you're not close enough" (like in Sierra games of yore). If it is not in the player's line of sight, then saying "you don't see that" should suffice. The major problem with this is if you implement a timed game element where the player wants to interact with something close-by without it being fully in sight (pull the lever and run through the gate...I don't want to turm, look at the lever, interact with the lever, turn, and then run for the gate in hopes of making it through in time...because that is frustrating).
#7
03/09/2006 (5:05 pm)
Text interaction makes more sense with NPCs, since it would be an actual conversation. One of the nuttier results of MUDS morphing into MMORPGs is that most of them are still designed with tons of slash commands like /dig, /taunt, etc, where as soon as text adventures where able, they abandoned the idea as fast as they could - see Lucasarts Monkey Island series...
#8
03/09/2006 (6:45 pm)
@David: I think I understand what you mean; the Deus Ex model comes to mind. I've never been a huge fan of constructing commands visually, mostly because I don't like the constraining and suggestive nature of that. Although there is rarely more than a few verbs that can be used with any particular noun object, I've always felt that presenting the user with the full list of options removes some of the creative thinking from the process and interrupts the sense of UI transparency.

This, in fact, has been a point of contention in the IF world for some time: should the player be privy to all possible actions he can take with a particular object? The "intellisense" system you mentioned is a similar concept in reverse, providing possible nouns for a particular verb based on the context. To me, that type of system just seems to deemphasize creative problem solving. Or perhaps it's more accurate to say that it force-feeds the player instead of making him or her "earn" it.

Of course, one could argue that many (if not most) objects in IF games have a limited set of verbs that can act on them, so by hiding the list of those possible options you're not really hindering creative thought in any tangible way. In fact, you may be more likely to elicit player frustration with a straight text parser because you're giving the illusion that you can do almost anything to any object -- when in fact, you most certainly cannot. As one person put it: "I think a big part of the reason that most people don't like text IF is the stark contrast between the promise of 'boundlessness' and the reality of 'I don't know that word.'". It's certainly an argument with some merit, but I guess I still prefer the false sense of boundlessness provided by a non-suggestive parser/command creator. That, and I still think the pure text parser provides a subtle sense of UI transparency that you lose when you provide a more visual method of input.

@Paul: I'm not sure; there aren't many examples of true conversation with NPCs. NPC conversation is another one of those contentious issues in the IF world, and there really aren't a lot of truly satisfying solutions for it. The main ones are ASK/TELL, TALK TO, and a menu-driven system. ASK/TELL gets the most play, and is probably the best we have right now; you basically type ASK ABOUT , or TELL ABOUT , and they give you their response to that conversation topic. TALK TO is an easier system to implement, but often not as satisfying since all you can type is TALK TO to elicit a response, on whatever topic the engine determines is appropriate for the situation. Menu-based systems are also used a lot (the system in Neverwinter Nights comes to mind), but I've never really favored those, since they seem too constraining and suggestive, and they often end up with the typical "menu maze" where players end up enhaustively trying all options to see all responses. Plus, it's an obvious break from the standard UI which calls attention to itself and degrades the experience.

To me, there isn't a huge advantage to text input over other forms of interaction, aside from providing a few more creative ways of interacting with the world. We'll see how it works out.
#9
03/09/2006 (9:25 pm)
The IF nature was why I hadn't mentioned it previously. The Looking Glass games pissed a lot of IF purists off but made IF more intuitive in nature.

It is a discussion of purity and danger. Does the pure method of a deterministic view of the player having to live in the developers brain and vocabulary compromise the player's limitations and enjoyment of the game. The purity is the developer's vocabulary and grammatic sense. The danger is the player's comparative function. Reductionist tendencies are in accordance with the developer/writer/programmer of the context, word list, and grammatical syntax rather than the player. You touch on this in your evaluation of proxemics and IF has been tackling with it for decades now.

Regardless, I can't wait to see the next step!
#10
03/10/2006 (5:50 am)
You will all enjoy this link. It is a video representing natural language parsing using Quake done by researcher Donna Byron at OSU. She works dilligently on pronoun resolution, etc., in a natural setting where distance to an object etc plays a key role in semantic meaning.

A note on the video: the characters are simply acting out the conversation recorded by humans performing the same tasks in a real world.

slate.cse.ohio-state.edu/quakeref/quake-explain-brighter.mpg
#11
03/10/2006 (6:34 am)
Rubes, I recently implented this resource into my 1.4, it's a great headstart on visible inventory:

www.garagegames.com/index.php?sec=mg&mod=resource&page=view&qid=7514
#12
03/10/2006 (8:13 am)
@Tim: Thanks...that's interesting stuff. It would be nice to see the actual parsing being done to know how sophisticated it is.

@Dave: I agree, thanks. I'm still debating whether a visible inventory fits with the style. I'm not sure. It would be nice to be able to select an object in the inventory using the mouse, but since object selection is tied to the crosshair, that would not work (assuming the inventory GUI moves along with the crosshair).
#13
03/10/2006 (8:33 am)
Donna Byron gave a lecture here at the institute a couple of weeks ago, which is why I knew about her video. Here is the abstract of her talk:

Context Management for Embodied Conversational Agents

Dr. Donna Byron

Embodied conversational agents present a new challenge for software models of dialog processing. To participate in a dialog, an agent must manage a complex set of knowledge sources that influence the relationship between linguistic forms and their meanings, including the current state of the discourse itself, tracking which beliefs are mutually known between the agent and its partner, observing the state of the world in which the dialog takes place, and updating the state of the task underway. In contrast to traditional unimodal dialog agents, an agent that is embodied and even mobile within a conversational setting must gather some of this information through perceptual sensors, fuse that information as it arrives in independent data streams with independent timing, and reason appropriately to synchronize this information with the linguistic exchange happening in the dialog. Although this is a very complex task, we have made some progress on the relevant issues in our first year of development of the OCEANS conversational agent at OSU. The OCEANS project aims to develop an autonomous agent that can assist with search-and-rescue tasks within a simulated graphical world. This talk will describe the current status of the reasoning component and sensor fusion needed to support context-sensitive language processing in this situated dialog agent.


She has some resources posted here that might be useful:

slate.cse.ohio-state.edu/quakeref
#14
03/10/2006 (8:45 am)
Whoa. My head is spinning after that. Embodied, mobile perceptual sensors? Pretty cool stuff, if they can really get it to work. As a side note, I found it interesting that the bunny was hopping up and down on a pogo stick, yet the bunny's first-person perspective was still.
#15
03/10/2006 (8:59 am)
Yeh, I think the pogo stick was added as a humor generating eye catcher. In fact, when she first showed the video I wasn't sure what was going on. I got hooked when the bunny said she had accomplished an ancillary objective and the penguin said "Really?" and turned to look at the other object. Pretty cool language AI.
#16
03/10/2006 (1:07 pm)
Princess Vespa
#17
03/10/2006 (1:16 pm)
Got it!
#18
07/22/2007 (9:40 pm)
Rubes, got any resources for someone wanting to start writing a text parsing system for TGE? I've followed your work from the first thread you posted asking others if this was possible. Now I see you've indeed found it very possible! Any articles or books you could recommend would be very helpful :-)

- Ed Johnson
#19
07/23/2007 (8:14 am)
Ed: The quick answer to your question is no, not many that I know of. But really, I suppose your question has two parts to it: resources for writing a text parsing system in general, and resources for writing one for TGE.

The second part is easy: as far as I know, nobody has done this before, so there hasn't been a lot of guidance. Basically, all I did was to tap into the chat system already built into the starter.fps mission in TGE. Using that system, I already had a text entry window and a text output window -- all I had to do was to splice into the code that handled the text that was entered by the player. The main caveat to this is that the chat system was designed for a multiplayer game, whereas my game is to be single-player only, so there is a lot of what I did that is not network-friendly. Part of that was because I had a tough time grasping the network server-client structure in TGE, and part was because I didn't care. So that's something to keep in mind if you go this way.

As for the first part, I really just studied some of the (free) text parsing systems out there for interactive fiction to understand how they did it, and then I just rolled my own using some of these techniques. It's not pretty work, though -- writing a text parser that can do what most parsers are capable of these days (which still isn't very much) can be painful and frustrating. But it's definitely possible!

As for resources for this first part, I would look at two widely used IF systems: TADS and Inform. One note about Inform: when it moved to version 7, it took on a completely new approach, and I'm not sure how applicable it would be to a project of your own. Instead, I looked at Inform version 6, which was easier for me to digest from a coder's perspective.

I have more experience with Inform than with TADS; two things that helped me to understand text parsing much better include the Inform technical manual and the Inform Designer's Manual. The Inform Beginner's Guide was also a useful resource as well.

Hope that helps...
#20
07/24/2007 (10:24 am)
Thank you immensely! I was thinking along the same lines, but wanted to double check. The hardest part indeed will be the actual text parsing, and not so much getting it hooked into torque. I am going to attempt a very simple text parsing system, reminiscent of Zork or Hugo's house of horrors.. two of my favorite games growing up :)

If I get anywhere with this, I'd like to get a simple text parsing tutorial+code into the resources section. Anyone who would like to collaborate with me on this, shoot me an email.
Page «Previous 1 2