Vespers3D: Adventures in Cinematics, Part I
by Rubes · 04/26/2007 (12:27 pm) · 8 comments

Vespers3D: Adventures in Cinematics
Recap:
Vespers3D is our attempt to bring old-school text-based adventure games (interactive fiction) into the world of real-time first-person 3D - a new genre we are calling 3D/if (3D interactive fiction). It is based on Vespers, Jason Devlin's fantastic text IF game that won numerous awards from the IF community, including Best Game at the IFComp'05 and the 2006 XYZZY awards. Vespers provides a compelling setting and a powerful storyline for a game that will, in the end, be something akin to Myst but with a fully interactive 3D environment and good old-fashioned text command input and output.
Well, another three months have passed since my last .plan. It has been a slow three months, but we continue to make gradual progress as we ease into the complicated world of animation and sound. This is a pretty long .plan, so bear with me.
Vespers3D and Cinematics
Generally speaking, interactive fiction games place primary emphasis on one of two things: story or puzzles. It's rare for an IF game to do both really well. Games that focus on puzzles seem to have a harder time putting together a flowing narrative; in contrast, those that emphasize story become quite linear to avoid having puzzles interrupt the flow too much. The text version of Vespers is one of the latter -- there are many puzzles, but they are essentially secondary to plot and character development, making the game a very linear experience. But in the 3D world, "linear" means a lot of action that is scripted, whether that means short animated sequences or actual cutscenes. Vespers3D will thus be a very cinematic game, and implementing that is a big challenge.
Sequences and State Machines
Although it's probably more theoretical and maybe not entirely necessary, I decided to approach the handling of non-player objects (typically NPCs, but potentially other objects as well) in terms of state machines. Torque already does this with mounted images like weapons, but I need something a little different. In our project, most objects will have the following basic state chart to begin with:
Figure 1. Basic state chart.Following object initialization, the object will enter an "idle" state, where it basically sits happily doing nothing. Once in a while, perhaps at random, it can be told to play some basic "idle" animation or sound. An example from our game is one of the windows in the refectory (dining hall); it sits there until, at some random interval, it plays an animation and sound to make it appear to blow in the wind, slapping shut and opening again. It then returns to the idle state, waiting for the next random animation call.
More complex objects, like doors or NPCs, require an additional, "non-idle" state, as such:
Figure 2. Expanded state chart.Here, objects receive some sort of trigger to tell it to transition to the "active" state to perform some function (e.g., animation or sound) -- a door will play its "open" animation/sound, or an NPC will respond to a question with an animation/sound clip. Afterward, it transitions back to its idle state.
What happens during the "active" state is really more complicated than that, though, as shown in the next figure:
Figure 3. The problems with performing sequences.When an object is told to become "active", it needs to know what sequence to play: that can consist of an animation sequence alone, an audio sequence alone, or both together. Once that happens, the game needs to know when that sequence has finished playing, in order to proceed to the next state (represented by the lower ?). Once it figures that out, then it needs to know what to do next -- if it should return to the idle state, or if it should stay "active" to play another sequence right away (represented by the upper ?).
What this means, then, is that for every "active" sequence an object plays, three key pieces of information are needed, as shown below:
Figure 4. The final state chart.The three components are: (1) which sequence to play (animation and/or audio); (2) what trigger should be used to signal the end of the sequence (animation or audio); and (3) what to do when the sequence is done. Also note that step (3) might additionally include a trigger to start a sequence for some other object as well.
So this brings me to the two important features of the state machine: end triggers and script sequences.
End Triggers
Cinematic sequences can be tough because they require fairly precise timing for each component in order to convincingly pull it off. That means being able to direct particular animations and/or sounds to start at specific times. The problem, though, is that some sequences need to start right when another sequence ends. Take this simple sequence from the text version of Vespers:
Brother Matteo leans against the rails, staring into the wind. [b]>TALK TO MATTEO[/b] "How are you Matteo?" you ask, leaning up on the rails beside him. He turns, a friendly smile on his face. "I don't even know anymore." He gives an exaggerated shrug.
Pretty straightforward exchange, at least from a text standpoint. But in 3D, the following actions all need to take place:
1. Play the audio recording of the player/abbott asking the question
2. When that is finished, transition Matteo to the "active" state
3. Play Matteo's animation sequence (turning, smiling, shrugging)
4. Also play Matteo's audio response simultaneously
5. When finished, return Matteo to the "idle" state
But how do you know when the abbott's audio recording is finished and to proceed to step 2? How do you know when Matteo's animation sequence is finished and to proceed to step 5?
It turns out animation sequences are pretty easy; when DSQ's are played, there is actually a callback already in the engine for ShapeBase objects called "onEndSequence". That makes it simple, at least for animations. For sounds, it's a little more difficult. I've looked into some kind of similar code so that the engine can know when a particular audio clip has finished playing, but there doesn't appear to be an easy solution. For now, I've gone ahead and used a simple (but laborious) workaround: know the duration of each audio clip and schedule a custom "onAudioDone" call to occur in that many milliseconds. It's not perfect, given the fact that game time is a bit slower than real time, but for anything except the longest audio clips it works well enough. It's also important to note that it is relatively uncommon to have just an audio clip playing by itself; typically, it will be a combination of animation and audio, and for that we can just stick with the more accurate animation "onEndSequence".
Script Sequences
Highly scripted scenes require us to know what sequence to play now, and what sequence(s) to play when finished. That's basically what script sequences are for. Without going into too much detail, each object (typically NPCs) keeps track of what state it is in, what sequence should be played when a particular trigger is received, and what state and sequence (either idle or active) should follow it.
To handle most of this, I've created a set of script functions I call the SequenceManager. While each object keeps its own script sequence, the SequenceManager is what handles all of the dirty business: initiating an object's sequence, handling actions when an object's sequence is finished, and doing all other necessary housekeeping. It makes for a relatively straightforward implementation, although it's still in the developmental stage and there will undoubtedly be kinks that need to be worked out.
Problem Areas
A few issues have come up as I try to implement this system, most of which are related to the transition from a turn-based text game to a real-time 3D game. The turn-based nature of text IF makes it relatively easy to control what happens as a result of the player's actions and what actions can be performed next -- or, more specifically when the player can act next. In real time, however, there are some potentially complicated situations.
Going back to the simple interaction between the player and Matteo above, the player asks a question and Matteo responds with audio and animation. In the text game, the player sees the full interaction before he can then enter a new command. But in 3D, as soon as the audio and animation start playing, nothing is limiting the player from entering a new command -- asking another question right away, talking to someone else, or even just walking away. This would be bad.
In response, I've implemented a sort of "pause" feature that prevents most player input while certain sequences are being played. So, when the player asks Matteo a question, he is prevented from typing or entering any commands until the response sequence is finished. That means no typed commands, no mouse commands, and even no moving around -- all you're allowed to do is pan the camera and click on objects. I chose to indicate this "pause" state with an hourglass image in the bottom corner of the screen, as such:
Figure 5. The hourglass in the corner indicating no player input.It may not be the most elegant solution right now, but it allows a certain level of control over things to keep the action moving forward as it should.
It becomes even more complicated, however, when you start thinking about the 3D nature of some really complex action sequences. Take, for instance, one of the main dramatic scenes in the text game. Below is the first part of it, which spans several player turns (in bold):
[i]Base of the Bell Tower[/i] The smooth patch of snow at the base of the tower, formerly cut only by a single line of footsteps, is now a mash of dirt and blood. Steps lead northeast and up to the tower, and to the southwest is the cloister. Brother Matteo lies upon the ground, blood trickling from the side of his head. Lucca kneels by his side. The remaining brothers are gathered around the body, shouting. Drogo jumps up and down, rubbing his hands with glee. "It begins," he squeals to you through a wide grin. [b]>EXAMINE MATTEO[/b] He looks so calm. Tiny snowflakes trace the creases in his face, lingering longer and longer as time passes. His mouth fills with snow. Constantin strides forward and smacks Lucca across the face, sending him sprawling into the snow. "Get away from him, murderer!" He shouts, eyes burning. "Constantin!" Ignatius shouts defiantly, stepping between the two. Although Constantin towers above him, Ignatius looks taller than you have ever seen him. "Leave the boy alone." Drogo giggles.
There is so much to consider here it can be pretty intimidating. That's a lot of highly coordinated animation and sound taking place, and I have to consider things like: What happens if the player is standing in the way of the scripted action? How to handle player input during this extended sequence? It's allowed in the text game, although it's difficult to actually alter the action sequence. But do players really want to be silent "observers" to all of this action, especially when you consider they are playing one of the key characters at the monastery? Should this just be done as one long cutscene?
Some tough questions yet to be answered. Until next time, I'll leave you with one last screenie for sticking with this .plan to the end.
Figure 6. The Entrance Hall, doors open.
#2
04/26/2007 (1:33 pm)
Awesome Read! I've been watching this for months, have so many questions, can't wait to see a demo! Great work!
#3
04/26/2007 (9:13 pm)
Fantastic plan Mike! Glad to see things progressing well.
#4
04/27/2007 (6:55 pm)
Thanks everyone, much appreciated. Ed, feel free to ask away, this is the place to do it!
#5
04/28/2007 (2:35 am)
In terms of what happens if the player blocks and NPC path during an sequence. Given youre actually in first person, what I'd suggest is that you simply place the player into a "safe" location during the action, put the player in control of a camera instead and then simply swap the player back to a player after the action has ended. So basically dynamically swapping the player to a camera and setting the transform of the player itself to a "safe" place.
#6
04/28/2007 (2:16 pm)
@Phil: I think that's one good way to approach it, I'll look into that. I can also set up the scene so that the player triggers the action from a safe place. There are a few options, but I also want to make sure it's simple, easy, and straightforward to players who don't have a lot of FPS-style game experience.
#7
04/30/2007 (4:47 pm)
Love the write-up! Looks like it's progressing along nicely. Can't wait to see it with these additions. Good Luck to you! 
Torque Owner Surge
MDNAMEDIA