Replays

Context

These are some thoughts I had about the design of “replay” which is a module for recording and replaying a talk given in Roblox. When you replay a talk, a clone of the original character appears and delivers the talk to you, moving just how they originally did, and writing on the board live, just as they originally did. It’s basically magic - especially when the talk is given in VR. Check it out!

Replay Design

I had a pretty good conceptual model of how the replay would work, and how I would synchronise all of the board writing with the character positions. Every element of the scene to be recorded (at this point just characters writing on a board) can be captured at at many different levels. For example, to replay a character moving around, I could capture the positions of the parts every single frame, or I could capture the user inputs that caused the character to move around as it did during recording. For board writing, I could capture the entire state of the board every single time it changed, or I could capture the arguments of the client-to-server events that cause each change to the board. Going even further, I could capture the user inputs on the client (where their pen was) which caused those remote events to fire.

Taking it to the extreme I could try to record the neurons firing in the brain of the person controlling the character, and then try to fire those same neurons to make them replay their inputs. At the other extreme of this spectrum is something like doing a plain old screen recording of the rendered scene on your computer (or even further, the neurons that fire when you watch the recording happen).

My point is that there’s a flow of information, from intent, to input, to code-execution/remote-events, to appearance/influence in the workspace, to your screen/speakers, to your eyes and ears. The task of recording and can be viewed as picking the right moment to intercept that flow of information, and then replaying is a matter of re-entering that flow of information.

There are pros and cons to choosing different points of the flow. For example, if I record a CFrame for all 16 parts of the character 30 times a second, that’s 480 CFrames/sec, or 5,760 numbers/sec, or 46KB/second, or 2.76MB per minute. This is much more data than would be required to store just the user inputs. A similar comparison holds for storing the whole board state every time it changes vs storing the remote event calls.

In general it seems that there is more data to store later in the flow, so earlier must be better yeah? Not exactly. There are important drawbacks recording earlier points in the flow. The system being recorded does not exist in isolation of the workspace around it. The positions of the parts of a character are not just a function of the player inputs, since it can be influenced by things like physics simulation (you might be assaulted by the audience while lecturing). If those physical influences don’t also occur to the character while replaying, the character might end up a metre to the left of where they should be writing. Even without external influences, you are relying on determinism of the physics engine if you are re-simulating everything live.

The same point is relevant for the writing on the board. If you are re-performing the individual writing events, then the actions of the users watching can destroy or obscure what you were writing. On the other hand if there is someone else present during the recording whose board interaction isn’t being recorded, then you could end up with a very different outcome during the replay. For example, they could clear the board during the recording (perhaps they were trying to help!), but this won’t occur in the replay.

The best way forward is to attempt to strike a balance where you can record data efficiently, while being resistant to the chaos of re-simulation. A good example of this is how I ended up recording VR characters. We use the Nexus VR Character Model, which translates the user input cframes of the controllers and headset into 3 world-relative CFrames, then passes those to all other clients. Those 3 CFrames are then used to position (with physics) all 16 parts of the character using Motors. I record these CFrames as they come into the server, and on replay I just call the same functions that perform the physics simulation to position the character parts. Subsequent physics that occur during the replay can affect the character parts, but they cannot affect the recorded CFrames, so if the character is pushed during the replay, it will be dragged back towards the target CFrames anyway. This is what I mean by making the replay resistant to the chaos of re-simulation.

I haven’t quite achieved the same balance for recording the board writing, and this is partly because we don’t necessarily want the person in the replay to be the only who can interact with the board. It could be quite useful to be able to annotate and edit freely on the fly. It’s also just kinda fun (funny) to mess with the recording. Erasing what it writes, jumping on its head etc. The later in the flow of information that the recording derives-from, the less novel the whole thing becomes. Just seeing the end of that flow is just like watching a screen recording. As an audience member, the more you can influence the re-simulation of things the more involved and connected you feel to whats happening in front of you.

However, it’s no fun to accidentally ruin the re-simulation of a replay that you’re trying to learn from, and the speaker starts gesturing at things that aren’t there anymore (either because they’ve been moved, or what they’ve written has been erased). I think a good compromise is to allow some degree of re-simulation, but let the audience choose to jump back to the canonical state of things, i.e. putting the character back where they should be, and making the board look as it was at that point in the recording. This happens automatically for the character, but the board might be a little trickier.

Technical Roadblocks and Debugging

I want to end with some reflections on the technical side of things. The first issue I faced was how to patch into metaboard in a natural way to record and replay the writing events. I wanted to make “replay” an independent module that interfaces with metaboard, as opposed to just baking replays as a special use case into the metaboard code. This amounted to a refactor that separated the remote event handling from the board-state-manipulation. In the future I can imagine having to refactor more functions to take additional arguments, or inserting intermmediate functions that can be used by external modules.

It also took a while to get the character simulation working, which most involved learning how the Nexus VR Character Model works. I got it working with a hack that doesn’t animate as smoothly as it should but I’ll revisit this later.

The thing that took by far the longest, and was the most soul-draining, was debugging would-be-type errors. After writing all the datastore code to serialise, store, restore and desererialise a replay, the code had grown a bit and there was a lot of arguments being passed around. I spent many boring hours hitting play, putting the headset on, testing until there was an error, taking it off, then trying to locate which function had been called with the wrong argument order, or was missing arguments. This was made worse by the fact that in some circumstances, the use of coroutines can prevent errors from surfacing to the output as they normally would. They would only appear if I run the program in debug mode and carefully step through it. I don’t fully understand yet why this happens.

The problems with passing incorrect arguments around usually occurs when I modify a function to take more or less arguments, and then forget to update all instances that make use of that function. One remedy that I have begun to employ is passing a single table of named arguments to the function, which means order never matters. However this isn’t a complete solution. When mistakes inevitably happen, the error usually occurs deep into a stacktrace of function calls, and the game is to figure out which function was called with incorrect arguments, which can be much further up, and sometimes in a past thread.

These problems don’t occur in a strictly-typed programming language, and this is certainly the reason why many Roblox programmers have been attracted to writing their games/libraries with TypeScript. You write the code (mistakes and all), and the type checker catches all the silly type errors for you. And it captures them where they first happen, not just where they cause a problem later down the road, and not just the ones that your runtime scenario runs into.

At this time I have no desire to invest any time programming in TypeScript for Roblox, but it’s starting to feel a bit amateur-hour to be debugging type errors the way I am. I’m considering trying out “t” which is a runtime typechecker.

https://devforum.roblox.com/t/t-a-runtime-type-checker-for-roblox/139769

It only works at runtime, but the difference it would make is that there is no longer a game of finding where in the stack of function calls was an argument passed incorrectly.

I’ve tried using the luau type checker, but I’ve never actually seen it alert me to a type error so I doubt I was using it properly. Also I don’t think it works if a function is imported from somewhere in the datamodel that only exists at runtime.

-Billy.