TVM Debugger

krigga

This is a post written to explain the inner workings of TVM Debugger, and also list the features that can be implemented in the future. If you want to try TVM Debugger, please see this repo.

Debugger components

Source Maps

TON contracts compile down to TVM bytecode, which is, like everything in TON, just a Cell. At run time, we do not have information about which line of code we're currently at, we only have information about the current code Cell and offset inside it. So, we need to somehow map this information to file names and line numbers. There are multiple approaches to that problem, but I've chosen one that I consider the simplest to implement and that covers all potential problems. I modified the FunC compiler to insert a no-op opcode before outputting opcodes for any line of code, and this no-op opcode carries an index into a source map array, which stores the file name, line number, variable names, and other information. For example, this contract:

int main(int a, int b) {
int r = a + b;
return r;
}

compiles down into this fift assembly:

PROGRAM{
DECLPROC main
main PROC:<{
  "DI0" DEBUGSTR
  ADD
  "DI1" DEBUGSTR
  "DI2" DEBUGSTR
}>
}END>c

with this source map:

[{"file":"test.fc","line":2,"pos":4,"vars":["a","b"],"func":"main","first_stmt":true},{"file":"test.fc","line":3,"pos":4,"vars":["r"],"func":"main"},{"file":"test.fc","line":3,"pos":12,"func":"main","ret":true}]

As you can see, debug info opcodes are DEBUGSTR opcodes with strings beginning with DI and an index into the source map array. The source map array entries carry the file name, line number, position, variable names, function name, and may also contain flags about whether this is the first statement of the function, or if it's an opcode just before a return.

At run time, we will be able to inspect the next opcode to be executed, determine if it's a "DIx" DEBUGSTR opcode, and see which source map entry it is.

The modified FunC compiler can be found here.

Step-by-step TVM execution

The pre-existing TVM implementation can only run the contract completely, without stopping. If we want to have a chance at debugging, we need to be able to execute an opcode, inspect the TVM state, execute another opcode, and so on, so in other words, we need the ability to execute the contracts step by step. In order to do that, I modified the TVM implementation and some more code in order to expose methods to execute a single opcode and inspect the TVM state in this "paused" state. Specifically, we need to be able to query our code position (current code cell hash and offset) and our stack values. As you can tell, debug info provided at compile time comes very handy here - using it, we can map the code position to an actual source file and line number, and also map stack values to their variable names.

The modified emulator can be found here.

Debug Adapter Protocol

In order to facilitate the debugging of code written in any language in VS Code, Microsoft invented the Debug Adapter Protocol (DAP). It is a protocol which generalizes the communication between a code editor and a debugger. So now that we have source maps and the ability to execute contracts step-by-step, we need to make our debugger talk to VS Code and other editors utilizing DAP. This is actually pretty easy to do thanks to the mock debug extension and the libraries provided by Microsoft.

Of course, the actual debugger would also have to do the source map matching, the TVM stepping, and generally just control the whole process, but that's the easy part once you have the source maps and the step-by-step TVM.

The debugger is completely contained in Sandbox and can be found here.

Sandbox/Blueprint integration

Now the debugger has to be integrated with the tooling already used by most developers on TON. In short, the following was done:

1) func-js's API was updated to support source maps (debug info) generation

2) Blueprint's API was updated to use the new func-js API

3) The debugger was completely implemented in Sandbox

4) Sandbox's API was updated to support debugging; Sandbox calls the debugger when needed

When the debugger is called, it waits for a debugger to connect to it using the Debug Adapter Protocol, then once the contract is completely executed, the debugger returns a result to Sandbox in the same form that it would receive from a regular smart contract executor.

VS Code extension

This is the simplest component, because it merely connects to the debug session provided by Sandbox - all of the actual debugging is handled by VS Code itself.

The extension can be found here.

Future features

At the moment, this is mostly a proof-of-concept implementation - it is dirty in some places, the debugger is incomplete (there is only step-in and continue at the moment, no step-out or step-over), the extension is very simple and so on, so a lot of the features were left for the future, and here are some of them.

Global variable inspection
Fift assembly and TACT debugging
Smaller, more efficient source maps (not 1 opcode per line as it is now)
Step-over and step-out
Callstack
Improved extension (perhaps a convenient button to connect and/or autoconnect?)
Variable modification
Expression execution