Every function would be pretty overkill, but you can automatically install instrumentation on functions on demand or with a pre-determined user selection. You can even instrument on a line basis if your instrumentation is cheap enough. I've experimented with this in a VS extension I'm developing and I could easily browse through a non-trivial game codebase without causing noticeable performance overhead [1]. In the demo, the instrumentation is auto-installed on all functions within the file you opened. Obviously, this is just one project I was testing on but it shows that this type of tracing is feasible.
I sympathize a lot with this post. Today, modern IDE's still don't provide automatic ways to inspect program execution and nicely visualize it. Instead, they rely on having you place all the breakpoints, step-step-step and restart everything again if you did not inspect something in time. I don't want to insert print statements that end up clogging the console completely. This workflow also quickly breaks with a lot of logging.
I'm quite a bit into developing more tooling around that, and it's really hard - developers particularly want tools that completely integrate with existing IDE's. Debugger infrastructure is usually not built to be very exstensible or accessible. For my extension for Visual Studio https://d-0.dev/ I had to put serious effort into making sure it all still works correctly with VS breakpoints which is extremely difficult when the debugger expects all of the code to be in the same place all the time.
This is an awesome overview and if you want more, most of those are documented in an approachable way on YouTube.
Just wanted to provide some perspective here on how many things those projects need to take care of in order to get some training setup going.
I'm the developer behind TMInterface [1] mentioned in this post, which is a TAS tool for the older TrackMania game (Nations Forever). For Linesight (last project in this post), I recently ended up working with its developers to provide them the APIs they need to access from the game. There's a lot of things RL projects usually want to do: speed up the game (one of the most important), deterministically control the vehicle, get simulation information, navigate menus, skip cut scenes, make save states, capture screenshots etc. Having each of those things implemented natively greatly impacts the stability and performance of training/inference in a RL agent, e.g. for the latest version the project uses a direct capture of the surface that's rendered to the game window, instead of using an external Python library (DxCam). This is faster, doesn't require any additional setup and also allows for training even if the game window is completely occluded by other windows.
There are also many other smaller annoying things: many games throttle FPS if the window is unfocused which is also the case here, and the tool patches out this behaviour for the project, and there's a lot more things like this. The newest release of Linesight V3 [2] can reliably approach world records and it's being trained & experimented with by quite a few people. The developers made it easy to setup and documented a lot of the process [3].
I know your name from falling asleep to Wirtual videos. I think I actually found his content thanks to your collaboration on the cheating scandal. Thanks for all your hard work - it's obvious how significant and beneficial it is within the TM community.
Scientist here and have book marked this article for close reading (so apologies for this question if it is discussed in the article).
I had a few brushes in RL (with collaborators who knew more RL than I did). A key issue we encountered in different problem settings was the number of samples required to train. We created a headless version of the underlying environment but could not make it go a lot faster than real-time. We also did some work to parallelize but it wasn't enough (and it was expensive). Is the TM related RL training happening in real-time or is it possible to speed it up? That seemed like the key problem to make RL widely used, but curious about your thoughts.
I'm not sure about your particular case, but if your environment really is headless, then it should absolutely be possible to run it a lot faster than realtime. It depends on what the environment is and if you have access to its source code (we do not have that in TrackMania so it's a lot harder). Either the environment is purposely throttling the amount of time it simulates, or it just takes so much time to simulate the environment that it's not possible to speed it up anymore.
We're lucky in case of TrackMania because it internally has systems to both set the relative game speed and also completely disable all rendering and just run physics. Linesight achieves about ~10x speedup where the most time spent now is in rendering game frames and running the inference on the network. They also parallelize training by running more game instances and implementing a training queue. For the "raw" speedup ratios, TM usually achieves about ~60x (one minute is simulated in one second) and I use this speedup to implement bruteforce functionality in the tool (coupled with a custom save states implementation).
It's possible to speed it up by running the game as fast as it can go (so, not limited as it normally is for human consumption). They talk about running it at 9x speed, so months of training could be done in 80 hours.
Hello, since some time now I develop an extension for Visual Studio that makes it possible to see what's going on in your program live. This includes which code paths are executed and what changes in variables take place on each line of the function. Here, I decided to create a prototype that makes it possible to visualize the longest executing lines in the function, also completely live and at runtime.
The difference between this and e.g profilers like Tracy is that there's no need to insert the instrumentation manually, and you can explore the codebase much more freely to see what's going on. This is done through dynamic function relocation and inserting an rdtsc call after each line to collect the CPU timestamps. There's some more supporting code of course, but when it comes to overhead we get about ~15 instructions per line (more if it supported multiple threads).
On top of all that, the extension works really hard not to disrupt any standard Visual Studio workflow like setting breakpoints or viewing the call stack, which has been a very hard thing to achieve: the VS debugger expects that all code is on the same place as it was and that it maps perfectly to the corresponding debug information (which is not at all the case when the extension is running!).
There's also a lot of apps written in Vala for elementary OS's AppCenter with screenshots & links to GitHub: https://appcenter.elementary.io/ (not all, but most). I published two apps there and using Vala has been a great experience for creating Gtk/Linux desktop apps.
I'll add to that my own Visual Studio extension that improves runtime debugging by showing you which lines are actively executing and what changes happened line-by-line: https://d-0.dev It integrates with the VS debugger completely and does not require any changes to one's codebase.
It's also available on the marketplace if you want to try it out: https://marketplace.visualstudio.com/items?itemName=donadigo...
Hello HN, for about 6 months I've been developing a Visual Studio extension that makes it easier to debug & inspect dynamic C++ code that changes state e.g every frame. The newest feature I added to the extension is showing real-time changes alongside each line. This works kind of like a global, immediate watch, but without any breakpoints - this is really useful in game development where you'd like to track variables that change state all the time.
The feature works by setting the cursor inside a function, and the extension will start showing you the code paths that are executed and what local variables are changing in it. You can also use it as a function recorder with no breakpoints - just set the cursor in the function, execute an action that calls the function and you'll be able to trace everything that the function did.
Interesting approach. Was it enough for your use case despite being quite cumbersome if you end up needing to save a bunch of different more complex instances? I mean, I imagine most dynamic objects that are the real state you'd care about will probably be allocated somewhere on the heap and be referenced from lists or pointer types. I guess this would work best for POD types like program settings and the like, so I'm curious about your setup there.
Yeah it worked fairly well. IIRC there wasn't any kind of dynamically allocated data at all as is common in embedded projects. All strings were stored as fixed size, basically char[64]. Instead of storing pointers, I stored offsets within the memory file, since the base address changes from run to run.
(rant incoming)
This is actually something that bothers me with contemporary programming languages - the huge amount of implicit state due to function calls on the stack and the way the heap is organized. I'm much more in favor of a data-oriented design where you don't get to freely allocate objects, but need to strictly organize them according to a model. I imagine this would also make formal verification a lot easier, since you have a single global coherent view of the program state, kind of like being forced to store all important program state in a database (except more efficient since it's just memory).
Ironically I've become a big fan of global variables. In my experience the classic criticism of "anything could modify anything" doesn't really hold - using globals enables me to statically analyze all usages, not to mention setting memory breakpoints becomes much easier as well - I always know what I want to watch, instead of it being some pointer that will be allocated at some unknown time, 20 stack frames away from the actual usage.
3) relational model rather than explicit memory objects
4) serialize to disk as builtin
I think
1) is a goal to aim for for the programmer, but when a stack is needed a stack is useful, so should be provided by the language
2) is a good point, encapsulation is a good support for programmers, but encapsulation in the debugger is unwanted; so hopefully the language provides easy support for this
3) Nice idea. Depends on the application domain. Maybe pointers, maybe relational is more useful depending on the situation. Formal verification, err... An excuse for pointers would be: code a solution close to the problem is best for verifying problem->solution->proof which may mean pointers, although verifying solution->proof is better for relational. A good idea to explore...
I have succesfully used the mmapped state trick in the past, although for crash resilience (with carefully maintaining invariants at instruction boundaries) not just for persistence.
It worked well with fixed size objects, although if I would do it again today I would still allow for a separate pool for variable size strings as updating the max string size was the most common change that both breaks the file compatibility and bloats the size.
The separate pool would make it more awkward to dereference the string as you can't neither use a dumb pointer nor the self offset trick but you have to pass the poll base offset explicitly (or implicitly with a global), but I think it would be worth it.
edit: thinking about it, interleaving fixed size string pools with the other objects might also work, in practice you get a persistent heap.
Hey, looks like we had the same idea! I've run into the same issue and wanted to stop building debug UI's that do the same thing.
In my case, I went to implement this for Windows, but bit more advanced with inline scripting support. Basically, you insert a snippet anywhere in the code that can call various functions and one of them is `view()` which will show all locals (or a chosen object out of scope) in a Visual Studio window (it's a VS extension [0], website [1]). Calling it again will update the view with the new state. The scripting language underneath is Angelscript and it understands symbols inside your entire project (e.g you could do `view(Namespace::g_variable)` or even call functions from the program etc.
The snippet hook is then JIT'ed into the target function by relocating it using info from the PDB. This took me ~1y to implement, and it's still a research phase project.
Hey HN, Adam here. I've been building D0 - an extension that adds non-standard debugging features that help debug C++ applications in real time without setting any breakpoints.
D0 was previously a separate application, now it's exclusively a Visual Studio extension (with integration for other editors coming soon).
With D0 I'm focusing on features that make you less likely to place a breakpoint and more likely to observe what is going on immediately. Currently the extension has 4 main features:
1. Shows live code execution in the Visual Studio editor as it happens. This eliminates the need for setting breakpoints when you just need to see if a line is running or not which happens surprisingly quite often. And if you do need to inspect the values with the application in break mode, the line indicators give you a much better idea of e.g. when to expect that a breakpoint will be hit.
2. Shows live call stack of the current function. There's often times a need to inspect the call stack of a particular function. Usually this requires putting a breakpoint and inspecting the call stack while the application is in break mode. D0 improves on this and shows all paths that a function was called with previously just by putting the editor cursor anywhere in that function. This is much more ergonomic if you need to iterate all callers as you do not need to continuously break the application and remember who called the function.
3. Allows inserting scripting snippets anywhere in the code. These snippets can print or view complex objects without recompiling the main program and can be easily toggled on/off. This enables a very fast iteration on e.g the set of logs you need to produce at runtime since you can just enable/disable certain snippets without any waiting.
4. View objects live - tied with the scripting snippets, you can put a view() call in your main code with a snippet and D0 will create an object view inside Visual Studio that allows you to monitor variables live and change their values. This can be helpful when you need to tweak some settings/properties within the running application and it's basically a runtime version of the Locals window in Visual Studio when you hit a breakpoint.
There's more features that I'm working on and I'm looking forward to your feedback!
[1] https://www.youtube.com/watch?v=3PnVG49SFmU