Implementing certain things in PyTorch can be fun. I've recently started to make an economic game simulator (where you have players that can buy and sell items, do fighting etc.) and began to outline the design in the usual C#, but then, suddenly, thought - why not make the whole thing run on GPU and support millions of players with tens of millions sim steps per sec? This should make experiment cycles really fast - and that's what simulators are for!
If you don't want to program CUDA kernels yourself, what's the easiest tool for the job? PyTorch, of course. Mutable by default tensors, lots of library operations and generally a pleasure to use.
The programming is really interesting - you start to think in batches - every operation is a parallel operation, it's not 1 player/item doing something, it's a whole lot of them at once - and that's really fast.
I've eventually did run into some problems - for example, ragged/sparse tensors support is not designed for this at all, and since I have variable number of items per player - this is a problem. Also, producing something like logs inside tensors is challenging - reallocation is very expensive and must be done in batches too. There are some other interesting things like launching several operations / CUDA kernels at once.
This is incredibly clever. In case anyone else is confused by the order: the ASCII codes for L, S, R are 76, 83, 82 respectively, so L encodes i and R encodes -i as you’d expect.
A complex number represents a 2D point in the complex plane, but multiplication by a complex number on the unit circle also happens to correspond to a rotation in the complex plane, e.g. 3 × i = 3i, which is a rotation of 90 degrees anticlockwise, or -5i × -i = -5, which is a rotation of 90 degrees clockwise.
So if you use the complex numbers 1, i and -i to represent the actions “go straight”, “turn left” and “turn right” respectively, you can update a complex number representing the current heading of the snake (1 = east, i = north, -1 = west, -i = south) by multiplying it by the action. To get the snake’s current position, add up all the “current heading” numbers you’ve seen so far.
Is writing X in Y lines of code having any real meaning? I'm not trying to be negative, just genuinely asking.
For example, in 1-line-of-code I can write "hello world". This hello world is physically lighting up individual pixels on my display. It's rendering the letters to be the shape "hello world" should be. It has antialiasing applied. All in 1 line of code!!!
It has some merit when you are comparing different solutions. For example (and I know people loathe python on here, I'm just using it for the sake of example, please bear with me), writing a simple web service in python takes fewer lines of code than doing the same in say, Rust.
This doesn't imply that a language is better than another, it's just an objective fact one might find useful to keep in mind.
The article's usage is admittedly clickbait, another recent article that popped up on the home page was titled something to the effect of "Speech Recognition in 100 lines of C++". Predictably, it turned out that they depend on a multi-million-line library.
But if you're importing in a library that might have 100,000 LoC just so that you can say you "wrote X in 12 LoC" ... is that really a far representation?
I think that if someone can condense an algorithm using just a bunch of lines of code, that reflects a deep understanding of the problem domain. A lot of people can code an algorithm using a lot of lines of code. But it requires practice and experience to write short and expressive code. For example, Karpathy is famous for this, you can check his implementation of evolution strategies [1].
I've been on r/programming since 2006 or so, and the nature has changed drastically in the past maybe 3-4 years. I think there are a lot of wannabes who hang out there now, as opposed to dyed in the wool nerds. I recently had to go out of my way to tell someone that no, you aren't wrong for saying you generally shouldn't write commentless code, because like 50 people were being nasty to them by unironically saying "Code should be self documenting", as if they read that on a blog post some where but have never written a line of code in their life.
It was an enjoyable and enlightening read. Now I’m thinking how I can rewrite my tile-based path-finding RL project to utilize more Tensors and less for loops.
Well done! I’m looking forward to reading how you concurrently ran 100 million mini snake games.
Really cool. That's how this project got started. I was doing the exact thing you were doing (tile-based RL path finding). But it was soooo slow, so I started implementing it in tensors, and that's when I realized I could do a whole game of Snake.
r/programming isn't exactly a welcoming community. Hackernews is far more professional in my experience, especially with side projects. Puns here are kept to the minimum, and people are generally supportive and constructive
I used to be jealous of people that write big impressive things in tiny amounts of lines of code. But honestly, not anymore. I can't make heads or tails of it and I wouldn't want to try.
If I wrote my own Snake, it would probably be like 500 lines and that's OK.
I don’t think these sorts of code golfs are meant to be impressive. They are fun because they make you explore the quirks of your language/framework to try to optimize the constraint.
I find them to be pretty useful when you are learning a new language. That said, the 500 line snake is probably more fun to play than the codegolfed snake.
Good stuff. It would've been nice if there ever was an actual "Next Step" tensor that once multiplied (or some other operation) by any tensor representing current state, would give you the next state. That was my initial thinking when I read "Using Linear Algebra".
Yes, but not in the matrix state we are using to represent it. It would probaly need to be represented as a "one-hot" MDP encoding, takes away from some of the beauty of it.
Here's a rudimentary display function that displays a list of moves:
∇display moves
colours←↑(0 255 0)(0 0 0)(0 0 255)(255 0 0) ⍝ green black blue red
'b'⎕WC'Bitmap'('Bits'(0 0⍴0))('CMap'colours) ⍝ create bitmap
'f'⎕WC'Form' 'Snake demo'('Size' 500 500)('Coord' 'pixel')('Picture'b) ⍝ create form with b as the background
s←1 ¯1@(⍉↓10 10⊤2?100)⊢10 10⍴0 ⍝ start position
b.Bits←50/50⌿(¯1 0 1,⌈/,s)⍸s ⍝ display snake
:For move :In moves ⍝ loop over moves
s←move snake s ⍝ update using 'snake' function
b.Bits←50/50⌿(¯1 0 1,⌈/,s)⍸s ⍝ update bitmap
⎕DL÷50 ⍝ delay by 1/50 s
:EndFor
∇
If you don't want to program CUDA kernels yourself, what's the easiest tool for the job? PyTorch, of course. Mutable by default tensors, lots of library operations and generally a pleasure to use.
The programming is really interesting - you start to think in batches - every operation is a parallel operation, it's not 1 player/item doing something, it's a whole lot of them at once - and that's really fast.
I've eventually did run into some problems - for example, ragged/sparse tensors support is not designed for this at all, and since I have variable number of items per player - this is a problem. Also, producing something like logs inside tensors is challenging - reallocation is very expensive and must be done in batches too. There are some other interesting things like launching several operations / CUDA kernels at once.