From the paper, written by the same person who made the video:
"The result is not impressive at all; the main goal here is to reduce the opponent's health, but our objective function can
only track bytes that go up."
I can't watch the video at the moment but I imagine the paper goes much farther in-depth with the internals of this AI.
But seriously, now. Reverse Polish notation is plenty intuitive as long as you don't get too crazy with what you put on the stack.
For example,
(7**3 + 9*7) / 2
might be rendered:
7 3 ** 9 7 * + 2 /
That is, "Take 7 and 3 and exponentiate, take 9 and 7 and multiply, then add the two numbers, take the answer and 2 and divide." Parentheses or extra spacing could easily be used to increase readability (though parentheses are never required in RPN; they're purely aesthetic), if that's too much to swallow:
(((7 3 **) (9 7 *) +) 2 /)
...But really it's just a matter of what you're used to.
P.S. You'll notice that with parentheses RPN looked remarkably Lisp-like. This isn't a coincidence. The core difference between prefix notation as used by Lisp and RPN is that operators in RPN may only take two arguments (it's more complicated in languages like Forth, but since we're talking about arithmetic here we can safely make this assertion). This is the tradeoff RPN makes: losing the ability to have n number of arguments per operator, for the ability to ditch those parentheses for good.
It should also be clear that one could easily have a Lisp-like syntax similar in appearance to the RPN with parentheses above (though with Lisp's ability to have more than two arguments), but just happening to put operators at the end. Conversely, one could have an RPN where operators are at the front, which is actually the concept behind Polish notation, where RPN stems from: http://en.wikipedia.org/wiki/Polish_notation
On a final note, since RPN is read from left-to-right exclusively, a speaker of an RTL language would actually be worse off. The perceived unintuitiveness is more because English is a subject-verb-object language, whereas RPN is subject-object-verb.
Ah. So it's a sort of Big Software Syndrome type thing -- "we could do it in a more intuitive way for new people, but it would break all of these existing programmers who learned it the less intuitive way, so let's leave it the original way" -- or am I missing something?
That may have something to do with it, but in practice, I find myself moving up/down (j/k) files far more often than I need to backtrack by a single character, and when I do, as often as not my pinky flies up to backspace (note this works by default only in Vim). It ends up being the most natural to keep your hands on the home row anyway.
I can't watch the video at the moment but I imagine the paper goes much farther in-depth with the internals of this AI.