What's nice about the move is that the reinforcement learning algorithm doesn't fundamentally change. The action and state spaces will be larger since a joint has more degrees of freedom in 3D than it does in 2D so learning may take longer so we'll need to also increase the size of the replay buffer and the episode length.
I'm planning on writing a followup post on 3D so stay tuned!
If the author reads this we’ve used VPython for simple simulation visualizations such as this and it has worked great both inside and outside of Jupyter notebooks.
Mark (the author) is working on yuri.ai (https://yuri.ai), a deep reinforcement learning platform for games. Drop him a line or sign up if you are interested!
The analytical solution wouldn't involve any training, you'd solve a system of equations where position of the finger is equal to where the goal is constrained by the equations which tell you how an arm moves.
The advantage of the RL approach is that it doesn't need to know how an arm moves but then involves some training.
Sorry, I meant compare the RL version as it trains to the analytical version.
It's certainly neat that inverse kinematics can be learned from zero knowledge, but I would have a hard time trusting it to operate a real arm in an industrial setting.
You'd be entirely right not to trust it as it is in an industrial setting. There's been some research around safe exploration that would add additional terms to the reward function to do things like punish flailing around and such but I haven't experimented with those techniques myself.