This depends entirely on your goals. If you're researching the actual machine learning algorithms, then use a framework like TensorFlow or Torch, which provides all the tensor operations and abstracts away the hardware. If you're trying to get maximum performance on hardware today, stick with Nvidia and use CUDA. If you're interested in deploying across a range of hardware, or want to get your hands dirty with the actual implementation of algorithms (such as wonnx), then WebGPU is the way to go.