Many convolutional neural networks for image classification work on 299x299 inpu...

mkl · on Jan 12, 2024

Most listed there are even smaller, 224×224px. These are models for Coral, though, right? It also says "These are not production-quality models; they are for demonstration purposes only."; the resolution may be part of the reason for that.

rgovostes · on Jan 12, 2024

The CNN architectures are really designed to operate on such small inputs, even if you run them on RTX 4090s. It makes sense: even with 224x224 pixels you would find it easy to identify the subject of an image. Having a 50MP image of a chair doesn't really help you figure out what it is, and even makes it harder if you're only looking at one zoomed in region of the image at a time.

"Not production-quality models" might refer to: Training was not done for as many epochs as you might want to achieve peak accuracy, or different quantization methods might yield better performance, etc. Or, it's just a disclaimer that if you decide to sell a product using one of these models, don't blame them if it is bad at detecting hot dog vs not hot dog.