Hacker News new | past | comments | ask | show | jobs | submit login

ML systems generally do not care about human semantics, and they will not produce them naturally. The VAE works at 16 bits float per channel, so compression is not an issue either, but if it was, HSV would be a poor choice too.



ML systems don't care, but humans do and better semantically-meaningful representations in training data usually lead to better results for us. In images you often care about "different colours of similar brightness" rather than "matching levels of 3 colour components", so there's a non-zero chance HSV/HLS would do better than RGB. It's nothing to do with compression.


Does it lead to better results though? For the system, the best representation would be one that it learned - which is the latent representation, 4 channels in this case. Would it learn a "better" representation when fed with HSL instead of RGB? If so, what's the intuition? RGB somewhat resembles human vision, whereas HSL exists for interactive editing, and YCbCr exists for compression. If anything, I would expect YCbCr to outperform.


> If so, what's the intuition?

HSV closer resembles physical properties, for most natural things. Hue and saturation variations are usually meaningful variations in the actual material. Brightness variations often end up being mostly about lighting, rather than the material. It can be surprisingly effective for simple segmentation [1], which is why it's usually the first one implemented in computer vision classes.

Our eyes have RGB sensors, but I would claim I perceive the colors in my surroundings in something like HSV (although, that could very well be from the way I learned colors). And, I think this makes sense: if you're looking for something, you want a color perception that's not overly sensitive to lighting conditions. RGB is directly related.

[1] https://medium.com/neurosapiens/segmentation-and-classificat...


The segmentation aspect is interesting, but the problem I have with H is that it is circular, i.e. 0 and 1 represent virtually the same hue, and my intuition is that this lends itself poorly to a NN. The luminosity argument is valid, but that is not unique to HSL, hence my intuition that YCbCr (or related) would outperform.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: