Help me see how this requires a _thorough_ study of Bayesian methods.
What you describe seems to be (additive/Laplace) smoothing [1], a fairly basic concept. And I don't see how this is specifically Bayesian.
Then, one common critique of "Bayesian stats work well with small samples" is that with small data (weak signal), you get your prior back, so you haven't learned anything, and the only thing you're left with is your "bias", erm, prior.
I am still keen to learn why one needs a thorough study of Bayesian methods to help in practical machine learning tasks (meaning, on top of a bit of statistical or data science curriculum plus on the job experience).
What you describe seems to be (additive/Laplace) smoothing [1], a fairly basic concept. And I don't see how this is specifically Bayesian.
Then, one common critique of "Bayesian stats work well with small samples" is that with small data (weak signal), you get your prior back, so you haven't learned anything, and the only thing you're left with is your "bias", erm, prior.
I am still keen to learn why one needs a thorough study of Bayesian methods to help in practical machine learning tasks (meaning, on top of a bit of statistical or data science curriculum plus on the job experience).
[1] https://en.wikipedia.org/wiki/Additive_smoothing