Suggestion: calculate the out-of-sample Sharpe ratio[0] of the suggestions over a reasonable period to gauge how good the model would actually perform in terms of return compared to risks. It is better than vanilla accuracy or related metrics. Source: I'm a financial economist.
[0]: https://en.wikipedia.org/wiki/Sharpe_ratio