Setting aside efficiency or accuracy, caching enhances the value of prompt engineering and thus increases the effective value of AI services (how the value is monetized or split is TBD).
Comments suggest that caching the state of the network might also reduce processing.
I wonder if it also permits better A/B-style testing by reducing the effect of cross-domain errors. If the AI service providers made it easy to provide feedback on post-cache responses, the providers could incorporate the quality-enhancement loop accelerating time to product-market fit (at the risk of increasing dependency and reducing ability to switch).
Comments suggest that caching the state of the network might also reduce processing.
I wonder if it also permits better A/B-style testing by reducing the effect of cross-domain errors. If the AI service providers made it easy to provide feedback on post-cache responses, the providers could incorporate the quality-enhancement loop accelerating time to product-market fit (at the risk of increasing dependency and reducing ability to switch).