I wonder if we could get it to generate a layered output, to make it easy to change just the text layer. It already creates the textual part in a separate pass, right?
Current open source tools include pretty decent off the shelf segment anything based detectors. It leaves a lot to be desired, but you do layer-like operations automatically detecting certain concept and applying changes to them or, less commonly exporting the cropped areas. But not the content "beneath" the layers as they don't exist.
I would bet that Adobe is definitely salivating at that. Might not be for a long time but it seems like a no brainer once the technology can handle it. Just the last few years have been fast and I interacted with the JS landscape for a few years. It moves faster than Sonic and this tech iterates quick.