Well the key difference is that you don't have to think much to get a code-specialized language model, and when you do train one it's much more general (eg. inferring constraints from text, using user-provided types, correctly naming variables, less prone to exponential complexity as sample length grows, etc.). And then the model just gets better over time as AI improves, and all you have to provide is a comparatively cheap bit of compute.
I got the impression from you saying “You can synthesize programs at this level of complexity with a few minutes on a single ten year old laptop using 5-10 year old algorithms.” that you thought this was generally solved at this level of complexity, rather than merely true for an easier example here and there.
I got the impression from you saying “You can synthesize programs at this level of complexity with a few minutes on a single ten year old laptop using 5-10 year old algorithms.” that you thought this was generally solved at this level of complexity, rather than merely true for an easier example here and there.