The emergence of AI engineering

swyx · on July 7, 2023

(author of https://www.latent.space/p/ai-engineer here)

this is a fantastic followup post spelling out the toolkit of the AI Engineer. I have a version of this in my notes but didnt want to publish it for fear of being too biased by ommission, plus it's generally good to leave others to fill in the blanks on some things. the Projects list is particularly great - one could build a course around these few things and be reasonably confident that you have the base skills of a competent "AI Engineer" (that, for example, any PM or founder could ask to do an AI project and they'd be well equipped to advise/make technical decisions).

don't miss Andrej Karpathy + Jim Fan's take on the AI Engineer role: https://twitter.com/karpathy/status/1674873002314563584

for those who dont like Twitter, i cleaned up the audio of our AI Engineer conversation with Jared Palmer, Alex Graveley, Joseph Nelson, and other self identifying AI Engineers here: https://mixtape.swyx.io/episodes/the-rise-of-the-ai-engineer...

it's a fun time for those who want to carve out a space for themselves specializing in this stuff.

ReadEvalPost · on July 7, 2023

"Build, don't train" is poor advice for a prospective "AI Engineer" title. It should be "Know when to reach for a new model architecture, know when to reach for fine-tuning / LoRA training, know when to use an API." Only relying on an API will drastically reduce your product differentiation, to say nothing of the fact that any AI Engineer worth that title should know how to build and train models.

charlierguo · on July 7, 2023

Fair point! I think my main idea was "prefer building with an API over training your own model" but that isn't as pithy.

The jury's still out on how much training and fine tuning are going to matter in the long run - my belief is that there are many great products that can exist without needing a new model architecture, or owning the model at all.

ReadEvalPost · on July 7, 2023

That advice makes sense if we're talking about 800B+ parameter models that require a gigantic investment of capital and time. For models that fit on a consumer GPU you're leaving chips on the table to not take advantage of training / fine-tuning. It's just too easy and powerful not to.

fswd · on July 7, 2023

We're calling it LLMOps

swyx · on July 7, 2023

1. its unpronounceable

2. ops is boring. get in losers, we're generating all the things