> he wants to be able to express programs, and even an operating system, as a directed acyclic graph of logical binary operations, so that you can have consistent and deterministic runtime behavior.
So how is this different from digital logic synthesis for CPLD/FPGA or chip design we have been doing over the last decades?
FPGAs are (prematurely) optimized for the wrong things, latency and utilization. The hardware is heterogeneous, and there isn't one standard chip. Plus they tend to be expensive.
The idea is to be able to compile/run like you can now with your Von Neuman machine.
FPGA compile runs can sometimes take days! And of course, chips take months and quite a bit of money for each try through the loop.
With FPGAs I can sample a hundred high precision ADCs in parallel and feed them through DSP, process 10Gb ethernet at line rate, etc with deterministic outcomes (necessary given safety and regulatory considerations). They integrate well with CPUs and other coprocessors - heterogeny isn't wrong. Plus training a NN model also takes days! To be fair not always, but for the above applications my build time was hours to many-hours anyway.
I grant the hardware is absurdly expensive at the high end, but I really don't think application wise the comparison is apples to apples.
Hotzs saying literally everything with an io pin or actuator will be driven solely by NN (driven by tinygrad) seems to me maybe 1/3 self promotion, 1/3 mania, some much smaller amount incisive at best.
There is the excellent CAD sketcher plugin for Blender; this adds a basic 2D parametric/constraint based editor into your workflow, which can convert it's output into a mesh to integrate into your blender model. For more complicated models I typically make 2 or 3 2D constraint models, and use the blender boolean tools to combine this into the final 3D model.
One recurring pattern I have seen at multiple customer sites is that scaling makes the engineers lazy to optimize. One production performance calamity requires the team to add CPU as a quick fix, and from that time on the baseline for the product's requirement has been set to the new number of CPUs.
"Back in the olden days", if your product was slow but the number of CPUs was fixed (or could not be increased instantly), the solution was to go and fix your code.
Basic system level skills are now no longer taught or practiced at the appropriate levels, so teams end up without engineers who actually know how to profile and optimize.
I've lost track of the times I've heard "compute is cheap! engineers are expensive!" Except... that compute cost will live forever. The time it takes someone to debug a bad loop or poor query is at worst a one time cost. Longer term, it may even make other stuff faster in the future.
Looking at some numbers on cloud, yeah I don't believe in that statement anymore. End device compute might be cheap, but cloud services certainly are not.
Because it's cheaper now to throw hardware at a problem than actually try to fix the code or root of the issue. This wasn't the case a couple of decades ago.
It also means less engineers are needed for most companies.
> Because it's cheaper now to throw hardware at a problem than actually try to fix the code or root of the issue.
It's not cheaper, it's just more opaque.
Back when your service was deployed on that 2 CPU box and it was too slow for obvious reasons, you optimized it and then it was good.
Today you just shrug and increase that kubernetes cluster from 16 to 48 nodes and forget about it. Costs a lot but the bill shows up somewhere else, in most groups the engineer doesn't even know what it is.
It actually still scares the hell out of me that this is the way even the experts 'program' this technology, with all the ambiguities rising from the use of natural language.
Keep in mind that this is not the only way the experts program this technology.
There's plenty of fine-tuning and RLHF involved too, that's mostly how "model alignment" works for example.
The system prompt exists merely as an extra precaution to reinforce the behaviors learned in RLHF, to explain some subtleties that would be otherwise hard to learn, and to fix little mistakes that remain after fine-tuning.
You can verify that this is true by using the model through the API, where you can set a custom system prompt. Even if your prompt is very short, most behaviors still remain pretty similar.
There's an interesting X thread from the researchers at Anthropic on why their prompt is the way it is at [1][2].
Supposedly they use "RLAIF", but honestly given that the first step is to "generate responses... using a helpful-only AI assistant" it kinda sounds like RLHF with more steps.
LLM Prompt Engineering: Injecting your own arbitrary data into a what is ultimately an undifferentiated input stream of word-tokens from no particular source, hoping your sequence will be most influential in the dream-generator output, compared to a sequence placed there by another person, or a sequence that they indirectly caused the system to emit that then got injected back into itself.
Then play whack-a-mole until you get what you want, enough of the time, temporarily.
I think OP's point is that "hope" is never a substitute for "a battery of experiments on dependably constant phenomena and supported by strong statistical analysis."
Honestly, this sort of programming (whether it's in quotes or not) will be unbelievably life changing when it works.
I can absolutely put into words what I want, but I cannot program it because of all the variables. When a computer can build the code for me based on my description... Holy cow.
Well, hopefully your developers are substantially more capable, able to clearly track the difference between your requests versus those of other stakeholders... And they don't get confused by overhearing their own voice repeating words from other people. :p
We all use abstractions, and abstractions, good as they are to fight complexity, are also bad because sometimes they hide details we need to know. In other words, we don't genuinely understand anything. We're parrots of abstractions invented elsewhere and not fully grokked. In a company there is no single human who understands everything, it's a patchwork of partial understandings coupled functionally together. Even a medium sized git repo suffers from the same issue - nobody understands it fully.
Wholeheartedly agree. Which is why the most valuable people in a company are those who can cross abstraction layers, vertically or horizontally, and reduce information loss from boundaries between abstractions.
... or - worse even - something you think is what you want, because you know not better, but happens to be a wholy (or - worse - even just subtly partially incorrect) confabulated answer.-
It still scares the hell out me that engineers think there’s a better alternative that covers all the use cases of a LLM. Look at how naive Siri’s engineers were, thinking they could scale that mess to a point where people all over the world would find it a helpful tool that improved the way they use a computer.
The original founders realised the weakness of Siri and started a machine learning based assistent which they sold to Samsung. Apple could have taken the same route but didn't.
I mean, there are videos from when Siri was launched [1] with folks at Apple calling it intelligent and proudly demonstrating that if you asked it whether you need a raincoat, it would check the weather forecast and give you an answer - demonstrating conceptual understanding, not just responding to a 'weather' keyword. With senior folk saying "I've been in the AI field a long time, and this still blows me away."
So there's direct evidence of Apple insiders thinking Siri was pretty great.
Of course we could assume Apple insiders realised Siri was an underwhelming product, even if there's no video evidence. Perhaps the product is evidence enough?
My overall impression using Siri daily for many years (mainly for controlling smart lights, turning Tv on/off, setting timers/alarms), is that Siri is artificially dumbed down to never respond with an incorrect answer.
When it says “please open iPhone to see the results” - half the time I think it’s capable of responding with something but Apple would rather it not.
I’ve always seen Siri’s limitations as a business decision by Apple rather than a technical feat that couldn’t be solved. (Although maybe it’s something that couldn’t be solved to Apple’s standards)
"we have identified an interoperability issue [..] in which random temporary
radio traffic disruptions are incorrectly recognized as legacy switch power
toggles"
Yeah right, this is some nice BS. These lamps are driven by Zigbee, Please explain to me how "random radio traffic disruptions" are able to disrupt a protocol that has built in proper AES-128 encryption.
> Please explain to me how "random radio traffic disruptions" are able to disrupt a protocol that has built in proper AES-128 encryption.
Radio traffic disruption, not data manipulation. Like when your network hub stops working or someone is using a microwave in the next room, for example. Most zigbee devices are transmitting in the 2.4Ghz band, which makes them very susceptible to being drowned out by EMI.
Sure, so this clearly disrupts the transmission. But since there is strong encryption and authentication used in the Zigbee protocol, the chance of a mangled frame ending up at higher level in the protocol stack is approximately zero - anything messing up the frames will invalidate any hash so the receiver will discard these frames.
Although this is a fun project, this seems like a silly amount of work to get this to work: this display controller has been supported in the Linux kernel for ages. Why not just add the panel to the device tree and let the kernel handle all that for you?
This is a nice example to do inheritance the Rust-way using composition instead of OO, but still I'm missing some essential parts for which I have not found the proper Rust idioms yet.
For example, when building a widget hierarchy, I would like to be able to pass widgets of different types to a function or store them in a `Vec<>`, but this is not possible since these are different concrete types.
Trait objects can help here, but then Rust lacks the machinery to upcast and downcast between types as there is no RTTI attached to these trait objects; you can not cast/convert a `Button` to a `BaseWidget`, or a `BaseWidget` that was originally a `Button` back to a `Button`.
Right, it seems `Any` and `downcast_ref::<>` were the missing parts for the downcasting. Upcasting is new in nightly with `#![feature(trait_upcasting)]`
I might be going around in the wrong social circles, but none of the people I know look anything like the realistic people in these images. Are these models even able to generate pictures of actual normal everyday people instead of glossy photo models and celebrity lookalikes?
Also because models in photographs are symetrical, emotionless, softly-lit, and have perfect skin imperfections. Things like age lines, wrinkles, scars, emotional expression, deep shadows and asymetry require actual understanding of human anatomy to draw convincingly.
Yet, the Gen ai image producers have no understanding of anything and draw human anatomy convincingly very often. Yin other words, you're wrong. AI does not need anatomy knowledge, that's not how any of this works. They just need enough training data.
> AI does not need anatomy knowledge, that's not how any of this works.
Surely someone has done a paired kinematics model to filter results by this point?
Not my field, but I figured 11 fingered people were just because it was computationally cheaper to have the ape on the other side of the keyboard hit refresh until happy.
There are "normal diffusion" models that create average people with flaws in the style of a low grade consumer camera. They're kind of unsettling because they don't have the same uncanny valley as the typical supermodel photographed with a $10k camera look, but there is still weirdness around the fringes.
Try making a picture of people working in an office with any diffusion model.
It looks like the stock photo cover for that mandatory course you hated.
Even adding keywords like “everyday” doesn’t help. And a fear it’s going to be worse in a few years when this stuff constitutes the majority of the input.
Oh a "prompter" is it? No, I'm not a prompter, but if you want to interface with a GenAI model, a prompt is kind of the way to do it. No need to be a salty about it.
Yes, however, for the purposes of a demo article like this, it's significantly easier to use famous people who are essentially baked into the model. You encapsulate all of their specific details in a keyword that's easily reusable across a variety of prompts.
By contrast, a "normal" random person is very easy to generate, but very difficult to keep consistent across scenes.
I know. As well these model outputs are so messed up, it's too much. Penis fingers, vacant cross-eyed stares. This has got to be fine. "Trained on explicit images". It kills me every time. TFA is so emphatic, I think the author is hallucinating as badly as the models are.
Not really,
I'm in my sixties, and it's surprisingly difficult to get round the biases these models have of young, perfect people, if you want to get images of older people.
Try the single word prompt 'woman' and see what you get...
Have larger diffusion models gotten to synthetic training dogfooding yet?
The irony is that once we get there, we can address biases in historical data. I.e. having a training set that matches reality vs images that were captured and available ~2020.
The larger base models do an excellent job of aging, try out asking for ages increased by 5 year increments, and you’ll see clear progression (some of it caricatured of course), e.g. “55 year old woman” vs “woman”.
Yeah, I mean it's not great that the models are biased around a certain subset of "woman" (usually young, pretty, white etc.) but you can just describe what you want to see and push the model to give it you. Yes, sometimes it's a bit of a fight, but it's doable.
The good thing is that you do not require a new language.