I think the author doesn’t realize that 99.9% of people with an m1 Mac don’t care at all about running local ml workloads. The people that do care about running local ml workloads have either a workstation with an NVidia card, access to virtual machines on the cloud with NVidia processors or some Apple computer with an M2 Max or similar processor.
Put simply, this isn’t a big deal at all. Most people are not ml power users and most people are using their Mac for web apps via a browser, local office apps, and something like zoom or slack.
Also, as an aside, Siri is mostly useless and I’ve never met anyone who uses it regularly. So whether it needs ml processing or not is irrelevant.
Counterargument: most people are not running directly the ML workloads, agreed - however Apple implements them as features of the OS (like the new GPT powered text predictions). So it WILL affect the regular users over time as new features are not made available for M1 users.
The only reason you want to use bfloat16 is to train "compressed" models. Best inference performance tends to be at 4 bith quantization (and many people are trying to make it go down, even to 1 bit, and with sparsity, actually below that)
So this just doesn't matter.
Ironically, memory speed is what matters. And that's exactly the point where the M1 actually beats even the M3.
Thanks. I was going to rebut your point but I re-read the article and while one of the headings seem to rebut it, the actual content does not contradict anything, so I'll agree with your argument :)
“ Siri is mostly useless and I’ve never met anyone who uses it regularly. So whether it needs ml processing or not is irrelevant.”
You should get out more. I use Siri daily to switch lights, set reminders, turn on timers, and play music. It works well within limited domains like that. I don’t try to use it as an answerbot. That is a much bigger task and it is not suited for that.
Certainly Siri (and Alex and Google Assistant) is not very good right now for more complex tasks. That’s why Apple’s announcement that they will be adding LLM-style AI in the next OS interesting as it could make Siri more reliable and useful for more people.
I have an M2 Pro with 32Gb RAM. I keep ollama running in the background and use it either on the terminal for a chat interface, or as a code assistant on VS Code.
It works pretty great. Of course, larger models take longer to respond or don't work at all but that hasn't been an issue for my use-case.
Hey, sure. It's nothing fancy really. I've been using 2 extensions: Continue.dev [0] and Llama Coder [1].
The former for a chat-like interface where you can ask questions about the code, request it to write a function, tests, refactor, etc. The second for an inline, more Intellisense-like completion.
I'm on a Mac, so installed ollama from Homebrew, then `homebrew services enable ollama` to have it always in the background.
Configuring the extensions to use it is well documented, I imagine you won't have any trouble but let me know if you get stuck.
> I think the author doesn’t realize that 99.9% of people with an m1 Mac don’t care at all about running local ml workloads.
> (…)
> Put simply, this isn’t a big deal at all.
The author isn’t making a consumer argument, but a statement of fact. Which is perfectly in line with this blog, its author explores technical details of Macs but doesn’t necessarily make a value judgment on if something is good or bad, it’s about explaining how something works. The typical readers of this blog are the ones which will care about these details.
Most won't know they're running ML workloads. All they know is that cool new Photoshop feature doesn't work as well on their computer as their coworkers'.
But this has always been the case with older computers simply due to the inevitably faster processors coming out every year, and it hasn’t changed now. M1 processors were released more than 3 years ago.
At anytime in the last 30 years of computing, if you have a three year old computer it won’t run the latest version of Photoshop, Maya, whatever videogame, as well as the newest machines.
This was the case when I had a 16Mhz Mac and I saw people buying computers with 66Mhz computers and it’s still the case now.
Technology moves forward and new stuff will start to run slow on older machines, it’s not revelatory news.
I have a M1 Ultra Mac Studio with 128GB of RAM and the ability to run LLMs like LLaMa2 locally thanks to the unified memory architecture was a big factor. Just because you don't have a use for it doesn't mean everyone else is as blinkered as you.
As soon as the M3 Ultra lands, I am getting one, preferably with 256GB RAM.
It's a 60W part, so it would probably be _just_ about doable (old MBPs had up to about 75W worth of stuff to cool between CPU and GPU, though they definitely struggled a bit and the battery life on the dGPU ones was never amazing).
I would normally agree with you, but both my daughter and I had experiences with it this past week where we asked it to provide information while we were driving and it actually did - correctly even. We both commented that it felt like something had changed or improved
I don't give a flying f about """AI""". It's the XML of this decade. Still haven't seen any actual use case for it (other than generating spam content for the web). At least with Bitcoin you could buy drugs ;)
Amen brother. At least one voice of sanity has been heard through the AI hype. I more equate an AI toaster to being like the old 3D LCD TV crap from 10 years ago.
Yeah, this feels like cloud all over again. AI's not living up the hype.
At this time last year, GPT 4 really looked promising. It seemed like it could do logical reasoning, which would enable a lot of use cases besides just "automate the 50% dumbest people in your office." I wish OpenAI or someone else would push harder in that direction.
AI is great at parsing and classification. Use it in a specific feature in our product that would be impossible to do with manual regex/parsing - saves hundreds of man hours a year.
It’s also been immensely useful for software development work - I find myself reaching for AI over docs more often these days due to its ability to understand context.
We are using it for parsing and classification as well and have done a full internal study. The case is financial data collection. There are three side-by-side approaches we are comparing. Human-only, human-led, and ML-only. The only viable outcome for what we are doing is still human-only. The net cost of correcting the mistakes the ML makes in the human-led model are higher than the human-only model. The ML-only model has a capital risk higher than the value of our business. We're still experimenting with training but at least in our case, it's a poor solution.
Ergo YMMV and we shouldn't be spilling absolutes about it. It looks like it might work but does it really? We work in a very intellectually dishonest fashion-driven industry. Some objectivity would be nice in arguments.
As for writing code, I find it does a reasonable job of boilerplate but hits the crack pipe almost always if you stretch it. Then again that's better than any junior developer I've worked with so there is some utility there.
I agree. Occasionally ours doesn't work (frequent and very irritating OpenAI timeouts) or the output is inconsistent (different output from the same input).
"AI" is literally a marketing term for what we used to call algorithms, macros, etc. until very recently. Nothing fundamentally new is happening, just that what was largely confined to businesses and enterprises before has been adapted to become accessible to every man.
>I find myself reaching for AI over docs
I sincerely hope you read the documentation regardless to verify you aren't hallucinating. At best you'll just get the compiler mocking you, at worst you might find yourself responsible for a 737 MAX.
Oh, I still end up in the docs and have them open all the time. I don't generally use it for AI code completion. Maybe what's needed is just better doc search (that understands context and intent).
I have developed the a super effective prompt that saves me a ton of time working with SQL:
What is wrong with this code? Assume the tables exists.
And then I don't have to spend the time to find out that I placed the join after the where clause or that I should use one = sign when checking for equality or whatever.
You are welcome to your belief of course, I work as a consultant and if we are more effective than others we get more business.
I am retired and only program and write books for fun now, so my experience is likely not relevant: I have used co-pilot with Emacs and VSCode for the last year and it truly does save time. When I write, I like to ingest several pages into ChatGPT or Anthropic’s Claude 2, and ask things “any suggests for new topics for me to write about?”, etc.
I find the technology useful. Less useful, but fun are open source projects like open interpreter using local models.
I am not totally disagreeing with you, the level of hype is epic.
You may disagree with the post, but don’t dismiss the blog or the author, as I genuinely think it’s one of the best technical Mac blogs around (also he makes lots of useful free utilities to download).
I bought a refurbished M1 Mini and don’t use Siri etc either, mainly concerned about performance of text searches/analysis in the likes of FZF, Jetbrains software and so on. Haven’t heard the fan once yet.
For most people an M1 is a means of accessing websites, even as a developer my dev machine has become a gateway to virtual machines and I could do the same thing on a 2008 MacBook Pro. For a few who use ML features of apps like photoshop there may be more of an occasional wait time, but it’s not like rendering a video is it, it’s 10 seconds vs 15 seconds.
Wouldn’t most of the compute be on the GPU anyway? So the specifics of the ARM instruction set implemented on the CPU cores don’t matter very much.
Maybe the M2 GPU is also better at bfloat16, but the author admits to not having researched this. So I wouldn’t make any far-reaching conclusions based on this one difference in the CPU instruction set.
bfloat16 is not used for graphics, where FP16 is appropriate, so all GPUs that are new enough to support 16-bit floating-point numbers support FP16.
For bfloat16, there exists only one application, ML/AI, because it is the only application that can tolerate a so low resolution in numbers.
AFAIK, the first device that has provided hardware support for bfloat16 was the Intel Cooper Lake CPU, which was launched in Q2 2020. AMD has added bfloat16 support in Zen 4, in 2022.
If any GPU has added hardware support for bfloat16, it must be a model launched in 2020 or later.
Support for bfloat16 is much more likely to be provided by dedicated ML/AI accelerators.
Some compute might be on the AMX units (dedicated matrix multiplication coprocessor, closely attached to the CPU, distinct from both ANE and GPU). They gained bf16 support in M2.
Interesting. People with older hardware might not be able to run new software or process data at the same speed or efficiency as people with the latest hardware.
I assume RAM will also be a major constraint for on device ML; most Macs still default to 8GB (with absurd bumps in price to upgrade) so I'd guess some very high percentage of Macs sold in last 2 years are going to be quite bad for any major new software stuff Apple release in 2024. That is assuming they even are supported at all; if Apple follow their iPhone playbook they might just release the new stuff on late 2024 Macs.
Never use siri, rarely use spotlight beyond app file names completion to launch. On my fourth or fifth Macbook pro over 15 years having been a thinkpad person before that. I'm not interested in an ai assistant, it's an overclocked terminal to SSH onto bigger boxes and a word processor. It runs emacs plenty fast.
…well, if it runs. Because all these predictive-auto-whatever features also break things: Eg., I have a bug in Apple Mail [1], which basically breaks "entering text into a computer using a keyboard" – a problem I would have thought was solved some 70+ years ago, but alas, here we are…
It is a work laptop. My own are Thinkpads. I prefer the Thinkpads for most things.
Cons of the MacBook:
Terrible selection of ports, terrible trackpad (no middle click as standard, right click is unreliable and needs a clumsy gesture which fails more often than it works), AWFUL keyboard, zero repairability or upgradability.
I would not wish to spend my own money on one of these things.
I am not saying you are wrong. I am merely saying what you value is not universal.
For my current work project, I use a Mac Mini mostly to ssh into various boxes running Linux, both x86 and arm. The docs and comms are on the Mac though.
Funny enough, the mini stays at 10 W (it's a M2) while the x86 box under my desk idles at 20 W with no monitor connected...
This doesn't sound right. The argument is that it's ISA doesn't support bfloat16 but the neural engine does support float16. They seem to think bfloat16 is a better format which isn't really true. float16 is better, it is just more expensive to convert to/from float32.
Tried raycast but realised you can’t customise the GUI the way you can with Alfred – I have Alfred set up so it only takes up a minimal amount of space in the top corner of the screen (ala Spotlight pre-Yosemite).
Submitted feedback to the team but they were pretty dismissive about it.
It is not, the whole "stateful app in a launcher" paradigm is very clunky and slow in practice and it can't handle files and incoming text in any meaningful way. But hey, they have an AI and subscriptions! This is what happens when you take VC money.
bfloat16 mostly matters for training stability, not for inference. M1 is inefficient for training any reasonably sized model and most inference efforts target 4-8 bits, even.
I was curious to see how they were going to get me to upgrade from my M1 Pro. I still don't see a need to replace this thing in the foreseeable future.
Shocking! New machines are ever-so-slightly better at doing new things! OMG
Is this what people have to tell themselves these days, to stay on fashion-driven upgrade cycles...? "Mate, I absolutely need a new mbp, what if Siri starts using a LLM and it gets 0.000001 second slower at replying that it didn't understand what I'm saying?!?”
I wish these bloggers weren't so keen to go along with FOMO-stoking vendors.
Not really. The article has about one line of hypothesis ("left behind" for "AI tasks"). The other 99.95% is background.
Besides it uses "AI" so it looks like an attempt to join the current gold rush/religion.
Edit: Hey, I asked ChatGPT to rephrase the parent comment "in a way that will bring me upvotes on Hacker News". Here's what I got:
Ah, the article seems to delve into the details of integers and IEEE-754 floats; perhaps providing links for those topics would be more efficient. But I understand, the author might be aiming to meet a specific word count!
Put simply, this isn’t a big deal at all. Most people are not ml power users and most people are using their Mac for web apps via a browser, local office apps, and something like zoom or slack.
Also, as an aside, Siri is mostly useless and I’ve never met anyone who uses it regularly. So whether it needs ml processing or not is irrelevant.