Hacker News new | past | comments | ask | show | jobs | submit login
SMT Solving on an iPhone (2018) (utexas.edu)
224 points by marcobambini on Aug 9, 2020 | hide | past | favorite | 149 comments



Hah, funny to see my blog referenced. Yeah, cache misses are a huge part of SAT solving (modern solvers have prefetch code specifically for this loop, some seemingly following the code I wrote). And SAT solving is about 95+% of SMT solving. So, yeah, cache is where it's at.

I once had an old, but "enthusiast" i7 that had a 3-lane mem setup, and "upgraded" to a consumer dual-lane i7 that was 2-3 gens ahead. It had the same performance(!) for SAT solving. Was pretty eye-opening.


> SAT solving is about 95+% of SMT solving

This is theory-dependent, I would say. It is certainly true for bit-vector solvers that bit-blast for example. For other theories, however, this is not necessarily the case. Consider a simplex-based linear integer arithmetic solver for example. If there is a problem that is simple at the Boolean level, you may end up spending more time in the simplex solver.


Ah, good point. Still, the most used theories are mostly SAT-bound. BV (bitvector) especially. And the translation to SAT can play a huge part, so even in BV logics, the SMT solver's abilities are super-important. But runtime in these logics are dominated by the SAT solver. I should play with more SMT solvers.

I work on the STP SMT solver along with a team of very dedicated people (https://github.com/stp/stp/), it's QF_BV (Quantifier Free BV) solver and it tends to do well in the competitions: https://smt-comp.github.io/2020/results.html So I'm probably a bit biased towards BV. For counting (e.g. https://github.com/meelgroup/approxmc) and sampling (e.g. https://github.com/meelgroup/unigen) it's also 99% SAT solver runtime.


At the end of the day, cache miss latency should be bounded by the speed of light and the distance to memory. It's entirely unsurprising to me that by putting main memory on the same board as the CPU, as well as the optimizations that come from allowing a custom communication protocol between them, better performance is possible. I wonder what a desktop machine designed for cache and memory latency would look like.


It's SRAM vs DRAM not the speed of light. Go take a look a DRAM CAS values as clock speeds have increased. The ns/fetch has stayed pretty static(~10ns) since DDR first showed up.

There's been slight improvements but nothing like the order of magnitudes you seen in other areas.

If you want to pay the power and cost of SRAM it can be done but it doesn't usually pencil out beyond what you see in caches today.


Not just same board... the RAM is literally stacked right on top of the SoC. It’ll be interesting to see what Apple will do on desktop... maybe some stacked RAM plus more on the board? Having full control over memory / bus architecture (and the OS) creates lots of opportunities for optimization.


Which 3 gens? Intel CPU performance has been flat-ish for 5 years or so, so that might not be super-suprising


The only 3 Lane i7 I know of is the Nehalem


Has there been a study of the bottleneck of the performance of modern sat solvers? Some modern x86 CPUs have enormous cache sizes.


Yes. It turns out that squeezing more performance out _purely_ from the silicon has been mostly due to: (1) hand-rolled memory layouts that optimize placement, do bit-stuffing and the like to take the most advantage of the cache, (2) using clever data structures that cache certain elements of the pointed-to data in the data we read linearly, thereby allowing the cache to pre-fetch them and allow us not to (always) dereference the pointer, (3) prefetching.

Note that (1) and (2) are much more important than (3). In particular, you can even arrange the data elements for (1) in an order that you _expect_ them to be dereferenced, and even if your "hit" ratio is only slightly better than zero, you can get a performance improvement. For this latter one, check out https://www.msoos.org/2016/03/memory-layout-of-clauses-in-mi...


Thanks for that resource, that is really interesting!

Although I'm quite new to the SAT field I have to say I am quite impressed with the performance of cutting edge sat solvers. Looking at the source of some of them (like Glucose) I don’t see that much low-level trickery. I wonder how much down the stack one could go to try to get more out of their processors.


So getting rid of ptrs and using roaring bitmap indices would make them fly.


To be honest can you redo it when the a14 chips come out in a device?


Apple Silicon has been pretty impressive as it outperforms many laptops and desktops in benchmarks.

This guy even used the A12 chip from 2018. The A13 is even faster and ships on all iPhone 11's and even the $399 iPhone SE 2nd gen.

I am eager to see how Apple Silicon will do in the upcoming MacBooks.


It's going to be awesome to see chips that actually utilize the board space specifically for the functions the end-device is used for. I get general-purpose boards are the defacto to suit the masses and it's favoured by OEM's because they're easier to ship, but I think we're past the point we need that, adoption is no longer a problem.

I signed up for the Apple DTK, hoping they approve more people abroad..


Note that this sort of software can be extremely sensitive to cache arrangement.

Not only actual performance but also small details such as data arrangement, line size, replacement model and coherency model.

Source: I worked on performance improvement for a related piece of software.



Smt solvers and other 'old a.i.' workloads depend mainly on memory performance under unpredictably branching workloads, which the iPhone chips seem to be strong in. Compare that to 'new a.i.' workloads that eliminate branch misprediction almost entirely.


> Compare that to 'new a.i.' workloads that eliminate branch misprediction almost entirely.

Wait a few more years, and your statement will be wrong. :) So far, ML/DL workloads work with traditional matrices which are hitting a scaling wall. The natural next step is sparse matrices, where branching comes back with a vengeance.


I have a hard time putting SAT, SMT, and at least some linear algebra algorithms into the AI category. They are all logic, just of a sort that’s far more mind-numbing than our usual data processing workflows.

I know the AI people get grumpy about all of the problem spaces that have been harvested or “stolen” and called conventional code, but in this case I’m not so sure that’s accurate.

It feels more like a bunch of people who wanted to use advanced math and logic jumped on the AI gravy train while it was going past, and then hopped off again when they had their fill.


SMT, SAT, etc are absolutely AI. In fact, Machine Learning is just another field inside AI (albeit a famous and successful one). As someone that got his PhD doing his research in "classical" AI, it's surprising that nowadays most people think that AI is just machine learning.


+1. I am always surprised that they used to do (and of course still must do) proper full-on formal verification of e.g. train track scheduling systems, so train accidents, and consequently 100s of deaths don't just happen because "sorry for the bug, ver 1.2 fixes it" -- but people are happy to sit into a Tesla going at 120km/h and allow a well-tested, but not formally verified deep learning system to make decisions. BTW, I know it's not formally verified, because formally verifying deep learning systems is a currently very much unsolved problem for anything but toy examples.


I completely agree.

Hot take: If SMT, SAT are too simple to be given the coveted badge of AI, then Machine learning should be called as glorified curve-fitting?

I feel that everyone is amazed by the results provided by the good examples of Machine Learning, as only those get popular. The spectacular failures and completely absurd results which reveal that ML is not having any semantic understanding of the problem are completely glossed over.


Is linear regression AI? How about logistic regression? Many people would consider the latter to be ML/AI while the former statistics.

Ultimately, I think the distinction is pretty meaningless.


I don't know if you know but 20 odd years ago what was called AI at universities was SMT/prolog etc. People where doing stuff like expressing common sense facts in logic and using programs to derive more common sense.


Academic computer scientists are like that.

Research priorities in the U.S. driven by what the Department of Defense will pay for. DoD had some bad experiences with FPGA and now they want to build ASIC supercompilers.

One thing I find funny is that the production rules engines that we have today (Drools, Clara) are dramatically better than the ones we had when production rules were popular (e.g. CLIPS, eMYCIN) Today we have improved RETE algorithms, hash indices, etc. Production rules only seem to survive in a few niches, one of which is inside banks, another is complex event processing.


AI is a wider field than you think. Hell, even the A* algorithm for path planning was considered AI.


New AI workloads also run on dedicated blocks. I.e. https://en.m.wikipedia.org/wiki/Apple_A11#Neural_Engine


I am always amazed how the user experience of an iPad mini (~USD 400) is much better than a top end 13" Windows based notebook (~USD 2700). I know I am comparing apples with oranges in term of running processes and multitasking but the foreground app experience is much better.


My uninformed hunch: There's likely a lot to be said for 1) requiring all apps to use a single modern GUI toolkit, and 2) optimizing that toolkit for the use case of touch. Rather than having a single codebase that's supposed to solve all use cases in addition to 30 years of baggage.


Apple fires a lot of customers as well, to stay focused on a narrower set of problems.

And we spend a lot of time on the internet complaining about that (nobody likes rejection).


> optimizing that toolkit for the use case of touch.

iPadOS seems to be good enough with mouse pointer and keyboard input too now.


Most of it is running a single app at a time, with far less complexity than a desktop system.


Note that at any given moment there's usually a couple dozen daemons running in the background on iOS. And while iPhone will only show one app at once, iPad can do a handful.


iOS optimizes here too, by limiting how much CPU/IO background apps can use. It's frustrating to use Chromecast (for example) on iOS since it's always disconnecting from the receiver.


I'm talking about background daemons from the system, which have different resource limits.


the background daemons are mostly memory-limited, not processor-limited.


Fair. And to be honest, many of them are present on macOS too, without Jetsam around to kill them ;)


Comparing apples to windows?


ios to windows; apple to microsoft;


I understand the relationships— it was merely a play on words.


Correction was not my intent. Apologies. I didn't really think through communicating my intent when I wrote the comment. I was actually hoping to elicit someone else chiming in with a clever comparison. Alas...


Dust to dust?


Unless you have to manage files, use a mouse, or install software that isn't on the app store.

I don't know what too end laptops you were looking at but I suspect there are more options there than the one you chose.


Mouse support is quite good as of April.

iOS 13 supports local USB hard drives and network shares well enough.


Supports mouse and fileanagement poorly. File management on iOS is a disaster. I love my iPad pro I detest iOS.


The file browser is a new feature right? Perhaps it will get better soon. The new pro ipads seem like a push to replace some laptop usage which would require a file browser.


The file browser to support iCloud and third party apps like Dropbox, OneDrive, Google Drive and on device storage has been around for awhile. UsB Drive and file shares came out last year.


Let's say you want to work in a Google drive folder when making a file so that it's always backed up, like you would on a PC.

You open procreate and have to share/import/export with Google each time. The automatic backup thing I use drive for doesn't seem to work. The files aren't really backed up until they're reshared with Google drive, circumventing the whole point of automatic backups as I'm working.

The file browser is a workaround that lets you drag files but the basic functionality is ruined.


Procreate should support the UIDocumentBrowserViewController

When you open a file within Procreate, it should give you an option within the app to choose Google Drive or whatever other storage provider you have install and load and save files to Google Drive from within the app.

It works for Office.


can confirm does not work in procreate.


The API was introduced in June 2017 at WWDC. There is really no reason for Procreate not to support the functionality.


I really want to see the Apple Silicon under some real world load running a current gen AAA video game in 4K ultra settings while streaming with OBS, playing music, and having Chrome open with 100+ tabs. Because that's what my current PC does for example so I'm really curious about everything (performance, temps etc.)

To some extent it's just hard to believe that the iPhone CPU is better in every thing compared to the current gen top of the line Intel and AMD desktop CPUs.


Chrome open with 100+ tabs

Bit off topic, but I don't use Chrome so I wonder: do you mean 100+ visited i.e. active tabs, or are you counting all of them, or doesn't Chrome have a distinction between those cases like e.g. Firefox does (i.e. when starting the application it doesn't actually load all tabs until you activate them - so once opened perfomance wise it doesn't matter a lot if it's 10+ or 100+ tabs)?


I'm pretty sure that Chrome does have a distinction between loaded and unloaded tabs (suspended tabs, maybe?), but it definitely manages tabs in a different way from Firefox and Safari. Both Chrome and Firefox can trivially handle 100 suspended tabs on any machine; both would struggle with that many active tabs if the machine didn't have enough RAM to fit them all.

This metric really only has to do with the amount of memory that a machine has, in my experience.


Chrome will suspend/discard inactive tabs when when the memory pressure is high, and reload it back when you activate the tab again, so you probably can open 1000 tabs and leave it alone without much impact (this feature is exposed as an api and can be triggered by extensions: https://developer.chrome.com/extensions/tabs#method-discard ). But when you activate them all I imagine it would consume too much system resource.


1. iPad Pro humiliates PCs and Macs on some media tasks,

2. We haven't seen any real product yet, only the dev kit. The new Macs will be much more powerful than any existing Apple Arm products. We'll see.


A small tip, if I may: you can separate the list items with a newline to make them show up on their individual lines:

  1. Like so,

  2. for example.
will render as:

1. Like so,

2. for example.


Which tasks are those? I have a 2020 iPad Pro and want to see this.


Lightroom on the 2017 iPad Pro vs a 2017 iMac.


I was under the impression modern browsers suspend background tabs to some extent?

But anyway, you're talking about a very specific use case that only partially generalises.

I'm typing this on an 2014 11" MacBook Air 6,1 i5-4260U CPU @ 1.40GHz, 2001 Mhz, 2C / 4T with 4GB RAM running Windows 10 with two Chromium Edge and Firefox, 2D CAD and vector graphics programs running.


Apple likely won’t optimize their own chips for your use case at first (I expect the iMac Pro or Mac Pro will stay Intel for at least two years), but for cost, low heat and high battery. Imagine a slim MacBook with no fan.


You mean.. the 11” MacBook? It already is slim as a phone and has no fan.


It has been discontinued. They might reintroduce it with Apple Silicon.


>I really want to see the Apple Silicon under some real world load running a current gen AAA video game in 4K ultra settings while streaming with OBS

AFAIK apple CPUs are on-par with high performance desktop class x86 CPUs, but only in integer performance. AAA gaming would probably perform badly since it has plenty of floating point operations.

edit: *in single thread performance


Apple CPUs are nowhere close to things like Ryzen or Thread ripper when it comes to multi-threaded performance. Moreover, most modern games can make use of at the very least 4 to 8 threads nowadays, so I would expect Apple silicon to fair poorly on most game benchmarks (assuming they were ported to ARM, of course).


Thanks, edited.


You left out running Z3 in your list of tasks. Did you read the article?


That's a bizarre comparison. The AMD CPU doesn't do OBS with AAA games; the GPU and VRAM does that.


CPU matter for AAA games. Different CPUs gets you different fps on these games so clearly they are used.

And there's nothing bizare in OP's real world scenario. Mine is even worse.

I also have multiple Linux OS running on Docker, Windows on a virtual machine with 4 CPU threads dedicated to it and native Windows running Ubuntu over WSL2 performing some tasks. I don't turn any of that off while I'm playing game and streaming.

for the downvoters: https://www.youtube.com/watch?v=PcYA-H3qpTI


AMD has APUs.


This. While I can appreciate that Apple is taking a huge risk in creating their own chips, I cannot believe that the performance of the chips will be on par with current x86_64 chipsets based on Apple's past. In all honestly, the vast majority of Apple's clientele for computers are businesses, who basically just use them for basic web browsing/office work, which isn't a high bar to meet the needs for.


"the vast majority of Apple's clientele for computers are businesses"

I'm sorry? What?!


I suggest you pick up an iPad and try doing something demanding on it.


I don't know of a single employee of a serious actual business that has been issued an Apple anything. Not a laptop, desktop, or phone. I don't know of a single person that, when given the choice of what to use for work, has chosen Apple unless they already were entrenched in the Apple ecosystem.

The vast majority of Apple's customers are Apple fans, people who merely wish to think different, nothing more, nothing less.

Edit: Don't downvote just to disagree. Comment and state your reasoning on why you think I'm wrong. If you work for a company that only issues Apple devices, please speak up with your experience on how this went, and, if possible, why the company didn't go with Windows or Linux computers instead.

For the comments that try to bring up software development roles: most software developers I know either use Windows or use Linux on the desktop, and loath OSX worse than I do. Very few prefer OSX.

The few people I know that prefer OSX (and will die on that hill) are frontend developers.


If you're throwing around anecdotes, then my most recent workplaces allocated Macbooks, and Apple peripherals, to every non-technical person in the office (~2/3rds of the workforce). I know plenty of individuals who, upon starting a software engineering role, were immediately shipped Macbooks to use for work.

I wouldn't go so far as to say that this is proof that "Apple's main clientele is business", but it certainly wouldn't surprise me. And it's not true that offices don't allocate Apples to workers.


When I was at Facebook the majority of employees used Mac Book Pros for work (I was an exception because I worked on VR stuff which was Windows only). It seems like the majority of Silicon Valley big tech and unicorn startups have a large (if not majority) population of employees using Apple products for development. I understand that many employees prefer or need Linux or Windows, however.


That seems to be the only places where MBPs are almost part of the dress code to work there. The HN community has a disproportional representation of such employment, and sometimes the Apple echo chamber shows up strongly because of it.

It even happened at Microsoft until the Surface program was finally given a proper go ahead by Nadella. A lot of former MBP users seem to love the Surfacebook, so they succeeded in making a functional yet veblen good in the same vein that a MBP is.


Amazon also has a disproportionate number of MacBooks even though we can choose either a MacBook or a Windows PC.

As far as Surface sells.

https://appleinsider.com/articles/19/10/24/editorial-why-mic...


Apple's A12Z Under Rosetta Outperforms Microsoft's Native Arm-Based Surface Pro X:

https://www.macrumors.com/2020/06/29/apple-rosetta-2-a12z-be...


Yes, but how well does it perform against something that isn't what might be their slowest (but thinnest and most portable, but somehow not cheapest) model?

Apparently, a Surface with a Ryzen 4800U exists and is coming out really soon.

The comparison isn't realistic because Apple's acquisition of PA Semi was a brilliant move, and no one makes an ARM that is anywhere near that level of performance, not even Qualcomm. I agree with Apple's decision to put their ARM on the desktop as their answer to the Intel dilemma.

In this case, an ARM vs ARM comparison is not an apples to apples comparison, Apple does not license their design to anyone else, ergo, the only way to make a fair comparison is to use the best commonly available part, which currently, is Zen 2 parts.

I expect a Zen 2 to crush a 2 year old Apple ARM, but I also expect a future Apple desktop and laptop design to keep up with Zen 3. I actually hope Apple can win the fight, but I also wish Apple would license their CPU to other manufacturers to promote the general adoption of more architectures in this space.


Lol calling a MacBook a Veblen good? Oh my lord you're a trip.


"A veblen good is a good for which demand increases as the price increases, because of its exclusive nature and appeal as a status symbol." -- https://www.investopedia.com/terms/v/veblen-good.asp


I know what it is. That's why I'm laughing at you.


> The vast majority of Apple's customers are Apple fans, people who merely wish to think different, nothing more, nothing less.

That is one of the most absurd takes I have ever seen by an Apple hater. Even haters usually acknowledge that Macs are used extensively in software development, scientific research, film editing, design work, and education. And Apple is the fourth largest PC maker in the world, so they’re also used just about everywhere else at least a little bit.


Do you consider software engineering to be a "serious actual business", or are you going to gatekeep it to match whatever your notion of one is until they can only use software that is Windows-only?


> I don't know of a single employee of a serious actual business that has been issued an Apple anything. Not a laptop, desktop, or phone. I don't know of a single person that, when given the choice of what to use for work, has chosen Apple unless they already were entrenched in the Apple ecosystem.

What are you doing on this website because you clearly don't work in software.



Of course DiabloD3 chose to ignore this.


https://news.ycombinator.com/newsguidelines.html

> Please don't comment about the voting on comments. It never does any good, and it makes boring reading.


The most rabid anti-Mac people I've met, when pressed, knew zero or close to zero about MacOS. Basically, fear-of-the-unknown was what it was.


At least in the tech world, I have found this varies massively by location. Every single job I’ve had in San Francisco has been 99% Apple MacBooks in use by all eng employees and I’ve had to fight really hard to be allowed to use a PC. But I’ve also had plenty of eng jobs outside of the bay area and those tend to be on PCs with the occasional Mac user, often stereotypically a designer. It’s kind of astounding how completely different the world is when you get outside of the bay area bubble, and most of my tech friends in San Francisco don’t seem to understand that, and give me crap for preferring to use a PC.


Every job I've ever had has issued me a Mac. Mac Pros are heavily used by academics and Macbooks by developers. I would have preferred Linux myself, but OSX is way better than Windows, that's for sure.


lol, apple fans chuckle. Typing this from my work macbook.


Are smartphone architectures really this amazing, or is Intel just dropping the ball so hard that they're getting lapped by mobile hardware?


Intel has focused most on data center performance, where you are operating at scale and frequently using entirely different algorithms than people usually use on a PC or Phone. For instance

https://en.wikipedia.org/wiki/Column-oriented_DBMS

dominates the competition for ad-hoc analytics under "enterprise" conditions. There the point is to maximize the use of memory bandwidth and never stall the pipelines, to turn a commercial data processing job into something that looks like HPC to the memory bus.

(Actually columnar organization improves performance reliably for small data; video game programmers in the 1980s knew that it was better to put the x coordinates of all the spaceships together, then put the y coordinates together, then put the sprite id's together, ...)

Most muggles though go with the row-orientation that comes with structs, classes, record types, and don't care. Pythoners know their language is slow but they know enough to use 'pandas' which is an object-oriented interface to an in-memory column store.

Intel has sold it's performance on the basis of what a sophisticated organization could get out of complex programming. For instance, the AVX-512 instructions can nearly double performance for some workloads. If you are a cloud provider that will rack 1000s of identical machines that run the same software, it can be a big win. The average PC user runs software that has to run on a wide range of hardware and may be supported by lower-tier organizations. They will rightly privilege having a trouble-free product and low support costs vs giving every user the highest performance possible.)


This - consumers basically get 'leftovers' of design made for datacenters, while Apple will probably optimize for consumer use cases.


Columnar is just a specialisation, like graph or document. It's not a general-purpose improvement on relational.


From the viewpoint of relational it is an execution strategy. Many columnar databases return row wise results via odbc or some similar interface and they look like relational databases.

The difficulties of doing columnar style queries for document and graph databases led at least one person I know to tell a certain three letter agency that they couldn't have the panopticon machine they wanted.

For online transaction processing rows are good, but for OLAP it is not even competitive.


From what I've seen, it's a combination of Intel dropping the ball hard, and Apple aggressively pushing to make better and better chips for the iPhone and iPad—and, now, for their computers.

This isn't something that you're going to see in "mobile hardware" more generally, I don't think, at least not unless Intel continues to be unable to find its balls for a few more years.


The smartphone architecture is smart and well executed, but it is nothing revolutionary. The key is definitely that Intel has lost its manufacturing lead, which is letting other chip companies release competitive CPU architectures. Moore's law ending has leveled the playing field for ASICs.


> Both systems ran Z3 4.8.1, compiled by me using Clang with the same optimization settings.

Hypothesis: LLVM's AArch64 backend has had more work put into it (by Apple, at least) than LLVM's x86_64 backend has, specifically for "finishing one-off tasks quickly" (as opposed to "achieving high throughput on long-running tasks.")

To me, this would make sense—until recently, AArch64 devices were mostly always-mobile, and so needed to be optimized for performing compute-intensive workloads on battery, and so have had more thought into the efficiency of their single-threaded burst performance (the whole "race to sleep" thing.) I'd expect, for example, AArch64-optimized code-gen to favor low-code-size serial loops over large-code-size SIMD vector ops, ala GCC -Os, in order to both 1. keep the vector units powered down, and 2. keep cache-line contention lower and thereby keep now-unneeded DMA channels powered down; both keeping the chip further from the TDP ceiling; and thus keeping the rest of the core that is powered on able to burst-clock longer. In such a setup, the simple serial-processing loop may potentially outperform the SIMD ops. (Presuming the loop has a variable number of executions that may be long or short, and is itself run frequently.)

x86_64 devices, meanwhile, generally are only expected to perform compute-intensive tasks while connected to power, and so the optimizations contributed to compilers like LLVM that specifically impact x86_64, are likely more from the HPC and OLTP crowds, who favor squeezing out continuous aggregate throughput, at the expense of per-task time-to-completion (i.e. holding onto Turbo Boost-like features at maximum duty cycle, to increase mean throughput, even as the overheat conditions and license-switching overhead lower modal task performance.)


The LLVM x86_64 backend just gets as much work, and HPC is just one of the reasons. The NVIDIA PGI compilers for example use LLVM as a code generation backend, and that's one commercial option.


This seems like a trivially testable hypotheses. Compile the code and look at the instructions generated. No need for elaborate speculation.


> So, in a fit of procrastination, I decided to cross-compile Z3 to iOS, and see just how fast my new phone (or hypothetical future Mac) is.

> I bet the new iPad Pro’s A12X is even faster thanks to the larger thermal envelope a tablet affords.

iirc Apple intends to complete transition to Apple Silicon within three years (and they often tend to be conservative with their estimates) so I'd imagine the 2023 or 2024 mac pro will be interesting given (I'd assume) it won't have the thermal constraints an iPhone has. Thoughts?


They are planning to complete the transition in 2 years: https://www.apple.com/newsroom/2020/06/apple-announces-mac-t...

Makes your question even more interesting though, what will a Mac Pro look like running Apple silicon inside 2 years from now?


Interesting. Is it possible that this particular workload takes a big hit due to the Spectre/Meltdown mitigations?


Yes. Mitigations drastically increase the cost of branch mispredictions, which has always been expensive on x86 and has gotten even worse. See also the sister comment.

Cache misses have the same problem.


I highly doubt the author compiled Z3 with those mitigations enabled, because they are opt-in compiler options. Perhaps if the program makes frequent system calls then the kernel’s mitigations would introduce overhead.


My impression is that the mitigations that would hit hardest are Intel microcode patches that load at boot time, and it may not have been obvious to revert those. Or perhaps not wise to, depending on the production environment.


Why? If performance matters, don't browse Reddit on your lab machine.

I don't run antivirus on my microwave oven. There's no need.


Support for most hardware depends on applying latest patches for HW and SW. Only if you explicitly bought as HPC or Lab environment with the appropriate contract specifying performance you might get around that.


> I don't run antivirus on my microwave oven. There's no need.

Oh brother, I got some bad news for you. It's called the Mirai botnet.


> I don't run antivirus on my microwave oven. There's no need.

Only as long as you have not connected it to any network - "smart" appliances have often enough been used for botnets now.


The article is older than these mitigations, so the Intel fares probably even worse nowadays.


Are you sure? The article is from late 2018. Spectre is from early 2018. I would assume some microcode patches were in place. I recall several waves of patches of various types.


Indeed, there could have been some overlap, sorry.


>Indeed, after benchmarking I checked the iPhone’s battery usage report, which said Slack had used 4 times more energy

>than the Z3 app despite less time on screen.


The a12 has about 3-4 times the number of transistors as i7-7700k. I know there are a lot of other on than the SOC core than the cpu for a12, but they still have a pretty large transistor budget. It was made with 7nm transistors vs 14nm that intel was using at the time when 7700k came out. Apple did a good job designing the chips, but TSMC is a huge part of why these chips are so good. For the longest time Intel manufacturing process was way ahead of anyone else which reflected in their performance. If I give you way more transistors I can probably make a better chip.

If I give you more cache, decoders, execution units etc. I can make a faster chip.

Also if you used all cores for sustained amount of time the intel would win out just because it can handle all that heat.


I think the dimension you’re missing is that the a12 also consumes significantly less power. While we don’t know exact numbers, the author speculates it draws about a tenth of the power as the i7.


It's the cache that makes this fast, not the higher clock higher voltage cores on an older process.

Not that x86 particularly in 2018 is the pinnacle of perf per watt but the power dimension is non linear on some of these variables and this particular benchmark doesn't happen to gain much from it.


Because its running at a much higher clock rate and its a much older process. The apple isn't running 4 big cores either.


(2018)


Makes me wonder what the article would look like today, or next month when the new generation iPhone hardware comes out. I guess benchmarks like geekbench can give us a general picture


The author's code is open source[0] if anyone wants to take a look.

[0]: https://github.com/jamesbornholt/z3-ios


On the topic of the portability of Z3, does anyone know why this example runs fine on native amd64 but fails to solve when built with emscripten/wasm?

https://github.com/sielicki/z3wasm

Do a build by running, “mkdir bld && cd bld && emcmake cmake ../ -GNinja && emmake ninja”


Not particularly apropos to the thrust of the article, but I want to apply SMT to some tricky infosec use cases. Is there a good place to start for someone without an academic background in math/logic/etc?


I suggest you start with this paper:

"SMT Solvers in Software Security", Julien Vanegue, Sean Heelan, Rolf Rolles

https://www.usenix.org/system/files/conference/woot12/woot12...


Looks perfect, thank you!!!


The chapter of Norvig on it may suffice as a self contained introduction. There is a book called SMT by Example (IIRC) which is good.


Will definitely check out Norvig. Not sure if this is what you had in mind but the linked PDF appears to be a treasure trove, thank you!!

https://news.ycombinator.com/item?id=19075599


Yep. Knuth also has a chapter on it.


For anyone trying to reproduce it in an Android phone you don't need to compile anything. Just "apt install z3" in Termux. It will probably be pretty slow but it will work.


Z3 is not available on termux (I am pretty sure it has been before).

Back to the issue at hand. It is possible z3 is faster on arm64 due to the aarch64 bitfield instructions, but I find that unlikely.


It was recently merged but it is available.

https://github.com/termux/termux-packages/tree/master/packag...


Sounds it, thanks. Had to update the yermux app first to see it.

Yes, z3 performs slightly betteron aarch64. But performance seems to be mostly cache dependent.

Also, z3 is threades which would favor Apple with their fewer but faster cores compare to others with more but weaker cores.


Hmmm... I think my autocorrect had a meltdown.

Hopefully people can figure it out anyway.


It's very fun suggestion to put an iPhone in the cold water. Imagine cluster of iPhones working under-water for math problems :)


When Apple releases ARM Macs I think we are all going to be really impressed. Take these phone chips, beef them up a bit with more cores and maybe more cache, clock them a bit higher, and give them better cooling so they can sustain higher speeds.

If it goes how I think it will go, X86 is done. People will really start wanting ARM on servers.


I won't be very impressed if I can't install Linux on them after Apple deprecates macOS support on older models. I already can't do that with old iPhones and iPads.


People have been wanting ARM servers since the first ones were announced. But there are two big problems. First, there is no unified boot- and device initialisation-standard. So you cannot just put in a bootable USB stick and expect it to work. Second, you can not reliably order any hardware, just very small series just above prototype stages...


> First, there is no unified boot- and device initialisation-standard.

There is. UEFI + ACPI.

> So you cannot just put in a bootable USB stick and expect it to work.

That's how it works today for Arm servers. Windows on Arm platforms also use UEFI + ACPI, and ship with the bootloader unlocked.


Unfortunately there are no other ARM designs that can compete with X86. And Apple CPUs will only run in Apple Hardware executing Apple Software.


Have you looked at AWS' new Graviton2 instances? They compete handily with x86-64 instance performance, and are cheaper to boot.

https://www.anandtech.com/show/15578/cloud-clash-amazon-grav...


Graviton can't compete with x86. It can only compete on AWS on server workloads. I can't buy a Graviton workstation/server. There are no Graviton laptops.


The first part made sense but the second part didn’t. The server people won’t be waiting. If they could profitably port their workload to another platform then they would just do it.



Could be Apple arm chips have certain instructions that are more optimized


Beware: Apple devices are not general purpose computers. Whether you can use them as such may be subject to change.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: