I wonder what's the good core count for development workstation. While it's obvious that increase from 4 cores to 12 cores is useful, for me it's not obvious that increase from 12 cores to 32 cores is similarly useful.
Sure, if I'm compiling huge project from the scratch, every core will be loaded, but that's more suitable for CI machine.
For everyday development people usually use incremental compilation and in my experience it's few files. So for me something like 12 cores looks like a golden balance between single core performance and multi core performance (because usually those high-core processors come with low frequency).
I have codebases which scale almost linearly with cores. I built a 56-thread Xeon machine. One of my projects compiles in two minutes with single threaded make and two seconds with all 56 threads going.
Not the OP, but we have big projects written in C containing hundreds or thousands of source files and builds scale almost linearly, at least up to 96 cores (I tested on ARM servers).
You have to be careful about how you write Makefiles - not using Makefiles in subdirectories, but writing non-recursive Makefiles. And you have to sometimes divide up large source files.
It's usually a good idea to do a test build with make -j1 and pipe the output through moreutils “ts” utility, then sort by which operation takes longest and try to break it up / parallelize it.
An example is when you are developing/testing distributed systems. Many tests which used to require a cluster to run can now be run on a single machine, which helps a lot with debugging.
Personally I am very excited for a couple years from now when prices on used dual socket epyc servers come down because I want a workstation with 128 physical cores. It will be glorious.
The reasoning is that your code, when it's ready and in production, will run on machines with more cores than they had available last year. If your code can now face its future runtime environment, then you'll have fewer surprises later.
A second reason is that machines with many cores degrade more gracefully under high load. You may be compiling across a dozen cores, but you'll still be able to search Stack Overflow or answer Gmail.
You may also consider going for higher memory bandwidth. 8x8GB DIMMs is better than 2x32GB ones.
Machine Learning would eat all CPU/GPU/memory/disk you could possibly throw at it. Right now training some Deep Learning model with a lot of preprocessing and all my Threadripper and Titan RTX cores are at 100%. If you use Docker a lot with multiple independent microservices/containers running, or VMs, or just compiling large code bases, the more cores the merrier (unless there is some bug in CPUs that prevent them from scaling properly).
Making programs concurrent is hard, especially if you don't have good support from the language. I, too, would hesitate before attempting to make a highly concurrent application in PHP, Python, or Visual Basic. And even then you're looking at considerable investment rewriting stuff that already works.
Maybe it works out if the cost savings from being able to purchase fewer machines outweighs the rewrite and the risk of new concurrency bugs.
That's the trick: they can't make single cores faster so they just throw more cores at the problem and call it a day. Obviously that doesn't work but who cares? It looks good in benchmarks.
Are you sure that they sacrifice speed for security? It's not obvious to me. Hardware fixes are supposed to be fast. That 20% slowdown comes from software implementation (and you can actually disable it).
And yet, GPUs have hundreds to thousands of cores, and we find them very useful...
There are many, many useful tasks which are embarassingly parallel or nearly so. For those tasks, doubling the core count doubles the performance. Single core gains have stagnated, so a performance doubling is a huge win. There's no other way to just up and double performance for any class of problem.
And even outside those tasks, more cores means more simultaneous heterogeneous workload, which means deeper and richer information pipelines. If you're live editing audio or video, for example, your core count determines the number of plugins/tracks you can work with simultaneously - and it's not unusual to have hundreds of tracks and dozens of layered plugins.
That's not exactly true. The GPU equivalent of a CPU core is called a Streaming Multiprocessor (in NVIDIA language), and it's latest GPU, GTX 2080 has 72 of them.
Each of these SMs can run hundreds of threads, but they run the same code in lock-step.
That has never been true: on NVIDIA hardware the 32 threads of each Warp run in lockstep, certainly not all threads. And starting with Volta, each thread even got its own program counter and call stack.
Ah good, a response instead of an unhelpful downvote or 2.
I guess it's from experience, that when there's a bottleneck in my work and in many other areas I've seen it's down to lack of ram hitting VM.
Now lack of CPU is less crippling in a way, the CPU just divides n ways and things run proportionately slower. If paging happens, it can easily be far worse than lack of CPU as disk is so slow (ok, SSDs these days may ameliorate that; I've no experience there).
So why? Experience suggests lack of ram is more common than lack of cpu and the effect is worse. Of course the best thing to do is examine your system before you take anyone's advice, including mine.
I would take 64 gigs of memory/4 cores over 16gigs/12 cores any day. The idea of many core is simply less preposterous than it was even a few years ago.
Oh heck yes, at least IME. I work in DBs and in a company I worked for were renting 16GB machines for the hosted service for their clients. I almost literally armtwisted them into upgrading to a 64GB dev machine. When they saw how that ran, within 3 days they'd started rolling out 64GB machines to their clients. Perf difference was huuuge because the hot dataset could fit into ram at last.
As ever, it depends on one's needs, but most often that need is greater for memory. IME. YMMV. Measure first as always.
If you are a DB developer, you can have a machine now that will have close to the number of cores on you prod/staging instance. A DB will use all cores available unless you restrict it
I'm curious about the 280 watt TDP, especially for the 16 core part. The 16 core Threadripper 2 had a 180 watt TDP, so what are they doing with the extra hundred watts on a smaller process? Could these chips be running at much higher frequencies?
We might really be at the brink of no-compromise super high end workstation computing!
Ryzen 3900X (and probably 3950X) would agressively limit all core boost to fit in the TDP.
A Threadripper with a higher TDP limit (and better cooling) will be able to boost all core to same levels as single core, ideal for workstation workloads.
Some of that TDP is also spent on driving additional memory channels and PCIE 4.0 lanes. There were rumors of TR having 8 channels now. And judging from active cooling on X570 chipsets, PCIE 4.0 is hot.
As a Threadripper user, this makes me happy. I don't care about the TDP too much, as long as I can compile large C++ codebases as fast as possible (for which short-term, all-core boosts is very useful).
True, you'd need a pretty good water cooler to run at max all the time. I would gladly trade electricity for performance though. A lot of the stage gates in my compilation are single threaded, so a highly-threaded chip that consistently boosts up to high frequencies would be worth it to me in terms of initial and ongoing costs.
Sure, if I'm compiling huge project from the scratch, every core will be loaded, but that's more suitable for CI machine.
For everyday development people usually use incremental compilation and in my experience it's few files. So for me something like 12 cores looks like a golden balance between single core performance and multi core performance (because usually those high-core processors come with low frequency).