DDR5 memory is on its way, twice as fast as DDR4

trey-jones · on March 31, 2017

So, it's been a while since I tinkered with hardware, but I'm making the blind assumption that although the bus speed is twice as fast as DDR4, there is also twice the read latency, as has always been the case with DDR memory, meaning the net read latency improvement is basically nil.

Of course we expect every generation of Double Data Rate memory to double the data rate. But it's always been a bit misleading to say "twice as fast".

Retric · on March 31, 2017

Latency is limited by the speed of electricity which is a large fraction of the speed of light, so unless you move DDR closer to the CPU it physically can't get 2x as fast... ever. light speed * 0.951 (electricity) / 2 (round trip) / 2 feet (CPU to furthest RAM chip along a wire) ~= 250 million cycles per second.

However, cache is now huge. So, it's latency is usually less of an issue than sequential reads/writes and total size.

ajross · on March 31, 2017

Out of curiosity, why did you ballpark the length of the wire to "2 feet" (close to an order of magnitude high, way under 1 digit of precision) but specify the wave propagation speed along that wire to one part in a thousand (way more precision than appropriate: line capacitance changes significantly with trace width, pcb thickness to ground plane, pcb composition, etc...)?

You're point isn't invalid but it's sort of misapplied. In fact the latency of DRAM on modern systems is dominated almost entirely by precharge time within the memory IC, which is about an order of magnitude slower than signal propagation. And it's not changing, either. DRAM cycles have been sitting at around 60 MHz for a decade and a half. We're dropping voltage as we shrink the cells, and so there's no net increase. Realistically signal propagation will never be the limit.

Retric · on March 31, 2017

I very much rounded the output to represent minimal accuracy and while 9.51 e-1 is not accurate to 3 digits 1 digits of accuracy * 1 digits of accuracy = less than 1 digits of accuracy.

Anyway, if you actually trace the furthest bit of a ram chip to the CPU it's a longer path than you might think > 1 foot but < 3 feet.

As to my point tCL is a round trip latency and actually not that far from optimal. DDR is not designed for pure random access so much as cheap access to lot's of ram so yes there are many trade offs, but they are more reasonable when your close to hard limits.

gatsby99 · on March 31, 2017

Best practice is to have traces less than 3"

Retric · on March 31, 2017

Traces only go from CPU pin to Module, latency includes from memory controller CPU pin, CPU pin to PCB, PCB to module, module to RAM chip edge, ram chip edge to actual memory location.

Here is a picture, from the center of the CPU, to the bottom edge of the furthest part of the DIM then up. http://harddiskdirect.com/media/catalog/product/cache/1/imag...

PS: A DIMM is just over 5 inches wide and 0.7 inches tall so if was centered physically on top of a CPU that's 3 inches right there.

gatsby99 · on March 31, 2017

You don't know what you are talking about. Memory controllers are typically onchip.

koverstreet · on April 1, 2017

"You don't know what you are talking about" is not a great response to someone who's writing more substantive posts than you.

Retric · on March 31, 2017

If by on chip you mean on CPU I agree: https://en.wikipedia.org/wiki/Memory_controller

But, that's part of the CPU packaging, a signal needs to get to these pins on the bottom of the CPU and that takes time.

jjoonathan · on March 31, 2017

I keep hearing this, but it doesn't jibe with the RAM timings I see. tRAS is usually 50ns or so, and propagation delay is 2ns or so (see below). What gives?

My suspicion is that amplifying the differential voltages produced by femtocoulombs of charge at 100% reliability is a harder problem than moving the DRAM chips closer to the CPU, and that speed-of-light has gotten the blame in "pop architecture" due to sloppy overgeneralization.

    2*2 feet is long for a memory bus, but it's partially cancelled
    out by the high velocity factor (sorry, "speed of electricity")
    of 0.951 in parent's calculations. Instead, I'm going with 0.5*2
    feet at a velocity factor of 0.5 for a 2ns roundtrip propagation delay.

dfox · on March 31, 2017

Major factor in DRAM latency is in moving these charges around the chip (between the array and sense amplifiers). The wires in the array have some finite conductivity and current handling capability thus this cannot happen instantenuosly.

On the other hand sequential access bandwith is limited almost entirely by the external interface and has more to with economical factors (pin count is major part of IC price) than any physical limitations.

Retric · on March 31, 2017

https://en.wikipedia.org/wiki/CAS_latency

First word latency: DDR3 SDRAM 6.37ns = 1/156,985,871 of a second.

jjoonathan · on March 31, 2017

tCL is (very) roughly the RAM equivalent of sequential access time. I'm talking random access time, tRAS, which includes closing the last row (precharging the diff amps) and opening the next (waiting for the amps to stabilize). It's called Random Access Memory, so I think it's more fair to judge it by its random access time.

Retric · on March 31, 2017

tCL is very much a round trip latency not just sequential access. If you want sequential access 4 times you can pay for tCL once. https://www.altera.com/ja_JP/pdfs/literature/hb/external-mem...

DDR is not really setup for random access, because it's setup for talking to L2/L3 cache not registers. tRAS ~= tCL + tRCD + tRP which kind of but not exactly exactly what you want for random access.

jjoonathan · on April 1, 2017

Yes, tCL is a round trip time for accessing an already open row, and it puts a sharp upper bound of ~5ns on propagation delay. Random access time, approximated by tRAS, is still ~50ns. You contend that tCL is the bigger problem, I contend that tRAS is the bigger problem. The key statistic is <#CAS/#RAS>, which I haven't been able to find easy values for.

I don't understand your argument that the L2 and L3 caches tip the balance in favor of <#CAS>. If anything, I'd expect them to do the opposite, because the SRAM and DRAM both aim to exploit locality, meaning that better L2 and L3 caches would reduce #CAS faster than #RAS. Of course, armchair reasoning about such complex systems as memory controllers and caches can only go so far, so I wouldn't be shocked to be wrong, but I would certainly demand actual statistics before changing my view.

adrianratnapala · on March 31, 2017

All this means is that we should stop thinking of this stuff as RAM. Only the L1 cache is really RAM. Everything else is just a kind of fast, volatile, solid state disk that just happens to share an address space with the RAM.

mamon · on March 31, 2017

Yes, I always considered that inflated bus speeds a scam. It is like having a subway line driving trains every 2 minutes, but only 1 train in 15 taking passengers, rest going completely empty.

If you look at CAS latencies [1] then it's clear that memory speeds stagnated a long time ago:

[1] https://en.wikipedia.org/wiki/CAS_latency

dom0 · on March 31, 2017

That's not a very accurate analogy, since sequential reads do improve. CAS latency is one limiting factor on random reads, which has been constant / only slightly improving for a very long time; significant improvements last happened here around with DDR and SDR (i.e. 20 years ago).

This is a central reason why the cache size per thread has not changed much in the last 20 years (beside of course fabbing/yield limits). (And because caches are subject to a similar effect as Dennard scaling)

(btw. I wonder if it'd have been smarter by the memory industry if they specified latencies in absolute numbers (or relative to the memory array clock, not the bus clock), not bus cycles, since the rising latency numbers have been a marketing issue in every generation.)

petra · on March 31, 2017

If cas latency haven't changed for 20 years, and the memory wall in general being a huge pain, does anybody know why didn't some sort of 2.5D memory(even using low-resolution lithography and carriers cheaper than silicon), haven't entered the market ?

Retric · on March 31, 2017

Heat, RAM puts out a lot more than you might think. That said, for low power devices like cellphones you can use different packaging and cut latency.

Also, AMD for example uses stacked ram (HBM), but that needs very special packaging and active cooling. http://www.amd.com/en-us/innovations/software-technologies/h... https://en.wikipedia.org/wiki/High_Bandwidth_Memory

petra · on March 31, 2017

If heat is the problem, and since the memory wall is a 20-year problem, was there some new cooling tech ? or is it just standard industry inertia(and just being satisfied with half solutions) ?

dom0 · on March 31, 2017

No, it's a fundamental issue with the technology. Roughly, the ratio of power density to capacity-that-can-hold-information-for-a-refresh-interval stays more or less constant, independent of fab node. In fact, newer generations (DDR3, DDR4) have been pushing it here (cf. rowhammer).

I doubt we will see a drastic improvement in DRAM latency unless the underlying process changes a lot (i.e. different physical storage mechanism or very different manufacturing).

theandrewbailey · on March 31, 2017

I think that's what HBM tries to do, with stacking memory dies on top of each other, along with locating them next to its associated processor.

sniglom · on March 31, 2017

"Analysts didn't expect DDR5 to be developed" Is that the same analysts stating that Apple will have an event at the same time as last year?

If an improvement can be done cheap and easy it will be done. Implementing a memory controller for DDR5 is probably a trivial job very similar to current DDR4.

Integrated GPUs has been on the rise for a long time. This is good news for reusing current memory controller designs while providing higher bandwidth.

sp332 · on March 31, 2017

JEDEC officially specs DDR4 RAM up to 2133 MHz. But manufacturers seem to have left this far behind already, with Corsair and G.Skill shipping modules at 4200 MHz. I don't see why JEDEC doesn't just add a few more rows to the table of official speeds. The only real advantage for a new spec is to bump the density.

DiabloD3 · on March 31, 2017

No, they've standardized up to 2400.

And they better have, because Intel already standardized 2400 as the recommended speed for Skylake.

phkahler · on March 31, 2017

>> The only real advantage for a new spec is to bump the density.

And DDR4 can apparently support 128GB per module. Those apparently exist, but you can't find even a 32GB at Newegg.

ksec · on March 31, 2017

I am not aware of anyone other Samsung has this

http://thememoryguy.com/samsungs-colossal-128gb-dimm/

But that was the end of 2015, we are into 2017 and i have yet to see this again on the market. And they are likely very expensive.

I really hope DDR5 bumps the spec to 512GB rather then 256GB. Unfortunately the DDR memory / GB prices is actually on the rise. The only thing i could see changes in the market is when all the Fabs from China goes online in 2018, and China continue to pour in tens of billions of money in it.

yuhong · on March 31, 2017

It would probably make sense to talk about the density of each chip here. 16Gbit DDR4 is already in the spec though it is not being made yet.`

pmalynin · on March 31, 2017

Above 8 gigs per module you're going to need buffered ram. Check in the ECC section.

sscotth · on March 31, 2017

Not true. I use non-ECC 16GB sticks.

https://m.newegg.com/products/N82E16820231974

yuhong · on March 31, 2017

That is because we now have 8Gbit DDR4 chips.

yuhong · on March 31, 2017

They are based on TSV.

yuhong · on March 31, 2017

One reason why JEDEC didn't standardize them is they often required higher voltages

speeder · on March 31, 2017

I have impression JEDEC constantly underestimate what manufacturers can actually do...

I bought last year DDR3 RAM, 2400mhz.

So, when DDR4 started to be sold, capped at 2133, DD3 2400 was already sort of easy to find (and it is faster than the DDR4 too, due to smaller latency, the few times I decided to run my computer in some OC competition that relied on RAM it ended on the top 1% of the air cooled machines despite not being an exceptional machine neither prepared to OC, Intel XMP runs crazy fast on my machine, the OC people I talked all said it was because of my fast DDR3 RAM).

plopz · on March 31, 2017

When you see RAM listed at 4200, its the same as the 2133 but the sticks have been tested to be able to overclock to 4200.

sp332 · on March 31, 2017

Right, but it's only an "overclock" because the official spec only goes to 2133. And that means that compatibility is very hit-or-miss. It could benefit from being more standardized.

refrigerator · on March 31, 2017

Would the equivalent amount of DDR5 actually feel any different to DDR4 to the end user? It seems like hardware is getting better but basic software (web browser etc) is getting more complicated/bloated at the same rate, so things don't actually feel 10x faster than they did a few years ago.

jerf · on March 31, 2017

Do you have an SSD yet? A good one?

Computers with good SSDs and not too much bloatware (relevant on Windows) do feel faster to me than computers used to be.

I think people can get nostalgic accidentally and overestimate how fast things were in the past. I remember multi-minute loads for games on my Commodore 64. I remember multi-minute Windows 3.1 bootups. I remember watching a JPEG progressively creep in on the early web. I remember watching my school projects in C++ take noticeable time to compile despite what we would now consider their absurdly simple nature. I remember when I was reluctant to click on a 1MB download.

But you do want to avoid having a modern Windows machine on a slow spinning-rust hard drive. Yeow.

refrigerator · on March 31, 2017

Yeah I think you're right - moving from mechanical to SSD made more difference in perceived speed than more/better RAM and faster CPU. Boot times are for sure a happy thing of the past, even opening up programs like Photoshop used to take long enough for me to alt-tab into the web browser for a bit while it started up, whereas now it's instant.

I've recently switched from 8GB MBA to 16GB MBP and to be honest, I'm not sure I've noticed any more speed above what's to be expected from any brand new machine before it gets bogged down with shit. My MBA used to be able to handle multiple big programs running in the background - 2 of the Adobe suite + Chrome + small stuff, so I haven't felt much use for the extra juice day-to-day, though I imagine the RAM is useful for video editing etc.

theandrewbailey · on March 31, 2017

Agreed. Load times were bad in the past.

No one's asked me yet why these smaller SSD thingys are so much more expensive than larger hard drives, but I prepared the answer a long time ago:

They make your computer feel 10x faster.

Only cheapskates and the stubborn would pass on an SSD.

jerf · on March 31, 2017

"Only cheapskates and the stubborn would pass on an SSD."

Even a year ago I was still going "eeehhhhhh, weeelllll, maaayybee..." but sometime in the last year I'd say we crossed over for any professional. It's no longer a matter of paying $100 for a big-enough spinning rust or $600 for a too-small SSD, now it's more like "Do I want a really fast 512GB or a slow 4TB for the same price?", which are both as of right now ~$150 give or take $30 depending on your quality needs. (I just checked.) That's still quite the spread on size, but you can fit a lot in 512GB.

theandrewbailey · on March 31, 2017

The people that I had in mind asking me would not know what an SSD is (the less technically literate). The people who go to Dell, HP, etc, and are trying to decide on a "good" laptop.

hobozilla · on March 31, 2017

Really? Go back and install Windows 7, then Windows XP, then Windows 98. Tell me the boot times and how long until a browser opens. :D

I think our expectations are just getting higher.

chromalife · on March 31, 2017

He may have been referring to Wirth's law. From wikipedia

Wirth's law is a computing adage which states that software is getting slower more rapidly than hardware becomes faster.

fiedzia · on March 31, 2017

>Would the equivalent amount of DDR5 actually feel any different to DDR4 to the end user?

Most likely not at all. Browser speed has more to do with network speed and browser/website design than your hardware. Browsers and websites still make poor use of multiple cores, and making sites performing well is a never ending chase between designers adding more ads, images and visual effects, browsers trying to figure out tricks to make coping with that possible and developers implementing those tricks.

dogma1138 · on March 31, 2017

Is the image of an DDR5 DIMM? because from the look of it it has some serious power management and storage components looks more like an NVRAM DIMM to me.

Edit: the image name has 8GB NVDIMM in it, so yeah this isn't DDR5.

vondur · on March 31, 2017

How much difference is there between GDDR and regular DDR? I think that GDDR runs at higher voltages and temps than regular DDR. I'm assuming if GDDR5 is available, then it's kind of inevitable that regular DDR 5 would be coming at some point.

theandrewbailey · on March 31, 2017

I thought the main difference with graphical RAM was that the monitor output DAC could read from memory at all times, regardless of what the GPU is doing with it. And GDDRn is not the same generation as DDRn; the last GDDRs (4,5) have been based off DDR3, I think.

dom0 · on March 31, 2017

Graphics RAM was dual-ported (=two readers at the same time) in the early days, but hasn't been for a long time. The concurrent accesses are managed by the GPU internally now.

akmittal · on March 31, 2017

Most laptops shipped this year still has DDR3/LPDDR3 RAM

dogma1138 · on March 31, 2017

Because LPDDR4 wasn't available in sufficient quantities or not supported by the platform SkyLake only supported LPDDR3 and the current KabyLake SKUs also do not support LPDDR4.

Nexxxeh · on March 31, 2017

Is LPDDR3 what is also referred to as DDR3L, or is that different?

my123 · on March 31, 2017

DDR3L is completely different than LPDDR3. DDR3L is a low-voltage DDR3 version, while LPDDR3 is completely separate.

nik736 · on March 31, 2017

Blame Intel.

mikhailt · on March 31, 2017

That's due to integrating the memory controller onto the CPU, you can't switch memory controller to use a newer spec, you're stuck with whatever the CPU supports.

phire · on March 31, 2017

These laptops ship with broadwell/skylake/kabby lake processors, which support both DDR-3L and DDR4 memory.

But for price/power-budget reasons, they ship with the DDR-3L option.

my123 · on March 31, 2017

Actually, they use LPDDR3, which uses even less than DDR3L

Zekio · on March 31, 2017

I wonder how this will affect the Infinite Fabric in Ryzen CPU which runs at half clock speed of the ram

Tharkun · on March 31, 2017

Is anything being done in DDR5 to mitigate rowhammer at the design level?

TazeTSchnitzel · on March 31, 2017

Are there applications which will see significant benefit from a new generation of DRAM, or higher speeds?

Games, for instance, seem to see only very small performance differences from changing RAM speeds.

the8472 · on March 31, 2017

Games are a special breed of application since they delegate a lot of work to the GPU and often are GPU-bound, not CPU-bound. Other workloads often end up with the CPU waiting for memory accesses. The most obvious example, besides stream-processing, would be garbage-collected languages. Simply scanning the heap will benefit from all the memory bandwidth it can get.

Even if speeds don't improve perceptibly because the CPU is simply waiting for the human 99% of the time it still may benefit laptop battery time due to increased power efficiency and the CPU completing that 1% sooner and returning to low power states.

TazeTSchnitzel · on March 31, 2017

> Games are a special breed of application since they delegate a lot of work to the GPU and often are GPU-bound, not CPU-bound.

Yeah, and the GPU (at least a dedicated one) has its own VRAM, so accesses for which RAM speed would otherwise be important end up relying on VRAM speed, I assume.

axelfontaine · on March 31, 2017

In-memory datastores like Redis. They are also the ones who benefit the most from faster XMP DDR4.

theandrewbailey · on March 31, 2017

Yet you can still see gamers worrying about RAM timings.

Zekio · on March 31, 2017

Well they should worry but only if they use Ryzen CPU's which depend quite a bit on RAM

rocky1138 · on March 31, 2017

Is DDR4 memory a major bottleneck in systems today?

KSS42 · on March 31, 2017

Yes, in servers, because of high core counts and demanding "big data" applications.

km3k · on March 31, 2017

No, but more speed and less power consumption is good for laptops.

sayright · on March 31, 2017

HN article above this one: "Going faster doesn’t make you happier.." :)