UltraRAM

tromp · on Oct 3, 2023

With durability claims of

> 10 million write/erase cycles

this is not going to compete with DRAM, which needs to endure trillions of write/erase cycles in its lifetime.

Unless they grossly underestimated its durability, a name like UltraFlash would seem more appropriate?!

IanCal · on Oct 3, 2023

They have tested it to 10 million cycles with no degradation, so that's where that figure comes from. It's not 10e7 before failures or 10e7 before failures at some particular rate. The assumption is it's somewhere higher than this but you can't tell without more testing.

> The process was repeated five times, resulting in a little over 10^7 program/erase cycles applied to the device. As can be clearly seen in Figure 4d, there is no degradation of the ∆IS-D window throughout these tests, meaning that the endurance is at least 10^7.

https://onlinelibrary.wiley.com/doi/epdf/10.1002/aelm.202101...

runeks · on Oct 3, 2023

Hmm. If this memory is faster than DRAM, wouldn't it be quick to test, say, ten trillion write/erase cycles?

Why stop at 10M? Is the erase operation really slow?

dahfizz · on Oct 3, 2023

The paper says they tested the durability of the ram with a 5ms program-read-erase-read loop. Meaning each time they program-read-erase-read, it takes 5 milliseconds.

Ten trillion cycles would take over 150 years.

I'm guessing a silicon lab doesn't have "the rest of the computer" that would allow them to run this ram at full speed constantly. This UltraRAM isn't something they can just slot into their motherboard.

hinkley · on Oct 3, 2023

5ms is 200 cycles per second. 3600 seconds in an hour. 0.72 million writes per hour. Almost 40 million if I start it on Friday evening and stop it on Monday morning

10 million is 14 hours. It takes longer than that to prepare your documentation. Something is rotten in Denmark. A skeptic could very, very reasonably assume that cherry-picking is going on here, and that 10m to degradation isn't far off from the truth.

DoctorOetker · on Oct 3, 2023

These labs have their custom structures synthesized, adding a small circuit specifically for endurance testing would be trivial compared to what they have achieved designing and implementing the structures they have.

lawlessone · on Oct 3, 2023

Could processors keep up with it?

DoctorOetker · on Oct 3, 2023

Half in jest: if processors couldn't keep up with it, it would be used to replace flip-flops etc in processors...

amelius · on Oct 3, 2023

That doesn't necessarily mean the processors would become any faster ...

DougMerritt · on Oct 3, 2023

It kind of does, all else being equal. What else do you have in mind? Parasitic capacitance on signal lines?

georgeburdell · on Oct 3, 2023

This is a common problem in memory. Oftentimes they use models to accelerate the wear and tear through temperature, voltage, etc. and extrapolate the lifetime

antx · on Oct 3, 2023

Indeed, the paper says:

> Assuming ideal capacitive scaling[33] down to state-of-the-art feature sizes, the switching performance would be faster than DRAM, although testing on smaller feature size devices is required to confirm this.

So, they have no idea of its performance. Yet.

idiotsecant · on Oct 3, 2023

Probably a case of don't ask questions you don't want answers to.

Findecanor · on Oct 3, 2023

That's about the same durability as Intel Optane had, so the first thing it could be would be to replace Optane where it has been used.

Optane did inspire a lot of R&D into persistent data structures, databases and file systems that started to challenge the traditional model of local memory and persistent storage. IMHO, a few of those projects were a little bit overoptimistic, and used NVRAM as DRAM without many restrictions. For NVRAM to be viable, I think it still needs to have overprovisioning, wear levelling, memory-protection and transactions, provided by hardware and/or an OS but not necessarily with traditional interfaces. It is mostly a matter of mapping it CoW via a paging scheme instead of directly, and it will still be at near-DRAM speed.

zozbot234 · on Oct 3, 2023

That's basically the equivalent of a Flash Translation Layer, and having it removes the original selling point of making fsync() a no-op. At that point, persistent memory's only advantage over existing non-volatile storage is possibly higher performance.

Findecanor · on Oct 3, 2023

To hell with fsync(), I'd want a proper commit()! ;)

The performance is so high that the assumptions that had led to the old file system interfaces don't apply any more. There is opportunity for something better.

foobiekr · on Oct 3, 2023

There are software transactional memories for nearly all programming languages. A project I worked on, shipping since the mid 2000s, uses an in memory STM for nearly all operations.

_a_a_a_ · on Oct 3, 2023

Sorry, what is commit() here?

pmontra · on Oct 3, 2023

Maybe

  begin() 
  try {
    work on
    several
    in memory
    data structures
    commit() 
  }
  catch {
    rollback()
  }

And all those data structures are either collectively updated or not.

danbruc · on Oct 3, 2023

Transactional memory [1], there are existing hardware and software implementations.

[1] https://en.wikipedia.org/wiki/Transactional_memory

yencabulator · on Oct 6, 2023

None of which seem to perform well in the real world when you measure anything but CS papers published...

CoastalCoder · on Oct 3, 2023

Why transactions?

two_handfuls · on Oct 3, 2023

Because it’s the best way to survive failures (such as loss of power). Transactions allow you to know that all your datastructures are in a consistent state.

Findecanor · on Oct 4, 2023

It would also enforce that writes happen only when a program explicitly declares that it is necessary.

When writes are persistent and cause wear, the consequences of e.g. a common buffer overflow or use-after-free bug can be much higher than if they were not. Even an unoptimised loop that writes to NVRAM could be bad.

arc-in-space · on Oct 3, 2023

Trillions of cycles, what? How can that be?

Maybe I'm confusing something, but to reach a trillion cycles in, say, a year, would take overwriting all your memory 30 times a millisecond. That doesn't sound right?

Or is that trillions of any writes and erases?

Cort3z · on Oct 3, 2023

DDR ram is refreshed every 64ms (varies by DDR generation and specific chips). Branch Education has an excellent video on this named "How does computer memory work?"[1]. It would still take an exceedingly long time to reach a trillion, but it's still pretty frequent.

[1](https://www.youtube.com/watch?v=7J7X7aZvMXQ)

xnorswap · on Oct 3, 2023

1 trillion * 64ms is over 2000 years, I think it's unlikely that there's any DDR RAM that old.

Cort3z · on Oct 3, 2023

From just refreshing. You generally do other stuff than refresh the ram too.

MayeulC · on Oct 3, 2023

You don't need to refresh non-DRAM memories though.

I agree that some regions risk being R/W more than others, so memory controllers should indeed perform some kind of wear levelling, but otherwise I find it hard to imagine trillions of overwrites across GBs (or TBs) of data. 1e6 cycles is definitely doable, and on the low side, even for flash devices. 1e9 is pretty good for general-purpose memories, and few applications require 1e12. Not even SRAM or DRAM have unlimited endurance, due to physical wear. It's hard to find a source on this, but I would probably hand wave it at around 1e15 cycles for DRAM? This would be 30 years of operation for one access every microsecond.

xnorswap · on Oct 3, 2023

Of course, but it's still a huge order of magnitude difference to get to trillions.

archontes · on Oct 3, 2023

Like, say, serving a million MMO users?

Kirby64 · on Oct 3, 2023

In comparison, 10 million would mean less than a day if it was refreshed every 64ms. Even a billion would mean only 2 years worth of cycles.

I think a trillion, or at very least 10s-100s billion is the correct amount of cycles for RAM.

mastax · on Oct 3, 2023

Persistent memory doesn't need to be refreshed though so that's irrelevant.

thereddaikon · on Oct 3, 2023

This type of memory wouldn't need refresh so you can cut out all of those writes.

jepler · on Oct 3, 2023

If you have to design for pathological workloads, absolutely you can write to a location in main memory 30 times per millisecond.

Lots of non-pathological workloads might write to a memory location every millisecond, such as a game with a 4-pass renderer running at 240Hz.

mattclarkdotnet · on Oct 3, 2023

Even at 1GHz, a trillion (10^12) writes is only 1000 seconds of work for a modern CPU. OK latency is a thing, so multiply by 10 and it takes a day. This is for DRAM where cells are individually addressed. For flash with wear levelling the numbers of course get bigger.

jandrese · on Oct 3, 2023

In practice a memory location being written to that heavily will never escape the cache unless you are doing something exceptionally weird.

DoctorOetker · on Oct 3, 2023

doesn't C have keywords like volatile to insist reading from RAM?

aidenn0 · on Oct 3, 2023

Volatile requires it emit instructions that access the object. So if the object is in RAMA, it will emit memory access instructions. However, on modern CPUs, that will still hit the cache. You need to either map in the memory as uncached, or flush the caches to force a memory access

pbalcer · on Oct 3, 2023

no, that won't work. You'd have to clflush after every store. And even then, the cacheline might only ever get to the write pending queue (wpq) - and that you can't control.

vlovich123 · on Oct 3, 2023

I would seriously doubt there’s many instances of writing to a single volatile memory location at 1ghz (excluding benchmarks).

pezezin · on Oct 3, 2023

Modern DRAM doesn't address individual cells. For both DDR4 and DDR5 the minimum burst length is 64 bytes, the width of a cache line of most CPUs.

Etherlord87 · on Oct 3, 2023

I started to think about flipping a single bit in some process a million times per frame inside some loop, but that could only be done in cache…

Still if you only changed the state of the memory once per frame, you would do it in RAM, not in cache. At 1000 FPS (we should consider the worst scenario even if rare) that's 3 hours of playing a game to reach 10 800 000 reads/writes.

Now question is what happens if that bit gets damaged, perhaps the memory just disables it as damaged, and uses another bit for this memory address from now on. Perhaps it makes the ultra ram slower over time as more bits (sectors) get damaged?

tus666 · on Oct 3, 2023

Lifetime of microeletronics is often quoted at around 30 years. So that's once a millisecond. For a refresh cycle that does not seem extraordinary.

tromp · on Oct 3, 2023

I was thinking of overwriting just a few words of memory over and over again, which DRAM can endure for decades.

pmontra · on Oct 3, 2023

The clock frequency is GHz, which is a trillion cycles per seconds. There is at least one cache layer between the CPU and the RAM but we are in the same ballpark. And yet it's OK for the typical lifetime of our computers.

benjijay · on Oct 3, 2023

GHz is Billion, not Trillion

sfink · on Oct 3, 2023

(side note)

Until recently a billion was a trillion, or vice versa, depending on whether you're from the UK or the US.

A GHz is a GHz no matter where you are. :-)

agumonkey · on Oct 3, 2023

Noob question, aren't the trillions of cycle including 'refreshing' read/write that wouldn't be necessary with persistent memory ?

adrian_b · on Oct 3, 2023

You can trivially exceed one million write cycles in only a second with a modern CPU, just by incrementing a shared counter (which cannot be cached).

adwn · on Oct 3, 2023

> just by incrementing a shared counter (which cannot be cached)

That's not true, a shared counter (i.e., an atomic integer) is cached – in fact, there's no guarantee that its value is ever written back to system RAM.

You're probably thinking of non-cacheable memory: the kernel can set the MMU attributes of a memory page such that the CPU will avoid the cache when it accesses addresses in that page. This is completely independent of atomic accesses on memory locations [1].

[1] At least typically – there may well be CPUs which disallow atomic accesses on non-cacheable memory.

SonOfLilit · on Oct 3, 2023

I assume by "which cannot be cached" he meant "which is in a page configured as non-cacheable in the MMU", i.e. exactly what you said

adwn · on Oct 3, 2023

But then why would it have to be a shared counter? Any write to a non-cacheable memory location is transmitted to system RAM, it doesn't have to be shared with other cores, nor does it have to be a counter.

tuatoru · on Oct 3, 2023

Because it is used by a peripheral.

londons_explore · on Oct 3, 2023

Wear levelling on RAM isn't in use today to my knowledge, but I don't think it is technically impossible.

You would probably go for some approach where most memory addresses are direct-mapped, and then the few that have been written most are redirected to new addresses.

The reading of the direct-mapped addresses would be super fast, since you can do the read in parallel with the lookup in the remapping table (just to check that this is a direct-mapped address). Reads of non-direct mapped addresses might take a couple of extra cycles, but that doesn't matter because they are very rare.

To do any of that, CPU memory controllers need to be able to handle per-request variable-latency RAM, which to my knowledge today they do not, although it would not be a big redesign to add.

ahoka · on Oct 3, 2023

Wear leveling RAM would be trivial with any MMU from the last 40 years. You can just fault on the write and do your wear leveling in the fault handler. This is how virtual memory already works.

londons_explore · on Oct 3, 2023

Indeed, and this is how "badram" on Linux works.

Tuna-Fish · on Oct 3, 2023

No, you would keep a write counter for every (4kB) DRAM page, and have the OS move the virtual page to new physical one if the write count of a page grows much higher than the average.

vlovich123 · on Oct 3, 2023

That assumes you still have DRAM. Since this is faster and higher capacity than RAM, it’s potentially viable as a RAM replacement. In that case, you wouldn’t have anywhere to store the counters (but presumably in that case you wouldn’t need to either). I’m not sure you’d need to have a write counter when this replaced RAM though even if this didn’t have the same write endurance. For storage nodes, there’s no value in RAM outlasting storage. And this already has better write endurance than NAND. So on a storage node, you could easily imagine using this as RAM as the number of erases is going to be dominated by storage activity rather than ancillary memory writes managing the storage.

crote · on Oct 3, 2023

That really depends on your use case, doesn't it?

Assuming a typical 5-year lifecycle, 10 million writes means 1 write every 15 seconds. That's more than enough for executable code, CDN content, or a database index. I can definitely see systems with 75% UltraRAM for read-heavy data and 25% traditional RAM for write-heavy pages acting basically as L4 cache.

benj111 · on Oct 3, 2023

Why does it need to directly compete with dram?

The current set up is based on separating volatile and non volatile memory and adding caches to paper over the slowness. Caches are getting bigger and bigger because of the huge speed disparity. I think you underestimate how much of a game changer this could be.

This is persistent and fast.

If this takes off, and it does only last 10s of millions of cycles, just use cache for fast changing things and ultraram for everything else.

If it lasts trillions of cycles, it potentially would completely change pc architecture. It was the 80s when we had ram/rom that could keep up with the processors of the day. This potentially gets you an instant on computer, no need for caches, no need for memory for the graphics card, no separate hard drives. Just one big simple bucket of bytes for everything.

dan-robertson · on Oct 3, 2023

If the latency claims turn out to be true, it could still be worth it in various cases, eg with a bit of effort to reduce the number of writes you could get a big hashtable that you initialise once a day or so that gives really fast lookups.

Tuna-Fish · on Oct 3, 2023

Some of the researchers believe that the durability is actually unlimited, but they haven't been able to prove it yet.

tuatoru · on Oct 3, 2023

Yeah, potentially dead in ten milliseconds with a pathological workload.

moconnor · on Oct 3, 2023

Perfect for LLM weights though.

jackmott42 · on Oct 3, 2023

I would buy RAM yearly for 10x latency improvement!

mejthemage · on Oct 3, 2023

Can you imagine the consumer backlash?

RAM is not easily removable in most of today's electronics. So replacing RAM once a year actually means replacing all your devices once a year.

adolph · on Oct 3, 2023

I've been impressed by the growing capabilities of the device fixing community. Replacement of BGA components is more attainable than one might think.

https://www.youtube.com/watch?v=X7C_hdJsY4Y

I think for my kids I may have them skip traditional through wire soldering for SMD with hot air and toasters.

https://hackaday.io/project/27900-reflowduino-wireless-reflo...

benj111 · on Oct 3, 2023

things that dont last long are replaceable though.

no one complains about not being able to replace the processor in their phone because it 'never' breaks. batteries on the other hand do, and to varying degrees are replaceable.

JonChesterfield · on Oct 3, 2023

Article claims a tenth the latency of dram at 100x lower power, but also says they're trying to fabricate at 20nm. Oh, and also persistent.

If they've done that, awesome. Make it, show that it works, licence how to make the thing to semiconductor companies and retire wealthy. Or maybe the university owns the IP.

hwillis · on Oct 3, 2023

If they've done that, I think the concept of "turning off" a device goes away. You just unplug it, and the energy needed to dump the stuff in the pipeline to memory can be stored in a capacitor.

The OS can just always be loaded and ready to go; when power is restored it checks to see if the hardware has changed and just loads up the 64 MB of CPU cache. It could take just a few milliseconds. It takes on the order of a millisecond to charge the capacitors in a desktop PSU. "Restarting" becomes basically the same thing as reloading, and takes >100s of times longer than actually restarting the device. That's crazy to consider.

If boot time is 0, stuff will just unplug itself after its been idle for a few seconds. I'd expect the hardware in phones/laptops to become more distributed, with basic vital functions handled by a separate processor. Probably the screen gets taken over by a very simple processor that can only display the time, battery %, cell info (or the current screen buffer, for a laptop) and user input causes the main cpu to wake up in between frames.

stcredzero · on Oct 3, 2023

If they've done that, I think the concept of "turning off" a device goes away.

...The OS can just always be loaded and ready to go; when power is restored it checks to see if the hardware has changed and just loads up the 64 MB of CPU cache.

The idea, called “Orthogonal Persistence” way back when, has been around quite awhile. Here’s my (probably spotty) idea of the history:

Researchers wanted instant-on for their early visions of tablets. To make sure security and networking would still work properly, there was an idea to use Capabilities (which were around since the 1960’s) to support this and solve the chicken and egg problems that were thought to arise.

Capabilities later became widely adopted just for better security, but Orthogonal Persistence never took off, because never rebooting would have required much higher levels of reliability, which would have been expensive to achieve. So today’s devices still reboot, but also have a fast "wake from sleep."

So I’m not sure if we will ever have true “Orthogonal Persistence.” We might have much slicker “wake from sleep” instead.

I'd expect the hardware in phones/laptops to become more distributed, with basic vital functions handled by a separate processor.

This is already the case!

sowbug · on Oct 3, 2023

Will "have you tried unplugging it?" still be the ultimate tech-support solution if persistent RAM becomes commonplace?

Weryj · on Oct 4, 2023

We thought that with Optane, I'm sad to see that discontinued because it was a much better target for swap files or databases.

benj111 · on Oct 3, 2023

Why do you even need a cpu cache?

1ns write operations suggest fast read too.

Tuna-Fish · on Oct 3, 2023

No matter how fast the device itself is, addressing into a large pool will always be slower than a smaller pool. Both because of increased travel distance, and because every time you double the size of the pool, you add one additional mux on the path between the request and the memory.

This is why CPUs have multi-level caches, even though the transistors in L1 cache and L2 cache are typically the same -- the difference in access latency is not because L2 is made of slower memory, but because L1 is a very small pool very close to the CPU with the load/store units built into it, and L2 is a bit further away.

However, if main memory latency is suddenly a lot lower, it might change what is the most efficient cache level layout. The currently ubiquitous large L3 cache might go away. That would of course require very high bandwidth to the memory chips, because L3 does bandwidth amplification too.

hwillis · on Oct 3, 2023

Should be stressed that speed is entirely theoretical: https://onlinelibrary.wiley.com/doi/epdf/10.1002/aelm.202101...

> In all of the above tests, the program and erase states were set using between 1 and 10 ms voltage pulses, two times longer than the switching times used in our recent report of ULTRARAM on GaAs substrates.[15] In both cases, the devices operate at a remarkably high speed for their large (20 μm) feature size. Assuming ideal capacitive scaling[33] down to state-of-the-art feature sizes, the switching performance would be faster than DRAM, although testing on smaller feature size devices is required to confirm this.

> Why do you even need a cpu cache?

Cell read time is entirely different from latency and throughput. This stuff still reads in rows like RAM and can't just be accessed freely like registers.

vlovich123 · on Oct 3, 2023

You’d probably still need an L1 cache. L2 and L3 might be superfluous or you could have massive L2/L3 caches made with this rather than traditional SRAM that sit internally within the CPU to avoid the memory bus. Contention for the memory bus could also be a reason to still have SRAM caches that are slower than main memory.

Shifts like this are so impactful it’s hard to predict exactly what good designs will look like until we’ve had 5-10 years hands on for the industry to shake out how the Hw topology will looks like (maybe more since HW dev cycles prevent fast iteration and testing of ideas)

tuatoru · on Oct 3, 2023

1/100 power, 1/10 latency... in a through-hole chip carrier?? How do they get enough of it close enough to the CPU at those low powers and latencies, at DRAM clocks? Electricity travels 10cm in a tenth of a nanosecond, best case. And it uses quantum...

I'll leave it to the experts.

duskwuff · on Oct 3, 2023

> 1/100 power, 1/10 latency... in a through-hole chip carrier??

That's a common package for testing ICs -- notice the array of dies inside and the haphazardly placed bonding wires. It isn't the final form factor.

Tuna-Fish · on Oct 3, 2023

They are talking about the speed of the new type of memory cell, not of the physical implementation they have it in.

If this actually pans out, it will be worthwhile to stack a lot of it on the same package as the CPU. The reason memory is so far in current systems is mainly that having it closer wouldn't actually meaningfully help, because almost all the latency is reading data from the DRAM array anyway. If they suddenly get an economical new memory type that has an access latency of tenth of what DRAM does, they are going to figure out how to get it close enough that the signal travel will not be a meaningful part of the total latency.

tuatoru · on Oct 3, 2023

Thank you for the explanation! TIL.

alias_neo · on Oct 3, 2023

My understanding, from reading it, is that the through-hole carrier is for demos.

I doubt they've actually demonstrated the speed/power claims practically; that's what the new test kit, and potential fab partnerships are for.

bitwize · on Oct 3, 2023

This sounds too good to be true. When Apple buys up all the production capacity for this and makes it available exclusively in Macs and iPads, we'll know it's viable. Till then, my optimism is tempered with caution.

caleb-allen · on Oct 3, 2023

Apple won't do it until they're proven by integrating with existing manufacturers of some kind

dist-epoch · on Oct 3, 2023

Apple is too smart for that. If they buy it, they will also sell it separately, just like Samsung manufactures phone screens for Apple.

amelius · on Oct 3, 2023

Yeah, they will sell it with a special license so you can't use it in products that compete with iDevices and MacBooks.

sorenjan · on Oct 3, 2023

Will this be used for Harvard architecture where programs are run straight from their storage instead of first being read into RAM? Maybe we can use data stored on this instead of having to stream it from storage to RAM?

a6 · on Oct 3, 2023

Not to be confused with the AMD/Xilinx UltraRAM present in their FPGA fabric.

DoctorOetker · on Oct 3, 2023

If it is a different technology, wikipedia needs to be corrected:

> The technology has been integrated into Xilinx's FPGAs and Zynq UltraScale+ family of multiprocessor system-on-chips (MPSoC).[7]

Referencing a paper https://www.eejournal.com/chalk_talks/2016033002-xilinx-ultr... FROM 2016 !

Ballas · on Oct 3, 2023

It seems they(Xilinx/AMD) might also have applied for a trademark, so I don't know if the name as used here will stick...

https://trademarks.justia.com/972/53/ultraram-97253591.html

omneity · on Oct 3, 2023

Optane Resurgence?

The tech looks super cool if it does get commercialized.

hwillis · on Oct 3, 2023

Optane made loading the OS super fast, but the OS still has to fill up RAM. No matter what, loading up 2+ GB of ram will always take noticeable time. Even flat out, Optane takes >1 second to boot, and several seconds to restore a session.

Cost permitting, this stuff would replace RAM, not the drive. No more loading into ram; now the bottleneck is loading into cache and that will always be trivially fast just because cache is so small.

Even if its too expensive to replace RAM, if it can fit the minimum bits of an OS then I think cold boot time still goes to 10s of milliseconds. Might take a couple years, but interactivity doesnt need to wait on ram to be filled.

the8472 · on Oct 3, 2023

> No matter what, loading up 2+ GB of ram will always take noticeable time.

Barely so. NVMe sequential throughput is measured in gigabytes per second. So you can get this under 300ms. And you can optimize the order in which things are loaded so that the important ones arrive first, not all in-memory data is hot.

What makes booting take time are serial dependencies between boot stages, timers (boot prompts for humans, but also for hardware to power up), careful device enumeration and initialization and stuff like that.

hwillis · on Oct 3, 2023

Yep. Keep it all hot. North/southbridge, controllers, everything. Skip POST. If something has changed we can just restart once it's warmed up. Anything that still needs a traditional boot (eg a disk drive) we just be assumed to have not changed until proven otherwise. Fuck the MBR. Fuck ROM and BIOS and CMOS. If you don't need to do it between RAM frames, you don't need to do it until you're told to reboot.

If you've been unpowered all day, or if your hardware has changed, or you're worried about security, then you can choose boot from scratch. The only other reason, IMO, is because the computer has just been put together. If all those parts can restore their previous configuration, all they have to do is signal "yep, I'm still in the same configuration" and we should be able to pick up where we left off (again, except for disk drives/ram/networking etc).

the8472 · on Oct 4, 2023

I like the clean slate we get from boot-from-scratch. Some pieces of hardware have subtle state corruption issues that can accumulate over uptime (kinda like Windows) that get flushed out by a reinitialization. Instead as much as possible of the boot process should be taken off the critical path and be treated more like hotplug/optional peripherals. Those HDDs? They can spin up while I'm already logged in (assuming they're not the boot drive).

alecmg · on Oct 3, 2023

definite deja vu from Optane

Capacity and price killed it, no word about these in the article

the8472 · on Oct 3, 2023

Similar claims have been made about MRAM, FeRAM and similar devices for many years, hailed as replacement and unification of both storage and DRAM. MRAM isn't completely vaporware, but it's not available at the prices or densities of DRAM.

So, will it scale down? Will it be cheap to manufacture?

lionkor · on Oct 3, 2023

Ultra-Random Access Memory? ;)

This seems misnamed either way.

dist-epoch · on Oct 3, 2023

Lower latency than RAM and more durable than NAND?

Where is the catch? Price? Throughput?

hwillis · on Oct 3, 2023

Currently, performance is hypothetical. This and DRAM both work by charging up a little capacitor; this tech uses tunnelling so that the capacitor can be very highly isolated. That's why it doesn't discharge.

The smaller the capacitor, the faster it can charge/discharge. This tech has only been tested at sizes ~1000x large than the state of the art, and the speed advantage assumes it scales perfectly with the scaling laws. Reality is never that kind, but it might be mostly that kind.

It's still theoretical, though. There might be some manufacturing quirk that makes it not work as well at small sizes. Defects that don't matter now might be huge at that scale. If power requirements creep up, they may kill longevity, which may require them to sacrifice speed... everything has to go right, or it can become a balancing act.

Assuming everything goes great, it's still somewhat more complex than DRAM- more layers. It will certainly cost more than conventional RAM, but with ICs in particular it's very hard to know if that will be 10x more or .1% more.

aidenn0 · on Oct 3, 2023

It's going to be very expensive (lower densities than NAND and a somewhat exotic process for making it) and it hasn't been proven at geometries smaller than 20nm; it will only be faster than RAM if it continues to scale.

sfink · on Oct 3, 2023

"in mice."

Or rather, the silicon "in mice" equivalent: in a test sample 1000x the scale, with only hopes and wishes that things won't change too much when they scale down.

All the cool mice these days are running around with memristor-based brain implants. This will be a huge upgrade for them. They'll be able to spend a small fraction of their usual daily time running in the hamster wheel, charging up their symbiote brains.

sipos · on Oct 3, 2023

Not being made in a way that is usable in current systems, not having a commercial scale manufacturing process yet, and not being proven for long term use yet.

M95D · on Oct 3, 2023

This kind of memory would make a cold boot attack a child's play.

Also, considering big tech corp. tendency to lockdown stuff, will we need a hack just to do a system reboot?

stcredzero · on Oct 3, 2023

This kind of memory would make a cold boot attack a child's play.

People have been thinking about that for over 5 decades! This is a part of the history of Capabilities.

M95D · on Oct 4, 2023

What's "Capabilities"? This term is too general to search for it.

stcredzero · on Oct 10, 2023

https://en.wikipedia.org/wiki/Capability-based_security

Just google, “capabilities computer science"

Joel_Mckay · on Oct 3, 2023

Rebrand the name, as currently it is misleading... However, the technology looks interesting for storage devices if it indeed exceeds SLC Flash specifications.

However, after the Violin Systems boondoggle one may find it significantly harder to find growth capital.

Good luck =)

peter_d_sherman · on Oct 3, 2023

>"Moreover, the UltraRAM researchers asserted that the new memory tech is expected to be capable of 1ns write operations, which is about 10x faster than DRAM."

That'll be really nice if they can get it into production...

datameta · on Oct 3, 2023

This would be best used as cache for standard flash or as a faster swap space for RAM. I don't think this is a replacement for either.

KETpXDDzR · on Oct 3, 2023

I wonder how long it will take to build up a mass production for this. The hardware needed seems to be very experimental.

aidenn0 · on Oct 3, 2023

Sounds expensive. Density similar to SLC flash or DRAM with a rather exotic process.

wildzzz · on Oct 3, 2023

It would likely be a replacement for applications where low latency and non-volatility are required but size is less important. A microcontroller with a good sized chunk of UltraRAM could allow for a type of Harvard architecture where program code runs right off of where it's stored. You can have the microcontroller shutdown completely and start right back up where it left off with NV memory, just write the registers to memory right before shutdown and load them back on boot. You can have very power efficient devices that never really have an off state because they are always hibernating when they aren't doing anything.

aidenn0 · on Oct 3, 2023

If it's as low-latency as they say, then small microcontrollers could just use it for the register file directly, no?

imtringued · on Oct 3, 2023

Watch this fade into obscurity the same as Intel's 3D X point.

The world is not ready for the diseconomies of scale of a second memory type. DRAM and SSD fabs are struggling so why shouldn't they?

Goodbye promising technology!

rmbyrro · on Oct 3, 2023

It doesn't need to have consumer-wide adoption to be a success.

They can cater to a niche business market to which UltraRAM can add ultra-high value (pun intended) for particular data processing or persistence needs.

One idea that comes to my mind is stock markets. Automated traders took over it and their fight in on the sub-milisecond scale.

Imagine how much an investment bank would pay for UltraRAM if it allows them to process real time data much faster and make ultra-money with it? (again, intended and not sorry about that!)

cultureswitch · on Oct 3, 2023

Closer to home, I can think of a few competitive video game players with more money than sense who would spend in the thousands for a DRAM replacement that allows to go from 150FPS to 300FPS. Depending on workload, DRAM latency generally ends up being the bottleneck at such high refresh rates.

m_eiman · on Oct 3, 2023

Assuming it can be as dense physically as current flash and RAM, there is a pretty nice market to target in mobile devices that's always looking for things that can lower battery use.

Conveniently that's also a market where Apple and Google have enough control over the software side to make things work with a new, weird memory scheme (RAM slower than permanent storage, but still needed because of durability).

ilaksh · on Oct 3, 2023

LLMs/multimodal large models are looking for something like this. They will easily absorb the gains.

nagisa · on Oct 3, 2023

Well, on the other hand GDDR and HBM are two competing technologies both of which are still reasonably alive, with HBM being a promising candidate to replace GDDR for good eventually.

g0xA52A2A · on Oct 3, 2023

Great now I can wait for this and Nantero! /s

austin-cheney · on Oct 3, 2023

Deleted.

rewmie · on Oct 3, 2023

> Databases are automatically obsolete. A file system is enough and performance is improved by not using a database.

This makes no sense at all. Databases are much more about what data structures are used internally and the high-level interfaces provided to access said data than the underlying technology used to persist data.

There is also the matter of how much data can/needs to be persisted, which is not addressed at all.

austin-cheney · on Oct 3, 2023

Deleted.

actionfromafar · on Oct 3, 2023

I don’t understand how any of that follows. Databases are as much about locking and data structures as about persistence.

nivertech · on Oct 3, 2023

Locking and data structures are more or less a solved problem.

Persistence is not yet solved.

Our programming models are currently heavily influenced by the way we store and query data and the underlying registers/memory/cache/storage HW. With PRAM, we can simplify programming models using Persistence Ignorance (PI).

https://deviq.com/principles/persistence-ignorance

austin-cheney · on Oct 3, 2023

Deleted.

c0balt · on Oct 3, 2023

Well, that's one hell of an undersell of what a database like PostgreSQL, MSSQL or MongoDB does.

It's not just that people "can't write priginal applications" but that in fact people shouldn't always write their own bespoke single-purpose databases for each application. Getting ACID, MVCC, efficient storage, indexing and backups etc. at the same time is hard, really damn hard. You might get over some of them, e.g., efficient storage, with hardware but there's no free lunch on those topics.

A database is like using a library: You can always write it from scratch (and sometimes you should even) but in 99% percent of cases you should rely on the tried, battle tested existing solution.

rewmie · on Oct 3, 2023

> Databases are a means to query and index data with greater performance than without.

I don't think you understand your own point. The querying, indexing, and performance bit are tied to the data structures used internally by the database, not the technology used to persist data.

kreetx · on Oct 3, 2023

What exactly will become obsolete? (IMO) file systems are a degenerate form of databases already, with graph structure, large enough directory entries becoming b-trees etc. But even with fast storage you'd still need indexes for any access pattern different for what's encoded in the directory tree.

c03 · on Oct 3, 2023

I believe the author means database as disk persistence. But we could all host sqlite db's in /dev/shm today if we wanted.. I guess it's just excitement about persistence at speed :)

sfink · on Oct 3, 2023

The original post is deleted, but in a proper ACID database, persistence and speed are quite highly related. The DB can't truly move on until it has confirmed that a set of writes has truly hit the persistent storage, and the time required for that is vastly larger than the time required to process a transaction in volatile memory. We cheat in every possible way, notably with battery-backed NVRAM, so in actual practice the cost isn't always visible. But if you need truly persistent transactions, then fast persistent memory is a godsend. (Especially if it's big enough to put the WAL into. Though even a couple of bytes to store the last committed transaction ID can be useful.)

hulitu · on Oct 3, 2023

> the company claims it has at least 4,000X more endurance than NAND and can store data for 1,000+ years.

I'm sure the company has some tests to prove those claims. /s

Anyone did a HALT test with the M-DISC ?

rvnx · on Oct 3, 2023

I come from the future and I can tell you that the 1’000 years were greatly exaggerated, as the UltraRAM doesn’t handle well radiations fallout from World War III.

P.S.: we switched to the metric system.

elzbardico · on Oct 3, 2023

I also come from the future, and the reason for the war was to finally get rid of the imperial system. Despite billions of people having died, we all think it was worth.

akerr · on Oct 3, 2023

Metric will finally overthrow our imperial overlords! Sorry to hear about the war - I assume these were related?

sleepybrett · on Oct 3, 2023

P.P.S.: Going metric was the reason for ww3

IanCal · on Oct 3, 2023

You can disagree with the tests or interpretation but it's easy to find the paper rather than just assuming no testing.

Endurance figures come from actual testing with their 20um version. Retention is based on looking at the decay over 14h. Since it decays to begin with then plateaus they look at fitting a line to it from some time before the plateau (otherwise the answer is "infinite years") which gives 10^7 hours.

https://onlinelibrary.wiley.com/doi/epdf/10.1002/aelm.202101...

DoctorOetker · on Oct 3, 2023

I would assume they would have a couple of such devices running since publication till now. Instead of >24 hours they probably know the endurance after about 500 days today, allowing for more convincing extrapolations towards 10^7 hours.

It is unclear from the article if these plots are tests of individual memory cells or a large collection of cells. Any serious attempts would involve an array of cells so a million such graphs can be plotted together etc.