Mythic does not do in-memory computing, despite their claims.
Flash cannot be used for in-memory computing, because writing it is too slow.
According to what they say, they have an inference device that uses analog computing for inference. They have a flash memory, but that stores only the weights of the model, which are constant during the computation, so the flash is not a working memory, it is used only for the reconfiguration of the device, when a new model is loaded.
Analog computing for inference is actually something that is much more promising than in-memory computing, so Mythic might be able to develop useful devices.
d-Matrix appears to do true in-memory computing, but the price of their devices for an amount of memory matching a current GPU will be astronomical.
Perhaps there will be organizations willing to pay huge amounts of money for a very high performance, like those which are buying Cerebras nowadays, but such an expensive technology will always be a niche too small to be relevant for most users.
You don't need to write anything back to flash to use it to compute something: the output of a floating gate transistor is written to some digital buffer nearby (usually SRAM). Yes, it's only used for inference, not sure how that disqualifies it from being in-memory computing? In-memory computing simply means there's a memory device/circuit (transistor, capacitor, memristor, etc) that holds a value and is used to compute another value based on some input received by the cell. As opposed to a traditional ALU which receives two inputs from a separate memory circuit (registers) to compute the output.
This is not in-memory computing, because from the point of view of the inference algorithm the flash memory is not a memory.
You can remove all the flash memory and replace all its bits with suitable connections to ground or the supply voltage, corresponding to the weights of the model.
Then the device without any flash memory will continue to function exactly like before, computing the inference algorithm without changes. Therefore it should be obvious that this is not in-memory computing, if you can remove the memory without affecting the computing.
The memory is needed only if you want to be able to change the model, by loading another set of weights.
The flash memory is a configuration memory, exactly like the configuration memories of logic devices like FPGAs or CPLDs. In FPGAs or CPLDs you do the same thing, you load the configuration memory with a new set of values, then the FPGA/CPLD will implement a new logic device, until the next reloading of the configuration memory.
Exactly like in this device, the configuration memory of the FPGAs/CPLDs, which may be a flash memory too, is not counted as a working memory. The FPGAs/CPLDs contain memories and registers, but those are distinct from the configuration memory and they cannot be implemented with flash memory, like the configuration memory.
In this inference device with analog computing there must also be a working memory, which contains mutable state, but that must be implemented with capacitors that store analog voltages.
You might talk about an in-memory computing only with reference to the analog memory with capacitors, but even this description is likely to be misleading, because from the point of view of the analog memory it is more probable that the structure of the inference device is some kind of dataflow structure, where the memory capacitors implement some kind of analog shift registers and not anything resembling memory cells in which information is stored for later retrieval.
Flash cannot be used for in-memory computing, because writing it is too slow.
According to what they say, they have an inference device that uses analog computing for inference. They have a flash memory, but that stores only the weights of the model, which are constant during the computation, so the flash is not a working memory, it is used only for the reconfiguration of the device, when a new model is loaded.
Analog computing for inference is actually something that is much more promising than in-memory computing, so Mythic might be able to develop useful devices.
d-Matrix appears to do true in-memory computing, but the price of their devices for an amount of memory matching a current GPU will be astronomical.
Perhaps there will be organizations willing to pay huge amounts of money for a very high performance, like those which are buying Cerebras nowadays, but such an expensive technology will always be a niche too small to be relevant for most users.