It not holding all the data though. Anyone with old coins has to hold at most 2kb of data. They do have to occasionally (think every 1 to 5 years) scan all spends from that epoch to update that data or get someone to do that on their behalf. But it really does reduce the work.
As to mimblewimble: yes, there is a connection. You are trying to prune provably spent things. But to do that, you must know what was spent. Which means, if I spend a coin with you in Mimblewimble, the set of possible coins it could be is orders of magnitude smaller than the set of coins it could be in zcash or even Monero. Because these don't prune.
How do you "scan all spends from that epoch" with having those spends ("the data")?
You essentialy just said: "It's not holding all the data though, you just have to have all the data." ...?
As I pointed out elsewhere in this thread (see cousin comments) relying an an external 3rd party archival service doesn't solve the issue because either (a) you want the system decentralized so it needs to be within the signer's capability to run such a service, or (b) you don't do that and now you've introduced central points of failure and then what's the point?
So the model is as follows: This network holds the last n blocks (where n is something on the order of months or years.). Spending coins within that time period is unchanged. For every coin outside of those n blocks, the user must hold about 2kb of data per coin. Every n blocks, they must scan the last n blocks before those blocks are discarded to update their state OR rely on a third party archiving service. So if n = 1 year, then oncee a year you must connect to the network and either download the years worth serial numbers from coins outside the current epoch (note that this is much smaller than the years blockchain) or just have a node scan on your behalf given your data.. If you wait longer than a year, then you are out of luck unless you find a copy.
So users must scan the entire block chain to maintain their balance. Note that this is a stronger requirement than bitcoin has! A similar amount of data needs to be synced for on bitcoin to see if the inputs were spent, but not to make a transaction. That's the key difference. There are many applications where it makes sense to have a wallet make spends while checking block data only when it is expecting a confirmation, e.g. because its keys are HSM protected. Vending machines, for example.
Bitcoin has about 2-4k inputs per block. Let's say 3k inputs every 600 seconds. A key image size depends on the crypto being used. A super conservative lower bound on size is 256 bits per key image -- smaller than either Monero or Zcash, I believe -- as general information theory says anything less than that cannot provide 128 bits of security. That's 4GB/yr or 330MB/mo. And again, that's a minimum -- Zcash for example is 9x this number as a theoretical minimum, larger when you add protocol and serialization overhead.
That's a lot of data to suck down a pay-as-you-use-it IoT 3G connection. And that's just at bitcoin's pre-segwit average usage, not even what segwit can do or levels people think bitcoin should eventually be scaled up to. However much bitcoin capacity limits are raised in the future will directly scale up these numbers.
> rely on a third party archiving service
This is not a solution. If you allow scaling to such a degree that third party archiving services are required, then you've centralized the network. Why even use a block chain at that point?
The assumption isn't that the entire blockchain is stored forever, its that users who keep year old coins can 1) receive blocks (either as they are created, or in some batched process where the batch size could be as large as you like up to e.g. 1 year) and 2) store 2k of state per coin. This is weaker assumption than that of a full node in Bitcoin (which stores the entire blockchain)but obviously stronger than an SPV client which doesn't receive blocks.
And remember, this only happens for really old coins. The data from looking at Monero's anonymous Tx's (recall there was a bug that leaked spending history) is that coins are typically spent with in a week. Not just does this mean few users will pay this cost, but the cost will actually be smaller. You don't need the entire block to update the non-memership proof for a given epoch, you only need all the serial numbers from that epoch that were in that block. Thats likely a small fraction of transactions. Zcash serial numbers are 256 bits by the way.
Yes, its a cost. But it is the price you pay for strong privacy. If you can prune transactions, its because you know their outputs have been spent. Which means those outputs don't contribute to your anonymity set.
As to mimblewimble: yes, there is a connection. You are trying to prune provably spent things. But to do that, you must know what was spent. Which means, if I spend a coin with you in Mimblewimble, the set of possible coins it could be is orders of magnitude smaller than the set of coins it could be in zcash or even Monero. Because these don't prune.