Why cut up a a wafer of chips, package each with HBM, put the package on a board, connect to CPUs with a fabric, then tie them all back together with networking chips and cables? Their 3 new clusters are the top 3 biggest AI training platforms in the world. Comparing the WSE-3 against Nvidia's H100. "It's got 52 times more cores. It's got 800 times more memory on chip. It's got 7,000 times more memory bandwidth and more than 3,700 times more fabric bandwidth. But they don't sell the chips, they build the clusters and sell the compute power. Except for a couple they built in Dubai.
Plus, newer models are moving from Transformer to Mamba and don't need as much memory, because they save the important information, not everything it's been trained on.
Sometimes that is all that is needed to move the needle. Tik tok almost moved it but that just made certain swaths of the political spectrum ask for a direct ban (with other downsides eg. 1st amendment concerns) instead of overarching policy reform.
"Policymakers should consider the following steps:
Congress should pass a comprehensive U.S. privacy law, with strong controls on the data brokerage ecosystem. The most effective step to prevent harms from data brokerage for all Americans would be a strong, comprehensive privacy law."
if you don't just ban it, you get the whole GDPR consent banner issue. what is the downside of banning it? it's not like businesses couldn't manage advertising before the internet was around
"The number of U.S. email accounts believed to be affected so far is limited, and the attack appeared targeted, though an FBI investigation is ongoing, said a person familiar with the matter who spoke on the condition of anonymity because of the matter’s sensitivity. Pentagon, intelligence community and military email accounts did not appear to be affected, the person said."
I think fundamentally, if you have incomplete information and have to make some actions or judgements, either you are:
1. doing things to reason about or uncover more useful datapoints to increase certainty
2. you are accepting the probability that you are right/wrong at face value
The direction in which you decide to uncover datapoints is the "bias" that they are talking about. This process if further influenced by institutionalized assumptions or priors you are working with.
I really don't like lists like "Strategic Assumptions That Were Not Challenged" because they are factually true but also reek of survivorship bias.
A100 is an enterprise class GPU like AMD's MI series and Intel Ponte Vecchio. In that regard the pricing is not unreasonable - They are in separate class from Geforce or Radeon cards.