Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft bets on Germany in €3.2B AI push (reuters.com)
157 points by thunderbong 10 months ago | hide | past | favorite | 116 comments



I haven't seen it mentioned yet on Hacker News, but Germany is about to pass the "Gesundheitsdatennutzungsgesetz". This bill allows anonymized access for researchers to all German health care data. Data from over 83 million people.

The example given by the health minister for how this might play out: Researchers find an interesting pattern in the data. They request the ministry to ask the matched people for permission to become participants in a study or direct access. If permission is given, the anonymization is lifted in part and a study could move forward. This alone would make Germany a pretty fascinating place for future AI research.


Sounds perhaps a bit naive, as one might expect from German government, when it comes to data. How much of such data does one need to de-anonymize someone? How easy will it be to accidentally slip identifying data in using whatever kind of system, that is supposed to be used? How easy will it be for any company like MS to correlate that data with all that other data they extracted without user consent?


It is important to note the the original draft for the law does not talk about anonymizing data, but rather pseudonymizing of data. So no attemt is made at keeping identities of patients truly anonymous (which has been repeadeatly proven to be impossible in sparse datasets).


>How easy will it be for any company like MS to correlate that data with all that other data they extracted without user consent?

Doable to some extent, but would they really learn much that we haven't already told them, given our propensity to Google for symptoms and diseases?

Personally, my worry here, if I had some embarrassing medical history that I wouldn't want people to know about, would be some malicious party gaining access to de-anonymized data and using it to blackmail, or just simply making it public.

Edit: Come to think of it, insurance companies could probably have a field day with a data set like this.


The civilized world has universal health coverage, so the insurance thing isn’t really a problem except in one particular country. But the malicious release is definitely a problem.


It never is a problem until it is. Insurance companies in Germany are also not benevolent. If there is a case where they do not have to pay they won't. If they can use data to save money they will do it, even if it worsens your conditions.


" Insurance companies in Germany are also not benevolent. " You know, that the leaders of public insuarances are voted by the customers.


Not all public insurances perform these elections but some, e.g. Barmer GEK, Techniker Krankenkasse, DAK-Gesundheit, Kaufmännische Krankenkasse – KKH, Handelskrankenkasse und BKK RWE and I think also AOK.


One might argue that the free market is consumer-driven voting.


It doesn't feel adversarial as a customer. Never did for me, never heard it from others. They aren't even regular companies - they have a special legal status. Private insurance, which is also available, is different.


The "civilized world" does not, however, have universal life insurance nor disability insurance that would cover the full standard of living. Those are, at least in Germany, privatized, and companies could benefit a lot from having access to some extra data.

I'm not sure that getting caught using such data would be sufficiently bad news to be not worth it.


No, because medial data can also be used for things like life insurance, insurance against loss of earnings caused by sickness, and probably a lot of other things.


Blatantly false.

I've been in Europe two decades and aside from health insurance being key to more than a few countries, private health insurance is relatively common in tandem with public health services.


Quite typically there is basic service via a welfare system and additional or premium service with insurances. I pay into the public health system at a rate proportional to my income (which is more than what I would pay if I had a worse-paying job for the same service), and I have an additional dental insurance which insures my teeth.


No, I mean full-blown private health insurance. In France, Belgium and Netherlands it's very common (and I think mandatory in NL).

Outside of those countries, it's still somewhat common to have private health insurance (not just dental), such as Mapfre or Regina Maria. Heck, even in Germany something like 1 out of 10 people have private health insurance (again, not just dental).


Incredibly naive. Easy to de-anonymize. First they come for the books, now AI practitioners come for the health data.

"Estimating the success of re-identifications in incomplete datasets using generative models" - https://www.nature.com/articles/s41467-019-10933-3

"...We here propose a generative copula-based method that can accurately estimate the likelihood of a specific person to be correctly re-identified, even in a heavily incomplete dataset. On 210 populations, our method obtains AUC scores for predicting individual uniqueness ranging from 0.84 to 0.97, with low false-discovery rate. Using our model, we find that 99.98% of Americans would be correctly re-identified in any dataset using 15 demographic attributes. Our results suggest that even heavily sampled anonymized datasets are unlikely to satisfy the modern standards for anonymization ..."


A similar paper, ~3k citations, on deanonymization in sparse datasets:

https://systems.cs.columbia.edu/private-systems-class/papers...

It's well known within the sphere that anonymization is hard and deanonymization is trivial. The original Netflix prize around their recommender system also had issues with deanonymization.


The companies and researchers get access to the data with oversight but the data isn't shared. They basically get access to a server to run their analysis but can't retrieve the data.


There is almost zero retribution for non-compliance. This law is ripe for abuse. It's only a question when, not if, the data will become public. Conveniently, people with private insurance are exempted, which are most of our politicians and all of our public servants.


Wrong. Potential fines for a GDPR breach, under which this could fall under, are 4% of the firm's worldwide annual revenue from the preceding financial year. But there is also a criminal code in the works for violating data provided by this bill.


Sweden is usually ranking vet high on health care, both in results and research/innovation. One big reason for this is that we very early on gathered a lot of medical statistics on the population through the "health ministry", which have later been used in research. All this have fostered and cultivated innovation. The biggest victory however is the benefit of the results it had brought the population (and the world).

If done with care this can have a big impact! Very exciting Germany is doing this!


When it comes to digitalization and privacy in Germany, no good intention goes unpunished.


But there is nothing about "digitalization and privacy" in this law. They will collect the data and "sell" it to the lowest bidder (Palantir for example).


This should absolutely be done everywhere.

It will exponentially reduce time to cures for various diseases. The amount of red tape in academia and pharma to do simple data studies is wild...


My medical data is affected and I can't disagree more. As well pharma is not interested in finding cures, for obvious reasons. I am getting all the downsides with no of the upsides.


Next thing you see is letter from Rundfunkbeitrag suggesting to pay for the whole year in advance because your cancer doesn't have good prognosis. Nothing personal or fraudulent you know... it will be convenient.


I don't think people understood the joke since they might not know what Rundfunkbeitrag is!

Everyone outside of German has to face death and taxes. In Germany you face death, taxes and Rundfunkbeitrag.


Doesn't sound anonymous if it can be trivial unmasked. Sounds like the setup for one of the most egregious medical leaks of all time. Germany pwned.


It can't be trivially unmasked. If approved researchers get controlled access to the data with oversight, but the data itself isn't shared with the companies.


There is no such thing. Information is not a physical good that you can store in a safe. Access to it implies sharing it. At best what you can do is an API limit so that the thing is not immediately copied in its entirety. But then again, what possible statistical insights could you gain with only a small fraction?


They get access to servers and/or data centers where they can run their analysis. Its trivial to control for data egress in such a scenario.


None of this is mentioned in that law. The worst thing that can happen is getting excluded from the data for some time.


I would hope that there is a way for the subjects themselves to be involved in the decision before their data is de-anonymised


I think the plan is an opt-out solution. But they really don't want many people to actually opt out.

And it's a legal challenge to make sure that everyone understands how and why they should opt out, even my grandma.


How much I love these solutions where you're stripped of your rights to privacy be default, but if you want it back you need to invest your free time in order to go through the awesome process of bureaucracy, really for these people we're just disposable beings that should be ready to give up their limited time in order to act on any bullshit politicians think about


The issue is that an opt in solution might make the system unviable in the first place, which reduces utility for everyone. Of course, opt out should be simple, but if the majority of people don't care and the data is anonymised properly, I don't really see the issue since this is used to further the sciences.


Consider how much more effective an opt out system for organ donation is than an opt in system.


Do you mean the bullshit to allow innovation in healt treatments and research?


There is always a greater good that can be used to strip people of what they want is their right to a private and quiet life, what was yesterday, encryption in communication for safety, their medical data for research? There is always something to take away, I don't care about medical research, it is my right to have my medical data not shared with researchers, if someone else wants, then it's their right, I don't see why the default has to be to take away


When using public healthcare resources and anonymized data?


The change in what’s happening is important. The original social contract just assumed it would always be private. Now that’s changing. Possibly if the notion was built in that the public health system would reserve the right to effectively monetise your data… the entire genesis of it would have happened differently. I’m not saying it’s wrong. I’m saying changing the rules halfway through means you can’t appeal to the original contract.


Rules change all the time around tax and healthcare. This is also not about directly monetizing data but to benefit the recipients of public healthcare which is a lot more than just finding revenue sources.


As said before, there is always a benefit that needs removing the right to privacy, I was not saying it is wrong, I was saying to make it opt in so that people can be part of the benefit by choice if they want to


Not how this works. If the benefit is better healthcare, there is no way to withold that from people not opted in.


> When using public healthcare resources and anonymized data?

The problem is that data is not anonymized. And the companies using these data are using it for other purposes like selling targeted health data.


> I would hope that there is a way for the subjects themselves to be involved in the decision before their data is de-anonymised

No. Unless you don't want medical treatment.


> This bill allows anonymized access for researchers to all German health care data. Data from over 83 million people.

Sounds like MS has just bought a relatively easy access to ~83m people's health data. For €3.2B.

That's €39 per person (tax included).


What exactly does it mean to invest 3.2B€ in Germany? Does it mean this is actually spend in Germany - as in for goods and services produced in Germany, or does it mean they order an astonishing amount of GPUs at Nvidia and put them into the Frankfurt region because that's just one of the most popular regions latency wise to serve Europe?



Isn't power really expensive in Germany? Wonder why they would build data centers there.


It's on a downwards trend again after the spike created by the war [1]. However, I don't think this is a big factor, since Microsoft is the second-biggest buyer of renewable energy, so other prices apply.

[1] https://www.statista.com/statistics/1346782/electricity-pric...

[2] https://www.renewableenergyworld.com/news/bloombergnef-corpo...


That second link … it’s depressing how a site about renewable _energy_ writes an article about cumulative _energy_ consumption but consistently fails to tell the difference between energy and power.


Germany is in the middle of Europe? 3rd largest economy (by GDP) in the world? Lots of customers nearby? It has one of the world-wide largest Internet exchange nodes in Frankfurt (-> DE-CIX)? It helps to adhere to EU (-> GDPR) and German regulations having data local, when wanting to serve EU customers?


> It helps to adhere to EU (-> GDPR) and German regulations having data local

Not this again. The moment you transmit data to European based servers under control of US corporation you could just as well send it straight to the US. Same difference.

Nobody cares if AWS, Azure or GCP have EU datacenters. They are for most part understood as under US control.


> Nobody cares if AWS, Azure or GCP have EU datacenters.

That's wrong. For EU companies it makes a difference, since it allows them to be compliant. It may not be enough for you and me, but it is a huge legal (and also practical) difference. Generally I agree it is dumb to give these companies data (regardless where they are) and that includes already data from me as an end customer. They can't be trusted when it comes to data privacy or lawful use of my data.

Still there is a difference of data being used for company/business usage or for intelligence. Generally the trend to host data in the EU and having regulations on the EU level is a positive trend.


EU's best AI/ML universities are nearby (ETH, Tübingen, Munich,...)


I think British, Swiss and French universities top those in Germany for AI/ML.


ETH is Switzerland. The claim was "nearby". And when you're in Germany's south (even in Munich, which is relatively far east), the major French universities are pretty accessible, too.


They could have just gone to Switzerland and Germany would have been nearby! Also you get Germand and French speakers! So easy to jump between France and Germany.


Switzerland is a lot more expensive, though.


Electricity for the industry is cheaper, they are freed from some taxes (not sure which one) and obviously VAT is deductible which is not the case for end consumers.

Regardless of that it's a myth anyway. Germany has a very competitive electricity market, and it basically works by having all the costs on the actual electricity bill (will not be 100% true in the coming years, but it was until now). Some other countries have cheaper electricity on paper, but cover the real costs with government money / taxes. France is the most obvious example as a neighbouring country.


Why do you think it's a myth? It is true. Germany has one of the highest electricity prices in the world. It's well known because it's true.


Read what I wrote. The price of electricity is what gets paid. If you pay 5c per kWh and say I have the cheapest electricity in the world, but your electric company has to get bailed out by government intervention your electricity is not actually 5c per kWh.

I just checked and you can get a new contract in Germany for 25-30c per kWh. If you compare it to France, the price there is 24c per kWh, but at the same time the government fought tooth and nail last year to be able to fund repairs of the nuclear plants. They want to spend 20-25 billion per year of taxpayer money in the coming years. So do the math on what it actually costs. If you run a real company for profit you fund the future repairs through the price of the things you sell.

There is no such thing in Germany, or at least there wasn't until recently. There is a renewables surcharge but it was always on the electricity bill. Last year it was removed because of high electricity costs so it wasn't needed, and the current plan now is to finance it through taxes once the fund gets depleted. Personally I hope it gets put back on the bill because I like to know how much things truly cost.


Residential electricity has the full rates on everything. Commercial electricity is a bit cheaper because taxes and network fees are reduced, and industrial electricity is even cheaper because the reductions are even higher.

In case of the network fees, residential customers directly subsidize industrial customers: network fees cover the network operation costs, so it's a zero-sum game where one party's reduction is another party's increase.

It's still somewhat expensive to operate electricity heavy industry in Germany, but much less outrageous than the residential prices that are usually used in for comparisons in the news.


There are also already tons of DCs in Germany. So obviously it's not prohibiting anything.


I pay ~50 Euro cents per kWh. Even if you gifted me a GPU server it would be cheaper for me to use the cloud.


Check your provider's current rate, it's probably closer to 30 cents. Then call them and ask if they want you to switch at the next opportunity or if they're willing to hoist you over into the new rate.

They even backdated my contract change by a few months when I did that.


Change your provider, you're overpaying by about 20 ct.


Why do you pay so much? Latest contracts are 23 Cents/kWh, I pay about the same using tibber's dynamic tariff. Sounds like a massive rip off.


His current provider sends him regularly leaflets with windmills. No one would resist that.


Germany had many big external investment + government subvention projects involving factories, data centers etc. announced in the recent years. None of them materialized and they got canceled because of the German balanced budget amendment [0].

[0]: https://de.wikipedia.org/wiki/Schuldenbremse_(Deutschland)


Schuldenbremse does not directly affect private investment. Why should it?


Some projects are a mix of private and public investments. A freeze on the public portion affects the risk/trade-offs of the private investment portion and therefore on the overall project.


Sorry but you wrote "None"! Now we talk about "Some"?


I'm not OP, just explaining how public and private investments are sometimes linked.


If anyone cares to fix title, here's the character to copy-paste: € (3,2B€)


Using the name rather than the symbol is just the Reuters house style for non-USD currencies in headlines.


This might be good strategically from the perspective of EU companies where the physical location of operations can matter a lot depending on your field of business, e.g. NIS 2 Directive for energy companies and other sensitive infrastructure.

But they'll face competition too of course! I can't wait to see more development from European actors and projects like Mistral, TrustLLM, and also GPT-SW3 for the Nordics! There's much in motion on the European stage right now.


They don't face much competition for datacenter locations in Europe. Germany has a really good location being relatively centered to the continent depending on the city, and DE CIX as a gigantic internet exchange. Most datacenters are located around that exchange. The only two locations that come close to Frankfurt am Main in datacenter count to my knowledge are Paris and Amsterdam, but neither has that geographical advantage for latency.


It goes both ways, of sorts - DE-CIX is also a consequence of all the large DCs being there :-)

Interestingly, Tier 1 transit pricing is increasingly competitive and DE-CIX access is more expensive (relatively speaking) in recent years.


Related: https://news.ycombinator.com/item?id=39384282 (2 days ago, 1 comment)


Meanwhile, at the actual company, we are laying off German employees in droves because the labor laws are so inflexible there (or so the meme internally goes)


> double the capacity of its AI and data centre infrastructure in the country and expand its training programmes,

So installing GPUs in the existing German Azure datacenters (likely most of the money), and spending some of the time of their existing field teams to train customers to use it.

I think the "bet" is a bit emphatic there, Germany has had protective laws for their data for a long time, which is why they've had local datacenters for a long time.

It's a logical place to start massive deployment of GPUs in Europe if you don't want to exclude its largest economy from the party


Microsoft also has a Privacy Shield re-certification coming up this year [0].

[0] - https://www.privacyshield.gov/ps/participant?id=a2zt0000000K...


hard to understand what exactly they are investing in. just building a data center to suck up the green energy is not such a great deal for germany


I thought Germany was famously an OSS leader? Thought all govt machines ran Linux. Maybe it’s just certain municipalities.


While I live in Germany I haven't been paying too much attention, but I do know that it feels like it's trying to invest more in OSS than certain other countries. One thing is the sovereign tech fund and another is them working on openDesk.


There is an interesting story on this topic: https://itsfoss.com/munich-linux-failure/


Someone somewhere is using WSL. Was enough to tick the "OSS" checkbox.


Something something Dreiundzwanzig und Ich.


Are there countries which have universal healthcare, and health care data is not stored already at a centralized place? This is not new at all. At least definitely not in Europe.


A lot of UK NHS patient data is stored by GPs who are private sector contractors and do not use NHS systems.

I do not know the details - I have worked with medical systems handling NHS data a bit but not the details of what is stored where and how it is transferred between systems. I do recall some news stories about GPs not cooperating with the NHSs push to a centralised system because of privacy concerns - articles on The Register IIRC.


Good point!

But I highly doubt that they really care about privacy, but how easy would be to find another doctor.


Romania has universal healthcare, if pathetic, and health care data is likely not stored at all, and in any case not at a centralized place. And we're definitely in Europe, the EU even.

Admittedly, this may be a bit besides the point. Still, I think the universality of your assertion is based mostly on thinking that the world is made only of a handful of "important" countries.


Question marks mean something. Definitely not assertions.


care to explain?


23andme

(drei­und­zwanzig und mich)


Didn't they have a recent data leak?


I think that was OP’s point: Gathering such data in a centralized place is inherently risky.


MS need its own AGI strategy as they ain’t getting it from OpenAI from that 10b investment


> MS need its own AGI strategy as they ain’t getting it from OpenAI from that 10b investment

Microsoft will neither get AGI from their own nor the OpenAI investment. The whole story of AGI is just a nice science-fiction story marketing pitch to relieve investors of their money. :-)


Fun fact: Once these have been built, I‘ll be able to walk there from where I‘m writing this (which is my bed) — it‘s that close.

Anyone here who can tell how these change the surroundings (economically, ecologically, socially… any dimension is interesting)?


Is Germany being used for development or as a database?

I have reservations about Germany’s 9-5/work-life balance culture being compatible with high performance in development.


From personal experience, American work/life balance seems to revolve around standing at the water cooler while Germans sit at their desks and do actual work.


Between 9-5, you are right.

Then, the action starts.

(When I worked in Germany, they practically locked the doors at 5.)


Conspiracy view... Maybe just an indirect bribe so German government votes in favour for the tech giant on EU level laws...


"Legal bribes" with dedicated budgets and departments at corporations are absolutely a thing (lookup Siemens), only in Europe we avoid the word "lobbying". It's not conspiracy at all.


"Kickbacks" is the term used internally at Siemens for this modus. They only document it well for their CEE subsidiaries but the fish rots at the head.


[flagged]


Dude, wake up. The richest European is selling purses, perfumes, cognac, and belts. Investment wise is balls deep in real estate and yachts. Second richest is selling shampoo and lotion. Europe is feudalism. IT has no priority or strategic importance. The situation is more or less "so MS can bring some of this AI stuff and invest billions on top of it? We like billions and cannot remain behind in AI, yeah let's shake hands!"


I would invest a round 4B EUR, 3.2B is an awkward number.


If you've got enough money that €800 million is something you'd spend purely for the aesthetics of your invoice, please feel free to send the extra my way.


Joking, but surely Microsoft does.


It doesn't go well when reporting on it, but clearly they must have actually went for 3,220,176,896 EUR, signalling to Germany that they are 0xBFF00000's


Probably they forgot the other wallet with 800 million euros in the other pants


Happens all to me all the time, really annoying when you don't want to be cheap on tips...


Um, no. That's what they spent on bri^Wlobying. /s




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: