The implementation of the UK Covid-19 dashboard

trebligdivad · on Jan 23, 2022

I do wonder if the power of a distributed database is really needed here; it gets ~1 update a day, so there's no need to have clever consistency stuff. Most of the queries relate to either today's data (you open the map and zoom in to see how doomed your area is today), or the graphs showing a standard set of history (e.g. cases over the last year). You'd think you could extract that data to be static and not require database queries, and only fire up the database for the tiny proportion that go digging in history.

mmcnl · on Jan 23, 2022

I agree. For example, the Dutch corona dashboard (coronadashboard.rijksoverheid.nl) is a statically rendered dashboard using Next that gets updated daily. No backend and it's super fast.

Maybe I'm not objective because I'm Dutch myself, but from both a user-facing and technical perspective I think the Dutch dashboard is by far the best corona dashboard in the world. It's very fast, has a lot of detailed visualizations, provides a lot of context and has fair amount of accessibility features.

codefined · on Jan 23, 2022

Looking at the Dutch site on my phone (Samsung S10) I noticed it took a little while to load compared to the nigh instant loading of the UK Gov variant. Looking at Page Insights [0] [1] tells a similar picture. Desktop time to interactive times of 0.4s Vs 3.5s and Mobile time to interactive times of 4.5s Vs 13.1s.

The Dutch website seems to spend a lot of that time running the Next JS framework stuff, which the Gov.uk variant does not. It might work quickly on fast computers, but even on modern phones it seems to visibly pause.

mmcnl · on Jan 23, 2022

On my iPhone Xr, older than S10, it loads very fast. Also performance depends on which page you benchmark. Landing page is faster for UK, but the cases pages is about twice as fast for the NL dashboard (time to interactive 2.4s for NL vs 4.4s for UK). Also first meaningful paint is faster (0.5s vs 0.8s). This proves that you can get decent performance without an overly bloated costly architecture.

tinus_hn · on Jan 24, 2022

It looks really nice. Unfortunately it turns out the main feature, the data, is phony.

https://dvhn.nl/groningen/Meer-ziekenhuispati%C3%ABnten-blij...

When the hospitals feel like it, they test patients that are already in the hospital for something else if they have COVID. And when they don’t feel like it, they don’t. Any patient found to have COVID, is added to the graph. So these numbers, and also derived numbers such as the R value, are statistically useless and vulnerable to manipulation.

Accacin · on Jan 23, 2022

I'd say that using Next for a static site is just as over engineered, personally.

mmcnl · on Jan 23, 2022

I disagree. Expressing your frontend layout as code is not over engineering at all imo. It makes it easier to re-use code and is great for testability. Next is perfect for this use case. I actually think the code is quite elegant too. It's open source, so don't take my word for it, but have a look for yourself: https://github.com/minvws/nl-covid19-data-dashboard/tree/dev....

dwohnitmok · on Jan 24, 2022

> Expressing your frontend layout as code is not over engineering at all imo

What do you mean by this? Isn't all frontend layout expressed as code?

mmcnl · on Jan 24, 2022

HTML is not code imo. With React (and thus Next) you can treat your frontend code as full-fledged functions and objects.

xbar · on Jan 23, 2022

Is there a dashboard of dashboards somewhere?

mslot · on Jan 23, 2022

The dashboard has to deal with a complex data integration problem, with different sources with differences in completeness, accuracy, age, and granularity (at many levels), daily corrections in past data, changes in data structure and semantics over time, large data volume, 4pm traffic spikes. Moreover, an API that allows you to select different metrics for different areas. Being able to simply write a SQL query or update and have it be fast regardless of volume is quite a life-saver if you have a tiny (mostly 1 person) team and development speed/adaptability is essential.

Some example queries issued by the dashboard: https://github.com/publichealthengland/coronavirus-dashboard... https://github.com/publichealthengland/coronavirus-dashboard... https://github.com/publichealthengland/coronavirus-dashboard...

alexchamberlain · on Jan 23, 2022

I think the point was that the analyst needs to do that, but the frontend doesn't really need to - it could render a bunch of statically aggregated data and the spikes become another CDN problem.

vlovich123 · on Jan 23, 2022

See the other comment about the Dutch dashboard. Covid data isn’t changing that quickly. Having the frontend render something more static simplifies the design. No sql queries are even needed and you don’t need to scale out your database.

Accacin · on Jan 23, 2022

Well the Dutch site takes much longer to load. All these comments are (rightfully) discussing the back end being incredibly over-engineered, but 99% of people do not care about that. They care about how quickly a page loads, which the gov.uk site does much better.

I guess that implies that using Next for a "static site" is not a great idea.

howinteresting · on Jan 24, 2022

https://coronadashboard.rijksoverheid.nl loads in less than a second on my Pixel 6.

lloydatkinson · on Jan 23, 2022

I had the same thoughts and then it was confirmed how insane this setup is part way through:

“At the time of writing, the Citus distributed database cluster adopted by the team on Azure is HA-enabled for high availability and has 12 worker nodes with a combined total of 192 vCores, ~1.5 TB of memory, and 24 TB of storage. (The Citus coordinator node has 64 vCores, 256 GB of memory, and 1 TB of storage.)”

That’s beyond overkill for something that as you say could be generated statically a couple of times a day.

vidarh · on Jan 23, 2022

It's probably overkill, but not really enough overkill to be worth spending much time on.

E.g. 12 worker nodes and 192 vCores means they've picked 16 core nodes. 1.5TB of memory across 12 nodes means 128GB per node. 24TB of storage is just 2TB per node.

So it's 12 relatively mid sized servers/VMs.

They could certainly do it with much less, and I have no interest in looking up what 12 nodes of that spec would cost on Azure, but at Hetzner it'd cost less than 1500 GBP/month including substantial egress. At most cloud providers the bandwidth bill for this likely swamps the instance cost, and the developer cost to develop this is likely many times the lifetime projected hosting cost even with that much overkill.

If they happen to have someone familiar with query caching and CDNs, I'm sure they could cut it significantly very quickly, and even an entirely average developer could figure out how to trim that significantly over time. But even at (low) UK government contract rates it's not worth much time to try to trim a bill like that much vs. just picking whatever the developers who worked on it preferred.

tgv · on Jan 23, 2022

> generated statically a couple of times a day.

That would require actual work instead of selling an overpriced generic solution.

smarx007 · on Jan 23, 2022

Did you look at the 3 different (non-trivial) APIs they are offering on top of the dashboard? Though I have a hard time understanding why use PostgreSQL instead of ClickHouse, for example.

lloydatkinson · on Jan 23, 2022

No I didn’t tbh, I didn’t read much further. Notice how one sentence says Postgres was chosen because it was somebody’s preference

foepys · on Jan 23, 2022

You will always be faster with worse tools you know than with better tools you don't know.

lloydatkinson · on Jan 23, 2022

True but why does it also need terabytes of storage and 12 worker nodes?

stuaxo · on Jan 23, 2022

i imagine getting something up, quicklh waa a priority, rather than spending longer architecting amd optimising.

sharken · on Jan 23, 2022

My suspicion is that since this has to do with COVID, there is no real limit on what the cost should really be.

As for using the setup for other things, that seems less likely given this expensive setup.

samhw · on Jan 23, 2022

> could be generated statically a couple of times a day

Hell, let's do some partial evaluation: just bake the computed HTML into the source code and recompile that a few times a day. No need to even read from a file when you can fetch it from rodata.

As for the reason why they did it this way, I assume it's a combination of CV-driven development along with the hackernoon-reading-junior-engineer-meets-cunning-salesperson effect which others have noted.

_ben_ · on Jan 23, 2022

Yes the static render option seems optimal however if an API is being offered then something dynamic is mandated forcing scaling of the data tier. It seems like even a basic app cache would suffice.

Alternatively, we're building https://www.polyscale.ai/ that is a good fit for this type of use case. It's a global database cache and integrates with Postgres/MySQL etc. We host PoP's globally so the database reads are offset and local to users.

Agree with the other comments in that this feels like a shiny use case to quote to other prospects, but all good :)

illwrks · on Jan 23, 2022

My guess is that this is sales pitch. It will be rolled out to business customers to say "look at our shiny bells and whistles", and contracts will be signed.

glogla · on Jan 23, 2022

I played with the website and it feels really nice.

My guess is that this was web people who were contracted to build a read-only daily updated dashboard instead of interactive web app so they treated it as another web app, just scaled up.

YZF · on Jan 23, 2022

To add to this the scale of the data is presumably quite small as well. The geographical resolution is probably not super fine, there's only a handful of different kinds of data (deaths, vaccination whatnot) and the time resolution doesn't have to be too fine either (a day?). Even if you wanted to query it in very sophisticated ways you wouldn't need a database.

londons_explore · on Jan 23, 2022

In fact, the UK dashboard had a suspicious outage when total case numbers exceeded the 1 million row limit of Excel... I suspect excel is used in the data prep stage, if not used in serving the dashboard.

riverdweller · on Jan 23, 2022

It’s entirely unnecessary. The data is updated too infrequently to justify anything like this.

I built a one-pager vanilla JS site that polls the official Johns Hopkins aggregated data daily, and displays dynamically generated smoothed moving average charts, performs curve similarity analysis to identify similar patterns in different countries, and performs logarithmic regression to depict current doubling/halving times.

This happens entirely on the client side, with no server side component whatsoever (other than the http server to deliver the static HTML&JS that does all the work). See https://covid-19-charts.net/

kakakiki · on Jan 23, 2022

I came here to say exactly this. Is there a reason why they didn’t do it? I couldn’t figure it out from the article.

greatgib · on Jan 23, 2022

First, when you see public officials doing a blog post on "Microsoft.com" website instead of on a public website, you know that something fishy is going on...

On the other side, I have the feeling that this thing that clearly over-engineered. Just look at data their diagram... If I'm not wrong there is one writer and multiple reader for the data, or at least multiple writers on one side and multiple readers on another side, without a need for "real time" consistency.

So, this thing could probably have been better splitted to not have the use for "scaled" databases

sdoering · on Jan 23, 2022

> First, when you see public officials doing a blog post on "Microsoft.com" website instead of on a public website, you know that something fishy is going on...

The article states it was written by Claire Giordano from San Francisco. Not sure where you got the UK Government official from.

To me it read like a b2b marketing piece and showcase. Kind of: We can power this, so we can power your BI dashboard as well.

Taking this into account it was a nice write up and from a data analyst's and consultant's pov interesting to read.

Rastonbury · on Jan 23, 2022

Coming from consulting, this is exactly what it is - from a pure engineering standpoint it may be lacking but if you've read any other 'case study' targeted as business people this goes really deep. Normally it's SEO fodder crammed with jargon and buzzwords

arsome · on Jan 23, 2022

It definitely had a bit of a marketing tone, but it was focused on what's ultimately an open source product you can run pretty much anywhere, from a different cloud provider to your own bare metal, they just happened to use it on Azure.

greatgib · on Jan 23, 2022

Indeed I did not check the bio of the article writer, but when you read:

<<As a result, the GOV.UK Coronavirus dashboard became one of the most visited public service websites in the United Kingdom.>>

You don't expect the gov UK dashboard to be done us consultants...

DrBazza · on Jan 23, 2022

> First, when you see public officials doing a blog post on "Microsoft.com" website instead of on a public website, you know that something fishy is going on...

Maybe, I'm naive, or not cynical enough, but I just read this as a case study of customer using Azure to provide the general public with information in a robust fashion.

In fact, if anything, the whole article is remarkably light on pushing Azure, and quite heavy on architecture details.

The open source code (on Github) uses Postgres (not MSSQL), and Python (not C# or Powershell), and in fact has a screen shot of Jetbrain's Pycharm, and not VSCode.

In fact it's probably quite an MS agnostic article.

Even though gov.uk is actually a really good IT company, I'm quite pleased that they're using "the cloud" rather than trying to create their own.

samhw · on Jan 23, 2022

> Even though gov.uk is actually a really good IT company

For anyone who's wondering, the relevant team here is GDS[0]. We hired a bunch of engineers from there at one of my previous companies - which was doing some quite gnarly technical work - and they were superb. I believe the US equivalent is 18F.

[0] The Government Digital Service in full, but no definite article for the initialism.

ghassanmas · on Jan 23, 2022

To be accurte, it's not completely Micrsoft agnostic, it make use case for a PSQL extension citus[1], the company behind this extension has been acquired by Microsoft two years ago[2].

[1]: https://github.com/citusdata/citus

[2]: https://blogs.microsoft.com/blog/2019/01/24/microsoft-acquir...

StringyBob · on Jan 23, 2022

Not as much tech detail, but for an alternative source, here’s an intro by the team dashboard lead - https://ukhsa.blog.gov.uk/2022/01/20/reporting-the-vital-sta...

Also - I’ve been really impressed by the openness of the team actually doing the work - eg threads like https://twitter.com/pouriaaa/status/1476892793729654787

and in particular this analysis of debugging a problem that the dashboard encountered - which also gives a lot more background context: https://dev.to/xenatisch/cascade-of-doom-jit-and-how-a-postg...

CraigJPerry · on Jan 23, 2022

I like the UI design in figure 1. There’s no crap in the way of the data but i don’t feel overwhelmed either. My eyes can scan across sections and it feels natural, theres no firehose effect. I like the thought that’s gone into showing the % vaccinated in the top right. I like the dashed underlines telling me that some explainatory text is available.

I think the page looks inoffensive but is clearly focussed on being informative. I wish more data repositories took care and attention towards how data is represented.

one-more-minute · on Jan 23, 2022

In general the gov.uk website is stellar. Lots of good info, and plenty of thought has gone into making everything clear, accessible and pleasing to the eye. Things generally work in a way that they really don't on most public org websites. The team behind it blog a fair bit about how they've made these things happen.

https://insidegovuk.blog.gov.uk

The only downside is that they often send you to sites run by other, significantly less competent bodies (looking at you, student loans company).

jsmith99 · on Jan 23, 2022

They had an early win getting a effective and consistent UI accross government sites, but digitising the underlying processes is a work in progress. Once you click that beautiful and accessible green submit button, it's not impossible a printer in Swansea whirs to life to print your answers out into the original paper form.

hunter321 · on Jan 23, 2022

http://ehmipeach.defra.gov.uk/

>Using the right browser - only use Microsoft Edge PEACH is compatible with Microsoft Edge, but only when Internet Explorer mode is enabled. For guidance on setting up Internet Explorer mode in Edge, follow this link to instructions on the Microsoft Support Page.

Yeah, there's still some way to go.

samhw · on Jan 23, 2022

> it's not impossible a printer in Swansea whirs to life to print your answers out into the original paper form

Speaking of which: my mum went through the form to renew her Oyster card the other day, and, having finally completed it, it generated a PDF form and told her to print it out and take it to the Post Office.

So yeah, there are definitely some Jira tickets still on the left.

samwillis · on Jan 23, 2022

While I almost entirely agree with you, having had to report positive LFT Covid tests and do the subsequent “test and trace” form this week I feel that while the UI and UX is nice the flow of those forms are really quite pore. There are repetitive/redundant questions plus inconsistent and out of date advise. It’s probably more an effect of the difficult moving target of changing rules and advice, but for what is an important government process everyone is coming into contact with I though it would be better. Maybe wishful thinking though.

cronin101 · on Jan 23, 2022

I think GOV.UK in general does a very good job of making important information readily available. It’s however extra jarring switching between the clean/efficient style of the online messaging and the underlying public services/offices that are still held together by chewing gum and a fax machine, whenever you have some issue that can only be resolved by persuading a human to stamp some paper (Visas, public records, etc.)

spaniard89277 · on Jan 23, 2022

At least you got that. Try Spain, when the tax agency uses behavioral data and puts incentives for tax agents, but you can't get a f*ing appointment for pretty much any service as they use the most stupid appointment system ever, and 90% of the state services are subcontracted to the lowest bidder making everything a PITA to use.

All the fancy tech to get in your pockets. For everything else, go f*k yourself.

cameronh90 · on Jan 24, 2022

It was a lot worse a decade ago. The GDS have done a pretty good job of unifying the design language and methodology of disparate government departments, but of course, it is a huge job. It clearly involves just as much cultural and organisational overhaul as it does technology work.

Most recently I found the DVLA license renewal was one of those ugly backwaters (albeit still fully online), but their license check code generator is great.

For real terrible stuff, check out local council websites.

eggy · on Jan 23, 2022

Yes, I thought the same thing in finding it easier on my eyes/quick perception of the site.

I do think the UK and some other countries do a better job of presenting data compared to the CDC.

It's pretty much agreed that the rate of unvaccinated people vs. vaccinated people winding up in hospital beds is several times higher, however, all the CDC data presented is only rates. I want tallies or counts, and I cannot find them. For instance, on Ontario, Candada's site[1], the vaccinated are 74% vs. the unvaccinated's 26% of COVID hospitalizations. Most non-technical people think the hospitalizations of COVID patients is like over 90%. It's because more and more people are vaccinated, even with a lower rate of hospitalizations, the numbers are higher. Also, it's interesting to see on the Ontario site that COVID hospitalizations consist of 56% directly for COVID, and 44% were admitted for other reasons and then tested positive for COVID once hospitalized. The case is more telling for ICU with 81% admitted for COVID, and 19% for other reasons.

I am trying to play with raw data more for refreshing my munging skills than making a point or fodder to add to the COVID noise. I have been coding since 1978, played with neural nets, GAs, and GP in the late 1980s, but I don't code or do data analysis for a living right now (other than buisness strategy reports that require some basic analysis). There's a lot of data out there, and it can get very confusing. I am back to using R/RStudio from a brief stint using Julia/Pluto notebooks and previously using Python/Jupyter notebooks. I even did a toy DSEIR model in J back in April 2020 based on previous work by a couple of people, which I plan on updating to April[2]. I am going to try and do some Lisp work, and I think I will settle on RStudio and Lisp for more genomic/bioinformatic stuff (yes, I know biolisp has been supplanted by python, however, Lisp is having a renaissance in symbolic-related areas of ML again like NLP). BTW, in what language was GPT implemented, not API languages, but what PL(s) was used to create the code - C++, Java, Go?

I may be bad at navigating the CDC website, but I can't seem to get the dataset of numbers of hospitalizations by vaccination status, only rates or pre-filtered data. I do remember downloading raw data that seemed to have it (over 1.8gb, I think), but I can't seem to find it. I'd appreciate a link if anyone has it.

[1] https://covid-19.ontario.ca/data/hospitalizations#hospitaliz...

[2] https://github.com/phantomics/april

jll29 · on Jan 23, 2022

It's funny to read about a dashboard with TBs of memory and distributed DBs when on HN, people pride themselves on getting Web servers to run on floppy disk based systems.

Joking aside, I liked the description of the dashboard, and generally speaking the UK's government Web sites are better quality, support open data more, are easier to read and navigate than other European countries from what I have seen. This includes this dashboard, which looks clean, simple and functional.

I was waiting for the big SQL Server advertising language and positively suprised that the article is very tech agnostic. I did all seem to be rather over-engineered, but Microsoft needs to make some money and government agencies don't generally have wizards from HN working for them, so I can live with an occasionally over-engineered system as long as important systems are working and remain up.

The most mysterious part for me was why one would put JSON inside relational tables?

wnolens · on Jan 24, 2022

> The most mysterious part for me was why one would put JSON inside relational tables?

Cheap and easy way to permit a flexible schema for some part of the data. Performance tests probably showed that for their specific query workload, any slow down from parsing/lack of index was fine.

snthd · on Jan 23, 2022

Elsewhere[0] Microsoft have redefined "Open Source" to not include the right to redistribute, or to host on a cloud service.

So while there's nothing wrong here with calling an MIT project open souce, it's not inconsistent with their own definition, and useable as propaganda.

[0] https://azure.microsoft.com/en-gb/services/developer-tools/d...

>Is Azure Data Studio open source?

>Yes, the source code for Azure Data Studio and its data providers is open source and available on GitHub. The source code for the front-end Azure Data Studio, which is based on Microsoft Visual Studio Code, is available under an end-user license agreement that provides rights to modify and use the software, but not to redistribute it or host it in a cloud service. The source code for the data providers is available under the MIT license.

chrisseaton · on Jan 23, 2022

It’s still useful to have the source open for reference even if you can’t redistribute though.

nojito · on Jan 23, 2022

How is the most common open source license not open source?

samhw · on Jan 23, 2022

I mean, whatever the truth may be, you're begging the question somewhat by calling it an 'open source license'.

skylanh · on Jan 23, 2022

BSDL, GPL, MIT License.

This has been an argument for at least 25 years that I've been around this stuff.

snthd · on Jan 23, 2022

Both sides of that argument agree that the right to redistribute is fundamental to FLOSS.

Microsoft is defining their product, which you can't redistribute[0], as "Open Source".

[0] https://github.com/microsoft/azuredatastudio/blob/4012f26976...

llimos · on Jan 23, 2022

> From the beginning of the COVID-19 pandemic, the United Kingdom (UK) government has made it a top priority to track key health metrics and to share those metrics with the public.

According to Dominic Cummings (ex-adviser to the PM), this isn't true at all - one of their biggest failings early on was to not have the data and not see the priority in getting it.[1]

[1] https://news.sky.com/story/dominic-cummings-hearing-the-insi... : He added later that there was no data system at that point, and he needed to use his iphone as a calculator to make predictions about the extent to which infections would spread, which he then wrote down on a white board.

zarzavat · on Jan 23, 2022

It's worth noting for those who don't follow UK politics (I don't recommend doing so), that Dominic Cummings was fired as the top advisor, and like a jealous ex he is determined to bring down the government by any means possible. So he is an unreliable source to say the least. Although the government seems to have left enough rope to hang themselves without Cummings needing to invent anything.

Closi · on Jan 23, 2022

The government was tracking key health metrics and sharing them at the point Dom was talking about using his iPhone. For instance on the same date, the government did it's first daily briefing and shared infection and death metrics with the public (see https://www.bbc.co.uk/news/uk-51901818).

Tracking key health metrics and sharing those metrics with the public doesn't mean that there is modelling about the extent to which infections would spread - although we also know that the imperial modelling was released a day later, so while he may have been using his iPhone to make predictions there were also academic teams modelling this that were collaborating with the government at the time (see https://www.imperial.ac.uk/news/196234/covid-19-imperial-res...).

It's also not clear what a 'data system' is in this context - there was clearly an effort to very quickly put something in place to capture data (because it couldn't wait a few weeks/months), but a more robust analytics system will inevitably take more than a few weeks to put in place if not already in place pre-pandemic (a lot of this is about how NHS trusts are structured in the UK, which operate fairly independently). It's not clear to me how quickly is realistic to implement what Dom thought was suitable in terms of a 'data system', particularly as I'm not particularly clear on his requirements (he seems to want an element of forecasting built into this system for instance?), so without knowing what the requirements are can we be confident that what he wanted to build was possible to build, test and implement in his expected timeline?

So I don't think there is a clear contradiction here (and in fact, I think the evidence points to the fact that the statement in the article is probably correct).

iamstupidsimple · on Jan 23, 2022

From watching his testimony I believe Cummings was disappointed that this 'data system' wasn't available before the pandemic, precisely because that few week delay putting one together meant everything.

Closi · on Jan 23, 2022

> I believe Cummings was disappointed that this 'data system' wasn't available before the pandemic, precisely because that few week delay putting one together meant everything.

It's very easy to say that something should have been there after the fact, and harder to build a system for an unknown-threat before you have any clear requirements.

It's also not clear to me what data a 'data system' would have provided that would have meaningfully changed any policy (and if it didn't affect policy, I'm not sure how it can 'mean everything').

What data did the government not have in March 2020 that they could have collected with some sort of pre-built 'data system'? In reality the bigger issue in understanding the situation at the time was that we couldn't identify all the covid cases anyway because there wasn't enough test resource - it wasn't the lack of a 'data system'.

rob_c · on Jan 23, 2022

Yes but Cummings deliberately asked questions which are fundamentally unknowable and went with report9 which claimed to be all knowing...

> "and to share those metrics with the public" Given stats on how many patients were in ICU didn't officially exist so you couldnt request them with an FOI request I'll let you work out how true this is. They wanted to control the data to craft a narrative to justify the report9 claims we'd have plague bodies in the streets because this is the end of days...

lozenge · on Jan 23, 2022

Well, it depends what time period you're talking about.

In early March 2020, they briefly announced that the daily COVID figures would move to a weekly cadence.

When locking down, they were still flying blind, and after that (during Hancock's 100k tests a day moonshot) there were leaks that "figures were being compiled in a notebook by calling round different labs".

glogla · on Jan 23, 2022

I skimmed the article and it seems interesting.

On the data side, they have ~7.5 billion total records and they add in 55 million new a day. On the web side, they have ~1 million daily unique users and 100k concurrent users at peak ("concurrent" means "in one minute" is seems).

I'm no expert on the web part, but I'm kind of curious why they went with the design they did for the data part. The design, and the chosen technologies make me think they treated it more like a normal web app, not like a dashboard. I would expect OLAP database, not a sharded Postgres, and the data model feels very OLTP to me as well. Or maybe is that because it's mostly time series and not traditional data model?

I'll have to go through the article in more detail.

mslot · on Jan 23, 2022

OLAP stores are relatively fast at answering a single query on a large data set, but basically none of them can handle high throughput with subsecond response times (e.g. when the whole country checks statistics for their own postcode at 4pm).

OLTP stores are relatively bad at aggregating across a lot of data.

Analytics dashboards with many users, a lot of ever-changing data, and many different views exist in a gray area between OLAP and OLTP often referred to as real-time analytics or operational analytics. The queries are usually somewhat lighter / less ad-hoc / more indexed than in OLAP, but there can be hundreds or thousands of them per second with different filters and aggregations.

There are some specialized real-time analytics databases like Druid. Citus (used in the article) allows you to run such workloads at scale on PostgreSQL.

GraemeMeyer · on Jan 23, 2022

A few other commenters have pointed out the same thing - I’m wondering if it’s simply the skill set they had on hand when the need arose

2143 · on Jan 23, 2022

The UK covid dashboard[2] mentioned in the article also has a "simple" version.[1]

I love it when websites have a simple text version.

[1] https://coronavirus.data.gov.uk/easy_read [2] https://coronavirus.data.gov.uk/

zlib · on Jan 23, 2022

https://github.com/publichealthengland powered by a lot of Python and Ruby, nice.

nojito · on Jan 23, 2022

And a bunch of F#

https://github.com/publichealthengland/coronavirus-dashboard...

mdm12 · on Jan 23, 2022

Nice catch!

It surprises me how much more popular F# is in Europe compared to the US. I finally got a professional F# gig in the states (\o/), but there were very few options. It makes me wonder, are universities in Europe providing a more functional-first approach to CS education, or is something else going on?

jll29 · on Jan 23, 2022

Never seen a job ad on anything sharp other than C# in Europe.

There are occasionally LISP and Clojure jobs from what I can tell.

(It's also hard to find people on the talent side. I needed a Haskell developer with NLP skills in 2005, and could not find one so we had to port our codebase to Java.)

mdm12 · on Jan 23, 2022

It's anecdotal, but on the F# Software Foundation's slack workspace[1], 4 out of the last 5 postings in the #jobs channel were in Europe.

No doubt, any company that picks a niche programming language as their business's lingua franca is taking a risk. For me, though, that is an indicator that they care about quality and do not have a culture of treating engineers as replaceable assembly line parts.

[1] https://fsharp.org/guides/slack/

boomskats · on Jan 23, 2022

This feels like a marketing-led attempt to shoehorn Citus into something topical and shareable, having realised that they've barely talked about it since acquiring the company a couple of years ago.

I'm all for Citus, but cmon. Overkill.

raesene9 · on Jan 23, 2022

The covid dashboard I've found the nicest to use, is actually not one of the official ones but instead this https://www.travellingtabby.com/scotland-coronavirus-tracker... .

The information is presented clearly and it's easy to see what's going on, although in my case the main reason is the breakdown for Argyll & Bute, which isn't a focus area for the national ones!

adenner · on Jan 23, 2022

It is miles better than my state's local Covid Dashboard (https://coronavirus.iowa.gov/) that is updated fully once a week on Wednesdays. On Monday and Friday they simply post a screenshot of a pixelated version of a summary page only.

idleherb · on Jan 24, 2022

Looking at their database schema (table "auth_user"), it looks like the user passwords are stored unencrypted and without salt?

rob_c · on Jan 23, 2022

Given the original meetings were all about producing a dynamic simulation dashboard which would allow people and politicians to understand the impact of various measures on lives saved...

Typical this turned into a pro cloud puff piece that frankly shows a serious amount of over design for what should be a data filtering/processing step to any reasonable "data scientist". And if I'm having to say a data scientist could do it better you know you got it wrong...

axiosgunnar · on Jan 23, 2022

> people and monsters

monsters?

rob_c · on Jan 23, 2022

Fixed to politicians. Freudian slip and a half there...

2dvisio · on Jan 23, 2022

Wonder why they haven’t used powerBI for that, but deep inside I know why...

amelius · on Jan 23, 2022

I would have preferred to see it implemented as a spreadsheet in the cloud.

StringyBob · on Jan 23, 2022

Well, you can just grab the data very easily: https://coronavirus.data.gov.uk/details/download

amelius · on Jan 23, 2022

Yeah, not what I prefer.

As a programmer, I want to make demands about UX too, for a change ...

aembleton · on Jan 23, 2022

Why not build your own UX on top of the data?

rob_c · on Jan 23, 2022

The data is exportable but you then have to deal with 8MB+ csv files. Still it's better than it being in docx I suppose...

But then this would let you perform statistical analyses on _their_ data, I'm not sure they're such a big fan of that...

aembleton · on Jan 24, 2022

What format would you prefer it to be in than CSV? There are CSV libraries in pretty much every language and any spreadsheet can import them. 8MB really isn't that hard to handle these days.

intricatedetail · on Jan 23, 2022

[flagged]

rob_c · on Jan 23, 2022

Yes. This was never pure ONS data. It was always processed/massaged/tampered with.

Although horrifyingly it did allow you to gauge the Westminster mindset and understand what Draconian measures they were planning to introduce sightly ahead of time because the data would have to reflect this when lord Boris got up on the podium...

rob_c · on Jan 23, 2022

Despite the downvotes the only incorrect statement here is that Boris has a lordship. Go export the ONS data yourself and analyse it unless your too lazy to see what I mean

teh_klev · on Jan 23, 2022

> Go export the ONS data yourself and analyse it unless your[sic] too lazy to see what I mean

If you're going to make such claims then the onus is you to provide the evidence.

cameronh90 · on Jan 24, 2022

Surely the onus is on the person claiming it's tampered?

rob_c · on Jan 24, 2022

Not tampered processed and misrepresented. Two my where you can remove the logarithmic plots on the MS data. Tell me where you can get the data behind this great firewall of information other than FOI requests and do gain some basic literacy in stats to spot some of the obvious oddities in the datasets (they're there)

axiosgunnar · on Jan 23, 2022

> Government project

> awarded to Microsoft

Hey Europe, want to stop being several decades behind in IT compared to US/China?

One simple trick:

Ban FAANG from public procurement in Europe!

It‘s a no-brainer really.

Buy locally, ideally giving small companies and startups a chance.

You will have to do it anyway very soon if you want your privacy laws to be taken seriously.

There might be a couple of months of friction while buerocrats have to find new procurement partners, but that's it.

And then the European tech scene will rise.

azalemeth · on Jan 23, 2022

I really can't agree with this more. My university gives >$2e6/year to Microsoft alone. I'd much rather it gave >$2e6/year to providing jobs for locally employed people, rather than buying someone another yacht.

discordance · on Jan 23, 2022

Curious how many students and faculty at your university are served by that $2m/year?

I don’t disagree with you, but if I were the CTO of a 10000+ seat organization, and a Microsoft/Google/etc told me they could provide email, storage, sharing/collaboration, office apps, security etc for a few bucks per user / month… that’s a pretty compelling deal.

notahacker · on Jan 23, 2022

And the idea of a university rolling its own office apps suite to provide local employment sounds like the sort of decision likely to end up in tears, or at least in students relying on Excel and Word on their own computers anyway

vidarh · on Jan 23, 2022

Most people wouldn't end up with rolling their own, but contracting existing support providers or developers.

E.g. for Excel and Word replacements there is LibreOffice, and any number of companies offer more polished packaged up LibreOffice variants and surrounding services. One example is Collabora in the UK[1]. In that respect the effect would be to shift revenue that currently mostly leaves Europe to companies closer to home.

The biggest hindrance is that if you do too much of that and you'll get the US and others doing the same thing in return, whether in the same sectors or entirely different sectors.

[1] https://www.collaboraoffice.com/

odiroot · on Jan 23, 2022

> Ban FAANG from public procurement in Europe!

Is it any better if SAP / Telekom get similar contracts (what usually happens in Germany)?

axiosgunnar · on Jan 23, 2022

At least the tax euros stay in Europe, and you get local expertise, tech hubs etc

stickfigure · on Jan 23, 2022

Don't Google Microsoft et al have offices in Europe?

rm445 · on Jan 23, 2022

> Microsoft

> Ban FAANG

I know exactly what you mean, but is that what we're doing now? Including Microsoft in FAANG but not changing the acronym?

shroompasta · on Jan 23, 2022

to be fair, microsoft should've been included in the first place. they've been teetering top 3 market cap for a while now.

amelius · on Jan 23, 2022

They should replace Netflix in FAANG or include Disney, HBO and a bunch of others.

desas · on Jan 23, 2022

A couple of months wouldn't replace Microsoft word never mind Excel, Aws, Azure AD or Azure

johneth · on Jan 23, 2022

Microsoft didn't build this project, they just host it.

It's built in-house by the government.

kakoni · on Jan 23, 2022

For instance in Finland Azure/AWS/GCP are seen as a superior alternative against anything local/European and we’ve started to move our govermental and healthcare infra into these cloud providers.

benbristow · on Jan 23, 2022

Get your point.

Those small companies and startups are going to end up using Microsoft/Amazon/Google for their hosting/cloud-services anyway so FAANG still win in the end.

tjungblut · on Jan 23, 2022

The UK isn't part of the EU anymore.

Doctor_Fegg · on Jan 23, 2022

Maybe GP has edited their comment, but it says "Hey Europe" and we are very definitely still part of Europe.

glogla · on Jan 23, 2022

But one day you will put that offshore wind turbines into reverse mode and push the isles to the middle of Atlantic, right?

tjungblut · on Jan 24, 2022

Indeed it was edited, it said EU before. Europe makes even less sense, there is no centralized public procurement in all of Europe.

rob_c · on Jan 23, 2022

But frankly we're very similar in this regard, and given we might get dropped from H24 we're not in a good position tech or science wise heading into a recovery...

We'll no doubt award yet more govt projects to the tech oligarchs of the west and praise students for using their toys...