Hacker News new | past | comments | ask | show | jobs | submit login
Random ScotRail Apology Generator (datasette.io)
81 points by notpushkin on Oct 16, 2022 | hide | past | favorite | 46 comments



The background here is that someone did a FOI request for the announcement audio, crowdsourced transcriptions, and various things have been built on top of it.

Related previous posts:

“Ambient Scotrail Beats – Relax to Scottish train announcements over low-fi beats” - from the original FIO requester, covers details of background and crowd sourcing:

https://news.ycombinator.com/item?id=32535491

“Hacking around with the ScotRail audio announcements” - this “datasette” from the creator of Datasette Simon Willison:

https://news.ycombinator.com/item?id=32536808

(Slight aside Datasette is an awesome tool for publishing and exploring these sort of datasets)


The background here is that someone did a FOI request for the announcement audio

Whatever the other merits of this project, I'd like to know the motivation behind the FOI - I really dislike it being used for trivial reasons.


One persons trivial is another persons art. FOI exists to liberate information created using taxpayers money, so that information can be used by the public as they wish (after all, they paid for it).

Every request enhances the publics understanding of government, and every request creates the opportunity for the information to be reused in new unforeseen ways. Some may seem trivial (like a random apology generator), others may significantly impact people’s day-to-day lives (better public transit routing apps). But until the FOI happens, and people can play with the data, neither will ever happen. Frequently the people doing impactful things with the data, aren’t the same people requesting it. It’s also likely that many of those impactful projects would never be started if the data wasn’t already free. So every “trivial” request may lead to something far more impactful down the road.

If you’re concerned about the cost involved. Then I would recommend criticising government for not making FOI fulfilment either efficient, or unnecessary. If government agencies just published the data (like TfL does), then there wouldn’t be any need for FOI requests.


Literally everything that is available via FOI should just be posted online proactively.


Curation would become a problem (maybe even to the point of requiring more resources in aggregate than complying with current FOI/FIOA requests).


I'm not suggesting any curation that wouldn't already have happened as a result of existing needs for organization. Just publication.


Look at your own photos on your phone. Mine are all IMG_####. Scale that up to thousands of organizations and tens of thousands of users.


The point is publicly funded and government data should be public by default. The organisation of that now public data isn’t within the mandate, and doesn’t need to be, of the originating organisation.

Journalist, hobbyist, investigators, or the just generate intrigued can organise the data or trawl through it to their hearts content.

That’s literally what Datasette was built for.


Exactly. Is there some expectation that the governmental organization will have a giant pile of completely unorganized data? Or that they'll have it organized for their own day-to-day usage? The latter will suffice.


“I would like the ScotRail audio files.”

“We don’t do FOI any more; it’s all online but uncurated. Download the petabytes of data from gov.uk and I’m sure you’ll find it in there somewhere.”


Presumably, a train fan wanted to play with announcement audio. Is that not reason enough?

I'd try it myself, but local train station announcement systems are commercial software that's licenced by the central rail authority, so I suppose it wouldn't work.


Why is that? You clearly feel strongly about it. I’d not considered that FOI requests might elicit such a response.

Perhaps you could define the boundary between trivial and non-trivial uses?

Have you had to deal with requests for information that you regard as falling on the trivial side of things?

Is there a solution to the your dislike of certain types of use?

Apologies for all the questions but your comment feels like it’s missing some genuinely interesting detail.


I've seen public sector organisations where FOI laws have been used by salesmen spamming to determine technology infrastructure to pitch their product.

It's not why FOI was created or motivated. There is genuine need for openness but this sort of thing just tasks already underfunded and stretched departments with work that distracts from providing core services.


I can see how that would be frustrating and sort of causes a misuse of public funds.

Feels like infrastructure queries shouldn’t come under FOI when i first consider it. Then i suppose there might be legitimate reasons to care; Perhaps things like conflict of interest in contract allocation, or things not being put out to tender.

As you mention, it feels like publicly funded things should be able to be inspected by default, so perhaps having it out in the open all the time is best. That requires funding and it requires a commitment at a governmental level to get it right.

With the lack of funding provided to public institutions in the uk though it’s pretty hard even for the basics to be done well. I’d better not get on to that subject lest i get testy.

Back to your main point though, I don’t know how to differentiate (with a set of unambiguous rules) between the trivial and non-trivial in a way that doesn’t accidentally prevent real use cases.

Perhaps the way FOI is set up is at fault here rather than the people who make use of it, trivially or otherwise.


FOI laws have been created so that the public can gain information about the internal workings of the government - and the scope is deliberately wide. It includes activities that you or me may not consider worthwhile, but that’s our judgment call. If handling FOI requests overloads the departments, then the answer would be to just make the information accessible to the public from the outset or supply sufficient funding to the departments so that they can provide the services required by law, including answering FOI requests.


I'm amazed at the specificity of some excuses.

I got "a road vehicle colliding with a bridge earlier on this train's journey", and (forgot the actual text) "a bicycle on the rail earlier on this train's journey".


Some interesting research, which I think TfL did, basically says that people are more understanding of delays etc if they’re given the actual reason why a delay occurred.

After all you’re much more likely to forgive ScotRail for getting home late if you’re told your train was delayed due to someone leaving a bike on the track, than you’re if you’re told nothing. A lack of explanation just comes across as either indifference to your situation, or incompetence. Neither is a good look.


Unless the reason is leaves on the line, or the wrong kind of snow.


Leaves are surprisingly tricky. Trains mulch them into a hard slippery varnish on the rails, which is then extremely difficult to remove (current best methods involve industrial lasers).

The slippery varnish substantially reduces breaking performance, and increases breaking distances. That in turn means all of your safety margins vanish, and it’s no longer possible to operate trains at full speed, and at normal follow distances. All of that in turn means delays on an individual train basis, but also lower network capacity. Given that most of the UKs commuter rail operates at 80%+ capacity (basically the limit, before minor failures consistently result in catastrophic cascading failures), the result is that leaves on the rails is actually a pretty serious problem that’s tricky to solve.


What about some kind of shovel or air blower to remove the leaves before they compress.

I don't deny it's an issue. But it happens every year. It's like having trains that can't run when it rains.


You would have to equip every train with that system, leaf fall during Autumn in the UK is surprisingly high, so cleaning once a day wouldn't even be close to good enough. Many councils increase their street cleaning 4x just to deal with leaves. Not to mention that wet leaves are surprisingly heavy and sticky, so you might need a pretty complicated system to reliably remove leaves on a train moving at full line speed.

At the end of the day, leaves cause issues for about four weeks out of every year. That makes it tricky to justify adding thousands of pounds of specialised equipment to every train, just to deal with four weeks.

Of course, you could find a solution to all these things, and have far more reliable trains. But the cost would be high, and it would require our government to actually want reliable public transport. So far, all the evidence suggests our government is only interested in subsiding private vehicles, because public transport "doesn't return a profit".


4 weeks is aprox 1/12 of a year.

A train costs multiple millions of £.

So 'thousands' of pounds seems reasonable to keep them running properly for 1/12 of the year.

Are leaves heavy and sticky when just fallen? From my experience that seems to be the case when they've been around a while. The whole issue is that the network is running at near capacity. If they're being moved away every hour, it doesn't seem like you'd get that issue.

Ultimately it hasn't been done, so in some sense you are correct. I think it's more a case that they can't be bothered, rather than a technical or financial limitation.


> Ultimately it hasn't been done, so in some sense you are correct. I think it's more a case that they can't be bothered, rather than a technical or financial limitation.

Seems rather uncharitable. Given how much effort various rail companies put into providing a good rail service, seems rather odd that they got to leaves and decided “they can’t be bothered” anymore. I mean why stop there? Why bother running the trains at all? Why bother doing track upgrades or maintenance?

There’s 20,000 miles of train track in the UK. At 100mph it would take you 8days to traverse all of it. You would need 200 leaf clearing units running at 100mph 24/7 to clear leaves every hour. I can see why that might be a logistical challenge.


Why not point compressed air at the rails?

Edit: D'oh, already suggested up-thread. Rest of my comment still stands.

I'm sure even electric trains already have a compressed air system for brakes (though being UK, it might be vacuum).

If you really want to get fancy, have a computer vision system looking ahead that will only turn on the air when it sees leaves.


Even a vacuum system would imply pressurised (well at least forced, I'm not sure if there's technically a difference here) air.


Here you can browse the list of all reasons: https://scotrail.datasette.io/scotrail?sql=++++select+Transc...


Thank God they differentiate between "A fire near the railway involving gas cylinders" and "A fire near the railway suspected to involve gas cylinders" how else would I be able to sleep at night.


First one I got was "overcrowding because of a rugby match", they also have overcrowding because of a marathon, a football match, a sporting event, a concert and just general overcrowding and overcrowding because there's not enough cars as usual.


This sort of things is even 'funnier' when you realise that there must have been a whole discussion on this specific issue in one or several meetings.


I wonder how much of the message is for train and station staff, like Inspector Sands, as much as for passengers.


I have to wonder how they devised the list to be recorded. Perhaps they have a list of every reason a train has ever been delayed and they used that.


Some back story and previous discussion https://news.ycombinator.com/item?id=32536808


Anyone here spend at least an hour laughing at the abuse-a-tron in the late 90s?

I think this is it but it's been forever.

https://members.tripod.com/Koolkat_2/ABUSE.HTML


I note that the cancellation is in three parts: “has been cancelled”, “due to” and finally the reason, each said in slightly different ways.

Why not just say “…has been cancelled” and then announce the reason with a single, complete sentence in a longer piece of audio?: “This is due to livestock on the track.”, “This is due to a staffing shortage.”, etc

I get that it’s impossible to foresee everything the system might have to say and to record all announcements in one sitting with the same person in exactly the same way. However, is there an important technical reason for why these systems break up sentences so that they are spoken by different people?


Because it wouldn't sound like an official robot announcement without the awkward pauses.

On the Jakarta metro, most stations have a sponsor's name attached to them. The sponsors change every now and then, so the new names have to switched in, but they put zero effort into equalizing the audio. So you get treated to (lovely female voice) "Stesen berikutnya, Lebak Bulus" (dude yelling) "GRAB". (lady) "Lebak Bulus" (dude again) "GRAB."


That’s because the SQL powering this demo only selects the “has been cancelled” announcement section. Presumably they have other snippets they can put in there like “has been delayed”, or “has terminated early” etc. Depending on the nature of the incident, you may not have, or need, a reason. It’s probably not worth putting a reason in if the train has only bee delay 5 mins, you may not even know the reason for the delay in time for the announcement.


The goal is to avoid as many mid sentence voice changes as possible. That’s hard if your announcement is one sentence:

The X service from A to B has been Y’d due to Z

I’m wondering if, instead, the announcement could be:

The X service from A to B has been Y’d.

This is due to Z.

I’ve said this before — there aren’t that many different train services. You could record every version of both sentences in full, and almost certainly record every version of every sentence in full by the same person in a tractable amount of time. I don’t think the problem scales as much as these engineers think it does.


Sounds like a lot of work, and cost (voice actors aren’t free) for a some pretty marginal gain.

If people can understand the announcements easily, then they’ve done their job. Anything extra is just that, extra.


The number of routes is bounded by the number of drivers you hire, all of whom draw an annual salary, and who operate rolling stock that is bought and fueled multiple times on a daily basis.

For each of those you have to pay a voice artist to say the name of the service N times for the N types of notable announcement. Once a season. It’s nothing.

As well as that, factor in the number of passengers who will listen to these announcements, and think about the impact getting the announcement perfect instead of just done. It really is the sort of thing worth getting right.

But yes, you will probably have one naive engineer in the middle of all this saying “repetition is Bad(tm) so let’s factor it apart and build it programmatically!” oblivious to the whole system.


> But yes, you will probably have one naive engineer in the middle of all this saying “repetition is Bad(tm) so let’s factor it apart and build it programmatically!” oblivious to the whole system.

Sure, because organisations like ScotRail are notorious for being engineer led, and clearly have plenty of extra cash sloshing around that couldn't be better spent improving the actual rail service.


Now do one for Deutsche Bahn :D

There was some news a few weeks ago where DB was saying train delays/cancellations were due to sabotage. Probably by Russian agents. Sure thing DB.


There’s absolutely enough to blame the DB for, but the sabotage was a real thing - unless you want to imply that the federal police is lying.

Source (in German): https://www.tagesschau.de/inland/bahn-sabotage-101.html


Deutsche Bahn has some announcements which are amazingly unspecific (something like "[the delay] is due to delays in the operating sequence"). They also have multiple euphemisms for accidents/suicide (translated it would be something like "personal injury at the track, personal injury IN the track, emergency doctor deployed at the track").

If they were to record the announcements today, maybe they should add one for sabotage and one for stolen overhead lines...


Relatively recently they switched from announcing track/train/signalling/overhead line/… failures to announcing everything as "repairs". I suppose the hope was that that would make them sound more proactive (look, we're already repairing the failure), but in practice those new phrasings often just sound silly.

> Deutsche Bahn has some announcements which are amazingly unspecific (something like "[the delay] is due to delays in the operating sequence")

And the local public transport operator (buses/trams) uses "operational reasons" as a euphemism for "lack of drivers".


We are sorry to announce that Virgin Trains East Coast service to Carluke has been cancelled due to A train not stopping in the correct position at a station.


A Train is asshoe




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: