Hacker News new | past | comments | ask | show | jobs | submit login
NASA knows what knocked Voyager 1 offline, but it will take a while to fix (arstechnica.com)
90 points by ljoshua 3 months ago | hide | past | favorite | 82 comments



The next time you're having a frustrating debugging session, comfort yourself with the knowledge that you're not debugging a hardware issue on a computer built 50 years ago where it takes 45 hours to produce the output of every single command, and where failure means the loss of one of humanity's most important scientific relics.


only slightly more frustrating that using console.log before IE 9. In those versions, if you called console.log and the console was not enabled (F12), the statement who primary purpose was to debug code, would itself cause an error.

> Please note that in IE, unlike in Firefox, if the developer tools are not active, window.console is undefined and calling console.log() will break. Always protect your calls with window.console && console.log('stuff');

https://stackoverflow.com/questions/2656730/how-can-i-use-co...


I don’t know about their process, but if I was in charge of such work, I would create a replica system to mimic the response without the wait, and the risks, it might not eliminate all of them, but definitely will minimize them.


There are replicas but there may also be differences because of the large amount of human effort required to build one. I saw a replica complete with wire-wrapping over a decade ago. This technology looks very foreign to any recent computer engineer. As such, the more practical approach is to write software that can validate most commands you would send. I wonder what level of emulation they have achieved (if any) on modern hardware.


Replicating the bits of hardware destroyed by radiation in the correct place and to the correct degree may be bit tricky.


[flagged]


From the Guidelines[0]:

“Be kind. Don't be snarky. Converse curiously;”

[0] https://news.ycombinator.com/newsguidelines.html


No need to be harsh


I disagree. There is a pressing need to stamp out the sort of thinking prevalent in GP’s post where armchair quarterbacks post on the internet about how they would solve something when simply reading the article would tell you that the armchair quarterback’s second down pass from the one yard line is a Super Bowl loser.

Read the article.


It’s an internet message board. You can turn it off and walk away. There are no pressing needs. Take a series of deep breaths and relax your shoulders.


It’s rhetoric man. Have a brew and take your own advice before replying to my posts.


When has telling someone "calm down" ever made them calm down?


People often say stuff like this and it's usually somewhat nebulous but... yeah. You're right.

I honestly can't think of a more difficult debugging scenario.


I wonder how many people have to carefully analyze and sign off on every command and update that is sent to the probe?


Probably less than you would imagine. Only so many people are qualified to be able to scrutinize anything for these probes. It’s likely that whatever small team is doing this debugging has the capability within the group to sign off.


So they're saying that one memory chip in the FDS has failed, representing 3% of the FDS's memory.

The FDS memory seems to be CMOS, which was apparently novel at the time of launch[0]. I suppose that means it's static RAM. I have a notion that at least some CMOS SRAM came in 8-bit wide chips, storing 1-4 Kb. So that would imply total RAM of between 20Kb and 80Kb. I haven't managed to track down many definite numbers or facts.

It's programmed in FORTRAN77; it's a very long time since I was adjacent to FORTRAN, but as I recall, all variables were global. I'm sure it's tricky to write code that works even if there's a hole in physical memory; but I guess the absence of heaps and caches must help a lot.

[0] https://www.allaboutcircuits.com/news/voyager-mission-annive...


Couldn't you declare a variable that is mapped to the danger zone and canary it.


Unfortunately there's very little code space remaining to code in such a workaround. Easier and safer to rewrite the addresses in a patch.


That's what I assume the process would be.

In those days, a "patch" didn't mean a source-code diff; at least, not in the systems I worked on. We got binary patches. I think they were packaged for use with a program we called "patch", but you could definitely apply them with a hex editor, in a pinch.

We got patches for the OS itself this way. But we didn't have access to the OS source, nor the build tools - the OS was written in a proprietary ALGOL derivative, and we didn't have the compiler.

I was once shown how to patch the COBOL compiler so that you could produce programs that couldn't be written in COBOL, things like messing with IO ports.


It would be fun to try to design a system that can work around all kinds of failures, such as being able re-purpose one subsystem to take over from another one.

For example, have all the digital electronics be reprogrammable gate arrays.


Relative simplicity is also a key ingredient to the success of these probes. The computers onboard are completely understandable by a single person. The software stack doesn’t have so many layers of abstraction that you are disconnected from what the hardware is actually doing… in fact the opposite is true.


Redundancy is probably the best option. That’s how the Voyager is still working after one of the computers failed.


Redundancy is a very good option, but the "right" level of modularization helps too. You probably don't want to make everything out of FPGAs, but if your individual modules can be flexibly rewired on the fly and they are granular enough in their functionality to build new things, that adds yet another level of redundancy, as you might be able to replace a broken thing with the right combination of working modules.


Redundancy is the straightforward way, two of everything. That still means two failures can cripple the machine. I'm talking about going beyond that. I.e if you need A and B and have modules X, Y, and Z, then any one of XYZ can fail and you still have full functionality. I.e. you can have more redundancy with fewer modules.


Reminds me of something I got really invested in about a decade ago which is Dave Ackley's work on "robust first computing". The example at the end of the sorting algorithm that can reconstitute itself when you blow half of the machine up is still amazing to me.

https://youtu.be/lbgzXndaNKk


Not possible in reality. You'll inevitably have single points of failure in the control system of any spacecraft, even with redundancy, rerouting, or tech that is robust to individual gate failures (erasure coding, holographic storage, neural networks etc).


One massive FPGA? In the future: nanobots that can reassemble to a better design. Maybe you don’t even need to complete the design, just ship the nanobots and do a 0-day patch when they reach their target.


Or they will deconstruct the shop in flight. New version of rapid unscheduled disassembly or RUD, normally reserved for crashes and flight failures.


They should make a movie.


I wonder how FPGAs handle cosmic radiation.


The film "It's quieter in the twilight" might be of interest to folks on this thread. I saw it on Amazon Prime, but I think it is available on quite a few platforms, according to https://www.itsquieterfilm.com/where-to-watch


This appears to be a 2022 documentary film featuring interviews with the scientists operating the Voyager program these days. Reviewers seem as impressed by the ordinariness of their office environment as they are by the quality of the mission and of the scientists themselves.


I'm nervous asking, but a little curious what kind of equipment someone would need to give it different instructions. Does it use 1970s encryption? How big would your radio need to be?


No encryption; security through expensivity. A major developed nation would be the only candidate with enough resources to build something to communicate with Voyager. But to what end?

Voyagers are sent commands using the DSN, which is in high demand: https://arstechnica.com/space/2023/08/nasas-artemis-i-missio...

They need big radios (the biggest!) because the signal strength coming back from the probes these days are very weak, only one-tenth of a billion-trillionth of a watt.


I thought you were being handwave-y about the signal being returned because that's an obscenely small amount of power, but you're right. Hardware that is probably 50 years old at this stage, in the cold of (interstellar)space, and we're still able to talk to it despite the signal after its long journey back being ~0.000000000000000001w.


I still havent figured out how people decide signals under the noise floor


Essentially averaging. To put it simply, noise is random while the signal is not. So if you average 2x the noise reduces but sqrt(2) but the signal remains. Keep doing that and you have the signal appear out of the noise.

GPS operates at a negative SNR. That also uses code division multiple access to allow multiple transmitters to operate on the same frequency and the signals do not interfere.


Look up viterbi algorithm.

Imagine you're watching a trail of animal footprints in the snow, but the snow has covered parts of the trail (noise), so some prints are unclear. You're trying to figure out exactly which path the animal took. The Viterbi algorithm is like a detective method for doing just that, but instead of animal tracks, it's used for decoding messages or signals that got partially scrambled during transmission.


This is fascinating. It sounds like viterbi builds probabilities from known data, is that right? Is it essentially looking at the faint signal now and comparing it to data from when the signal was stronger and extrapolating?


That’s how I understand it. I think it uses knowledge of the state changes, what they should be like, and selects the most likely one from a table of options. Based on what it knows about the signal, it guessed whether it’s more likely to be a or b.

I use it with ham radio, where the software I use sends signals well under the noise floor (-23db) that can still get across call signs, signal reports, and maybe a thank you. The naked ear would not hear a thing at the low end of received signal.


I need a Jupyter notebook example!


A lot of the signal processioning/decoding gear is cryogenic. Maybe everything to the ADC?


Applied radio astronomy with a transmit component.


> Does it use 1970s encryption?

That would actually be fine, because you can do encryption in software. So you wouldn't need to keep any physical artifacts from the 1970s around.


I think OP's concern was more around that it's likely easy to break any encryption that was used in the 1970s so someone could send the spacecraft malicious commands.

That said, only America, China, Europe, India, Japan and Russia have deep space networks and none of them have anything to gain from investing the resources into trying to harm Voyager. Even with weak encryption it's likely difficult to collect enough samples to be able to break it and even once it's broken you still need to figure out a valid set of commands to send to achive your goal.


> it's likely easy to break any encryption that was used in the 1970s

I wouldn't jump to that conclusion. Some things encrypted now are intended to be secure for 50 years.


We have much better chances of designing a cryptography system today that can survive 50 years of math, and with computers a billion times faster we can make the encryption O((1.something)^billion) times harder to brute force. Nothing in the 70s was capable of resisting today's attacks.

You mention a one-time pad in another comment. In theory they could have sacrificed a bunch of tape space to store a pad for commands.


Encryption in the 70s resists today's attacks just fine. DES was weakened on purpose (short key length, but the idea for 3DES dates from 1978). In the asymmetric realm you still can't break a full length RSA key.

Maybe it's not fully contemporaneous with Voyager but the claim that 70s encryption is insufficient doesn't stand up to scrutiny


It's not the full length RSA key that's the worry, though, it's the structure around it that Bleichenbacher finds clever ways of poking holes in.

I don't think Voyager would be very vulnerable to an adaptive chosen ciphertext attack regardless. But you don't need the 70s maths to be wrong, you only need the engineering to have a flaw.


Is today's engineering different? One could argue that we have 50 years more of experience with the engineering (though the components have changed too, requiring newish engineering), and I would guess that we invest more resources in such things these days.


What was "full length" at the time, even if they got RSA implemented immediately after the paper was announced?

The RSA paper recommends 200 digit keys, which is somewhat under 700 bits. Would even that size run acceptably on an 8 kiloword 0.1MIPS machine? (I couldn't quickly find an actual opcode listing, so I can't say exactly how fast it multiplies, but it's probably not great.)


Why would you use public key encryption here in the first place? Just use a symmetric cipher.


What's a symmetric cipher from the 70s that we can't brute force today?

(And I'm not counting "the idea" of 3DES when even that was after voyager already launched and it wasn't properly set up until 1981. Also 3DES has enough weaknesses that I wouldn't trust it to be uncrackable today.)


Theoretically but probably not practically.

It’s doubtful that there’s enough processing or memory headroom for any decent encryption.

You’d have to not only add decryption routines, a key (even 128 bits is a lot!), and that’s another point of failure.

As we know one of the memory modules already corrupted, so if encryption was implemented, chances are the Voyager would be a brick now.


> It’s doubtful that there’s enough processing or memory headroom for any decent encryption.

Maybe they use a one-time pad. :) I'm only half joking (imagining the person who has spent the last 40+ years on Voyager with a book). Why not? How much memory and cpu would be required?


OP might be concerned it's bogus untested crypto from the bleeding edge 70s. not tested crypto from the 40s to 60s we use safely on everything today.


> The Flight Data Subsystem was an innovation in computing when it was developed five decades ago. It was the first computer on a spacecraft to use volatile memory. Most of NASA's missions operate with redundancy, so each Voyager spacecraft launched with two FDS computers. But the backup FDS on Voyager 1 failed in 1982.

I wonder why they chose volatile memory: Were there performance demands that required it? How much did non-volatile memory cost back then?


The voyager program was on a somewhat tight budget so I would imagine that had an impact. Having volatile memory is arguably a really good thing since it’s allowed them to patch the FSW on the vehicle in space a bunch of times, which a pretty common thing for spacecraft nowadays but would have been unheard of when it launched.


I'm not sure that's a capability specific to volatile memory, merely R/W memory. R/W memory already existed on prior NASA computers in the form of magnetic core RAM (as opposed to magnetic rope memory which served as ROM).


I'm understanding 'volatile memory' to mean memory that needs power for data integrity - if you lose power, memory is wiped like the memory in most desktops/laptops.


This is just speculation but, a shift in software engineering philosophy. The software of early probes was viewed, as a fixed part of the flight avionics and command sequencer. It was programmed well in advance to allow physical manufacture of the ROMs, tested as a whole unit, and the software became fixed quite early on in the project. Like a simple microcontroller, basically.

By the late 70s the modern understanding of software development was taking hold. They wanted to be able to download arbitrary programs, as needed, in flight. And so instead of having it mostly in ROM, the Voyager computers are like any other general purpose computer and all their main memory is RAM.

Updating a live system in RAM isn't as risky as it might sound when you consider there are three different computers on board, each dually redundant, serving a supervisor role for the other two.


It doesn't need to be volatile (require power for data integrity) to be read/write. See the other subthread ...


Until the spares fail, in which case it gets fun. This is the case where the spare failed already.


Seems like the smartest thing to do would be to take everything that we have learned from all of this and launch more voyagers.


The reason voyagers went so fast and so far was the particular arrangement of planets in the solar systems at launch time. A number of maneuvers and gravitational slingshots were used. It may not be possible for a number of years to do this sort of thing again. And without this, you’d need a much bigger rocket than is possible to build.


The Grand Tour alignment was less about being able to visit the outer planets, and more about being able to do it cheaply. One spacecraft was able to visit four planets! That's rare! If you're willing to launch four different probes to visit those four planets, it's not trivial to do, but it's not hard. (Jupiter's position basically sets your launch windows.) And just hurling something out into the black as fast as possible pretty much is trivial.

If you can round up the money for the spacecraft.


Calling something trivial if you have the gdp of a small nation on hand is pretty much the opposite of what trivial means.


Voyager 1 only hit Jupiter and Saturn, and it barely got a speed boost from Saturn.

We can boost from Jupiter into deep space whenever we want. Pioneer 10 reached a similar speed just fine, but we don't talk about it a lot because it broke.



There is a new (2020) "sun diver solar sail" proposal which could get as far as Voyager 1 is now, in two to three years.

Article: https://www.universetoday.com/148241/want-the-fastest-solar-...

Paper: https://arxiv.org/pdf/2009.12659.pdf

Interview with paper’s author: https://www.youtube.com/watch?v=W-E83lC-eN0


interview is not with the papers author


Ah yes. You are correct. I believe the interviewee is the one who led the project which brought all of these individual researchers together.



At the time those articles were published, the engineers were analyzing a memory dump received from the spacecraft.

The update in this new article is that they have determined that a memory chip has gone bad.


This is frankly awesome news, we now know Voyager 1 just needs a Windows Update(tm) to keep on flying. Of course, "just" is putting it mildly to say the least, but I don't want to think how worse the diagnosis could have been.



Seeing that picture of voyager, I wondered why we aren't sending out more of these in every direction.


The voyager missions took advantage of a rare planetary alignment for massive gravity assists. Their primary mission was investigating Jupiter, Saturn, their moons, a bit of Uranus and Neptune.

There’s not much to investigate going out every direction, the little science they’re still doing is tracking the size of the suns influence, and that will be about the same in any direction.

Another “grand tour” of the solar system during an alignment with modern sensors and cameras would be awesome though


It’s remarkable both that the alignment managed fall in a period in which we had the technology to take advantage of it and that someone recognized the opportunity and acted on it. A slightly different course of history could’ve prevented the Voyager missions from happening, causing the opportunity to be missed.

It’s so hard to imagine. As a 90s kid the photos the Voyagers sent back were prominent in my childhood.


Imagine how good the cameras will be in the 2140s the next time that alignment happens.


The fact that no one here mentions 'aliens', given the headline, disappoints me.


Give it time my friend. Give it time. Now, if the headline is that Voyager 1 has been mysteriously upgraded and is transmitting a 5TW signal and appears to be heading back, then do please do the 'aliens' thing.


Definitely it wasn't aliens. The main camera only started to show a mysterious countdown sequence.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: