Hello (Distributed) World: Designing Software that Spreads P2P

jluxenberg · on Aug 12, 2010

They're using SHA1 to sign / identify programs. Finding and exploiting a hash collision would be fairly straightforward and could have really bad consequences, since I presume I could publish my "rogue" modified program to peers fairly easily.

liuliu · on Aug 12, 2010

Just out of curiosity, is there any new progress in the collision discovery of SHA1? Can you provide some references so that I can dig into?

woodall · on Aug 13, 2010

Here is the questions posed to the Stack Overflow audience, however its from 2009[1]. The linked article says that they can now get collisions in about 2^52 operations- as opposed to the previous 2^69.

The National Institute of Standards and Technology has urged Federal agencies to stop using SHA1 digital signatures by the end of 2010, and instead start transitioning to the SHA2 family[3].

[1] http://stackoverflow.com/questions/1147830/understanding-sha...

[2] http://www.schneier.com/blog/archives/2005/02/sha1_broken.ht...

[3] http://csrc.nist.gov/groups/ST/hash/statement.html

makanikai · on Aug 13, 2010

From the footnotes: "In addition, we will be transitioning to SHA-256 in the future. "

adamb · on Aug 13, 2010

Our early decision to use SHA-1 was a balance between limitations of mobile hardware and the security landscape of the day.

Much of Skynet's core technology is actually designed for mobile platforms. Skynet essentially thinks of a desktop computer as a fancy phone with a different UI toolkit and without a cellular modem.

While switching to SHA-2 is on our to-do list, it's not as high as nailing a stellar experience for our users. Should SHA-1 erode more quickly than expected, we'll be sure to bump up the priority of that transition. We'll be sure to pivot the network before it's a real problem.

exit · on Aug 12, 2010

it's a little annoying hearing about but not seeing this "spin" language.

tophercyll · on Aug 12, 2010

Here's some basic syntax.

  values = '[1, 2, 3]
  incremented = values.map({ n | n + 1 })
  summary = incremented.join(", ")

  summary.starts-with?("1").then({
    fail("Increment failed.")
  })

tophercyll · on Aug 12, 2010

What sort of examples would you be most interested in seeing?

cmars232 · on Aug 12, 2010

How about a chatroom?

tophercyll · on Aug 12, 2010

Ignoring user interface and the mechanics of peer discovery...

  user-interface, discovery-system |
  # This process is spawned with two pids as arguments
  
  # If either pid halts, I halt too.
  user-interface.link
  discovery-system.link
  
  user-interface.subscribe(when(
    text-entered: { message |
      # I entered a message in the UI, announce it to everyone subscribed to me
      my-actor.announce('exclaimed, message)
    }
  ))
  
  discovery-system.subscribe(when(
    peer-discovered: { name, peer |
      # Discovery system located another user in the chat room
  
      # Send message to user-interface showing a Growl style notification.
      user-interface << ('notify, "{name} arrived.")
  
      peer.subscribe(when(
        # This guy went away - "halted" is announced when a process stops
        halted: { result | user-interface << ('notify, "{name} left.") }
  
        # This guy said something
        exclaimed: { message | user-interface << ('text-append, "{name}: {message}") }
      ))
    }
  )
  
  main.loop

akkartik · on Aug 13, 2010

Are the names discovery-system, user-interface, my-actor important?

tophercyll · on Aug 13, 2010

Every process has "my-actor" defined in its global environment. It provides methods related to basic actor functionality. In this case I'm calling the "announce" method to say something to all of the process' subscribers.

user-interface and discovery-system are two hypothetical processes whose pids (process identifiers) are passed in to this process as arguments. user-interface would be responsible for creating a window with a text display area and a text entry box. discovery-system would use our remote service discovery model to find others chatting in the same "room."

exit · on Aug 13, 2010

what does ' before a symbol do?

cracki · on Aug 13, 2010

in scheme, the ' is the syntax for a quote. i'm guessing they use it here as syntax to symbols, but in some of the example code, i see it in front of list-like things too...

adamb · on Aug 13, 2010

Our use of ' was inspired by lisp's.

Immutability such is an important part of Spin's messaging model that we wanted a consistent way to declare an immutable construction.

  # An immutable string (or symbol)
  'foo

  # An immutable list
  '[ 1, 'foo, '[] ]

  # An immutable map
  '{ foo: 1, y: 'two, z: '[ 'three ] }

Slightly more advanced...

  # Also an immutable string
  "foo with spaces" 

  # An immutable pair
  "bar" -> 2

  # Syntactic sugar for that same pair
  bar: 2 

  # Syntactic sugar for "baz" -> baz
  ~baz

adamb · on Aug 13, 2010

While we're talking about syntax, "\n" is an alternative to ",".

This helps multi-line expressions read more naturally.

  '{
    foo: 1
    baz: 'bar
    chunky: 'bacon
  }

devicenull · on Aug 13, 2010

So now we can't patch a security hole in a library without "recompiling" every application it's linked to? That seems like a huge step backwards to me.

tophercyll · on Aug 13, 2010

Remember, it's basically only Linux distros that have the capability to upgrade a third-party library like that (ironically, distro maintainers actually have the source code required to recompile everything if they wanted!).

Mac and Windows applications ship bundled versions of third party libraries all the time. Managing a complex web of name+version based dependencies is much harder in a decentralized software ecosystem, so bundling starts to look attractive.

For our purposes, the benefit of a system where software is more reliable, predictable, and accountable is greater than the cost of asking developers to recompile in unusual circumstances.

adamb · on Aug 13, 2010

Security holes present an interesting challenge. Since we allow authors to blacklist their code at the uuid level, it's possible to issue a network-wide advisory that revokes execution rights for that specific uuid.

This can instantly close the hole until a patch is released. This keeps users safe and gives application authors time to test against their application with the new library before re-publishing.

In many cases, application authors are the only people that are qualified to test interactions between their applications and the updated library.

adamb · on Aug 13, 2010

The distributed programmer in me wants to point out that having applications automatically use the latest version of a library is a scary proposition.

During a distributed operation, participants can arrive at many different points in time. This means that applications using the newly-patched library will likely be interacting with applications using the unpatched library. Whenever multiple versions of anything are interacting with themselves things can get complicated.

In light of this, we opted to keep things simple and predictable for ourselves (and others). Since applications always run against exactly what you say they should, you're free to keep running forward, without having to worry about tripping over past decisions.

adamb · on Aug 13, 2010

A step backwards compared to what? Are you thinking about a specific alternative approach?

jasonwatkinspdx · on Aug 12, 2010

Great stuff! Show more!

wozer · on Aug 12, 2010

Interesting. But what is their approach to security? Capability-based? Or simply not connecting to computers you don't trust?

adamb · on Aug 12, 2010

Generally speaking, we don't connect directly to unknown computers, but there are other reasons for that.

Security is addressed at a few levels (here are 6 of them).

1. Access to native resources requires specific permission.

2. All processes (including those with access to native resources) can only be addressed by their 288-bit process identifier (128-bits of which are random). The only identifiers known to a process are its own, those of its children, and ones explicitly given to it.

3. The Actor model means each process can independently decide which messages to reply to, which to ignore, and how long to wait for a response (if at all).

4. Each node has a unique RSA key-pair. The 160-bit fingerprint of the public key is the non-random part of every process identifier. This allows nodes to verify the remote processes they communicate with. (And if necessary, encrypt messages sent to them.)

5. Hash-based distribution makes it easy to blacklist poorly-written or maliciously-crafted code, once it's been identified as such.

6. System services in Skynet are always kept current with live, on-the-fly updates.

cmars232 · on Aug 13, 2010

Is this going to be open-source?

adamb · on Aug 13, 2010

Without making any promises, it's hard to imagine any programming system becoming successful without being largely open source.

mattknox · on Aug 12, 2010

I've been waiting for this release for a looooong time. Bravo!

atomical · on Aug 12, 2010

What release? It's another blog post.

atomical · on Aug 12, 2010

Is there any protection against software piracy?

tophercyll · on Aug 13, 2010

We don't build any in, but the same strategies (ie, requiring a license key, etc) that work for regular applications like Microsoft Word also work for fluid applications. The bits can be copied easily, but the same is true for traditional software.

Crazy strategies, like CDs that physically can't be copied and must be in the drive or kernel extensions obviously won't work, but that's probably a good thing. =)