Hacker News new | past | comments | ask | show | jobs | submit login
The Rune Programming Language (github.com/google)
248 points by rurban on Nov 27, 2022 | hide | past | favorite | 186 comments



> The only close competitor is C++, where the author uses the little-known MemoryPool class from the <memory> library.

I was like I know C++ and I’ve never heard of MemoryPool.

Turns out that MemoryPool doesn’t exist in <memory>. In the source code they’re talking about it’s an alias to std::pmr::monotonic_buffer_resource.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...


Why would the struct-of-arrays layout be beneficial in a binary tree where every single node is visited, and the left + right arrays constitute all of the data in the tree? The previous example they gave was touching 75% less memory. In the binary tree it's the same amount of memory. Is the order it's touched better somehow?


Array of struct tends to only be better when there's high temporal locality of using nearly all the fields.

This code, as written, makes left node accesses temporally correlated with other left node accesses (and by symmetry must do the same for right nodes), allowing the caches at all levels to have a higher hit rate. Whether that higher rate is actually attained depends on the architecture, compiler, other running code, linker, and all sorts of other garbage, but it has a lot higher chance of better behavior anyway.


For anyone interested, Rune's SOA is done entirely by DataDraw. The API for creating relations is passed almost verbatim to datadraw, which will generate a bunch of C code for interacting with the DD library. So Rune didn't really write any of this, it just bundled DD with a language runtime, and used a more permissive license despite DataDraw being LGPL... ! It was taking too long to set up Rune on an Ubuntu VM, so I just replicated Rune's version of the Benchmarks Game binary-tree benchmark using DataDraw directly.

DataDraw runs the depth=21 case (single threaded, -O3) in about 4.6 seconds on my M1. It has the built-in advantage of reusing the table allocations as the database is global and implicit, the indexes being u32 sized. The existing C++g++#7/MemoryPool and Rust#5/bumpalo versions (both array-of-structs) are discarding their allocations on every iteration. Running Rust#5/bumpalo with RAYON_NUM_THREADS=1 runs in about 2.7 seconds. So in addition to the implementation being someone else's and possibly misused LGPL, the claims re the binary tree benchmark are not really made out.

Nevertheless you can very closely approximate in Rust what DataDraw does, without macros or anything else, by creating a single global struct of `slab::Slab`s, inserting a dummy first element, converting the usize keys to NonZeroU32 and using this as your SOA. If you do that, then you get roughly the same perf except that DataDraw isn't doing bounds checks. So the Rust version can do it in about 5.5 seconds vs 4.6. Obviously this can't be parallelised as you're passing a single &mut SOA around or using a thread-local.

I also compared the perf for a `slotmap::SlotMap`. It was about 7 seconds, but it solves the ABA problem with the generational indices, so that may be worth it.

Finally I compared the perf for a global Slab<Node>, i.e. global AOS style, to avoid clouding the AOS results with bumpalo's insane performance. It ran quicker than the global SOA style, in 4.6 seconds, because the overhead of managing two arrays, their separate allocations and their bounds checks was too much for the memory access pattern to do anything, if there indeed was any advantage to it.


Some memory-touching orders are better than others, because CPUs will prefetch nearby memory into CPU cache during memory accesses.


First example seems weak. Any language can use separate data type for secrets and corresponding operator for constant-time comparison.

Second example is very neat. Actually I thought about using sqlite with tmpfs database for application state. That could be useful for some kinds of applications. That said, using functional API over traditional data structures seems like a traditional and widely accepted approach. Interesting to see where that experiment will go.


But it's not just operator overloading, there seems to be a monad-like "secret" that makes calls to func(a, ...) -> b with secret(a) return secret(b).

The example doesn't spell this out explicitly, but I think those are the semantics of secret.


What makes that different from creating a Secret class that's just a container class and defines its own methods? I just don't see what it is that requires a new language to accomplish and couldn't just be implemented as a library in many of the pre-existing languages (including popular ones like C++).


It seems like we're focusing way too much on just the very first example, instead of the feature list as a whole?

There are plenty of other interesting features, such as disallowing conditional branching on the contents of secrets, all the way down to enforcing Spectre and Meltdown mitigations around secrets without necessarily globally taking that performance hit on sensitive and non-sensitive data alike.


I think I'm just not creative enough to think of features that need to be implemented by the language and not easily done at a library level. You could prevent the access to secret's contents by creating the appropriate class, but I suppose it would be harder or impossible to allow the user to peak into the contents but disallowing using them for branching specifically. Though, I'm not sure where that can be useful in practice. Also, I'm not sure how that language feature can mitigate meltdown and spectre. How can it do that?


I'm no expert here, but the arguments I can think of a couple possible arguments. One is that it's not always possible for libraries to ensure that programmers use them correctly. This may be particularly true for systems programming languages that often permit low-level features such as pointers. The second is that the compiler can do things that a library can't.

Spectre/Meltdown mitigation is a great example of the latter. The compiler can be careful to emit instructions that don't permit speculative branching, or disable other optimizations that might expose a security risk. Libraries don't typically get the kind of access that would permit them to do that, and it's not clear to me that a language that does permit such access would be appropriate for safety- or security-critical applications.

As for why if statements are a security risk, they give a tidy little example of that at the bottom of their overview page, under the heading, "Can you see the security flaw in this code?" I can't say I understand it terribly well. I personally take that as a sign that, while I can speculate about why they're doing things this way, and even be arrogant enough to do so in public on the Internet, I'm very, very, very far from being qualified to suggest that their basic approach is wrong. Chesterton's Fence sometimes strikes me as being overblown, but Chesterton's Safety Belt is no joke.


I implemented it as a runtime class in Python just to see, seems like you can get a lot of (this benefit) in other languages by doing the same: https://github.com/yasyf/python-secret-type


    x^2 means x squared, orphaning the XOR operator. Rune is for crypto, so math comes first.
Isn't XOR used very heavily in crypto? (As is exponentiation of course)


It looks like Rune still has a bitwise XOR operator: @

https://github.com/google/rune/blob/main/bootstrap/database/...

That innovation does seem like a potential footgun.


Here's my hot take: if you're designing a new programming language in the modern era, even a systems language, ignore the precedent of C and don't use &|^~ as bitwise operators. You can still have infix bitwise operators, but spell them out as bitand/bitor/bitxor/bitnot. Then you can just use &| for logical and/or, which are 1000x more common than bitwise and/or, and you can reclaim ^ for exponentiation as well. And don't forget to fix C's broken bitwise operator precedence while you're at it.


Interesting. I'd go the other way. Continue using single-character bitwise operators, but spell out "and" and "or" for the logical operators.

Rationale: in normal usage, short-circuiting logical operators are, in effect, a special kind of control flow statement, and control flow statements are typically spelled out. Bitwise operators are more unequivocally meant for calculation, and therefore perhaps more deserving of similar syntax to the arithmetic operators, despite their less frequent usage.

I think that this way of drawing the distinction might be particularly relevant in a language that disallows conditional branching - and, by extension, short-circuiting logical operations - on certain kinds of data the way Rune does.


This is exactly what Python did and it works out very well for the most part.


Or the Pascal way. "And"/"or" for logical operators on bools. "And"/"Or" for bitwise operators on integers

With a strong type system, it knows if the input is bool or integer


Hard disagree on single characters for logic and exponentiation. ASCII has too few special characters and you'd want to use those elsewhere, without creating ambiguity. Elixir got it right.


ASCII has tones of unused chars we could grab for coding.


Mm yes but BEL is not a character that can be typed on a keyboard with any amount of ease.


> ASCII has too few special characters

Unicode, OTOH, doesn’t have this problem.


True, but we don’t have Unicode keyboards.


Some language use three characters for bitwise operators, like `>>>`, `!!!`, `&&&`. I think it is a good compromise.


Sheer insanity. Sensible languages use ** for exponentiation and ^ for xor.


OTP is simply pt XOR giant-key.

It would've been wiser to duplicate C's bit operators ( ~ ^ & | << >> ) than give the emperor some "new clothes."

Welcome to another "Not Invented Here" language dying to be special and full of surprises. It's a fail.


I know, right? The strongest theoretical crypto is a one-time pad, often presented in its XOR form in the first chapter of every crypto book.

But hey, if Google says math comes first...


Implicit nullability of all values, a very dubious design decision in a new language.


Given that they mention SQL more than other things, and stress SoA and memory intensive applications, I think this makes more sense than it would otherwise


SQL null is not like the null of programming languages, it behaves more like "unknown" than "blank". For instance, null != null. Arithmetic on null values is also well-defined so that 1 + null = null.

I'm not convinced that implicit nullability is a good idea, either in SQL or in newly designed languages.

I don't see the connection between implicit nullability and SoA. If you have a link to some example, that would be interesting to read.


I'm not convinced so either, just that it doesn't not make sense.

As for SoA, I'm thinking of two relatively new formats and applications that have default nullable types, are SoA, and are specifically targeting memory intensive applications: apache arrow and duckdb, both seeing some success and well deserved hype


Even if you want to allow nullability, you can make it both explicit and opt-in rather than implicit and opt-out.


Go on...


The subject has been discussed to death here on HN and in other places. If you're unfamiliar, this talk by Hoare (the inventor of implicit nullability) is as good a place to start as any:

https://www.infoq.com/presentations/Null-References-The-Bill...


This is because we didn't just kill Abel, we destroyed Abel, and this caused all of Abel's children to be recursively destroyed.

TT


The Destruction of Abel, Jeremiah 27:14.


Sigh. There's enough religion in programming languages already. Can we not add or xor actual religion?


Is there any reason the `secret` type can't be implemented in, say, Rust, or is it just Google getting really excited to reinvent the wheel again?


You can have a Secret type in Rust where the Eq, Add, etc. traits are overloaded with constant-time versions. I'm unclear if there are any semantics that operator overloading doesn't address.


The secret (heh) is that any function defined on non-secret arguments can automatically be applied to secret arguments as well, and just returns a secret value now - without having to think about mapping/monads/whatever.


I think that it's actually something like the opposite of that. Rune disallows conditional branching on secrets, which would imply that passing secrets into the subset of functions defined on non-secrets that branch control flow based on an argument's value would be a compiler error.


Wouldn’t it make way more sense to write this in C and expose a typed API via language-specific packages. The functionality is cool, but nothing stops you from forgetting to use secret(string) instead of string.


Seems kind of odd that you’re arguing “just do it in C” and “nothing stops you from forgetting” in the same post.


Why is that odd? You could provide a typed API in other languages that leverages a suite of different side-channel proof algorithms. In order to provide such an API, you need a portable language compatible with the C calling convention for FFI support in other languages. The only language that meets such criteria is C.


You can't provide an API that gives the security features they want in c. The core feature here is a generic secret type. In c that's a void*, defeating the purpose of the typed, safe, API.


Yes you can?

    // lib.h
    struct SecureString {
      uint8_t *content;
      size_t len;
    }

    bool secure_str_eq(l *SecureString, r *SecureString) {
      // some constant-time algorithm
    }

    SecureString *new_secur_string() {
        struct SecureString *s = malloc(sizeof(struct SecureString));
        // initialize s
        return s;
    }
And then in go, for instance, you would do something like this:

    // #include "lib.h"
    import "C"

    type SecureString struct {
        ptr *C.struct_SecureString
    }

    func NewSecureString() *SecureString {
        return &SecureString {
            ptr: C.new_secure_string(),
        }
    }

    func (s *SecureString) Eq(other SecureString*) bool {
        return C.secure_str_eq(h.ptr, other.ptr)
    }


Ok, now how do you provide a SecureUserCreatedStructuredData type?


Serialize SecureUserCreatedStructuredData into an array of SecureString (or SecureInt, etc).


What if I want my application to be performant? There's no value in c abi compatibility if the code internally costs a serialization round trip all the time. May as well just use rusts ffi at that point, it'll be faster!

Also how do you even operate on a SecureUserCreatedStructuredData that is serialized into a SecureString or SecureBytes?

Does my function that operates on a SecureUserCreatedStructureData take in a SecureString, which is documented as needing to be deserializable to a SecureUserCreatedStructureData? If so, that's just a secure void*, with all the problems void* has, and slower.


Written in C.

Uses LLVM for execution.

Parsers are Lex and Yacc.


So a weekend toy project.

Hand-coded parser-lexers are far more powerful because their diagnostics are useful and are much easier to maintain.

Lex/yacc / flex/bison products by contrast have awful diagnostics and become steaming piles of confusion that have to be rewritten.


PHP is Flex/Bison.

The main point of the discussed project is to alter data structure to be more CPU cache friendly and thus try to be faster than C++. The paradigm.


Why do they call it "Python-inspired" if it looks like C++?

From https://github.com/google/rune/blob/main/benchmarks/mandelbr...

  for k = 0u32, k < 8u32, k += 1 {
    cr0[k] = 2.0 * <f64>(8 * x + k) / <f64>width - 1.5;
  }


Not syntactically.

It's slot-based, vaguely like Python, and has reference-counting GC, like Python. That's how I understand it.


So.. objective-c inspired? :P


Minus static_cast or replacing C cast parens with angle brackets.


Pretty interesting: imperative programming meets relational DB models meets column store, with a serving of constant-time operations on top for data marked as secret.

Also, memory-safe and blazingly fast in certain circumstances. Not complete though, some tests currently fail.


I'm perfectly fine seeing nullable references in a new systems programming language, but it's disappointing that nullable is the default. If you care about both performance and memory safety, being able to compile-time guarantee that you don't need a null check in a specific location can be big!


> This is not an officially supported Google product.

Then why is it under github.com/google ?


I believe Google has a practice where all projects that are copyright assigned to them are under github.com/google: https://opensource.google/documentation/reference/releasing


It's not a hard requirement, but they do make you jump through extra hoops to put them elsewhere.


Are there any officially supported Google products under github.com/google ?


Are there any officially supported Google products?


[flagged]


[flagged]


Are we Reddit yet?


Even if there aren't, it makes sense to clarify to media that this is an open source project made without any business commitments. I've seen other businesses do something similar; internally it reduces friction with the business admin when devs want to open source something.


_without any business commitments_ - not to sound trite, but when is the last time Google stood by its commitments?


Google still sells ads, doesn't it?


Didn't they originally claim to never show ads in search given that it is in direct opposition to search results quality? Back in the days when Pagerank was hot.


That was in the original pagerank publications. Google had lots of other early attempts to make money that avoided advertising (like the search appliance [0]), but the firehose of money from ads dwarfed anything else they could ever find and that early Google didn't last long.

[0] https://en.wikipedia.org/wiki/Google_Search_Appliance


Was that a "business commitment" or a "user commitment"?


Would Google Guice count?

https://github.com/google/guice


The code is owned by Google because it was written by a Google employee on the job.


Or off the job -- Google claims ownership either way. https://news.ycombinator.com/item?id=1969979


IIUC you can request exception and have your own personal projects, you're just supposed to be transparent about it. Your request will be rejected if it completes with Google business (like if you're making a new collaborative email and docs solution), but otherwise I've not heard any sob stories.


That would be one more huge reason not to work for them.


Because Google owns the code.


> This is not an officially supported Google product.

Oh good, maybe it’ll be here a while.


Instead of this:

  do {
    c = getNextChar()
  } while c != ‘\0’ {
    processChar(c)
  }
I'd prefer:

  loop {
    c = getNextChar()
    break if c == ‘\0’
    processChar(c)
  }
The eyesight rationale for curly braces (screen readers) is something I had never considered. But it would be nice if they were optional. I've been writing Python for 14 years and have never had a problem with mis-indenting.

Edit: I had to correct my post because Tabs were used in the original, so alignment was all wrong.


As someone who has some difficulty with reading sometimes, I find Python's way of doing things to be a genuine readability challenge. Python's the language I get paid to write, but, if I thought I could get away with it at work, I might try advocating we try out Hy simply so I could return to a world of having visible, non-whitespace delimiters to aid me in reading code.

I don't want to make too much hay about my challenges because I'm not sure they realistically qualify as a disability. But my own meager experience in this department does demonstrate to me first-hand that "it would be nice if accessibility affordances were optional" largely defeats the purpose of accessibility.


The braces could be automatically added or removed by any editor, so I don't see the problem, unless someone wants to write code like:

    if a == b {
 do this
        } else {
  do that
      }


Arguments about what's easy to do in any editor lose a bit of their shine in the post-GitHub era. Barring the development of a really good language-aware semantic diff algorithm that I'm guessing doesn't exist yet, you really do need to have a consistent coding standard everywhere if you want to avoid injecting unnecessary chaos (and, in this case, accessibility challenges) into exactly the parts of the modern software development lifecycle where code readability is most important.


I don't really care about the security aspects of the language (cool, but not earth-shattering), but I love seeing little passion-project langs like this. I have been doodling a language design for about a year now, and this one is eerily similar in some ways, and totally different in others. I dig it.


So there's also reveal() primitive for revealing secrets, as shown by the code for msqrt(). OK, Rune gives you a library for computing stuff in CT, but how do I implement something independent of these things and show that they are CT? Do I still need to rely on FM techniques such as relational symbolic execution? Can I jump into a "constant" mode with a special typing system or something that lets me do this without having to run a more costly verification?


Can someone explain how this works:

"Assume the attacker can tell how long it takes for mac == computedMac to run. If the first byte of an attacker-chosen mac is wrong for the attacker-chosen message, the loop terminates after just one comparison. With 256 attempts, the attacker can find the first byte of the expected MAC for the attacker-controlled message. Repeating this process, the attacker can forge an entire MAC."

How precisely should an attacker guess how long the comparison runs?


By how fast the function returns.

This is white-box security, a hypothetical setting where we assume the attacker has access to the entire knowledge of the system and to every oracle they want (like an oracle telling them how much time each function takes), but don't know any secret, like private or symmetrical keys. If you can prove that your function is secure in that setting, then it's secure in real-case situations where the attacker knows even less.


This is a timing attack or timing oracle. Lets assume a mac represented in an array of 32 bytes. If we had a pseudocode method like:

    byte [32] (actualMac, expectedMac)
    for int x = 0..31
        if (actualMac[x] != expectedMac[x])
            return false;
        fi
    end
    return true;
We return false as soon as we hit an invalid byte in our calculated mac. If the time taken to execute one iteration of the loop is Y and the attacker is able to time this method accurately they will be able to tell what the value of actualMac is by feeding known inputs. They will know because the return time will be 2Y when they have bailed after the first byte. 3Y after the second, 4Y after the third etc.

This is why we should check the arrays in constant time - compare every byte in both arrays before returning. We do not return early so we can’t leak information


> in constant time

why is it called constant time if it isn't constant with respect to array length? Just seems confusing because the algorithm is linear without a short circuit


It's constant time in that it always takes the same amount of time regardless of the extent to which the two strings are equal. It is a different concept than constant time in complexity analysis.

What's even more confusing is that it is also constant time in the complexity analysis sense given that the mac is usually a fixed-size string after choosing a hashing algorithm.


Isn't it sufficient to compare 64 bits at a time? Then the oracle becomes rather useless.

Many current memcmp implementations use such large comparisons because they avoid hard-to-predict data-dependent branches for extracting the specific point of mismatch.


Don’t really buy it. Seems to be both “spherical cow optimistic assumptions” and “anyone who could seriously think about pulling this off has nation-state level resources and already 0wnz you and/or already has the rubber hose at hand"


Not really. It doesn't rely on that big of an assumption, nor does it require nation state resources[0]. When you're trying to find the secret you can make a bunch of requests and measure for statistically significant change, which can still be detectable beyond jitter & web server load.

Also ignoring the fact that calling constant_strcompare(string, string) instead of strcompare(string, string) when working with secrets isn't that big of an ask.

[0] https://crypto.stanford.edu/~dabo/papers/ssl-timing.pdf


If you could measure the time granularly as a client requesting some resource on the server how exactly would you know the time corresponds to the comparison and not to some tangential task?


They wouldn't be guessing, they'd be measuring. I'm not qualified to really explain more but if you want to learn more, "timing attack" is what you're looking for

https://en.wikipedia.org/wiki/Timing_attack


Coda Hale’s old article on the topic is still good: https://codahale.com/a-lesson-in-timing-attacks/

(Note that Java’s MessageDigest.isEqual has been constant time since shortly after that article and you should use it rather than writing your own in Java).


Freddy the Pig?


Ha! Wow, looks like there is a redirect when the referrer is HN…


You guess the MAC tag value of the message and measure how long it takes the server to return "bad MAC" error or behave in a way that means the MAC was bad. In 1/256 cases, it takes longer because the first byte was correct. You may need to send many queries to get the value because timing is noisy, but with statistics you'll find that value. Now you try all 256 possible values of the second byte and one of them will take longer because both 2 first bytes are correct. Repeat.

For the normal way to safely compare MAC values, see for example: https://docs.python.org/3/library/hmac.html#hmac.compare_dig...


So-far, all comments on this thread are about the general concept of timing attacks...

You're asking a different question, though. You're asking about precision.

The answer here is that in many cases timing attacks pose a theoretical risk, but they can't be exploited in practice due to a low signal-to-noise ratio.

It really depends on the attack vector.

Measuring the latency of a network call (TCP) from across the other side of the world, as an example, is going to be too noisy (in many cases). Especially if the attacker wants to remain covert.


with secrets the timing safe stdlib variants are used, without, the fast ones.

timing safe means always using the full loop and not branching away on certain values. every value needs the same time.


This looks brilliant. I'm a sucker for languages that offer strong support for relational modeling without slow metaprogramming hacks.


Quite underwhelming. The two main features seem to be constant time operations on values wrapped with Secret, and easy SoA layouts.

The former is trivial to implement in Rust and C++, the latter is a bit more complicated, but also implementable in both via macros, in a reasonably ergonomic way.

What is the advantage then?


Interestingly the parser is generated using flex and bison, which I thought was rare these days.


If you want accurate error messages, you might want to write the parser by hand. If you want to make changing the grammar easy, and have a reliable, correct parser, you use a parser generator. Some have better error message than bison.


As per usual with benchmark results they are measuring how much effort or knowledge the authors of the code had.

For example, in D you can write a container that automatically switches from AoS to SoA but most benchmarks are just copy-pasted C++


I'll be interested to see how this compares to Nim (the other python-similar systems programming language).

Very interested in trying seeing how it's 'SOA memory management' turns out in practice.


What's up with having two different new experimental languages from Google on the front page https://news.ycombinator.com/item?id=33756800 - did Google just happen to release them at the same time?

I think when Carbon came out people's reaction was stronger than deserved (like saying that Google don't believe in Rust, or that Go has been a failure because it hasn't replaced C++ etc.) while in reality all of this is very experimental and very early in its development. I kind of expect similar reactions to Rune and Mangle with some people trying to make a big deal out of nothing.


This isn't an official Google project, the contributions seem mostly from one person, so probably someone doing this as a 20% project. Carbon OTOH has a team working on it AFAIK.


And it’s not new, either. First commits to the repo are well over a year old.


That's pretty new for a programming language.


The project GP was accusing them of name sniping is only a few months older.


Promotion cycle, new unfinished languages.


"I open sourced some internal project, it has no adoption and no mindshare and no plan to achieve either of these things" isn't going to make for a successful promo.

"Hey I built a thing, let's toss it over the wall but let people know what not to expect" is a perfectly fine thing for anybody to do.


It seems like somebody got too easy access to the Google Github repo. Usually only higher quality projects get approved to the central repo.


If I remember correctly we were supposed to have our personal projects there.

Since Google owns the copyright it almost makes sense.

Also, being in the Google organization doesn't mean that Google is involved in it. I'm maintaining a project there (`google/double-conversion`), despite not having worked for Google for years. Nobody at Google has any influence or reviews on that project.


This. Google’s policy is kind of dumb (I write a lot of stuff for fun that no one wants the ownership of… The primary value is I learned something that I can use later on something important),but pretty much any experimental thing some one at Google works on will wind up there even on their free time.


Wrong. This is in a late state, and very high quality.

And Google is not really known to release high quality projects overall. Only some of them are.


Syntax reminds me of Go. I'm looking forward to trying this out.


This is an interesting idea for language design:

> Users of Rune are protected, because the compiler sees that macSecret is secret, and thus the result of hmacSha256 is secret. The string comparison operator, when either operand is secret, will run in constant time, revealing no timing information to the attacker. Care must still be taken in Rune, but many common mistakes like this are detected by the compiler, and either fixed or flagged as an error.


I mean, this would be relatively easy to implement in any language that have operator overloading.


How do you operator overload in C++ so that calling computeHmac(secret(string)) returns secret(string)?


Define the computeHmac method to return a secret(string) type?

Then define the equals operator for "secret" to behave in the needed way.


But now you've forced computeHmac to only work only secrets, when there is no such need. You've coupled an implementation of an abstract algorithm with the particular case that _you_ want to use it this one time with sensitive secrets.

The advantages of monads include exactly the opposite decoupling: the hmac algorithm implementation is true to its bare specification, and it is the context that changes some of its behavior.


You could overload computeHmac's return value so that it would return either a string or a secret, then you could use it directly with checkHmac, if you wanted, or as a string in other applications.


Consider this: I don't even know what hmac is, and it's implemented in a library.

Also, I can't overload return values on their own in C++, I would have to overload the whole signature. In fact, to get the same monadic result, I need 2^n-1 overloads for a function with n arguments (one for each subset of the arguments except the empty one).

Of course monads' advantages can be coded directly, just as functions can be coded directly in assembly. Personally I code in C++, so I've never used proper monads, but I see where they save work.


I was just considering a nicer interface for computeHmac at a library level, but if you don't have that, you could still just implement checkHmac using your Secret class and call it with checkHmac(Secret(str), message, mac).


It's turtles all the way down. Without the secrets monad support in the language, at some point I have to be the one enforcing secrets in multiple places: if I implement checkHmac(secret(str),...) that calls computeHmac(str)->str then I must both "unbox" the secret(str) _and_ "box" the return value of computeHmac.

I can do this, sure, and if I forget then I'll have a security bug. This is similar to the situation with c++ destructors over c's manual malloc+free. If you're happy freeing at every function's end, then this secrets thing adds nothing for you, and that's cool. It's your own choice what language to use.


If you try to call computeHmac from checkHmac surely you'll get some sort of error or warning, wouldn't you? You have no guarantee computeHmac will your secret will be treated as a secret by computeHmac, if its signature is simply "str". Unboxing it is never safe.


Any language that's implemented in C++ already starts with a handicap

That's nice to see they went with C

I like the syntax


What it's written in is meaningless except to religious folks. Supposedly, they use lex and yacc, which means it's a toy. Mainstream compilers use hand-written lexers and/or parsers because they're faster and better.


Now, it is all just a race to replace C++ as quickly as possible.


It's not even typed in Futhark.


I like the syntax a lot


Looks like Swift.


Haha


There already exists Rune programming language and that one was earlier: https://rune-rs.github.io/

They should be more careful picking the name.


Let's be real for a minute. A couple of hobbyists have named their pet project "rune". Should the name be then forsaken for all eternity?


Let's be real for a minute. What you're actually saying is these hobbyists don't really matter and they don't even deserve to name their projects. Only Real Projects created by Real Programmers at Real Big Tech corporations get the cool names.

This is the kind of disrespect that pushed people to create trademark laws.


The actual issue here is you acting like a name collision is a huge problem. It isn't, it's an everyday occurrence on Github alone. We just add a bit more info, like the account name in the case of Github or the year of release for movies/series/games etc.


If a name really isn’t such a big deal, then it shouldn’t be a big deal to change it to something else that wasn’t already taken. If there’s resistance to that idea, then maybe names are a big deal after all.

For a language dev, the name of the language is all you really own about it. These days, developers expect their languages and tools to be free, and of course open source and permissively licensed. The name and logo of the language is really the only IP most PL devs actually fully control, and costs actual money and time to maintain (registering and defending trademarks, domains, etc.)

To just step on names like Google has repeatedly done shows a crass disregard for what independent language devs go through.


We are talking about "rune", a common English noun. It's not like Google called it Zig or Jai. And how many github repos are just called "Lisp"?

Google isn't exactly innovative with their naming: Fuchsia, Dart, Pixel, Go, Drive, Ara, ... Aside from rare short-term experiments like Stadia everything outside a basic dictionary should be safe.

I'm not going to defend Google, but this specific case isn't one that I'd lose my mind over.


name collision becomes a problem when at least one of the entities is willing to bring lawyers to bear. not saying that's happening here, but certainly more of a concern in a situation where you have a hobbyist going up against a big company.


Hobbyists do deserve to name their projects. And other people can name their projects the same thing. Not a big deal.

(By the way, the reverse scenario here should be okay, too. If Google makes a project with with a common noun name, then others should be able to use that noun to name their projects.)


If Google can just stomp on anyone's name and that's fine by you, then what does it mean to say that hobbyists "deserve" to name their projects? What you're really saying is that whoever has the loudest voice backed by the most money gets claim over the name, regardless of who had claim to it first. In that world, hobbyists get whatever is leftover by by big corps, and don't really "deserve" anything.


> What you're really saying is that whoever has the loudest voice backed by the most money gets claim over the name, regardless of who had claim to it first.

No, I’m saying that nobody has “claim over the name”. Naming collisions happen all the time, and I don’t know why we get so bent out of shape about it. There are two multibillion-dollar software companies called Epic. There are a million businesses called AAA. I’ve been to three different breakfast restaurants called Sunrise.


Yes, actually, or at least until abandoned by the original authors. That's pretty much the norm in the PL community. There are enough names out there that no one needs to step on any toes. Although I guess Google engineers don't care much about community norms.

But in the spirit of being real: what are you trying to do by calling the Rune devs "a couple of hobbyists"? Is that an attempt to minimize them, as if they are not a corporation therefore they don't have any naming rights to their projects? "A couple hobbyists" are how many great language you know and love started out. Their rights are important too. We don't want the norm to be big corporations snuffing out hobbyist projects by making them unsearchable, like Google did to Go!. That's bad for everyone.


There already was a GO programming language before Google decided to use the name, too.


As some other comments have pointed out, this seems to be a one-person project. FWIW, Go also started as a project by 3 people who happened to be working for Google (Robert Griesemer, Rob Pike and Ken Thompson), so it wasn't Google consciously choosing the name.

Perhaps Google as an employer shouldn't allow employees to choose any name they like, and do some diligence to avoid name clashes. This may sound quite reasonable for outsiders, but internally this will be another step that requires manual review in the process of publishing open source code, and employees will see this as red tape and get discouraged from open sourcing their code in the first place.

The benefit of requiring every project to go through a name clash review is also questionable: there are 2.5k repos under https://github.com/google, and most of the them will never become popular enough for name clashes to be a problem anyway. This repo only has 177 stars despite hitting HN homepage.

IMO Google should instead make it easy for people to publish their open source code wherever they like, but I suppose there are some messy legal reasons why they prefer employees to put their repos under https://github.com/google. (It's not a hard requirement, but they do make you jump through extra hoops to open source your code elsewhere.)

(I'm a Google employee, but I didn't know this project and don't work for the department responsible for the process of open sourcing code.)


It shouldn’t need to be a company policy, you’d think a competent engineer would simply do due diligence in naming their project. I remember searching for name clashes for a project I wrote solo when I was ~13 years old in the early 2000s.


Unfortunately Rob Pike, Ken Thompson, and Robert Griesemer are not competent engineers by this standard. I’m sure they’ll be sad to hear it.


The competence is assumed. The disappointment comes from people who are competent not doing the due diligence that some think should be par for the course when naming a project.


You seem to be a lot more saddened than I am that a nice two letter name wasn't successfully squatted for all eternity by a no longer maintained ultra obscure niche language that no one has ever heard of let alone used.

Now perl stealing prolog's file extension on the other hand...


I don't have any particular opinion about Go. Just occurred to me there might be a more charitable interpretation of the comment.


Ah, so they're repeat offenders. How's it go again? Cache invalidation, naming tconcurrencyhings, , and off-by-one errors?


Google: - leader in AI and text generation - unable to find or create original names for langs


Nah they're doing just fine. I was googling alien porn and I landed here


Given how poor their search engine performs as of late, it's no surprise they can't seem to be original; they obviously can't find anything that already exists.


Hey listen, namin things is really hard


Correction, there was a "Go!" programming language.


There also was a singer before Apple decided to name their programming language ;)


That doesn't justify them choosing an already chosen name now, does it?


Rune (the existing language) is a fairly small hobby project that only got started in 2020. The developer of this new Rune probably just didn't know about it.

There is nothing nefarious here. There is no cabal. These things happen all the time. I've had it happen to two of my own small/obscure projects. It happens.


Goggle's Rune is even a smaller hobby project.


I didn’t read the GPs comment as a justification.


Unless it's trademarked it doesn't matter in the slightest


Who is the rightful owner of a name or similarly, a piece of land, or an idea, patent?

The first settler? The first settler that held it for at least a year, 10 or 100? The most powerful entity claiming it?

In the modern western mind there is the notion that whoever grabs it first rightfully owns it. Which is a simple rule, but encourages squatting and holding but not using. The squatter can then hold ransom against somebody who would be the rightful owner.

At least with patents or electromagnetic spectrum there is a time of expiration. You come first and claim it for a few years after which it becomes public domain. Or we hold an auction each 5 years to maintain stable and efficient allocation of a finite and scarce resource.

With concepts like programming languages, the case with stronger base wins, like in the example of Go. People associate the Go label with Pike's Golang, not with the previous Go!.


> Who is the rightful owner of a name or similarly, a piece of land, or an idea, patent

It used to be that if you have lived on the land for 3 generations it's yours.


When you say “Modern western mind”, who exactly are you referring to ?


One would think they would try googling it beforehand.


The original title I submitted had the proper name for it. "ᚣ The Rune Programming Language"


Whoever gets the stronger adopters gets the name.

A programming language is such a common personal project, it’s inevitable to not see this happen. It also doesn’t help that 95% of these languages aren’t known.


Can someone explain the timing security vulnerability mentioned in the article, I'm not sure I understand.


This is sort of classic problem. First, you need to assume the attacker is capable of measuring checkMac's runtime reliably. Second, you also need to assume the attacker is able to control the message and the mac, but not the secret. How the attacker got there is not really relevant, the point is figuring out whether or not the system is possible to crack by having the attacker full knowledge of everything, but the secret.

In most languages, a similar implementation of checkMac would not pass the test, because they will usually implement some sort of short-circuiting. Which essentially means that checkMac will take longer to execute the closer you get to the true mac of that message.

Let's say computeMac(secret, "a") == "a21a". The attacker could pass in message="a" mac="0000" at first. Let's say that takes 1 unit of time, because "0000" == "a21a" only has to look at the first character. So the attacker knows that 0 is wrong. They then try "1000", then "2000", up until they get to "a000". Then, the algorithm takes 2 units of time, the first one is comparing '0' == 'a' and the second one is '0' == '2'. Now the attacker knows the first character of the mac. They keep going like this until they find out the entire mac of the message. In a nutshell, the time the function takes to execute leaks informartion that an attacker could use.

In this language, when you do use the secret(string) type, it will always compare all the characters of the string, even after it knows it will be false anwyay, just to make sure no information is leaked.


~I love when I see programming languages who's first advertised features are implementable in 8 lines of rust~

Edit: ^ the above had the wrong tone. Thanks to dang for pointing it out. What I meant to express was that it's possible to accomplish a similar safety/ergonomics at the library level in rust in not too many SLOC. My personal preference is towards Rust's approach because the type system gives really powerful composable primitives which makes it possible to have the compiler check a wide range of invariants, instead of just the ones that are common/special enough to go into the language itself

(Example edited after comments from mumblemumble)

    // The struct is public, but the contents are private, meaning you can't directly access the secret once it's inside the struct
    pub struct Secret<T>(T);

    impl<T> Secret<T> {
        // The only public way to access the secret, returns a new secret
        pub fn map<U>(&self, func: impl FnOnce(&T) -> U) -> Secret<U> {
            Secret(func(&self.0))
        }
    }
    
    impl<T: AsRef<[u8]>> PartialEq<&[u8]> for Secret<T> {
        // == does the correct thing (and only works for types that would make sense (`AsRef<[u8]>`)
        fn eq(&self, other: &&[u8]) -> bool {
            constant_time_eq(self.0.as_ref(), other)
        }
    }

    /* Some other file */

    use secret::Secret;
    
    // Translated from the example
    fn check_mac<T: AsRef<[u8]>>(mac_secret: Secret<T>, message: &[u8], mac: &[u8]) -> bool {
        // This returns a new Secret<[u8; 32]>
        let computed_mac = mac_secret.map(|secret| hmac_sha_256(secret.as_ref(), message));

        // This uses the `constant_time_eq` impl from above
        computed_mac == mac
    }

Edit: It looks like you can implment SOA as a macro too https://github.com/lumol-org/soa-derive

Edit: mumblemumble helpfully points out I demonstrated this poorly, so I tried to better demostrate what I was going for in this comment https://news.ycombinator.com/item?id=33764037


This Rust example seems like it misses the point? Rune detects that you're working with data that is marked as secret, and gives you a compiler guarantee that it's defending against some set of known attacks.

The Rust code above depends on the programmer to consistently remember to enforce safety, and to do so correctly every time.

Sure, you could probably implement that as a library. But, "We don't see much value in compiler help with this, a combination of libraries and being careful gets the job done," would be a peculiar position for a rustacean to defend.


Great callout, I haven't had my coffee yet. Here is a version that better shows what I intended

    pub struct Secret<T>(T);

    impl<T> Secret<T> {
        pub fn map<U>(&self, func: impl FnOnce(&T) -> U) -> Secret<U> {
            Secret(func(&self.0))
        }
    }
    
    impl<T: AsRef<[u8]>> PartialEq<&[u8]> for Secret<T> {
        fn eq(&self, other: &&[u8]) -> bool {
            constant_time_eq(self.0.as_ref(), other)
        }
    }

    /* Some other file */

    use secret::Secret;
    
    // Translated from the example
    fn check_mac<T: AsRef<[u8]>>(mac_secret: Secret<T>, message: &[u8], mac: &[u8]) -> bool {
        // This returns a new Secret<[u8; 32]>
        let computed_mac = mac_secret.map(|secret| hmac_sha_256(secret.as_ref(), message));

        // This uses the `constant_time_eq` impl from above
        computed_mac == mac
    }


I think the interesting part of the example is what you _can't_ do in the other file. It's pretty hard to misuse because the return type of `Secret::map` is a new `Secret`, the only way to do `==` on a `Secret<T>` uses a constant time compare.

I guess my main point is that when you have a instead of having to add new things at the language _level_, if I have something as powerful as the rust type system I can implement the same functionality in not much of code.


That covers the one case in the example, but the language goes even further than that in ensuring constant-time processing of secrets, including ensuring speculative execution in the CPU won't expose the data to timing attacks.

I don't know enough about the subject to really evaluate this in detail, but I am more than willing to at least entertain the notion that the problem space is thorny enough that a language-level solution really can do some things that can't be as effectively accomplished with a library solution. Even in a language with a strong compiler like Rust.

Rune also has an interesting approach to pointer safety that's significantly different from Rust's: https://github.com/google/rune/blob/main/doc/index.md#runes-...


It all gets more complicated when you want to pass more than one secret parameter, or the function already returns a Secret - now you need a monad. The key feature seems to be that the code does not need 'map' or anything, the secrecy flag is propagated regardless.


> now you need a Monad

"need a Monad" sounds scary but in practice it looks like this

    impl<T> Secret<T> {
        pub fn map<U>(&self, func: impl FnOnce(&T) -> U) -> Secret<U> {
            Secret(func(&self.0))
        }

        pub fn flat_map<U>(&self, func: impl FnOnce(&T) -> Secret<U>) -> Secret<U> {
            func(&self.0)
        }
    }
If you need an escape hatch for something more complicated, you could provide an api to that

    impl<T> Secret<T> {
        pub unsafe fn reveal(&self) -> &T {
            &self.0
        }
    }


My point, which I didn’t express well there, was about the call side: How do you call

    func(param1: A, param2: B) -> C
with both parameters being secrets (secret1: Secret<A>, secret2: Secret<B>)? In Rust, it‘s

    flat_map(secret1, |param1|
      map(secret2, |param2|
        func(param1, param2)
      )
    )
or with do_notation at least a bit cleaner

    do! {
      param1 <- secret1;
      param2 <- secret2;
      Secret(func(param1, param2));
    }
whereas Rune manages it with

    func(secret1, secret2)


Please don't be snarky when evaluating someone else's work. Your comment would be fine without the opening swipe.

https://news.ycombinator.com/newsguidelines.html


Compiler checking is always better than programmer checking.

As someone who uses rust, I assume you would prefer the former absolutely.


It's got some novel ideas: better than many new languages!

Don't quite grok why it eliminates the need for ref counting though. Tree structures are fine when you have them, but frequently you don't. The docs claim Rune programmers never write destructors even though there's no GC, so is there no equivalent of RAII? How do you model graphs?

The constant time stuff doesn't matter. Virtually nothing needs to be constant time like that and when it does you're probably writing in assembly anyway.


I think the point behind "constant time" here is that the comparison always takes the same amount of time, and not based on how many characters in the two strings are equal, to prevent a timing attack. "Constant time" isn't very accurate here because given a choice of hashing algorithm, the hash is a fixed-length string and therefore even a trivial string comparison technically "runs in constant time", at least if we allow the typical abuses of language in CS.


Yes, but my point is that there are only a few use cases for this, mostly in cryptography, and to ensure something is genuinely constant time you have to write in assembly. The routines needed are small enough and compilers hard enough to trust that you just bypass them in practice, so it's an odd choice for something to build into a language syntax.


My assumption is that it's not just the syntax, but that their compiler does the right thing. Though someone said it's based on LLVM, so who knows how much control they get.


I suspect for the kinds of projects this might in theory be targeting, they aren’t doing any dynamic allocation at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: