Hacker News new | past | comments | ask | show | jobs | submit login
Berry is a ultra-lightweight dynamically typed embedded scripting language (berry-lang.github.io)
266 points by dannyobrien 11 months ago | hide | past | favorite | 97 comments



A surprisingly rich feature set for a 40-KB runtime: a VM with GC that runs a Python/Ruby lookalike language, supporting procedural, OO, or functional styles. From the cursory look, seems pretty ergonomic to write.

What stood out for me is the ability to pre-create constant objects and put them mostly into ROM, so that the RAM is only used for the actually mutable data. This is something you can't have with MicroPython or Lua, AFAICT, and this makes a lot of difference in MCUs where ROM / flash is plentiful, and RAM is scarce.


You can "freeze" micropython modules and have them saved in flash with Micropython, where flash only stores the precompiled bytecode.

If your module contains stored data in the form of an immutable object, like a string or bytes(), it will be read straight out of flash without copying to RAM first.

This needs you to run some code on a desktop computer to perform the freezing though.

https://docs.micropython.org/en/latest/reference/constrained...


Any serious build setup for an MCU firmware is going to have steps like this in, generating code and saving blobs of this and that.

Though I guess part of the pitch for micropython is being able to iterate without a flash step every time. Possibly addressed by having your dev boards fitted with a version of the MCU with enough RAM to run it all that way.


Just to be clear, not just data but bytecode will also be executed from flash too.


I have to admit, I went back to the site after reading your description. I find it much more comprehensive and appealing than the home page.


As @snops explained, MicroPython supports executing code from flash in the form of frozen code. The consequence of this is that it needs to be compiled in to the firmware binary which slows down the typical rapid development cycle of MicroPython.

So a convenient workflow is to deploy at runtime (ie not in flash) until you stabilise your code base or approach production; at which point you can freeze your code into your firmware.

For an even more convenient model, there is the proposal to support 'mapfs' so that some flash will be allocated to store code. You can then compile your Python to bytecode (with `mpy_cross`) and upload it to - and run from - this area at runtime.

Although this feature has been demonstrated as a proof-of-concept, there are some subtleties to sort out before it hits mainline. If you're interested in the feature, please read or comment on the PR:

https://github.com/micropython/micropython/pull/8381


It does indeed look interesting and looks to be similar to Lua in several aspects (although, there are some noticeable differences too). FWIW, there has been work done in eLua to keep some of the data (transparently) in ROM: https://eluaproject.net/doc/v0.9/en_arch_ltr.html (see rotables).


Note that Berry is used by Tasmota: https://tasmota.github.io/docs/Berry/


Tasmota looks interesting. Interested to hear your thoughts on it if you've tried it or have compared it to toit/jaguar?


I use Tasmota on a couple smart wifi plugs and apart from it being very reliable and doing what it says I don't have any development experience with it. But isn't toit a language and jaguar an app as opposed to Tasmota being a complete firmware that merely includes Berry as the runtime?


Yes the nomenclature is a bit confusing but I believe toit is the SDK/platform/firmware(?), toitlang is the language, and jaguar is the CLI tool used to bundle/deploy it all onto a device.

As far as I can tell, the combination of all three of the above is the equivalent to tasmota + tasmotizer + berry?

I guess what I am asking for is how does developing on the ESP32 when using toitlang + toit SDK compare to issuing commands against Tasmota using berry?


That makes much more sense. Naming continues to be a hard computer science problem :)


Man, this looks great. And, personally, I think this is some of the best documentation I’ve seen. Kudos to those who put it together! I love the “short manual” for experienced devs to quickly get a sense of the language. And I wasn’t familiar with Tasmota before but def will be looking for an excuse to try it on a project.


I guess the main questions I'd have would be:

What's the performance & memory usage compared to Lua?

How sandboxable is it? Can you run untrusted code through it?


The comparison with Lua is the most interesting to me. Most companies I worked for used Python as a shitty proxy or CLI tool, but I never could recommend Lua due to its confusing OO system (meta tables??)

If this language is a valid alternative with a simple struct and a basic constructor, I may recommend it to remove all the Python scripts and their dependencies.


Having a lot of experience wrangling prototypes in JS was a nice benefit to my first forays into lua, though I wrote more functional code than OO anyway.

Honestly, off by one errors due to indexing was a lot harder to shake than getting used to meta tables.


I’m assuming that the indexing off by one errors were due to Lua 1-based indexing? If so, was it due to you being used to 0-based indexing or something else?

Just curious. I’ve been kicking around some attempts to make programming simpler (though hopefully no less powerful) for casual programmers. Lua’s 1-based indexing gets a lot of critiques and, while I haven’t written a ton of Lua, can’t help but think that the critiques are due to us all being used to 0-based indexing. Which is a valid critique, especially for a scripting language, but one that applies less to novices.


Bingo. All of the languages I have used prior to (and since) use 0-indexed arrays.

There's times where doing math with array indexes is simpler with 0 and with 1-indexed arrays. I don't think I really found 0 to be difficult to learn; there are a great many more challenging things.

I wouldn't say "don't use a 1 indexed language" but I've also never found the arguments in favor of it to be compelling when using lua.


This is somewhat inefficient but zero indexed strings can easily be simulated in lua:

  local array = function(backing)
    local res = { len = function() return #backing end }
    setmetatable(res, {
      __index = function(_, k) return backing[k + 1] end,
      __newindex = function(_, k, v) backing[k + 1] = v end
    })
    return res
  end
  local x = array { 1, 2, 3 }
  print(x[0]) -- 1
  x[1] = 5
  x[x.len()] = 6
  for i = 0, x.len() - 1 do print(x[i]) end -- 1\n5\n3\n6


Metatable types are indeed kind of awkward, but it's pretty easy to build a type system on them in a library. Check out metaty where I built a type system, formatter and deep equality checking in like 700 LoC. I'm adding the ability to document types/functions as well, after which it will be "done"


Would you have a link? my googling skills have failed me



You can't run untrusted code because by default code can open and write to the filesystem. Sandboxing is up to the application. For sandboxing wasm is a good alternative.


Removing the API for file access would be easy. Limiting compute time is the harder thing.


Limiting compute is trivial on any modern operating system. On Linux you can create a second supervisor thread, block on a mutex with a timeout, and then if you timed out before acquiring the mutex you can just tgkill the runner thread. On BSDs it's even easier since you can just hand the runner thread port to your supervisor thread and, after you timeout, you can just set RIP/PC to whatever location you want without having to screw around with signals.


I'd love to better understand what you're describing. Thread-level compute limits generally seem like they need to be managed in some sort of cooperative fashion, otherwise you end up with the possibility of dangling locks and other kinds of corruption on forcible termination.

What signal are you sending to the running thread? What's the handler look like? How does that translate to terminating the thread? How do you do this forcibly (without cooperation from user code)? Schemes that I have seen look more like generating/ instrumenting polling/checkpoints into user code.


He's describing killing a thread using a watchdog.

This doesn't limit compute power of the thread.

It's an all or nothing approach; either the thread runs or is killed, which doesn't work (as you point out) for all circumstances where you want to limit the amount of compute pwr a thread can use.


They're describing sending a signal to a thread, not "killing" it. IIRC, the only thing tgkill is going to do for you is ensure which thread runs the signal handler. For example, if you send a SIGTERM, (IIRC), you will terminate your entire process.


You let it operate on a copy of the input data and check integrity afterwards. If it doesn't reach a valid state you discard it as failed/corrupted.


There are many pieces of process global runtime state that can get corrupted that are irrelevant to the input data.

You can get lucky and sometimes avoid corruption, but eventually you're going to hit something important like leaving a libc heap allocator mutex locked when you kill a thread in the middle of an allocation.


Yes, it is very easy to remove the entire standard library from lua. Could compute time be handled with signal handlers in at least some cases? If not, one (admittedly somewhat hard) option is to write a small lua to lua compiler that injects yields into loops and recursive function calls.


An native thread management with an `async` keyword would really make it stand out from Lua for game engine uses.


I wouldn’t call that “stand out” especially when it comes to game engines, game engines are very complex and they are all different from implementations to their algorithms having a native thread management keyword does more worries than stand out


I'd love something exactly like this, but with less paradigms and statically typed, for use as a configuration language.

I've used several projects requiring non-trivial configuration that, instead of requiring you to write hundreds of lines of yaml, simply let you write Lua or Starlark/python, which feels so much better to me. I'm always missing autocompletion and reflection though. There doesn't seem to be a good candidate for this, pretty much all small embeddable scripting languages are dynamically typed...


I've been thinking along these lines but more 'strongly validated' than statically typed in the sense that you'd be better off being able to load the entire config and then produce a list of problems (and should be able to offer good editor support if done correctly).

Though https://dhall-lang.org/ demonstrates that you can statically type quite a lot of configuration to great advantage, which appears to be programmatically embeddable in multiple languages per https://docs.dhall-lang.org/howtos/How-to-integrate-Dhall.ht...


Looks nice. I'm still rather partial to uLisp (http://www.ulisp.com), but it's great to see this.

Erm... Berry good :)


microcontroller options are interesting, also Forths (https://github.com/tabemann/zeptoforth)


If like me you like to look at examples of code to get a feel for the language, take a look at https://github.com/berry-lang/berry/tree/master/examples


It would be very nice to provide bindings to other languages. We use quickjs from rust and it works pretty well, you can provide what you want to the VM so it can run untrusted code.


This looks like a language optimized for embedded use. It looks well designed and documented, and doesn’t do anything stupid or unexpected. The syntax is pleasantly minimalist and tasteful. I‘ll definitely keep it in mind for my next ESP32 project.


I will use it right now, I’ve been messing with my ESP-32 dev kits for some flashy Halloween decorations, and I’m excited to try this out. I agree, it looks great for the niche it’s in, especially since it’s doing things that I don’t really think have an analog in any other language.


Is there a standard approach to making native stack traces capable of marking stack frames with the name of the scripted function? Like when you get a crash or use a cpu profiler, is it possible to interleave native and script stack traces?


I think the thing I'd miss most from python if I was using this is multiple-inheritance. Not even necessarily true multiple inheritance, but at least some kind of mixin.


Class-based inheritance is IMHO overrated. Most of the time you're interested in specific behavior (what is the area under this shape?), rather than some abstract taxonomy (is square a specialized rect or is rect a generalized square?).

Composition and interfaces tend to produce less convoluted designs, and the most obnoxious problem with multiple inheritance ("diamonds") has to be explicitly addressed at the call site, rather than by meditating on the globally-determined method resolution order.


Imho it's just that method signatures are enough work to write and maintain that inheritance is needed for simple code reuse sometimes.

A language with good constructs for convenient call-signature reuse and redirection at the function level could probably skip implementation inheritance.


How do feel about prototypical inheritance and/or mixins as an alternative? I sorta feel like the component in the Entity/Component/System model that’s been getting popular in gaming drifts into the same territory as well though I’ve never written games.


ECS is different. Systems coordinate entities, entities have components, some of them can be behaviors, some of them are not (vertex data). You would need to provide something that a component can inherit to derive that it’s a behavior. Unity does this by calling it a MonoBehavior (not Mono as in singular, mono as in Mono runtime, the name stuck).

Having an ECBS system: one could model all the permutations as described while keeping code concise, shareable, and non-repeatable. Unity is really an ECBS.

Inheritance vs non is as old as OOP vs functional. Each has their place, each has their pros and cons. Having functional components and behavior traits attached to object inheritance based entity nodes is probably the best of all composition. Entities can get more and more complex with inheritance chains (further classifying what components can be attached), components can include behaviors, data, events/triggers, sub-components. Behaviors (triggering events or waiting for) is where your game logic would mostly reside.


Aha, thx for the insight. Very helpful. A couple questions:

1) If I wanted to learn more about ECS/ECBS, would Unity be a decent place to learn? I know sometimes the terms get coopted (for example, MVC as a term gets fairly abused at times in web development) so I wasn’t sure if their implementation was a good one to learn from. And while I’m a fan of open source, Unity certainly does seem to be a leader in adoption and available assets, etc and not a bad place to start. I’m not really looking to develop a commercial game, but in addition to my poking around game engines to see if and what might be applicable ideas for a more classic business/personal type apps, I am interested in building something that broadly speaking sorta acts like an FPS or Arma type game but just for the sake of playing around and simulating different drones and drone swarms, etc in a 3D environment and playing with that. Pretty crude is fine, and AAA level graphics aren’t really a huge concern. I think I honestly could prob get pretty far with just a non-graphical, non-gamelike OO design with JS/Python/whatever as a crude simulation. But I’ve heard that standing up something like an FPS in Unity is not too difficult. And I don’t plan on pushing boundaries from a graphics or asset perspective.

2) > Systems coordinate entities Could you elaborate a bit? Systems are the one aspect that for me are hardest to get my head around coming from more standard web/business development. So physics engines would be a system right? But I’m not sure what are other examples.

Thx for the info. And feel free to just refer to links if you know amy decent ones. I’ve been poking around ECS from a far for a couple years now but haven’t really found articles that fill in the gaps for things like systems in ECS. Though I’m sure the fact that I have little context in game development sure doesn’t help… lol.


> Systems are the one aspect that for me are hardest to get my head

In an ECS architecture, the world is a database of entities, which are just identifiers, which have components associated with them. Systems is simply code that query this database and operate on the returned data. For example, if you want burning entities to set burnable entities on fire, you might write a system like this, using Bevy as an example:

    fn fire_spread(mut commands: Commands, burning: Query<&Collider, With<Burning>>, flammable: Query<(Entity, &Collider), With<Flammable>>) {
        for col_1 in &burning {
            for (flammable_entity, col_2) in &flammable {
                if col_1.touches(col_2) {
                    commands.entity(flammable_entity).insert(Burning)
                }
            }
        }
    }
The Commands here exists to defer archetype moves – otherwise what should the query do if you added Burning to an entity while the query was running? And of course in a real game, you might want to use a spatial query so time complexity isn't m×n. You could then run this system every tick:

    app.add_systems(Update, fire_spread)
If you're familiar with C++, I can also recommend you check out https://www.flecs.dev/flecs/


Awesome, thx.


Unity is just the loudest, they certainly aren’t the first and they definitely didn’t invent the paradigm. ECS/ECBS is a pretty common construct.

This wiki is the Bible for ECS/ECBS systems: http://entity-systems.wikidot.com/


That page seems to be more about the entity-component architecture (entities are composed of components, which define data and behavior for one aspect of the game) then about entity-component-systems architecture (entities are composed of components, which are pure data; systems can query this database). Traditional Unity, as that site points out, is EC; Unity DOTS is ECS. Not sure why the site would be the bible on the topic either as it only some surface-level information and links (and hasn't been kept up to date much).


Much like other bibles ;)

Wikipedia has some links to some more recent approaches. The biggest problem (and the reason why you don’t see medium articles about it) is it’s extremely difficult to sell in the enterprise. Even if it is the right architecture. Most developers can’t grok it because of the MVC issue you already alluded to. My biggest advice - crack open VSCode and write one. You’ll learn more by doing. Forget games. Try writing a database. This is what it is really. How do you handle 100,000 objects that interdepend on a web of code? Start basic, arithmetic. Components can add, subtract, etc. end goal is query the scene for Fibonacci sequence entities.

Bonus points for updating (randomizing) values and picking different entities next tick.

Deductions for use of Dictionary<K,V> or std::map


Awesome, thx!


I think any object/type system is good, as long as it doesn't encourage atrocities. I personally am guilty of implementing a fully-blown class-based inheritance system on top of Lua's metatables, and thank $DEITIES it didn't make it into any production project!

Nowadays if I write Lua, I mostly see metatables as a memory usage optimization. If you have thousands of small objects with many methods, and all of them share the same set of methods, it makes a lot of sense to have a single vtable, to save some memory. Otherwise I wouldn't bother, it's not like Lua lets you control struct member padding to ensure they fit in a single cache line.


This language is targeted toward embedded machines with very low resources, are you using python with multiple inheritance on embedded?


Sure, I use micropython on the rare occasion when I'm playing with microcontroller stuff. Mostly that's hobbyist stuff, but still it's a feature I'd miss.


I’ve been using C for work on embedded for so long the idea of that is foreign to me, do you have any code in GitHub that is a good example? I’d like to read it.


Not really, I was working on a little web interface for the some hardware called "eduponics" but my day job has been keeping me pretty busy.

Obviously micropython has a lot of downsides as compared to writing stuff in C, much higher energy requirements for one thing, but there are some nice things as well. A REPL can be really nice when you're a hobbyist using hardware that you aren't familiar with, makes it really easy to try stuff out.

Performance wise you can do hard real time stuff in interrupts, as long as you don't treat it like python at all, no creating objects, if you need to do real stuff try to pass it back to a soft real-time event loop (python asyncio does a reasonable approximation of a co-operative real-time multitasking OS). You can also inline assembly directly into your functions, or some other python-like DSLs that have much better performance.

I don't think anyone is using it seriously in any commercial products yet honestly, but for quickly hacking together something with some SPI peripherals it's a pretty nice environment.


The company I work for uses MicroPython commercially, for medical devices. It's used in a growing number of commercial segments.


Any chance Berry has something like generators or function resumption of some kind? I usually use scripting in interactive stuff such as games.


This looks nice, especially the small runtime. One nitpick: Why not "if" expressions, or even "for" expressions? Ternary operators aren't something any new language should copy. It might even reduce parser code.


It looks neat. But the goals sound very similar to venerable old Lua. I wonder what Berry has to set it apart from Lua, besides a few of Lua's idiosyncratic language decisions that come from its age.


What are some reasons this is interesting when great embedded JavaScript/TypeScript runtimes (like Moddable's XS, which runs on microcontrollers with as little as 32 KB RAM) exist?


1) If you use Berry, you won't have to write JavaScript


Having only looked at both projects and not using either, to me there are 2 standout features berry has over xs.

1. It appears straight forward to add native functionality to berry.

2. Berry has the ability to compile code to ROM.

Maybe xs has that functionality too, but I don't see it anywhere in the top levels of it's docs. One thing I see missing from the berry docs is the debugging process. That could be a rather significant drawback of it isn't easy.


Seems cool, has a lot of the Vibe of Lua but with bit-twiddling types


I wonder if anyone is forking love2D to use berry instead of/in addition to Lua?


Off topic: Any good resources on how to create your own language?

(eg syntax, compilation, etc)


There are more than a few books and tutorials on how to write parsers and compilers and PL theory, but I know of none on how to go about actually designing a language from scratch. Perhaps most new languages can be considered slight variations on earlier languages, so there's not much point when you're almost certainly using one of ~6 styles with slight modifications? I got started on my own little project by writing "ideal" programs and then working backwards and defining the syntax and vocabulary, then writing and reworking again ad nauseam.


Bob Nystrom's Crafting Interpreters is generally well regarded these days although I'm unaware of any of the classic "so you want to build a programming language" texts.


It's not a tutorial by any means, but the openwrt project has been working on ucode[0].

Tiny JavaScript-like interpreter. It's pretty small. You might be able to glean insight.

Presumably the source for this project would be insightful as well. It appears like there is more code for berry script compared to ucode though.

[0]: https://github.com/jow-/ucode


Don't care for the way they do exceptional handling. Make errors part of the type system


Can you ELI 5?


Handle errors using exceptions (like most languages today, e.g. C++, Java, Python) vs errors as values (e.g. Rust, Go, Haskell).

The type system (especially for Rust and Haskell) forces you to always handle the error, or you have a compilation error. With some syntactic sugar (? in Rust, do notation in Haskell), you can be as concise as exception-based languages.


OK a lot of pieces just dropped in the place with your explanation. I’ve never heard it said so compactly before.


No coroutines :-(


Most MCU code I've seen targets single core single thread especially that for hard real-time. Maybe the author can outline more of the thinking behind coroutines, fibres or green threads or whatever they considered. In a real time if you have callbacks triggered by physical events that may offer more certainty than coroutines doing stuff that might finish, sometime.


> Most MCU code I've seen targets single core single thread

That's actually a good fit for coroutines! Cooperative multi-tasking can be very useful in single-threaded environments; it's basically syntactic sugar for a state machine. Lua has had coroutines since 2003, so I'd be curious why Berry chose to omit them.


There's got to be an xkcd strip on the trajectory of computer languages:

1. Starts out fast, compact and lightweight. 2. Bugs and corner-cases get fixed. 3. Features demanded by users get added. 4. No longer fast, compact or lightweight.


I’d think this actually applies to nearly everything in software and is def a useful thing to keep an eye on.

Underemployment for the last 3 years has given me a great opportunity to dive deeper into a lot of CS topics for things we use daily and read (or at least try to read) specs and code bases etc.

I’ve come to firmly believe that languages or other open source projects that get popular, usually do so for the things that are in its early versions.

For example, the HTTP 0.9 spec is like 1 page. UDP is like 3 pages.

I’d seen an article on HN recently (cant remember the name) that described this well. A language or what not starts simply and many jump on the bandwagon. Then starts adding incremental features, none of which are difficult to incrementally learn for existing users but the footprint of the product keeps expanding. C++, Javascript, Web APIs, graphics languages like OpenGL all are examples in my eye.

The fact that our system (broadly speaking) incentivizes inventing something ‘new’ rather than fixing or simplifying an existing something certainly exacerbates this.


When I try to explain this to non-programmers I often use this metaphor: Imagine that you enjoy mountain climbing, but the act of climbing the mountain causes more mountain to rise above you.

That's what we see in software development: a runaway process of elaboration and ever-increasing complexity and levels of abstraction.


Lua has done pretty well to avoid this IMO. Notably they added coroutines with basically zero cost to the implementation (essentially just a pointer keeping track of function position on top of the already existing closures). The whole language is incredibly constrained yet powerful


>I’d think this actually applies to nearly everything in software and is def a useful thing to keep an eye on.

Sort of like if the Peter Principle and Parkinson's Law had a child.

https://en.m.wikipedia.org/wiki/Peter_principle

https://en.m.wikipedia.org/wiki/Parkinson%27s_law

Or just Zawinskis Law:

https://en.m.wikipedia.org/wiki/Jamie_Zawinski#Zawinski's_La...


1. Dynamically typed!

2. Ok that didn't work too well we've added static type hints.

Does anyone know of any good embedded statically typed languages? It seems to be a very unexplored design space. The only real ones I know of are AngelScript (which catastrophically fails rule 0 of programming languages) and Gluon which is just a bit too weird and aggressively functional for me.


Perhaps Roblox's Luau? It is Lua with gradual types and a sandbox-enabled VM [1]. "Luau aims to be backwards-compatible with Lua 5.1", so yep, still 1-based array and string indexing :-).

--

1: https://luau-lang.org/sandbox


I made a prototype that works in 3000 lines of C

github.com/civboot/fngi

It was super fun. I'm going to be going in a different direction (Lua implemented in Lua, with Lua-library assemblers and assembly type system), but it's certainly very possible


Nice! Pretty interesting syntax. Not sure about / for comments or the function argument syntax but you have a solid point about not using " or ' for strings!


Thanks :)

FYI the real power of / for comments is /token to comment out one token. So useful to add a bit of doc sugar when calling something

    call(x\sugar)

    call(x/*sugar*/)
Also

    Thing \(inline comments) other


What's rule 0?


Show examples on the main web page.

Try and find an AngelScript example. It's stupidly hard. Compare it to these web sites:

https://dlang.org/

https://koka-lang.github.io/koka/doc/index.html

https://vale.dev/

http://mu-script.org/

https://go.dev/

https://www.hylo-lang.org/

Sadly Rust fails this too but at least the Playground is only one click away. And Rust is mainstream anyway so it doesn't matter as much. I completely failed to find a single AngelScript example accessible from my phone's browser. Even its Wikipedia page doesn't have any.


True. Probably, only AWK is an exception.


And Lua


Their marketing page loaded absurdly fast


Hmm, written in C99 :) Thats super cool, considered its very new language. It can even run on my ancient toaster ;) I will keep an eye on that project. For now im happy with old Ruby.



Very cool language, but it doesn’t seem to have anything to do with the purpose of the language in the original post. Can this be used in embedded systems too?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: