Hacker News new | past | comments | ask | show | jobs | submit login
Lua: Good, Bad, and Ugly Parts (2012) (kulchenko.com)
202 points by Svetlitski on Dec 27, 2021 | hide | past | favorite | 94 comments



Lua is a very nice piece of technology. Its source code is pretty easy to get into, the documentation is complete.

It has its quirks yes, but if I need to add scripting to a software, I'd consider Lua before considering writing a DSL, simply because you can pretty much embed Lua's source in your C/C++ software as a static library[0].

The stack-based approach makes it so easy to interact with C/C++, and I've been looking at the Rust bindings[1] recently out of curiosity, looks promising.

  [0] - https://github.com/lubgr/lua-cmake
  [1] - https://github.com/amethyst/rlua/blob/master/examples/guided_tour.rs


Since this is a Lua thread, your reference list should probably start at 1 rather than 0


It is about embedding how easy it is to embed lua, so you should be able to use the host conventions


Oh wow, that does make the non-standard array references rather catastrophic.


you wrap the host across the embedding boundary so that the host and lua can either use their own convention or have access to a special object that explicitely and clearly exposes the other convention.

In general using array indexes as foreign keys should always be considered carefully.

And if all you will be doing is iterating then the starting index doesn't really matter (unless you copy paste loop logic, which is a problem in itself)


Errata: you wrap the host across -> you wrap the array across


Because of this I always thought lists should be iterable/traversable, but have no direct (numeric) index to their items.


Like a stream?


Yes, like a stream, or an iterator. Where you fetch also a reverse iterator if you wish, filter by some predicate etc. A lot of overlap with both relational algebra/sql and with functional programming filter/map/reduce, I guess. Indexes seem convenient, but they're a hack in most cases.

Consider what happens when you sort your array, or delete or insert an item in the middle, or start. Now all your indexes have changed.

React had this issue, and this is why they had to add stable "key" to collections so that when you mutate a sequence of <li/> items for example, it doesn't get confused which is which.


think Set in Java does this

EDIT: Other than the obvious


If you need a few line here an there scripting, sure Lua, JS, PHP... whatever works.

Using these languages for large projects is where the trouble starts: as they push you towards bad solutions that we know are bad for many decades. You need clear description of how to avoid these pittfals, or your growing codebase slowly becomes unmanageable.

I ended up ditching a growing Lua codebase for this reason. Get out before it gets to big to get out.


This opinion gets tossed around a lot, but I've found the opposite to be true, in part because all projects trend towards becoming an unholy mess as they grow large, so being able to accomplish a goal in significantly fewer lines of code has tangible benefits.

If there actually is a correlation between high level languages and "trouble", it may be just because a high level language lets a less experienced developer (or team of developers) get farther than they otherwise would in some lower level language, i.e. the more cumbersome nature of a lower level language forces you to follow best practices earlier or the whole thing never gets off the ground, while the same sloppy devs + a higher level language might actually get as far as shipping something.

Whether that is ultimately better is debatable, but the problem is more how the tool is being used than something inherent in the tool itself.


> all projects trend towards becoming an unholy mess as they grow large, so being able to accomplish a goal in significantly fewer lines of code has tangible benefits

This is a really good point that I didn't realize before - sure, languages like bash and Lua and Tcl might scale very poorly to large codebases, but because they're more expressive than some other languages (coughjavacough), you might be able to implement the functionality you need without needing a large codebase in the first place.


I hear this opinion a lot. What exactly is the characteristic(s) that is missing from a scripting language? IDE integration? The compile-time checking before running? Seems like these problems largely have solutions these days. With stuff like JSdoc annotations or TypeScript, or Teal in Lua land, language servers for most languages etc, unit testing. They might be solved _better_ in a lot of compiled languages, but it's not like it's the Wild West if you're using a scripting language, and you could argue the more modern design in some of them are a fair trade-off for the native compile checking.


With Typescript you need to add an extra compilation step. With that one can just as well not to embed JavaScript but write in the original language.

Teal has an advantage that the typed language can be translated into Lua at runtime so during development it allows for quick prototyping and adding types when the amount of types becomes substantial. But its type system is less powerful than that of TypeScript.


You can use ts-node to avoid the compilation step with typescript.


Or Deno. But only on the back end.


I do not prefer "scripting", but dynamic typing + poor language design.

There are dyn typed langs with langs with nice designs. I like Ruby and IOScript. All lisp-like languages. And there are several dyn typed fit-for-embedded langs implemented in Rust that are very promising. This is very much my own opinion, I know. But reading the article's "cons of Lua" I immediately remembered my fight with that language.

Personally I prefer typed languages nowadays. The stronger the better, as long as there is good IDE support.


I'm not convinced scripting languages are not suitable for large projects. It's said a lot, but I've never seen proof one way or the other. I suspect unmanageable code-bases derive from uncontrollable project forces, ignorance, or not caring, but not because of the perceived short comings of a scripting language.


Indeed. Unmanageable code bases come from conversations like this between management and developers:

“Look, we need you to implement this feature within the next 24 hours.”

“Based on my scrum analysis, I need about a week to implement it.”

“Sorry, if it’s not done within the next 24 hours, you will be out of a job.”

“OK....”

At which point, we get a hideous rush job. I have seen this happen time and time again. There is no language in the world which isn’t going to be able to force clean, manageable code under these kinds of circumstances, which alas can and do happen too often in the caffeine and work obsessed tech culture.


I recently started working at a place which is like this, and the code really shows it. It really does not matter what language is used, it's not going to very maintainable if written so quickly.


Have you coded with Haskell or Elm? Im confident that re-factoring that mess that surely comes out of the 24h sprint will be re-factorable without too much pain in these languages.

Just a feeling, but coding with them feels like they are refactor-optimized langs.


If you keep having to do 24 hour sprints without refactoring it doesn’t matter.


It's a trade off, I think. A compiled static language forces you to behave a certain way at the expense of flexibility/expressiveness, script languages are more flexible, but in a way that is easier to acquire technical debt that might be harder to move away from (IMO).


People do not seems to have any trouble producing unreadable and unmaintainable code in statically-typed compiled languages.


Not at all, but having been on the "dynamically typed spaghetti code with very obscure paths/interactions and no test coverage" refactor wagon, I would take a statically typed option any day.


Indeed the quality to look for is strong typed, not statically typed.

Strong typed code is much easier to refactor.


Unfortunately this requires some 'buy in' from the managers which means that many programs are integer/string typed instead of being strongly typed..


I'm sure with time, the dynlang prototype phase -> static in production will be the norm.

Gradual typing has been improving (might not be the only way but still)


Yes, the problem with strong static typing in many languages is that you have to make decisions about typing at exactly the worst time (at the beginning), while you almost never know up front exactly what you need.

For me the ideal would be a language with the expressiveness and fast dev times of Python, and then type information would be collected from actually running the program during development and would gradually shift from auto-generated type hints to more concrete type declarations that could be enforced but that would also assist in compilation & optimization.


I've read people saying they do that in 70s 80s 90s.. it's probably an unavoidable phenomenon of solution space exploration. Even car makers use gradually more precise models before selling something.

The type gathering from use would be nice.


I find that, with a good IDE, the expressiveness of some typed langs matches that of dyn typed langs. You write a bit more code (the types) when using a typed lang, but you write a bit less tests (the stand-in for types dyn langs offer) and your IDE takes care of many of your type signatures.

The fast dev times, as in quick "edit -> run -> try" loops, it what I find most attractive of the dyn langs nowadays. And gradual typing might be a way to get there, although I think it is not needed (some strong typed langs have decent hot code reload).


I believe mlua [0] is the recommended Lua Rust binding now.

[0] https://github.com/khvzak/mlua


Thanks, I'll take a look


There's also Lua implementation (incomplete) in Rust - Luster[1].

[1] https://github.com/kyren/luster



Why Lua rather than e.g. TCL or Python (both similarly easy to embed, but better-known as full-fledged languages)?


TCL's good for string stuff, but gets very messy if you want to do stuff outside of that. It wasn't really designed originally as a general language: it works and does have some (IMO 'too') clever features, but it has a lot of foot guns as well: comments are actually (almost) ignored procedures, which causes issues, you sometimes have to escape comments or they'll change logic or cause syntax issues (i.e. trying to comment out statements sometimes still triggers syntax errors within the comment!), everything's a string which is great for strings, but not when you need to start validating numbers or similar... TCL makes a lot more sense as a command language (what it was originally designed for - string commands) or a REPL...

Back in the late 80s before Python and Lua were released the following decades, TCL made sense, as it was the only freely available embedable language.

Python's larger and more complete (and I'd argue a better language then Lua), but Lua's compact and very fast as it's a register-based bytecode VM (and luaJIT exists which is even faster) (although if you don't use the 'local' keyword on variables, it's then quite a bit slower as it no longer uses stack-based variables, so the code can end up being more verbose to make it fast), so games commonly used Lua for scripting/gameplay as it was easy to integrate.


Have you seen TCL quadcode? It infers types and uses llvm - Check out the typing diagram; it is kind of crazy. Maybe it is to avoid shimmering more than anything, but I think it is impressive. Thought you might find it interestung even if you don't use TCL.

https://wiki.tcl-lang.org/page/tclquadcode


Let me add my voice to the chorus of posters pointing out that, no, Python is not easy to embed.

Back in 2004 or 2005, Firaxis decided to use Python as their embedded scripting language. They used something called “Boost Python”, a then reasonably easy to embed fork of Python, to embed Python2 in their Civilization 4 gaming engine.

Soon after this, Boost Python got abandoned and Firaxis ended up having to use an outdated version of Python by the time they released their final Civilization 4 expansion.

For Civilization 5, Firaxis instead used Lua, since they wanted an actively maintained code base.

For my own “embed a scripting language in a DNS server” project, I went with a slightly modified Lua 5.1. The entire DNS server, including the Lua scripting engine, is a 103,936 byte sized Windows service. The stack was a little hard to grok at first, but I was able to fairly quickly get used to it and have a Lua script set up configuration for the server, as well as parse DNS queries. [1]

To Python’s credit, the Python2 code used in Civilization 4 is 100% compatible with the final 2019 release of a Python2 interpreter, to the point that I can run map scripts for Civilization 4 -- compiled for x86/32-bit -- on a 64-bit ARM Raspberry Pi and have them generate the exact same maps. Useful when I wanted a particular kind of map for a Civ4 mod, and had to iterate through 300 different random seeds on my Raspberry Pi to find the desired kind of map. After about a month, I had over 180 map seeds meeting my criteria.

[1] https://github.com/samboy/MaraDNS/tree/master/deadwood-githu... for the record


Some corrections: Boost.Python (note the period) was a C++ library for automaticing the generation of cross-language bindings between Python and C++. It wasn't itself a distribution of Python. Boost.Python still exists. However, its successor (pybind11) used features shipped in C++11 to simplify the implementation (and compiles far faster) and is the leading Python-to-C++ binding to use today. I'm using pybind11 in a project today with bog standard Python3 and it works great.

Lua made inroads into the game dev community in particular thanks to LuaJIT. Despite it actually being a fork of Lua that hasn't kept up with changes in the base language, LuaJIT remains popular for its speed.


Python is all but "easy to embed" - for one thing it's huge, and it requires linking to native libraries which may conflict with your own (openssl comes to mind). It also has an unstable ABI (and even API in some cases). Lua, in contrast, is just a single, very lightweight DLL, with no external dependencies and a stable ABI.


Yeah, Python is "theoretically" easy to embed, that's why it uses the GIL

In practice, it is hard to embed and we get the GIL disadvantages, so that is not working out so fine.


The GIL is less about python being easy to embed, than it is about embedding other libraries into python. The GIL was set up to make it easy to wrap C code and expose it to python code. A task for which it served quite well at the time and to be honest lasted longer than I would have expected.


My impression is that Python is not actually easy to embed (and I say this as a fan and frequent user of Python).


Modern tcl is big. Embedding something like Jim is still possible, but I think Lua is more popular than tcl and has a more approachable syntax if you want laymen to use your DSL.


Modern Tcl may be big, but it is not too big to embed. I know, because I embedded it into a Go system at a previous employer. Unfortunately, it is proprietary, so I may not point you to the repo, but it required remarkably little glue code.

I also evaluated Python and Lua. As others have noted, Python appears to be a right royal pain to embed. I actually had more experience with Lua prior to that project, but that experience leads me to believe that Lua is, generally, the wrong language.

Looking back on the project, I would not hesitate to do the same thing again, ideally as open source so that other Go projects can embed a scripting language. It got the job done, and did it well. Non-programmers on our team were able to write both configuration and logic successfully & productively, and apparently enjoyed the experience. Tcl itself ended up being a pleasure to use. While I personally would have enjoyed something like Embeddable Common Lisp, I think that would have been to much to ask of the rest of the team.

Interestingly, the four languages Lua, Python, Tcl and Lisp each can be said to take one idea and run with it. Lua is everything-is-a-hash-table (well, almost everything); Python is everything-is-an-object; Tcl is everything-is-a-string; and Lisp is everything-is-a-list (well, in theory: in practice it is really everything-is-an-object). I don’t know if this says anything deep about scripting languages, but it is at least interesting — right?


Python is not easy to embed. It's very unsecure and is difficult, if not nigh-on impossible, to effectively sandbox. (see RestrictedPython)


It's my understanding that Lua can be more easely sandboxed, at least compared to Python.


Lua is smaller, and one of the fastest in the bytecode-land. It also used in a large number of diverse applications [1], so in terms of popularity, I think it has nothing to be ashamed of compared to others.

[1] https://en.wikipedia.org/wiki/List_of_applications_using_Lua


When I say "embed" I mean: add the source tree to your project and compile statically (see the CMake project for Lua 5.4).

To my knowledge, you can't do that with Python. I don't know about TCL though.


Not sure about TCL, but python is much bigger, has a bigger memory footprint, and is a bit slower than LUA. Also, check how easy is to integrate Lua into your codebase


I embedded Python and it was not easy to embed at all.


At first Lua seems strange, but after a while you start to appreciate it. It’s designed with a small number of concepts that manage to lead to expressive code with excellent performance.

I use it for two things: scripting TeX¹, where it allows you to do amazing things², even complex numerical calculations³ within the document; and writing Pandoc filters, where it is now the standard method.

1 ‘LuaTeX comes of age’. LWN. Available from: https://lwn.net/Articles/731581/

2 http://wiki.luatex.org/index.php/TeX_without_TeX

3 http://www.unirioja.es/cu/jvarona/downloads/numerical-method...


I was a huge Lua fan, but once I jumped on the typescript bandwagon I found the lack of (production ready) desugaring and static type compilers for Lua to be a negative. The language is fantastic, but what I expect from a scripting language is always expanding, I guess


You might enjoy https://typescripttolua.github.io/

I tried a bunch of different typed luas before landing on this project and having a really wonderful experience incrementally porting my most recent love2d thing over and then being able to make big architectural changes with confidence that types gave me.

I was able to do this port without modifying the original lua files, just replacing them with typescript one by one. Any lua libraries I was using I kept using by just writing some type declarations for then and throwing them in with the compiled output.


Today I would recommend checking out Teal: https://github.com/teal-language/tl/ I'd consider it the spiritual successor to Typed Lua.


I looked at many options, what I lacked when I looked in this space is a big comparable project to, say, my game engine https://github.com/lanarts/lanarts that uses one of these typed Lua's. I'm not in a position where I can spend the time being the first - typescript has very mature idioms for large code bases


What about https://github.com/Roblox/luau? It gives Typescript-like annotations.


With roblox behind it, I'd indeed consider it for my game. I'm guessing no support for LuaJIT, though?


It's a custom fork of plain Lua, not LuaJIT, so indeed there's no LuaJIT support. They first open sourced it this November and supposedly they're planning to implement their own JIT. There's no timeframe or estimate for that though, just mentions of plans to here and there.


Sounds good. I won't cycle back to finish Lanarts for a while so I look forward to a JIT in this space. LuaJIT was a marvel but it probably needed a few more years of Mike Pall's time to smooth out difficult to anticipate optimization


I really appreciate that the article has a "Different" section for things that are different from, but not necessarily better or worse than, other programming languages.

This is also a very good summary, and tracks with my own experience getting into Lua for Neovim scripting.


This was written just after Lua 5.2 was released. There have been improvements since then. Lua 5.3 introduced an integer type (64-bit), actual Boolean operators, a UTF-8 library, a way to yield across a C-boundary and a way to pack and unpack binary data. Lua 5.4 introduced deterministic finalizers (and a fall out from that is a limited form of constant local variables).


But those improvements backfired, as Mike Pall refused to update luajit with them. They have now a 10x slower language implementation.


If you want Lua which is luajit compatible, do what I did and use Lua 5.1 for the scripting engine. This way, if luajit performance is ever needed, scripts written with my engine will not break.

Lua 5.1 + bit32 (because, yes, I like to be able to do bitwise and/or/xor stuff) is my current favorite scripting language.


I wonder if maybe it was a good thing in the end. Lua51/LuaJIT are now dead languages - no new design changes are going to happen. It is now a stable development target for which libraries can only accumulate.


Lua appeals to my sensibilites and I want to make use of it. It seems to be a well-designed language and it has a very performant JIT. But a big weakness for non-embedded use cases seems to be its ecosystem. A recent little project I wanted to use Lua on included a sqlite database, oldschool 3DES encryption, and an SMTP client. For many languages, there would be a clearly mob-approved library for each of these. But I found 3 or 4 possibilities to use sqlite in Lua, with no clear winner. I ended up using Ruby instead.


While what you went through and felt is valid, I was able to find, in LuaRocks, libraries or Lua interfaces to libraries implementing everything on that wish list. In more detail:

a sqlite database

https://luarocks.org/modules/tami5/sqlite

https://luarocks.org/modules/dougcurrie/lsqlite3

3DES encryption

https://luarocks.org/modules/starius/luacrypto

https://github.com/somesocks/lua-lockbox/blob/master/lockbox...

an SMTP client

https://luarocks.org/modules/luarocks/lua-smtps


In my limited experience, when you want to interact with C libraries, you often end up reading the docs of the original C library anyway because often the bindings are nearly 1:1 - and often the trouble lies in the word "nearly".

Just writing yourself the bindings you actually need is most likely the best approach.

It should be relatively easy to do, because by using something like Lua you goal is precisely to expose functionalities implemented by your language as Lua functions (otherwise you would just convert them into dynamic library functions and use a classic glue language with a decent FFI to connect them).

This is the value proposition of an embeddable scripting language at heart.


I work on a project that leverages Kong's API Gateway, which is essentially Nginx + Openresty (Lua) + Kong (More Lua). The killer feature wrt Kong is the plugin ecosystem, which (among other things) allows you to act on the request/response lifecycle at various stages. Developers coming onboard to the project usually have little to no experience writing Lua, but we've found that coming up to speed on the language and it's runtime to be fairly painless. These days Kong has shims to write plug-in code in a few different languages (javascript, python, go, and more recently a wasm runtime) but despite our teams unfamiliarity with the language we still go back to Lua because performance can't be beat.


I’ve had a very positive experience using Lua as an extension language.

I was writing a text editor at the time. I wanted as much of the core code/actions to be written in Lua as possible, as I’ve always disliked very thin scripting APIs that sit on top of opaque native procedures.

I was able to wrap a handful of native functions in Lua code, then write the remaining 85% of the editor core in Lua. Everything was very fast and the process was straightforward. I’d definitely choose it again.


Lua 5.4 now includes a short utf8 library in the standard library. It has ways ways to get the utf8 string length and regex patterns for utf8 codepoints. However, it doesn't include any functionality that would require large tables of characters (for example, knowing which unicode characters are alphabetical).

-----

The requirement that return be the last statement is to avoid a syntactic ambiguity. Since Lua has no semicolons, there would be no way to know whether a line following the return statement is part of the return statement.

    return
    f()


Would you not just terminate the return when you see a newline? like the language already does for expressions.


Newlines are not significant in Lua. From Lua's point of view, there is no difference between

    return 
    f()
and

    return f()
You can also put multiple statements on the same line with no semicolons

    x=x+1 y=y+2


From a language design perspective, is it easier to build a compiler that considers `;` or newline as the end-of-statement operator, or one that works like Lua, where statements can appear anywhere?


From a language design point of view, the hard thing about newline as a statement terminator is how to deal with expressions that span multiple lines. In shell you have to use explicit line continuations. In Python, the line breaks are ignored if they are inside parenthesis (which is one of the reasons why it doesn't allow multi-line lambdas). Javascript deals with it by having an ad-hoc semicolon insertion algorithm.

The way Lua does it is closer to what you might be taught if you take a compilers class. It's just a plain old context free grammar. The tricky bit was designing the grammar so that semicolons were not needed. You have to be careful when there is something that could either be parsed as a continuation of the previous statement or as the start of the next statement. One place where Lua does this is in the rule that the return statement must be the last in the block. Another is that not every expression can appear as the function part of a function call. For example, something like 1() is a syntax error because the grammar doesn't allow function calls where the function is a numeric literal.


I didn't know statements didn't need to be on separate lines. That's really cool. Thanks for the correction!


You can also write rust ( or go, or write in any other language that allows you to expose a C ABI ) and make a Lua module. This solves a number of the different, bad and ugly issues in my opinion. For example, I think using rust’s chrono or unicode_segmentation library makes life so much easier than having to deal with that in Lua. Neovim embeds lua 5.1 jit, and it’s possible to write plugins in rust for neovim using this mechanism.

For any one that knows what I’m talking about it should be obvious how to do this. If not, I wrote about this more here, in case you are interested:

https://blog.kdheepak.com/loading-a-rust-library-as-a-lua-mo...


Could have sworn I saw this submission 4 days ago.

https://news.ycombinator.com/user?id=Svetlitski

The timestamps next to the two oldest replies make it seem like they were more recently submitted, too. But I think they are 4 days old.

Seems like the title should be "Lua: Good, Different, Bad and Ugly Parts" as there is a fourth section, preceding "Bad", in the blog post called "Different".

Language wonks can endlessly dismiss Lua as a programming language, but it continues to be "embdedded in", i.e., used to extend, useful network applications, e.g., dnsdist, haproxy, nmap. To learn to use the application to its fullest, one has to learn some Lua.


That's the second chance pool: https://news.ycombinator.com/item?id=11657576


A missing 'good' - the reference implementation is extremely easy to extend with existing C or lua libraries. E.g. embed sqlite.c and bindings to it and all of penlight (a bunch of standard library like utilities written in lua).

There are some package managers. However concatenating all the source plus a bunch of libraries into one .c file also works great, My bootstrap/install is to clone that one file via git and feed it to clang.


A couple past threads:

Lua: Good, bad, and ugly parts - https://news.ycombinator.com/item?id=6616817 - Oct 2013 (19 comments)

Lua: Good, bad, and ugly parts - https://news.ycombinator.com/item?id=5348513 - March 2013 (110 comments)


It is nice small, so Lua even ran on Playstation 3 SPU!

We used to ship lua binaries for the premake build system.

For me, the indexed from 1 rather than 0 is a big turn off.

Roblox has the mlua fork, with increased performance and added type annotations: https://github.com/Roblox/luau


> For me, the indexed from 1 rather than 0 is a big turn off.

I can't understand why people complain about that.

If you're using 'pairs(table)' iteration then the index doesn't matter.

If you're accessing indexes directly you can have the index '0' in a table.


> If you're accessing indexes directly you can have the index '0' in a table.

Sure, you can, but in practice you won't. For example, something like `local pos = {x, y}`. You now have to use 1/2 to access the x and y coords. Making the same table with 0 indexing is awkward and not idiomatic.


I hope something like `ArrayBuffer` and `TypedArray` would be added to Lua, like to JavaScript.


You can use LuaJIT's FFI.

  local ffi = require 'ffi'
  local arr = ffi.new('float[?]', 100)
LuaJIT's FFI is pretty great. It puts even a lot of compiled languages to shame.


For better or worse, the language devs have a goal of keeping things simple, so this would have to be provided by a library. It's pretty easy to implement this sort of thing in luajit or provide that library to other luas though.


(2012)

"lua 5.2 has no globals" is a bit of a misleading takeaway from the getfenf/setfenv changes in lua 5.2.


We've added the year above now. Thanks all!


This is from 2012 (should say so in title).


Plenty of good bullet points, but it's now a quite outdated article. A lot has happened in Lua since 2012.


[flagged]


"Please don't complain about tangential annoyances—things like article or website formats, name collisions, or back-button breakage. They're too common to be interesting."

https://news.ycombinator.com/newsguidelines.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: