Hacker News new | past | comments | ask | show | jobs | submit login
Pkl, a Programming Language for Configuration (pkl-lang.org)
930 points by bioballer 7 months ago | hide | past | favorite | 564 comments




Did all the timestamps get reset? Seeing most (all?) comments at about 30min ago


Ah sorry - that was an unintended side-effect of re-upping the original submission. I must have done some of the steps in the wrong order.


You mean you don’t have a distributed merge post microservice that emits post migration events, which are then consumed by a post owner conversion service using your existing event-driven architecture to facilitate seamless data synchronization and user notification processes??? That is not very hacker news of you from the guy who owns hacker news.


I know you're joking, but I actually think complexity like that (when mostly unnecessary) is the least hackery thing in the world. Simple and effective gives me a hacker buzz


Pkl was built using the GraalVM Truffle framework. So it supports runtime compilation using Futamura Projections. We have been working with Apple on this for a while, and I am quite happy that we can finally read the sources!

https://github.com/oracle/graal/tree/master/truffle

Disclaimer: graalvm dev here.

Edit: typo


> ...GraalVM Truffle framework... Futurama Projections...

I know it's partly on me for not knowing the domain, but I honestly suspected somebody is trying to make fun of me with some concentrated technobabble.

Especially since I wouldn't expect the topic (configuration languages) to require complex mathematical machinery to work with. Now I have something interesting to dig into.


What has most impressed me about GraalVM and Truffle is their capability of deep-optimizing high-level code like Python and Ruby.

I once saw a demo where someone took a simple operation in Ruby using inefficient-but-elegant syntax (including creating and sorting an array, where a simple loop would have been the appropriate approach in C). He compiled that using TruffleRuby and the entire array construction and sorting was completely optimized out of the generated bytecode.


Really? Link?


He probably means one of the wonderfully crafted talks by Chris Seaton.

Here is one of the many: https://youtu.be/bf5pQVgux3c?si=S8Dm5d_GXYXgJtnY

If you go looking for more you will find many more marbles.


Shameless self plug: Giving an introduction in this video: https://youtu.be/pksRrON5XfU?si=CmutoA5Fcwa287Yl


Gently teasing: linking a 2 hour video with "shameless self plug" definitely did _not_ help obviate the surreality.


I'm not sure if it was part of the humor, so pardon me if it was, but it's actually "Futamura" as in Yoshihiko Futamura, not "Futurama".

https://en.wikipedia.org/wiki/Partial_evaluation#Futamura_pr...


Really glad it wasn't just me. Genuinely thought someone was trying to make a joke.


Same - it doesn't help that I read Futamura as Futurama the first 3 times.


Probably because the original comment said “Futurama” not “Futamura” due to autocorrect [0], and was later edited to correct the misspelling.

Even now the OG comment says “Fuamura” but the quote in the GP comment has the original “Futurama” written in it.

[0] https://news.ycombinator.com/item?id=39239965


For me it was about 5, until I read your comment. :/


Same. There was a mini subthread years ago that applies.

https://news.ycombinator.com/item?id=13752964


Glad I'm not the only one who had this reaction. I just can't bring myself to accept that a problem that could be solved with a slightly better version of JSON or property lists requires this many buzzwords.


Those aren't "buzzwords" though, it's a very specific way to implement programming languages. It's not really meaningful except for the PL implementation nerds.

Especially the Futamura projections. It's almost magic and very few people have even heard of them.


Very few people have heard of them. That is exactly the reason why I mention them as often as I can. They are a great entry into the world of meta compilers.


If Futamurma means what I think it means skimming across the Wikipedia entry, it would mean that simple value-holder-file configurations would be parsed and checked at the speed of a general purpose tokenizer. But without closing the door to what the language can express in more elaborate configuration file "landscapes". Best of both worlds and presumably all without requiring anybody but the toolmakers to understand what the buzzwords really mean.


The best video I know about this stuff is "Compilers for free" by Tom Stuart (https://youtu.be/n_k6O50Nd-4). It is hilarious at one point. Brilliant.


Fantastic talk! Thanks for sharing.


Genuinely read "Futurama Projections" and figured the same. This doesn't sound real (though I fully trust it is, just sounds funny).


>...suspected somebody is trying to make fun of me...

I think that too, "Futamura projections" are important but they are very very far from "complex mathematical machinery" as you may hear it. They are indeed very simple (even mathematically trivial) and require no special background to understand.


> but I honestly suspected somebody is trying to make fun of me with some concentrated technobabble

Let me tell about a revolutionary device called a Turbo encabulator.


sounds like a perfectly cromulent topic to embiggen our knowledge.


Perfectumentous!


An author named David Duncan wrote a series of books, called A Man of His Word (and A Handful of Men)[0]. Great books.

One of the races in the books was the Anthropophagi (basically modeled on New Guinea headhunters). They talked like that.

[0] https://en.wikipedia.org/wiki/Dave_Duncan_(writer)


You joke, but this is surprisingly close to the name given to Dumbledore in the Dutch translation of Harry Potter.


supercali ...


[flagged]


Are you really this upset because people don't know a 60 year old movie reference, and downvoted a comment that didn't add to discussion? And you need to flex your age because of it?

If you get this upset you don't have to post on this site. Or you can learn to be not as reactive to social media.


>Are you really this upset because people don't know a 60 year old movie reference, and downvoted a comment that didn't add to discussion?

Maybe you should read more carefully before replying.

I already said above that I was not complaining.

As for my comment (the supercali... word) not adding to the discussion, you are wrong again. The comment was in the same spirit as my parent and grandparent comments, who used words like cromulent, embiggen and perfectumentous.

>And you need to flex your age because of it?

Wrong again. Nothing in my comment shows that I was "flexing my age", as you call it.

>If you get this upset you don't have to post on this site.

Oh, I don't mind posting. I am having fun. I don't let comments like the one that I replied to, spoil my fun.

>Or you can learn to be not as reactive to social media.

Er, the term is "social" media, not "lone wolf baying at the moon" media.

It indicates people reacting (by replying) to other people, which could include approvingly, neutrally or critically, just like in real life, you know.

But there is something in what you say. This "comments about comments about comments about ..." scenario is getting boring and tedious.

From now on, I'll let the blind downvoters be blind downvoters and keep doing their thing. As I said earlier, HN points are not at all important, to me, at least.


Are you okay?


no, I'm fuzztester :)


You joke but newer rails versions come with a front end framework named Turbo, and there's also a JS bundler named Turbo, so this is actually too close to reality


It makes me think of this game, basically "pokemon or technobabble". Can't find it now though.


There's Pokémon or Big Data: http://pixelastic.github.io/pokemonorbigdata/

And the (original, I think), Pokémon or Tech Term: https://docs.google.com/forms/d/e/1FAIpQLSfsG7AEFLvlW68aIVIs...


> Futamura

not Futurama :D


This comment is what PKL is going to be remembered for. Tbh I wouldn’t even have the courage to write the comment myself as the framework was coming from Apple.


> Pkl was built using the GraalVM Truffle framework. So it supports runtime compilation using Futamura Projections.

What now?


As I understand it:

GraalVM is an alternate JDK that can, among other things, do ahead-of-time compilation for Java.

Truffle is a framework for building languages, that sits on top of Graal.

Futamura Projections are a particularly interesting use case for compile-time partial evaluation.

With partial evaluation you have static (known at compile time) arguments to your function, and dynamic (known at runtime) arguments.

When “your function” is an interpreter, and your static arguments are source code, the output is effectively a self-contained compiled binary.

When your function is an interpreter, and the static arguments are the source code for the interpreter itself, that “self-contained compiled binary” is now itself a compiler.


That all sounds cool, but is any of that especially useful for a configuration language?


If you want a tool to be able to generate executable validation from a schema, a compiler framework should come in handy.

It seems like they did not aim to make yet another mvp configuration language, but something that can scale across a wide range of usage scenarios spanning all the way from short-lived processes reading a number from a file to huge libraries of default/override hierarchies. Lack of universality sets an upper bound for the value of a configuration language, particularly when seen through the lens of configuring heterogeneous tech stacks.


I’m also curious, because Graal is pretty exciting stuff, what this might give over Jsonnet or Cuelang. It’s already a hard enough sell to try to get people to adopt these and they are much older and more robust than Pkl.


I'm very wary of anything Java-based, having been burned by Java tooling in the past. I work on a few different Android projects and I have to switch between three different JDK versions depending on which I'm working on. What happened to "write once, run anywhere"??

I really like Pkl's comparison page, which includes its weak points as well! https://pkl-lang.org/main/current/introduction/comparison.ht...

Pkl’s native binaries are larger than those of other config languages.

It should be as fast and easy to use and reliable as something like esbuild, so I'd suggest they may want to rewrite it in Go like esbuild. I'm not a Go fan at all, but it clearly does some things really well.


>"write once, run anywhere"??

You know that the code compiled using future version of java won' work in older versions..rt? I would like to know if any other programming language does that kind of thing.

> t should be as fast and easy to use and

How did you conclude that it's not fast? They are creating native binaries just like Go or any other AOT languages with GCs. Graal native images are as fast or faster than Go. Also it contains a REPL, that's why bigger size. So for CLI tooling as a developer using pkl, you won't see any difference if it's written in java + kotlin or golang.


You know that the code compiled using future version of java won' work in older versions..rt? I would like to know if any other programming language does that kind of thing.

Of course, but what surprises me is the lack of backwards compatibility -- future JVMs refusing the run old code. I get that you have to deprecate old unsafe APIs sometimes, but it feels silly that I need three different Java versions for different Android projects.

They are creating native binaries just like Go or any other AOT languages with GCs. Graal native images are as fast or faster than Go. Also it contains a REPL, that's why bigger size. So for CLI tooling as a developer using pkl, you won't see any difference if it's written in java + kotlin or golang.

That's good! I thought you needed Java to run it.

I figured I should give it a proper try, so I just downloaded it. 105MB!! They're not kidding when they say it's big. I also checked bun (47.7MB) and esbuild (9.8MB) for comparison.

pkl does seem to start up pretty fast, though. 1.6s on the first run (presumably just the time needed to cache that big binary) and ~100ms thereafter.


There are reasons Oracle sued Google over Android and you just articulated one of them.


It’s not Android per se that’s the problem, it’s that Android uses Gradle as a build system and Gradle uses Java.

The Gradle compatibility matrix is pretty complicated: https://docs.gradle.org/current/userguide/compatibility.html...

I’ve also used Facebook’s Buck build system, as an attempt to get away from Gradle, and it’s also fussy about JDK versions.


Pkl is newly open sourced, but it not new. It's been used for years at Apple, and has been battle tested internally.

I'd actually say that our tooling in some ways is more mature. For example, I think our IDE experience (at least in JetBrains editors) is the best out there.


There is no trust in the words “we tested this internally.”

Apple employees can rate it as an old project among themselves, while it is more convenient for everyone else to rate the product from the moment of publication.


Looks like a more robust type system than Jsonnet (but less than Cue), with some amount of flow-control that Cue doesn't seem to support. I am not very familiar with Cue though.


> With partial evaluation you have static (known at compile time) arguments to your function, and dynamic (known at runtime) arguments.

That's pretty clever... How is this implemented in actual code though? I can't even begin to imagine how that magic machinery works.


> Truffle is a framework for building languages, that sits on top of Graal.

wtf is Graal? That sounds like a supporting character from Beowulf.


https://graalvm.org

Polyglot and native compilation enabled runtime for JVM, can run Js, Python, Ruby and more.


On tonights episode of Futurama bender and the gang explore the temple of Pkl on planet VM where truffles are considered the holy graals and barely run away in time from - The Compilations - an ancient secretive order of silver voiced kong-fu monks tasked with protecting the shrine from alien invaders as has been foretold in prophecies - and strangely reminiscent of 20th century Earth doo-wop group The Drifters.

Cue chase montage shenanigans with Under The Boardwalk playing in the background

Do you smell toast.


I definitely did a double take to make sure they didn’t write Futurama.


I absolutely thought they wrote Futurama until I saw this comment


They did (autocorrect) and later fixed it.


The mind. It is a curios thing.


holy graals


A LOT of projects in the Java world do add new features to java. My favorite is CraC


new game: llm hallucination, attempt at humor, or legitimate technical explanation.


Too close to the "reliably solvable by simple heuristic" end of the spectrum to be a good game: if the text is short it's probably a joke, if it is a very long wall of words it's LLM and anything of somewhat reasonable length can only be a legitimate technical explanation, no matter how unlikely.


> and anything of somewhat reasonable length can only be a legitimate technical explanation, no matter how unlikely.

thanks, that will help improve the output.


all I saw was oracle


FWIW Graal is probably one of the most exciting technologies to come out of Oracle in a long time.


It came out of Oracle. Kids of death.


It actually didn’t, it came out of academia. Oracle just did the right thing one time.


It's GPL-licensed, and it works. I'm happy they haven't Oracle-ized the JVM, and have been investing into great features that are available to everyone for free.


Agreed. Anything by Oracle is an automatic hard nope.


Your mention of Futamura Projections was a nice reminder of how very mathy/theoretical foundations underpin Nice Things in high-level languages, like Hindley–Milner inspired type systems and modern optimizing compilers targeting SSA form. Value Lattices in Cue [1], another config programmable configuration language, also fall into this bucket.

[1]: https://cuelang.org/docs/concepts/logic/


Currently using Cue in a major project. It can be a puzzle. But, we like it a lot. Wish it had a bigger community.


Not completely related to the OP, but is Truffle going to be upstreamed as part of Project Galahad or will it remain as a library maintained by Oracle Labs?

I ask cause the Project Galahad page on openjdk.org is a bit sparse on details.


The truffle compiler extensions in Graal will be part of Galahad. For example the partial evaluator. The truffle framework and all the languages are consumed from a maven repo as a regular java library these days.

Some background on the recent changes there: https://medium.com/graalvm/truffle-unchained-13887b77b62c


It'd be interesting to understand what kind of performance problem Apple had and tried to solve with GraalVM/Truffle. I've seen some instances of heavy configs that generate literally several gigabytes of data, but those were usually not significant bottlenecks since configs are not updated very frequently.

Of course, I know those two frameworks are one of the engineering marble of the age and would understand even if they decided to go without any concrete needs for it.


I guess you mean Futamura projections?


I wish people named more tech products after popular media instead of common words. Would make it equally hard to web search, but at least it would be funny for non-techies to listen to


You'll really like the Quantum Resistant key exchange algorithm - Kyber. A related project is Cryptographic Suite for Algebraic Lattices or CRYSTALS.

Sadly they renamed Kyber to MLKEM.


There is also Dilithium:

https://pq-crystals.org/


Damn you autocomplete! This happens all the time :D


Oh wow, this wasn’t the sort of language I expected to see being built on Truffle, but I’ll be really interested to take a closer look when I’m on a decent net connection.


Do you know, why they use both ANTLR and Truffle?


Truffle has no opinion on how you parse the sources. It cares about how you execute them from an intermediate Truffle guided representation produced by the parser.

In other words antlr and truffle are a great fit. We even use this pairing for our example language simplelanguage.

https://github.com/graalvm/simplelanguage


Thanks! I haven't seen before usage of Truffle and ANTLR together, but it makes sense.


Futamura


That's iron chef futamura to you


25 years ago pretty much every program had a GUI to do the configuration. With help texts. On Windows, programs then either saved stuff into an ini file or the windows registry, both you could also edit manually.

Today we have a programming language coming as a 87 MB binary to create config files. And to run that programming language you need to manually crate a ... config file.

So what we are missing now is a 500GB framework that can write the config file for the programming language that is writing a config file for the actual program I wish to use.

I am sorry, but very clearly a huge chunk of today's "developers" really are in the business of creating problems.


Yes, the windows registry, the peak of our craft. No need to innovate, this one’s solved, wrap it up and move on

Try taking more time to consider the problems you don’t have that others do, instead of writing anything off that doesn’t make sense to you (and simultaneously gatekeeping an entire industry)


Pointing out that a proposed solution to a problem creates more problems is not the same as saying the original problem doesn't exist or isn't worth solving. Nor does it imply that some random other solution is better.


How about .ini/.cfg files? Simple and portable, allowing comments, unlike JSON.


Not scalable though, like JSON with nested objects.


Scalable config files? I'm confused. What needs to scale?


Creating complex layers upon layers just to "solve" a very simple problem is not innovation. It's shittification.


If your configurations are simple, you don’t need this. If they are not, you might.

Because you have not run into the problems this has addressed in your career, does not mean that you know better than Apple how to solve them. In fact, it means like you are uniquely unqualified to solve them. Acting like you are in a severely condescending dismissal of the engineers who worked on this makes for boring conversation.


Maybe give us an example?

It's an interesting question - just how complex are our biggest configuration problems today?


Kubernetes config is a decent example. I had ChatGPT generate a representative silly example -- the content doesn't matter so much as the structure:

https://gist.github.com/cstrahan/528b00cd5c3a22e3d8f057bb1a7...

Now consider 100s (if not 1000s) of such files.

I haven't given Pkl an in depth look yet, but I can say that the Industry Standard™ of "simple YAML" + string substitution (with delicate, error prone indentation -- since YAML is indentation sensitive) is easily beat by any of:

- https://jsonnet.org/

- https://nickel-lang.org/

- https://nixos.org/manual/nix/stable/language/index.html

- https://dhall-lang.org/

- (insert many more here, probably including Pkl)


Maybe not always, but surely in most cases of too complex a config, it is a case of ad-hoc grown config, representing what one wants to actually configure badly, and/or underlying abstractions of the thing one wants to configure matching badly what one wants to do. In most cases it would be good to take a step back, or multiple ones at that, and really ask oneself: "What is it, that I actually want to configure here?" and think about why it cannot be a simpler config. What abstractions would actually make expressing that config easy.

Often one will get to a very simple config format in the end. Of course, when one has to deal with very complex formats created by others, already widespread in use, on cannot easily change the format. Maybe that is the reason we get these meta config tools.


Sure. There are undoubtedly a lot of config formats that are overly complex.

But sometimes the complexity is irreducible. Kubernetes is one such case. The model is very well thought out, and just about as simple as it could get without removing functionality. It has sensible defaults, built-in versioning, well-defined schema etc. But if you want to describe a complete installation of a distributed system with many heterogenous processes, spread across many hosts, communicating in specific ways, with specific permissions, persistence, isolation, automatic scaling, resilience, etc, there are a lot of details. I've worked with systems that have thousands of lines of configuration, and honestly that's not extraordinary. Many people on this site will rightly scoff and say, "psshh, that's nothing."

Configuration languages are a really important area of research in the tech industry right now, and every time someone posts one on here, there are a huge number of dismissive comments. Fine. Not everyone has this problem, but it's a real problem, and solving it represents a real advance in the state of the art.


Sounds like the case where a picture is worth a thousand words


You're absolutely right, k8s is a pig.


Especially since k8s allows using arbitrary labels for plugins, effectively creating „stringly typed“ programming.

Nobody is stopped from compiling code, converting it to base64 and storing it into a label for later execution.

Arbitrary parameters like this are the opposite of a unifying abstraction.

That ingress-behavior wasn‘t defined but pluggable to suit existing load balancers also broke the abstraction.


I do feel there's a simple abstraction under there somewhere - k8s config is just a big tree, after all. And adding pkl's loops on top of that would probably reduce a lot of duplication.


A team of 6 responsible for overseeing the deployment of all software in a department of 1000 to any number of deployment environments with tight budgets on durability, delivery time, and the autonomy of the teams that are writing the configuration may reach for this, or something similar, over the Windows Registry


I just run a count, on the infrastructure configuration for a relatively small, but already public company. 13k lines of YAML alone. Some of it is generated (from "YAML with macros") by AWS CDK parody I invented before CDK became a thing (and being an old school Unix pervert I did it with awk and m4). There's also 1400 lines of written or generated JSON that still needs to be managed. And that's a successful, but single-product company. I also worked for a company which had a single product but ran it for their clients heavily (very heavily) customized, you can easily 10x the numbers above and you still have to do that with a single team and go back to your family every night and not riding an ambulance to the nearest mental facility. YAML is like the object code, it's not a configuration language. (and that's why I prefer JSON, it's less error prone and easier to generate). Writing YAML manually is like programming in machine code back in the 80s. Possible, but why?


Well, Apple is the gold standard in creating new proprietary programming languages for often just evil reasons - typically for vendor lock-in.

They invented the Objective C and Swift, and made it pretty much impossible to use any standard language to target their platforms in a cross-platform way. I can access the Windows API with any language I want. I can access the Linux Kernel API with any language I want. I can access the BSD API with any language I want. I even can use any language I want on the C64 to call native operating system functions.

So, yes, I am taking the liberty to believe I know better than Apple. TBH I don't regard this conclusion as rocket science, so no ego involved here.


There's no "walled garden" here. We want Pkl to be useful for many developers, for a variety of purposes. Hopefully that's clear enough with Java, Kotlin, and Go being amongst the various supported language bindings. We also already support Linux, and plan on supporting Windows.

It's also open sourced under the Apache 2.0 license, which grants you the right to distribute and modify it.

Cheers


I appreciate you having good intentions. I don't mean to be disrespectful.

This however does not change my position that we have more than enough scripting languages focused on string operations, and don't need another language. And especially not a bloated one that itself depends on huge frameworks. If the same task can be solved using a 76KB standard Unix tool, creating a custom language with a huge bloated runtime simply can not be justified. It's bringing a sledgehammer for a task where a swiss pocket knife would be appropriate.

And my position also stands that it would make far more sense to finally agree on a config file standard format, with native parsers existing in every major programming language, so people stop re-inventing the wheel again and again and again, with the wheel becoming heavier and bulkier and less round every time.


> And my position also stands that it would make far more sense to finally agree on a config file standard format

It would also make far more sense for humans to agree on most things and work towards a common goal that benefitted us all, but that’s just as much of a pipe dream.

Proposing something that will never happen is not a practical solution.

https://xkcd.com/927/


communism immediately comes to mind. beautiful idea, but also the idea with the largest body count in the human history.


> They invented the Objective C and Swift

Apple had nothing to do with the invention of Objective-C. It had been out for 12 years when Apple first started using it.


Apple did not invented Objective-C. They took it over from NeXT. Which also didn't invent it.

Also, since Objective-C is basically just C, you could actually target their platform with plain C if you know what you're doing. Some of the underlying frameworks (e.g. Core Foundation) are actually C APIs.


Misappropriating a trendy word to use as a descriptor for anything you don’t like. This is how someone speaks immediately before taking a look in the mirror and realising that they sound like their parents. You are not the be-all end-all and your drive-by analysis of a technology is going to be biased by its suitability for what you do day-to-day. No amount of experience justifies your attitude.


I am sorry if the wording of my opinion has offended you.

My parents most likely would have complained that they'd need 58 floppy disks to run this programming language. :)


It's not that it's offensive, it's just not conducive to constructive dialog. "Enshittification" has rapidly come to mean nothing more than "changing in ways I don't like", just crasser.


Ok, I'll avoid that term in the future, then.


It is no different when you are trying to solve a complex problem and complex layers upon layers obfuscate and confound you when you are trying to figure out why some component ends up with configuration values that are nowhere to be found in your input files.


I am taking your comment as a joke because the Windows Registry is a joke.


I think they are being sarcastic about windows registry.


40 years ago I wrote Assembly that ran directly on the CPU. You could modify the registers directly. Then these crazy people came along and made a compiled language? And it outputs... Assembly?

I'm sorry, but a huge chunk of todays "developers" really are in the business of creating problems.


except when you moved to C there was no corresponding increase in fixed time which you must spend on tool setup just to program


Yeah, it’s really great that compiling C is a straightforward endeavor that doesn’t require any tools, and definitely doesn’t depend on executables that generate or parse configuration files. I’m confident that no C developers have ever spent time on tool setup just to program.


gcc main.c

Of course tinkerers made some tools you can tinker with as time went on, but you don’t have to let those interfere with velocity, just like I will not be using configuration languages.


Do you mind linking me to some large C projects that don’t use make/autotools/etc?


stb libraries


So… novelty single-file libraries? I’m not sure that “it works if you only have one file” truly supports your argument.


25 years ago admin would be responsible for one server, that they would setup and then never touch until (and sometimes after) someone hacked it.

Now people are responsible for thousands of containers, and GUI simply does not scale. And the configuration of those containers is domain-specific and/or varies enough so that you cannot write a single config file and copy it over. This is why configuration management systems first came with template engines and now require a separate tool to render the configuration.

Nobody would develop anything like that if it was not needed.


Oh yes, people DO develop stuff that is not needed. Back in the 90ies pretty much every nerd sooner or later had to write his own programming language (myself included). The difference between then and today: We did throw it away afterwards. These days every week someone spews out a new "programming language", claiming that the re-invented wheel is so much more round than the others.

If fully agree that you might need to automate editing and deploying config files. But you don't need a new programming language for that. Just use one of the thousands of existing languages and tools.

You can for example use sed. A standard since 1973. Available everywhere. 76KB in size. Extremely fast.

Or use AWK. Exists since 1977.

Or you can use Perl. Natively able to read/write ini files, and in general very good at string processing. 6MB in size (or 250KB for embedded versions).

Or use Javascript, PHP, shell scripting, whatever.

But no, there is absolutely no need to invent a new programming language for a task where there are very established tools for.


You are proposing to use general propose language vs domain-specific ones. I think, this discussion has been settled quite some time ago. For example, awk itself is domain-specific, built for processing text (or filed-separated) files. On the other had, m4 is as old as awk. m4 was heavily used even before autoconf (sendmail configuration is probably the best-known and possibly the ugliest application of it), for configuration, so one can argue, the need for macro expansion and rendering of configuration (as well as othe types) files was recognised even back then.

m4 just shows its age, and we’ve learned what is good (being Turing-complete) and what’s less useful (multiple output streams).

m4 could’ve evolved into m5, m6, etc., but nobody did it, instead people developed tools like jinja. One could argue, this is wrong and should’ve been evolutionary, but the point still is, domain-specific language to render configuration existed because there was use for them. Perl indeed could be such, I guess it did not happen for totally unrelated reasons. If Perl worked, Python would not gain popularity.

Pkl is not conceptually a new invention, it’s a new attempt to solve a real problem.


I stopped using sed because it behaved differently on my coworker's machine than it did on mine.


Same, actually a lot of the listed things are too old-school for me.


I'm surprised there's no widely-adopted library for general-purpose languages that handles user-friendly object input and outputs static config (perhaps in JSON) for machines to read. Instead, there are several DSLs that act as dynamic config, and even some that generate dynamic config, or something in the middle like k8s YAMLs.

Have I just not thought hard enough about this problem yet, and a DSL is really needed for it? Or should I go write that lib?


There is PowerShell: `ConvertTo-Json`


100%. Containers are essentially language-agnostic software libraries, and the config we write for them is analogous to the wiring we'd normally do in a to wire libraries together. Yet, our current tools aren't adapted to write and distribute the "programs" formed from gluing them together.

(Mostly-shameless plug, this is what we're trying to solve with Kurtosis: https://github.com/kurtosis-tech/kurtosis )


That strikes me as a very uncurious perspective. There’s absolutely nothing that could be improved about configuration files? An 87MB binary feels like a very arbitrary measure of worthiness.


ONE actual standard that everyone uses would be fine. For non-tree (nested) configurations, an ini file would do. It's a standard that worked well in 1990, and still does today.

For nested data, it doesn't matter to me if it's JSON, YAML, TOML or whatever. Just agree on ONE format.

TOML also is a good example of "creating problems instead of solutions": They deliberately (!) broke compatibility to the INI format due to "I can't stand unquoted strings". Yeah, emotional feelings about CONFIG FILE FORMATS.

And again: If your config files are so complex, how about just creating a TUI/GUI so the user can configure your program in an accessible fashion?

Anyway, yes, there is something that can be improved: Create a standard instead of adding another layer of complexity for something that should be very simple in the first place.

Yes, binary size may feel arbitrary, but it often gives a hint about the hammer's size.


> TOML also is a good example of "creating problems instead of solutions": They deliberately (!) broke compatibility to the INI format due to "I can't stand unquoted strings". Yeah, emotional feelings about CONFIG FILE FORMATS.

INI is hard to parse because without quotes the parser would not know whether it’s a “True” as a string or True as Boolean value. Formal parser that can be included into a program as a library and would reliably generate internal config representation (e.g., read config file into an object in application memory that will be used to modify the behaviour) is a good thing and TOML helps with that while INI does not.


Yes, agreed. On the other hand, there are quite a few people using Javascript, a language where this distinction also isn't really done in a clean way.

If you would de-serialize an ini-file into Objects in memory you would typically have meta-data about your object (RTTI/Reflection for compiled languages, given for scripting languages).

But yes, INI isn't without flaws. Parsers typically will need to accept everything from "true", "True", True, true, 1, "1" for boolean fields.


Schemas are often better than syntax typing. In other words the program decides what the type is, not the file.


If the schema allows union types, I don't want to deal with unquoted strings.


You don’t deal with syntax period. Union types are not desirable with configuration with the exception of |null perhaps, which should receive a default.


> They deliberately (!) broke compatibility to the INI format

There is no "ini format" though? INI is a weak set of conventions that different programs parse in a completely different way, it's the opposite of standardization.


> ONE actual standard that everyone uses would be fine

Sounds like a social problem. How do you get _everyone_ to agree on anything?


It’s pretty clear that you are putting “things that I [think I] understand above things that are better. Have you ever written an ini parser? If you have, I doubt it was good, because none of them are, because ini is a flawed format.


No, I am a Pascal guy. Turbo Pascal had an INI parser, Delphi has one, FreePascal has one, and they all work flawlessly.

From what I see, 20+ years proven parsers are available for pretty much any language there is.

And yes, the format is a bit flawed. But it gets the job done.


The Python parser has worked well for decades.


I mean, there is something to be said against using the equivalent of 87 novels worth of text to produce a note’s worth of configuration.


The general shift to infrastructure-as-code is tied to a number of new trends and requirements:

1) Code review (config code can be reviewed by humans)

2) Automated pre-submit checks (config code can passed through automated pre-submit checks - such as preventing huge changes, or giving you a nice diff to look at

3) Auditability / history-tracking (you can look at the history of a config file to see who, when, why changes were made - this may even be necessary for compliance reasons)

4) De-duplication (you can extract common components into templates/functions - this supports variation across envs/regions/customers and repetition across tasks/machines/DBs)

All of these features help build large-scale systems in modern corporate environments.


25 years ago, the only way to get two pieces of software to talk to each other was to get their authors to talk to each other.

I'm quite happy that remix culture has finally made it to software tooling. The extra layer of abstraction is a small price to pay for the power it brings.

(I don't know Pkl, but it seems like it scratches part of the same itch that you might use nix for).


> So what we are missing now is a 500GB framework that can write the config file for the programming language that is writing a config file for the actual program I wish to use.

That exists since 1960. It's called LISP. The e.g. https://guix.gnu.org/ uses with great success, the Guile Scheme dialect of LISP, to be precise. And FYI the "framework" is:

  $ ls --human-readable --size $(readlink $(which guile))
  16K /gnu/store/1gd9nsy4cps8fnrd1avkc9l01l7ywiai-guile-3.0.9/bin/guile
Yes, only sixteen kilos, not gigas.


Guix/Nix can use dynamic linking without risk since they know the exact dependency closure. It's not fair to compare a statically linked executable to a dynamic one.


Right, so its still just about 9 MB:

    guile=$(readlink -f $(which guile))
    sizes=$({ echo $guile; ldd $guile|grep /|sed 's+^[^/]*\(/[^ ]*\).*+\1+'; }|xargs -n 1 readlink -f|xargs du -ab|cut -f 1)
    sumsize=0
    for size in $sizes; do
        sumsize=$((sumsize+size));
    done
    echo $sumsize
Gives 9041632 here.


Thanks for the snippet. Now I have a recipe how to calculate the size of a binary file, plus the shared objects, i.e. the so-file(s), it requires directly.

Now I just need to extend your snippet so that it works recursively. A shared object can require other shared objects.


25 years ago, you didn't have thousands of websites and applications supporting billions of users and millions of TPS. In hindsight, the innovation around DevOps is nothing short of a marvel IMO. 87 MB in the context of a $100 16GB stick of RAM is basically cheap in relative terms.


I think there are plenty of people who have the same uncomfortable feeling.

I used to see this sort of thing a lot at Apple and I used to classify the people concerned as those that work AT Apple rather than people that work FOR Apple.


> manually crate a ... config file.

Wouldn't you want to self host the config?


I'm convinced some developers are copying the lawyers old scam of charging by the word why else would we have this nonsense


Pkl was one of the best internal tools at Apple, and it’s so good to see it finally getting open sourced.

My team migrated several kloc k8s configuration to pkl with great success. Internally we used to write alert definitions in pkl and it would generate configuration for 2 different monitoring tools, a pretty static documentation site and link it all together nicely.

Would gladly recommend this to anyone and I’m excited to be able to use this again.


Was about to ask if you had k8s api models available internally, and that someone should create some tool to generate that from the spec. But turns out it already exists in the open!

https://github.com/apple/pkl-k8s-examples


Coming from yaml+kustomize, all those curly braces are a tough sell. It looks like they roughly double the number of lines in the file.


While I learned to accept YAML it messes up editor usage.

It is so sensitive that basic text editing like copy and paste, tab, in/decreasing indent never quite do what I expect in IntelliJ.

I paste parts of yaml into another yaml and it ends up somewhere unpredictable.


Curly braces are great at ensuring correctness though, as goto fail; has shown.


Why yes, I would like to see more of those in my k8s, so glad we finally have the technology

https://github.com/apple/pkl-k8s-examples/blob/96ba7d415a85c...


Let us review the progress we've had so far in this area

* XML has the advantage of being almost perfect except its too verbose and isn't JSON

* JSON has the advantage of being JSON, but is strangely broken at even basic stuff (i.e. comments, no trailing commas)

* TOML has the advantage of being "better JSON", for people who like rust, but it is still too limited for a lot of scenarios

* YAML is not as limited, but it has the advantage of giving bad people promotions and good people PTSD

* HOCON has the advantage of being the goldilocks child of JSON and YAML, but nobody read the documentation

* <YOUR PROJECT'S PL> has the advantage of being able to do anything and this is a disadvantage

* Etc... for various reasons

In all seriousness, I welcome Pkl! Configuration seems simple and I think another configuration language opens up a lot of pent-up frustration, but I am legitimately very happy to see some fresh ideas in this space.

Incorporating a collection of typed records with a modicum of behavior and logic might be the secret recipe needed to crack the code. The fact that it can produce equivalent JS tells you that they've chosen a very intelligent subset of functionality of programming language features.

My best wishes to the team!


^ This is really the only required comment in this whole thread.


Built-in fetching of http resources and reading files from the filesystem[1] combined with turing-completeness are features that I wasn't expecting from a configuration language. I wonder if the complexity this brings is justified.

1: https://pkl-lang.org/main/current/language-reference/index.h...


Those sound like features that will eventually lead to major security issues.


I/O can be sandboxed via flags.

For example, see these CLI flags: https://pkl-lang.org/main/current/pkl-cli/index.html#common-...

And when using the different language bindings, you can specify sandboxing options directly in that library.


How many people will run a malicious config at least once without the flags? At some point it becomes a numbers game.


For me, that's where all the power of the language comes from. It's like writing your config in Go or Python (which I think is also a great approach) except its designed from the ground up for this use case of config generation.


Ah just give me typescript, I do not need to learn a new thing for configuration languages and at the end of the day the output is compatible with JSON. Typescript has all the stuff I would want: types, first class json support, an ecosystem of libraries (if you want it, I would probably not for config generation). And the tooling is amazing. Does pkl have LSP, syntax highlighting in every editor, debugger or repl?


Agreed. If we're getting the point where you're saying "it's like writing your configuration in [insert Domain Specific Language Here]" then I'd prefer to simply use that language. I understand the point Pkl is making in that "configuration won't work across DSLs so here's one language for all of them" but I don't know that that's enough motivation for people to adopt it.

However, I haven't built anything cool like this so what do I know. I'm just procrastinating on the my personal project and browsing hacker news.


At one time I just gave up on using any config formats and instead went back to the good old simple "use the programming language as the config file too" approach.

It's certainly not cross language capable but when your editor knows exactly what's there and what's not with auto completion from the config file, that was a superior experience than caring about cross language issues that may not even be a problem depending on the project.

If you must, you can easily make an identical copy of the config in the build process to convert it to another language.


Yep, I've always got a config.js. But that's a luxury compiled languages won't have.


Ship your config as a .so/.dll ! :P


This might actually be done pretty often.


A big problem we've hit with allowing users to write Typescript (or any other general-purpose programming language) for our product is that it's too powerful.

Observationally, it seems that all that power eventually gets used, and then you end up with config that has complex interfaces, or becomes non-portable because it's doing arbitrary file reads, or is non-deterministic because of an ill-advised call to random or the system clock. The config then becomes something not maintainable by the rest of the team - just the few who know it.

Config languages seem to need to strike an interesting balance between being complex enough to allow for reasonable DRY code (which helps maintainability at the expense of readability), but not so complex that they're not generally-maintainable.


I feel like this is what code review is for. Isn’t this also possible with your application code? How do you prevent that?

As for I/O concerns, run it in CI with deno and ensure there is no I/O


Yep, application code can have the same problem! The difference is that application code lives "inside the abstraction" of the program, and is viewed and edited by a much smaller set of developers.

Configuration, by contrast, sits at the seam between two systems. It's the top-level parameterization of the abstraction, and behaves more like an API. E.g. imagine if the only way to configure Kubernetes or Docker were in language-specitic bindings - there was no such thing as a YAML lingua franca.


My impression of typescript is that if you do:

a = 1;

and elsewhere:

a = 2;

The ultimate value of 'a' will depend on the order in which those statements are executed, right?

A configuration language should surface that as an error and tell you where all of the conflicting references are.


Typescript does have const variables. But you can still often mutate the instances and there are many other reasons why a configuration language should be strictly free from side effects allowed in imperative languages.


If your tsconfig settings are strict enough this is a type error :)


> Ah just give me typescript, I do not need to learn a new thing for configuration languages

People who don't know Typescript will still have to learn a new thing in that case, so there isn't really any reason to pick TS over any other existing language

> syntax highlighting in every editor

There is a tree-sitter parser for pkl, so it'll work anywhere where tree-sitter works. And for VSCode and JetBrains stuff, they have official extensions.


More people know TS (or JS, close enough) than Pkl, it can be used for more, and it's got better tooling/support. Those are understatements.


TS tooling won’t tell you when your config value violates the constraint “isBetween(0, 100)” or “matches(Regex(…))”. Among other things.


A config validator, which can be written in TS, would.


In theory, yes. But I doubt that we’ll see a TS IDE plugin that underlines violated constraints with red squiggles anytime soon. :-)


There are IDEs that auto-run unit tests and highlight exceptions thrown in the main or test code, which wouldn't be far off.


Agreed, runtime checks can be implemented in typescript - that’s the beauty of it


Nix does this too and maintains perfect caching from top to bottom.


Nix is a build & metaprogramming system, not (just) a configuration language.


Nix is a bunch of things, one of which is a configuration language.


A bafflingly ugly configuration language.


A major difference is that Nix has no type system or schema. Pkl is typed.


The NixOS module system has those and can be used independently of NixOS despite its name.


But I have no idea how I would build a config structure for an application using Nix... It seems very powerful so I'm sure it's possible, but I just have no idea where I'd start for this specific use case.

Whereas this documentation for Pkl is entirely about that use case.


I agree that there's a huge documentation and user awareness gap, but NixOS is obviously possible to configure this way so it's definitely possible.

Might be room for a tool that exposes just the configuration management side of Nix in a more approachable way... on the other hand it would be a bit silly to use nix for conf files and not also for the underlying packages.


I guess it strikes me as not just a documentation and awareness gap, but a different paradigm in a more fundamental way.

I think your last sentence gets at that too: For Nix, configuration management is just one component of a broader and more powerful paradigm. And it seems to me like putting a square peg in a round hole (or as you say, a bit silly) to try to use it to solve this narrower and simpler problem.


Better to learn a tiny corner of nix (which you may later apply to the rest of it, or not) than to learn a language with a narrower use case.

But one who embraces nix fully is one who is willing to commit a lot of time to turning their back on convention. Returning to 90% of the conventions that they worked so hard to leave behind probably won't excite them.

So it's not silly, it's just that the person to do it is culturally unlikely.


> Better to learn a tiny corner of nix (which you may later apply to the rest of it, or not) than to learn a language with a narrower use case.

"Better" in what way?

Like, maybe I agree in some sort of philosophical sense, of how it's better to learn and expand one's awareness of what exists and is possible in the world. That is, better for students, similarly to how I'd say it is better for students to learn lisp and haskell, because they'll figure out how to use javascript and python and C# or whatever as needed on the job. And I'm a believer in life-long learning, that everyone should do their damndest to carve out time to be a student at all times throughout life. So certainly I'm in favor of learning about Nix and its config structure and comparing and contrasting it to other things like this Pkl.

But I don't think the idea of using Nix's config management instead of a narrowly-targeted config management tool would be "better" in the sense of a professional figuring out how to set up or refactor the config structure within the non-Nix paradigm of the vast majority of organizations. And that's what I've been talking about here.

I'm actually a Nix kool-aid drinker, believe it or not, and I hope its paradigm will sweep the world, but I also have to do what's right by the organizations I work within at each moment in time, and being at the vanguard of the Nix paradigm has not been that, in my view, yet.


If you're:

> a professional figuring out how to set up or refactor the config structure within the non-Nix paradigm of the vast majority of organizations

Then I don't think nix-as-config-only is a good move. But if you're a professional figuring out how to set up or refactor the config structure of some organization, AND for some reason you've already decided that you're going to introduce a language that nobody in the organization uses, then would I say that nix is a good choice. This is because there are probably other people in that organization that would find nix useful if they were already over the syntax adjustment phase.

If a multitool's screwdriver is just as good as a normal screwdriver (granted they're usually not), why not prefer the multitool?

Generally I'd just suggest using whatever languages are already around, config-focused or otherwise. For instance we have a lot of python in our stack so I've found a library called mergedict that gets us close enough to the composability of nix modules without adding a new language.


> Then I don't think nix-as-config-only is a good move.

Yep, that was the only narrow point I was trying to make initially :)

> If a multitool's screwdriver is just as good as a normal screwdriver (granted they're usually not), why not prefer the multitool?

Under the assumption that the multitool's screwdriver is just as good, then yeah, that might make sense. But as you grant, that is usually not the case. And my original point is that this is one of those usual cases where it is not as good, not one of those rare cases where it is just as good. And the reason Nix is not just as good for this, is that it was not made for this, it was made for something different and bigger.

> For instance we have a lot of python in our stack so I've found a library called mergedict that gets us close enough to the composability of nix modules without adding a new language.

Things like mergedict, while being a common genre of solution to this problem, are not a good solution to it. They are the total mess I said I'm always keeping an eye out for better alternatives to.

I don't know yet if I think Pkl (or Cue or any of the rest) actually fit that bill, but your contention that there is no point looking for solutions to this problem short of Nix strikes me as a classic case of making perfect the enemy of good.


Sounds a bit like Mako in Python, that allows you to code Python in the template, iirc.


Having read through the the docs a little, my gut reaction is that they might be a little too much in love with the idea of having created a language that can serve both as schema definition and as minimal values carrier. It smells of unexpected failure modes through overuse [1].

But perhaps this is exactly the core feature: everybody who adds pkl to their software implicitly signs up for participation in whatever configuration monstrosity the downstream stack will end up having. Based on the assumption that that it will be a monstrosity anyways, and that a uniform system would be less bad than an unstructured mess.

Next stage of concerns: if it's a heterogeneous stack that shares one configuration graph, the runtime implementation that is linked into parts of the stack can't ever be allowed to evolve. Ouch. And then there's runtime performance, I wonder if that could eventually become a factor, e.g. if short-lived go processes are involved?

It all seems surprisingly ambitious, very far from the "why not json-with-comments" that was my first reaction (shared with many I assume)

[1] digression: e.g. it reminds me of how in the dark days of peak XML java, a lot of typechecking was effectively thrown overboard because for some unfathomable reason everybody agreed that it would be nice to rearrange objects without a helpful compiler in the room


It's normal for a programming language to have the ability to define types and create values of those types. Pkl is entirely conventional here.


> We offer plugins and extensions for IntelliJ, Visual Studio Code and Neovim, with Language Server Protocol support coming soon.

Why? Why would they not have just done the language server first (or only)? All of those have built-in support for it, so separate implementations wouldn't have been necessary; that's the point.

I just don't understand why you'd make that decision on a greenfield project today, especially if LSP support is planned at all?


Probably because JetBrains only recently added native LSP to IntelliJ[0] etc.

Given that writing anything in Java/Kotlin basically requires the use of IntelliJ, it’s not really surprising that a language built on top of Truffle and GraalVM, all Java technologies, would ship with an IntelliJ plugin, _but not_ an LSP. Because such an LSP would have been useless in the primary IDE used by the language developers.

So if you wanna blame anyone, blame JetBrains for deliberately making hard to use LSPs with their IDE. Presumably to build a moat around their plugin ecosystem.

[0] https://blog.jetbrains.com/platform/2023/07/lsp-for-plugin-d...


The experience with LSPs is quite underwhelming compared to what a native language plugin in IntelliJ can do. That isn’t to say the approach is bad, but it’s definitely a trade off.


How come? I don't know anything in say, pycharm, that can't be done with a LSP server and plugins. But that's probably because I don't know enough so I'm curious about what limitations an LSP would involve!


Refactorings are one example where LSP can’t match IntelliJ. A language server also takes far more skill and effort to implement than the equivalent IntelliJ plugin, where much of the heavy lifting is done by the IntelliJ framework. And you still need at least a lightweight plugin for each supported editor. That’s why a language server isn’t necessarily the best place to start but hard to avoid in the long run.


Why assume it's a greenfield project? I would think most open source software that comes out of companies have lived internally for a while before going public?


This. I’m willing to bet I talked to one of the people who use this internally at Apple a few years back when I interviewed there (before they decided to do away with remote hires). They didn’t mention it, but it fits the context.


Yes, I was using it when I joined Apple in mid 2020


Don't leave is hanging, how was your experience?

(Pkl, not Apple :p)


I absolutely loved it and I’ve been extremely impatient to see its release. I used it to generate k8s manifests, Terraform, all infra config files. Very flexible and fun to use


Do you know of any public resources about it's use cases?, other than the one linked in this submission of course.

What bout Pkl made it easier to write Terraform/K8s manifests/etc.?


I looked around and couldn't make sense of how to write out random config files like I manage with puppet and ERB files.

Are there any reasonable examples of such?


Additionally I cant think of other such projects from Apple. They tend to build things that get used extensively till the end of their life is consumed.

I wish Apple would continue to open source much more. Especially old software much like Microsoft does. I would kind of enjoy a Linux Distro that can natively run old Apple software and tools.


Clang leaps to mind. Most of LLVM was developed under Apple's auspices, although the project predates Apple hiring Chris Latner. Swift is also open source, although in practice it's almost always used in a macOS context. Which I think is a shame, it's a great language for writing the sorts of servers for which people usually reach for Go.

That's not a complete list of programs Apple has open-sourced, although clearly open-source software isn't what they're best known for. But clang alone is a worthy contribution which deserves to be recognized.


FoundationDB was open sourced a few years after they bought it too.


Fair point.


>I just don't understand why you'd make that decision on a greenfield project today, especially if LSP support is planned at all?

How about if the LSP doesn't cut it for the kind of they IDE support they want to offer?


Well I suppose that's what I'm asking, is that the case? I'm not aware of other projects that have a separate plug-in due to LSP shortcomings, so just interested if there is something they're doing that's not possible/won't be in the LSP one.


I think the answer to your question is yes: LSPs can’t do everything that a plugin can do.

https://blog.jetbrains.com/platform/2023/07/lsp-for-plugin-d...


That sounds like a limitation on Jetbrains part, no? Or is it due to LSPs themselves?


I suppose there are things which are just beyond the scope of LSP, like menus and other UI bits which are IDE specific, not to mention refactoring tools etc.


That would be on the client (the LSP support in the specific IDE) to implement though, and would then work for any language server that implemented those capabilities?

Like 'go to definition' or whatever is surely in every IDEs menu, but it can be done with LSP.


Pkl has been widely used at Apple for years. It is not a greenfield project.


When distributing LSP servers you need to compile them to the target platform. Afterwards you normally also need an extension or plugin to actually activate the LSP server. So it is possible that there is an LSP server, but they haven’t figured out the distribution yet (sharing binaries? Homebrew?).


have you worked with ide native extensions? while LSP is useful, his expressiveness and power and integration is limited.


I can imagine the answer if I were in their shoes. Say I already have something working as a plugin (or know how to make plugins vs. learning from scratch about implementing LSP), I'd rather have this out there as soon as possible and add more tooling/language support in subsequent releases.


If it was just one I would have assumed the same, but Intellij, VSCode, and Neovim..?


From https://pkl-lang.org/blog/introducing-pkl.html#editor-suppor...:

"We are also releasing two other plugins: our VS Code plugin, and our neovim plugin. Today, these plugins only provide basic editing features like syntax highlighting and code folding."


Language Server support isn't nearly as good as building a native plugin for the IDE.


Because the current extensions only provide syntax highlighting. The Neovim one uses tree-sitter which is built into NVim.


LSPs in vscode suck


Can you elaborate please? VSCode is literally the reason why LSP even exists


They turn your IDE into a distributed system with parsing and serializing overhead for every action.


Isn't that true for any editor with LSP support though, not just VSCode? And I personally consider that to be much better than the alternative solution of re-writing parts of the compiler to a different language to integrate them into the editor directly, like JetBrains does.


I was wrong. I meant Treesitter sucks in vscode


VSCode doesn't support tree-sitter at all, it doesn't "suck", it just doesn't exist


I've had a good long think about configuration languages, and after a long-term on/off love/hate relationship with schemas I think I've finally concluded that I don't want rich types in my configs, thank you very much.

I use statically-typed programming languages, and for my purposes I'd rather have a config language where the only types are strings, arrays, and hashmaps, and then push all type validation into the parsing stage.


Guess the obvious question is why don’t you want types in your config language? Pushing all the validation to parsing state just makes it hard to write valid config, because you only know if the config is valid when you feed it into your program.

Having the ability to pull all that validation forward, and into your IDE, means you can be told about invalid config as you’re writing it. To me the idea of only wanting to validate config at the last possible moment, is a bit like advocating for only using simple text editors, and replying purely on build failures for feedback. Sure you can do it, but why would subject yourself to that?

Pkl is interesting because it makes it possible to describe not just the types, but also valid configs. Effectively allowing you to ship your applications config parsing and validation into Pkl, so your code can just expect valid config. Then Pkl can expose all that parsing and validation logic into your IDE, so you get pointers as you type. Just like you do in any modern IDE for any modern language.


> Guess the obvious question is why don’t you want types in your config language?

The disadvantage of typed configuration languages is they make assumptions about the format of the data. For you a "date" type might mean ISO 8601, but for me it might mean RFC 3339. Config languages that make assumptions are coupling the language and schema validation together. The alternative is to decouple them: offer a flexible configuration language with a separate schema language. The latter would let you define the schema specifically for your data for validation.


> For you a "date" type might mean ISO 8601, but for me it might mean RFC 3339

A generic date type doesn’t come with any specific string format. ISO 8601 and RFC 3339 are both ways of representing a date as a string. Which has little to do with Date as a type.

There’s also perfectly good solutions to those problems. Use a type alias to create a date type backed by a string, with formatting constraints (such as a regex). Then people can define dates using any string representation they want!

Incidentally this is exactly what Pkl lets you do. You can use Pkl schema to define in detail how you want to validate your data using various primitives like regex. And then separately create a config template that uses those custom data types. As a dev you can choose how tightly bound your config template is to a data validation schema, define all the validation in line, or import an external module to provide you with richly validated data types.


> ISO 8601 and RFC 3339 are both ways of representing a date as a string. Which has little to do with Date as a type.

Tell that to the TOML authors [1].

It's good Pkl has string validation, but what if I don't want a string? What if I want my own literal datetime syntax, like TOML? The grammar for a configuration language could be made flexible enough to accept most anything as a value. In such a design the "string" type itself could be a regex rule defined by a schema.

Keep in mind datetime literals are just an example, the actual number of potential types is unbound.

[1] https://toml.io/en/v1.0.0#offset-date-time


This choice in TOML is a mixed bag. I would say it's substantially mitigated by the fact that strings must be quoted, so you don't have strings magically turning into datetimes if they look like a date. But it does add substantially complexity to an otherwise simple language.

The near-compatibility of ISO 8601 and RFC 3339 is a rich source of bugs, but that's hardly TOML's fault, and the standard is perfectly clear that the latter is used. TOML, like many configuration and data transport languages, is defined in terms of its syntax, so an abstract Date type doesn't make sense for it.

Providing a datetime format for TOML was probably the right decision, I think there are more people who complain about JSON lacking one than there are who complain about TOML having one.


Even more reason to standardize the format used in configurations and to validate it early, rather than at runtime.


Seems like someone should create a new datetime standard, documenting the safe intersection of ISO 8601 and RFC 3339. I did find this useful comparison: https://ijmacd.github.io/rfc3339-iso8601/


One argument I might put on the cons side of AOT type validation for configs is that there will always be some invalid inputs that can't be statically checked (e.g. port number already taken), and not failing simple type errors before runtime helps keeping in mind and view whatever feedback channel exists for the runtime failure. I wouldn't consider that a winning argument, but it's not entirely without merit.


That’s not a reason for giving up on all config validation before runtime. Just because we can’t solve a problem in every possible situation doesn’t mean shouldn’t solve the problem for any situation.


duplicate port number can be checked in the type system if you’re using racket, at least


Duplicate port number in the config might be checked ahead of time, but port number already taken by something unrelated in the environment deployed to can't. I'm sure that the scope of pkl isn't intended to setting up clean slate containers and nothing else, ever.


My take on this is that there is not obvious reason not to, but it just so happens that typed configuration languages are not rich enough and not integrated enough to be that useful.

Those languages that arrived with the JSON hype train like yaml or toml might be great for dynamic languages, where you can load them to some native object. But in statically typed languages you are gonna declare your types, in code, anyway. So configuration providing types doesn't really do much.


Ah! Well this is the hole that Pkl does a very good job of filling!

Being able to use Pkl code-gen to create language bindings means you can take any arbitrary Pkl schema and turn it into native structures in your language, typed based on the Pkl schema. Then you can let Pkl do all the heavy lifting of parsing and validating a specific config, and turning it into native objects in your language.

So no need for double declaring types. Declare them once in Pkl, generate the equivalent types in your language, then just start using those types like any other primitive. The Pkl bindings will handle the loading glue for you.


If you replace "Pkl" with "XML", this is all exactly true for XML. Ten years ago we were generating C# classes, typed validators, and automatic parsers from XSD schemas, with automatic IDE integration and IntelliSense completions when editing the XML documents--is this just XSD for the younger JSON generation? I shipped multiple megabytes of complex manually-written XML configuration this way and it was delightful. We never would have pulled it off without XSD.


What you're expressing here is that many of the ideas from that period of xml-centricity were quite good and useful!

But xml itself was not a good language for this, because its legibility is terrible. It's just not a good format for human reading and editing. (But it also isn't a great format for machine interaction either...)

So yeah, I see it as a good thing that this seems to be able to do all that useful stuff you were doing with xml and xsd a decade (and more) ago. But it's (IMO) a much nicer way to do it.


To be fair if your config is just a structure with strings then you declare your types only once, too. Minus the codegen, but also minus the editor integration.

I'm not hating on Pkl here, we deserve better in this space, so I'm happy with more developments.


The whole _point_ of Pkl is that it is both rich enough and integrated enough though?


Oh, I didn't comment on the Pkl, just on the status quo. My bad for not making that clear.

"Enough" is the keyword here, time will tell I guess.


Yaml predates toml by ten years or so _and_ has an extensible schema to define types.

Sadly, nobody ever cared about that.


Yeah, generally, you want to validate as early as practical... catching problems sooner is better than later.

I think the problem might be separation of concerns...

pkl comes in early, and by design is separated from your app and the details thereof. It seems good for validating high-level or external configuration constraints. But suppose you have some constraints based on implementation details of your app. Now you face a choice: eschew pkl and put that validation logic in the app where you lose the benefits of early validation or put it in pkl (if that's even possible) which implicitly makes it dependent on application implementation details. Of course, we devs aren't great at stopping to consider the deep implications of each config param we add, so which one happens in practice in each case probably depends on the dev or dev group and how much they've bought in to pkl... some will pretty much always add a config to pkl because that's what it's there for, while others will ignore pkl whenever they can. I think this is inherent in the ambiguity the choice presents. There's probably a right choice in each case, but devs will only sometimes be that careful about the location of each config validation.

That's my guess anyway, as to why the previous post wants to just put all the validation at the level it's used. If that's your rule the ambiguity is resolved and it works perfectly for config dependent on specific app concerns and pretty well for config that also has high-level or external concerns, since those are less volatile and when they do change, it generally implies app changes in any case.

My gut says pkl is over engineered for the vast majority of cases and people should not reach for it unless they have a specific problem that it will solve for them.


Hmm isn’t pickle designed for exactly this use case? External config module deps you pull for overall config validation.

E.g. so you always write valid k8s manifests.

And then you can extend them with your own additional validation rules for what you think your app needs? I’ve just skimmed the docs but it seems it allows you to be as loose or as precise as possible, plus packaging and publishing those rules for others to use.

Seems kinda awesome.


Fixing the app implementation details so that the configuration stays "clean", or at least forcibly documenting them so that the configuration can be written correctly, is vastly better than allowing the app to make undocumented surprises with its private validation.


I am convinced the Pkl config will grow in complexity until it has a yaml or json config for the configuration program.


Not in my experience.


> Guess the obvious question is why don’t you want types in your config language?

Where do you store the schema?


Because with a real programming language you get an actual IDE, auto complete, a debugger, sane compiler errors instead of a vague helm error "invalid thingy at line 4", you can log a bad config the same way you log stuff for the rest of your program and you can't guarantee your config is valid anyway if your config language can't see what class you're going to feed it to.


That’s a consequence of configuration languages not have proper type systems, and robust was to manipulate data.

Helm isn’t a configuration language, it’s a dressed up string templating system with ability to do kubectl apply.

> you can't guarantee your config is valid anyway if your config language can't see what class you're going to feed it to.

Obviously, but that’s hardly an insurmountable problem. We’ve had code gen and language introspection for decades.


Helm is not a proper configuration language. It just does string replace. Unbelievable that people actually use it.

A config language that does have the type information can give proper errors. Try for example terraform.


I think this is a reasonable approach if you only have one stack, and don't have a lot of config. If you have one stack, you can put all the validation, types, and everything else in your runtime application, and then you don't need to learn new languages, and everything works.

This becomes a lot more painful if your work is more polyglot. If you need to define config that needs to be shared between different applications, but they're written in different languages, you'll have a much harder time. Also, say, if you need to deploy your applications to Kubernetes, and your Kubernetes specification needs to provide config files to your application, then you'll still end up in a situation where your statically typed programming language won't help. That is where something like Pkl becomes really helpful, because you have just one place to manage all that complexity--right in Pkl itself.


I mostly agree with this, but I've been a big fan of having primitive types in config. Most of the time if I have something I want to configure, it's either one of the following (or a map/list-based structure consisting of):

- scalar value

- feature toggle

- URI/enum option/human readable display text

Having float/long/boolean is trivial to validate in the config language itself, and if they're useful and simple enough isn't it nice to be able to validate your config as early as possible?


It's nice, but it comes at a cost. For example, every user of toml forever will have to put strings in quotes. Why? Because having other types creates ambiguity, that is resolved by this one simple trick. But if you don't quote them then you have "the Norway problem" like in yaml.


That's my feeling too. Tools like this are trying to squeeze into the space between "straightforward configuration easily maintained in static files" and "complicated state management better served by real code in a real programming language". And that's a real hole for some applications, but it's a small hole.

Basically forcing everyone to learn new tooling (Pkl here, but lots of json/yaml middleware nonsense fits the bill too) just to deal with what really isn't that big of a problem seems like a bad trade.


The only thing I want is very basic flow control / environment based code blocks and that’s it. I think nginx has a reasonable config language


Reduce it further: strings and maps. Arrays can be represented as a map.


Strings can be represented as arrays too. Doesnt make a good argument for removing them.


Doing so would necessitate the addition of another type: character/grapheme cluster.

Representing arrays as maps would impose no additional requirements outside of validation which is already considered as part of the proposal in question.


Sounds like you and the INI guy agree here and honestly I'm coming around to it because for complex types you end up typing everything twice.

https://github.com/madmurphy/libconfini/wiki/An-INI-critique...


You're saying you don't want red squigglies in your IDE when you do your configuration wrong? Why?


So like cue [0] but more primitive, less principled and in java?

[0] https://cuelang.org


Looks like it might fix three big issues I had with Cue:

1. The only way to use it is to run their Go CLI app to convert the Cue into JSON and then load that. That sucks. I want native support. Jsonnet does this a lot better (https://jsonnet.org/ref/bindings.html), and PKL at least supports 4 languages. Cue only supports Go directly. Not good.

2. Cue has a super fancy type system, but as far as I could figure out there's no way to actually take advantage of this in an IDE, which is like 60% of the benefits of fancy type systems. In a Cue document you can't say "this is the schema". XML had that decades ago (and it has awesome IDE integration with Red Hat's XML extension for VSCode). Even JSON can sort of do it via `$schema`. The docs are a bit scant but it looks like this supports it too. The fact that Cue doesn't sucks.

3. Cue is pretty much only a fancy type system. It's a really elegant and nice type system, but that's it. It doesn't even have functions. So it isn't going to help with a lot of the things that Jsonnet and PKL help with.

This is not really in the same area as Cue. It's a way more direct competitor to Jsonnet and looks better, based on my brief skim.

My only concern with these sorts of things is that they're basically a whole new programming language, but without many of the features you'd want from a real programming language. It's in an uncanny valley.

Does look nice though.


> [CUE] doesn't even have functions.

Note that CUE has comprehensions, which are morally (but not syntactically) functions (actually closures). They are a way to transform values (which can be types) into other values.

We are also adding real function types to CUE. At least in the beginning these functions will be written in other languages than CUE itself, however.

While we are very principled when it comes to language design, we are also very responsive to finding solutions to user's problems, and we welcome any sort of feedback, especially if it's backed by specific use cases and experiences.

As mentioned in another comment, support for languages other than Go is coming.


What does "morally" mean here? I've not seen that term used in compsci except when about ethics.


I believe the term comes from this paper: https://www.cs.ox.ac.uk/jeremy.gibbons/publications/fast+loo.... See section 5 which defines “moral equality”.


That is indeed the only reference I could find (and maybe Cue is inspired by this paper?), but they don't seem to explain why they use such language. They talk about "unjustified reasoning" and laws, but the connection to ethics seems questionable. But I've not read the whole paper.


A moral equivalence class is an axiomatic equivalence class that when quotienting some mathematical structure the model in question might lose some of its soundness or adequacy properties, but one that nevertheless might be useful for other meta-mathematic reason.

This use of the word is so common in certain math/category theory/compsci communities that I was not aware it was unconventional in any way. It has nothing to do with ethics.

I guess the moral of the story is that the moral of the story can be lost if people don't understand the story.


The word choice is peculiar, but I found this [1] explanation helpful.

[1] https://eugeniacheng.com/wp-content/uploads/2017/02/cheng-mo...


Love this comment and I agree with basically everything. What are you using for configuration these days?

I've fallen back to YAML because at least its already used for a lot of tools, and has comments, jsonschema support in VSCode giving IDE features, language library support, yamllint, and yq for formatting/querying/mass-updating from the CLI


Yeah I actually haven't found a great answer yet. Here's everything I've tried and why it sucks:

* JSON. No comments. Deal-breaker

* JSONC. No unique file extension so its difficult to distinguish from JSON. Poor library support due to library authors drinking the "comments are bad" koolaid.

* JSON5. This would be an excellent option IMO except that library and IDE support is not great.

* JSON6. This just complicates JSON5 for minimal benefits. Pointless.

* Cue. As described.

* Jsonnet. Pretty good option tbh! However I couldn't get the Rust library to work. It's a low level parser, seems like you can't just plug it into Serde, which is what 99% of people really want. Also I ran into the "uncanny valley" effect where you can do some things but not all. So it tricks you into writing some programmatic config (e.g. with string manipulation) but then you find you can't do that string manipulation.

* Dhall. Weird syntax (backslash to declare functions. I've also heard it is slow. Didn't try this too much.

* YAML. Obviously YAML is the worst option. However I did realise you can use it as basically JSON5 except with a different comment character, which is not too bad.

* Starlark. Actually I haven't tried this yet but it looks promising.

So yeah I have no idea at the moment.

I wonder if it would be worth defining a "YAML JSON5" format, that's basically YAML-compatible JSON5.


Have you seen https://kdl.dev/ ?


Do you have 5 minutes to talk about TOML?


Ha I forgot about that. TOML is pretty awful too. It's fine as long as you only need 1 level of nesting. As soon as you need to go deeper than that you end up with [[weird syntax]] that is very not obvious. I would say it's less obvious than YAML and YAML is already pretty unintuitive.


Looks like it has better IDE integration. Still, I am going to stick with cue because of what you mentioned and also because it is a community project. Apple has very few actively maintained open source projects and sometimes such projects are difficult to contribute to or have wavering support for the open source side. It is great having corporate backing behind something like swift that needs a massive amount of work, but for cue, I am happy with steady improvements meeting the needs of a wide community once I can figure out a good IDE integration.


FWIW, we (CUE) are currently working on a LSP for CUE which should improve the IDE experience.


I really like CUE, but for most use cases I have I would want to embed it in an application, and Go is the only language with support.

For it to gain more adoption it really needs a rewrite in a low-level language (C/Rust), so it can be exposed in various languages through an extension/FFI.


Making CUE available as a library for other languages is one of our top priorities. Sadly, I can't provide an ETA at this time, all I can say is that I am personally working on this.

Getting feedback from the community about what other languages they'd want supported first would be of massive help, however.


Can we have D language support?

For those who want the justifications for CUE, this is an excellent write up.

[1] How CUE Wins:

https://blog.cedriccharly.com/post/20210523-how-cue-wins/


C library gets to halfway to everywhere to paraphrase a saying.


Or just a c compatible ABI. The implementation doesn't have to be in C, it could be rust or zig or c++ or nim or ...


> Getting feedback from the community about what other languages they'd want supported first

Rust and Python would be my top picks.


Because you asked for this feedback: Rust


Thanks for working on it. Useful stuff.


I'd like support on the JVM, in addition to rust which was already mentioned in other comments.


.net would benefit from it.


This is honestly my only complaint about Cue after having it used it for a few weeks after looking at everything else in the space (KCL, Jsonnet, Dhall, etc.). Cue is incredible imo and the commenter above talking about not being able to define the schema vs. the data is sort of missing the point - Cue makes them the same thing in a way that really understands the whole role of config langauges and IMO is way better for it. When you start to really understand what a config language _should_ do, Cue is the only option, and most attempts to dismiss it seem like hand-waves in order to push some other preference.

However it only being in Go and not implemented with some C ABI is a major downside for adoption, especially when their documentation itself for implementing the core CLI functionality in a go program (to then compile to a DLL for use in non-Go land) is pretty sparse.


Thank you for your comment. We're working hard to add support for other languages. I agree that exposing a Go library as a C shared object is non-trivial and rough around the edges. We are committed to polishing these edges.


I've been somewhat surprised that CUE bills itself as "tooling friendly" and doesn't yet have a language server- the number one bit of tooling most devs use for a particular language.

I'm assuming it's becaus CUE is still unstable?

Anyway, if others are interested in CUE's LSP work, I think https://github.com/cue-lang/cue/issues/142 is the issue to subscribe to


Tooling friendly can mean different things to different people. Similarly, different groups of people have different priorities.

It has always been clear that LSP was high priority, but we have many other high priority work that also needs to be done. Most of the work that we do is driven by feedback and demand from the community.

Additionally, we want to do the LSP right instead of quickly hacking something together. That requires more work than one might think.

While CUE has not reached 1.0 yet, people definitely use CUE in production and we work hard not to break any of their code. I can assure you LSP is missing simply because we had other things to tackle first and not because the language is unstable in a colloquial sense.


That’s what I’m seeing as well. Curious to try it out to see how its expressiveness compares to Cue. Looks like it’s Turing-complete as opposed to Cue, which is a plus… but that comes with downsides.

One thing I like to see is the direction of “declare types and validations in a single place, integrate with any language”.

My daily codebase atm has types declarations in typescript, cue, pydantic and for our database… and it’s driving me bonkers seeing as most types are already declared in Cue to start with. I played a little with packages meant to translate them i.e. Cue -> TS, but nothing worth the effort.

IMO it would be a big upside for Cue to handle these as first class language features.


What advantages does Turing-completeness provide for a configuration language?


There are three (maybe more?) ways things can be Turing-incomplete:

1. You are limited to N evaluation/reduction steps.

2. The language doesn't include primitives like recursion or loops.

3. You can have recursion or loops, but the language makes you somehow prove that your program will terminate.

I think (1) would be fine, but I don't know any configuration languages that use this approach.

(2) is restrictive/annoying whenever you want to implement any logic in the config language library. Eg. a tool uses a homegrown data format BAML and you need to convert JSON to BAML in the config. Now either you have to write and manually call a preprocessor, or you need to use a patched version of the <config language evaluator> that will have JSON->BAML as a built-in, or you must implement JSON->BAML without loops or recursion. For a more realistic example, imagine that a certain config string has to be HTML-escaped and the config language doesn't provide a built-in for that purpose.

(3) -- you don't want it. There are languages (like Agda) that let you prove things like "this program terminates", but writing those proofs can be harder than writing the program itself.


I think the C preprocessor is an interesting example of (2), because the metaprogramming community has converged on an extremely clever paradigm to circumvent the lack of recursion: continuation machines. By defining a linear number of “continuation evaluation” macros, you can generate an exponential number of “recursive” macro expansions, which trivially scales to the point that it could take until the heat death of a universe for an arbitrary program to terminate, but a program can choose to terminate at any time. The Chaos-pp and Order-pp projects are good implementations of this!


I think 2) seems incorrect. What you can’t have is unbounded loops and recursion. Bounded loops are perfectly fine and I don’t tend to need unbounded ones when programming (with exceptions being infinite loops for handling asynchronous events, which a configuration language doesn’t need to do).

Recursion is trickier. I think banning it or simply limiting stack depth seems fairly reasonable? In fact I’m pretty sure most Turing-complete languages have a stack depth limit, so unbounded recursion is not allowed for those either. I don’t see a limit being a problem, because again this is a config language.

I don’t see why HTML escaping needs Turing-completeness. It shouldn’t need any unbounded iteration (it should be limited to the size of the input string) or unbounded loops. In general, I can’t think of any typical data processing code where turning completeness is required, but could be wrong. Do you have any practical examples of transformations that need unbounded iteration?


> I don’t see why HTML escaping needs Turing-completeness.

First of all, let's avoid "Turing-completeness" because then we might start arguing about whether a language with unrestricted recursion is or isn't Turing-complete since there are stack depth limits / memory limits / universe will end one day / etc.

I would phrase this question as "why would HTML escaping need unrestricted recursion or loops" -- since in practice config languages either have unrestricted recursion or loops (Nickel), or they don't (CUE, Dhall, Starlark).

For HTML escaping specifically, just having `.map(f)` and `.concat` available (in functional languages), or `for char in string` (in imperative languages), would be enough.

For something like HTML un-escaping, it's already trickier. If you are using recursion, your language needs to understand the concept of the string becoming "smaller" at each step. If you are using loops, `for ... in ...` is not enough anymore.

An even trickier example would be mergesort:

  merge(xs, ys) = ...

  mergeSort(xs) =
    let len   = xs.length
        left  = mergeSort(xs.slice(0, len/2))
        right = mergeSort(xs.slice(len/2, len))
    in merge(left, right)
It might seem obvious that this should terminate, because of course `.slice` will return a smaller array, but the actual termination proof in Agda is already something I wouldn't want in my config language: <https://stackoverflow.com/a/22271690/615030>

(Not to mention that this implementation is faulty and will loop at len=1.)

Limiting stack depth at [arbitrary number] -- this is similar to (1). I don't know why configuration languages don't do it, to be honest.


I think there is another option:

2a. The language includes limited primitives for recursion or loops.

If that’s done right, somehow proving that your program will terminate becomes trivial.

For example, allowing looping over a previously defined array with (key,value1) pairs to generate many more complex definitions that include common value2, value3, etc fields trivially guarantees termination, but a generic “while” loop doesn’t.

That will make you language less powerful, but IMO shouldn’t be problem for a configuration language.

In this example, I’m not sure you would even need that, as the language has ways to share common config values.


See my examples with html un-escaping and mergesort, down the comment chain.

Limited recursion/iteration is ok if all you need is to fill existing values into a template and possibly reduce repetition.

But in a large system with many components I might want to take a single timestamp, parse it, and generate timestamps in five different formats for five different subcomponents.

Or I might want to generate a piece of yaml config for an Ansible playbook that needs it, and now my config language needs to know how to escape yaml strings.

Or a config for a static site generator needs to be able to calculate a hash of a local css file because I’d like to use links like `style.css?hash` (a popular cache-defeating mechanism).

Or a certain path has to be split on “.” and reversed (Java-style com.example.blah things).

Or a Unix path needs to be converted to a Windows path, except [some special-cased paths that live in a map in an adjacent config].

There are endless reasons to want arbitrary logic in my config files, beyond reducing repetition. A lot of things I’ve listed are provided as primitives in various config/templating languages, but you always end up stumbling upon something that’s not provided.

Of course, one could say “You should use a real programming language for this kind of stuff”, and I’m happy that the JavaScript ecosystem is converging on allowing .js/.ts files for configs, because that’s exactly what I want too. But I’d like to have the same features available in projects that aren’t allowed to touch JS.


Many data transformations that you take for granted in other languages are either impossible or require amazing feats of contortion of the language to make happen.


Javascript/typescript don't have introspection or any autogen between static and runtime types either.

Cue is not general purpose language, with emphasis on it - because it's a good thing.

Asking for upstream embedded support feels like asking for bash interpreter, why would you need it in the first place?

It's based on completely different, logic based paradigms, use it as it is meant to be used - as top level configuration aiding language. Declare policies and generation in it and interface with other languages/tooling though input/output json/yml.


I think everyone appreciates links to similar projects for comparison, but a more in-depth comment would probably come across better - the "less principled and more primitive" sounds like a thoughtless off the cuff ad hominem dismissal.

Consider that some engineers poured a lot of heart into what they were building, and are probably excited to finally share it with the world.

I am not saying you have to love it, but just brutally putting it down with no justification seems really rough. Snark is easy.


I wish the Cue docs were better. I arrived here https://cuelang.org/docs/usecases/configuration/ but it doesn’t answer basic questions like “can I define my own validation functions (the > < != operators used in the example)?”.


Yes. I am sure it’s that simple. I’m sure that there are all downsides and no upsides. This is the first time in history where one technology is a strict superset of a competing technology, from all perspectives. /s

I really don’t know why this snark is necessary.


Primitive and less principled doesn't imply "downsides". Go is more primitive and less principled than Haskell, yet it's useful and oftentimes better due to its primitiveness. Cue is written in Go for example.


In a competition with sky/starlark, I feel skylark would win here. “Safe subset of python” is what a lot of people presented with this problem want, and skylark gives them almost exactly that.

OTOH, curious to see what advantages Pkl gains from not having the constraints of maintaining familiarity with another language.


Starlark seems to be overwhelmingly bound to Bazel at the moment—searching for it, I had to follow a link from Bazel to the GitHub repo and then from there I got to the implementations and found this:

> The implementations below are not fully compliant to the specification yet. We aim to remove the differences and provide a common test suite.

This does not inspire confidence that I could use this in a project any time soon.

Meanwhile, from what I can tell Pkl has a single Truffle implementation that currently supports 4 languages, it has a syntax that is more familiar to me as a non-Python dev, it has static typing, and it has a dedicated plugin in most IDEs (whereas Starlark just says to install the Bazel plugin). Maybe Starlark is more appealing to people writing Python or already using Bazel, but for the rest of us there's no contest right now.


The implementations and users page mentioned above:

https://github.com/bazelbuild/starlark/blob/master/users.md


Never used Bazel in my life, so while I can appreciate your passion, I guess I don't share your perspective. Generally the pattern I've seen has been providing a skylark interface to allow folks to define rules or configurations, which are then consumed through by whatever service via starlark-rust or similar implementations.


Copybara uses it


Step two of installing Copybara is to install Bazel [0], so that doesn't exactly contradict my claim that if you're not already using Bazel you probably won't use Starlark.

[0] https://github.com/google/copybara


Bazel is just the build system used to build Copybara. You don’t need to have Bazel in your system to use an already built copy of Copybara


Yes, but:

> Copybara doesn't have a release process yet, so you need to compile from HEAD.

Looks like there's an Arch Linux build maintained by... somebody, but if you're not on Arch then you're going to be building Copybara with Bazel. That this works for them suggests to me that their community has a significant amount of overlap with the Bazel community, so it's not good evidence of Starlark being used outside of the Bazel world.


(I've been the main designer of Starlark)

Lots of projects developed at Google use Starlark. Copybara is one of them. That's where the connection comes from.

Many other companies are also adopting Starlark for their own needs. For example, Meta has invested a lot in Starlark and published their implementation (https://developers.facebook.com/blog/post/2021/04/08/rust-st...), although they don't use Bazel at all.

Starlark was first created for Bazel. The organic user growth comes from people who have seen and used the language, so often Bazel users. But it doesn't have to be.


Agree - one of the things we've found using Starlark at Kurtosis is even the small jump from Python to Starlark makes people think, "What's this Starlark? I don't want to learn a new language", and we have to show them how not-a-new-language it is. I can't imagine bringing a truly new language like Pkl to folks and having them enjoy it.


I thought the same thing with the Godot game engine's GDScript. Aside from a few class-level implementation details (IIRC, it's been a while) it's essentially Python, syntactically. "Ugh... If I'm going to learn a new scripting language it's not going to be application-specific... Oh NM."


My employer uses a combination of Protocol Buffers (for the config schema definition) and Bazel/Starlark (for concrete instantiations). Configs are validated at build time and runtime using CEL (https://github.com/google/cel-spec).


Trivia note, the bazelbuild starlark readme example shows a rare correct implementation of FizzBuzz, with no unique case for "FizzBuzz".

https://github.com/bazelbuild/starlark


Wow, I was at Apple back in the 2018 timeframe when Peter was first building this. He was hoping to make it open sourced even back then, 6ish years ago. Great to see that it finally made it.

I really wish Apple would learn to play nicer with the OSS community. I have yet to see them deciding to open-source something backfire on them monetarily or reputationally, and I've seen the act of them abruptly close-sourcing things sour community opinion (i.e. FoundationDB).


Hey, I recognize this name :D

Yeah, it's been a long time coming, and it feels great to finally get this out in open source.

FDB is open source too, BTW: https://github.com/apple/foundationdb


How does this compare to HCL (Terraform)?

It has about exactly the same feature set. Declarative config, type definitions, data validators, reusable modules, variables, transforming functions, loops and other repeat primitives, reading external data like files and envvars, output and input json or yaml, IDE integration, you name it.

Much under-appreciated language btw. I hardly see it used outside TF. Here I think Pkl has an advantage of gaining adoption in applications, by generating types for the application code. Otherwise it will just stay as mystic item in the admins toolbox that others consider overkill.


I was a big believer in Helm for generating Kubernetes resources until I looked up and saw that we had created an impossible-to-validate, impossible-to-reason-about DSL in our values.yaml and that's when I realized we were at the end of our rope with Helm. We switched to Pkl for our Kubernetes resource generation -- it's delightful to maintain and reason about our deployments before execution time :)


After spending 5,000+ hours in helm charts, I still hate helm.

I've looked at cdk8s but I hate the tsconfig culture, why can't it be simple like Go?


Try Deno with CDK8s! It's such a fantastic experience. Here's a repo for my home server: https://github.com/shepherdjerred/servers/tree/main/cdk8s


This is the use case I am most interested in as well, templating strings into yaml always felt super clunky to me.


Curious if you have experience with terraform/open tofu/hcl for k8s?


I'm having a little trouble understanding the problem(s) Pkl is trying to solve.

After reading the title, my assumption was that Pkl was yet another newer, better configuration language (a la TOML), but now that I've read the article, it sounds like it's more a language for _generating_ config.

Unless I'm mistaken, it sounds like an abstraction on top of your config files meant to help you build & re-use configuration in a more standardized way, rather than yet another config language into itself.

A problem space I'm familiar with is having a bunch of Terraform or Cloudformation configuration you want to share/repeat in multiple projects. Doing-so can get hairy quickly, as the path of least resistance is to copy-paste a bunch of config you barely understand from some other project, and then perform trial-and-error surgery to find and change a couple of lines to suit your project.

Is Pkl designed to help address that sort of problem? Or am I missing something?


> A problem space I'm familiar with is having a bunch of Terraform or Cloudformation configuration you want to share/repeat in multiple projects. Doing-so can get hairy quickly, as the path of least resistance is to copy-paste a bunch of config you barely understand from some other project, and then perform trial-and-error surgery to find and change a couple of lines to suit your project.

Yes, Pkl is meant to solve this problem. It's a single place for you to configure all your targets. Within the same codebase, you can generate static configuration, and also import the same Pkl source files into a runtime application, so you don't have to copy/paste things around.


Thank you so much for the explanation!

I see much more clearly how something like this could be extremely useful.


Actually terraform is supposed to address that exact problem itself.

Instead of copy-paste of json files or aws resources, you can write a terraform module to generate it.

If you need to copy paste a large chunk of terraform module it is time to schedule refactoring.


We do have a degree of abstraction through terraform modules, but I've found that the same copy-paste problem applies to the terraform that composes those modules together.

This is possibly (if not likely) moreso a result of creating our terraform modules in a suboptimal way due to insufficient expertise than a shortcoming of Terraform itself.

It is also largely a result of having a backlog of scheduled redactors that is longer than I'd care to admit.


It solves exactly the sort of problem you are describing, yes! I think the discussions here focus too much on the language bindings, but maybe I’m missing that point as well.


I might sound like heretic but as someone who always keeps an eye out for configuration systems (I've tried edn, raw json, dhall, cue, hcl to name a couple) over the years, I'm sticking to jsonnet + jsonschema. Some reasons in favor of what might seem like an antiquated / type-poor system:

  - jsonnet fixes almost all of the superficial complaints people have about json (no comments, invitations to inconsistent layouts, no composition, no functions)
  - jsonnet has a very handy formatter and has trailing commas (simple diffs)
  - jsonnet can import jsonnet as well as json so you can "refactor" your configs using code and/or plain data
  - json is everywhere and nearly every language has a parser the standard libraries
  - jsonnet is not turing complete; I consider this a huge plus as effectively you are always operating in "data space"; everything is a shape transform, nothing more, nothing less
  - you can do further slicing and dicing with other mature tools like jq, jc, yq, gron, whatever
  - your outputs being plain old json, leverage whatever json schema ecosystem tools you have
  - json schema being old, you have lots of codegen tools available
  - the jsonnet library has go, python, node, C++ bindings
  - super easy to learn and run interactively
the biggest thing that's sorely lacking in this ecosystem are whatever jsonschema doesn't support in its spec for validation, like complex XOR relationships. Sometimes I wish these are declarable in data space, but on the other hand, configs with complex relationships like these often have business code backing them. Another weakness is if you have a large anthology of schemas/templates/source data you need to figure out a management method and hierarchy yourself.

Maybe pkl has a nice answer to these but J+J is really quite robust. I'd even go further and say it's beneficial to adopt a schema-first mindset and make it a habit to generate schemas. These tools are so lightweight and ubiquitous, it makes quick work for cranking out a schema and validating everything as you go.



> Pkl — pronounced Pickle — is an embeddable configuration language which provides rich support for data templating and validation. It can be used from the command line, integrated in a build pipeline, or embedded in a program. Pkl scales from small to large, simple to complex, ad-hoc to repetitive configuration tasks.

I do like the sound of that. It's always quite tedious to manage configuration in full-stack applications with mixed languages/ecosystems. It seems they already have Pkl plugins for IntelliJ, vscode and neovim and a language server is "coming soon".


Pkl, pronounced like pickle?

Pickle in Python https://docs.python.org/3/library/pickle.html

Tcl pronounced tickle. https://www.tcl.tk/

PCL, was it ever pronounced pickle? https://en.m.wikipedia.org/wiki/Printer_Command_Language


I was wondering whether anyone would point out the Python pickle, which is often used for saving trained ML models in .pkl files.

I mean, at this point almost every memorable 3 letter file extension probably already has some claim to it, but still, it would have been nice if Apple had done some research and made this more unique.


I'm curious about the language "bindings" for reading Pkl from Java, Kotlin, Swift, and Golang. At first I thought these were completely independent implementations of the language, since they read from Pkl source and not some intermediate format.

However, the CLI talks talk about a "pkl server" mode that is used by the bindings. So it looks like there is a single implementation (written in some JVM language?) that is run in a subprocess. I wish there was more documentation about how this works under the covers.

https://pkl-lang.org/main/current/pkl-cli/index.html#install...


At some point, we'll publish more documentation about this, including instructions for how to build your own language binding.

And, it's only a sub-process right now, but we plan on also providing a C library as another way to bind to Pkl.

But if you want to learn more about how this works, feel free to connect with us on GitHub! https://github.com/apple/pkl/discussions


Not to be confused with “pickle”, the Python object serialization format…


Where "format" is used in the lightest way possible. (NEVER try to unpickle anything that was not produced by pickle itself (preferably the exact same version))


Or GNU poke (https://www.gnu.org/software/poke/) source files which are called "pickles".


Or PECL (pronounced Pickle), the old PHP package manager superseded by Composer.


My first thought too :(


I'm sorry, can someone explain why one would want to translate from one data description language to another (Pkl -> JSON, or whatever)? Why not just write JSON (or whatever) to begin with?


Json lacks comments, templating, evaluation, types, and lots of other features. Having something generating a valid config is great, especially if you're generating multiple configs in different format from the same place. For example being able to configure SSH, nginx and some other services from nixos config is amazing.


Well, Dhall provides something between JSON and a Turing complete language that can make a lot of configuration much quicker to write, if you can hack the functional syntax. Pkl is probably a similar concept.

http://dhall-lang.org/


I love Dhall. It's purposefully not Turing complete (which is somewhat difficult to achieve in a language design) and I love that fact.


Because there are a thousand ways to describe the same thing in JSON. This:

    city = {
        "id": 3,
        "name": "Foo",
        "lat": 3.555,
        "long": 4.11,
    }
Is the same as:

    city = {
        "id": "3",
        "name": "Foo",
        "coordinates": [3.555, 4.11]
    }
But you want to normalize that.

Something like Cuelang will:

- Define the schema that will let you know what you should put inside.

- Allow you to inflate that schema into a full file, providing only the values.

- Will generate always a correct file, with no typo, or format error.

- Will check that the data is correct and tell you if there are any errors.

- Is note theoretically tied to a particular run time or stack.


Sounds like a job for JSON Schema

https://json-schema.org


Jumping back and forth between a JSON schema and writing valid JSON is not a very fun task


In your example, how is this different than incorrectly creating the data structure in any format ?

I'll be looking into cue, but how does it solve that problem ?


In general (not just limited to Pkl), I think the advantage is that you get IDE support like autocomplete and compile time checks. Pkl seems to borrow some features from JSONSchema/SHACL for example where one can also add validations like "value must be bigger than 20 and lower than 100" so when you configure a component incorrectly, it can throw a good error message before deployment.


Writing raw JSON is very error prone and you have to repeat yourself a lot. I think everyone who has worked with it has had some surprises, I certainly have.

Similar to why do we write Python instead of assembly, or why do relational databases typically have things like datatypes and constraints?


The amount of problems I've had with JSON in my career makes me think almost anything could be better than it. There's so many weird edge cases in the JSON spec that you can hit that it just becomes endless levels of hair pulling.


Have you tried YAML? :-)


NOrway, I haven't.


Can you share some examples?


which edge cases?


Just scroll a little further. It’s not just another syntax for config files.


Because Pkl makes it trivial to write templates and transforms. So you can write a Pkl schema that only requires a minimum set of fields, then auto generate a complete configuration.

This is most useful when dealing with tools like k8s where deploying a single application might involve 3-10 separate manifests (Deployment, Service, NetworkPolicy, HttpRoute, Autoscaler etc etc). With Pkl you can easily write a simple template that only requests the minimum needed to define an “app” (e.g. name, namespace, mixins for sidecars) and have Pkl generate all the needed manifests for you.

Really Pkl should be seen as a language for quickly building templating tools like Helm. But with type safety by default, and no need for horrible indent hacks.


JSONSchema covers a lot of various schema needs and YAML is something a lot of developers are comfortable with. I know both of those technologies are not popular here in HN but YAML type-checked and editor-autocomplete-enabled using JSONSchema is a solid choice for configurations in my opinion.


Every time I see YAML used for any configuration I know I’m in a frustrating time. It’s particularly bad for build systems where there the feedback time can be so slow.


I agree it doesn't scale well as there are no loops/functions, but there's a lot of good tooling around YAML that other config languages lack:

- mature libs for most languages

- VScode plugin + Jsonschema for auto completion / schema checking

- yamllint to detect the languages footguns

- yq to query, update in place, and format while preserving comments and sorting keys from the CLI


true, but as name suggests jsonschema wasn't meant for yaml it's only because yaml is superset of JSON. the issue is that when you want or need to use something that is outside of JSON spec, like tags all your validation falls apart. also JSON schea validation is really basic, and while designing configuration format can often mitigate that, it's not very versatile.

another common thing is that sometimes you have to define multiple very similar sections in configuration that cannot be handled with yaml archors, eg I have repeated definition dozen times that changes only in 2 numbers that are deeply in structure and name string, and I need to repeat everything because of that and it's pita to modify all other parameters that need to be kept in sync

therefore I think this format looks really nice, although I'm concerned by loops that can be used there, is there possible to create simple config that takes ages to parse or consume very large amounts of memory?


Jsonschema is still json and yaml is absolutely not comfortable to work with. It’s only enough for simple configs. As soon as you have the urge to use a template you should replace it with something else.


I use jsonnet for templating and transformation and jsonschema for validation. Very happy with this combination. One big reason is that there are lot of libraries and codegen tools to choose from once you have the JSON generation pipeline (including schema generation) down.


Fail to see how this is any different than Dhall (https://dhall-lang.org/) other than it produces plists too.


My understanding is that it doesn’t require termination (“is Turing complete”), unlike Dhall.


This reminds me of the idea behind Lua - similarly the original users needed a configuration format which became increasingly sophisticated and at some point the authors realized they needed “proper” programminglanguage constructs.


This is also why Lua is called Lua, the original configuration language was called SOL, for Simple Object Language. It never shipped, by the time the desired code was delivered to Petrobras, it was the first edition of Lua.

The authors have a fun read[0] about the history of the language, for the curious.

[0]: https://www.lua.org/history.html


Interestingly, this is exactly the reason the Lua langue was conceived more than 30 years ago.


Are you sure about that? Pkl generates configuration from code. Lua was developed for a couple projects at Petrobras that were used for data entry and data flow. Maybe I don't understand the use-case for Pkl properly.


So this is like jsonnet, sdlang and cuelang? What does Pkl do different?



I used this extensively when I worked at Apple and LOVED it. So excited that its finally out


I'm really enjoying reading through the docs and the tutorial. We've created Lowdefy, a config web-stack which makes it really simple to build quite advanced web apps. We're writing everything in YAML, but it has it's limitations, specifically when doing config type checking and IDE extensions that go beyond just YAML.

I've been looking for a way to have typed objects in the config to do config suggestions and type checking.. PKL looks like it can do this for us. And with the JSON output we might even be able to get there with minimal effort.

Is there anyone here with some PKL experience that would be willing to answer some technical questions re the use of PKL for more advanced, nested config?

See Lowdefy:

https://lowdefy.com/

https://github.com/lowdefy/lowdefy


Reminds me of [HCL](https://github.com/hashicorp/hcl), but without all the providers to deploy the config?


Can somebody tell me real use cases they would use this for?

I've seen their usecase documentation entry. And I understand the benefits this could have.

But I think I need some hands-on usecases, to fully grasp why or how I would use this.

Thanks


Regrettably the folks who have been using it for years aren’t going to be give a lot of specifics, but generation of k8s yaml / jsonnet in particular was exceptionally common. One example from the other thread:

> My team migrated several kloc k8s configuration to pkl with great success. Internally we used to write alert definitions in pkl and it would generate configuration for 2 different monitoring tools, a pretty static documentation site and link it all together nicely.

https://news.ycombinator.com/item?id=39235425


Can't speak to Pkl, but for Jsonnet I made it possible to fully define then load neural network model architectures directly from Jsonnet config files [1], rather than relying on Python's unsafe pickle module.

Since neural networks often have many repeating features, using a traditional configuration language requires repeating the same structures a lot, whereas using Jsonnet you can use `std.repeat` instead. You can see some examples of this in the readme of my package.

[1]: https://pypi.org/project/cresset/


Types on the fields is interesting. The examples suggest said types are lost when you serialise to json or xml, though it seems like at least some types should be expressible as schemas.

Language reference doesn't mention schemas either https://pkl-lang.org/main/current/language-reference/index.h...


You are looking for templates: https://pkl-lang.org/main/current/language-tutorial/02_filli...

There’s another repo, the Pkl Pantry, that provides a couple of ready made templates (Schemas) that you can try out: https://github.com/apple/pkl-pantry


I was looking for a translation from a pkl template into an xml schema. It would probably be lossy - I think pkl can express more invariants than xml or json schema can - but still gives some invariants on the data for use downstream of the pkl tooling.


It seems like a truncated Pkl file can still be a syntactically valid file? This is a problem with TOML and YAML (but not JSON) where the language itself doesn't protect against truncations, and applications will need to add end-of-file markers or something similar in their config schema. I am not sure if Pkl has some built-in feature related to this sort of thing.


That's a feature, not a bug. Truncated files should not be accepted as valid


So you agree with them that it's a problem...?


This feels like a strict superset of hashicorp's HCL with strong typing and include statements. Once again, Apple has released a piece of tech which is innovative in its ergonomics perhaps, but ultimately a copy of a predecessor. This is becoming a habit of theirs and has only developed recently.

Unpopular opinion: yaml is almost as close to perfection as can be gotten. The only thing we could do to make it better is remove features.

They got some things really right. Targeting dynamic languages over static ones is the right choice, though a difficult one. 64-bit floats written in decimal notation are the goodest lowest common denominator we have (sorry), and the ability to embed valid documents in other documents (multiline strings prefixed only with white space) is a game changer. JSON compatibility is controversial, but ultimately very useful.

There are problems that are caused by too many features. The Norway problem is caused by too many features. References and splices are cool but generally confusing. That people do not use tags on their data 99% of the time in the wild strongly suggests that strongly typed configuration is less useful. True, there are too many different kinds of multi-line strings. Finally, if yaml were syntactically specified while preserving that white space feel, we could edit large yaml documents without having to get out our carpenter squares to figure out what level of indentation we're on. Syntactic specification will help our editors figure out what is happening while we type.

I have addressed many of these problems in a new language called NRDL while seeking to preserve the things that yaml got right[1]. I feel we should learn from the mistakes and the success of those configuration languages that came before.

1: https://git.sr.ht/~skin/nrdl


I'd like you to consider that there's an approach to commenting on this kind of thing which wouldn't leave your contribution languishing, greyed-out, at the bottom of the thread.

See if you can figure out what that approach would be, and adopt it in future.


it's very brave to release a new configuration language with no Python support today.


Python is able to load Python code at runtime, so one can use Python for configuration. A solution doesn’t need to go looking for problems that are already solved.


...and yet Python people routinely fail to grasp this and end up writing ad-hoc config language interpreters on top of yaml anyway.

Granted the conveniences of Python syntax for code are mostly lost when trying to express tree structured data, and yaml flips that on its head.

(Mumble, grumble, something about s-expressions...)


That’s because it’s impossible to properly sandbox the config parsing. It’s also a horrible experience to debug configs.

But it’s still better than templating yaml.


People interested in configuring Python software in Python should look into Starlark. There are Python bindings for two versions of Starlark: Go (https://github.com/caketop/python-starlark-go) and Rust (https://github.com/inducer/starlark-pyo3). I used python-starlark-go for a time in a project that ran on x86-64 Linux and had no problems with it. (I stopped because my project's configuration turned out simpler than expected, so I switched to TOML.)

Worth noting that it is specifically CPython that has been called impossible to sandbox. (2014 discussion: https://news.ycombinator.com/item?id=8280053.) It may be possible to sandbox PyPy. PyPy has a sandboxing feature its website calls a "working prototype" (https://www.pypy.org/features.html#sandboxing). If someone invested in it—potentially a huge effort—it could plausibly become good enough for configuration. But, IMO, Starlark is a better choice here because it was designed for isolation from the start. If you wanted to invest in Python-as-config-for-Python, a good use of your time might be improving Starlark for Python.


Looks very interesting. Thanks for the pointer.


So true. One of the major config mgt utilities which shall remain nameless, (cough, Ansible, cough), is written in python but created an excrable POS config language build on YAML. At least Ant had the excuse that Java was not suitable for a config language. Will people never learn that building scripting languages on markup languages will inevitably end in tears?


Not uncommon to have a python program that loads external half-trusted configuration, that must be sandboxed in capabilities.

For in-house stuff, totally agree, just use the python code itself as the configuration.


They start with Swift, Java, Kotlin and Go. That's already a big chunk of applications. Perhaps more will come in the future.


No C library sounds like a purposeful omission to slow down third party integrations.


And at the same time borrowing the name and file ending of Pythons object serialization protocol.


That's a strange way to frame it, you don't have to support every language to be successful. World domination might not even be the project's goal.


Why does everyone use these weird languages and not compromise and use some cut down version of python or lua as a DSL?


It looks excellent and addresses all the issues I have with json, yaml, XML and co, but developers love to use "simple" technologies that then cause unending pain later due to being too simple. So I wonder if this will see much use.


The problem with configuration is that it’s only marginally useful to statically enforce that “this should be an IP address” or “this should be a port”

People rarely waste time because they put a number where they should have put an IP address. People waste a lot of time because they don’t know what IP address to put in there.

Look at any real life production nginx or kubernetes config and ask yourself an honest question — how much would a static type system actually help me in writing this config? We understand what types the configuration fields want — that’s trivial. We spend all of our time finding the correct values for those types.


Pkl is a new language for describing configuration. It blends together the declarative nature of static formats like YAML or JSON, with the expressiveness, safety, and tooling of a general purpose programming language.


Serious question - why not just use python?


Python isn't oriented around defining and validating data.

For example, something like: "this number is be between 1 and 10" means you have to come up with a novel way to add validation, because it isn't built into the language.

Also, Pkl is meant to be usable everywhere. General purpose languages tend to be tied to their own ecosystem--imagine telling a Go developer that they need to install Python, set up virtualenv, and run pip install so they can configure their application. We'd like Pkl to be a simple tool that can be brought anywhere, and easily integrated into any system.


> For example, something like: "this number is be between 1 and 10" means you have to come up with a novel way to add validation, because it isn't built into the language.

No need for a novel mechanism - there are plenty of available solutions to add validation to python

> imagine telling a Go developer that they need to install Python

Pkl doesn't come preinstalled on machines - so you'll have to install it as well

> set up virtualenv, and run pip install so they can configure their application

This is the real friction point, but is it a bigger friction point than having to adopt yet another DSL?


> This is the real friction point, but is it a bigger friction point than having to adopt yet another DSL?

Yesterday, I created a virtualenv, then ran `pip install`, only to see it fail. I found out that even `pip --version` was failing. I discovered that running `python -m ensurepip --upgrade` would fix pip. It did fix pip, but `pip install` still didn't work properly. I figured out that pip was reporting a different version of python than the one virtualenv is using. Running `python -m pip install --upgrade pip` upgraded pip, which should have been accomplished with the previous ensurepip command. Finally, everything was working properly.

I experienced these problems after years of experience in python. To answer the question. Yes, it's worth adopting yet another DSL than using python.


Did you activate the virtual environment?

For all the flak python gets, dependency setup is pretty simple if you're not flailing around aimlessly

  python3.12 -m venv --copies --clear --upgrade-deps ".venv"
  source ".venv/bin/activate"
  python -m pip install --upgrade pip setuptools wheel
  python -m pip install --editable .


Yes, I had activated the virtual environment. Thanks for the guide. It works without problems. I had used `virtualenv venv -p /usr/local/bin/python3.12` in my setup. Yours seems to be a better way.


That right there is the main problem with python dependency management - too many damn ways to do the same thing.

And to think "There should be one-- and preferably only one --obvious way to do it." is in the Zen of Python...


assert x >= 1 and x <= 10, "x is out of the allowed range"

What novel way to add validation in python?


This made me realize that Go actually succeeds at being more pythonic than python.

I'm also incredibly high.


High or not, you're 100% correct.

Have a look at PEP 20 – The Zen of Python.

Python is actually horrible at following it, Go doing a much better job.


Because (unless your app is written in Python) you don’t want to start a full-fledged Python run-time to read a config file. Nor do you want all the hassle of trying to ship a Python run-time with your application.

Edit: and moreover, you probably do not want config files that can run arbitrary code with full access to the Python standard library.


Python would be a terrible choice for this.

* It's been dragged into static typing kicking and screaming.

* You import the whole Python infrastructure/packaging catastrophe.

* It's not sandboxed.

If you wanted something Python-like you would you Starlark.


Which of these problems outweigh the complexity of onboarding a new language to an organization?


Probably the infrastructure/packaging mess. That's debatable. If you fix all of them though it's easily worth a new language, as long as it isn't too hard to learn.

This looks very easy. I'd be more hesitant about Dhall or Nix (though obviously Nix comes with even bigger benefits so it might be worth the awkward language).


Congrats Apple folks who worked on this, I know it’s been a long time coming


Coming from Elixir land, the Config module accomplishes some of Pkl's goals. But you're pretty limited by what you can do since the config files run at compile-time. Lack of typing and need for magic atoms to retrieve them can make them fickle and prone to typos.

There's probably enough interest and demand for node bindings to be spearheaded. God knows how many config files you have to tinker around with in SPA land. And of course using it for docker/k8 configs could benefit just about any language.



I thought it odd that the language bindings didn't include the most popular langage, Python. In fact, Python seems not to be mentioned at all in the linked page. So I'm wondering, is it

1) Because the developers of Pkl are Python haters

2) Because the developers of Pkl are so overawed by Python that they can't imagine Pkl contributing anything useful to the Python ecosystem

In either case, having suffered so much using Ansible and its excrable YAML scripting, I may use Pkl together with Python.


Probably secret option number 3 of no-one having needed it yet - because the tool is standalone, you can render the required output before running whatever needs to consume it. I’ve certainly used it from Python in that manner - bindings are only required if you want to consume the raw language programmatically from a Python context.


Exactly this. Pkl is most useful as a type-safe configuration language that can output to any other format (already supported, or put together by the user within Pkl). You’ll always get valid JSON, YAML, PLIST, what-have-you as output. This you can then parse in the language/system of your choice.

Certainly language bindings are useful, and if there’s demand likely someone will create them.


“Popular” doesn’t mean “fitting every purpose”.

All the listed languages are compiled and statically typed. Python is neither. Neither is JavaScript, another popular programming language which is also not listed.


I don't see how the static typing of the bound languages enters into it. Python has type hints which can be enforced by some compilers, if you are so committed to type safety in your config scripts. If you have Pkl output YAML or XML then where is your type safety?

Since "popular" does mean that very many people are using it, I think it would be wise to try to serve the Python community.


Another possibility: a naming conflict with, and potential confusion around the already existing Python serialisation library 'pickle'?


I think this intended for compiled languages? If you already have Python or Ruby in your stack you can simply write a little script to generate the required JSON or YAML. I'm not sure you would ever want to add Pkl to the mix in that case?


> Because the developers of Pkl are Python haters

Look at the languages that they do support - Go, Swift, Kotlin, Java. These are all robust languages for writing production grade software. That's probably why - the people at Apple using this don't need it for their hacky Python scripts.


By "production grade software" do you mean major sites like Instagram or Reddit?


> 1) Because the developers of Pkl are Python haters

Is it really that deep?


What happened to generating config from templates? Like puppet/chef/ansible? Then it doesn't much matter what configuration format is used, as it is all generated from a common template.

Systems like pkl and skylark live on their own and isn't well integrated with systems outside it (for example, an object inside a configuration might want to generate a log source definition in the central logstash config, or a new object in the monitoring system).


String templates are not type safe and they are super easy to mess up data escaping, by not closing a quote or even worse: deal with yaml indentation.

Integration with resource creation shouldn’t be needed. If you look at terraform it actually does so and it’s surely nice, but that’s not strictly necessary, it’s just a result of terraform resources themselves being configured in the same configuration language as you write your config file templates. Supporting everything also comes with compromises, like only having a half-assed understanding of how to deploy Kubernetes manifests, instead of focusing on the job of only generating them.


so we're reinventing lisp yet again?


I have been looking into alternatives to YAML templating, the Helm abomination, etc. and was utterly surprised how no one ever apparently has thought to create a configuration/templating system that's basically a fancy library on top of Scheme.

I truly believe every company is still reinventing the wheel because of a lack of serious foundational knowledge in most engineers, so they are doomed to recreate subpar, shitty alternatives.


> was utterly surprised how no one ever apparently has thought to create a configuration/templating system that's basically a fancy library on top of Scheme.

There's Clojure's extensible data notation: https://github.com/edn-format/edn


It seems cool at first sight, but I feel it is overengineered for a configuration language: why would one need to indicate whether a variable is a list, a vector or a set? These are implementation details, they should be represented as lists, and it is the application underneath that choses the appropriate representation.

You only need Scheme, quoting and quasi-quoting, and s-expressions.

    `((username "foo")
      (password ,(getenv "PASS")))
Want to be fully declarative because you are afraid of Turing-complete configs? Simply abort if the top-level list is quasi-quoted.


edn + clojure was the most pleasant configuration system I've worked with, but I have never convinced anybody else to use it. Jsonnet on the other hand, people get productive in it pretty quickly. It's my default choice for all configuration now.


I wonder, what would happen if Clojure was marketed as a configuration file format from the very start, without mentioning persistent immutable data structures, software transactional memory, and other scary words. Would it have more adoption now?


I doubt it; edn seemed like _that_ rebranding effort. It looked json-like enough, but clearly didn't have the draw. People who complain that json has no comments obviously have a secret list of other gripes.

The general tooling wasn't there, apart from clojure. I tried one of the edn libs for python/node and it always felt second class. The full power was just never there outside of clojure projects.

It's like how everybody still uses QWERTY (including me) and are happy to buy better keyboards, but they must still be in QWERTY


Yeah was thinking about that a lot, too. I guess there’s zero overlap between schemers/lispers and people doing yaml configs.


There are dozens of us!

More seriously, I don’t understand how anyone tolerates 5k line yaml files but I haven’t found any support for moving away from it at $bigcorp. Right now we use an undocumented shim layer between Helm and Argo for maximum inscrutable templating. It’s an endless footgun with the impact spread so thin it’s invisible to the business.


what about guix? could that be used for this purpose?


It is a stretch to say that s-expressions could be added to the list of configurations/specification languages? [1] they can also be simplified to use less parenthesis.

[1] https://www.gnupg.org/documentation/manuals/gcrypt/S_002dexp...


It's 2024 and the software industry has not solved the problem of configuration yet. Umpteen solutions every year - languages, formats, etc - all for just doing configuration. Just wow! It amazes me. It would be so much better if all the great minds focused on real issues than keep spinning on the hamster wheel of config file formats. Just look at the # of comments here.


Honestly, I don't imagine this being a good idea. The problem it solves is pretty clear, and it's easy to see how one attains the desire to solve that, but is it really that kind of problem which should be solved?

It is kinda commonplace that you don't write configuration in the real PL. Maybe it's less obvious, when your working language is Go/Java, since they are compiled, so obviously you have no choice but to keep configuration in an interpreted format. But you also don't do that in Python/PHP/Javascript. You could just write configuration using full power of Python, right? But even when you do keep some configuration as Python code (because it's only developers who are tweaking these constants anyway, etc.), you usually prefer a plain dict for that, not a bunch of dynamically generated classes with multi-level inheritance.

So, ok, maybe it makes sense when you have to deal with a over-engineered fucked up third party tool like k8s, which you have to configure in YAML, and you could just make it less verbose and have some schema? Well, I don't know, maybe. But the point holds. The good thing about YAML/JSON/etc is that they are very declarative (even when they are used a as imperative DSL syntax, as in GitHub Actions or k8s). If you see that port = 6001, it's 6001, that's it. Your entire DB configuration is right there, in front of you.

The biggest selling point of PKL seems to be that "sidecars" example. And, I mean, sure, it's precisely the problem we intended to solve. But it also shows that you (oh, not you, but your co-worker, of course) could write pretty much anything. It doesn't look like a configuration anymore to me. It's exactly keeping your "configuration" in the source-code of your app, in a real PL, using dynamically generated classes in a cycle and whatever. The thing, we don't usually do for some reason.


I'm still on the fence (and distracted by Vision Pro). It's a good-ish idea, IMHO. I'd like to see a toolkit for "reading configs" where configs are some universal format (JSON, YAML, TOML, etc) for various languages. So that the app in Ruby can share config with "microservice" in Rust and they can also maintain compatibility with completely different app (but sharing infra resources) in Go. Using one config language that supports hierarchical templates, modules, imports, whatever. Because otherwise everyone starts inventing their own "YAML with functions", "JSON with templates", etc. Pkl seems like that. But I don't like that it's Turing complete. Powerful and dangerous. Configs are inherently declarative.


Is this a reasonable tool to manage problems like:

I've got a stream of json in one schema; I want to make ETL for this to write it out to elasticsearch, S3 (in a specific different schema) and postgres targets, using a mixture of "bespoke python", vector, and logstash.

Should I use this or should I roll my own goo using ERB or mustache or similar?

Or should I quit and go work some place that's using gRPC?


More poking around in this; it'd be great if things like

https://github.com/apple/pkl-pantry

Had more documentation. Like there's a prometheus template; is that for use only in k8?


I appreciate what the language is trying to do and simplified configuration that is repeatable / re-usable is the goal.

However I can’t help but wonder why systems don’t simply start adopting full programming languages once they hit a certain complexity. Having to use another language just to reproduce YAMl or what have you in a repeatable fashion is a symptom of the problem IMO


This is a full programming language! It’s just designed for exactly this goal.


Yes that I realize if I didn’t make it clear.

I am definitely lamenting the fact that SDKs (CDKs?) aren’t already exposed in most common languages as a way to take advantage of the robust features of full programming languages.

They either aren’t at all or they use some hybrid format (terraform etc) or they’re Byzantine in nature or ignore good API conventions


Until Cue/Pkl/whatever the current FotM has FFI that can be realistically utilized from most other languages (ie. not require custom bindings), it won't get real adoption.

Without wide adoption, it's at best an interesting research project. A really cool project I want to work with, but not something I can reasonably bring up.


I like the syntactic sugar of directly validating types through modules. However, configuration at some point always converges towards Turing completness. I’m biased but imo nothing beats TypeScript and schema validation tooling like Effect.Schema or Zod. Yes it’s more code to write but you’re not locked in and can adopt the system as needed.


I feel I have seen it all already.

Imperative DSLs created out of XML+XSD or <put-any-declarative-config-with-schema-here> fears which become exploited by expressiveness and nightmares to maintain after a while.

This is why I still prefer Maven (XML) instead of Gradle (Groovy DSL) for Java project configurations.

And the definitive question: can pkl configure pkl?


Does this support fancy symbols in bare identifiers? Even the huge reference page is blank on that, and the only working links to grammar are to the ANTLR website, but even if they did work, the grammar is also blank as it simply references an internal grammar function

And there doesn't seem to be any online playground to test


Yes: quoted identifiers are represented using backticks (which do not then form part of the name), so:

local `My Variable` = 42

Do you happen to have a link to the missing/blank reference page on this?

EDIT: perhaps I misunderstood the question - if it's about whether you can use non-ASCII characters in identifiers without quoting, the answer is "sometimes":

``` local `abde` = 42 // Only works when quoted

local `teʝ` = 42 // Works fine (as expected)

local teʝ2 = 42. // Also works fine ```

It might be interesting to expand the docs to cover the precise rules here without having to resort to ANTLR grammar, as you say.


Yes, I meant bare unquoted identifiers

For example, does

x² = 42 work? Or

move↓?=true

or some emoji

I guess not given your first example (though it's not rendered here properly, looks like 4 ascii letters), which is rather unfortunate as configs could be more readable with symbolic names,

https://pkl-lang.org/main/current/language-reference/index.h..., both links are broken. Another thing is that the doc page should have an additional pages with a single section/page view otherwise it's too hard to find a word in a single huge doc


Genuine question: it is weird to find an Apple open source project that depends on Java. Is this because internally Apple has not break the historical link with WebObjects? [1]

[1] https://en.wikipedia.org/wiki/WebObjects


I like it. I wish Helm used objects instead of string templating. Hopefully it becomes a popular alternative.


The first thing the page shows is generating configurations in existing formats like YAML. For most of my use cases, there is tooling that does some updates of the configuration (e.g. bumping dependencies), so I'd really need bidirectional support to convert the updates back into Pkl.


Not sure about Pkl, but Jsonnet will let you import a plain json file.


You know why there's isn't a library for Python? The answer will surprise you.



Putting constraints into the config file is nice from a self documenting point of view, but I suspect that

a) nobody will bother

b) that one time they do, the constraint will be wrong in some circumstance, and the constraint will get edited out anyway.


We programmers clearly seem to think we have nothing better to do than create edge case programming languages that solve incredibly niche problems in esoteric ways.


I manage and distribute my configuration files with Ansible, using the Jinja2 templating language. My needs are rather basic. I'm not sure if I'd benefit much from adding a tool like this to my workflow.


> datasize3 = 5.mb // megabytes

> duration3 = 5.ms // milliseconds

grumpy sounds


doubt I would use it elsewhere instead of JSON but if this is an indicator that .plist in xcode will go away then I'm all in - .plist is probably one of the worst formats human can work with.


NeXT used a text format for Plist[0] (which Amazon’s Brazil Config looked very similar to) but was deprecated relatively early in OS X’s lifetime. All of the plutils can still read it, but none of them will write it.

The biggest drawback was limited number of types, imho. It seems like encoding issues were another drawback which led Apple to use XML when OS X launched. JSON want a thing yet, so XML probably looked like a good option.

0 - https://code.google.com/archive/p/networkpx/wikis/PlistSpec....


I like the syntax and the type system but it’s written in Java, why?


This is so cool! And built with GraalVM, too. Amazing


My team at Apple is very heavy on pcl + k8s configs. It's a very useful tool. Definitely recommend everyone to get familiar with it.


I kind of feel that if your base is kubernetes (and especially helm) YAML soup, basically everything looks like a good alternative.


No mention of Xcode support that I can find.


Noticed that as well... I guess not even Apple wants to use it anymore, oh my...


The services teams at Apple are more agnostic than what you’d think.


In the not-eating-our-own-dogfood sense?


It’s not as if Apple provides server side tooling anymore.

Though, even if they weren’t going to provide a server OS, it seems like they could still offer IDE support for the droves of web devs working on Macs a deploying to Linux (like I assume their devs do).


This is awesome and extremely benefetial in this AI era. But the paranoid part of my brain doesn't like that.


Isn't this similar to gradle (which is based on groovy lang)? Syntax and functionality looks similar.


What is the use case for this? There are a zillion config languages from toml to yaml to HCL.



What I like:

- the validation system

- ability to split large configuration files into smaller files and enhance readability (I know Java / Spring Boot apps already have this ability using profiles)

- multiple language support

Wonder if it’s possible to have more complicated validation functions. The example given validates 1 configuration (“port must be greater than 1000”). But there are times when some configuration is valid by itself however in the presence of other configuration the configuration(s) will possibly be ignored.

Given:

- configA

- configB supersedes configA in app

When:

- both configA and configB set in pkl

Then:

- at compile time, should throw a validation error

Criticisms:

- documentation (in typical Apple fashion) appears to be lacking. Couldn’t find anything on the validation system

- missing LSP support

- yet another config language to support/learn


the validation you are asking for sounds like what CUE does.

If you say

a: 5

a: 3

That’s an error.

If you want the ability to "override", you have to write it like this, for example:

a: *5 | int

a: 3

This also works for struct/list members, so you can set defaults for all fields, and stuff like that.

https://cuelang.org/docs/about/


sadly the Go implementation is bloated as hell:

https://github.com/apple/pkl-go/blob/main/go.sum

no thank you.


You should rather look at go.mod file, and it doesn't look bloated at all. They could maybe drop cobra and use something with less dependencies, they use pflag package already so maybse use only that, but uhh I've seen much, much worse worse, as a daily Go user I'd say it's fine


> You should rather look at go.mod file

no. the whole point of go.sum is to see everything. you could have a go.mod with a single item, but if that item is a giant module with hundreds of third party imports (as in this case), its quite misleading.

> I've seen much, much worse worse, as a daily Go user I'd say it's fine

uh where? I am a daily Go user, both professional and personal. and this is one of the most bloated modules I have ever seen.


did they really need to call it the same thing and with the same extension as Python pickle files !? that's doing to be so confusing if it becomes popular


What does Pkl stand for?


I can see that Apple is behind this. This looks better than Cue


Honest question, how this is better than CUE apart from its Turing completeness? But that probably not a feature that we really want in a config language?


Congratulations Phil!!


Yes! Congrats Phil!


“Our best configuration as code language ever.”


Too bad, it doesn't have dotnet and C# support...


So how does Pkl compare to Nix or Dhall?


Anything development related from Apple is inherently disinteresting unless it's aimed at Apple, is for Apple and you're building for Apple.

Apple has done nothing cross platform or open source or community oriented so if they come out with something that's intended to be more general is has no base, no users, no audience of non-Apple developers to land on.

I'm not anti Apple - I love MacOS and Apple is my main machine. I'm just pragmatic about Apple technologies - they are all for usage inside the Apple bubble.


> Apple has done nothing cross platform or open source or community oriented

Cups - https://en.wikipedia.org/wiki/CUPS

Swift - https://en.wikipedia.org/wiki/Swift_(programming_language)

Zeroconf - https://en.wikipedia.org/wiki/Bonjour_(software)

I think they're bad, opaque opensource maintainers, but they did release some popular things that have communities on other systems.

I do wish they just contributed to nickel or something else though rather than doing their NIH as usual.


CUPS was done by Easy Software Products, in 1997. Apple adopted it only in 2002.

Swift is a language entirely in the control of Apple, mainly targeting Apples platforms. With little to no community engagement.

Bonjour is not really cross platform, and not really open source either as it has lots of strings attached to the license and terms one can use it under.

(I wouldn't say that Apple has done "nothing" -- but to credit them for doing much is also a stretch)


HLS -- created by Apple, just about everybody uses it for streaming on the web.

And I think CSS animations and transforms mostly came out of Apple; at any rate, they're very similar to the animations and transforms in UIKit that originally came from NeXT.


Apple being gung-ho to push a technology to further nail Flash to the cross whether the actual reason, at least is consistent with their MO and not some great contribution to open standards.


As long as HLS is more open than Flash was, why not? It could be both of those things.


Ha, I was only referring to the CSS animations. That seemed at the time to be more pressing. But no disagreement.


> I do wish they just contributed to nickel or something else though rather than doing their NIH as usual.

They’re contributing to a ton of OSS that’s NIH:

https://opensource.apple.com/projects/

K8s, spark, cassandra, netty, zookeeper, solr, containerd


> I do wish they just contributed to nickel or something else though rather than doing their NIH as usual.

Did Nickel exist in 2018? (Someone here said that Pkl did.)


WebKit? FoundationDB?


I didn't include webkit since it's a forked KHTML, which wasn't theirs.


I didn't include Linux because it's just a Unix reimplementation, not theirs.

or:

I didn't include Firefox because it's just an NCSA Mosaic fork, not theirs.

The point being, whoever does most of the work and maintenance owns something, not who came up with some early predecessor.


Yes.

Entire timeline of JavaScript engines.

https://egbert.net/blog/articles/javascript-jit-engines-time...


There's hardly anything left of that.


At first glance, this is interesting in that it’s the least Apple-ish open source project I can remember seeing. I can definitely see uses for it right now in non-Apple-ecosystem projects where you’re not already invested in a different tool.


I think they funded development of Clang and LLVM.


And WebKit, which is the rendering engine in Chrome. (Though Google now maintains their own fork).


What's clang, chopped liver?


programmable and safe is an oxymoron.


Looks great. Reminds me a bit of HCL.


Developing a whole new language whose reference contents page spans two phone screens, with classes, built-in packages and methods, bindings for various languages and language servers for various IDEs, just to replace yaml? Seriously? Either the configuration is broken, or if the yaml is doomed to be repetitive to be repetitive (to allow flexibility etc), won’t sandboxed JavaScript do the job of outputting a readable yaml?


When is the Swift rewrite?


In what sense is this programmable?

Can I write a loop in this? If? Elvis operator? A simple map lookup?



Congrats Peter!


What about Lua?


DML


Dml


Nix support?


any plan to support c/cpp ?


If you're interested, there's now an unofficial community Discord: https://discord.gg/vDhhCT24


ffs



Indeed, launching Chrome would simply increase the browser count by 1

(is there an xkcd for that?)

https://cdn.statcdn.com/Infographic/images/normal/1438.jpeg


what a waste of time.


Here we go again


[flagged]


Spam. Please flag and remove.


Is this a joke? Why? We have so many usable formats already. Every language can do JSON. What problem does this solve? Or is this another Apple solution to a problem that doesn't exist?


I do not find this particularly interesting because the problem has been solved by several projects already, but this is a programmable configuration file.

i.e. you can write code statements and it transpiles down to other formats


I kinda think we haven't solved this yet. Sure, this essentially is to generate a text file from another text file of parameters and code, which becomes our new configuration file.

Then there is need to generate that new configuration file as things getting complicated.

The current approach (of all these current languages, pkl looks like is the same) is to painfully refactor the ones you want to change to a parameter file, i.e. a manual data/code separation and you template that.

It would be nice just say this configuration file is now a template file with A B C fields as parameters, and load it up in our new configuration file for templating with good trackability and performance.

It is also easy to migrate to other languages, as configuration now can turn into code cheaply.


I'm not sure I'd agree its solved already, yes we have some languages (Dhall or Cue probably lead the pack) but I hugely prefer Pcl's syntax and approach. It's like saying that C solved systems languages, so there's no reason to have Go or Rust


It's written in Java. Jesus.


From the folks that brought you plist, introducing yet another format, pkl!


So Apple just invented NixOS?


no

you can click it, it'll open link in your browser where you can read what it refers to.


Only the Nix language.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: