Hacker News new | past | comments | ask | show | jobs | submit login
Catala: A Programming Language for the Law (arxiv.org)
232 points by todsacerdoti on May 6, 2021 | hide | past | favorite | 126 comments



Judging by how most people write code today, I find this idea terrifying. I sometimes wonder if it would be better if we went back to an ancient-Roman style legal system which maximizes the human interpretive element rather than attempts to remove it.

It's already the case in many countries today that there is too much focus on the wording of the law and not enough on the intent. People try to use a rigid fixation on the importance of words to twist the law in any direction which benefits them, even in ways that don't feel right.

We risk a Kafkaesque situation.


> Judging by how most people write code today, I find this idea terrifying

Most people today don't write code for safety critical systems. The abstract mentions they already uncovered a "bug" in an active law by rewriting it in their language, so it could be an immensely helpful tool for finding edge cases and loopholes in that mess.

> It's already the case in many countries today that there is too much focus on the wording of the law and not enough on the intent

Again, as the abstract mentions this is intended for laws that are meant to be interpreted 100% literally like an algorithm. Those exist and those are also exactly the ones that should be absolutely consistent.

The real problem I see is that clear, loophole-free laws are often not in the interest of the lawmakers. You only need to take a look at the underhanded ways various parties are trying to undermine privacy.


Another problem is that it will probably increase the proliferation of strict liability offences.

The fact that a law is internally consistent and free of loopholes does not necessarily mean that it is a good law but an algorithmic approach to creating such laws risks making their creation too easy so that we end up with a large body of law that can't be argued against because it has been 'proven' correct.

I'm reminded of a remark that Donald Knuth made regarding a piece of code. It went something like this: "Be careful if you use this, I have merely proven it correct but not tried to run it."


I'd urge people to consider their experience writing unit tests. It's not uncommon for me to write a test that my code fails, not because the code was wrong, but because my test was wrong. The same applies to proofs. You can prove that your code does something, but that doesn't prove it does what you wanted. It's easy for us to understand, but I could imagine lay people misinterpreting it.


>> Judging by how most people write code today, I find this idea terrifying

> Most people today don't write code for safety critical systems. The abstract mentions they already uncovered a "bug" in an active law by rewriting it in their language, so it could be an immensely helpful tool for finding edge cases and loopholes in that mess.

I'll give you an example from a field that I (unfortunately) have a reasonable amount of experience of where the law and contracts are supposed to be interpreted 100% literally like an algorithm: structured finance (eg the subject matter of the 2008 financial crisis). The prospectus of a structured security like a mortgage bond is typically a hundred or more pages of very dense legalese that was worked on intensively by highly-specialised lawyers and bankers for months. Its intent is to set up a small special-purpose company with strict rules on how it operates down to the last penny and zero discretion. This allows investors (in theory) to understand how the bond will perform.

The opening chapter of my introductory textbook on structured credit analysis says the author (who has been part of those teams for years and has worked on hundreds or possibly thousands of bonds) has never read a bond prospectus for a structured security that does not contain serious drafting errors. I personally worked on a bond for months and read, edited and commented on the bond model and prospectus hundreds of times. We certainly were trying to make our prospectus as simple, clear and loophole-free as possible. Then a few years after it launched we discovered we had made a big mistake in drafting it that had gone unnoticed by me, the expert bankers and lawyers and all the investors so far. Luckily we were able to fix the problem without negative consequences. There is literally no chance that this error would have been found or solved by a formal system and if anything it would actually have been worse.

Some things are complex enough that they are just very hard to do. The existing system allows for that by having a rigorous process of dispute, interpretation and argument to decide complex questions post facto.


> we discovered we had made a big mistake in drafting it that had gone unnoticed by me, the expert bankers and lawyers and all the investors so far. There is literally no chance that this error would have been found or solved by a formal system and if anything it would actually have been worse.

There is no chance that a formal system wouldn't have revealed this bug? Static analysis and simulations wouldn't have helped?


> Some things are complex enough that they are just very hard to do.

I agree with this wholeheartedly. Engineers have the ability to automate some very complex issues. But there will always be a class of issues beyond automation, simulation, analysis, etc. and we just have to accept that the only way to solve them is with slow, hard, error-prone work.


Do you think that smart contacts could be useful in this area?


Not smart contracts exactly but there is a specific part of each bond called the waterfall (basically specifies what happens with all the cashflows under different circumstances and is a little bit like an equity cap table plus liquidity preferences). I've always thought there should be a dsl specifying the waterfall that can be used for the bond model and used to generate the legalese in the prospectus. A common problem is the bond model doesn't match the prospectus and that problem would just disappear. I intend to do this as a research project if I ever get time (it's been 12 years since I had the idea to do this so it may never happen...).


If I understand it correctly smart contracts are just another piece of software combined with some blockchain magic. If you can't prove a piece of code is correct, wrapping it in a trendy buzzword won't make the code more correct.

Maybe I'm wrong, but I don't know how decentralization would change anything other than adding safeties against manipulation if done right.


That line in the summary is a bit misleading. The authors didn't find a "bug" in any law - instead they found a corner case that wasn't coverred by an online tool hosted by the French government to estimate family benefits under French law:

> After close inspection of the OpenFisca code, a discrepancy was located with the Catala implementation. Indeed, according to article L755-12 of the Social Security Code, the income cap for the family benefits does not apply in overseas territories with single-child families. This subtlety was not taken into account by OpenFisca, and was fixed after its disclosure by the authors.

Here's the pull request for their fix to the benefits simulator: https://github.com/openfisca/openfisca-france/issues/1426


There's a real slippery slope issue here too though, for several reasons, that I'd expect would lead to overuse.

It's a fancy new thing that will be branded as helping to avoid mistakes, reduce legal system costs, increase consistency, etc.

So, it will be applied to existing laws more generally than the abstract suggests it should be.

New laws will be made with this in mind, increasing application regardless of whether the new laws are actually literally interpretable.

It will become entrenched and hard to remove as allocated funds to pay people to do the same jobs disappear.


To me it seems like you contradict yourself.

> it would be better [to adopt a law system] which maximizes the human interpretive element rather than attempts to remove it.

> twist the law in any direction which benefits them, even in ways that don't feel right.

The first statement seems to say that the more humans can interpret the law the better, the second statement seems to imply that people actually doing such interpretation with the tools available now is a bad thing.

It's a good thing your comment is not a legal text though. Otherwise some formalism would be quite welcome to derive the intended meaning.


I don't think it's a contradiction.

If you follow the exact wording of the law, it's easier to find loopholes (like a hacker looking for a breach).

If you follow the intent of the law, then it's much harder. A judge will see if you acted in bad faith, if your actions are against the spirit of the law.


There is no contradiction. Countries which put more focus on the intent of the law (like Europe) tend to have fairer outcomes than countries which focus too much on the wording of the law like in the US. I read that many criminals in the US get away with crimes because of failures of procedure by the prosecution for example; this is a clear sign that there is too much focus on the wording of the law. If intent counted more, then nobody would be acquitted of a crime on the basis of a technicality.

It's the fact that people have accepted the primacy of the written word that they allow themselves to be swayed to accept verdicts which don't make sense. It's important to factor in the fact that criminal justice involves multiple jurors, so you don't want some outdated wording written in the 1900s to allow one bad juror to use wording of a law or precedent to convince other jurors of a conclusion which best matches the wording but defies common sense.

It's like in the US constitution, there is this paragraph:

”No State shall enter into any Treaty, Alliance, or Confederation; grant Letters of Marque and Reprisal; coin Money; emit Bills of Credit; make any Thing but gold and silver Coin a Tender in Payment of Debts"

And yet today the US government (which is clearly an alliance of states) pays back its debts by printing more money not backed by gold. People have used the wording "No state" to imply that the Federal government is not a state and therefore, it doesn't apply. But clearly this defies common sense about the intent of the law which was to prevent the government from printing unlimited money not tethered to any scarce asset to inflate away its true debt... And yet this is exactly what the federal government (the alliance of states) is doing nowadays.


> And yet today the US government (which is clearly an alliance of states)

That's not correct: The "alliance of states" was under the (superseded) Articles of Confederation — the governance failures of which led to the 1787 constitutional convention. In contrast, the very first words of the Constitution state explicitly that "We, the People of the United States" [emphasis added] were joining together to establish a national polity that transcended the states.

> People have used the wording "No state" to imply that the Federal government is not a state

It's far, far more than just a mere implication — it's a foundational assumption. See, for example, the Supremacy Clause of the Constitution. [0]

[0] https://www.law.cornell.edu/wex/supremacy_clause


I strongly agree with your main intent, i.e. that law should be interpreted by humans. However,

Your fact " I read that many criminals in the US get away with crimes because of failures of procedure by the prosecution for example;"

which you interpret as

"this is a clear sign that there is too much focus on the wording of the law. "

could also be interpreted as:

police often use illegal means to target people and fabricate evidence, and the US's well-functioning legal system stops this.


>People have used the wording "No state" to imply that the Federal government is not a state

in a document delineating what qualifies as states and what qualifies as the federal government with all sorts of text about relations between the two it would seem to be a really reasonable implication to make.

Furthermore your example is in a paragraph that starts "No State shall enter into any Treaty, Alliance, or Confederation", by your logic it follows that the Federal Government cannot ever enter into any Treaty, Alliance, or Confederation because the Federal Government is after all a State.


>> in a document delineating what qualifies as states and what qualifies as the federal government with all sorts of text about relations between the two

If those definitions were tacked on afterwards, then it would exemplify exactly the kinds of distortions I'm referring to. I don't know the order in which the US constitution was written but that definitely should affect the intent.

>> Furthermore your example is in a paragraph that starts "No State shall enter into any Treaty, Alliance, or Confederation"

TBH, I didn't even see this interpretation. I guess some people are focused on the wording to an entirely different level. I used the semicolons to interpret it as "no alliance pertaining to the payment of debt" based on my interpretation, it doesn't say anything about alliances pertaining to other matters.

But this seems to reinforce my point. The words can be ambiguous but it doesn't mean we should disqualify the intent altogether.


ok so, those definitions were not tacked on afterwards, they were really the reason for the Constitution to be written in the first place. What was tacked on afterwards was the Bill of Rights.

The reason for the writing of the Constitution was the Articles of Confederation https://en.wikipedia.org/wiki/Articles_of_Confederation being rather problematic for running a nation.

In fact for any judge familiar with the Articles of Confederation and the history thereof, which I would hope was every judge, the interpretation of the paragraph you cite would be completely obvious and pertaining only to the states, because some of the original problems between the states that caused the Constitution to be written was states doing such things as entering into treaties with foreign nations, coining their own money (making trade between states more difficult than needed), etc. etc.

Basically the paragraph is saying "all the stuff you guys were doing or maybe thinking about doing, stops now!" - I say "maybe thinking about doing" because unsure about letters of Marque although I could certainly see it having been done by some of the more haughty states at the time - say New York or Virginia.


But how do you actually know the intent if you can't trust the words and you weren't there when the text to write down was discussed? This is literally your interpretation of the document versus someone else's.


If intent counted more, then nobody would be acquitted of a crime on the basis of a technicality

One person's "technicality" is another person's last hope to have a core principle enforced like "innocent until proven guilty".

Intent is a squishy thing that's open to endless subjectivity and biases. Intent should only be used when the letter of the law isn't sufficiently clear.

Separation of Powers was one of the most brilliant concepts undergirding the US Constitution. It's already been damaged pretty badly over the years. We don't need to double down on that damage by insisting upon having judges ignore the wording of laws so they can go with their feelings regarding intent.


I have never seen any law, ever, that came anywhere near close to "sufficiently clear".

Upthread there was a discussion of hundreds of pages going into something as trivially straightforward as the requirements of a financial contract. For something involving human behavior, there will always be billions of corner cases, exceptions, confounding events, and other factors leaving decisions open to a judgment call.

That's why lawyers spend years just learning to read the law, and then reading and synthesizing thousands of decisions that try to patch together all those inconsistencies and gaps. And even then, every case ultimately comes down to a judge's judgment call on which lawyer has done so more successfully... or worse, a jury of twelve people deliberately selected for their ignorance of the law.

Human beings are too squishy to write genuinely precise laws. Lawyers try to pretend otherwise, and that pretension is fundamental to trying to actually have a society. But let's not kid ourselves into thinking that any law is actually rigorous in a sense that a computer programmer, scientist, or logician would recognize.


IANAL and I am not too familiar with the US constitution, but the paragraph you mentioned seems to have a very clear intention: To forbid the individual states from a) performing diplomacy/foreign policy on their own and b) creating their own currency.

At this point, I want to point out that while a lot of discussion always surrounds the status and the individual rights of the states, I think it really is clear that the US is the state (as in nation) and the individual states, while they have some autonomy, are the inseparable parts that the US is made of, just like in other federal states (nations).

From that perspective, I think the justification for the paragraph you mention is clear: The states are not independent, and therefore they can't have diplomatic ties with other nations. In diplomacy, this has always been the case: Diplomatic ties always exist between independent nations, never between an independent nation and a federal, dependent state. There is nothing like an "Embassy of the U.S. State of Texas to the United Mexican States" since the State of Texas can't have diplomatic ties on its own.

What you mention regarding printing more money does not apply here: The paragraph you mention talks about the US states, but not about the (nation) state itself, the USA. There exists no alliance of states, rather there is a federal republic consisting of states (which in turn are not allowed to have any diplomatic ties of their own). The (nation) state represented by the US government is obviously not a (federal) state inside the US, that would be absurd.


> But clearly this defies common sense about the intent of the law which was to prevent the government from printing unlimited money not tethered to any scarce asset to inflate away its true debt...

That is a really, really strained interpretation of this clause. Because these exact words are used to give these powers to other entities in government:

Article I, Section 8, clause 5: [The Congress shall have Power] To coin Money, regulate the Value thereof, and of foreign Coin, and fix the Standard of Weights and Measures;

Article I, Section 8, clause 11: [The Congress shall have Power] To declare War, grant Letters of Marque and Reprisal, and make Rules concerning Captures on Land and Water;

Article II, Section 1, clause 2: [The President] shall have Power, by and with the Advice and Consent of the Senate, to make Treaties, provided two thirds of the Senators present concur

Some of the other things forbidden with "No State" are also expressly forbidden to Congress itself. How is it common sense that "No state" also includes "the federal government" when the federal government is sometimes given the power which is allotted to "no state" and sometimes expressly forbidden as well that which is allotted to "no state"?


> I read that many criminals in the US get away with crimes because of failures of procedure by the prosecution for example;

How does anyone know they got away with a crime if their cases never went to trial? Isn't that exactly what a trial is there to establish.

The USA has a long history of village folk "just knowing" that an accused is guilty. Seldom worked out well.

Pilots have procedures they must follow. So do cops. They just need to do their jobs properly.


>> How does anyone know they got away with a crime if their cases never went to trial?

Based on probability, you can safely assume that some percentage of them were guilty. That is more correct than assuming that they were all innocent. But to presume innocence simply on the basis that they were not convicted is itself part of the problem wereby the procedure and the wording takes precedence over the intent.

In reality, there is no certainty; some guilty people will be acquitted and some innocent people will be charged. To put too much weight on the words 'guilty' or 'innocent' is to ignore this reality.


> But to presume innocence simply on the basis that they were not convicted is itself part of the problem wereby the procedure and the wording takes precedence over the intent.

Unless the intent is "it is better a hundred guilty persons should escape than one innocent person should suffer."


You seem to have several misunderstandings about the intent of US law in this thread.

The intent of US law is specifically to allow many guilty people to be freed due to procedural errors, and so the fact that this happens is not taking "precedence over the intent".

Also, in the US, no one is ever found "innocent" in court. They are found either guilty or not guilty.


"Risk"

Yeah, we crossed that Rubicon like Evil Kenevil jumped the Grand Canyon a long time ago.

I'd like to see this sort of rigor applied to the tax code so that we can all agree that "complexity is a subsidy" and just scuttle the whole mess in favor of something not an iron rod of oppression.


Clearly it needs an easy way to drop to a command prompt for input.

Seek death penalty? Abort, Retry, Fail?_


I concur, and as a result the same goes for contracts.

A friend recently told me "I never needed a lawyer because someone did not understand a contract, but several times because someone tried to finagle with legal sophistry".


Related:

When trying to compare the legal systems of various countries, it is instructive to note that they differ by large amounts. The basics of Law, and the governing principles and assumptions behind Law, vary quite a lot. Even within countries, the law is often not the same everywhere. Wikipedia has a good map of the 'basic' law systems used throughout the world:

https://en.wikipedia.org/wiki/List_of_national_legal_systems


> Judging by how most people write code today, I find this idea terrifying.

Writing code is not a single activity, it is a modern means of communication and information transmission. Code is written for a vast range of purposes, some better suited to less rigorous programmers than others. No-one is asking or expecting the authors of bad code that you have in mind to formalize legal systems. The latter task will be on the rigorous side of programming, presumably leaning heavily on static type checking / automated theorem proving.


As a programmer, and indeed a logical thinker in general, I find the fact that there is any human interpretation to be horrifying. At worst this means undesirables get worse punishments, and at best it means when going into the legal system, you have no idea what to expect. Indeed, you have no way to manage the risk that the law presents to you. Your only hope is to avoid it at all costs.


Can we solve this by generating example scenarios from the code and putting them to a common sense test? This is an integration test, of sorts.


Not if its functional programming. And it has a chance to move just about the same speed as the legal system.


There have been many such attempts (e.g. NKRL by Zarri et al., also funded by EU, not mentioned by the paper). There are even societies that have been dealing with such issues for many decades (e.g. http://www.iaail.org). The formalization of law and language is only one of the issues. Like many previous attempts, this one suffers from the fuzziness of human language. Fuzziness is not a drawback; it is what makes it possible to communicate efficiently in the first place. In order for us to communicate effectively, we need an enormous amount of tacit knowledge about our environment that our culture and life experience brings. If one tries to formalize the language, as in the present approach, one must also take this knowledge into account, down to the last detail (an "upper ontology" is by far not sufficient for this, and e.g. at https://en.wikipedia.org/wiki/Cyc they have been working on a complete formalization for decades and are far from finished). And the tacit knowledge and also the moral valuation of the same change over time. And there are things like https://en.wikipedia.org/wiki/Sorites_paradox which stand in the way of a complete formalization. And also works like those of Gödel or Wittgenstein showed the limits long ago.


At first glance, the paper seems to be written by competent people, but the 'related research' section is rather shallow. How is what the authors have developed more than "yet another language"?

I don't want to be too critical, though, it probably can't hurt to work on something new without dragging along all the baggage of the past. Still, it feels a bit strange from an academic perspective.


> it probably can't hurt to work on something new without dragging along all the baggage of the past

Ignoring a century of relevant research is not a recommended scientific approach though. The "baggage of the past" are rather the "shoulders of giants".


I mostly agree, but many scientific communities are rather self-centered/inward-looking, and when it comes to technological innovation like programming languages, it could be a reasonable compromise to align slightly more with a "mainstream" ecosystem and slightly less with the academic community. Of course, ideally, all design choices are compared to the community's state of the art, but this may be impractical if it implies a comparison to tens of obscure languages.


All three authors are computer scientists. It even seems to be a cooperation between academia and industry .


Awesome to see someone trying to implement this idea, which I bet most people here have wished existed at some point.

    1 scope Section121SinglePerson:
    2   rule requirements_ownership_met under condition
    3   aggregate_periods_from_last_five_years of personal.property_ownership >=^ 730 day
    4   consequence fulfilled
Syntax (shown above with line numbers added) looks pretty clean and simple. Hopefully it is enough like natural language that non-techies wouldn't run away at first sight

This probably won't cause a legal revolution tomorrow, but I hope it can be the start of a slow reform toward linguistic precision and machine-friendliness in legislation in the future.

Also it's a bad name, since that's already the (local) name of a widely spoken natural language.


This snippet begs all the important questions. How is the value of "aggregate_periods_from_last_five_years of personal.property_ownership" calculated? What does it mean to own a property? Does it count if you jointly own a property with someone else? If so, how much of a percentage of ownership counts as ownership? What about if for some part of the 730 days the ownership of the property was subject to dispute and it could have been owned or not owned by the person in question?

The whole thing smacks of false precision to me. It's precise about things that are not legally complex to be precise about in the current system, and the language gives the false impression that it's actually going to be possible to be precise about everything.

Secondly, it feels like it will have a similar drawback to sql[1] in that it seems on the surface to be non-technical, but actually requires significant technical expertise to avoid numerous beartraps and understand anything beyond the very most obvious.

[1] Which was originally intended to be a formal version of natural language understandable to non-technical people (hence the original name SEQUEL "Structured English Query Language")


> How is the value of "aggregate_periods_from_last_five_years of personal.property_ownership" calculated?

How is the value of "during the 5-year period ending on the date of the sale or exchange, such property has been owned and used by the taxpayer as the taxpayer’s principal residence for periods aggregating 2 years or more" calculated?

> What does it mean to own a property?

What does it mean to own a property under the non-formalized law?

In the end, it means that if you file a tax return that the tax authorities aren't happy with, you can argue in front of a judge whether you think you own something, and the judge will decide whether you do. Based on this they will decide whether your tax return was valid or not. The technicalities of whether you own something in the eyes of the judge have nothing to do with whether your tax return was based on your reading of the law or on a tool based on this language. Either way you must decide whether you think you own something, and the rest of the computation is based on this decision.

> Does it count if you jointly own a property with someone else?

Search the paper for "joint returns", it quotes both the law and its formalization.

> The whole thing smacks of false precision to me. It's precise about things that are not legally complex to be precise about in the current system

<scratches head> The questions you raised do appear legally complex to me. But also I think you misunderstood the whole problem domain. The law currently spells out an algorithm for computing something. The inputs to this algorithm include questions like "do you legally own stuff". These inputs are outside the specific algorithm spelled out in Section so-and-so. They are similarly outside the specific algorithm spelled out in the computer program.


The law doesn't specify anything as precise as an algorithm. The law has multiple statutes, codes, precedents etc with overlapping applicability. Then in a particular case these are tried and humans come to a decision.

Trying to pretend it's precise like a computer algorithm is I think a really big problem.

The reason I posed those particular questions is that if you try to make this answer the questions you rapidly approach a level of complexity similar to the existing legal codes and still don't have something that could automatically be evaluated.


> The law doesn't specify anything as precise as an algorithm. The law has multiple statutes, codes, precedents etc with overlapping applicability.

Again, I think you're arguing about something that the paper doesn't claim to do. It doesn't claim to formalize "the law" as a whole. It claims to formalize part of a single section of a single statute. That specific statute describes precise rules into which you feed boolean and numeric data, and the application of which rules spits out boolean and numeric data. These rules are written specifically so that they are not ambiguous, so that anyone given the same input data would come to the same output conclusion. That's an algorithm.

What is indeed ambiguous are questions of ownership: Did you own a certain property for two years within a five-year period if for part of that period it was co-owned with a friend who then became your spouse and then later divorced you and got the property? But again, this is not what the paper claims to decide. This is something that the algorithm in the paper, and the algorithm of this specific section of the tax code, takes as input as a boolean flag.

> The reason I posed those particular questions is that if you try to make this answer the questions [...]

But again, you do not try to make this answer these questions. And the paper never claims that it does.

The tool described in the paper does not contain a function like "given THE ENTIRE STATE OF THE REAL WORLD, compute whether I owned such-and-such property for two years in a five-year period". The tool described in the paper only contains a function like "given a boolean flag modeling MY CLAIM to have owned (or not) such-and-such property, compute a number for me".

Your reasoning for having owned (or not) a certain property is not checked by the tool described in the paper. It takes your claim about property ownership as a boolean flag and says "well, if you are really sure that your claim is correct, then your tax liability is such-and-such". It doesn't claim to check whether your claim is correct.


Sure, but the problem is then it's not very interesting. There are lots of real-world systems already that do this with existing laws and existing programming languages (eg in the UK where I live the the tax service has a free web filing system where you fill in your tax return online, it checks it for correctness and tells you how much you owe or are owed (if you have overpaid).


nonono, arrogant language lawyer types in robes pretending that natural language is more precise than formal logic codes is the big problem.


Those are not problems specific to this language. You need to define those terms properly to tax people adequately.


Another one that jumped out at me is if I own a property from noon one day to noon another day - is that one day or two days?


Indeed, not the best name, but according to the README, the reason for chosing it was the following:

"The language is named after Pierre Catala, a professor of law who pionneered the French legaltech by creating a computer database of law cases, Juris-Data. The research group that he led in the late 1960s, the Centre d’études et de traitement de l’information juridique (CETIJ), has also influenced the creation by state conselor Lucien Mehl of the Centre de recherches et développement en informatique juridique (CENIJ), which eventually transformed into the entity managing the LegiFrance website, acting as the public service of legislative documentation."

I'm wondering what could result of the combination of smart contracts on a blockchain and this legal language.


Which is a bit circular, because Pierre’s last name is a reference to the region that speaks that language, since he was born in neighboring Occitanie.


The somewhat ironic logic here being that a name referencing a place is very unlikely to come from that place. If Jordi moves from a Catalan area to an Occitan area then he (and his descendants) might be known as "Jordi the Catalan". It wouldn't be very useful to pick that as a name for someone who never left though.

Of course nowadays people's names are mostly just symbolic, and you can move to England and still be called Johnny English.

(In fact, as a sibling comment notes, maybe SQL is named after him.)


>Of course nowadays people's names are mostly just symbolic, and you can move to England and still be called Johnny English.

This depends on the language. In Spanish (and by extension every romance language) no one would know the meaning of Alvarado (they would associate it with the name Álvaro, but just that) or González, but it would be awful to have a name like "Marcos Café/Marrón" (Marcus Brown) or "Carlos Panadero" (Carlos Baker).


What about Marcos Moreno? Or Carlos Zapatero?

I take your point that the origins of surnames in Spain are typically unlike those in e.g. England, but there are plenty of common apellidos originating in physical descriptions, (far away) place of origin, or livelihood.


Kind of like how, if someone's last name is Ashkenaz/Ashkenazi, you know they're a Sephardic Jew.


> Also it's a bad name, since that's already the (local) name of a widely spoken natural language.

As a native Catalan myself I was quite surprised by the title.


It seems to be properly reasoned, but I agree - as another native Catalan - that the title is surprising.


It's a mistake to consider the law to be something absolute and logically-programmable.


It's a mistake to think that only things which are "absolute" can be logically programing. Take this silly example:

> If x is a good person they can go to heaven. Unless they were born on Feb 29th. The previous restriction is lifted if the year is a prime number. The counting of the year is Julian. The destination of heaven is replaced with hell if at least 5 people are on mars.

As the article shows, exactly existing law is fully of "whoops, you better keep on reading to find that override" situations like the above. Formalization still helps by putting all branching up front rather than with the the moral equivalent of COME FROM statements. And that helps whet there or not "good person", "heaven", and "hell" are well defined.

...The responses to this prreprint make me think no only do HN readers not understand the law, but they don't understand programming well either. :/


I agree with this but I also want to see someone try and fail and help us learn from it. This is really one of those untouched territories and there is reward in pursuing that regardless of the end result.


definitely a mistake for a common law based system, but maybe not for a napoleonic system.


Tax codes that spell out detailed computations are de facto "Napoleonic" (civil) even in a country that applies common law to other things, no?


Goedel's incompleteness theory shows that an axiomatic system can either be consistent or complete, but not both.

Software is an axiomatic system. Human behavior is NOT.


>Software is an axiomatic system. Human behavior is NOT.

I am not aware of any system of law that covers all possible human behavior. Common law systems theoretically could because part of the determination of what is covered is up to the courts, thus when you have a new human behavior it is uncertain if it is covered or not until you take it to court, but in a Napoleonic system the coverage of the law must be explicitly stated.

Thus Napoleonic systems would try for consistent but not complete.


I assume you refer to his results regarding axiomatic systems for arithmetic on natural numbers? If a law attempts to capture this, then, sure, it's problematic. But I don't that's very common.

From what I've seen, formal methods for legal systems would take the "facts" of a case as given (assumptions), and deduce legal consequences of those.


as these things go they will create the illusion of improvement and the chances of things going wrong will be reduced in frequency but when they happen it'll be injustice at cloud-scale. These things have a long fat tail.

Before I'm getting excited I want to know how it will be used in the developing world. E.g. perhaps we'll see an automatic smart-judge that goes from Cellebrite evidence directly to machine based torture and public execution with some blockchain sprinkled-on for good measure. See the latest dystopian piece on how Technology helps the developing world fight crime: https://cpj.org/2021/05/equipped-us-israeli-firms-botswana-p...


> Hopefully it is enough like natural language that non-techies wouldn't run away at first sight

Written law itself is also not "enough like natural language that non-law people wouldn't run away at first sight".


yeah they would run away screaming, this is too ambiguous /s


The reason why law, especially criminal, can't be expressed via bytecode is mostly because law in action is enforced selectively based on social effects. There are too many edge cases, circumstances and such to be able to with a binary output apply it to a situation.

There is a possiblity of automating the mundane, the administrative tasks, which seems like this language is taking a shot at - tax law - or such, but probably in the end we will start transitioning to law that just mentions that the taxes are counted by an algorithm and the algorithm is published on GitHub.

Using software metaphors, I think law has too much responsiblites and we need to decouple some stuff.

Source: I am law student, software engineer.


This is nonsense. We can use these techniques even when "inputs" are somewhat subjective. They point is to separate the boring mechanical parts from the interesting parts.

This type of argument is the same as the people arguing types are bad because they alone don't ensure program correctness. They don't need to. They take care of the boring bits freeing up the mind to think about the interesting parts.


I think what would matter isn't the application of it, but writing the law in a way that can be analized better than in a sentenced format


It’s also not as easily biasable.


Everyone who thinks law-as-code on actual, not tax-code-algorithm law is a great idea: We as an industry have yet to figure out how to reliably get simple business processes into code without mishaps, imagine the hubris of attempting to get all of society into code without terrible results ...


The Dutch Tax Administration have been using controlled natural language for the specification of their legislation for some time now with JetBrains MPS (external) DSLs: https://www.youtube.com/watch?v=1pfABlzO2Wc&index=2&t=0s&lis...


Tangential note, it’s exciting to see someone using MPS. It’s an interesting and possibly transformative software technology that seems ahead of its time.


Sounds very similar to https://www.eclipse.org/Xtext/

(though I never used MPS)


Yes, MPS and Xtext are the best language workbenches in the market. The main difference is that MPS uses a Projectional editor, so you can use different notation (not only text): https://youtu.be/XGm_khXZl44


They're not aiming at all of society. Parts of the law are already algorithms (not all of it), they just happen to be written in an obscure uncompilable Legalese.


This isn't python or JavaScript. This language is specifically designed to integrate with proof assistants, like ambitious hyper-secure TLS implementations or formally verified microkernels.


The famous mathamatician Godel was onto this some decades ago and found a loophole that allowed the American democracy to be legally turned into a dictatorship.

https://en.wikipedia.org/wiki/G%C3%B6del%27s_Loophole


Exactly the sort of misunderstanding of power you'd expect from a mathematician. It isn't the constitution itself that keeps governments from seizing dictatorial power. No potential dictator is going to say "Damn, Article V says it can't be applied to itself! My plans are foiled!"


They don't mention Stanford's CodeX project? https://law.stanford.edu/codex-the-stanford-center-for-legal...


I like the idea of using a formal language to actually aide development of a law. Similar to how when I code, the compiler will tell me if a default clause in a switch statement is missing or help me consider the edge cases when a value is missing. A formal language can help lawmakers consider edges cases that they might not have thought about initially. It may even help them re-consider the entire purpose of the law from a higher level once they actually experience the reality of putting into language. I don't think every law can be programmed in a formal way but I like the idea that formal thinking can potentially help make better laws.


There are already (by analogue) programming languages for law, natural human language. A contract for instance, or the law itself.

The problem with the law is interpretation. It is not deterministic and that is because only a small subset of human language is deterministic.

A programming language alone does not solve interpretation (what does this statement mean) and trust (can I trust that THE [because there should only be one] interpretation of this statement has been executed). A programming language (interpretation) and a runtime (trust) are both required.


Lojban, a predicate logic-based language designed to be unambiguous, is also a good choice. Though there's a difference (usually harder) in learning a human to a programming language. Interestingly, its ancestor, Loglan, originating in 60s predates logic programming itself.


Be aware that switching languages entirely for the law introduces two new problems: translation from the thoughts one has in their head into the target specification language, and that means not everyone who is subject _to_ the laws can all _know_ the laws. That circumstance reminds me of the church services in Latin -- "uh, trust me, the Lord said you should definitely mow my lawn or risk firey damnation. He said bring beer, too"

Now, while using a programming language mildly runs afoul of those same risks, the difference in my mind between a programming language and using lojban (or conlang-o-choice) is that the programming language uses the same vocabulary (plus or minus) just with more rigor in the links between the words. One need not already know "re'azda" is "house" in order to figure out "house.value"


Yeah, that's what I meant with difficulty. A programming language is basically specified logic that can utilize any human language (vocabulary) as basis. But you lose freedom since you're forced to structure your logic in a specific pattern. Lojban requires you learning a new vocabulary to regain flexibility while retaining the rigor. The benefit of this in law becomes apparent when considering the importance of semantics over syntax.


Lojban is the first thing that came to mind as I saw this. An excellent choice.


Catalá is also the name of the catalan language in catalan (https://en.wikipedia.org/wiki/Catalan_language).


This motivated a visit to wiki to look up Code of Ur-Nammu. Some gems (yes, /s) in there:

If a man accused the wife of a man of adultery, and the river ordeal proved her innocent, then the man who had accused her must pay one-third of a mina of silver.

So, here is justice at work. If she drowns, she was guilty. If she survives, someone gets 1/3 of a Mina. Ur-Nammu is vague here as to who gets the money. Likely the husband.

How would we code this in Catala? I'll start it, someone else can finish it /g

  declaration structure Person:
    data id content integer

  declaration structure Period:
    data start content date
    data end content date

  declaration structure MarriedCouple:
    data marriage_date content date
    data husband content Person
    data wife content Person
   
  declaration structure AccusationOfAdultery:
    data couple1 content MarriedCouple
    data accuser content Person
    data adultery_date content Period
   
  declaration structure River
    data id content integer
 
  declaration structure OrdealByWater:
    data subject content Person
    data river1 content River
    data ordeal_date content Period

  declaration scope UrNammuLaw14:
    context requirements_met condition
    context married_met condition
    context accused_while_married_met condition
    context tested_by_water_ordeal_met condition

  ...
https://en.wikipedia.org/wiki/Code_of_Ur-Nammu#Surviving_law...


Some issues I think I can see:

1. The above law makes it clear that it is only a man who can accuse the wife of another man of adultery. Your code seems to permit other genders as accusers.

2. Similarly it seems only the wife needed to face the river ordeal. (i.e. every person subject to the river ordeal needs to have a husband) Your code seems to allow the husband to subjected to the river ordeal.

Sorry for nitpicking... it was not to attack your attempt at the code, but the entire idea of formalizing the law in this manner might be very tedious, and less precise in practice than natural language.


np, it's good on all points. So we need gender defined. And yes, it's tedious. Which is why IANAL :)

[but p.s.] reviewing the laws in wiki, the only instance of a 'female voice' is when a slave girl puts down her mistress (and gets the Sumerian equivalent of soap in mouth). So, putting my Sumerian lawyer hat on, 'context' here implicitly affords 'legal voice' only to men. Women were mute as far as the law was concerned. So a Sumerian AI judge using Catala would have no problem with my code :)


Okay, but isn't that a good thing?

Because in the normal case, everything will already be there and only the last part must be written. And then, if someone changes the definition of a "River" then you will immediately see all the laws that are impacted and how they are impacted. Sounds good to me.


And that's why we should also write tests, to make sure the intention of the law works at all times.


I don't think it is either or. Tests are certainly helpful, but they are not perfect and they do not replace a good language/compiler.


funny that the name of the language transcribed to Russian Cyrillic is "Катала", meaning "criminal fraud gambler"

https://ru.wikipedia.org/wiki/Катала


Finally someone is doing this! For years I've had the opinion, that law would be better of with more fomalization and that laws should be written in formal language. A friend of mine even tried to built a parser for german law a few years back, but the project never got very far.


Computational law isn't new by any stretch.


Now, how they are going to solve the versioning problem of the same program having to deal with the ever changing laws, contradictory law statements and scripts that need to contain several versions and contradictory versions is going to be very interesting.

After all, those could benefit programmers too.


I think coding the law is not just great, but also the only way forwards. A couple of things:

- Comparing codifying law vs codifying human language/communication in general I think is very unfair. No doubt law is using the form of human written communication but it has structure and it is meant (at least in principle) to be precise. Even if there is some legislation that goes beyond that structure (honestly interested in seeing a few cases) why would we restrict ourselves from codifying all the rest of the legislation? We can have at least part of our law in an executable form, I can't see a drawback with that.

- Just for fun, let's start thinking about the intersection of law binaries and software licences, oh boy! :D


> Even if there is some legislation that goes beyond that structure (honestly interested in seeing a few cases)

It's extremely rare, approaching nonexistent, for laws to have a precise, deterministic meaning because the vast majority of human language doesn't either. Here's a good example from CodeX (Stanford Center for Legal Informatics) that assesses an extremely simple law, which contains a surprising amount of ambiguity[0]:

> "One technical problem with Computational Law, familiar to many individual with legal training, is due to the open texture of laws. Consider a municipal regulation stating "No vehicles in the park". On first blush this is fine, but it is really quite problematic. Just what constitutes a vehicle? Is a bicycle a vehicle? What about a skateboard? How about roller skates? What about a baby stroller? A horse? A repair vehicle? For that matter, what is the park? At what altitude does it end? If a helicopter hovers at 10 feet, is that a violation? What if it flies over at 100 feet?

[0] https://law.stanford.edu/2016/01/13/michael-genesereths-comp...


"But in some other areas, law leaves little room for interpretation, and essentially aims to rigorously describe a computation, a decision procedure or, simply said, an algorithm."

I would love to know what these areas are. There I'd debate around all sorts of seemingly little things in the law, like a definition or a word being singular. On top of that you have "discretion" built in and highly utilized with the judge, cop, and especially the prosecutor.

We already have legal algorithms for sentencing, alimony, child support, etc. These are widely decried as terrible. Why would we want more?


  declaration scope Section121Return: context return_data content ReturnType context person1 scope Section121SinglePerson context person2 scope Section121SinglePerson context gain_cap content money
Why can't this just be a framework for JS of Python? Something that any person of law could easily get into and give them the power of learning the basics of a popular language.

Little about this seems compelling and comes off as self-indulgent. Could you imagine the dread of some poor guy with little to no interest in programming having to learn this?


I think a great use case for this is to build legal validators. For instance if you make products that are heavily regulated and you want to distribute them in many jurisdictions, than having software that can take a description of your product and tell you in which jurisdictions it passes and in which it fails (and why and how to fix it) can drastically cut your iteration times.



Programmers and lawyers: together at last!


There are dozens of us!


My personal motivation behind "law as a programming language" is to aid me, as a lay person [1], to understand it better.

An AST could help me quickly navigate between paragraphs, and make searching the law easier?!

---

[1]: Just someone, who wants to understand law easier, and not spent years having to study it.

"understand law easier" also in the sense of TLDR law.

And yes, I'm aware of the interpretation aspect of the law, but just give me some easy to understand starting point.

Edit: formatting, formulation


This might be more useful for specifying the clauses of smart contracts than actual law.


The syntax reminds me of Ada. I wonder if that is pure coincidence.


As if "computer says so" wasn't bad enough already.


Katala means devious in Finnish. Possibly not coincidence.


"Catalán" or "Català" is a widely language spoken in Catalunya, NE Spain. Occitan is his sister language, spoken in south of France.


I think this is very interesting. As systems of distributed governance and smart contracts mature and emerge, I wonder if this sort of thing will be a step on the way for the old world to keep up.


I always thought the next programming language for the Law will come from some curious Cryptocoin with a very specialized contract programming language...


Wasn’t this posted some time ago?


A programming Language for Law is a bad idea. Or, certainly, one must have a very clear idea of what the goals are for such a programming language. For example this (almost throwaway) comment from another post:

>I'm wondering what could result of the combination of smart contracts on a blockchain and this legal language

Is precisely the wrong way to be thinking about this.

The expectation of Law formalised into code is that the benefit lies from the execution: that if you can provide it with the 'inputs' then the neutral computer will do the 'hard work' of working through the consequences and output a decision.

The other intended hope is that the computer can act as a neutral mediator (presumably why what I quoted above also includes the words 'blockchain' and 'smart contracts') which will come to its conclusions justly and without prejudice. Indeed, if you have an agreed upon system via both parties, then it acts like the spinning table at the beginning of the Temple of Doom - allowing two parties with competing interests to come to an accord without trusting each other.

But it is pretty clear to anyone who has worked within a legal or law enforcement context that this is not usually the bottleneck. Lawyers and Justices are usually intimately familiar with the law, and so much of a trial or a case is working through the establishment of the inputs and any of the logical crunching is normally able to be followed through fairly easily by those present.

There is absolutely a case for tools around the law that may superficially seem similar to the above, but in spirit are the complement: search tools for users to easily find related documents, markup to declare and highlight evidence formally (for human consideration, not machine), editors that can assist with writing up legal documents etc.

Something that would seem to be the same thing, for example, would be to create a data model over the existing law to describe the legal architecture as alluded to in the description of the OP: dependencies, exceptions, related articles etc. However, the purpose of doing this in the OP would be to make this executable. The actually useful thing is a lot more banal - to allow things to be surfaced for human inspection and consideration more quickly and easily. It would be to try to spot contradictions and inconsistencies in the law making stage, not in the law interpretation stage. It would be to spot the impact of making a change to one paragraph beyond just the Act that is being scrutinised.

These things are decidedly unsexy to a technical audience, and don't have that same electric feel of 'solving a problem'. Indeed, many of the biggest advances to be made in Law are to be using de rigeur tools in the tech world (such as version control) in a legal context.

A TED talk that summarises some further points:

https://www.youtube.com/watch?v=CEN4XNth61o


Law already is a programming language. It just has (deliberate) compiler issues

Where law is unclear, The drafters of the law were unclear what they wanted or could not agree what they wanted or were unwilling to explicitly state what they wanted.

I don't think we need to rewrite our laws in software friendly language.

We shall soon enough be needing to have law and software hand in hand - the obvious example is the self driving car and trolley problem - a car sees child ahead in the road, and cannot brake in time so must swerve, but swerving will kill the driver (or a old man on the sidewalk).

That must be explicitly coded into the car ahead of time - and that will need to be a regulation / law of the jurisdiction because no car company will just choose.

And so we are presented with the perfect problem - it will demand democratic involvement, it is impossibly hard ethically, and it will need the software to be inspectable / verified before and after the incident.

If we solved those problems then the recent Post Office debacle would have different .


This view: "Law already is a programming language."

always strikes me as so fundamentally wrong, and also very dangerous.

Goedel's incompleteness theory shows that an axiomatic system can either be consistent or complete, but not both.

Software is an axiomatic system. Human behavoir is NOT.

Law provides remedies for specific harms, such that vengeance becomes the right/duty of the state (which hopefully tempers it with justice)

Law thus needs room for interpretation and for understanding of the human. Thus the need for a judge. Thus a jury. etc.

As one man put it, "the letter of the law kills". Software is only the letter.

We all want clear, black and white guidelines on "good" and "bad". We sometimes look down on religions which seek to provide this.

For the love of $diety$, let's not make software that $diety$


I agree on all accounts, however I've always taken "Law is a programming language" or, maybe better, "Law is a program" as more of a metaphor. I see many similarities between law and computer programs, it's just that a law is a program written for humans (judge, jury, ...) to interpret, whereas a computer program is written for a computer.

Thus I think any attempt to translate a written law into something a computer can interpret as a failure, because the interpreter is not optional in the law. The interpreter is part of it and constitutes a vital piece of it, and without it written law does not make a lot of sense at all. Still, it is, in my view, a program which can be "executed", just not without a lot of context.


Actually, software can be as well "interpretable". It just means that the outcome is not "he has to be convicted because of X and Y" but instead "if X and Y are relevant then he should be convicted". Which means, software cannot automatically handle a case but it needs to be interpreted - however, the interpretation will be more formal and not so handwavy.


> Software is an axiomatic system. Human behavoir is NOT.

It really isn't though. When I write software I have to consider the different interpretations by the OS, browser, etc. and draft accordingly. Ditto for when I draft legal documents and need to consider the different interpretations by opposing counsel, a regulator, etc. As someone who drafts legal documents for humans and writes code for computers I can tell you first-hand the two processes are remarkably similar.


> Goedel's incompleteness theory shows that an axiomatic system can either be consistent or complete, but not both.

Goedel's theorem is about a specific form of incompleteness.

I think you'd have to provide some additional arguments why the theorem would be relavant in this context.

In other words: I don't think law as a formal system needs all the features that are preconditions for Goedel's theorem.

Similarly: Most proof assistants are very much usable without allowing arbitrary recursion.


I think the concept that is most useful here is that of fuzzy logic: rather than if statements requiring something that returns 1 or 0, they require a probability p \in [0, 1]. Some things are essentially certain. Others are not. Of course, the main point shown at trials is that these probabilities are themselves uncertain...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: