Hacker News new | past | comments | ask | show | jobs | submit login
Prolog language for PostgreSQL proof of concept (github.com/tatut)
214 points by triska 10 months ago | hide | past | favorite | 77 comments



As a proof of concept, this looks very cool. Suggestion: add a short example to the README.

I had one experience with Prolog in the 1980s that blew my mind. I had an IR&D project to build a complete prototype of an air/land battle simulator (yes, I was a defense contractor back then) in Common Lisp given 6 weeks of coverage to write it and demo it. After a month I was satisfied with the functionality and after demoing it I asked permission to rewrite it in ExperProlog on the Mac (I had done the Common Lisp version in my Xerox 1108 Lisp Machine). In ten days time it was done, and also had nice graphics and UI extensions that the Common Lisp version did not have. Anyway, except for few small open source things, that was the only large project I ever did in Prolog.


Can you explain a little about why / how you modeled a battle simulation in Prolog?

To me, the most natural approach is simply a time-stepped battlespace model. With event uncertainty (e.g., did the bullet hit its target?) modeled as random draws that get baked into the outcome.


My speculative reaction: Prolog offers a way to describe an AI "commander" that can make plans that optimize for many units simultaneously. This technique is often used in strategy games, not usually with Prolog itself but using similar hand-written planner algorithms in tandem with FSMs.

One of the downsides of this approach where it appears in games is that it results in unit behaviors that look robotic and overly micromanaged, often using tactics similar to early computer chess AI.


Honestly, that was almost 40 years ago and the details escape me. I do remember that it was not a fine detail model. I think some ‘experts’ in my company wrote up how they solved certain problems and I tried to capture that. I might have used OPS5 in the Common Lisp prototype. I (in hindsight) wasted a lot of time back then on symbolic AI that didn’t scale, and not enough time with neural networks. I was on a DARPA neural network advisory panel in the mid 1980s and I had a few good wins using NNs.


This reminds me of an old idea I’ve toyed with. In a logic class in university we talked about how some logic is time dependent, like A is true 2 hours after B becomes true. I was inspired to try to think through a language like prolog that could model and solve these relations through time. Didn’t get far with it since it’s a hard problem and I had too many classes that term. I was thinking it would be useful for clockless chips.


It is called Temporal logic or tense logic[1][2]. Linear time temporal logic is used in formal verification.

[1] https://plato.stanford.edu/entries/logic-temporal/

[2] https://en.wikipedia.org/wiki/Temporal_logic


Not sure if it's exactly what you're looking for but see this paper: https://ceur-ws.org/Vol-2785/paper9.pdf ( A Language for Timeline-based Planning )


What about it blew your mind?

Are you implying it's because a rewrite took less time than the first version?


If you're interested in this I would also recommend you check out Logica[0], which is a datalog-like language that is explicitly made to compile to SQL queries.

0: https://logica.dev/


I was wondering how this was possible and then found this monstrosity: https://colab.research.google.com/github/EvgSkv/logica/blob/...

Click on the SQL tab.


Seen longer, more obtuse queries written by hand in some applications I've worked on. Big, useful systems often give rise to them eventually.


I wrote SQL queries twice that long for credit originations reports in banks which would hit several terabytes of disk read per run


Apologies, I should've been more clear. The absolute length of SQL wasn't what I was alluding to: it was the ratio of Datalog to SQL I was curious about based on this statement in the website:

> Among database theoreticians Datalog and SQL are known to be equivalent. And indeed the conversion from Datalog to SQL and back is often straightforward.

I was skeptical, and indeed the sample I linked to shows 8 lines of Datalog turning into 265 lines of SQL. In defense of SQL, I note WITH RECURSIVE wasn't used.


    CREATE TABLE edges (
        source INT,
        target INT
    );

    INSERT INTO edges (source, target)
    SELECT generate_series as source, generate_series + 1 as target
    FROM generate_series(0, 999);

    WITH RECURSIVE path_lengths AS (
        -- Base case: Direct paths from edges
        SELECT source, target, 1 AS distance
        FROM edges
        UNION ALL
        -- Recursive step: Double the path length by joining on intermediate nodes
        SELECT p.source, e.target, p.distance + 1 AS distance
        FROM path_lengths AS p
        JOIN edges AS e ON p.target = e.source
        WHERE p.distance < 256 -- Control the recursion depth (2^8=256 for C8 equivalent)
    )
    -- Final query to select paths from source 0, similar to the logic program's final goal
    , final_paths AS (
        SELECT source, target, MIN(distance) AS distance
        FROM path_lengths
        WHERE source = 0
        GROUP BY source, target
    )
    SELECT * FROM final_paths
    ORDER BY source, target;
You can confirm the output matches by pasting it and clicking run here: https://extendsclass.com/postgresql-online.html


That's really not bad. I'm looking at CozoDB for a hobby project of mine, but if Logica could produce output like this consistently I'd probably prefer it because I'm much more familiar with maintaining RDBMSs.


Logica didn't output that. I posted SQL equivalent to the original Logica.


Oh nice, this is the one "SQL replacement" I'd actually look into


What puts you off prequel? https://prql-lang.org

really readable, nice syntax imo


PRQL has some very nice features, but the syntax also has some not so nice features that you don't see in the examples. There's special parsing rules for parenthesis and newlines, and a special word wrap character is needed at times to disable the rule that treats newlines as the pipe operator.


"Unreadable syntax" is like number 1000 on the list of SQL problems.


Love this, I've long considered that this should've been made as Prolog and SQL are ternary logic and basically SQL derives from Datalog and it itself from Prolog. A table record can be taught as of as a Prolog fact, so this makes the WHERE clauses the predicates in the conjunction on the right-hand side of a rule. And then exhausting the goal is actually returning the result-set.

Hope to see this develop even further, as Prolog has its place with relational databases.


Prolog's evaluation semantics are order-dependent, though. I've always thought this was the reason why the two language paradigms didn't see more merging than they did. There are some datalog+RDBMS hybrids but, as much as I'm not a fan of .NET, I think LINQ has seen the most success in this space.


What are you concerned about when using .NET?


For me, I just haven't enjoyed the developer experience. It's been a few years so take my opinions with a dash of salt but I find the language itself verbose, I should be freed of having to think about its memory layout because of garbage collection but I still end up thinking about it because of type boxing and enumerator-wrappers, the garbage collection routine itself is not as mature as other environments that I'm familiar with (perhaps that has improved?)... and, I know some of my characterization is unfair. Among the .NET/CLR languages I really only have in-depth experience with C# and perhaps I should give one of the others a chance. I also carry baggage from a few years on a Unity project that twists what C# actually is (like how async/await is mangled and the way some library routines are only available from the main thread). I also probably allow my experience on a Java project (a few years of rather high stress) to bias me w.r.t. C# but I know that's also unfair, but these make me less likely to approach the language. It wouldn't be a deal-breaker for me to join a team but it would step down my enthusiasm some.

I will say, though, that the .NET authors did seem to learn some lessons from Java's worse API decisions and the .NET API is more uniform and reasonable overall.


To clarify, Unity case is a massive night and day difference to what is expected to be "normal" language experience e.g. using ASP.NET Core/EF Core, AvaloniaUI, writing a CLI app or a systemd service, or GtkSharp even, or using Godot/Stride. This is due to GC being much slower and more punishing, very often completely different way of doing basic operations and it also (ab)using IEnumerators as coroutines which may make it seem that average usage of them is just as difficult (it's not). Performance is also significantly different, sometimes by an order of magnitude, even including Burst-compiled code.

Boxing is rarely something you have to think about if ever in general purpose code, nor is garbage collection outside of not insisting on doing things less efficient and often more painful way (doing int.Parse(text.Substring(0, 4)) over int.Parse(text.AsSpan(0..4)), something the analyzer prompts you to fix).

If you care about performance as indicated by message content, then any JVM language is a very big downgrade as many patterns are simply not expressible with it the way they are with C#/C++/Rust.

There are also significant differences in tooling as .NET one is much more aligned with what you expect from using Rust/Go/even Node interacting via CLI (dotnet build/run/publish).


their soul?


WHy do you say that Prolog is ternary? It ssemantics are roughly those of predicate logic that is bivalent. In Prolog a "query" can either succeed or fail, i.e. be true or false, or raise an error- are you counting errors as a third truth value?


well the ternary thing about SQL is that it has this NULL/unknown, and it can be demonstrated with the outer joins.

Then SQL being a slang of DataLog, which comes from Prolog implies Prolog is ternary, no? Indeed Prolog's handling of "unknown" is more procedural than a true ternary logic.

So it has slightly different notion. Even though I'm more a Prolog amateur (s.o. who loves it), a a failed query in Prolog implies the unknown state. It's about the provability within the system, not about a third truth value.


Yes, I see what you mean and it's a good observation and, funnily enough, you're both right and wrong.

What you're saying is that Negation as Failure (NAF) introduces a concept of uncertainty, that we can consider a third truth value. Prolog doesn't have classical negation, where negating a logical atom (a "fact") makes it false, unconditionally and with certainty; it only has NAF, where an atom is true only if it cannot be disproved (since we prove by refutation).

Intuitively speaking, that is right. NAF is the simplest way to treat uncertainty, or in any case there is no simpler way: just set aside what you cannot know with certainty, and proceed based on what you can (dis)prove with certainty. There's even a name for that: non-monotonic reasoning, and it's a powerful technique. There's an argument that's it's a good model of how humans reason about uncertainty. I'm agnostic on that front [1].

Formally speaking, thinking of Prolog as a ternary logic because of NAF is not right. NAF is just one way to assign truth values to atoms, but there are still only two truth values to assign, true or false. We might have uncertainty about the assignment, but that's still uncertainty about assignment of one of two values [2].

A bit more precisely, there are only two possible outcomes to a query: either it succeeds (one or more times, nondeterministically), or it fails. If the query succeeds we say that it was "true", for some values of the variables in the query, and if it fails we say it was "false", or there are no values of its variables that make it true.

In other words, Prolog will never return a NULL, like SQL. The uncertainty is baked-in to the success or failure of a query.

But that's a really cool subject and you are on the right track thinking of NAF in terms of uncertainty.

__________________

[1] Answer Set Programming (ASP) is a different logic programming language where the difference between classical negation and NAF is part of the semantics of the language, and for the purpose of non-monotonic reasoning. I don't really know that literature very well so I can't recommend specific texts.

[2] Also keep in mind that uncertainty exists only about the result of a query. Anything we declare as part of a Prolog program, "facts" and "rules", are axiomatically true. Since queries are proved by refutation that means that only falsehood is uncertain. We can know truth with certainty and un-truth with uncertainty.


An alternative query language for PostgreSQL would be a wonderful thing! But it isn’t clear that’s what this is.


If you're interested in an alternative query language, https://prql-lang.org/ is a good one.


Thanks for the mention. For Postgres, the easiest way to use PRQL is the plprql extension:

Link: https://github.com/kaspermarstal/plprql

Previous discussion: https://news.ycombinator.com/item?id=39428609


> it isn't clear that's what this is

Because that's not what this is - this is for when you write e.g. `language plpgsql` in defining some function; instead of that (with this installed) you could write `language plprolog` and use prolog.


yes, I have plans to try adding some query helpers eventually.

One could imagine representing queries as compound terms, like: q(user(id=Id, name=like("Bob%"), email=Email))

which would query, from user table and bind Id and Email for all matches. I plan to experiment with something like that.


Tangentially, Gerrit (a git Code Review web-app) deprecated the use of Prolog for the Submission Requirement feature.

https://gerrit-review.googlesource.com/Documentation/prolog-...

Prolog was introduced in 2.2.2 (2012), and deprecated in 3.6 (2022)


I would have liked actual examples of what it can do. With a simple table, and a few requests and their outputs.


that is coming, once I get it a bit more developed... this isn't anywhere near usable for real yet


When I saw the title I hopes this would be schema aware. Looking into the current proof of concept they allow to write prolog definitions.

How cool would be be, if all relations in a Postgres database would be lifted into the scope of a prolog process to work directly on the relations.


If your entire database can be lifted into the application's heap, it's probably small enough that I wonder why you've got it stored in an RDBMS... and because Prolog is lexically sensitive (order of sentences and the order of clauses in the sentence affect the eval result) then you would need to effectively load all related DB entries -- or maybe some cursor tricks to load domain entries lazily and the many small queries that entails.

What I've usually seen instead is starting from Datalog (declarative, order-independent) instead of Prolog, and converting that into the relevant SQL queries then loading the results into variables within the datalog context. This splits the knowledge base into IDB and EDB parts.


> If your entire database can be lifted into the application's heap, it's probably small enough that I wonder why you've got it stored in an RDBMS

I think he meant lifting the database schema, not the whole database. This would help with auto completion and other static checks before trying to run queries.


Much much more important would be the foreign keys. If I remember my prolog correctly, this:

  parent(alice,bob).
is duplicating information that could already be found in the schema's relationships.


Something like this! So that you can naturally query your data defining relationships and constraints

(This can already be done using joins, but that is not very ergonomical, especially when a lot of relations are at play)


Recently, I learned that PostgreSQL can integrate with many languages through plugins, such as PL/Python, pgrx (Rust), pgzx (Zig), and so on. I wonder if anyone is planning to write one for Java or C#, lol.


There has been PL/Java for a long time: https://tada.github.io/pljava/ and PL/Perl and others. The projects you mention are largely building on well-established interfaces in Postgres.


Not that I'm aware of but any plugin system that expects dynamically linked binaries integrating through C API can be targeted by .NET NativeAOT (e.g. OBS plugins can be written in C# without manually dealing with hosting the runtime).


I did some something like this a while ago:

https://github.com/ekoontz/psqlog


On a related note, have there ever been extensions for PosgreSQL/SQLite/any other FOSS DBMS that would allow using Tutorial D[1] (the Third-Manifesto one, not to be confused with Walter Bright's D language) to define and query data? In particular, I feel like PostgreSQL's CREATE TYPE[2] feature would allow for easier bridging between SQL and Tutorial D.

Unfortunately, searches for “PostgreSQL Tutorial D” don't issue useful results for obvious reasons.

[1]: https://www.dcs.warwick.ac.uk/~hugh/TTM/index.html

[2]: https://www.postgresql.org/docs/current/sql-createtype.html


Maybe ask on a suitable sounding PostgreSQL mailing list?

* https://www.postgresql.org/list/

The catch-all "general" one is probably good enough if nothing else seems like a closer match. :)



So, I expected this to actually query the DB using prolog (using the correspondence between relational algebra and logic programming).

I mean, can those prolog stored procedures use the db as a source of facts for prolog, or otherwise write queries?


It looks like it's 1 hour old, and seems to me that embedding the Prolog would be a necessary first step.


you can do that using ODBC


Postgres is not just a relational db. It's a way of life.

JSONB, HSTORE, LTREE, Full Text Search, Logical Replication, Range Types, BRIN Indexes, GIN Indexes, GiST Indexes, SP-GiST Indexes, Table Inheritance, Foreign Data Wrappers, XML Support, UUID-OSSP, pg_trgm, Cube, Earthdistance, pg_prolog, pg_partman, pgvector, TimescaleDB, PostGIS, Citus, pg_cron, BDR (Bi-Directional Replication), PL/Python, PL/Java, PL/V8, pg_stat_statements, pg_prewarm, pg_hint_plan, pg_repack, pgAudit, pgRouting, Multicorn (FDW), HypoPG, pg_squeeze, pglogical, Postgres-XL, Wal2json


All enterprise level RDMS have similar capabilities, some of which are yet to come to Postgres.


Postgres is currently the most advanced RDMS, anyone who's not locked in by Oracle or whatever and doesn't use Postgres for new projects is likely misinformed.

Postgres essentially made every other RDMS obsolete except for some niche circumstantial cases (e.g. vendor lock-in)

> enterprise level

Postgres is enterprise level (whatever that means). Blazingly fast, Web 3, cloud-native, etc etc, pick your own buzzwords


I love and use Postgres daily for many years, but:

Performance monitoring is pretty much absent, all you have is pg_stat_statements and friends. So for any serious scale you need 3d party solution (or you're writing your own) straight away.

HA is complicated. Patroni is the best option now, but it's quite far from experience SQL Server or Oracle provide.

Optimizer is still quite a bit simpler than in SQL server/Oracle. One big thing that is missing for me is "adaptive" query processing (there's AQP extension, but it's not a part of distribution). Even basic adaptive features help a lot when optimizer is doing stupid things (I'm not gonna bring up query hints here :))


The same applies to those that think Postgres does everything, every single feature, that Oracle, SQL Server, DB 2, and co, are able to deliver, in projects where their license costs are kind of irrelevant in the big context of the organization.

Usually, it is a great way for many organizations to have a database as free beer.


Sure, a few organizations may actually need some obscure feature that Oracle provides, but again, it's niche. For most companies, Postgres provides way more features than they will ever use.

And for the other 1%, it sometimes happens that their need for a specific feature in Oracle DB turns out to be entirely unnecessary.

Not to mention that the vast majority of products turn out to be fancy CRUD apps. Doesn't matter though, Oracle will convince you that you NEED their DB regardless.


The main enterprise-necessary feature missing in Postgres compared to Oracle is free trips to the Bahamas.


I am sure companies selling Postgres support to Fortune 100, can think out of something to please corporate sales team.


I’d argue that, the real value of Postgres isn’t that it has capabilities on par with Oracle, but rather that, it has a thriving plugin ecosystem. If someone needs that obscure Oracle feature, they may have the option of writing it. If there’s something a lot of people are interested in (such as vector features), someone will implement it.


There’s one thing Postgres doesn’t provide. It doesn’t provide a supplier who a company can put their liability on. A supplier who can fix the problem in Postgres code and maintain it with authority.

But don’t get me wrong. Postgres is an awesome database system.


What does 'put their liability on' mean to you?

Because I suspect it does not mean what I think you think it means.

(Typically the EULA of all those non-Postgres systems are 'this is sold as is, no warranty as to suitability to your purpose, yada yada'

Often times people believe that if they're paying many monies for a support contract, that means they can relocate their liability to that company. Almost every time, that company has better lawyers / contract writers.


the product is sold as such but there are support contracts to cover everything else


i beg to differ: Professional Services

https://www.postgresql.org/support/professional_support/


SQLite has this as well, although it's use cases are different: https://sqlite.org/com/member.html

They have multiple tiers of support as well, pretty interesting read


Most of these are consultants specialising in hosting and “consulting”. There’s even AWS in there. How much does it cost to be on that list?


Oh, good luck getting this from any of Oracle, Microsoft or IBM... But actually, you usually can get those for Postgres.

Potsgres is the one general purpose DBMS out there that you can hire a company to actually solve your problems.


To a very close approximation, Postgres does do everything.

And on the very uncommon cases that you need really something that Postgres doesn't do, Oracle, MS SQL, DB2, and co do not provide enough of a difference, and what you actually need is an specialized DBMS.


You haven't mentioned support. As much as I'm a fan of Postgres, if my org doesn't have a Postgres support capability, but it does have a MSSQL team, then I'm going to stick with MSSQL.

I've got software to write, I don't want to find out that script I copied from stackoverflow to backup the database doesn't work in all the scenarios I thought it would because something changed a minor version ago.


> Postgres is enterprise level (whatever that means). Blazingly fast, Web 3, cloud-native, etc etc, pick your own buzzwords

it has issues in several choke points: HA setup is complicated, it doesn't utilize multi-core machines well on heavy queries.


> enterprise level (whatever that means)

It means "non-technical executives/managers recognize the name and will approve its use". Like Oracle.


Postgres is an open-source platform with a thriving ecosystem. People have written a vector plugin to follow the AI boom, and will likely have whatever people have an interest for.



Is there anything for triple-store, so that it can be used for RDF storage?


[flagged]


Haha what kind of weird spam is this?


Looks like it is Bitcoin-related.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: