Hacker News new | past | comments | ask | show | jobs | submit login
Why Hypergraphs? (2013) (opencog.org)
94 points by avowud on July 29, 2022 | hide | past | favorite | 24 comments



Hypergraphs feature prominently in Wolfram Models (https://www.wolframphysics.org/technical-introduction/), they give rise to some really interesting phenomena when you look at hypergraph rewriting.


Hypergraph is like a graph but one where vertexes can connect more than two nodes. Right?

Isn't that just like the difference between Object Oriented and Relational, Models? A relational table has multiple columns. A relational table does not represent a relation between two things. It represents a relation between N things, with its N columns.


Yep, close -- it's where an edge can connect more than 2 nodes

Two things to make this way more intuitive & practical:

* Modeling: We use them for helping folks correlate across events, or correlating across wide database rows. Ex: Finding bots & fraudsters in your signup events or accounts table . A signup event might have the same weird User Agent, same weird IP, same weird domain name, etc: its good to see events connected across multiple such things! So any time you have an event or a wide row, you can think of it as a hyperedge connecting (correlating) the multiple entities involved.

* Visualizing: It's typically weird visually to draw an edge connecting more than 2 nodes, so most people don't. Instead, sometimes, our users want to see a correlation event as an explicit node in a graph. SO there's a natural bipartite graph from events to entities, like "(signup)-> (user agent)" + "(signup)->(ip)" + ... . But other times its annoying, so they rather just see "(ip:node)<-[signup:edge]-->(useragent:node)". Interestingly, for really wide data, having an explicit "hypernode" (event node) does increase the number of nodes... but significantly decreases the number of edges b/c avoids drawing really big connected components. Formally, add |events| nodes, vs multiply edges by |number of columns|^2 .

If you want to play with it, try "graphistry.hypergraph(pd.read_csv('http://blah.csv'))['graph'].plot()" in http://github.com/graphistry/pygraphistry (or just to get the ._nodes and ._edges dataframes out)

Nowdays, we also do a lot with embedding arbitrary data and exploring similarity graphs that way (use k-nn for inferring similarity edges), so not just across entity links but also say timestamps, $'s, and byte counts. Any dataset embedding can thus be thought of as a projection of the hypergraph, and supporting more interesting columns than just entity columns (categoricals). So in the above example, `graphistry.nodes(pd.read_csv('http://blah.csv')).umap().plot()` and we'll infer the `._edges`. All the nodes in an embedding view are essentially hypernodes, and instead of connecting them to entity/attribute nodes... project those out, and just connect the hypernodes to one another. This gives a very different way to think about stuff like explainability of AI :)


Yes, a hyperedge is like a set of nodes. Not limited to just two nodes.

I hadn't thought of relational tables in that way. I guess, you could look at each row of a table as a hyperedge grouping together all the columns for a given record. Not sure how useful that is though. The cross-table join relations are still relatively point-to-point I think.


Funnily enough, this hypergraph view is used to describe symptomatic query complexity for cyclical joins, the simplest if which is "count triangles in adjacency list" (naive RDBMS yields O(n²) on pathological data, despite the wrist case optimal tactic being something like O(n + #Triangles), with #Triangles limited inherently to O(n^(1.5))).

See https://arxiv.org/abs/1404.0703 for some reading material and one particular way at beyond worst case (i.e., sharing worst case asympotics with the best constant time algorithm, but better performance on non-pathological data (a more precise notion is complicated; there are at least 3 different useful definitions)) runtime performance.

AFAIK the worst cyclical query on a 2-column table is "k-clique counting", from a POV of performance between WCOJ and what Postgres can do (asympotics).


It might be just a good way of thinking about it. In practice it might provide some ideas about how to translate between an Object-view and Relational-View.

What I'd like to find is a relational database where field-values can be not just elementary values like numbers and strings but also "object-instances".

I wonder is there such a database? It would seem to offer the best of both worlds, objects, and relations between objects.


GemStone is a common one, but object databases are not very common due to the maturity, efficiency, and predictability of relational databases. The deep pipelining and caching models of CPUs are a better match for RDBMs than object ones.

https://en.wikipedia.org/wiki/Gemstone_(database)


I understand that Gemstone is an Object-Oriented Database that stores objects and is great at that. But can you query those objects with SQL?

What I'd like to understand is, is there some basic reason why RDBMS and ODBMs must be different databases. Or could their conceptual models of data perhaps be generalized into a single model. Using hyper-graphs perhaps. :-)


This might not be quite it, but perhaps you’re looking for graph databases?


Can you use SQL with graph databases?

Related to the current topic, do graph databases support hyper-graphs?


I don't believe they support SQL. They're somewhat more like pattern-matching on graphs. Most of their query language pages have compare/contrast examples with an equivalent SQL query. e.g. Neo4j's Cypher query language [1], or Apache Tinkerpop Gremlin (seriously) [2] used by Amazon Neptune.

I don't think the most popular ones support hypergraphs, although that sounds cool. I did see HypergraphDB [3] though! Need to read more about that.

[1] https://neo4j.com/developer/cypher/guide-sql-to-cypher/#cyph... [2] https://tinkerpop.apache.org/gremlin.html#:~:text=Host%20Lan... [3] http://hypergraphdb.org/


This can be modelled as a regular graph simply by representing the hyper edge as another node with a set of edges. If those are labelled appropriately to differentiate them from regular nodes, it’s straightforward enough to represent it as a regular graph.


Many things can be represented as more complex structures of simpler things.

It's interesting whether a new formalism allows to express things better, unearth interesting structures not apparent using a more busy formalism, etc.

Compare the Maxwell equations in the wordy original form, and in a highly compact Clifford algebra form.


Right but doesn't that mean that in essence you will have two types of nodes in your conceptual model. It is no longer that you have just 'nodes' and 'vertices', you will need to have two different types of nodes, and one type of vertex. The new type of node could be called 'hyper-node' perhaps. Else you lose information.


Graph rewriting systems are really neat. I've been working in Mathematica lately, and the language is a term-rewriting system. It's been really powerful and intuitive and for the life of me I can't understand why it's not more common.

Specifically, I read about the SubsetCases predicate: SubsetCases[expr, pattern] returns subsets of expr that match the pattern. You can draw a triangle graph, put slots as the node names, and do SubsetCases[graph,triangle] to pull out all the triangle subgraphs! Mind-blowing. You can also transform graphs really easily like this. Phenomenally powerful, though the computational complexity must get gnarly.


With the 13.1 release last month came a substantial HN discussion centred around how cool, yet proprietary Mathematica is.

https://news.ycombinator.com/item?id=31928608

That kind of also describes every other Wolfram-related submission :/


Comparing Wolfram's other works to Mathematica would be flattering. His massive book where he proposes "A New Kind Of Science" is a long love letter to himself. I mean, I haven't tried to work on something for 20 years, but plenty did, none of them is as arrogant and utterly self-absorbed as Wolfram is. At least Mathematica doesn't print "btw, I'm a genius, and fuck traditional programming langs" every time you evaluate an expression.

His recent "Physics Fundamentals" project is even worse, I see 0 collabration or critique from mainstream physics, 0 testable predictions or actual real-world significance to all the CS-Physics mixed salad. The only thing that sorta kinda could be described as physics is that he can re-derieve some traditional physics out of the new theory with some assumptions. Despite me not being a physicist, my understanding is that this is quite low-bar for a new physics theory, almost the bare minimum, enough to not consider you a total crank, but not for anything else.


For an original proposal that do logic inference on Hypergraphs I am using NeuraLogic, through a Python frontend (https://github.com/LukasZahradnik/PyNeuraLogic)

I wonder if this is something the author would have enjoyed…


Because logical relations are not only binary. If you have a logical formula f(x,y,z) (which applies to triplets of people, or machines or a network, or natural numbers or whatever) - then the combinatorial object corresponding to this relation is a hypergraph with (directed) 3-node edges.


I first heard of hypergraphs in the context of knowledge management using hode:

https://github.com/JeffreyBenjaminBrown/hode

I haven't looked more into them, but thought others might find hode interesting.


So what happened to opencog?


There seems to be a recent development:

Quote: "A novel version of opencog" https://wiki.opencog.org/w/Hyperon

And a video of OpenCog Hyperon in use with explanations: https://www.youtube.com/watch?v=Unryt6XH2FE


I assume the development fizzled out after several rounds of investment and no interesting results demonstrated. Not to sound dismissive, but I just see this pervasive pattern of failure in AI projects which aren't built from the ground up for learning.

If your basic system cannot learn solving at least some problems initially, it drastically lowers the probability of it acquiring learning capability later in development lifecycle.


Disclaimer: I don't have specific AI/ML expertise, but I've known the founder (Ben Goertzel) for close to 20 years now.

OpenCog never got huge amounts of money. Ben believes strongly in developing AGI (he actually coined the term). There are a few core researchers and developers that have worked with him for decades, also mission-focused. So I don't see development "fizzling out". There are also newer efforts such as SingularityNET.

Development seems alive and well, based on activity at https://groups.google.com/g/opencog, though it is mostly done by a single person these days (the author of the OP)

A general criticism about results may have some validity - When I added cross-platform capability to a part of the project back in 2016, I found an obscene amount of technical debt.

That said, I don't think your criticism is necessarily a good one to make. I'm not aware of any projects that have much in common with OpenCog or Ben's other work. The approach is very different, but again, I'm not an AI/ML expert and only understand this based on comments from others in the AGI community. My gut feeling here is that there's more to the story, or something otherwise doesn't add up, as you are attributing the project with a failure mode apparently shared by the majority of AI projects.

Can you name some examples of projects which are doing the right thing in regard to learning? I'd be genuinely interested in checking them out.

I'd also encourage posting your criticisms to the OpenCog mailing list and I'd expect someone there to address them in good faith. In any case, if you're enough of an expert to be able to make those sorts of criticisms correctly, there would certainly be value to be found in your complete argument.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: