Hacker News new | past | comments | ask | show | jobs | submit login
RankPL – A qualitative probabilistic programming language (github.com/tjitze)
69 points by tr352 on March 30, 2017 | hide | past | favorite | 14 comments



This is super exciting. Because of this, I started looking into Spohn's paper as well as yours. I need to finish reading your paper, and then I'd love to ask how you went from the theory to the algorithm. I'd love to understand what would be involved with porting it to a functional language (Rust most likely, which could help you in terms of making it embeddable in most any other language, for those that do not use Java).

Thank you for your great work!


Thanks! The principle could be applied to many languages, just like probabilistic programming also has been applied to many languages (look for example at Venture, Church, IBAL). This is something I'd like to do in the future.

Perhaps the easiest approach is to try to integrate the ranking semantics in an existing probabilistic language such as Church.

Please contact me if you have any questions, suggestions, ideas etc.


Genuinely curious: what's a realistic use case for something like this?


RankPL can be used instead of probabilistic programming for problems where probabilities are unknown, but where it's still possible to distinguish normal from surprising behavior of a model.

Admittedly, this is still quite abstract. I'm working on more concrete use cases.


Here's one that is maybe more concrete. And I hope I'm understanding everything correctly.

Say you're a startup running your infrastructure in AWS. You spread it out over three different regions and within each region you use 2 availability zones. Your network load is automatically balanced over these three geographical regions.

Now an earthquake happens in one region and although it's unlikely both of availability zones within that region go off line (fiber to the region is cut, power-loss, whatever). This means the entire region goes offline.

If modeled properly you should now be able to figure out what the consequences of this will be for the entire infrastructure. Will you be able to stay online if surprising behavior (an entire region going offline) happens?

Of course the big issue here is always mapping real world scenario's onto models that fit well enough.

EDIT: It's a matter of taking the "nasty integral" part out of it as per nerdponx in another comment on this thread. This can really help with doing Fault Tree Analysis for example as the statistics solving part there has always been a big problem for systems big enough (MCMC solvers help only to a degree).


Stochastic heuristics/sanity-checks


NLP,AI etc.


I think Family history fits in here too (seriously).


Interesting. Could you elaborate on what you have in mind?


I'm think of automatic grouping of data points into people profile. For instance, when searching for John Doe across all census records, provide profiles of people, backed up by multiple sources.

This would be achieved by seeing if a group of two references are likely to be referring to the second person , and of they are, merge the groups.


Is this equivalent to encoding the ranks as decimals in a floating-point probabilistic PL? Like if a program uses ranks 0-9 in RankPL, could those could be encoded as probabilities 1/55, 2/55, 3/55, ... ,10/55 (assuming rank 9 is 10 times less likely than rank 0) in a regular probabilistic PL?


It looks like the values are un-normalized density values, i.e the numerator in Bayes' theorem.

So, what you said. But determining the correct denominator for such a collection of values, in general, isn't easy when the collection is infinite (like a function). It amounts to a very very gnarly integration problem.

This is why MCMC solvers are used in Bayesian statistics, and why abstracting away the "nasty integral" part from the "probability density" part is useful to statisticians and other researchers.


There is indeed a probabilistic interpretation of ranks. It's discussed here:

http://ac.els-cdn.com/0004370295000909/1-s2.0-00043702950009...


This seems quite interesting. As soon as I reached the boolean circuit example query, I immediately thought of Prolog. Is there a known way to do something similar in logic languages? Using discrete ranks instead of continuous probabilities seems like a nice fit for such a setting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: