More

efferifick · 2024-05-17T12:06:34 1715947594

Thank you for sharing!

efferifick · 2024-04-21T13:50:53 1713707453

I am not sure why you mention LLM, the post mentions LLVM. Second, you can have different optimization options with different tradeoffs in compile-time and run-time. Third, even if the default and only options distributed with clang are too compile-time intensive, the good news is that this is open source and you could argue against it, or fork and maintain your own compile-time run-time tradeoff compiler along with other people who also want that behaviour. I don't see any benefit of arguing against research. Having new techniques to improve compilers is not a zero sum game.

lastgeniusua · 2024-04-21T14:46:07 1713710767

The LLM confusion (aside from the similar name) might also have come from the "Transformers" discussed in one of the papers linked.

efferifick · 2024-04-21T13:24:45 1713705885

This post and the Hydra paper reminds me a lot of Ruler and Enumo.

  * Nandi, Chandrakana, et al. "Rewrite rule inference using equality saturation." Proceedings of the ACM on Programming Languages 5.OOPSLA (2021): 1-28.
  * Pal, Anjali, et al. "Equality Saturation Theory Exploration à la Carte." Proceedings of the ACM on Programming Languages 7.OOPSLA2 (2023): 1034-1062.

I will need to read more about both of these techniques along with Synthesizing Abstract Transformers.

Thanks for sharing! Really exciting stuff!

manasij7479 · 2024-04-22T08:13:18 1713773598

These are indeed neat papers!

efferifick · 2024-04-21T12:57:08 1713704228

The order goes from simpler to more complex data flow analysis frameworks. These frameworks allow you to encode dataflow problems and solve them.

  * Kam, John B., and Jeffrey D. Ullman. "Monotone data flow analysis frameworks." Acta informatica 7.3 (1977): 305-317.
  * Reps, Thomas, Susan Horwitz, and Mooly Sagiv. "Precise interprocedural dataflow analysis via graph reachability." Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 1995.
  * Sagiv, Mooly, Thomas Reps, and Susan Horwitz. "Precise interprocedural dataflow analysis with applications to constant propagation." TAPSOFT'95: Theory and Practice of Software Development: 6th International Joint Conference CAAP/FASE Aarhus, Denmark, May 22–26, 1995 Proceedings 20. Springer Berlin Heidelberg, 1995.
  * Reps, Thomas, et al. "Weighted pushdown systems and their application to interprocedural dataflow analysis." Science of Computer Programming 58.1-2 (2005): 206-263.
  * Späth, Johannes, Karim Ali, and Eric Bodden. "Context-, flow-, and field-sensitive data-flow analysis using synchronized pushdown systems." Proceedings of the ACM on Programming Languages 3.POPL (2019): 1-29.

Other areas that may be interesting to look at:

  * Points-to Analysis
  * Abstract Interpretation
  * On demand dataflow analyses

pfdietz · 2024-04-22T12:34:46 1713789286

I get sad every time I see Susan's name. She died too soon.

electricships · 2024-04-21T16:59:34 1713718774

Are current compiler optimisations limited by algorithm or by compute?

would a x1000 compute cluster provide a meaningful performance boost ( of the generated binaries)?

efferifick · 2024-04-21T18:32:04 1713724324

Damn, I am being nerd-sniped here :)

One thing is that you can think of static analysis as building facts about the program. You can for example start by assuming nothing and then adding facts about the program. And you need to iteratively propagate these facts from one line of code to the next. But you can also start by assuming the universe and removing facts from this universe.

Some classes of program analysis are safe to stop early. For example, if I have a static analysis that tries to find the target of virtual calls (also known as a devirtualization), you can stop early after a time out. Not finding the target just implies a missed optimization.

There are some other classes of program analysis that are not safe until the algorithm finishes. For example, if you have to prove that two variables do not alias each other, you cannot stop until you have all possible points-to sets and verify that for each of those two variables, their points-to sets do not overlap.

So, given the above restriction, the first class (early termination) is perhaps more desirable and throwing more compute time would yield a better approximation. For the second one, it wouldn't.

Another thing to keep in mind is that most of these data flow frameworks are not easily parallelized. The only paper I've read (but I haven't kept up with these avenue of research) that implemented a control flow analysis in the GPU is the following:

* Prabhu, Tarun, et al. "EigenCFA: Accelerating flow analysis with GPUs." ACM SIGPLAN Notices 46.1 (2011): 511-522.

I'm sure people are working on it. (I should mention that there are some program analyses written in Datalog and Datalog can be parallelized, but I think this is a processor based parallelization and not a GPU one).

The third thing is that when you say whether we are limited by algorithms or compute, I think it is important to note that it is impossible to find all possible facts *precisely* about a program without running it. There is some relation between static program analysis and the halting problem. We want to be able to guarantee termination of our static program analysis and some facts are just unobtainable without running. However, there is not just static program analyses, but also dynamic program analyses which can analyze a program as it is running. An example of a dynamic program analysis can be value profiling. Imagine that you have a conditional and for 99% of the time, the conditional is false. With a virtual machine, you can add some instrumentation to find out the probability distribution of this conditional and then generate code without this condition, optimize the code, and only if the condition is false then run a less optimized version of the code with an additional penalty. Some virtual machines already do this for types and values. Type profiling and value profiling.

One last thing, when you say a meaningful performance boost, it depends on your code. If your code can be folded away completely at compile time, then yes, we could just generate the solution at compile time and that's it. But if it doesn't or parts of it cannot be folded away / the facts cannot be used to optimize the code, then no matter how much you search, you cannot optimize it statically.

Compilers are awesome :)

As an addendum, it might be desirable in the future to have a repository of analyzed code. Compilers right now are re-analyzing code on every single compile and not sharing their results across the web. It is a fantasy of mine to have a repository that maps some code with equivalent representations and every time one does a local compile it explores a new area and adds it to the repository. Essentially, each time you compile the code, it explores new potential optimizations and all of them get stored online.

moonchild · 2024-04-21T23:39:26 1713742766

> repository of analyzed code

yes, this is obviously what you want. there shouldn't be a notion of 'running' the compiler; it should simply be a persistent process. importantly, there are two issues, which are orthogonal but both critically important: iterativity—that is, produce an acceptable result immediately, but then continue to refine it—and incrementality—that is, when making a small change to the code, take advantage of information learned about the previous version.

there is this recent paper on incrementality https://arxiv.org/pdf/2104.01270.pdf, though it has some caveats (e.g. it does not handle iterativity at all, so there is only so far it can go; can do much better)—but it is interesting non-technically because the authors work for amazon, so perhaps some of this is headed for industry

algo_trader · 2024-04-21T19:19:05 1713727145

> it might be desirable in the future to have a repository of analyzed code.

We did a code verifier back end for a small app store. Were mostly MScs + some Phd consulting.

The world is indeed waiting for a CloudOptimizingCompiler (CoCo?!) - there are so many possibilities for shared compute, library caching, and machine learning.

I will PM you, just to keep the dream alive.

efferifick · 2024-04-20T16:11:02 1713629462

I've been a big fan of Emery's research. Coz is one tool that I am always wanting to use, but I haven't had the chance to do so.

Check his other research. Some of it is highly accessible via youtube videos. I recommend watching / reading:

  * Stabilizer
  * Mesh
  * Scalene

efferifick · on Nov 21, 2023

I recommend `egglog` which is Datalog + Equality Saturation. It has python bindings and has allowed me to optimize programs in a custom programming language.

https://egg-smol-python.readthedocs.io/en/latest/

efferifick · on Nov 17, 2023

I will be in the minority here, but:

Native integration with Datalog.

Many times, I find myself working on a program and I realize that what I need is a database. But having a database, even sqlite3 or Berkely DB, would be an overkill. If I could just express my data and the relationships between them, then I would be able to query what I need in an efficient way.

efferifick · on Oct 26, 2023

I think a lot of people in the comments are hung up on defining compiler as "taking a source language and producing a binary". I personally know Eddie and I agree with his points. (Even though his title is a bit provocative and contradicts one of the points in the article "A language is not inherently compiled or interpreted; whether a language is compiled or interpreted (or both!) is an implementation detail.")

I perhaps have not had a long professional life working with compilers (5+ years), but to me the definition of "compiles to binary" is too restrictive. The main things I care for in my work are:

1. To be able to perform some sort of static analysis on the program 2. To be able to transform the program representation

To other commenters: in Python, we have two program representations. The human readable string representation and the bytecode representation. The syntactical errors are a kind of static analysis. To me, the maps between the Python string representation and the bytecode representation and the classes of errors we can catch without running the program is far more interesting than pigeon-holing Python in the "compiled" or "interpreted" hole.

efferifick · on Nov 5, 2022

> Spanish Catalan architect Antoni Gaudí disliked drawings and prefered to explore some of his designs — such as the unfinished Church of Colònia Güell and the Sagrada Família — using scale models made of chains or weighted strings. It was long known that an optimal arch follows an inverted catenary curve, i.e., an upside-down hanging chain. Gaudí's upside-down physical models took him years to build but gave him more flexibility to explore organic designs, since every adjustment would immediately trigger the "physical recomputation" of optimal arches. He would turn the model upright by the way of a mirror placed underneath or by taking photographs.

http://dataphys.org/list/gaudis-hanging-chain-models/

efferifick · on Oct 16, 2022

Shouldn't there be the possibility for civilizations to join forces?

imglorp · on Oct 17, 2022

There's also a huge damping variable due to communication latency between star systems, and of course travel time if you want two armadas to gang up on a third system. In fact this is exactly the two generals problem: you really don't want to launch an armada without confirmation from your ally, and if you do launch it, you will have the two generals problem recalling it.

https://en.wikipedia.org/wiki/Two_Generals%27_Problem

efferifick · on Oct 17, 2022

I was suggesting a simpler solution: just merge the two civs into one. This would be equivalent to having only one general.

bee_rider · on Oct 17, 2022

Yeah, it would be interesting I think to add an aggression variable. If two species meet and their combined aggression is high enough, they fight (with some large bonus going to species with some combination of aggression and ability to detect, as it is assumed they can get off a devastating first strike).

If the species are not aggressive, they can presumably at least the weaker of the two should get some large bonus.

Of course then we could play with the first strike bonus and the alliance bonus, and get whatever result we want from the simulation.