> similar in scope to a second implementation of the Rust compiler. I know. That...

nobodywasishere · 2024-09-06T23:15:01 1725664501

I disagree. A compiler for batch building programs and a compiler for providing as much semantic information about incomplete/incorrect/constantly changing programs are completely different tasks that require completely different architectures and design considerations.

ReleaseCandidat · 2024-09-07T06:07:23 1725689243

No. Actually "interactive" frontends in batch compilation mode generally have better error messages in this mode too. Yes, it may make the batch compilation (the frontend part) sligthly slower, but won't turn Go into Rust (or Haskell or C++).

And there always is the possibility to stop in batch mode when the first error occured.

forrestthewoods · 2024-09-06T23:33:21 1725665601

I don’t think that’s true at all.

First of all, a compiler for a 100% correct program definitely has all the necessary information for robust intellisense. They don’t currently save all the data, but it should exist.

So the only real question is whether they can support the 0.01% of files that incomplete and changing?

I’ll readily admit I am not a compiler expert. So I’m open to being wrong. But I certainly don’t see why not. Compilers already need to support incorrect code so they can print helpful error messages. Including different errors spread through out a single file.

It may be that current compilers are badly architected for incremental intellisense generation. But I don’t think that’s an intrinsic difference. I see no reason that the tasks require “completely different architectures”.

troupo · 2024-09-07T05:28:11 1725686891

> First of all, a compiler for a 100% correct program definitely has all the necessary information for robust intellisense.

It doesn't. Intellisense is supposed to work on 100% incorrect and incomplete programs. To the point that it should work in syntactically invalid code.

forrestthewoods · 2024-09-07T05:39:50 1725687590

> Intellisense is supposed to work on 100% incorrect and incomplete programs.

Correct. I literally discussed this scenario in my comment!

If the program compiles successfully then the compiler has all the information it needs for intellisense. If the program does NOT fully compile then the compiler may or may not be able to emit sufficient intellisense information. I assert that compilers should be able to support this common scenario. It is not particularly different from needing to support good, clear error messages in the face of syntactically invalid code.

troupo · 2024-09-07T08:44:52 1725698692

> I assert that compilers should be able to support this common scenario

Not necessarily. These are two very different tasks quite at odds with each other

forrestthewoods · 2024-09-07T08:56:38 1725699398

> These are two very different tasks quite at odds with each other

Are they? I feel like intellisense is largely a subset of what a compiler already has to do.

I’d say the key features of an LSP are knowing the exact type of all symbols, goto definition, and auto-complete. The compiler has all of that information.

Compilers produce debug symbols which include some of the information you need for intellisense. I wrote a PDB-based LSP server that can goto definition on any function call for any language. Worked surprisingly well.

If you wanted to argue that intellisense is a subset of compiling and it can be done faster and more efficiently I could buy that argument. But if you’re going to declare the tasks are at odds with one another I’d love to hear specific details!

zaksingh · 2024-09-07T13:52:33 1725717153

On the efficiency angle, I think a big difficulty here that isn’t often discussed is that many optimization strategies relevant to incremental compilation slow down batch compilation, and vice versa.

For example, arena allocation strategies (i.e internment of identifiers and strings, as well as for allocating AST nodes, etc) is a very effective optimization in batch compilers, as the arenas can live until the end of execution and therefore don’t need “hands on” memory management.

However, this doesn’t work in an incremental environment, as you would quickly fill up the arenas with intermediary data and never be deleting anything from them. This is one reason rust-analyzer reimplements such a vast amount of the rust compiler, which makes heavy use of arenas throughout.

As essentially every programming language developer writes their batch compiler first without worrying about incremental compilation, they can wind up stuck in a situation where there’s simply no way to reuse their existing compiler code for an IDE efficiently. This effect tends to scale with how clever/well-optimized the batch compiler implementation is.

I think the future definitely lies in compilers written to be “incremental first,” but this requires a major shift in mindset, as well as accepting significantly worse performance for batch compilation. It also further complicates the already very complicated task of writing compilers, especially for first-time language designers.

thramp · 2024-09-07T14:17:43 1725718663

That's a great point about allocation/memory management. As an example, rust-analyzer needs to free memory, but rustc's `free` is simply `std::process::exit`.

If I remember correctly, the new trait solver's interner is a trait (https://doc.rust-lang.org/nightly/nightly-rustc/rustc_trait_...) that should allow rust-analyzer's implementation of it to free memory over time and not OOM people's machines.

> I think the future definitely lies in compilers written to be “incremental first,” but this requires a major shift in mindset, as well as accepting significantly worse performance for batch compilation. It also further complicates the already very complicated task of writing compilers, especially for first-time language designers.

I'm in strong agreement with you, but I will say: I've really grown to love query-based approaches to compiler-shaped problems. Makes some really tricky cache/state issues go away.

gugagore · 2024-09-07T16:41:08 1725727268

I thought that rust's compiler was indeed written to be incremental first. Check a sibling comment of mine for reasons why I thought so.

thramp · 2024-09-07T14:11:10 1725718270

> Are they? I feel like intellisense is largely a subset of what a compiler already has to do.

They are distinct! Well, not just intellisense, but pretty much everything. I'll paraphrase this blog post, but the best way to think about think about the difference between a traditional compiler and an IDE is that compilers are top-down (e.g, you start compiling a program from a compilation unit's entrypoint, a `lib.rs` or `main.rs` in Rust), but IDEs are cursor-centric—they're trying to compile/analyze the minimal amount of code necessary to understand the program. After all, the best way to go fast is to avoid unnecessary work!

> If you wanted to argue that intellisense is a subset of compiling and it can be done faster and more efficiently I could buy that argument. But if you’re going to declare the tasks are at odds with one another I’d love to hear specific details!

Beyond the philosophical/architectural difference I mentioned above, compilers typically have a one-way mapping between syntax and mapping, but to support things like refactors or assists, you often need to do the opposite: go from semantics to syntax. For instance, if you want to refactor from struct to an enum, you often need to find all instances of said struct, make the semantic change, then construct the new syntax tree from the semantics. For simple transformations like a struct to an enum, a purely syntax-based based approach might work (albeit, at the cost of accuracy if you have two structs with same name), but you start to run into issues when you consider traits, interfaces (for example: think about how a type implements an interface in Go!), or generics.

It doesn't really make sense for a compiler to support above use cases, but they're are _foundational_ to an IDE. However, if a compiler is query-centric (as rustc is), then it's pretty feasible for rustc and rust-analyzer to share, for instance, the trait solver or the borrow checker (we're planning/scoping work on the former right now).

gugagore · 2024-09-07T16:28:58 1725726538

Other comments have addresses many always of your comment. The constantly changing part is also an important feature for recompilation being more efficient than recompiling from scratch each time. You can read about it here: https://rustc-dev-guide.rust-lang.org/queries/query-evaluati...

There is recording of a talk on YouTube from Niko Matsakis that goes into the motivation.

In conclusion, you don't really want to optimize for the batch use case, even outside of IDE support.

sph · 2024-09-07T08:06:53 1725696413

Nonsense. Given that the end user of both is a human, you want the compiler that builds program to know as much about semantics to aid in fixing buggy/incomplete/incorrect programs.