*>I'm hoping the technology will advance enough so that we eventually get one la...

BiteCode_dev · on March 27, 2020

> not mathematically possible to create because the finite characters we use to design a language's syntax forces us to make certain concepts more inconvenient than others

There is a huge number of combinations possible, especially once keywords come into play. I don't think that's the limitation.

On big limitation is that people that are good at creating dynamic languages are bad at creating strict ones, and vice versa.

You can see in a comment bellow than some people talk about swift, Raiky, kotlin like solutions to this problem (as I mentioned in my post it would happen). But of course, they don't have the I/O solution Haskell has, the borrow checker rust has, nor the agility or signal/noise ratio of Python. They have a compromise. A compromise that can be good. Those languages are well designed. But it's not "the ultimate solution", because the don't navigate any end of the spectrum.

jasode · on March 27, 2020

">There is a huge number of combinations possible, especially once keywords come into play. I don't think that's the limitation."

The mathematical limitation still remains even if you switch from 1-character symbols like '+' to verbose words like "plus()". You can attempt to invent a new programming language using longer keywords but you'll still run into contradictions of expressing concepts because there's always an implicit runtime assumption behind any syntax that hides the raw manipulation of 0s and 1s. If you didn't hide such assumptions (i.e. "abstractions"), then that means the 0s & 1s and NAND gates would be explicitly visible to the programmer at the syntax layer and that's unusable for non-trivial programs.

There's a reason why no credible computer science paper[0] has claimed to have invented the One Programming Language that can cover the full spectrum of tasks for all situations. It's not because computer scientists with PhDs over the last 50 years are collectively dumb. It's because it's not mathematically possible.

[0] e.g. one can search academic archives: https://arxiv.org/archive/cs

BiteCode_dev · on March 27, 2020

It would be like saying you can't have high level languages because assembly have only a few limited combinations of registers. You can always abstract things away.

In fact, you are making the assumption that there is no more generalist logical concepts we can discover that can abstract or merge what now appears to be conflicting paradigms.

I imagine people said that to Argand. Nah, you can't do that sqrt(-1) thing, you'll run into contradictions.

Given that we have been at the task for less than a century, which is like 1 minute of human history, I'm not inclined to such categorical rejection of the possibility.

SAI_Peregrinus · on March 27, 2020

Unlambda makes everything equally difficult!

andrewflnr · on March 27, 2020

I believe this is a failure of imagination. I'm not a beginner programmer. I've played with a lot of different languages and read about a lot more, and I still have a strong gut feeling that they can be unified if we just figure out the right framework. Note that I, for one, would consider a language that shepherds you towards highly interoperable DSLs to be (close to) a success; something like Racket that could generate efficient native binaries would be really close...

I don't believe for a second that syntax is the obstacle. You can express all the complexity you need with composition. Also, the One True Language will obviously have macros.

jasode · on March 27, 2020

> I, for one, would consider a language that shepherds you towards highly interoperable DSLs [...] Also, the One True Language will obviously have macros.

But that just motivates someone else who isn't you to prefer another language that has that "DSL" and macros as the baseline syntax of the new language for convenience. Now you have 2+ languages again.

If someone else prefers not to type out extra parentheses ")))))" to balance things and/or requires highest performance of no GC, then a "Racket/Lisp-like" language can't be the basis of The One True Language.

>You can express all the complexity you need with composition.

True, and to generalize further, a Turing Complete language can be used to create another Turing Complete language. But the ability to build any complexity by composition is itself the motivation to create another programming language that doesn't require the extra work of composition.

For example, one can program in the C Language and combine its primitives to build the "C++ Language" (first C++ compiler was C-with-classes) and the C++ Language can be used to build Javascript interpreter (Netscape browser was written in C++). And then Javascript can be used to build the first Typescript compiler. Thus we might say (via tortured logic) that C Language can let you write a "DSL" as complicated as C++ and Javascript and Typescript. Even though that's true in sense, people don't think of "C Language" as the One True Language. It's the same situation of not thinking of low-level 0s and 1s of NAND gates as being the One True Language even though composition of NAND gates will let you build any other language.

andrewflnr · on March 27, 2020

> But that just motivates someone else who isn't you to prefer another language

Sure, they'll want to, and they probably will, but they won't have to due to performance or other constraints, which is how I understood the goal.

> But the ability to build any complexity by composition is itself the motivation to create another programming language that doesn't require the extra work of composition.

That's what the macros are for. All part of the plan.

> ... Even though that's true in sense, people don't think of "C Language" as the One True Language.

C is not certainly not a convenient language for hosting DSLs, due to insufficient abstraction capabilities, but the real missing ingredient is interop between the DSLs. C doesn't make it easy to pass data between them, etc.

NAND gates are not a great comparison. You want to be composing abstractions to create other composable abstractions. You could extend the analogy to composing circuits into bigger circuits, but that's really just converging back to a high level language.

jasode · on March 27, 2020

>, but they won't have to due to performance or other constraints,

They have to because a GC runtime is too heavy for embedded environments with low resources.

>That's what the macros are for. All part of the plan.

But there's still the motivation for another language that doesn't require creating the macros. E.g. Lisp macros are so powerful that they can recreate C#'s syntax feature of LINQ queries. That's true -- but C# doesn't require making the macros.

Each programming language has a different "starting point" of convenience. If you try to invent the One Language that can create all other languages' convenience syntax via macros, you've simply motivated the existence of those other languages that don't require the macros.

And the NAND gate is an abstraction. It's an abstraction of decidable logic based on rules instead of thinking about raw voltages. We do combine/compose billions of NANDs abstractions to create higher abstractions.

andrewflnr · on March 27, 2020

GC is obviously optional for OTL. That's definitely one of the tricky bits (note: not a syntactic problem).

For the rest, I think your goal definition is too narrow. Removing every last desire for people to create alternatives is an unreasonable bar for literally any artifact, based on human psychology alone. That's explicitly not my goal (see previous comment), and with anyone who does have that goal you need to have an entirely different conversation (which, again, does not involve syntax).

jasode · on March 27, 2020

>GC is obviously optional for OTL. That's definitely one of the tricky bits (note: not a syntactic problem).

If a GC language lets the programmer "opt out of GC" to mark a variable or block of memory as "fixed" so the GC doesn't scan it or move it to consolidate free space, how would one annotate that intention unless there's extra syntax for it?

Likewise in the opposition direction: If a non-GC language let's one "opt into GC", you will have ambiguities if you have syntax that allows raw pointers of dynamically calculated addresses to point to any arbitrary offset into a block of memory. That means that memory can't be part of the optional GC which would invalidate the pointer. If you restrict the optional GC language to ban undecidable dynamic pointers, it means you've created the motivation for another language with the syntax that lets you program with the freedom of dynamic pointers!

The general case of "optional GC" that covers all situations and tuning its behavior is tied to syntax because you can't invent a compiler or runtime that can read the mind of the programmer.

andrewflnr · on March 27, 2020

Set a flag during the (optional) compile phase that tells the compiler to error out if it can't statically determine where to allocate/free. (No, it's not Halting-complete because the compiler has the option of bailing due to insufficient evidence) Without that flag, it still tries but will include a GC if needed. Same for types, btw.

Ok, fine, you probably want some annotations (like for types). You got me. There's syntax involved. It's still fundamentally a semantics problem. If that can be solved, the syntax will follow.

Your post reads like, "You want to add features? But you'll have to add syntax! It's impossible!" Even if syntax is necessary for GC-obliviousness (it isn't for type inference), it implies no more about whether the project is possible than that for any other feature. Note how far we've strayed from mathematical absolutes about possible strings.

Even on the semantics side, 50 years is far too early to declare defeat. There are no actual impossibilities stopping this, unless you have a formal proof you're not telling us about. Even that would just be a guide of how to change the problem definition, in the same way that Rice's Theorem tells us to add the "insufficient evidence" output to our program verification tools. Have some more imagination.

jasode · on March 27, 2020

>Note how far we've strayed from mathematical absolutes about possible strings.

Well sure, we can just theoretically concatenate all the existing programming languages' syntax today into one hypothetical huge string and call _that_ artificial mathematical construct, The One True Language. But obviously, we don't consider OTL solved so "mathematical impossible strings" is constrained to mean "nice strings" advantageous to human ergonomics: reasonable lengths that are easy to read, and easy to type, with no ambiguous syntax causing contradictions in runtime assumptions, no long compile times, etc. E.g. I have no problem typing out balanced parentheses for Lisp but I don't want to do that when writing a quick script in Linux so Bash without all those parentheses is much more convenient.

>There are no actual impossibilities stopping this, unless you have a formal proof you're not telling us about.

The mathematical limitation is that all useful higher level abstractions must have information loss of the lower level it is abstracting. This can be visualized with surjection: https://en.wikipedia.org/wiki/Bijection,_injection_and_surje...

In the wiki diagram, we can think of 'X' as low-level assembly language and 'Y' as higher-level C Language. In C, a line of code to add 2 numbers might be:

  a = b + c;

In the wiki diagram we see X elements '3' and '4' both mapped to Y element 'C'. X-3 and X-4 may be thought of as strategy #3 vs strategy #4 for picking different cpu registers before the ADD instruction and Y-C is the "a=b+c" syntax. In assembly, you manually pick the registers but in C Language you don't because gcc/clang/MSVC compilers do it. Because there are multiple ways in assembler to add numbers that collapse to the equivalent "a=b+c", there is information loss. Most of the time, C Language programmers don't care about registers which is why the C Language abstraction is useful but sometimes you do, and that's why raw assembly is still used. You can't make OTL with the syntax that handles both semantics of assembly and C. If you argue that C can have "inline assembly", that doesn't cover the situation of not having the C Runtime loaded at all that runs prior to "main()". Also, embedding asm in C is still considered by programmers as 2 languages rather than one unified one.

Or we can also think of 'X' as low-level C/C++ language that has numeric data types "short, int, long, float, double". And 'Y' is the higher-level Javascript that only has 1 number type which is a IEEE754 double precision floating point which maps to C languages "double". This means that Javascript's "information lost" is the fine-grained usage of 8-bit ints, 16-bit ints, and 32-bit ints.

If programmer John attempts to design a OTL, he will have to choose which information in the lower layer is "lost" in the runtime assumptions of the higher-level OTL. Since the John's surjection can't cover all scenarios, it motivates another programming language being created. An assumption of GC in the language runtime creates some information loss. Even an optional GC is an abstraction also creates information loss of how to manually manage memory at a lower level of abstraction.

andrewflnr · on March 27, 2020

OTL does not need to be surjective onto the set of all binary programs. You only get "information loss" when you try to go backwards, from the end result to the intent. That's reverse engineering, not programming. Now, during translation, the compiler might fill in some information you didn't care about. If you do care about specific instructions and registers for some part of your program, supply them. You probably want to have an assembly DSL that knows about how to integrate with the other code rather than embedding strings. You probably can generate any assembly this way, if just by writing exclusively in the assembly DSL, but the actual requirement is to correctly translate all valid specs.

jasode · on March 27, 2020

> You only get "information loss" when you try to go backwards, from the end result to the intent. That's reverse engineering, not programming.

Instead of "information loss", another way to put it is "deliberate reduced choices to make the abstraction useful to ease cognitive burden". That way, it doesn't have connotations about reverse engineering because limitations of surjective mapping is very much about forward engineering.

E.g. I look at Javascript and think forward to engineer how I want to use integers that are larger than 2^53. Javascript's "simpler abstraction of 1 number type" loses the notion of true 64-bit int with a range up to 2^64. Therefore, I don't use Javascript if I need that capability. This means Javascript can't be the OTL for all situations. Your suggestion of Racket-like language as a candidate for OTL has the same problem: it will always have gaps in functionality/semantics/runtime that make others not want to use it and therefore they create Another Language with the desired semantics.

Supplementing the gaps via the ability to write custom DSLs and macros don't solve it. Lisp already has that now and it's not the OTL. If programmer John extends Lisp with macros to simulate monads, he'll spell the macro his way but programmer Bob will spell his macro differently. Now they've created 2 personal dialects of Lisp instead of a larger unified One True Language.

Rereading your comments, I think you're really saying that it's possible to invent the OTL for you, andrewflnr. That's probably true, but unfortunately, that's not a useful answer when the programming community is confused as to why there isn't a universal OTL yet. They're talking about the OTL that everybody can use that covers all scenarios from low-level embedded C Language to scripting to numeric computing to 4GL business languages where SQL SELECT statements are 1st class and don't require double quotes or parentheses or loading any database drivers. Such a universal programming language, if it could exist, would make the "One" in "One True Language" actually mean one.

andrewflnr · on March 28, 2020

Most/all languages today take away options. Any OTL would just provide defaults. Details are optional but always possible. I thought I was pretty clear about that re assembly. That's barely even one of the hard parts.

I'm well aware of what it means to have a language for everyone to use. I'm thinking of everything from bootloaders to machine learning to interactive shells. The reason there isn't one yet is that it's really hard. Lots of basic theory about how to think about computation is still being sounded out. Unifying frameworks have been known to take a few decades after that. Still no reason to think it's impossible.

You're just repeating that there will always be gaps, with no evidence except gaps in languages produced by today's rushed, history-bound ecosystem. You're trying to use JS as a illustration of an OTL, which is baffling. Having a limited set of integer sizes would obviously not fly.

I'm apparently not getting the vision across. This is not even a type of thing that exists today, which is why I keep saying to use more imagination. Racket is only close due to its radical flexibility in inputs and outputs.

jasode · on March 28, 2020

>Any OTL would just provide defaults.

And you will inevitably have defaults that contradict each other which motivates another language.

Another way of saying "default" is "concepts in the programming language we don't even have to explicitly type by hand or have our eyeballs look at."

What should the OTL default be for not typing any explicit datatype in front of the following _x_ that works for all embedded, scientific numeric, and 4GL business?

  x = 3

Should the default _x_ be a 32bit int or 64int or 128bit int? Or a 64bit double-precision? Or a arbitrary precision decimal (512+ bits memory expandable) or arbitrary size integer (512+ bits expandable)?

Should the default for x be const or mutable? Should the default for x have overflow checks or not? Should default for x be stored in a register or on the stack? Should the name 'x' be allowed to shadow an 'x' defined at a higher scope? What about the following?

  x = 3/9

Should x be turned into approximation of 0.3333... or should x preserve the underlying symbolic representation of 2 rationals with a divide operator (3,div,9)?

The defaults contradict each other at a fundamental level. The default for x cannot be simultaneously be both a 32-bit int and a 512-bit arbitrary precision decimal at the same time. We don't need a yet-to-be-discovered computer science breakthrough to understand that limitation today.

If we go meta and say that the default interpretation for "x = 3" is that it's invalid code and the programmer must type out a datatype in front of x to make it valid, then that choice of default will also motivate another language that doesn't require manually typing out an explicit datatype!

Therefore, we can massively simplify the problem from "One True Language" to just the "One True Datatype" -- and we can't even solve that! Why is it unsolvable? Because the OTD is just another way of saying "read the mind of the programmer and predict which syntax he doesn't want to type out explicitly for convenience in the particular domain he's working in". This is not even a well-posed question for computer science research. Mind-reading is even more intractable than the Halting Problem.

As another example, the default for awk language -- without even manually typing an explicit loop -- is to process text line-by-line from top-to-bottom. This is not a reasonable default for C/Javascript/Racket/etc. But if you make the default in the proposed OTL to not have implicit text processing loop in the runtime, it motivates another language (such as awk) that allows for it. You can't have a runtime that has both simultaneous properties of implicit-text-loop and text-loop-must-be-manually-coded.

Whatever choice you make as the defaults for OTL, it will be wrong for somebody in some other use case which motivates another language that chooses a different default.

>Details are optional but always possible.

Yes, but any extra possibilities will always require extra syntax that humans don't want to type or look at. Again, it's not what's possible. It's what's easy to type and read in the specific domain that the programmer is working in.

>You're just repeating that there will always be gaps, with no evidence except gaps in languages produced by today's rushed, history-bound

Are you saying you believe that abstractions today have gaps but tomorrow's yet-to-be-invented abstractions can be created without gaps and we just haven't discovered it yet because it's really hard with our limited imagination? Is that a fair restatement of your position?

Gaps don't just exist because of myopic accidents of history. Gaps must exist because they are fundamental to creating abstractions. To create an abstraction is to create the existence of gaps at the same time. Gaps are what make the abstraction useful. A map (whether fold paper map or online Google maps) is an abstraction of the real underlying territory. The map must have gaps of information loss because otherwise, the map would be the same size and same atoms as the underlying territory -- and thus the map would no longer be a "map".

The mathematical concept of "average or mean" is an abstraction tool of summing a set of numbers and dividing by its count. The "average" as one type of statistics shorthand, adds power of reasoning by letting us ignore the details but to do so, it also has gaps because there is information loss of all the individual elements that contributed to that average. The unavoidable information loss is what makes "average" usable in speech or writing. You cannot invent a new mathematical "average" which preserves all elements with no information loss because doing so means it's no longer the average. We can write "the average life expectancy is 78.6 in the USA". We can't write "the average life expectancy is [82,55,77,1,...300 million more elements divided by 300 million] in the USA" because that huge sentence's text would then be a gigabyte in size and incomprehensible. You can invent a different abstraction such as "weighted average" or "median" or "mode" but those other abstractions also have "information loss". You're just choosing different information to throw away. We can't just say we're not using enough imagination to envision a new type of mathematical "average" abstraction that will allow us to write an alternative sentence that preserves all information of individual age elements without the sentence being a gigabyte in size.

>JS as a illustration of an OTL, which is baffling.

No, I was using JS as one example about surjection that affects forward engineering. When I say "this means Javascript can't be the OTL for all situations", it's saying all programming languages above NAND gates will have gaps and thus you can't make a OTL.

What's baffling is why anyone would think Racket's (1) defaults + (2) DSL + (3) macros -- would even be a realistic starting point for the universal OTL. The features (1,2,3) you propose as ingredients for universal OTL are the very same undesirable things that motivates other alternative languages to exist! Inappropriate defaults motivates another language with different defaults. The ability to write DSLs motivate another language that doesn't require coding that DSL. The flexibility of coding macros motivate another language that doesn't require coding the macro.

pythonaut_16 · on March 27, 2020

I don't think syntax is the big limitation here; it's library and behavior design.

The One Language concept could still be considered to be accomplished by a language with two syntaxes provided they have a low friction to interoperability.