Python at Scale: Strict Modules

timothycrosley · on Oct 20, 2019

More and more I want someone to create a new language that amounts to a strict subset of Python, with mypy built-in, and is compilable into machine code. Python has by far my favorite syntax, community, and in my experience leads to the greatest productivity. There just happens to be a lot of overly dynamic features, that aren't even used by most, but used just enough to hold back optimization and structural improvement.

staticassertion · on Oct 20, 2019

https://github.com/python/mypy/tree/master/mypyc

This may interest you. This is probably going to be the 'official' way to get what you're talking about.

timothycrosley · on Oct 20, 2019

Thank you for sharing! This looks really promising, I'll try to think of ways I can contribute to the project.

ledauphin · on Oct 20, 2019

yeah, I'm excited about this approach. but it's a long way from being a realistic approach for folks outside Dropbox, it seems.

staticassertion · on Oct 21, 2019

It's a long way from being a realistic approach for anyone, including Dropbox, really.

timothycrosley · on Oct 21, 2019

What would you say are the biggest blockers to becoming realistic? I saw on the README they need tools in the Python ecosystem to start utilizing them, which I can help with starting with isort, beyond that I'd want to do whatever I can to help the project succeed.

chubot · on Oct 21, 2019

Probably the biggest issue is that it can't run many libraries and frameworks because they use a lot of dynamic features, i.e. reflection and metaprogramming.

To be more specific: getattr, operator overloading, descriptors, heterogeneous dicts, decorators, etc.

Type checking and metaprogramming are fundamentally at odds [1]. Dynamic languages like Python have more of a focus on the latter. They later added type checking, but it comes at the "cost" of ruling out the more idiomatic metaprogramming and reflection features. In other words, static typing makes your source code bigger.

Well, optional typing to some degree lets you have the best of both worlds -- you can skip type checking of the hard parts. But optional typing doesn't let you compile your program to make it faster -- you need a fully-typed program for that.

----

I'm doing something similar to mypyc with https://www.oilshell.org/ (I actually visited Dropbox and chatted with them about it back in the spring.)

The difference is that I'm compiling Oil's Python source to C++ rather than to Python-C extension modules. So it doesn't depend on the Python runtime. It's not done but it's working well so far, and it's given me a lot of appreciation for which dynamic features Python programs actually use! (both my own and others)

Also note that mypyc was used to speed up mypy, which is a type checker. A type checker is a very particular kind of program that's different than 99% of the use cases of Python. So success on speeding it up is super impressive but it's not clear it generalizes.

The same is true for Oil -- my translation work doesn't generalize to arbitrary Python programs. Lots of people have died on that hill because it's really hard. You have a hard tradeoff between the kinds of Python programs you can support and the speedup you can give them. There are 10-20 projects over the last 2 decades at various points along that spectrum. In addition to mypyc, Oil's strategy was also inspired by Shed Skin, which is an impressive but mostly dormant Python-to-C++ compiler.

----

So in short I would say the problem is that nobody will be able to agree on a subset. You will have a lot of different fragments of Python geared toward particular use cases.

But Python will very often be more appealing than any of those fragments because it has a bigger ecosystem. One thing that I've appreciated more and more while designing a language is how much the network effects and inertia matter. It's why we're still using C and C++ after almost 50 years. I'm sure every day there is still a lot more C++ written than Go, Rust, Swift, and D combined, etc.

Python has a similar network effect and it will be around basically forever in its current form. Software doesn't really get rewritten or reduced -- more stuff just gets added on top.

[1] I wrote some posts about that tradeoff here: http://www.oilshell.org/blog/tags.html?tag=metaprogramming#m...

gpderetta · on Oct 21, 2019

> type checking and metaprogramming are fundamentally at odds [1]

I would say that dynamic metaprogramming is at odd with type checking (and optimizations, and in general understanding the behaviour of a program statically). But of course metaprogramming can be done perfectly fine in a statically typed language.

chubot · on Oct 21, 2019

There's less conflict if you release the features in the same compiler of course, but it's definitely not "perfectly fine".

My blog posts give some more color on that, but also see:

https://discuss.ocaml.org/t/the-future-of-ppx/3766 (breakage)

https://words.steveklabnik.com/an-overview-of-macros-in-rust (breakage)

Also, OCaml has had at least 4 different metaprogramming systems -- ocamlmeta, camlp4, ppx, etc. Rust is getting its second one post-1.0 as AFAIU.

There are lots of open problems related to metaprogramming and type system design because they interact heavily.

This blog posts also hints at that: https://nim-lang.org/araq/v1.html

In contrast, in Lisp, metaprogramming is "just programming".

gpderetta · on Oct 21, 2019

There isn't anything specific of static type systems in those two links. Both ocaml and rust had powerful plugin systems that exposed the internal of the the compiler AST and they decided that they do not want to expose it as a stable interface. But exposing implementation details is not required to have a powerful metaprogramming environment. As far as I understand rust macros never did expose these internals and had no such issues.

Also compare with C++ metaprogramming (whose syntax is certainly awful, although it has been continually improving), but works perfectly fine with its type system (in fact most metaprogramming in C++ is done via the type system).

On the other hand CPython also exposes the runtime internals to plugins and de facto that prevents the language from evolving and alternative implementations to gain a foothold, so the issue of exposing implementation details preventing language evolution is not restricted to static languages.

edit: the hard part of typing and metaprogramming is making sure your metaprograms are well typed, i.e. the generated program is guaranteed to typecheck. This is great, but it is not a strict requirement, if you are happy with syntax macros a la rust or unconstrained templates a la C++, there is no particular issue. Your genrated program will be still be typechecked at compile time, which is still better than having a runtime error because of a bad metaprogram.

chubot · on Oct 21, 2019

The point is that they all struggle with the design of metaprogramming. Why are there 4+ different systems in OCaml as "addons" whereas in Lisp and Python it's integrated in the language?

It's exactly analogous to types being integrated in OCaml, Rust, etc. whereas in Python, JS, Ruby, and PHP static typing it's an "add-on".

Scala also has a newer system https://scalameta.org/

C++ has new dramatic new proposals 20+ years after templates, addressing really basic use cases:

https://herbsutter.com/2017/07/26/metaclasses-thoughts-on-ge...

The point is that no language has gotten it right so they keep introducing new systems and breaking old ones.

C++ templates also have the mistake where type checking is done after template expansion. That's why you get terrible error messages.

If you're not convinced, that's OK, but try watching the talk by Yaron Minsky here:

http://www.oilshell.org/blog/2016/12/05.html

cpeterso · on Oct 20, 2019

I like Python, but I often wonder how many developers use Python because they actually use dynamic language features versus just liking the languages' clean syntax and library ecosystem. I'm surprised languages that offer both REPL (for development) and AOT native compilation (for production), like OCaml, are not more popular. Evidence that syntax matters, I guess. :)

mypy and mypyc are interesting but their compile-time checks and optimizations are still hampered by Python's dynamic language semantics.

clintonb · on Oct 21, 2019

Don’t underestimate inertia. I’ve worked with Python and Django for seven years. I know the libraries in the ecosystem. I know the framework. It’s far easier for me to start a project with Django than to learn another framework or language.

peteradio · on Oct 20, 2019

Names matter and OCaml is a crappy name.

angry_octet · on Oct 21, 2019

If you think OCaml is bad, it used to be Caml Special Light, a play on Camel cigarette naming.

peteradio · on Oct 22, 2019

Christ..

pjmlp · on Oct 21, 2019

The dynamic language semantics of Lisp, Scheme, Smalltalk, JavaScript have not hampered the existence of good JIT/AOT compilers.

Smalltalk, for example you can completely change the structure of a class by sending a become: message.

What I think is missing is a bit of more PyPy love, and the Truffle and OpenJ9 Python support efforts.

Sophistifunk · on Oct 20, 2019

I think a great deal of this sort of thing could be done by just doing some eval in a dynamic state before you stop the vm and compile its stable state, rather than the actual source code.

dgoldstein0 · on Oct 21, 2019

I think you missed part of the point of what the article was trying to say - or rather, what they hoped to do with this strict python. One of those things being some form of hot code loading. A snapshot of the state can't be incrementally rebuilt - it's very much all or nothing; whereas if we know or modules are side effect free, or at least some useful part of module loading is, we could cache that part and get faster start up times on incremental changes.

angry_octet · on Oct 21, 2019

Sounds like they need Erlang.

scrollaway · on Oct 20, 2019

Python has some of my favourite syntax as well but I absolutely hate its annotations. TypeScript got typings right.

I think the killer language will be typescript with access to both the python and JavaScript ecosystems. We'll see what that looks like.

And of course if something changes the syntax, better anonymous functions will be the absolute first thing I would look for...

timothycrosley · on Oct 20, 2019

> TypeScript got typings right.

I have not used TypeScript, but looking at it's documentation the syntax for type annotations look identical. Would you be willing to expand on why you think its approach is better / how it's different?

scrollaway · on Oct 20, 2019

No importing basic types, using binary operators instead of awful things like Union and bracket accessors and what not, inline interfaces...

Try it a bit. it truly is enjoyable. Fifteen years of python and I'm still enjoying TypeScript more.

Mypy is limited by annotations having to be compatible python syntax.

moreati · on Oct 20, 2019

> using binary operators instead of awful things like Union

Do you mean like https://www.python.org/dev/peps/pep-0604/#id17 ?

> Inspired by Scala language [5], this proposal adds operator __or__() in the root type. With this new operator, it is possible to write int | str in place of Union[int,str]

scrollaway · on Oct 20, 2019

Yeah. Good to see that's being added.

timothycrosley · on Oct 20, 2019

> No importing basic type

This at least is being solved in Python: https://www.python.org/dev/peps/pep-0563/

> inline interfaces

Typed dicts in 3.8 look pretty similar at cursory glance https://www.python.org/dev/peps/pep-0589/

I'll definitely play around with it more!

scrollaway · on Oct 20, 2019

Watch the difference: A variable that can be an object with two elements (one of them being a list of strings), or a tuple of exactly two strings.

    // typescript
    let t: {a: string[], b: number} | [string, string]

    # python 3.8
    from typing import TypedDict, Tuple, Union
    class SomeTypedDict(TypedDict):
        a: List[str]
        b: Union[float, int]

    t: Union[SomeTypedDict, Tuple[str, str]]

I had to google a bunch to figure out how to write the Python version, whereas the typescript one was completely natural to write. It takes one line and requires no imports. The interface is inlined. All of this also makes it more readable when you come across it eg. in an IDE tooltip.

joshuamorton · on Oct 20, 2019

Granted, in python, I'd call the use of a typed dict a smell. If you're able to spend the time creating the typed dict, just promote it to a dataclass. Using python ~3.9, this will look like

    @dataclasses.dataclass  # or @attr.s
    class MyStruct:
        a: List[str]
        b: int|float

    t: MyStruct|Tuple[str,str]

But MyStruct will be an actual object that can be manipulated as an object. And if you want to accept any object that fits that interface, instead of just instances of MyStruct,

    class MyStructTmpl(Protocol):
        a: List[str]
        b: int|float

Then

    def f(thing: MyStructTmpl|Tuple[str,str]) -> bool:
        return True

    f(MyStruct(a=['a'], b=2))

would typecheck.

In JS having the typed-dict type makes sense because you're often working with arbitrary objects with who knows what attributes, but in python that isn't the case. There's fairly succinct and powerful tools (now, anyway) to define record types.

scrollaway · on Oct 20, 2019

I'll give you that in controlled code the use of typed dicts would be a symptom of a code smell, but in less controlled environments where you're dealing with eg. JSON inputs, form inputs, SQL table results and so on … not so.

I'm also not onboard the "it's a code smell, it doesn't matter" train. IMO if python adopted the typescript typing syntax we'd all be better for it.

I also forgot to mention the atrocious typing syntax for functions. Once again, typescript is a lot more succint and readable.

joshuamorton · on Oct 20, 2019

> but in less controlled environments where you're dealing with eg. JSON inputs, form inputs, SQL table results and so on … not so.

I more or less agree with this, but then again, IMO you should be isolating the less controlled code behind a controlled api. And the marginal value of converting `_ConvertQueryResultDictToQueryResult(qr: Dict[str, Any]) -> QueryResult` to something that uses a typeddict instead (which may not be possible, since that function is probably generic) is low.

> I'm also not onboard the "it's a code smell, it doesn't matter" train.

Emphatically, this isn't what I'm saying. What I will say is that ergonomics encourage certain methods of development. From experience, I'm strongly against the pattern of using a dict as a weak struct. The best comparison I can give is tuple -> namedtuple -> attrs. Namedtuple has absolutely valid uses (when you need tuple semantics, usually for backwards compatibility). But people often use it for any record type, because it's easy and familiar. Dataclasses are usually better, and I'd be happier (and the average python code would be better) if the friction to add a dataclass was lower than the friction to add a namedtuple.

Similarly, if the friction to use a dict in place of an object is much lower, people will be encouraged to use dicts in place of objects. This isn't a good thing. That doesn't mean that we absolutely shouldn't try to improve ergonomics across the board, but I'm a strong believer that the language should make doing the right thing easier than doing the wrong thing, and this is often (but not always!) the wrong thing.

scrollaway · on Oct 20, 2019

Yeah I get what you're saying. And indeed. I seldom use dataclasses even though I should. Or namedtuples for that matter. It feels like the fact they're an import away makes them harder to use.

Maybe both should be in the global namespace...

thelastbender12 · on Oct 21, 2019

strongly agree. I love the extra clarity type annotations bring to the code, though going back to the start of the file every time I want to add an import is slightly dissuading.

breatheoften · on Oct 20, 2019

> I think the killer language will be typescript with access to both the python and JavaScript ecosystems. We'll see what that looks like.

I think this is an extremely good idea. Python is horrible but forced on a huge number of developers because of its ecosystem ... I think a bridging layer from typescript to python could be built in a way similar to swift’s Python Interop — and I don’t think it would require any special language support ...

I think could actually make a better/easier to use/more robust design than Swift by requiring all interactions with the python interpreter from node be async.

timothycrosley · on Oct 20, 2019

> Python is horrible but forced on a huge number of developers because of its ecosystem

This is a really interesting perspective to me. Coming from Python circles, I've heard too often how horrible JavaScript is as a language and how it's only used because the web has dictated it. Doing web development, I've used both, and generally am inclined to agree. I know TypeScript add some niceties on top of it, but it still is stuck with JavaScript baggage. My perspective has always been that Python is by far the better language, which is why people have written that eco-system in it despite the fact it doesn't have a built-in monopoly of the browser.

TylerE · on Oct 20, 2019

I don't think there is any sort of long-term future in anything "Python". I think a successful modern language has to have the potential for efficient concurrency baked in, which isn't really possible without breaking compatibility, and the Python community would never survive another round like the 2->3 transition. (And I'm not convinced the community really survived that one either, given the amount of ongoing bitterness about the whole situation).

coldtea · on Oct 20, 2019

Err, did you notice that Python is in the top 3 language in TIOBE (up from 4), and named "Language of the year"?

What bizarro bubble do you live in?

timothycrosley · on Oct 20, 2019

Python is very healthy at the moment, and still growing! However, I think that makes it even more important as a community that we don't rest on our laurels and we fix the issues we do have. CPU concurrency is definitely one of those issues.

guggle · on Oct 21, 2019

> CPU concurrency is definitely one of those issues.

Yes, and each new version actually adresses that, one step at a time. (see https://bugs.python.org/issue35813 for example).

TylerE · on Oct 20, 2019

Indicators like TIOBE are VERY lagging. Like 5-10 years

coldtea · on Oct 21, 2019

In my assessment, not really. They are surprisingly spot on (for the top spots and singling out contenders).

Not to mention the language landscape is almost static at the top. Nobody's gonna came and take Python, Java, C, C++, JS, out in the next 10-15 years...

Only a huge self-blunder, like the Perl 5 -> 6 transition, and only at much more volatile time (when paradigms change, e.g. when web dev changed from CGI, Perl 5 had already lost the web framework scene to PHP, Rails, Django and the like even before losing its main niche back then - admin work) can do any serious damage to a top language...

Anyway, let's check in 5 years...

TylerE · on Oct 21, 2019

Well, for instance, TIOBE has Rust at #34, below such mainstays as ABAP, COBOL, and Scratch. It has Groovy above Ruby.

dagw · on Oct 21, 2019

Well, for instance, TIOBE has Rust at #34, below such mainstays as ABAP, COBOL

Do you believe that to be incorrect? I think you're probably underestimating how widely used those languages are in massive, "boring" companies around the world. Rust may be the new cool kid, and may even be the future, but the number of companies around the world that have adopted Rust for anything significant today is minuscule.

It has Groovy above Ruby.

Again, Java is everywhere, and many Java shops have added Groovy to their workflow where it makes sense. Ruby is barely used outside a small of number of tech companies.

coldtea · on Oct 21, 2019

Well, Rust is probably well under COBOL, that's for certain.

Not in momentum, but there are tons of installations, and billions of lines of code in COBOL ever churning. If LOC was the main criterium (and not just a factor), and if COBOL projects were hosted in GitHub, most languages with be dwarfed by it.

And Groovy is semi-popular in the Java world, which is huge itself.

But as I said, TIOBE is very good in the top-10 languages, and for spotting new major contenders (by how they jump up spots).

It's not great for relative ranking of the longer tail of languages above the top-10 / top-20...

timothycrosley · on Oct 20, 2019

> I think a successful modern language has to have the potential for efficient concurrency baked in

I agree with this!

> I don't think there is any sort of long-term future in anything "Python".

I disagree with this :)

I think Python has efficient IO concurrency built-in already with async, and I feel it is likely that it finds a way to work out CPU bound concurrency long-term, as projects like sub-shells with channel communication demonstrate.

nine_k · on Oct 20, 2019

Efficient CPU concurrency needs either fine-grained locking (and removal of GIL), or data immutability.

Both seem rather hard to implement.

I predict that for CPU-intensive tasks, you'll keep using extensions in native code (like numpy or pytorch), or keep passing serialized objects through queues in multiprocessing setups.

qaq · on Oct 20, 2019

https://github.com/jreese/aiomultiprocess The above is suffecient at FB scale for fairly intensive processing. Python is also sufficient for running quite a good bit of Instagram. I know some startups like to deploy over-engineered solutions but in reality Python is sufficient in many use-cases. You can always drop down to Cython if you have some hot path you need to optimize. (or Rust).

dmix · on Oct 20, 2019

Data science/AI people breathed a whole lot of life into Python. I'm curious where it would be if they had went elsewhere.

edraferi · on Oct 20, 2019

R for the ecosystem, Julia for the performance & future proofing.

coldtea · on Oct 20, 2019

What "future proofing"? It's not like Julia is "the future" it's just a contended, and it's not doing that good at that either...

pjmlp · on Oct 21, 2019

Its adoption at banks and life science research labs tells another story.

Naturally if the community finally gathers around PyPy, that might change.

coldtea · on Oct 21, 2019

>Its adoption at banks and life science research labs tells another story.

In all 10 of the places it was adopted?

Compared to Python with the thousands of deployments "at banks and life science research labs" it's a no show....

pjmlp · on Oct 21, 2019

Compared to UNIX with the thousands of deployments "at a couple of universities campus and hobbists" it's a no show....

coldtea · on Oct 21, 2019

Sure, though Python doesn't compete with UNIX, while it does with Julia for data science so not sure what the above is supposed to mean

(though, "competes" assumes Julia is getting anywhere close to Python's adoption in data science, but still)...

pjmlp · on Oct 21, 2019

It is supposed to mean that if Linus had listened to these kind of opinions Linux had gone nowhere.

moksly · on Oct 21, 2019

Aside from having a two decade history of using C#, types is the only thing preventing us from going full Python. Even so, the dynamic types in Python are more often a benefit than a disadvantage because Python is so great at handling them automatic.

We build our employee database, and from there our IDM, from a singel XML file in a really shitty format + three txt files in even worse formats (they are single line output files from an old mainframe system predating sap). We used to do it in a rather complicated Microsoft SSIS workflow with a lot of C# services. All in all it’s a 30 minute nightly runtime. I recently replaced it with around 500 lines of Python and a 1-5 minute Runtime (sometimes at the beginning of a school year we’ll see changes to around 1000 positions).

Python eats the XML like it wasn’t shit. It takes things like terrible date formats, we’re talking the output of a SAP free-text box shitty, and ports then seamlessly into a SQL date field. This alone was a nightmare in C# and Python just does it.

Still, after two decades of strict types it feels dangerous.

Rotareti · on Oct 21, 2019

I can imagine the next big programming language will be one that is split into two language-variants: the "low-level-variant" and the "high-level-variant".

The high-level-variant is a dynamic language with optional typing, which is good for scripting, fast prototyping, fast time-to-market, etc.

The low-level-variant is similar to the high-level-variant (same syntax, same features mostly, same documentation), but it has no garbage collector, typing is mandatory and it runs fast like C/C++/Rust. Compiled packages that are written in the low-level-variant can be used from the high-level-variant without additional effort at all. The tooling to achieve this comes with the language.

A language like this would be insane, IMHO.

nikki93 · on Oct 21, 2019

A key consideration here would probably be the expression and passing around of managed instances spawned in the high-level variant through low-level code. Would you explicitly retain and release them? -- etc. I think it should be an ergonomic solution for this language to provide an edge over just using C / C++ / etc. with Lua / Python / etc.

alex7o · on Oct 21, 2019

You can say that this is typescript and assemblyscript, they have the same syntax but one of them is compiled natively (wasm).

Rotareti · on Oct 21, 2019

Something in that direction, but I'd imagine it more like a Rust with a high-level-variant than a JS/TS with a low-level-variant ;)

jsmeaton · on Oct 21, 2019

For the sites I typically work on it’s very hard to give up the Django admin and all of the features it provides.

At the same time, I’d love a stronger type system to avoid a bunch of the pitfalls that the dynamism of python has.

So count me in.

thelastbender12 · on Oct 21, 2019

Very much this! For numerical computing, Numba + llvmlite attempts to do it.

I don't know however if this approach could be extended to other domains - say making a web framework. Given, python classes let you do so much tinkering, any attempts to port existing code will probably need a lot of rewriting?

totalperspectiv · on Oct 20, 2019

I'm hoping someone with more experience will chime in, but what about rpython / pypy?

chenzhekl · on Oct 21, 2019

How about Nim? It has Python-like syntax and is as fast as C. https://nim-lang.org/

timothycrosley · on Oct 21, 2019

Answered this below:

> I've been tracking nim, and would agree it's the most promising so far! I feel though that it's trying to be too flexible in many ways. Examples of this include allowing multiple different garbage collectors and encouraging heavy ast manipulation. I'm also afraid it is different enough to keep it from attracting a significant amount of developers from the Python community. Nonetheless, it's something I plan on using and contributing to, since it's the best option so far.

Though, now that another commenter pointed out mypyc: https://github.com/mypyc/mypyc I believe I'll invest my limited free-time in that project instead, as it will allow me to stay within the Python community and eco-system that I love so much.

Jefro118 · on Oct 21, 2019

Just in case it's of interest to anyone reading this, I interviewed the designer of Nim, Andreas, about his design choices and what he learned from Python and the C family here: https://sourcesort.com/interview/andreas-rumpf-on-creating-a...

Gives some good insight into where Nim is going in the future too.

strokirk · on Oct 22, 2019

It's certainly interesting to use! However, it's type checker still have a lot of work to go, since you can easily segfault due to using a nil reference.

bratao · on Oct 20, 2019

There is https://github.com/python/mypy/tree/master/mypyc that I think is a great idea and approach

paulie_a · on Oct 21, 2019

I completely agree. With python I need ten packages. With the shit show of JavaScript I need 100 conflicting packages. Why bother on a backend framework like js. it's a worthless language for backend development

Nimitz14 · on Oct 20, 2019

Yeah I was hoping Nim would be it but I don't like the syntax they use.

carapace · on Oct 20, 2019

Cython? Nuitka?

timothycrosley · on Oct 20, 2019

I use Cython a lot! But mostly to speed up existing Python code, and build C-extensions faster. I don't see it as a strict subset of Python or a new language to build a community around. Nuitka I just started experimenting with to build standalone Python executable, and I really like the direction and roadmap they are following. In the end though both of these technologies seem like ways to somewhat speedup existing Python code and not attempts to introduce a strict language subset that would allow the greatest amount of optimization, and finally fix long running issues, like the inability to have multiple versions of a package installed.

carapace · on Oct 20, 2019

What about RPython?

https://rpython.readthedocs.io/en/latest/rpython.html

timothycrosley · on Oct 21, 2019

As far as I understand it, RPython isnt' really meant for actually writing programs in:

> Do I have to rewrite my programs in RPython?

> No, and you shouldn’t try. First and foremost, RPython is a language designed for writing interpreters. It is a restricted subset of Python. If your program is not an interpreter but tries to do “real things”, like use any part of the standard Python library or any 3rd-party library, then it is not RPython to start with. You should only look at RPython if you try to write your own interpreter.

See: https://rpython.readthedocs.io/en/latest/faq.html

carapace · on Oct 21, 2019

Hmm, still it might make a good foundation for the sort of thing you're talking about, eh?

Cheers.

timothycrosley · on Oct 21, 2019

For sure!

TylerE · on Oct 20, 2019

timothycrosley · on Oct 20, 2019

I've been tracking nim, and would agree it's the most promising so far! I feel though that it's trying to be too flexible in many ways. Examples of this include allowing multiple different garbage collectors and encouraging heavy ast manipulation. I'm also afraid it is different enough to keep it from attracting a significant amount of developers from the Python community. Nonetheless, it's something I plan on using and contributing to, since it's the best option so far.

nimmer · on Oct 23, 2019

> allowing multiple different garbage collectors

How's that a problem?

weberc2 · on Oct 20, 2019

Sounds like Go. ;) This is a cheeky remark, but I use Python and Go, and Go very much feels like an improved Python in most ways. Especially when it comes to static analysis, build tooling, distribution, performance, etc. In particular, I love that there are no venvs, pipenvs, virtualenvs, pyenvs, wheels, eggs, setuptools, easy_installs, etc.

nine_k · on Oct 20, 2019

What Go adds in tooling and performance, it takes away in expressivity.

What takes 3 lines in Python, takes 10-30 on Go.

pjmlp · on Oct 21, 2019

Yeah, never understood what a Python Dev could find attractive in Go, besides not having to deal with C instead.

Which is positive mind you, but they would be better served by adopting PyPy.

weberc2 · on Oct 21, 2019

Pypy is super cool, but it doesn’t solve for maintainability and it only improves performance by one order of magnitude, leaving it 1-2 orders slower than Go. Besides, IMO, goroutines are so much nicer than Python’s async.

meowface · on Oct 21, 2019

Yeah, I totally agree. I love Python, but I can't stand Python's async situation. I don't like Go, but I think its concurrency is miles better.

weberc2 · on Oct 21, 2019

Yeah, but typically only locally. Like using a for loop instead of a list comprehension, or handling errors. So more keystrokes, but in most cases not more complexity. In some cases (generic programming), Python really is more expressive, but those ~5% of cases aren’t worth the tooling/perf/maintainability tradeoffs most of the time.

nine_k · on Oct 21, 2019

It depends.

Code reviewers glazing over copy-pasted boilerplate blocks can more easily lose track of the whole, and miss an error which is obvious when the whole is expressed in 10 lines.

There is some optimal range of expressive density for comfortable use by humans. APL or K likely above that level, and Go feels below it, not as low as COBOL, but still.

weberc2 · on Oct 21, 2019

The opposite is true in my experience. Most of that boilerplate is brackets and indentation, which visually frame the interesting bits, drawing your eye to them. This is, of course, subjective, but I use both regularly and at worst this is not a problem for Go.

joshuamorton · on Oct 21, 2019

The problem here is that there is boilerplate at all. There shouldn't be.

Boilerplate distracts from what is actually going on. I can generally identify code smells from the shape of python code (like, blur all the text so I can't read the words, and the shape of the blocks tells me everything I need), I can't do the same in go, because there's so much more indentation and visual stuff happening, and most of it (boilerplate error handling) isn't interesting.

weberc2 · on Oct 21, 2019

Like I said, I disagree with this. I suspect this is either because you're very experienced with Python and relatively inexperienced with Go, or perhaps you're simply an outlier. I think if you surveyed developers who are very experienced with Python and have at least a few months of experience with Go, you'll find people say that it's easier to identify issues in Go code--and I think this largely comes down to the role the boilerplate has in visually "framing" or "structuring" (i.e., providing "shape" to) the code.

Have a look at Haskell which goes to great lengths to eliminate boilerplate and I think you'll experience the opposite--Haskell becomes very difficult to read precisely because the code is so dense. Similarly, take the indentation, newlines, etc out of a JavaScript file or JSON blob (minify it, more or less) and see if it's more or less readable as a result. I think you'll find that visual structure is actually important.

joshuamorton · on Oct 21, 2019

At this point I've written fairly little go code, but reviewed quite a bit. Among those I work with, my opinion seems to be shared.

> I think you'll find that visual structure is actually important.

I didn't say otherwise. What I did say is that go adds visual noise that isn't present in python. (and it is noise: the proposal to add try! shows that the error handling style is noisy. It can be basically entirely removed by an automated transformation). Actual pattern matching like rust has, or even what Google C++ has with StatusOr and [1] our nonsense RETURN_IF_ERROR macros are better than what go does, and just as explicit (actually often moreso, since its more difficult to forget an error condition)

[1]: https://github.com/protocolbuffers/protobuf/blob/d0f91c863ae...

weberc2 · on Oct 21, 2019

> Among those I work with, my opinion seems to be shared.

Yeah, preference distributions are hard to assess. Either of us could be wrong.

> I didn't say otherwise. What I did say is that go adds visual noise that isn't present in python. (and it is noise: the proposal to add try! shows that the error handling style is noisy. It can be basically entirely removed by an automated transformation).

I’m glad we agree that terseness is not readability and visual structure is valuable. How do we meaningfully debate whether some boilerplate is noise or useful visual structure? Why is Python’s implicit propagation of errors elegant and beautiful visual structure while Go’s explicit error handling is ugly noise? Specifically how do we know that you aren’t prejudiced by your disproportionate experience with Python (even assuming my disproportionate experience with Python and preference for Go is an outlier)? What are the criteria?

timothycrosley · on Oct 21, 2019

I'm curious what you see as the tooling benefits of go?

weberc2 · on Oct 21, 2019

Build tooling (“go build” vs setup.py), type checker, text editor support (hovering over a symbol for the type and docstring), documentation generator / godoc.org, dependency management (pip is great but it’s not reproducible; go’s toolchain is only modestly better here IMO), no need for virtualenvs, etc. I’m sure I’m missing several.

timothycrosley · on Oct 21, 2019

> build tooling

All my projects now use poetry for the full build tooling and I love it. No setup.py needed just include any settings in the standard pyproject.toml file example: https://github.com/timothycrosley/portray/blob/master/pyproj..., which can be generated with poetry's help using poetry init.

> text-editor support

I feel like Python with type hints (for all their current flaws) does give you this exactly.

> dependency management

Again I think poetry solves the problems here very nicely

> documentation generator

Personally, I like portray better than anything in the Golang world for this https://timothycrosley.github.io/portray/ I may be biased since I wrote it.

I write a lot of Python tools so I'm genuinely curious because if there were unfilled needs I would want to address them as one of my 52 projects: https://timothycrosley.com/

weberc2 · on Oct 21, 2019

> All my projects now use poetry for the full build tooling and I love it. No setup.py needed just include any settings in the standard pyproject.toml file example: https://github.com/timothycrosley/portray/blob/master/pyproj..., which can be generated with poetry's help using poetry init.

We have yet to try poetry in our org. I'm hesitant to stray off the well-trodden path, but it might be worth a shot. Any idea about installing packages with system dependencies? Packages like `pygraphviz` (which depends on the `graphviz` or `graphviz-devel` system library) has always given us a lot of trouble, for example.

> I feel like Python with type hints (for all their current flaws) does give you this exactly.

I've noticed that some editors try to use these hints, but they seem to have a hard time in many cases loading the modules. It's possible that the editor extensions (e.g., VS Code) are just buggy, but it's still a problem. Further, they require that all of your dependencies have annotations or type stubs.

> Personally, I like portray better than anything in the Golang world for this https://timothycrosley.github.io/portray/ I may be biased since I wrote it.

I haven't tried portray.

The killer thing about Go's documentation generation is that it uses type annotations and exposes them in the generated documentation. This is critical because 95% of the reason I'm looking at documentation (especially in Python) is because I need to know the type signatures (and often Python docs omit types, or the types are wrong or vague--e.g., "the type is 'binary'" with no indication if that means a bytestring or a BytesIO or what). This is tablestakes for documentation systems in statically typed languages, but I have yet to find a Python tool that does this well. Further, `godoc.org` also generates links to types including across packages--this is _not_ tablestakes for statically typed languages--so you just have to click the type name and it will take you to the docs for other packages. Further, there is no CI needed to build/publish your documentation; `godoc.org` just needs access to your repo on github or elsewhere (you can run your own godoc.org inside your corporate firewall). Another nice-to-have feature is that documentation is just comments; there's no formal/obscure syntax a la sphinx.

> I write a lot of Python tools so I'm genuinely curious because if there were unfilled needs I would want to address them as one of my 52 projects: https://timothycrosley.com/

Cool. I'll take a look!

timothycrosley · on Oct 20, 2019

I hate the fact that you may be right, because I really don't like Go in many ways:

- I hate it's module system and package eco-system story. - I don't like its syntax. - I don't like its error handling. - I'd much prefer gradual typing. - I want to maintain the ability to use interactive interpreters. - I don't like the fact that instead of being community driven it is Google driven.

But, anecdotally, I see go being used as a second language to Python more than anything else and at an ever accelerating rate.

weberc2 · on Oct 21, 2019

These are all fair points. I really enjoy Python, but there are too many things I fight with on a regular basis that simply aren’t issues in Go. It could be so much better if (1) there was a better type system (mypy is unnecessarily shoehorned into the syntax and still very broken—can’t even express recursive types like JSON), (2) a good way to constrain the dynamism so performance could be improved, and (3) a better environment/package management and distribution story (so far pantsbuild.org and PEX files are the best I’ve found). Then there are a long tail of more minor issues, like async/await vs goroutines, real parallelism, etc.

timothycrosley · on Oct 21, 2019

> type-system

I agree, but if you for instance look at the TypeScript comparison sub-thread, you'll see that all the issues with both the syntax and implementation of the type-system are being aggressively resolved, and likely will all be so by 3.9.

> Good way to constrain the dynamism so performance could be improved

Couldn't agree more!

> environment

I find poetry a joy to use. If you want to bypass venvs all together, there's a lot of work to make that a reality, such as https://github.com/David-OConnor/pyflow.

> packaging

Python in 3.5 added complete zip app support, which has improved this dramatically from my perspective. Extended by things like shiv https://github.com/linkedin/shiv make it fairly complete.

> async/await

This is interesting to me. I prefer async/await in general, because it has become a standard across programming languages and I find it really easy to reason about. I also find channels to be too widely seen as a cure-all, when the only study so far has shown they actually led to an increase bug count. But I don't discount the value of real-parallelism, and am glad to see that Python has been pushing harder on that lately, with things like subshells that allow bypassing the GIL on a single thread.

weberc2 · on Oct 21, 2019

> I agree, but if you for instance look at the TypeScript comparison sub-thread, you'll see that all the issues with both the syntax and implementation of the type-system are being aggressively resolved, and likely will all be so by 3.9.

I'm happy to hear that; hopefully the efforts really do address these issues well.

> I find poetry a joy to use. If you want to bypass venvs all together, there's a lot of work to make that a reality, such as https://github.com/David-OConnor/pyflow.

I'll have to check those out, but one inherent problem is that even if these tools really do solve my pain points, adopting them means I'm leaving my org on a relatively small island, isolated from the Python community. If these really are the holy grail, why isn't the broader Python community adopting them? Please don't take this as me looking for something wrong--whatever Python build tool I use, I'll eventually need support and there's a lot to be said for having a thriving community that has almost always run into my exact problem before.

> Python in 3.5 added complete zip app support, which has improved this dramatically from my perspective. Extended by things like shiv https://github.com/linkedin/shiv make it fairly complete.

We're currently using this via pex. It mostly works, but we still run into problems occassionally (system dependencies, for example). Figuring out how to integrate these tools into the broader build process is another problem to solve--we're using `pants` which supports pex out of the box, but we're running into lots of bugs or other problems. I'll keep an eye on shiv.

> This is interesting to me. I prefer async/await in general, because it has become a standard across programming languages and I find it really easy to reason about. I also find channels to be too widely seen as a cure-all, when the only study so far has shown they actually led to an increase bug count. But I don't discount the value of real-parallelism, and am glad to see that Python has been pushing harder on that lately, with things like subshells that allow bypassing the GIL on a single thread.

My biggest issues with async/await are

(1) every package needs an async variant (async boto, async docker, etc etc). We work around this by running them in a thread pool executor, and I think that works, but I don't know if I'm holding the GIL unnecessarily and causing performance issues (fundamentally difficult to diagnose). This is roughly the "what color is my function" problem.

(2) it's really easy to starve the event loop by calling into something that transitively makes a sync call or otherwise just does a lot of CPU-heavy work. We've run into both kinds in production and they've been really hard to troubleshoot (because the requests that time out often aren't the ones that are actually causing the problems).

(3) dynamic typing means it's super easy to forget to await things. Tests should catch this, but we find ourselves writing tests _just_ to catch this (e.g., we now write tests for entrypoints that _just_ `await lib_function(params)`; we would normally not write tests for such simple functions, but now we have to). Static typing is the right way to solve this and mypy does, but mypy has too many other issues (at the moment) for our org.

One substantial criticism of goroutines is that they're less safe than async/await because you need to make sure the code you're running is threadsafe. I appreciate this criticism, but I think it's the right tradeoff for Go's performance aspirations (another great high-performance alternative is Rust's borrow checker, but that's the wrong tradeoff for Go's developer productivity aspirations).

timothycrosley · on Oct 21, 2019

> I'll have to check those out, but one inherent problem is that even if these tools really do solve my pain points, adopting them means I'm leaving my org on a relatively small island, isolated from the Python community. If these really are the holy grail, why isn't the broader Python community adopting them? Please don't take this as me looking for something wrong--whatever Python build tool I use, I'll eventually need support and there's a lot to be said for having a thriving community that has almost always run into my exact problem before.

Only because they are so new. portray was built a few weeks ago and already has a thriving community building around it - but of course it's still a small drop of the whole ecosystem. Older tools I've built like isort, are now ingrained into the community: https://github.com/timothycrosley/isort, but that took years, even without major issues or complaints being present. It just takes people time to adopt new things.

ledauphin · on Oct 20, 2019

Go may "feel" like Python, but it's almost nothing like Python in actual practice. It's not dynamic (and doesn't even have generics), and its error handling is dramatically different.

weberc2 · on Oct 21, 2019

It _is_ like Python in practice (I use both languages all the time). That’s largely why you see it used in many of the same places as Python. It has dynamic features by way of interface{}, which is every bit as “generic” as what Python has to offer. :) But yes, the error handling is different—values vs exceptions.

qtplatypus · on Oct 21, 2019

I am of the view that interface{} is the worst of both worlds with regards to static/dynamic typing. Dynamically typed languages typically have type coercion and structures that make dealing with vars with unknown types easy.

However golang doesn't have that. So you get the danger of a dynamic language without the features that make powerful.

weberc2 · on Oct 21, 2019

Go has those features in the reflect package (so as far as I know, Go is just as powerful as Python), but you’re right that they aren’t easy to use. If you do use them, it’s quite clear, and will be addressed in code review so you don’t have nearly as many dynamic typing bugs as Python—it’s not anywhere close.

Very dynamic code shouldn’t be easy; the happy path should encourage clear, simple code. By encouraging people to stay on the happy path, their code is more performant, maintainable, etc and it keeps the average code quality quite high across the ecosystem.

qtplatypus · on Oct 22, 2019

It is rare that I have had problems with dynamic typing errors in P* languages. I am of the view that that dynamic code should be one of two things.

1) Super easy. That way doing it right is trivial. 2) Impossiblely difficult so the only people who are doing it can be trusted to do it right.

To me go falls between those two. It’s real easy to say interface{} (indeed it is more difficult to make a non empty interface) but doing it in a way that is safe isn’t easy.

I don’t think expressive power is the point here. As they are both compleate languages. More it is an issue of what trade offs and comprises have been made.

weberc2 · on Oct 22, 2019

I'm not sure (1) exists, probably by definition. And I certainly don't agree that Python makes it easy to "do it correctly". Our Python app has daily 500s due to typing errors. We also suffered for years because we would build magical things that we thought would work in every scenario but ended up being untestable and/or failed to consider numerous edge cases ("what happens if someone inherits from my magical class?") and/or which failed to extend properly ("oops, someone renamed this attribute and now all of our hasattr checks are broken, and the tests didn't catch it because they passed mocks"). Eventually we built a culture that mostly discourages magic/gratuitous dynamism, but it took years and we're still suffering from that legacy code.

These problems simply don't crop up in Go, or at least they're in a different ballpark in terms of frequency and severity. So yeah, Go lacks typesafe generics, but I'll make that tradeoff all day every day in exchange for the maintainability, performance, tooling, distribution, etc improvements that Go offers today. No contest.

qtplatypus · on Oct 23, 2019

It seems that we have different experiences with regards to typing errors. Perhaps due to different practices, coding styles and problem domain.

codr7 · on Oct 21, 2019

Go has generic types the same way Python has macros, or the same way C++ templates is a functional programming language.

C has void *, writing generic code using it is hell. Enough so that people went through a lot of trouble creating C++ and later Rust to escape it.

I'd say the type casting from interface{} to whatever you assume is in there qualifies as different.

Pretty much every single aspect of these languages is different from what I can see, the only thing they have in common is included batteries, the rest is growing popularity and consequences thereof.

weberc2 · on Oct 21, 2019

Yeah, I get it. It’s a little disingenuous of me to say that interface{} qualifies as generics, but I can’t quite put my finger on why it is different than Python. Neither are typesafe (although mypy supports generics, but has many other issues), but in any case typesafe generics would I think improve Go.

allan_s · on Oct 20, 2019

> This means that just by importing this module, we're mutating global state somewhere else.

Yes, this !

That's why I hate Django and some flask app the most for, the fact that by importing a module, you're implicitly creating a database connection, and a lot of other magic stuff, which mean that now I can't import a constant defined in said module outside of `python manage.py`

Also as said below in the article, suddenly it's much harder to handle smoothly the "the database is momentary unavailable" (because someone has put the line starting the database connection in the global space of a module somewhere)

I much prefer frameworks/modules for which code is executed only once you invoke their "setup" function

orf · on Oct 20, 2019

Django doesn’t create database connections on import. That would be madness.

It does create an object that can (lazily) connect to the database, so it needs the required database drivers installed. It also needs the required information about _how_ to connect to the database, so it needs the settings loaded.

That's why you need to use `django.setup()` before, to tell it what settings to load. You should never be importing random Django models without this configured, simply because they cannot be used and will not work. We think an exception saying "don't do this, call django.setup()" is less confusing at import time is than "Databases not configured" at runtime. Not that it would even reach that, because you might be using a field from a third party application that needs to be initialized (i.e INSTALLED_APPS configured) or that relies on a configured settings (maybe an encrypted field that needs your SECRET_KEY available).

Stop making it hard, just write a management command. It's super easy.

nerdponx · on Oct 20, 2019

I much prefer frameworks/modules for which code is executed only once you invoke their "setup" function

Django _does_ have a "setup" function. You can't import and use Django database connections outside of a running application without it.

Flask also has a "run" method and does no i/o without it.

heavenlyblue · on Oct 20, 2019

Every time I hear a comment alike parent's, it makes me think how many times a day I actually read a comment in the same fashion, but about something I actually know nothing about.

diminoten · on Oct 21, 2019

What's worse is how many people blindly upvote negativity. It's still somehow cool to shit on things...

nerdponx · on Oct 20, 2019

With credit to the original poster, they might be complaining about the fact that Django is a monolithic framework and you can't really use Django code without spinning up the i/o portion. Which is legitimate criticism, but frankly if that's what you need then you shouldn't be using Django.

heavenlyblue · on Oct 20, 2019

If by I/O you mean web i/o then Django provides a perfectly functional management utility interface accessible through cli.

If, however by i/o you mean the database portion of it then Django works without database configuration, too.

yen223 · on Oct 20, 2019

Without calling setup, you cannot import anything that touches Django models, like constants defined in a file that transitively imports a Django model.

In practice, this means that any script that depends indirectly on Django code will incur a lengthy startup cost (from having to call setup()), and will fail to run if there's no database connection, even if the script itself doesn't need the db.

heavenlyblue · on Oct 20, 2019

You can import Django models before calling setup, you will simply not be able to use the database before calling setup.

tummybug · on Oct 20, 2019

I'm not sure about django but flasks Application object has a before_first_request method which takes a function designed to do this type of initialization operations.

rectangletangle · on Oct 20, 2019

I'm a huge fan of Django, but I always felt that this was true. I wish there was more of a push to decouple parts of the framework. Keep the magic, but allow usage without it.

ledauphin · on Oct 20, 2019

I love the idea, but it feels like just an idea at this point. I'd rather read about them releasing their 'compile-time' analyzer and revealing their measurements for how much startup time it saves.

In our codebase, we have pretty strict developer-enforced rules about not doing I/O at the module level, usually through the use of simple "Lazy" wrappers for module-level objects. I'd be curious to know what other approaches people have taken with Python here.

rectangletangle · on Oct 20, 2019

It is an interesting approach, though I feel like this could introduce some nasty unintended consequences given how dynamic and introspective Python can be (admittedly I haven't studied this particular implementation).

I always treated this a bit like single underscore private functions/methods, i.e., follow a convention that produces code that's easy to reason about, even if it's not strictly enforced by the language/compiler. So in practice this equates to separating out modules that mutate global state, and placing the majority of logic in "strict" modules that only declare a bunch of "pure" classes/routines. So the "non strict" code is really just a thin layer of wiring gluing everything together. For instance my Celery task files tend to be very thin.

ledauphin · on Oct 20, 2019

well, we also heavily use static typing, so you end up with something like

my_db_conn: Lazy[DbConn] = Lazy(lambda: make_db_conn(...))

and MyPy will tell you if you're doing something silly when you try to use it.

EDIT: After typing up this response and submitting I realize you were talking about their strict approach rather than ours. whoops :)

jedberg · on Oct 20, 2019

It's interesting to me that they are going down this path instead of the microservices path. This seems like something ripe for slowly breaking down into microservices.

Someone made a change that took down production because of non-deterministic outcomes? How about break out whatever they were changing into it's own service? With proper fallbacks, breaking that part shouldn't take down all of production again.

To be clear, I'm not saying microservices will solve all their problems or be less work. I'm just saying that with an equal level of effort, they would probably get more overall reliability by having multiple services, they'd be able to use multiple languages, whatever is suited to the task at hand, be able to deploy even more often with less risk, and be able to isolate these types of "change on import" behavior to a much smaller surface on any given deployment.

coldtea · on Oct 20, 2019

>Someone made a change that took down production because of non-deterministic outcomes? How about break out whatever they were changing into it's own service? With proper fallbacks, breaking that part shouldn't take down all of production again.

Yeah, now you'll have 10 interconnected services, 10x the complexity, and everything will have the ability to take down all of large parts of production, plus all the extra pain points of a distributed system...

jedberg · on Oct 20, 2019

You won't have 10 times the complexity if you are taking a monolith and making each section services. You'll have to same dependency graph, it will just use the network to make calls between them instead of being local.

You'll have added complexity with the network calls, which is why I said it wouldn't be any less work, just different work.

coldtea · on Oct 21, 2019

>You won't have 10 times the complexity if you are taking a monolith and making each section services. You'll have to same dependency graph, it will just use the network to make calls between them instead of being local.

Merely "use the network to make calls between them instead of being local" will add 10 times the complexity -- you suddenly have a distributed system, latency, delays, parts that can be on or off, de-centralized configuration (which can also get out of sync), and so on.

pytester · on Oct 21, 2019

>it will just use the network to make calls between them

meaning that you get to throw network and server errors into the mix of things that can go wrong, and you get the fun of tracing failures back 3 hops to a server that decides to take too long to run a process one day and times out a connection downstream.

it's horrible debugging stuff like this.

civicsquid · on Oct 20, 2019

Beyond increasing complexity, I think this also assumes a dependency graph that _can_ be broken down into microservices by the author/the author's team. From my experience a lot of things at this scale have such complex dependencies that unteasing those dependencies is difficult if not impossible without asking several teams to do something differently. And who knows how long that will take?

jedberg · on Oct 20, 2019

That's why you do it slowly. You take a small part of the monolith and make a service that does the same thing. Then you replace the code in the monolith with a call to the service, while keeping track of how often it is called in the monolith.

As you keep moving along, some things that depend on that first service will start calling the new service directly, and some will still call it in the monolith. But your tracking will tell you how often and who is doing that, so you can find out why.

In the meantime, nothing will break, because the monolith is still a pass through proxy to your service.

gtirloni · on Oct 21, 2019

I think your comment makes perfect sense.

However, at their scale and with their engineering resources, I can only imagine an attitude of "we can make this work" (the monolith) is easier to justify. The same goes for the micro-services approach (except here you have to justify changing what has been working so far?)

I'd love to read more about the history behind this approach at Instagram.

posedge · on Oct 21, 2019

Regardless of whether the monolith or microservices approach is the right way to go for their use case: I could very well imagine that it is too late for such a migration, and that it would hold them back for too long.

ben509 · on Oct 20, 2019

> How do we know that the log_to_network or route functions are not safe to call at module level? We assume that anything imported from a non-strict module is unsafe, except for certain standard library functions that are known safe.

It's hard to know anything about the stdlib as it can be monkey patched, e.g. [1]

That said, you could solve this with diagnostics; calculate signatures of stdlib functions and classes to find any known safe ones that were patched. Run that check in your test suite to find problematic imports.

> If the utils module is strict, then we’d rely on the analysis of that module to tell us in turn whether log_to_network is safe.

I like this. It seems far more usable than proposals like adding const decorators.[2]

[1]: https://github.com/gevent/gevent/blob/master/src/gevent/monk...

[2]: https://github.com/python/typing/issues/242

miki123211 · on Oct 20, 2019

This is yet another example of the divide between wizarding and engineering[1]. When you're a small startup, what matters is the expressiveness of your language, and the ability do do a lot of things very very quickly. Type safety, performance, readability, those things don't matter. You're just a bunch of engineers who know the whole codebase inside out, you're pretty certain of what you're doing. In short, you're wizarding. If you grow big enough, this approach slows you down greatly, and you need to switch to engineering. You sacrifice some speed for making the codebase more understandable to a larger group of people, you can no longer assume everyone knows all the code, you write unit tests, need types and dislike metaprogramming because of the confusion it creates. This is why languages like Python, Ruby, Lisp or Smalltalk are amazing for small startups, but Java is what enterprises use. They're different ends of the wizarding/engineering spectrum. I wish there was a language that let you move gradually from one end to the other, exactly when you need to.

[1] https://www.tedinski.com/2018/03/20/wizarding-vs-engineering...

bcherny · on Oct 20, 2019

> I wish there was a language that let you move gradually from one end to the other, exactly when you need to.

This is precisely what gradually typed languages — like TypeScript, Flow, and typed Pythons — solve!

I talked about this on Software Engineering Radio last week: https://www.se-radio.net/2019/10/episode-384-boris-cherny-on....

lostcolony · on Oct 21, 2019

On that note, I'd include Erlang. It's not gradually typed, per se, but you can have a fully dynamic language (no type specs, no Dialyzer), a completely optimistic static analyzer for inferring types and warning where it's inconsistent (Dialyzer runs), and then you can add specs where needed to tighten up and improve what Dialyzer can catch, to basically be a fully static language.

dnautics · on Oct 21, 2019

It's relatively easy, but not free, to do this. I find that the erlang (and elixir) guides seem to be a bit scant on best practices to achieve this level of discipline, for example, wrapping all gen_sever calls in module functions and presenting a well-defined API for the genserver module (and possibly, even linting for no naked genserver calls) is not really explained in this light. Similarly guidance is not provided for wrapping enum module calls (since that similarly destroys typing information)

lostcolony · on Oct 21, 2019

Yeah; it requires some rigor to do. My point was simply that it _can_ be done, and while the effort is high, it does allow you to move from pure dynamic language, to highly defined type checking.

Thaxll · on Oct 21, 2019

Dialazer is pretty bad and not even close to a static language.

lostcolony · on Oct 21, 2019

If you fully spec out your code, it's actually quite close. In a project we did that in, the only type errors we encountered were ones that a static system would not have caught either (due to their being caused by incoming data that did not conform to our type expectations; for instance, deserializing JSON to a specific type).

Without specs, it will assume every type is 'any()', unless it has information to infer something more stringent. For instance, if it sees you add 5 to it somewhere, it will instead assume it is a number. Etc. Even if in practice it actually is a list of some kind (and so that addition of 5 will fail). Which, yes, ain't great. Hence why I said it was a gradual transition; it will catch provable errors (i.e., if you call append on that same variable as above, it will note that there is no type that allows both append, and + an integer, and error), but leave plenty of things uncaught that could have been caught had it known the type in question (via a type spec).

dranka · on Oct 21, 2019

That critique is a bit unspecific but I tend to agree.

I have heard good a stuff about "gradulizer" though. It uses a gradual type system instead of dialyzers success typing.

https://github.com/josefs/Gradualizer

paulddraper · on Oct 20, 2019

TypeScript is perfectly this. (And other gradually typed solutions; TS is simply the most popular one.)

You have the madness of thousand of developers flinging code at the universe due to the easiness of browsers, JS, and npm.

This results in great speed, but not great quality.

When your project/company now wants quality, you keep your code but transition to types. (In OSS space, Angular and Yarn projects have both done JS => TS migrations of some form.)

ashishb · on Oct 21, 2019

Afaik, typescript is pretty bad in terms of catching some basic errors. Types are not enforced. A caller can change sync function to async, breaking the functionality downstream.

xzel · on Oct 21, 2019

You can decide if they're enforced or not. That is part of how it is gradual typing.

paulddraper · on Oct 21, 2019

Yes gradually typed languages are usually looser/more flexible about static typing.

> A caller can change sync function to async, breaking the functionality downstream.

I think you mean callee?

Are you taking about using a returned result? Because most languages permit ignoring return values.

If you want to check out promise use, check out https://tsetse.info/must-use-promises

ashishb · on Oct 21, 2019

Sorry, I did mean callee.

Consider

``` const myIntValue = f(); ```

This code will silently break when f changes from sync to async.

bcherny · on Oct 22, 2019

Right, but there will be an error where you consume myIntValue:

  const myIntValue = f()
  
  // Error TS2365: Operator '+' cannot be applied to types 'Promise<number>' and '2'
  const myResult = myIntValue + 2
  
  async function f() {
    return 42
  }

spraak · on Oct 20, 2019

Typescript was exactly what I was going to mention in reply

miki123211 · on Oct 21, 2019

It's not just about static typing, though. Macros, metaprogramming, being able to reach as deep as you want to, ugly code full of side effects, global state etc. All of those might actualy benefit you when your project is small, and they make development way faster (see Rails). Later, however, they're a definite impediment.

ken · on Oct 20, 2019

> When you're a small startup, what matters is the expressiveness of your language, and the ability do do a lot of things very very quickly. Type safety, performance, readability, those things don't matter.

I’ve never worked on a program so small that readability didn’t matter. I consider it a crucial ingredient of expressiveness and development speed.

Though your perspective could explain a few of the more atrocious code bases I’ve seen.

jacquesm · on Oct 20, 2019

During exploratory programming I don't care at all about readability, just about find a path - any path - to something that works. As soon as I have that readability starts to matter and the first order of battle is then to refactor out all the dead ends and to make the whole thing look good. That's because the project now has long term perspective.

andybak · on Oct 20, 2019

And the worst thing is using languages/libraries/frameworks that presume everyone needs engineering when you need to wizard.

There's two many people that have swallowed SOLID whole and can no longer see good engineering as a trade-off against other factors.

For example, being strict about having the smallest possible public API and making most methods private protects me from future breakage that might never be an issue (I might never upgrade) but forces me copy/paste vast globs of your code into my own if I need access to something you didn't anticipate. (and that's assuming I have access to your source. Worst case is that I have to reimplement things that already exist in the code I'm interfacing with)

Python got this right. Private methods are a weak or strong hint that you might want to think twice before calling them. But you're the boss at the end of the day.

pytester · on Oct 21, 2019

>And the worst thing is using languages/libraries/frameworks that presume everyone needs engineering when you need to wizard.

I think this is why it's easy to point a thousand things built in python which people use every day (like instagram), while in, say, haskell, there are barely a handful (pandoc, facebook spam filter, etc.).

heartbreak · on Oct 20, 2019

This is similar to Martin Fowler’s Design-Stamina Hypothesis [0].

[0] https://martinfowler.com/bliki/DesignStaminaHypothesis.html

choiway · on Oct 20, 2019

I like the way you characterize this but what on your thoughts on why you can't be a wizard with a typed language? Seems to me that if you start with something like Go or Typescript you cover a decent middle ground of foregoing a lot of boilerplate while having code that you don't need to be a wizard to understand.

zitterbewegung · on Oct 20, 2019

You can write fortran in any language. [1] https://blog.codinghorror.com/you-can-write-fortran-in-any-l... [2] https://queue.acm.org/detail.cfm?id=1039535

movedx · on Oct 20, 2019

> You're just a bunch of engineers who know the whole codebase inside out, you're pretty certain of what you're doing. In short, you're wizarding. If you grow big enough, this approach slows you down greatly, and you need to switch to engineering.

I've never heard of this before... I love it. Thanks for bringing this up.

DonaldPShimoda · on Oct 20, 2019

> When you're a small startup, what matters is the expressiveness of your language, and the ability do do a lot of things very very quickly. Type safety, performance, readability, those things don't matter. You're just a bunch of engineers who know the whole codebase inside out, you're pretty certain of what you're doing.

I'm not familiar with this use of the term "expressiveness".

My understanding is that expressiveness (as per "On the expressive power of programming languages", Felleisen 1991 [0]) has to do with capabilities that a language has that separate it from another language. C is more expressive than Python in that it gives you direct access to memory management, whereas Python is more expressive than C in that it provides inheritance/OO. (These are just examples.)

Type safety, performance, and readability are all wholly separate from expressiveness, I think. A language's type system and performance benchmarks have nothing to do with the expressive power of a language outright, and "readability" is entirely subjective to begin with.

So: would you mind elaborating on what you mean, exactly, by "expressiveness of [a] language" here?

---

In fact, most of what you (and the linked article) are talking about has to do with the dynamic/static spectrum, not this "wizarding/engineering" spectrum you've coined (though I do kind of like the idea of that for discussing development methodologies).

The article is all about how the dynamically-typed nature of Python allowed for rapid iteration at the beginning of the Instagram project, but has since hindered further progress as they've grown larger. But now they feel they can't just rewrite it all in a statically-typed language because of the engineering overhead involved.

On this note, I want to go to your last point:

> I wish there was a language that let you move gradually from one end to the other, exactly when you need to.

With regard to the dynamic/static distinction, there are languages that allow you to move "gradually from one end to the other", and they are (aptly) called gradually-typed languages.

Gradual typing was invented by Jeremy Siek and his PhD student, Walid Taha, back in the mid-2000s at Indiana [1]. In this discipline, you can have a statically-typed codebase with local dynamically-typed regions. You get all of the static guarantees for everywhere that they can be made, and dynamic regions impose runtime checks to ensure consistency. (This connects closely to contracts, which are primarily worked on by Robby Findler at Northwestern, I think.)

Unfortunately (to me), it seems like a lot of these languages are implemented in terms of existing dynamically-typed languages. For example, Sam Tobin-Hochstadt (Indiana) created Typed Racket, which is (of course) built upon Racket but provides a gradual typing discipline. Wherever possible, static types are checked, and everywhere else utilizes contracts to guarantee runtime consistency.

Anyway, all this is to say: the technology exists, technically, but is in its infancy. There's no doubt it'll be some time before it sees widespread use throughout industry. Sam wrote up a brief overview for the SIGPLAN Perspectives blog recently, if you're interested [2].

[0] https://www.sciencedirect.com/science/article/pii/0167642391...

[1] https://wphomes.soic.indiana.edu/jsiek/what-is-gradual-typin...

[2] https://blog.sigplan.org/2019/07/12/gradual-typing-theory-pr...

joshuamorton · on Oct 21, 2019

I expect in this context, expressiveness means something like "the ability to describe the relevant stuff in the code with minimal noise", which might map to having good abstraction.

I find that pythons OOP + functional aspects, combined with a good understanding of the language hits a sweet spot here. One that simply can't be reached in c/cpp/go/java/haskell, and which is much easier to reach than in js/rust/other langs where I think it is possible.

miki123211 · on Oct 21, 2019

My definition of expressiveness is basically similar to what Paul Graham (@pg) says. He also calls this "powerfulness" in his famous "Beating the averages" essay[1]. In short, a more expressive language is a language that lets you express more with less. Rails, with its "has_many :books" is very expressive, Assembly is the other end.

The wizarding/engineering spectrum was coined by the article I've linked to[2]. I think the post is exactly about that, first Instagram was wizarding and they had a suitable language for wizarding, now they're engineering, but their language is still only good for wizarding.

As I've said in a sister comment, it's not just about static typing, but metaprogramming/macros/side effects everywhere etc. There's more to the expressiveness/powerfulness than just types. While gradual typing is certainly an improvement, I think we need more research in this direction.

[1] http://www.paulgraham.com/avg.html

TylerE · on Oct 20, 2019

Who is starting large-scale new projects in Java in 2019?

coldtea · on Oct 20, 2019

Countless companies, huge and small -- from Apple and Amazon, to Google and your friendly local startup, plus all the enterprise world that's not a .NET shop...

In what parallel universe is not Java immensely popular or not used for green projects?

dunefox · on Oct 20, 2019

Hopefully in every universe where Kotlin exists.

lostcolony · on Oct 21, 2019

I can think of at least one universe where Kotlin exists, but most new JVM development is still in Java.

pjmlp · on Oct 21, 2019

Kotlin is meaningless without Java.

It is a fools errand to think Kotlin/Native would ever overtake Java.

Only on Android it might have a future, if Fuchsia never becomes a thing.

watt · on Oct 21, 2019

Kotlin might become the default Java syntax, that's what the GP is saying. And I agree.

pjmlp · on Oct 21, 2019

Java is Java.

On what concerns the JVM, Kotlin hype will be over in a couple of years, and it will get as much use as Scala, Clojure, Beanshell, Groovy enjoy nowadays.

Guest languages never get to own a platform, and with time all platform languages end up getting enough features that the large majority of developers never bother with extra tooling, debugging layer and idiomatic wrapper libraries of the guest languages.

coldtea · on Oct 21, 2019

>On what concerns the JVM, Kotlin hype will be over in a couple of years, and it will get as much use as Scala, Clojure, Beanshell, Groovy enjoy nowadays.

And we know that because?

>Guest languages never get to own a platform

That depends on the platform, who is running it, and how. You couldn't have a worse steering than Oracle.

And most "guest languages" are smaller affairs, they don't have companies the size of Google chosing them for Android app development (a huge niche in itself). Or have first class support from the most popular IDE of the host environment.

Plus, anything is anecdotal, as we have so few cases of major parent/host language rivalries, and even fewer cases with similar dynamics, that there's no real prediction.

Scala was too complex for most Jave-ers, too slow to compile, didn't have a Google pushing it to its platform devs but an insignificant company, etc. Clojure was a Lisp (= doomed), Beanshell and Groovy where from small, insignificant origins, and not pushed by anyone really mainstream the size of Google/FB/etc.

Kotlin doesn't have any of those issues.

Heck, even Elixir does quite well I hear.

pjmlp · on Oct 21, 2019

Because history has proven that multiple times.

UNIX and C, Web and JavaScript, Windows and .NET/C++, macOS and Objective-C/Swift, Android and J̶a̶v̶a̶/Kotlin/C++....

Google only cares to push Kotlin on Android, and it only matters because Google visibly doesn't want to move Java beyond the Java 8 subset that Android currently supports, so the choice is between an handicapped Java support or Kotlin.

Until there is a JVM written in Kotlin, and Kotlin gets first class support in all Java IDEs instead being a tool to sell InteliJ licenses, it is just yet another language that happens to target the JVM.

This ignoring that Kotlin already has a couple of impedance mismatches with the JVM, sequences vs streams, lambdas vs SAM, co-routines vs fibers, inline classes vs data classes.

Elixir is doing well because many developers seem wary to learn Prolog/Erlang syntax.

zmmmmm · on Oct 21, 2019

> Guest languages never get to own a platform, and with time all platform languages end up getting enough features

They do up to the point where they differ philosophically. Java is never going to turn in to a Clojure, nor is it going to adopt the type of dynamic scripting features Groovy offers.

Kotlin and Scala are more at risk in that way.

cutler · on Oct 21, 2019

Kotlin != Java .... unfortunately. Sorry Jetbrains.

salex89 · on Oct 20, 2019

Can't say about new ones, but after a couple of years of working on a huge Python project, I would accept rewriting it in Java without a whim. I have equal experience in Python and Java by now. Funny thing, I returned from my vacation a few months ago, and started writing my first code after it. Of course, I was more concerned with thinking what am I writing instead of how. Then I looked up and noticed I started typing Java instead of Python. And that's three years after the last time I've written some Java code.

pron · on Oct 21, 2019

Oh, just Apple, Amazon, Google, Netflix and likely every hospital, utility company, police force, military, or bank you depend on. And the people programming the robot swarms that pack your groceries (https://www.infoq.com/presentations/java-robot-swarms/).

lern_too_spel · on Oct 20, 2019

For large scale projects, Java and C++ remain the go-to languages. I've seen a little bit of Go start to show up but no others. Other languages are used for libraries (Rust, C), only at certain employers (OCaml, Erlang), or for small-scale projects (nearly everything else).

rswail · on Oct 21, 2019

On C++, inertia is a wonderful thing. Java has the benefits of extensive dependency injection and JVM/ecosystem tools that lower the risk of deployment of code. .NET also provides the controlled "managed code" environment of CLR.

Why any enterprise would use C++ for standard "business" or "large scale" programming makes no sense to me.

Enterprises want stability, not speed to market. Most of their infrastructure changes slowly (as in features deployed once or twice a year maybe). They have stable support mechanisms for this, including long and complex processes of approval.

pjmlp · on Oct 21, 2019

As ex-C++ dev, that lives in the Java/.NET worlds since 2005, because they still don't cover all use cases where C++ might be needed.

So while you might not write it as full stack C++, a couple of native libraries might be required as dependency, to access OS features, give some help to the AOT/JIT compilers, or in Java's case implementation of more machine friendly data structures.

victor106 · on Oct 20, 2019

One of my clients handles 90% of all the pbm routing in the US, which is millions of transactions per second. They started to completely modernize the application on java... primarily Spring Boot and Apache Geode. After some optimizations they are very happy with the performance and expressiveness of modern java.

TylerE · on Oct 20, 2019

I said new projects. Not upgrades.

GordonS · on Oct 20, 2019

Not the OP, but I understood their comment to mean 'a rewrite in Java of a legacy system written in some other language'?

victor106 · on Oct 21, 2019

Exactly right. Java has lots to improve on but there seems to be unsubstantiated hate towards Java on HN which I find contrary to what happens in the real world. In the real world companies find hiring Java programmers relatively easy (maybe not the best and brightest) who can get a project off the ground easily

jedberg · on Oct 20, 2019

A big chuck of Netflix is written in Java, and they make new stuff in Java all the time.

lucidone · on Oct 20, 2019

I write JavaScript all day and wish I could write in Java.

hsaliak · on Oct 20, 2019

Google

typon · on Oct 20, 2019

It's a good question. Why would you pick Java over Go or C++?

flukus · on Oct 20, 2019

C++ is mostly in a different but overlapping use case these days, used for system software or where performance is critical. Go is still a niche language most developers haven't heard of and with no readily available for hire talent pool.

I'm not a fan of the language, but java has a huge number of developers, a rich mature ecosystem of software and is quite productive (enterprise patterns aside), it's a good sweet spot for most companies.

WatchDog · on Oct 21, 2019

Why would you pick Go or C++ over Java?

C++ is a hydra of complexity, sure it has it's place, but it's not nearly as productive as Java for your typical web application.

Go is almost the opposite, so simple it lacks features like generics. The last time I used Go it had fundamental usability issues around dependency management(although I think recent versions have improved on vendoring a little).

adev_ · on Oct 21, 2019

> C++ is a hydra of complexity, sure it has it's place, but it's not nearly as productive as Java for your typical web application.

Modern C++ well is as productive ( probably even more productive ) as Java. The main issue with C++ is recruitment, C++ engineers are rare because C++ is barely teached.

rswail · on Oct 21, 2019

C++ is barely taught in ProgrammerGenerationFactories because "modern" C++ still allows "old" C++ and makes it difficult to stop developers from doing that.

Just like MISRA tries to constrain C programmers from doing dumb things in the embedded world, "modern" C++ tries to the same in the business world. But there isn't an easy way to enforce it, especially when you're outsourcing to some code sweatshop.

adev_ · on Oct 21, 2019

> But there isn't an easy way to enforce it

Every mature enough language has a subset that you need to avoid. Including Java. This is precisely due to this kind of things that every company need to have coding guidelines and proper static analysis tools.

> especially when you're outsourcing to some code sweatshop.

If you outsource your dev to cheap, other side of the world, low quality engineers. Then you deserve your problem, in any language.

I worked in the past for a company (embedded programming) that had an entire team of expensive engineers in Luxembourg just to fix the stupidities of an other team of outsourced engineers in India.

pjmlp · on Oct 21, 2019

Any good university has C++ related classes.

The problem with modern C++ is having many devs to actually make use of it.

Many devs don't care to follow up on modern language X best practices, rather just typing away something that works.

pjmlp · on Oct 20, 2019

Plenty of companies free of magpie developers.

TylerE · on Oct 20, 2019

I asked for examples, not platitudes.

coldtea · on Oct 20, 2019

And yet you only offered a misinformed platitude in the form of a question.

Google does all kind of new Java work (Golang contrary to myth, is just one of the languages Google uses for internal stuff, and niche at that), Amazon of course, most of Apple's backend services are Java, Twitter, AirBnB, Uber, LinkedIn, TripAdvisor, and tons of others use Java, and write green stuff in it all the time...

Thaxll · on Oct 21, 2019

Uber back end is mostly Go.

SEJeff · on Oct 20, 2019

Twitter, and they write some really good open source stuff too. If you're writing rpc services in java I'd almost argue you should default to considering Finagle:

https://twitter.github.io/finagle/

pjmlp · on Oct 20, 2019

Pick random names out of Fortune 500 companies.

Java projects get done all the time.

Additionally, even though it isn't standard Java, I have plenty of new apps running on my pocket.

kevan · on Oct 20, 2019

Plenty of teams within Amazon

ben_jones · on Oct 20, 2019

What you call Wizarding I call "ordinary Software Development". A software developer spends ~70% of their time writing features and the rest mixed between organization/planning/roadmapping etc. A software engineer spends ~30% of their time writing features and the rest of it managing technical debt and making long-term investments towards better features and processes.

Too many companies need devs but have engineers, or they need engineers but only have devs :/

heartbreak · on Oct 20, 2019

I’ve never known anyone to distinguish between the two roles, and they seem to be used interchangeably in the industry.

(Except those people who claim software engineers aren’t real engineers)

k_sze · on Oct 21, 2019

Another thing that I would like to see in some kind of strict mode is the ability to mark explicit exports like in JavaScript modules. I often want to import multiple things globally at the top of a module because they are shared by multiple class or function definitions that I am writing. However, such imports end up being exposed to and usable by the consumers of my module, even though the consumers should really have imported those things at their source instead of via my module.

There are currently maybe two ways to tackle this “problem”, without a strict mode:

1. Don’t import at the global module scope; but that’s a bit tedious.

2. Import with rename, like `import os as _os`, and then leave it to the principle of “we’re all consenting adults”. I.e. if anybody imports and used things that start with an underscore, it’s clearly their fault, not mine.

andreareina · on Oct 21, 2019

3. Import as normal, and leave it to the principle of "we're all consenting adults"; unless something is explicitly called out as being part of the public API I consider Law of Demeter[1] "violation" the same as accessing _var.

[1] https://en.wikipedia.org/wiki/Law_of_Demeter