Hacker News new | past | comments | ask | show | jobs | submit login

I'm currently writing code with it. The language is really (really) nice to work with.

- But the REPL lacks the ability to redefine structs on the go (which I can understand as it'd be tough to do, or simply not possible). But that, combined with the slow start up time makes life a bit harder than it should. Fortunately, one doesn't redefine its structs every day.

- There are also lots of libraries but the quality of the documentation is often sub par. For a newcomer like me, working examples would be great. For example, if you use the plots library, you'll have hard time finding a list of all possible plots (the documentation talks about lots of things, but strangely, not a list of possible charts). I've also looked at doing linear regression and GLM and, again, you've code libraries but the examples are rare...

- the startup time are still quite slow but that's ok because somehow you adapt your workflow around.

- being able to use greek symbols as identifiers is super cool but your editor as to manage that, else you'll have to memorize shortcuts...

But still, I keep on using it, it's much faster for my use case (data processing). I mean, faster than r or Python (for which I could write fast code but that'd mean I'd have to change the way it is written)




> There are also lots of libraries but the quality of the documentation is often sub par.

One thing I think R does not get enough credit for is really strong enforcement of documentation. If you want to get a package in CRAN it is going to be more work to not document your library than to just document it correctly. As a result nearly every R package has very solid documentation, including a well formatted pdf manual, a series of "Vignettes" that show how to use the library, and excellent in-interpreter documentation.


R has superb documentation. I miss it with Python.


I don't know what libraries you're refering to. But most of the time, they barely document what they do. By barely I mean:

- parameters documentation can be understood only if you actually know the theory behind what the library documents. One may say that it's a good thing in the sense it prevents one to shoot himself in the foot but for discoverability, that's painful. I've done basic stuff such as GLM, LDA, PCA,... For example, there are several ways to do PCA's but which one do you choose ? Not everybody know the theory behind each PCA's formulation...

- examples are usually very limited and don't show what the library can do. For example, plots (the one from the base library) are really not well documented. Examples are really scarce and don't even show the graphics themselves.

- The package federation (not exactly documentation) is really bad : there are lots of library that overlap, that redefines symbols here and there without telling, etc.

So R documentation is so so... It feels like it suffers from lots of math explanation I've seen : a strong will to write the minimum possible which makes everything hard to get into.

I like R but documentation is not a strong point. Ecosystem is.


> Not everybody know the theory behind each PCA's formulation...

It's a language meant for a specific domain. Expecting domain knowledge in said domain is not a failing - it's logical.


" I mean, faster than r or Python (for which I could write fast code but that'd mean I'd have to change the way it is written)"

You might benefit from numba. I've used it to speed my Python up enormously and completely painlessly just by adding decorators to critical functions. It's why I'm not considering moving to Julia.


There are so many ways to write R that can actually be very fast. R is a very different language as a whole and it has addressed a lot of problems these last 10 years. I might be biased but I really do like the functional side of R and how logical the libraries from Hadley Wickham have been designed.

Python still doesn't feel like a natural fix for data science work. I am guessing it is more bias opinion but why based on 0 for this domain????


> There are so many ways to write R that can actually be very fast.

But it means that you have to work with arrays. For me it often means breaking the flow of my (code) explanation to migrate to other data structures. Sure it is then fast, but it gets less readable and harder to update.

Now, I'm a programmer at heart, so I think in the "functional" paradigm, not the array/signal one. There's some "impedance" I guess :-)


I actually prefer the 0 indexing of python. Systems code (c/c++ etc.) already uses 0 based indexing. So it is nice that when you do data science the convention stays the same.


But Fortran, R, Matlab and other tools in the domain use 1 indexing...


Does it actually matter at this point? I write off by one errors all the time in JavaScript and I usually spot them immediately. I’ve written Julia and R in the past and I found it really easy to switch between contexts, and if I forget, the errors are such that it takes only a few seconds to spot and fix them.


Is it actually completely painless?

YMMV, but for me I found that there were many unsupported parts of Numpy I had to work around which meant I was effectively doing a full rewrite.

Especially assignments using boolean masks and working with multi-dimensional arrays in general is really tough


I think Numba has some pain points. I have also encountered issues with multi-dimensional arrays.

The beauty of Julia is that relatively naive Ruby-like code is already quite quick. And if you implement inner loops in an imperative way, with an eye towards not generating excessive allocations, it can approach C++ speed while still being nice high-level code that is close to mathematics or business logic.

Besides, the other strong point of Julia is composability. The ecosystem is made up by lots of small libraries that can interact in ways the original designers did not expect or plan for. In contrast, Python has exceptional libraries, but they tend to be big monoliths.

The problem with Julia right now is that some libraries are not sufficiently mature. For example, there's no mature native replacement for XLA or PyTorch. I know about Flux, but it's nowhere close if you wanna create, say, a large transformer. Or say you are working with GLMs. GLM.jl is nowhere close to R.

Some other Julia libraries represent the state of the art, though. I just can't wait to get all foundations complete! It's a really promising space for probabilistic and differentiable programming.


It just happened wasn't using Numpy except in the most basic possible ways. I had hand-written algorithms in python. That's where it shines. So, yes YMMV! It worked really well for me.


Numba is awesome. It solves a lot of problems. Julia is like numba but you are able to use any libraries inside the loops. Think FFT, scipy, etc


For simple things only. For example, the jit class in Numba is still experimental, and quite limited. So if you need to write non-trivial class, even if you jit each method, dropping to Python can be a problem (say with lots of instances.)

I have tried to write an application targeting HPC and in itself is not very complicated (probably ~2000 lines or in that order.) But I did things like using the Python language as the metaprogramming language for Numba (basically higher-order function where you jit inside.)

All in all my experience of Numba tells me that if I am designing the same package now I'd write it in Julia where jit is "first class" and you don't need to constantly thing about the boundary between Numba and Python.


I’ve become a big fan of Numba. No more awkward, human-unreadable vectorized code.


IMO, numba loses pretty much all of the advantages of python. The code is fast, but you lose the ability to organize your data the way you would in regular python, and most of the python ecosystem can't be called from within numba functions. If I wanted that developer experience, I would just use C.


I don't think that's true. You lose some functionality, but since you can call Numba on select critical methods, it's not so bad. You can often sequester your performance-critical logic into some Numba-decorated methods, and your business elements that call out to the rest of your ecosystem are not decorated with Numba.

Or to put it another way-- if I'm using vectorized code in Numpy I can't deal with the external ecosystem from within my vectorized code either.

> The code is fast, but you lose the ability to organize your data the way you would in regular python,

Can you clarify what you mean by this? I lose the ability to organize my data the way I'd like if I'm vectorizing my code too.


I wasn't comparing numba to numpy. I was comparing to python code where you don't care about performance. The main reason I don't find numba appealing is that Julia gives you numba like performance while allowing you to use structs (think classes) to organize stuff.


This doesn't really make sense. If you don't care about performance then you have no use for Numba.

Also, the comment you replied to was explicitly comparing Numba to vectorized Python, so you should not abandon that comparison in your reply without saying so.


I think the point is that one wants to write code that is similar to regular non-performance sensitive python code, with classes and everything, and still have it be fast.


> Also, the comment you replied to was explicitly comparing Numba to vectorized Python, so you should not abandon that comparison in your reply without saying so.

The comment you replied to explicitly says "regular python".


Not true at all for me, because in my situation there are some functions which have tight loops that take 99% or more of the execution time. Put the numba decorator in front of them and the code is 100+X faster. Clearly it's not that simple for everyone. But for some people, it can be. So it doesn't make sense to call it a panacea, and it also doesn't make sense to say it doesn't work. It works for some people, not for others.


The C API for Julia also has almost no documentation. There is a getting started guide, which is great, but if you want to do anything more advanced (e.g. creating structs like in your example), you'll end up reading the source code to try to puzzle through which functions to use in julia.h. There's also an apparent limitation that whichever thread initializes Julia is the only one that can later eval code, which was surprising. The language itself is very cool, but it has a long way to go to be easy to embed like Python is.


The good news is that you don't need the C api nearly as much because you don't have to call out to C whenever you want performance. Also, the extent to which the python C api is documented has actually been a major problem for them since it has effectively frozen a ton of python's implementation in majorly detrimental ways (eg the GIL)


Calling out to c "because you want performance" is only only one dimension of the issue, and assumes that your main application code is written in python or julia. In many cases (e.g., robotics), application code is written in c++ or c, and python bindings serve as simulation harnesses and visualization tools. Pybind11 is absolutely brilliant for this. The last time I looked, similar tooling for Julia was substantially less mature and definitely didn't look like something I'd want integrated into a production workflow.


I tried to integrate a Julia REPL into another application and the example on the website didn’t even compile.


It would be great if you can file an issue. We usually do CI for doctests on base julia itself, and naturally need to do more of it.


It's been open for several years: https://github.com/JuliaLang/julia/issues/37957


> But the REPL lacks the ability to redefine structs on the go (which I can understand as it'd be tough to do, or simply not possible). But that, combined with the slow start up time makes life a bit harder than it should. Fortunately, one doesn't redefine its structs every day.

It is possible to redefine structs in Pluto.jl which is also a productivity booster overall due to its reactivity.

> For example, if you use the plots library, you'll have hard time finding a list of all possible plots (the documentation talks about lots of things, but strangely, not a list of possible charts).

The Makie.jl plotting library has really great docs nowadays: https://makie.juliaplots.org/stable/examples/plotting_functi...


> The Makie.jl plotting library has really great docs nowadays

Even then it lacks geographical plots, and the GeoMakie.jl (part of the same ecosystem) documentation is limited.

https://juliaplots.org/GeoMakie.jl/stable/

I know, I know, I should contribute documentation.


I really want revise to be able to be able to redefine structs as well. It would make package development a lot easier.


I described what needs to happen in https://github.com/JuliaLang/julia/issues/40399, but so far nobody has had the time to implement it.


All valid complaints. Regarding the first one, redefining structs, you can wrap them inside modules and reload the module. It's not as ergonomic since you will need to qualify the structs with the module name.


Can't you just export the struct? Edit: Just tried it and ran into those difficulties. Julia structs appear to be treated as constants, and Revise.jl didn't help (for me).


I don't know why you cannot "overwrite" existing structs but to be fair whatever system you use to program in a stateful manner in the REPL will have some problems.

Just hit this one: after rewriting a function I don't see any change in behavior. Reason: I overwrote the function but the more specialized one was being called.

But there are plenty more. After some time it's just better to nuke the REPL and start clean. That said, I love programming in the REPL.


> Just hit this one: after rewriting a function I don't see any change in behavior. Reason: I overwrote the function but the more specialized one was being called.

Yeah, I've had this and the struct redefinition problem since the very early days of Julia, that's why I never fully bought into the Revise.jl based development model (it has its good parts, but these are big limitations that should be mentioned more often when recommending it). That's also why I resisted the removal of the `workspace()`-clearing function (like MATLAB's `clear`), since that would be an alternate option for quick and dirty exploration in a lot of cases; though these days the latency problems of exiting to shell and coming back are much less, so it's not as much of an issue.


Revise.jl properly handles the deletion of methods. If the person you were replying to were tracking their changes with Revise, it wouldn't have happened.


Yeah, my comment was confused in a way - I didn't notice they weren't using Revise.jl, but I was also talking about earlier versions of Revise.jl. Around the time that `workspace` function was retired (v0.7/1.0 times), IIRC, Revise.jl did have problems deleting methods, and that made it a big pain point in a language where method dispatch is such a central pattern. So it fell short as a replacement for workspace-deletion function, despite being suggested as one, and that frustrated me. I'm probably holding on to that negative impression for far too long though, by this point; I'll make a more wholehearted attempt at using a Revise-based workflow and see how it feels today.


Yeah that's the problem.


> But the REPL lacks the ability to redefine structs on the go

ProtoStructs.jl: https://github.com/BeastyBlacksmith/ProtoStructs.jl


Yeah, having Revise.jl be able to redefine types would be delightful.



there's also https://docs.juliaplots.org/latest/generated/gr/ in the docs which is basically "all the kinds of plots you can make" (see the other comment for Makie gallary)


Examples and tutorials should go first. Some parts of the ecosystem, for some reason, put the manual and descriptions first. Give people code, then give people a manual for if they want to dig further. Thankfully, this is rather easy to fix.


Currently you can use the workaround to redefine structs in Revise https://timholy.github.io/Revise.jl/stable/limitations/




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: