Yeah this is confusing for me: I'm non an expert in numpy * but I had assumed that it would do most of those things - vectorize, unroll, etc, either when compiled or through any backend it's using. I understand that numpy's routines are fixed and that mojo might have more flexibility, but for straight up matrix multiplication I'd be very surprised if it's really leaving that much performance on the table. Although I can appreciate that if it depends separately on what BLAS backend has been installed that is a barrier to getting default fast performance.
* For context I do have done some experience experimenting on the gcc/intel compiler options that are available for linear algebra, and even outside of BLAS, compiling with -o3 -ffast-math -funroll-loops etc does a lot of that, and for simple loops as in matrix vector multiplication, compilers can easily vectorize. I'm very curious if there is something I don't know about that will result in a speedup. See e.g. https://gist.github.com/rbitr/3b86154f78a0f0832e8bd171615236... for some basic playing around
I'm not sure where/how they'd be squeezing out more performance unless its better compilation/compatibility with Apple Silicon intrinsics.
Edit: ..Is Mojo using more than 1 core? I'm not sure I understand their syntax and if they are parallel constructs.
Edit2: Yeah Mojo seems to be parallelizing, so the comparison really isn't fair. The np.config posted elsewhere shows that OpenBLAS is only compiled with MAX_THREADS=3 support, and its not clear what their OPENBLAS_NUM_THREADS/OPENMP_NUM_THREADS was set to at runtime.
I'm not super familiar with Mac but I also notice that numpy here is using openblas64. I had thought the go-to was the Accelerate framework? Or is that part of it somehow? If so it would be interesting to see how that impacts performance. Of course it's all kind of an argument for something like Mojo that gives better performance out of the box. Also an argument for why Mojo would be way more interesting if it was open source.
Just whatever you get by default with pip install numpy... Changing the benchmark to run with a 1024x1024x1024 matrix instead of a 128x128x128 does speed up numpy significantly though
Python 119.189 GFLOPS
Naive: 6.275 GFLOPS 0.05x faster than Python
Vectorized: 22.259 GFLOPS 0.19x faster than Python
Parallelized: 50.258 GFLOPS 0.42x faster than Python
Tiled: 59.692 GFLOPS 0.50x faster than Python
Unrolled: 62.165 GFLOPS 0.52x faster than Python
Accumulated: 565.240 GFLOPS 4.74x faster than Python
If you are looking for improved performance, you will always go with NumPy + vectorization. That's what is important. So I don't know what is the argument here, am I missing something?
This is great, I'm going to switch to this from black. Being used to working in other languages I feel like I'm swimming in molasses when using Python.
It's funny that all good things for Python are not written in Python. Says a lot about the language.
SvelteKit's routing pattern is anything but clean. It uses filesystem based routes where every file is called "+page.svelte" (or +page.server.js/ts for API only routes).
For anything but a demo app with just a couple of routes it's a headache to navigate.
Edit: Another smell of the routing system to me is that you need to resort to complexity like this [1] to have anything but top down inheritance. I love Svelte btw and have used it in many projects.
Having built a rather large app using SvelteKit, I found the routing scheme to make lots of sense actually. You always know what code is located where, concepts transfer everywhere consistently, and concerns are nicely separated.
In theory it seems neat but in practice I'm here with 10 tabs all called "+page.svelte" or "+page.server.ts" and it completely breaks my workflow since I can't tell them apart or navigate with fuzzy name matching. How do you deal with that?
Generally you should try to get away from clicking around on file tabs and use your command palette but one thing you can do, if you're using VS Code, is change the "Label format" to "short" in the settings.
As I've grown older, I've seen lots of improvements to software usability - it warms my hear to see the attitude of yesteryear's "you're not using it right, consider changing your lifestyle" still alive and well.
It would be great if VSCode provided an API for the tab labels but the Svelte team really created the problem in the first place.
This whole file-based routing wasn't a great idea to begin with, as a general solution to routing. It works (up to a point) for a static site generator, but most SSGs provide a permalink setting so there's an escape hatch.
The whole thing gets even worse when you start introducing issues like having dozens if not hundreds of files with the same filename, weird characters in folder and file names, etc.
Try a JetBrains IDE - if two tabs with the same filename are open, it will automatically prepend the name of the containing folder (recursively up, until the names are different). I wouldn't know how to stay sane for the same reason otherwise.
I've been using Svelte happily for years, but I won't be using SvelteKit. In part because of the routing but also because it doesn't really solve much in the backend.
It's amazing that all the full stack frameworks (Next, Nuxt, SvelteKit, Remix, Astro, etc) are investing so much effort into reinvent the backend and after years they still don't provide even basic backend functionality. For example, out of the box, Fastify gives you validation, sessions, CORS, cache headers, etc. Features that you need in probably all backend projects.
I started this repo to figure out how to integrate Svelte with Fastify using Vite. It has hot reload, partial hydration, etc. It's very quick and dirty code, but it works.
It's only a headache if you make it one. With an open mind, it looks pretty neat to be honest. The top down inheritance doesn't look complex, just weird.
I agree with this comment. SvelteKit is unnecessarily complicated. The `+page.svelte` etc are just the start of it.
Plus I can't use it with Go, PHP, Ruby, Rust etc when it comes to SSR (without running multiple servers and handling deployment nightmares).
Something about this whole Node + SSR Front-end is smelly. (next, nuxt, solidstart) I love Svelte as a framework and a way of writing UI, but SvelteKit. Eh! Not so much.
SvelteKit is too much complexity for no reason. Goes opposite of what Svelte was meant to be: Simple and intuitive.
Curious to hear more, what specifically is complicated about it?
SvelteKit is a JavaScript framework, it makes sense that you can't use it with other languages. You can pair it with a backend of your choice of course, but to get the SSR benefits you do need to work within the framework.
There are other ways of using Svelte with other languages, I would take a look at something like Inertia.js [0].
Hey Kevin! First of all. I am a regular listener to your podcast! Love it!
Now, I am not specifically targeting SvelteKit to be fair. But the whole host of meta-frameworks like NextJS, NuxtJS etc which Vercel is pushing.
These frameworks in general are too much complexity added. SSR is hard and the best way to handle all of these is to not have a server layer at all. That is just abstract the rendering/routing part and leave the rest of the server stuffs to the user.
I want to simply write my Fastify/Express/Go-gin/Django app. Then add SvelteKit as my front-end with SSR support.
Right now, I first write SvelteKit and then think how am I going to integrate Express and Fastify to it (for a moment let's leave non-node solutions).
Trust me. If you simply leave the server out of SK and generate a simple API abstraction like `res.send(renderSKPath('/users/:id', { serverData }))`. It would have done the job.
I think its difficult to express what I want to say, but in short, remove the server and keep SK as a rendering layer only.
P.S. I know that there is an express adapter for SK. But that is not the point of this comment at all.
You're asking for two things that seem largely incompatible. How do you expect to do SSR in a Go-gin or Django app? Svelte components get compiled to JavaScript and SvelteKit is written in JavaScript. Doing SSR in those frameworks would necessitate calling JavaScript from Go or Python and introduce far more complexity if you could get it to work at all. The simplest options are either to run a Node server or turn off SSR, which you can do with 1-line in SvelteKit.
Looking forward to trying this, VSCode is great but I really miss the performance of Sublime Text. I hope they get the plugin system right, killer feature would be if it could load VSCode plugins (incredibly hard to pull off, yes)
After our past experience with Atom, getting the plugin system right is a top priority for the editor.
The thought of cross compatibility with VSCode plugins definitely crossed our mind and it's not out of the question, although our current plan is to initially support plugins using WASM.