More

thinxer · 2024-02-23T05:13:28.000000Z

For “cloud-native” apps, JuiceFS is not needed.

S3 is not designed for intensive metadata operations, like listing, renaming etc. For these operations, you will need a somewhat POSIX-complaint system. For example, if you want to train on ImageNet dataset, the “canonical” way [1] is to extract the images and organize them into folders, class by class. The whole dataset is discovered by directory listing. This where JuiceFS shines.

Of course, if the dataset is really massive, you will mostly end-up with in-house solutions.

[1]: https://github.com/pytorch/examples/blob/main/imagenet/extra...

thinxer · on April 2, 2023

For context:

1. https://twitter.com/rus/status/1641908582814830592

2. https://twitter.com/bighuman/status/1641910129384648706

thinxer · on March 30, 2021

On the first sight I fired up Developer Tools to check the banner is *not* <marquee>.

thinxer · on March 12, 2021

I expect Photoshop to contain many highly optimized function routines for photo processing. These routines are optimized by hand and usually use architecture-specific features such as SSE/AVX which is not portable.

Unoptimized (more portable) versions of theses routines exists, but I suspect they will perform well even with M1.

unicornfinder · on March 12, 2021

Indeed. It's why Photoshop for Mac won't run an Ryzen systems - it's optimised to the point where it relies on specific Intel CPU instructions.

dannyw · on March 12, 2021

Kinda a cool benefit of having a Mac. Serious optimisations by developers.

xxs · on March 12, 2021

The windows one likely have them, except they have the codepaths for all processors.

oauea · on March 12, 2021

Yeah, restricting the hardware your software can run on is super cool!

dylan604 · on March 12, 2021

This happens in Linux apps as well. Vendors like Discrete will have restrictions on what kernel version, which distro, what hardware that is really no different. With a Mac, you just don't have a choice on the hardware to install. On Linux, you have a choice, but only from the 2 options listed and you have to build it yourself.

whywhywhywhy · on March 12, 2021

>I expect Photoshop to contain many highly optimized function routines

Ah, I see you have never had the pleasure of using it.

Sarcasm aside, the whole Adobe suite makes atrocious use of your machines actual power, most is still single core. My machine can simulate the entire virtualized interactions of simulated light photons on glass on multiple GPUs in the time it takes photoshop to encode a GIF.

and you think "well maybe encoding the gif is more work?" yet try any GIF encoder written in pure Javascript and they still outperform Photoshop and also the 3D render.

Const-me · on March 12, 2021

> architecture-specific features such as SSE/AVX which is not portable.

I don’t have hands-on experience, but somewhere on HN I saw this: https://github.com/simd-everywhere/simde If starting a new cross-platform project today, I would try that library first, before doing the usual intrinsics.

rowanG077 · on March 12, 2021

This is probably true. But afaik photoshop allready ran on ios. So I would have expected it to take a couple of weeks at most.

dboreham · on March 12, 2021

A QA cycle probably takes 2 weeks.

tuyiown · on March 12, 2021

I'd imagine that they would have been cautious with highly cpu tuned instruction flow after the PPC->intel migration.

mirsadm · on March 12, 2021

I'd expect them to have moved on from hand assembly to things like halide-lang.org/ by now.

thinxer · on Sept 18, 2020

This looks like Entity-Component-System[1] in game development. Entities are composed of components, and components are often stored vectorized.

[1]: https://en.wikipedia.org/wiki/Entity_component_system

thinxer · on June 11, 2020

For anyone interested in why monorepo works, I'd recommend the book Software Engineering at Google: Lessons Learned from Programming Over Time. It has detailed the reasons for the One Version Rule and Version Control over Dependency Management.

secondcoming · on June 11, 2020

Doesn't Google use Perforce though, which (last time I used it) forces a monorepo approach? git doesn't have equivalents to branchspecs and clientspecs.

thinxer · on June 12, 2020

It is because Google wants a monorepo, then Google choose to use Perforce (and later Piper). It is not that Google uses Perforce and thus are limited to a monorepo.

The core value behind monorepo (and monorepo-like approaches) explained in the book is that dependency management is harder than version control.

thinxer · on Sept 25, 2019

I'm using this: https://github.com/uxbh/ztdns.

thinxer · on July 25, 2017

The site is using Let's Encrypt for https. Currently there is a hard limit of 2000 subdomains per week, meaning there will be at most 2000 simultaneous connections!

It seems that Wildcard support is coming Jan 2018.

trevordixon · on July 25, 2017

You can only get certificates for up to 2000 new subdomains per week, but you can renew many more than that, so I can work my up to a few tens of thousands of subdomains if needed.

From https://letsencrypt.org/docs/rate-limits/:

"The main limit is Certificates per Registered Domain (20 per week)...

"If you have a lot of subdomains, you may want to combine them into a single certificate, up to a limit of 100 Names per Certificate. Combined with the above limit, that means you can issue certificates containing up to 2,000 unique subdomains per week...

"To make sure you can always renew your certificates when you need to, we have a Renewal Exemption to the Certificates per Registered Domain limit. Even if you’ve hit the limit for the week, you can still issue new certificates that count as renewals...

"Note that the Renewal Exemption also means you can gradually increase the number of certificates available to your subdomains. You can issue 20 certificates in week 1, 20 more certificates in week 2, and so on, while not interfering with renewals of existing certificates."

A wildcard certificate would be far more convenient! I'm looking forward to that.

dijit · on July 25, 2017

I would highly recommend talking to LetsEncrypt staff, since those limits are there to mainly prevent abuse; and they have been quite forthcoming with raising them or even lifting them completely in some cases.

trevordixon · on July 25, 2017

Oh, good to know. Thank you!

thinxer · on June 27, 2017

Some basic analysis:

1. Kernels (Functions in NNabla) are mostly implemented in Eigen.

2. Network Forward is implemented as sequential run of functions. No multi-threaded scheduling. No multi-GPU or distributed support.

3. Python binding is implemented in Cython.

4. Have some basic dynamic graph support: run functions as soon as you add them to the graph, and run backward afterwards. Somewhat similar to PyTorch.

5. No support for checkpointing and graph serialization, or I'm missing something.

I'm not sure why Sony is releasing this (yet another) deep learning framework. I don't see any new problems the project is trying to solve, compared to other frameworks like TensorFlow and PyTorch. The code is simple and clear, but nowadays people need high-performance, distributed, production-ready frameworks, not another toy-ish framework. Someone please shed some light on me?

BTW, for newcomers to deep learning systems, [CSE 599G1](http://dlsys.cs.washington.edu/) is a good start.

antirez · on June 27, 2017

The problem is that this library is not just a C easy-to-bind project, otherwise an high quality embeddable library that can work reasonably well with CPUs and can also benefit from commonly used GPUs in small systems, could be useful for a number of projects. Not all the problems need to have a huge dataset of complex entries (like million of images), there are many IoT problems that instead need a self contained library supporting different kinds of NNs.

dimatura · on June 27, 2017

Seems like out of the major libraries (TF/Caffe/Theano/pytorch), pytorch is the only one to have a core that is C (the TH*). It's not exactly a small library, though. One small library that is in C and has some state-of-the-art features is Darknet (https://pjreddie.com/darknet).

That said, seems like directly using the C++ API was a major use case here, and it looks fairly clean to me.

kgwgk · on June 27, 2017

Does the C (or is it C++?) core of pytorch come directly from torch or do they add more functionality? Is there a way to interface with this core using C?

albertzeyer · on June 27, 2017

TensorFlow and Caffe are also implemented in C++.

kgwgk · on June 27, 2017

Being implemented in C++ doesn't ensure that a C++ interface exists. A C++ API is available for TensorFlow, though, and also a C API.

pjmlp · on June 27, 2017

> I'm not sure why Sony is releasing this (yet another) deep learning framework.

Maybe, because machine learning is the 2017's big data, cloud, IoT, VR, ...?

dimatura · on June 27, 2017

One thing that seems promising is built-in support for binary neural networks, which makes sense given its focus on embedded devices. No reason this couldn't have been implemented, in say, pytorch - but I'm guessing this library was started a few years back, when there were less alternatives available.

azinman2 · on June 27, 2017

Are the course lectures online?

thinxer · on March 28, 2017

Is rootless containers safe now? It is not turned on by default in ArchLinux because of security concerns[1].

[1]: https://bugs.archlinux.org/task/36969