Hacker News new | past | comments | ask | show | jobs | submit login

Colab is running software on other people's computer.

The moment you try to reproduce in local env, you'll be greeted with many "non-existent and unmatched dependency" errors.

Also, "examples" do not cut it.




I am not sure, what can be more simple than 1 LOC invocation + minimal imports.

It is true that the model is based on PyTorch + python, but the majority of complexity (like SSML parsing) is tucked inside of the model.

Theoretically one can make a simplified model without any of those features in plain PyTorch or ONNX, but so far we did not have proper motivation to do.

As for CLI, this also seems simple enough, but out of scope for us.


In such situations it could be useful to provide a container image or nix or guix shell setup, to make sure people have the dependencies they need.


> I am not sure, what can be more simple than 1 LOC invocation + minimal imports.

Let me make it "embarrassingly simple" for you:

    /bin/bash text_to_speech.sh file.txt file.wav
Also, I'm not entirely sure what "out of scope" mean?

Do you mean you run your software on computers that can't run bash?

Do you develop machine learning algorithms on your phone?


> Do you mean you run your software on computers that can't run bash?

It is explicitly stated, that PyTorch is the only real requirement. Bash is not required, i.e. models can be run on Windows or ARM with PyTorch.

> Also, I'm not entirely sure what "out of scope" mean?

There was no tangible benefit in making a bash CLI for us.


From my experience with similar projects, it doesn't get any simpler than creating a virtual environment, running requirements.txt and using a simple function to get what you want. Did you have a problem when you tried running that? Colab in this case is just abstracting that part for the user.


Not criticising this project in particular but I frequently find that Colab is just a way for people to get/be very very bad at managing build/deployment of their code. It allows hand rolling a bunch of adjustments to an environment that may only be barely understood and then simply cloning that poorly understood environment.

3/4 times I try to make/rebuild a Colab based demo from scratch in a suitable non Colab environment… the setup instructions are caring degrees of wrong. From the little mistakes like under specific requirements that are now broken due to transient dependency changes, to completely wrong because everything has changed to the absolute worst version of all, the never even written down.

I find Colab is a subtle form of lock in by providing useful crutches … by leaning on the crutches of Colab handing all this hard dependency and environment management stuff you never need to learn how to do it any better than necessary to function on Colab… to draw a somewhat nasty analogy using terminology from the DevOps world, good dependency and build tools make a folder full of code like cattle, you can blow it away and rebuild it when you want, but Colab let’s you raise a pet by hand and then just magically clones it whenever you or someone else need a copy.


Yes, that has been my exact experience with folks who work within Colab and other Jupyter-like things:

    1. They assume everyone has access to the same environment they do

    2. They often don't understand anything about the infrastructure that's running their stuff

    3. They produce very interesting work (such as this particular TTS work)

    4. They drop 90% of their potential audience within 5 mn because the bloody thing lives in a weird cloud-only environment or requires a nightmarish stack of dependencies to run on a local machine and basically can't be simply integrated in a larger pipeline (e.g. a simple shell script).
My experience has been that getting ML researchers to get their head out of colab's ass and learn to type things like "ls" and "cd" is really hard.


Jupyter has similar issues with bad environments but it’s usually much closer to “didn’t get my dependency versions right” or “this could theoretically run with less junk” and things like that.

Colab is far worse, they say “don’t worry about it, you can just clone” and it’s just been a toxic spill, rotting away at the level of understanding in the ML community. Colab let’s you basically never put any effort into management of setup, dependencies, or data, and consequently it’s both amazing and fucking horrible the moment you want to avoid using it because everyone just builds their project “leaning on” the capabilities of Colab… it’s built an entire shanty town of poorly managed ML projects leaning precariously against the supports provided by Colab.

I’m just glad it hasn’t sucked too much air out of Jupyter in the ML community because at least stock Jupyter based tools are easy enough to take apart and reverse engineer since it’s a normal Python ecosystem, no magic Google drive data links, no custom Google tensor unit specific libraries, no push button magic clones of entirely hand crafted environments.


> it doesn't get any simpler than creating a virtual environment, running requirements.txt and using a simple function to get what you want.

Can't tell if serious or sarcasm.


Can you point out any ML project that works any simpler than this? Other than running Colab of course, which I mentioned.


> Can you point out any ML project that works any simpler than this? Other than running Colab of course, which I mentioned.

https://bellard.org/nncp/


That's certainly very nice, but I'm sure you can appreciate the complexity involved here. I would need to compile this and have CUDA properly configured on Linux, or have no CUDA support on Windows. So even your hand-picked example is not that different from the process I just described, which is the standard for ML projects as of today, even for players like Meta or Nvidia.


Usually these projects depend on Python wheels and native binaries. No I haven't tried reproducing this project.

Interesting project btw, kudos to the dev.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: