Hacker News new | past | comments | ask | show | jobs | submit | etrain's comments login

There was a company called Petridish that did exactly this. They pivoted after six months and became Blue Apron: https://vator.tv/news/2017-09-05-when-blue-apron-was-young-t...


Check out Determined https://github.com/determined-ai/determined to help manage this kind of work at scale: Determined leverages Horovod under the hood, automatically manages cloud resources and can get you up on spot instances, T4's, etc. and will work on your local cluster as well. Gives you additional features like experiment management, scheduling, profiling, model registry, advanced hyperparameter tuning, etc.

Full disclosure: I'm a founder of the project.


Oh hey I interviewed with y'all a few years back, glad to see you're still around.


Interesting. How do you guys manage spot interruptions when training on spot instances?


Users expose their model to our Trial API (https://docs.determined.ai/latest/topic-guides/model-definit...), the base class then implements a training loop (which can be enhanced with user-supplied callbacks, metrics, etc.) that has a whole bunch of bells and whistles. Easy distributed (multi-GPU and multi-node) training, automatic checkpointing, fault tolerance, etc.

Concretely, the system is regularly taking checkpoints (which include model weights and optimizer state) and so if the spots disappear (as they do), the system has enough information to resume from where things were last checkpointed when resources become available again.


Thanks for going open source!


Tom7’s sigbovik submissions are true works of art. https://www.cs.cmu.edu/~tom7/mario/mario.pdf


My personal favorite is the compiler that uses only the printable bytes.


The exposition is tight.


Relational database queries are supposed to be _declarative_. As a user, you're not supposed to think about the mechanics of execution because the database system is supposed to be able to decide how to execute your query using whatever magic it wants, as long as it satisfies the contract that it gives you the right answer.

It's absolutely useful as a debugging tool to build up some semantic understanding of what the query means, and I encourage every database user to learn to use EXPLAIN, but relying on a mental execution model is borderline dangerous.


The author makes it crystal clear that she is not advocating this as a mental model for the execution of the query:

Database engines don’t actually literally run queries in this order because they implement a bunch of optimizations to make queries run faster – we’ll get to that a little later in the post.

So:

- you can use this diagram when you just want to understand which queries are valid and how to reason about what results of a given query will be

- you shouldn’t use this diagram to reason about query performance or anything involving indexes, that’s a much more complicated thing with a lot more variables

She claims this is a useful tool to understand the denotational semantics of the query and which kinds of queries are allowed vs. not allowed—and she's absolutely right.


A point that the article makes in multiple places, but I suppose it absolutely bore you repeating it once more.


Assume 100 pages on each onion address (it’s probably power-law but let’s just assume that’s the mean). Latency with Tor is super high. Assume average of 5s to load a single page. This is generous because tail latency will probably dominate mean latency in this setting.

These things can happen in parallel but let’s also assume no more than 32 simultaneous TCP connections per host through a Tor proxy.

So we’re looking at ~75k1005/32 seconds = 14 days to run through all of them. You may not need to distribute this but there are situations (e.g. I want a fresh index daily) where it is warranted.



It’s also hard to separate the design of the neural architecture from the definition of the feature extractor.


I had one of these phones too - it was a bad batch of serial numbers according to Apple. If this happens to you: https://www.apple.com/support/iphone6s-unexpectedshutdown/


I was referring to that (my old serial number did show up as eligible but I had it replaced before that).

Apparently all phones suffer from that problem but the batch we referred to seem to be especially bad. The software update just seems to have mitigated the effects of iPhones which were mostly ok but shut down occasionally (and the effected ones which shut down almost every week when it was at 70% and about once or twice a day at 20-30%).


Crunchbase has a good deal of data here - it is incomplete particularly for smaller companies, but is pretty good for larger investment rounds.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: