Hacker News new | past | comments | ask | show | jobs | submit | lpasselin's comments login

Do the captured elements move a lot during this snapshot? since it will take months? Is the difference significant?


Parallax shift from the spacecraft moving along its orbit isn't measurable past 16,000 light years. The Milky Way is about 100,000 light years in diameter, so for most of the stars in our galaxy, let alone extragalactic objects, there's no effect.

For closer objects this shift is useful, as it's the only way to directly measure the distance to a object: https://en.wikipedia.org/wiki/Parallax_in_astronomy Every other method for estimating distances to astronomical objects (standard candles, redshift, etc) are based on parallax measurements.


This comes from the same group as the EfficientViT model. A few months ago, their EfficientViT model was the only modern and small ViT style model I could find that had raw pytorch code available. No dependencies to the shitty framework and libraries that other ViT are using.


Custom and _competent_ AI tutors will be a game changer for education.


Meanwhile, I had a hard time last week getting a machine with 8 gpus from azure.


Every customer of a 3rd party cloud provider is going to have a hard time getting Nvidia GPU compute from 3rd party cloud providers - not only is the amount of GPUs available limited, but Nvidia themselves are trying to spin up their own cloud provider offering, and unsurprisingly don't want to help a competitor.


> but Nvidia themselves are trying to spin up their own cloud provider offering, and unsurprisingly don't want to help a competitor

I thought Jensen recently said he does not want to offer their own cloud offering. He instead wants to focus on creating ready made solutions for cloud vendors to purchase & re-sell services with.


Fair point! I think that's a recentish pivot though (past 2-3 years). I vaguely remember that in the late 2010s they were building and testing DGX Cloud as it's own standalone offering, but I might be wrong and confusing some other offerings they were working on.


I have a 200$ mini machine and would like to upgrade. Can DIY anything. What would you suggest for a maximum budget of 1500$?


Sage/Breville Dual Boiler and do the slayer mod + drip tray mod, you can then pull espresso that rivals any machine out there for a fraction of the cost. https://www.youtube.com/watch?v=JmQgxQ5Higw

The machine is old enough to be well understood with documented mods and fixes, it's dual boiler, triple PID, extreme temp stability, fill from the front, drip tray indicator. I love it.

https://home-barista.com/espresso-machines/breville-dual-boi...


Get a used Rancilio Silvia + do a fully fledged Gaggiuino build.

Check this Video comparison (Not a modded Silvia though, but a Gaggia classic which I would not recommended due to its aluminum boiler and the new China manufactured models have a (probably Teflon) coated boiler that likes to shed the coating... just search on reddit). The emphasis in this comparison is on the Gaggiuino mod, which does all the magic (profile based brewing). The underlying base machine is not that important. I would stay away from Sage/Breville, too much non standard parts and lots of plastic for my taste.

Decent Espresso vs DIY Gaggiuino build: https://www.youtube.com/watch?v=V4kAgPm1Xfw

Edit: many typos and some rewording


I read that Gaggia would replace flaking boilers. Contact your supplier for information.



The mamba paper shows significant improvements in all model sizes, up to 1b, the largest one tested.

Are there any reason why it wouldn't scale to 7b or more? Have they tried it?


That's the issue - I keep hearing that it is beyond small research group's budget to meaningfully train such a large model. You don't just need GPU time, you also need data. And just using the dregs of the internet doesn't cut it.


Most of the leaderboard has much lower sequence length.


Last time I tried, it did not have a good android app working.


What are these websites??


https://www.cnbc.com/amp/2017/03/22/meet-the-man-whose-site-...

https://www.businessinsider.com/techmeme-growth-2014-3

The river is the reverse chronological order similar to https://hckrnews.com

If you go back to the main page of tech meme, and hover over a story an arrow will appear on the left. Click it to see follow up stories to the first story.

The first two techmeme and mediagazer at least, do headline rewriting to debuzzfeed/deupworthy clickbait headlines. https://finance.yahoo.com/news/aggregators-attack-techmeme-h... https://www.poynter.org/reporting-editing/2015/techmeme-is-p...


I actually tried this last year, before OpenAI released their cheaper embeddings v2 in december. From my experiments, when compared to Bert embeddings (or recent variation of the model) the OpenAI embeddings are miles ahead when doing similarity search.


Interesting. Nils Reimers (SBERT guy) wrote on Medium that he found them to perform worse than SOTA models. Though that was, I believe, before December.


Most of the practitioners I see attempting this are running text through an embedding and then using cosine similarity or something similar as a metric.

Nils has written a lot of papers

https://www.nils-reimers.de/

and I think the Medium post you are talking about is

https://medium.com/@nils_reimers/openai-gpt-3-text-embedding...

and that SBERT is a Siamese network over BERT embeddings

https://arxiv.org/abs/1908.10084

which one would expect to do better than cosine similarity if it was trained correctly. I'd imagine the same Siamese network approach he is using would work better than cosine similarity with GPT-3.

There's also the issue of what similarity means for people. I worked on a search engine for patents where the similarity function we wanted was "Document B describes prior art relevant to Patent Application A". Today I am experimenting with a content based recommendation system and face the problem that one news event could spawn 10 stories that appear in my RSS feeds and I'd really like a clustering system that groups these together reliably without false positives.

I'd imagine a system that is great for one of these tasks might be mediocre for the other, in particular I am interested in some kind of data to evaluate success at news clustering.


Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: