Hacker News new | past | comments | ask | show | jobs | submit | lukasga's comments login

The story is that it is because Alfred Nobel was cheated on by a mathematician (hilarious, but unconfirmed)

http://nobelprizes.com/nobel/why_no_math.html#story


What does "Data flywheel" refer to here? Is it the continuous and immediate processing of user input while you're still speaking?


More like the data pipeline. Every bit of your usage of free chatgpt is being used to train ever better and ever more efficient iterations of the next GPT model


Maybe a bit outdated now, but reminds me of LSTMs from the recurrent update of a memory / hidden state with gating. I remember one of the biggest problems with such RNNs being vanishing gradients as a result of the long context, which vanilla transformers presumably avoided by parallellizing over the context instead of processing them individually. I wonder how this is avoided here?


Love the site, but the one thing I am missing is instantly seeing what papers are trending right now, instead of having to manually select a timeframe. Think hacker news frontpage. Would be interesting to add a "hot" filter, or similar. Average weighting by exponential decay over time? Not sure how you usually do that with pagerank.


Thanks! Yes, that's in the roadmap; I believe it will be complete in 3-4 cycles. Right now, we are finishing the topics module. It will provide us with more granular filtering tools, and enable us to also discover trending topics, etc. It will be eye-opening... but it's taking longer than expected because of the dataset size.


Spotify recently open sourced Voyager (continuation of Annoy), which uses HNSW. https://github.com/spotify/voyager


Can relate to this problem a lot. I have considered starting using a Docker dev container and making a base image for shared dependencies which I then can customize in a dockerfile for each new project, not sure if there's a better alternative though.


Yeah there is the official Nvidia container with torch+cuda pre-installed that some projects use.

I feel more projects should start with that as the base instead of pinning on whatever variants. Most aren't using specialized CUDA kernels after all.

Suppose there's the answer, just pick the specific torch+CUDA base that matches the major version of the project you want to run. Then cross your fingers and hope the dependencies mesh :p.



Thanks!


I saw that Spotify recently released a similar feature for selected podcasts, could anyone recommend some recent research papers related to voice translation? Would be really interested in reading!



This study was done at my uni, and the legitimacy and rigor of it was questioned and investigated by the university. However, it was ultimately not deemed to be inappropriate. Some interesting info is that the single picture that was shown to the "jury" of a student was handpicked by the author from Facebook, and that the jury consisted of high school students.

More info: https://www.svt.se/nyheter/lokalt/skane/snygghetsstudien-ar-...


They don’t say whether the author knew the grades when selecting the images. They also don’t say whether the jurors knew the grades. As long as they didn’t know I doubt that it had an impact on the study.

You could skew the results if you knew the grades but it wouldn’t be easy.


> the legitimacy and rigor of it was questioned and investigated by the university. However, it was ultimately not deemed to be inappropriate.

Does your uni routinely question and investigate the rigor of all its studies, or only ones where it doesn't like the findings?


The reason it was questioned was that some women felt uncomfortable by having their images pulled from social media and rated for attractiveness as part of the study, without having been asked or informed. But ultimately the ethics people decided that it was fine after all.


Fascinating article, I feel like quanta always puts out high quality stuff. I don't know if I misunderstood though, but the article mentions:

"In subsequent steps of the algorithm, should it pick parameters that we have already picked before in earlier steps, or should we exclude those?"

Isn't it the training examples the two different types of sampling are considered for, instead of the parameters of the model?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: