Hacker News new | past | comments | ask | show | jobs | submit | more ersiees's comments login

Yeah, we also find this interesting. Arguably, though, GP is far worse in all metrics. Thus, it is not really our closest competition.


I'd be interested to hear what you see as your closest competition?


Hey there! I am one of the authors on this paper. If there are any questions, I am happy to answer :)


You are very clear about the current limitations on data size, which I find refreshingly honest! How sensible do you find the idea to fine tune the model to a specific problem that has more than 1000 observations, by resampling the data (similar to bootstrapping) and retraining on the subsamples? As I understand it, one could fine tune the algorithm that TabPFN learned to the specific problem.

Many thanks also for open-sourcing your work and making the colab notebook, I've been playing around with that a bit.

Edit: spelling


We did try it a bit a while back, but did not have conclusive results. I expect you can bend it to perform better for larger datasets, too, but how exactly I cannot say for sure. The bootstrapping is definitely a good candidate for this.


Our lab works on changing this. I think it might still take some years for a full solution, but so far we are successful with NNs for small datasets with meta-learning (https://arxiv.org/abs/2207.01848) and large datasets with regularisation (https://arxiv.org/abs/2106.11189). The first is very new, but the second is also cited in this paper, but they didn’t run it as a baseline.


Why not run your own method on their benchmark? If your method is general-purpose, it should be very easy. And it would be very impressive to see your method succeed on a task that was not chosen by you to write your paper.


Yes, good point! The paper I am responsible for is targeted at smaller datasets, but I will propose this to the authors of the second paper :)


Tree based algorithms have the advantage that global or robust optimization of them is possible. Do your approaches offer similar guarantees?


The first work, I linked, arguably gets around this issue completely. There we train a neural network to approximate Bayesian prediction directly. It accepts the dataset as input. Thus, there is no optimisation procedure per dataset, just a forward pass. So far, this seems to be pretty stable.


Just because you optimize network parameters to fit a surrogate model of the data doesn't mean you are not doing local optimization without guarantees on robustness or global optimality w.r.t. network parameters to fit the surrogate. Am i missing something?


From working in a company providing neural network inference as a service, I can attest you, that we did this. We did it especially since we are scared people distill on our results. If the other service makes the same weird mistake, they distilled from us.


There also are new application libraries (torch vision, audio, X) being released together with 1.10: https://pytorch.org/blog/pytorch-1.10-new-library-releases/


I was especially pleased to see the create_feature_extractor quality-of-life function added in Torchvision. So convenient!


Built with torch.FX!


Can I use this to run my own COVID-19 tests for cheap?


No, that requires a PCR machine with an optical fluorescence sensor, which is much more expensive.


What a weird assumption.


The real comparison would have been between Cifar-10 (or Imagenet) and a downsampled version there of, not upsampled, and I actually know for sure that this harms performance.. So this ideal world and real world training are definitely not the same!


This is the exact comparison we make in the paper. We have subsampled ImageNet experiments as well; see Figure 3 in the full paper: https://arxiv.org/abs/2010.08127


You might have missed the production of EVs that causes more CO2 than that of diesel cars.


Over the 20 year service life of a car, operating costs dwarf production costs.


We were talking about emissions, not costs.


Exactly.

If car A produces 1 tons of emissions to build, and 100 tons over its life, and car B produces 2 tons of emissions to build, and 50 tons over its life, car B is the better one for the environment. (52 < 101)

Looking at emissions to build as a metric by itself is disingenuous, you should look at lifetime emissions instead.


I really like this critique of environmentalism, as an environmentalist myself. While I think it should have ended with an argument in favour of some sustainability law, like a carbon dioxide tax, it clearly remembers us that still we are so much stuck in ideologies when it comes to saving co2 and that this could potentially be, because it makes us feel better than really saving co2.


And that the answer to a lot of global problems is to reduce poverty. People who struggle to stay alive are focused on that. People who no longer have to struggle have the luxury of caring for their environment.

For example: most of SE Asia burns their rubbish in their back yard, because there are no rubbish collection facilities. They also litter, throwing plastic waste into the gutter, because they're too busy dealing with other stuff to care. But the wealthier countries (Singapore is the classic example) don't do this and are fastidious about waste management.


Yes. I'm currently doing my M.Sc. in ecology, with a special interest in conservation, so I think about issues like these a lot. The most important lesson? Life's complicated.

It doesn't matter what issues (or proposed solutions) you look at - it could be GMOs, glyphosate, plastic, electric cars, nuclear power, bio-fuels, you name it - there is no easy answer. Some solutions are better than others, but all have some disadvantages.

It certainly doesn't help the environmental movement that many of its members display a very definite "green ideology". They go around spouting their preferred policies, ignoring or shouting down any objections (no matter how well founded). That really doesn't help our trustworthiness with the rest of society.

The nature of politics is such that we often have to pick sub-optimal solutions, and often on incomplete knowledge. But let's at least have a proper debate before settling on one solution - and not prematurely tout anything as "the only green way to go".


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: