Hacker News new | past | comments | ask | show | jobs | submit login
AI for math resources, and erdosproblems.com (terrytao.wordpress.com)
160 points by nabla9 7 months ago | hide | past | favorite | 35 comments



I'm curious why he is so bullish on AI. The page lists many traditional and actually useful tools like Coq, but much space is devoted to deep learning.

My experience with CoPilot shows that it fails at the most simple logical tasks:

When presented with an integer sequence it often does not even recognize the number of elements given. It just claims that we are looking for the 10th element when the current sequence only has three. Results are also wrong all the time.

Spatial reasoning of CoPilot is not very good. I wonder if early expert systems written in the Lisp era would outperform it.

It fails at tasks that require textual awareness like palindromes.

What it does very well is confabulating bland short stories and poems. It is great at understanding the input questions.

All in all, I am more impressed with Wolfram Alpha, which looks more like a traditional expert system.



Calling out about the second announcement (erdosproblems.com) that there is a call for help and I think many people here might be able to help and some may be happy to do it. Skills required include GitHub, web design, Python/Html/Sql, …

Details: https://www.erdosproblems.com/help


As a college student learning Web stuff on my own, this is a pretty nice thing to spend summer on.


Paul Erdős was a very interesting person. I read his biography The Man Who Loved Only Numbers.

He couch surfed and wrote papers with others his entire life and became the most prolific published mathematician ever. He also loved to teach kids serious math and not underestimate their thinking.

Also I believe his name is pronounced like air-dish.


Closer to (non-rhotic) eh-r, dersh, I would say. But, travelling as he did, I'm sure he was used to all sorts of pronunciations and accents, so I doubt he'd mind overmuch.


while i am too far removed to make wise comments, it is fascinating that applied maths is being put to use for finding new patterns.

after going through the google docs for resources, i get a sense of some high level topics. however, i am confused about the current paradigms. are ml models being used to reduce the search space or as a way to let computers bring about serendipity?

it seems like these approaches are like graphing libraries or monte carlo sims that help the user see the bigger picture and then put in their own thinking.


What does AI for Math mean? People doing Math research that want to transition to AI research?


Broadly, as you'll see in the Google doc linked from the announcement, they are referring to the use of computer systems to assist mathematical research, which falls into a two step process: 1) Identify a suitable language for use by both humans and computers for communication and manipulation of mathematical theorems and their proofs. 2) Employ both mathematicians and machine learning algorithms to generate interesting sentences in that language.


> as you'll see in the Google doc linked from the announcement, they are referring to the use of computer systems to assist mathematical research, which falls into a two step process

I reviewed the doc, and it is just curated list of links on various projects (ML frameworks, proof assistant, some tutorials), and does not outline plan you described here. Or I missed something?..


The first section, Textbooks and Survey Papers, begins with Formal Proof (this is where one identifies the suitable language for mathematics), followed by general Machine Learning references, and concluding with Machine Learning for Formal Proof. Other sections have similar divisions, though some go off into adjacent fields where one finds non-AI* algorithms for specific mathematical applications, such as automated symbolic calculus or SMT solving, to name just two.

*Some people's terminology would classify such algorithms as "AI". I wouldn't, but they are nonetheless on-topic because you might invoke them to assist, focus, refine, or accelerate other techniques.


That makes a lot more sense, thanks.


I’m generally a bit of an LLM skeptic, at least as concerns extravagant claims around “AGI” or self-improvement of any of that nonsense.

But there are activities where an inherently high-temperature assistant (e.g. some Instruct-inspired tune) are a great fit, and those are almost definitionally at the boundary of “objectively faithful” and “a bit stochastic”.

I’ve never done any work in novel mathematics myself, but I’m a big fan of those who do, and any study of those who do paints a picture of roughly that boundary: those who are proficient can easily spot a falsehood but only with difficulty generate inspiration for a new idea.

Of all the highly optimistic applications for 2024 LLMs, mathematics sounds pretty plausible?


I’m generally a bit of an LLM skeptic, at least as concerns extravagant claims around “AGI” or self-improvement of any of that nonsense.

Of all the highly optimistic applications for 2024 LLMs, mathematics sounds pretty plausible?

You probably already know this, but I think it bears pointing out explicitly: in this context we're not talking (only) about LLM's. There's a LOT more to AI than just Large Language Models, and the list of resources linked reflects that.


I’m an advocate for shifting resources from pure-scale transformer work to other promising avenues in both representation and reinforcement learning.


Never going to happen until upscaling does not return easy gains.


FWIW as a former mathematician, I believe that if LLM’s can crack mathematics in the sense of say, solving the Riemann Hypothesis through chains of logical reasoning, then we’d be in the back half of reaching “AGI” in the sense of solving broad classes of general problems like drug discovery, or getting fusion to work, or even improving itself.


Oh sure, I think that’s pretty uncontroversial. An LLM that can with any consistency start at “prove or disprove P = NP” and arrive at a proof is in some sense super-intelligence.

Even the best funded boosters are making no such claims AFAIK, I think a much more reasonable and sober assertion might be like “they are useful to doing real mathematics”.


The curious thing is if you can ever hit a snowball type point, because by proving theorems you are generating training data, and maybe you can get to a point where you effectively have limitless data.


Science needs experiments and those will not go faster because of AI. So why would fusion arrive any faster?


> Science needs experiments and those will not go faster because of AI.

Even though I am deeply skeptical whether AIs (as they are commonly understood today) will be helpful for hard scientific problems (for quite different reasons), I don't think this argument necessarily holds: couldn't a (hypothetical!) AI propose much better experiments and experimental designs?


That‘s not how science has progressed in the past. It has not progressed by giant leaps of superior intelligence but incrementally and hardly a genius was more than 10 years ahead of his time.


Yeah I did say “back half”. I just think that if we can build an AI that can logically reason better than any human, combined with the total knowledge of all humans, it can start running experiments and doing “science” itself.


In general I think I agree with all your points.

The word “if” in both the parent and GP is doing a lot of lifting here though, if they can do amazing feat X, then they will likewise do amazing feat Y. And I think in every instance you’ve made a credible argument.

But I don’t think we have the if yet. If I could just slip GPT-4 an executive summary of the code I meant to write and the emails I meant to send on a given day I would be doing it.

But it can’t: it’s catastrophically wrong with extreme confidence routinely. These things are easy to cherry-pick in citation but still fuck up on HellaSwag in instances a child would not.


Oh, I 100% agree we are not there. Every time a new model comes out I cook up a few undergrad math homework questions to see if they’ve gotten any better, and so far they haven’t. But, the thing I like about “reducing” AGI to solving math is that it kind of sidesteps these thorny questions about interacting with the real world to conduct experiments, etc. It’s something that, in principle, can be done with the current text interfaces, and it’s pretty easy to tell whether or not we’ve gotten there.


Most likely it will just generate bullshit proofs, claim otherwise and then just flat out lie to your face.


The impact extends beyond mathematics. We have seen some pretty positive outcomes for LLMs as an assistant for scientists - with a lot of custom agent dev though - and not off-the-shelf models.

This is already being used by a few hundred scientists in a lab. We are aiming to extend to thousands next year with a focus on CS, climate & bio.


I think this readership would be highly interested by anything hundreds of serious scientists are doing in serious work.

Can you share any details?


Thanks! Most of the current work is not yet out as press or blogs - we generally do that post code release/ publications/ arXiv submissions. Will share them as they are made public.

Meanwhile, this is the blog of our earliest work from January that will be published in ICML24: https://blog.allenai.org/data-driven-discovery-with-large-ge...

Also if you are interested - happy to correspond more by email. Lot more to share privately.


https://arxiv.org/pdf/2402.00159

MIT, the Steel Factory, Cal, UDub and what, 37 authors pushed a preprint and no one heard about it?

@dang I can’t be the only one who has had it on this GPT-5 test flight shit. I don’t care how many clean Azure IPv4 blocks Altman has, throw this motherfucker out.


Haha, I think it was covered at HN. But did not get as much attention as it should have.


Terry Tao has recently been exploring the computer proof writing language Lean and since you can mechanically check the validity of the proof it's perfectly valid to have an AI generate Lean code. If it's wrong Lean will give errors and if it's a valid proof it doesn't matter where it came from.


The article or the linked Google doc doesn't mention how AI for Math resources is different than other resources list out there?


One obvious distinction is that it's recommended by Terry Tao.


To pick two examples which show the difference other lists might have an article about "Copilot" whereas this one has "Getting copilot to work with Lean4". It's not "Code assistant" it's getting that working with coq etc.

ie specifically about AI for maths rather than AI in general




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: