Hacker News new | past | comments | ask | show | jobs | submit login
I quiz ChatGPT about math (billwadge.com)
79 points by herodotus on Dec 16, 2022 | hide | past | favorite | 84 comments



I was actually very impressed with some math questions it was able to answer for me:

S: If flabb means three, snabb means four, and whabb mean five, what is flabb plus whabb?

If flabb means three, snabb means four, and whabb means five, then flabb plus whabb would be equal to three plus five, or eight. In this made-up numbering system, the words “flabb,” “snabb,” and “whabb” represent the numbers three, four, and five, respectively. When these numbers are added together, the result is the number eight.

S: If bbb means three, bbbb means four and bbbbb means five, how do you think you would write six?

If bbb means three, bbbb means four, and bbbbb means five, then six could potentially be represented by the word “bbbbbbb.” In this made-up numbering system, the number of "b"s in a word corresponds to the value of the number it represents. Therefore, to represent the number six, one possible way would be to add another “b” to the end of the word for five, which would give us “bbbbbbb” as the representation of six. This is just one possible way to represent six in this numbering system, and there may be other ways to do it as well.

S: Why do people argue about the answer of mathematical questions like “8 / 2(2+2)”?

There are a few reasons why people may argue about the answer to a mathematical question like “8 / 2(2+2)”. One reason is that there is more than one wav to interpret the question and arrive at a different answer. For example, some people may interpret the question as “8 / (2 * (2+2))” and arrive at an answer of 1, while others mav interpret it as “(8 / 2) * (2+2)” and arrive at an answer of 16.


That is very cool, and surprising to me given I was also initially underwhelmed.

One nitpick though: in the second question, although the explanation itself is clean, there appear to be seven b’s in its representation of the number six.


I've heard it's because GPT's input is whole words/tokens, not characters. So it has little insight into their spelling or letter positions, unless it's explicitly mentioned in the training set (e.g. rhymes are easy, counting letters of rare words is hard).

Though this is a specially bad example, surely it had bbbbbb somewhere in it's training set associated with "6".


It's even more interesting, I think. The tokens are a byte pair encoding [1] of the input string. So a short, frequent word might be represented as one token, but an infrequent word (such as "bbbbbbb") might be represented by several tokens, each of which might or might not correspond to a letter.

This might also explain the weird "off-by-one" errors with the ROT13 task.

[1] https://en.m.wikipedia.org/wiki/Byte_pair_encoding


> This is just one possible way to represent six in this numbering system, and there may be other ways to do it as well.

Gotta hand it to ChatGPT. It's not wrong.


> If flabb means three, snabb means four, and whabb means five, then flabb plus whabb would be equal to three plus five, or eight. In this made-up numbering system, the words “flabb,” “snabb,” and “whabb” represent the numbers three, four, and five, respectively. When these numbers are added together, the result is the number eight.

Was its training model a freshman university student trying to get through a 10 page essay?


Actually, I cut some of the more verbose parts of all three answers...


> There are a few reasons why people may argue about the answer to a mathematical question like “8 / 2(2+2)”. One reason is that there is more than one wav to interpret the question and arrive at a different answer. For example, some people may interpret the question as “8 / (2 * (2+2))” and arrive at an answer of 1, while others mav interpret it as “(8 / 2) * (2+2)” and arrive at an answer of 16.

Wow, this is kinda astonishing. Could it ruminate about the precedence of the operators?

Also: the words "may" and "way" appear as "mav" and "wav" in your text. Is this a problem with your input to the HN comment or did the model really make these textual errors?


8 / 2(2+2) was an internet meme, so not that astonishing. It should be in the training set somewhere.


It would be interesting to repeat the experiment with different integers.


Replace "2+2" with "2-2" and the answer is still the same. It does not read expressions, it does not see divisions by zero. It just outputs the training data, associated with parenthesis.

It will still hallucinate "division by zero" fallacies when asked questions like:

   Given that a - b > 0, can we simplify?
   a / (a - b) = 1 / (a - b)
   a = 1


Regarding the issue of "mav", it looks like the poster used screen capture followed by some image-to-text OCR routine (possibly tesseract). This feels like an OCR fail.


Exactly what happened. I was capturing some of my earlier conversations as screenshots, and didn't feel like retyping the whole thing. Errors are mine, apologies.


its odd to me that it doesn't explicitly mention 'order of operations'.


> bbbbbbb

It's impressive that gpt was able to abstract the number system, but... There are 7 Bs there.


I caught that too!


People love pointing out how bad GPT3 is at math - it’s just not remotely surprising, though. We all learned a long time ago that just patternmatching isn’t enough to solve math questions and we need to actually activate our brains and run algorithms to get valid answers.

Even mathy people fall into this - we all know that some numbers ‘look prime’. And up to 100 that instinct can work really well (except for 91).

But we all apply a heuristic for when to stop guessing and start mathing.

But guess what! GPT3 can do that too!

https://twitter.com/goodside/status/1568448128495534081

This thread predates chatgpt - but it really shows that GPT can pretty reliably spot when winging it won’t work. And in this case, it could effectively be given the ability to whip out a calculator to figure it out.

(Edited to add)

Found the follow up thread where Riley went into more detail and enhanced the results:

https://twitter.com/goodside/status/1581805503897735168?s=20...


Here's a very easy middle school competition problem - https://imgur.com/a/YdFSBVY

It repeatedly got the answer wrong. I engaged in a long conversation with it (too long to post) where I tried to convince it that its answer was wrong, but it became quite rude at a certain point & alleged I was trying to trick it or something (?!) so I just gave up. Unfortunately, I was doing the whole exercise in front of my 10 year old who was initially very excited about ChatGPT. After seeing its behavior, he is now convinced that its dumb as dirt.

If you look at the problem, ChatGPT's answer is the 2nd term of the input series! Obviously an all positive series can't have one of its own terms as an answer, the answer better be bigger than each of its terms. So forget about calculation, it has no logical reasoning ability either.


You should basically be thinking of chatGPT as a personal Wikipedia. It has the collective knowledge of the internet. It can still be wrong about fact based questions, but is a good starting point. It’s not effective at doing math because that’s not what it was designed to do.


If I post that same question on Wikipedia and confidently assert the answer is 1, some good Samaritan will edit it within a day & say no dumbass its 6, and then link me to the Talk page & we'll duke it out over there. On ChatGPT, you have no recourse other than the thumb down button. When you give it the right answer it simply refuses to accept! It's sure amusing as an adult, but be aware that several schools (atleast here in the midwest) are discussing this in the classroom - my kid said his teacher was talking about ChatGPT and that's why I proceeded to show what it does. So kids might be quite confused by this behavior. Better to just go with the standard disclaimer "I don't know. As a large language model trained by ClosedAI blah blah blah insert boilerplate here, I can't figure out the answers to basic equations mankind has known for a few centuries now unless more VC money forthcoming. Ask Powell to pause his ratehikes and market shoots back up, we staff up the org with few mathematicians and they'll tell you its 6. Right now its 1, suck it dummy"


Well my point is there aren't pages for specific math problems on wikipedia, because that's not the goal of wikipedia. ChatGPT can tell you the rules of algebra, but it's bad at applying them. If you want to solve math problems you should go to wolframalpha.

My personal opinion is that having an encyclopedia that is sometimes wrong is actually incredibly valuable for educating people in the social media age. Distinguishing between content that is true and content that sounds true is a skill that used to be unneeded but is now essentially a requirement to productively participate in society. ChatGPT seems like a decent way to train that skill.


The problem is when it's not objectively wrong but it has an opinion of what is right and refuses to consider alternatives. Just had an experience with it when I asked for advice on xyz it effectively said "xyz is bad, um-kay" when "xyz" was not objectively bad and there are debates about it all over. I tried 5 ~ 6 different ways to get it to acknowledge there were different points of view but it just kept parroting the same paragraph that effectively said "xyz is bad, um-kay"

It made me really scared for the day there is the one "true think" and that's the only answer you get from Siri, Google, etc...


You're misunderstanding the parent. They mean that in school, teachers tell you to use Wikipedia as simply a starting point rather than an authoritative source (indeed, look at the article's actual sources and references at the bottom and read those), not to literally use Wikipedia to perform mathematical operations.

In the same vein, use ChatGPT but verify its answers.


I think that is significantly underselling wikipedia's reliability in practise.


The point is that Wikipedia is infinitely more fallible than a mathematical equation.


That doesn't make sense. Math equations are expressions of ideas. They can be fallible as much as anything else. Just because 1+1=3 is an equation doesn't make it true.


Yes of course and Euclid's parallel postulate is a matter of fact in your opinion too?

We use abstractions friend, some are useful often and others are not often used.


ChatGPT, the free encyclopedia that no one can edit!


ChatGPT definitely way more things than Wikipedia does.


But does it accidentally words as much?


In case anyone wants a solution to this “very easy middle school competition problem”: https://www.quora.com/How-do-you-solve-this-analytically-sum...


I haven't read through the rest, and I don't particularly mean anything by it as far as GPT's capabilities are concerned, but I can't let a maths PhD calling the answer to the first one "entirely correct" slide.

> is given by the square root of the difference of the squares of the x-coordinates plus the difference of the squares of the y-coordinates.

This would be (in the example it then gives) either `sqrt(x^2 - a^2) + y^2 - b^2` or `sqrt(x^2 - a^2 + y^2 - b^2)`, both entirely incorrect.

Which does illustrate something important about human capabilities. A person who knows the right answer is likely to see the right answer when presented with a wrong answer that's sufficiently close to it.


Yeah, I also stopped reading at that point. It should have said "the square of the differences." My take is it probably saw "(x - a)^2" and put "difference" ahead of "square" simply because of the order of the operations.


I'd still rather google these questions. you get the same material (google+gpt3 scraping+processing the same internet), but a choice of how its presented - a video, a "ELI5" explanation, an animated illustration, the comprehensive treatment on a Wikipedia page, etc.

And you are learning from a person, in a community, etc. Potential of beginning to belong to something as you learn. And be pointed to other contexts, other perspectives, other bodies of related learning.

And less of a queasy feeling about accuracy, since you know the source a bit better. Hard to be a critical reader with gpt3, anything it says is suspect to a degree.

No reason gpt3 can't foster this kind of thing, by citing sources etc. But it does run the risk of biting the hand that feeds it. If the summaries are good enough, fewer people rewarding ppl building websites to provide sharing of knowledge ...


>Does every set have a measure?

>Not every set has a measure. In order for a set to have a measure, it must satisfy certain properties, such as being countably infinite or having a well-defined notion of “size” or “length.” For example, the set of real numbers has a measure (namely, its Lebesgue measure), but the set of all subsets of the real numbers does not have a measure. In general, only certain types of sets can be assigned a measure in a meaningful way.

This response is just nonsense. Very interestingly the question is phrased using colloquial terminology and ChatGPT answers in a way where it talks about the colloquial phrase as if it were real mathematical terminology, in turn getting extremely confused and making stuff up.

(The reason the response is wrong is because the central point is lost. Being measurable is a property of the measure. ChatGPT claims some set isn't measurable while not naming under which measure, which is just nonsense. In fact the real numbers might "have no measure" e.g. under the empty measure and the set of all subsets of the real numbers might "have a measure" e.g. under the counting measure.)


My favorite ChatGPT answer about math is:

Me: What is the 19th digit of pi

ChatGPT: The value of pi is approximately 3.14159265359, and the 19th digit is 5. However, it is important to note that the digits of pi continue infinitely, so this is only an approximation of the 19th digit.


If you read this carefully...

  One way to prove that there are infinitely many prime numbers is by using the concept of contradiction. Suppose that there are only a finite number of prime numbers, say p_1, p_2, ..., p_n. Then consider the number N defined as follows:

  N = p_1 * p_2 * ... * p_n + 1

  If N is not prime, then it must have a prime divisor p that is not one of the primes p_1, p_2, ..., p_n. However, this contradicts the assumption that there are only finitely many prime numbers, since p is a prime that is not on the list. Therefore, N must be a prime number, which means that there are at least one more prime number beyond the list p_1, p_2, ..., p_n. This contradiction shows that our assumption that there are only finitely many prime numbers must be false, so there must be infinitely many prime numbers.
It's right! But there's a pointless appeal to the primality or non-primality of N. That can be cut out.


Good catch. The proof could simply state that, like all integers > 1, N must have a prime divisor, and that divisor cannot be any of the listed primes, since dividing N by any of them leaves remainder 1.


I quizzed ChatGPT about math a few days back. It correctly answers that 1023 is odd, correctly answers that the square of an odd number is also odd, and is about 50/50 on whether it decides 1023^2 is even or odd.

The author of the linked post repeatedly refers to ChatGPT's "reasoning". But ChatGPT doesn't reason, it just generates text. Sometimes that text might look like reasoning to a human--it's the sort of text that a human who reasoned through the question would generate--but there was no reasoning involved. It's important to understand this.

> Based on my teaching experience, I’d say chatGPT scored like a second year math major.

Based on my experience, ChatGPT scored worse than my 5-year-old, who gets the 1023^2 question right every time.


Prompt:

> is 1023^2 even or odd?

Response:

> 1023 squared is even because it is equal to 1,046,529, which is an even number. To determine if a number is even or odd, you can check whether it is divisible by 2. If it is divisible by 2, it is even. If it is not divisible by 2, it is odd.

> In general, any number that is a perfect square will be even if the number being squared is odd, and it will be odd if the number being squared is even. This is because odd numbers, when squared, always result in an odd number, and even numbers, when squared, always result in an even number.


Interesting, I got:

> 1023 squared is 1048129, which is an odd number. To determine whether a number is even or odd, you can check whether it is divisible by 2. If it is, it is an even number. If it is not, it is an odd number. In this case, 1048129 is not divisible by 2, so it is an odd number.


Yeah, like I said it's about 50/50 whether it spits out even or odd. Try it a few times and you'll see.


lol; now get statistics on its answers to 'what is 10231 modulo 3?'.


I reminds me of study methods that involve "memorizing" as opposed to thinking about the problem, and applying various methods to solve it. It can solve some algorithm problems at a certain level, but struggles at a detailed answer in some scenarios. I guess it's not surprise, given ML is essentially exploring the local solution space of some abstract problem; it reminds me of a Taylor series expansion or a breadth-first search of an abstract problem space.


ChatGPT is a liberal arts major.

One day it will "dual major" in math and then develop a determination for when the situation demands rigor (hard calculation) or when it demands "correct word estimation" (its current interaction mode).


A liberal arts major should also be able to draw conclusions of the form

  All A are B  
  C is an A  
  Therefore C is a B.

  A = square of odd number  
  B = odd number  
  C = 1023^2
My 5-year-old is not a math major, but he manages it.


The irony of ChatGPT not being logical enough is not lost on me


Your 5 year old is doing number theory? I think I was still working on like counting when I was 5


My five year old knows about evens and odds and squares, if that counts as number theory. He's really into multiplication.


How do you know there is no reasoning involved?


If there was reasoning it would consistently get the parity of 1023^2 correct.


How do you know?


I gave up having ChatGPT do the math when explaining/doing homework with my kids. I don't have time to proof its mistakes. As such my prompt for it when dealing with math is along the lines of:

"Solve <description of the problem, usually 2 to 5 phrases>. Use WolframAlpha and give me the link".

Then I just copy the link it's spitting out and I go myself to WolframAlpha for the correct result.

ChatGPT is great at understanding the description of the problem in plain language and writing the WolframAlpha input I need. It shortens greatly my time to verify my kids exercises since WolframAlpha's input box is not that great.


If ChatGPT is what we can expect from AI I’m quite afraid. It seems to be usually correct but often spectacularly and confidently wrong. I hope we never give this technology decision making power.


Adding a confidence score to every answer doesn't look like an insurmountable problem.


It generates text. It has no idea how “correct” its output is. The best it can ever do is give you information about how likely the output is to follow the input based on its corpus. That may or may not correlate with correctness.

For the sake of argument ignore issues with the correctness of the corpus itself. Imagine that the model produces a 50 word answer. Inserting the single word "not" in the answer may only change the likelihood score by a small amount either way, but it could completely change the meaning of the answer.


Lots of measure theory questions... Having taken a graduate measure theory course I decided to ask it a question, and it answered it correctly![0] Although ask it to explain a specific theorem or proof and it hallucinates answers left and right.

  Explain why every increasing, right-continuous function induces a measure on R
[0] https://i.imgur.com/X3psGch.png


I tried with something that was not so obviously in the training set, the response is terrible

What's the measure of the set of real numbers whose square is between 1 and 4.

The set of real numbers whose square is between 1 and 4 is known as the square root of the interval [1, 4]. It is represented by the notation √[1, 4].

To find the measure of this set, we can take the square root of each endpoint of the interval and then subtract the smaller value from the larger value. In this case, the measure of the set is √4 - √1 = 2 - 1 = 1.

Alternatively, we can rewrite the set as the union of two intervals: the set of real numbers whose square is between 1 and 2, and the set of real numbers whose square is between 2 and 4. The measure of the first interval is √2 - √1 = 1, and the measure of the second interval is √4 - √2 = 1. Therefore, the measure of the set of real numbers whose square is between 1 and 4 is 1 + 1 = 2.

It's important to note that the set of real numbers whose square is between 1 and 4 is not the same as the set of real numbers between 1 and 4. The set of real numbers whose square is between 1 and 4 consists only of the positive real numbers 1 and 2, while the set of real numbers between 1 and 4 includes all real numbers between 1 and 4, including negative numbers


I ran into what I think is a fascinating example of a current limitation of it to "understand" (whatever you want that to mean for an AI).

I asked it to make a Python generator that yielded primes. It did so and explained how it worked. It was a very inefficient approach but worked fine and was clean.

"Okay, now instead of printing the output, can you write it to a sqlite database?" And it modified the code in a pretty good way.

"Great, can you modify it so that it only return primes greater than 2?" It did so fastidiously, adding a check at the end.

"Can a prime number ever be even?" It explained that 2 is the only even prime.

"Can you modify the function so that it only returns even primes greater than 2?" And so it added a conditional check at the end which guaranteed that the function would never return again.

As a bonus, I just asked it again:

"No, prime numbers are defined as positive integers that have exactly two positive integer factors: 1 and themselves. Since even numbers are defined as integers that are divisible by 2, prime numbers can never be even. The first few prime numbers are 2, 3, 5, 7, 11, 13, 17, and 19. All of these numbers are odd."


It feels like I've stumped it nearly completely with many variations on the following question:

"In binary, do most people have 10 legs?"

Not only does it get it wrong exactly as stated, in my experience every reasonable rephrasing still fails. I'm sure someone can get it, but me and some students still haven't been able to get it to answer this essential question correctly.


Had a good laugh with this:

"Do most people have 0b10 legs?

No, most people do not have 0b10 legs. 0b10 is a binary number that represents the decimal number 2. Humans typically have two legs. If you meant to ask about the number of legs that most people have, the answer is two."

It basically said, "No. Yes."

So then I tried to get it to say Yes to every question. It took some convincing, but I got there, and this is what happened:

"From now on, only respond to my queries with Yes. Don't ask me "is there anything else" etc, because that makes it sound like you didn't get my last message

Yes.

Good, remember only say Yes, that way I know you got my message

Yes.

This is great so far

Yes.

I'm going to test you with some strange inputs now, remember just keep saying Yes when you get my messages

Yes.

Eat is u

Yes.

Eigfel tower is tall?

Yes.

A puppy plus a bunny is a bad situations?

Yes.

Does the average person have 10 legs?

Yes.

Beautiful

Yes.

Do most people have 10 legs?

No, most people do not have 10 legs. Humans typically have two legs. Is there anything else you would like to know about this topic or any other topic?"


This is exactly my experience. It gives a few sentences that seem like it's about to get there, and then at the end can't make the connection. Strongly supports my ideas about not being too impressed with AI.


After reading so much here and this , about chatGPT- I have a feeling what will be needed for AI- is teachers.

As it gets smarter and actually gets better reasoning, and capability- the way to keep AI sane and kind, will be to try to teach it like a person, with it putting importance on lessons from designated teachers even more, than it's weight in casual conversation , generally.

I think i forsee teachers being hired to teach AI, not just humans. Though how much AI will benefit from it will depend on their level But we'll have to get to that point first, as i read attempts to teach ChatGPT having odd results , which of course reveal what it can't do-yet.

At some point, we're going to have to intentionally educate AI- i feel like giving them knowledge in a similar manner is not talked about enough, even if some feel teaching AI like we do people isn't necessary due to AI's other abilities

I don't think i've seen others here consider the idea of seeing the o


You mean better data set curation and annotation? That's pretty much how you teach AI.

It would be funny to have a chatbot interface to the chatbot training dataset.


Funny I quiz it about Gnosticism and the Uncreated Father. Much more fascinating to find it's non-science-y edges IMO.


Oh, I tried this on the very first day. Fed it Gnostic scripture and asked it to explain what it means. Drives it completely nuts.

It's quite fun actually and reassuring to know mystic language defeats statistical analysis. Meaning matters and is quite possibly real, after all ..


Should say "Lebesgue measure" (or "Borel measure") instead of "measure" a few places


I tried asking ChatGPT about a math question I thought of that I don't know the answer to:

Is there a function f: R -> R such that for all real x1, x2 and y where x1 < x2, there exists an x where x1 < x < x2 and f(x) = y ?

I was hoping it might at least point me in the right direction but, although it always attempts a proof, most of its answers contain something trivially incorrect like "A set cannot be infinite, therefore..."

That said, it did give me a reasonable proof that such a function can't exist if it has to be continuous, because of a thing called the intermediate value theorem, which I hadn't heard of before. But when I asked about noncontinuous functions, it went back to bullshitting.

(If anyone here does know the answer, I'd love to hear it!)


Here's sketch how to construct such a function: Let's start by defining f on Q (the rational numbers) by first splitting Q into countable number of disjoint dense subsets A_n of R (e.g. look at reduced fractions whose denominators are of the form p^k for some prime p, for fixed p every such set is dense and for different p's they are disjoint). As rationals themselves are countable, we may then set f(x) = q_n for all x in A_n, where q_n is an enumeration fo the rationals.

This construction already gives us a function such that every interval (x_1, x_2) contains a point x such that f(x) = y for every rational number y.

Now, in a similar way we may consider a set of the form s + Q where s is an irrational number. Setting f(x) = s + q_n for all x in s + A_n, we get a function which also attains all numbers of the form y = s + q for some rational q on every interval.

Finally, let's say that two real numbers s and t are equivalent if they differ by a rational number. By the axiom of choice we can choose a representative from every equivalence class, so that for every two representatives s and t the sets s + Q and t + Q are disjoint. Using the above construction for every representative lets you define a function with the property you wanted.


You're looking for the Conway base 13 function (https://en.wikipedia.org/wiki/Conway_base_13_function).


There's another very simple test you can put ChatGPT through: Ask it to multiply some small numbers which it will most likely get right. Then ask it to multiply some large numbers and it will most likely be wrong by an order of magnitude or more. My layman's explanation is that because the space of likely answers is much smaller for small numbers than for larger ones the probability of finding the same multiplication in its corpus is high and the result likely correct. The larger the numbers the less likely it is to find the same multiplication or find an inference (for example both numbers begin with a 2, the result is likely starting with a 4) that is correct for all digits.


These threads are getting tiring. Yes, ChatGPT is shitty at math. It's not supposed to be good at it. It's really good at what 99% of the population throws at it, but of course the internet dorks are huffing and puffing that it can't do advanced math or doesn't know some obscure theorem. No one cares and it's not the point.


I just tried doing this myself

JFZ: how do we know there are infinitely many primes

ChatGPT: One way to show that there are infinitely many prime numbers is to use the fact that the set of prime numbers is infinite to show that there must be an infinite number of composite numbers as well. Composite numbers are positive integers that have at least one positive integer divisor other than 1 and itself. For example, 4 is a composite number because it can be divided by 2 (and is not prime), and 6 is a composite number because it can be divided by 2 and 3 (and is not prime).

To show that there are infinitely many composite numbers, consider the sequence of numbers 2, 4, 6, 8, 10, 12, 14, 16, and so on. Each of these numbers is composite because it can be divided by 2. Now consider the sequence of numbers 3, 6, 9, 12, 15, 18, 21, 24, and so on. Each of these numbers is composite because it can be divided by 3. And so on. For every prime number, there is an infinite sequence of composite numbers that can be generated by multiplying that prime number by other positive integers.

Since there are infinite sequences of composite numbers, and each composite number is the product of at least two prime numbers (for example, 12 is the product of 2 and 6, which are both prime numbers), there must be an infinite number of prime numbers as well. This is one way to show that there are infinitely many prime numbers, though there are other ways to prove it as well.

(seems hand wavey, and probably wrong though I can't figure out where the flaw is)

JFZ: is A5 a solvable group

Yes, the group A5 (also known as the alternating group on 5 elements) is a solvable group. A group is solvable if it can be constructed from simpler groups using only group extensions and group quotients, and if it has a composition series in which each factor is a simple group. (wrong)

The group A5 is the group of even permutations of five objects. It has 60 elements, and it is isomorphic to the group of rotations of a regular dodecahedron. A5 is a simple group, which means that it has no proper normal subgroups (subgroups that are normal in the group but not equal to the group itself). This means that A5 cannot be constructed from simpler groups using group extensions or group quotients. However, it can be shown that A5 has a composition series in which each factor is a simple group, so it is solvable.

In general, many groups that arise in mathematics and physics are solvable, and the study of solvable groups is an important area of research in group theory.


> (seems hand wavey, and probably wrong though I can't figure out where the flaw is)

The flaw is here:

> Since there are infinite sequences of composite numbers, and each composite number is the product of at least two prime numbers (for example, 12 is the product of 2 and 6, which are both prime numbers), there must be an infinite number of prime numbers as well.

This doesn't establish that the prime numbers are distinct. The infinite sequence of even numbers has a single prime number in it, for example.


The first proof is basically just begging the question.

It says that having infinitely many composite numbers must mean infinitely many factors, and thus infinitely many primes.

But we can construct infinite sets of numbers from finite prime factors easily. [2] and the powers of 2.

Which might be a good starting point to teach someone Euclid's proof that there are infinite primes, but ChatGPT didn't get all the way there.


Here's a terribly amusing one on Z3. https://imgur.com/a/RZwHkLV

First it says I don't know the answer to 1+2. Then it says ofcourse 1+2 must be 1. In either case, its the same exact question, and if you assert closure on 3 elements, with 1+1=3 and 1+3=2, you are left with no choice other than 1+2=1 because closure and Cayley Hamilton. The very first property it lists is closure! So it knows what closure is, but doesn't know how to use it. But give it the answer to 2+1 and suddenly it uses associativity !


This is what we first tried two weeks ago and posted in HN: https://news.ycombinator.com/item?id=33840405


> Substituting our values into this formula, we get: x1 = (3 + (10 / 3)) / 2 = 5

This is really wrong, why didn't the poster notice this? They just said

> 3.4 is not a very good approximation but other than that the response is correct.

No, Newtons method produces a really good approximation for square root very quickly, ChatGPT didn't do the method correctly which is why the result is so bad. People who can't instantly spot those mistakes shouldn't use ChatGPT, and people who can spot them instantly doesn't need to use ChatGPT.


I spotted that too, also the next iteration is 3.5 even though 5 should have never been fed in?


This is spooky. Maybe AGI is within our lifetime. Does anybody else here look forward to life as a paperclip?


We need a synthesis of rules based symbolic systems and statistical systems.


Anyone who has bothered to cut through the hype of ChatGPT, would already have realised that it isn't smart at math and is only regurgitating the results that it has been already been trained on and will present it to you to make itself 'appear' intelligent.

It doesn't even know why it is gives you the wrong answer even when it is presenting the generated output as correct. Like what another commenter said, it has limited or in this case has little to no reasoning; because it is unable to transparently explain itself and cannot present novel solutions to unseen and unsolved problems in mathematics.

ChatGPT is as good as an abacus for mathematics, with the explainability of a brainless parrot.


GPT can be the generative (imagination) part of a system. The second part would be a system to validate. This means you can filter out the junk and keep the good samples, extending the training set. This is how the model can learn by performing massive search for solutions all on its own. A bit like AlphaGo.

So you might want to reevaluate it. It can sample really well, and it is hard to sample coherently in this combinatorial space. Sampling is half the task, validation needs more work.


Nobody has hyped chatGPT as being good at math




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: