Respectfully, I think we cracked basic intelligence. What do you imagine under b...

sph · 2024-05-15T14:22:43

> Respectfully, I think we cracked basic intelligence. What do you imagine under basic intelligence?

It all depends on your definition of intelligence. Mine is the ability to solve novel problems.

AI is unable to solve novel problems, only things it has been trained against. AI is not intelligent, unless you change the very definition of the word.

terminalcommand · 2024-05-15T14:37:40

I challenge you to imagine an imaginary game or computer language, explain the rules to the LLM. It will learn and play the game (or write programs in your invented language), although you imagined it. There was no resource to train on. Nobody knows of that game or language. LLM learns on the spot with your instructions and plays the game.

I cannot understand grad school level mathematics even if you give me all the books and papers in the world. I was not formally trained in mathematics, does that make me not intelligent?

sph · 2024-05-15T14:42:10

"Ability to solve novel problems" does not mean ability to solve all problems, nor to have all the knowledge in the world. A caveman can be as intelligent as you and I, even without being able to read.

Not having knowledge in mathematics has no impact on your intelligence quotient.

terminalcommand · 2024-05-15T15:04:20

I agree. But LLMs do solve novel problems in that case, you just have to explain it to them like you would have done to an intelligent caveman or me.

Which novel problem can't an LLM solve? I gave the example of an imaginary game because the LLM cannot have possibly trained on it as it is imagined by one person and nobody knows how to play it.

latexr · 2024-05-15T16:09:15

I tried the first thing which came to mind, with GPT-4o. The problem is probably not novel, yet it failed anyway.

---

User:

Let’s play a game. Every time I give you a word, you separate it into its individual characters and assign a number to each letter, in order, following the Latin alphabet. So "a" is "1", "b" is "2", and so forth. You will then subtract 2 from each number, convert it back to the corresponding letter and return the new word. Did you get all that? Are you ready to start with my next word?

LLM:

Got it! I'm ready to start. What's your first word?

User:

anthropomorfic

LLM:

Sure! Let's break down "anthropomorfic".

1. Assign numbers to each letter:

a (1), n (14), t (20), h (8), r (18), o (15), p (16), o (15), m (13), o (15), r (18), f (6), i (9), c (3)

2. Subtract 2 from each number:

a (1-2=-1: Consider Zuzzer, i.e., `25`), n (14-2=12), t (20-2=18), h (8-2=6), r (18-2=16), o (15-2=13), p (16-2=14), o (15-2=13), m (13-2=11), o (15-2=13), r (18-2=16), f (6-2=4), i (9-2=7), c (3-2=1)

3. Convert them back to letters:

y, l, r, f, p, m, n, m, k, m, p, d, g, y

The new word is "ylrfpmnmkmpdgy".

---

Looks pretty good, right? It even found a good wraparound solution for “a” (I wasn’t trying to trick it, I just didn’t think about the word too hard), though I have no clue what it means by “Consider Zuzzer”. But then you look at the last one, where it converted “c” to “y”. That’s absolutely wrong and is the easiest one any human would guess. Yet you could miss it because all the previous ones were right.

Here’s a bonus. When verifying the answer, I queried Brave search for “letters of the alphabet with numbers assigned”. The AI-generated answer at the top was: A=1, B=2, C=3, D=4, E=5, F=6, G=7, H=8, I=9, J=1, K=2, L=3, M=4, N=5, O=6, P=7, Q=8, R=9, S=1, T=2, U=3, V=4, W=5, X=6, Y=7, Z=8. Which, alright, I can see the logic, but that’s not what anyone would want by asking that question.

terminalcommand · 2024-05-15T16:54:44

For argument's sake I rewrote your prompt. I think gpt 3.5 understood it on the first try:

Prompt: Let’s play a cypher game.

The rules of the game are as follows:

1. Imagine a numbering system where each letter of the English alphabet corresponds to a number starting with 1. For example, a corresponds to 1, b corresponds to 2, c corresponds to three.

2. When you receive a word from me, break it into individual characters.

3. Convert each character into a list of numbers by following rule #1.

4. Subtract two from each number of the list and generate a new list, if the result is lesser than 0, add 26.

5. Convert each number in the new list into a letter by following rule #1 and present me your result.

Are you ready for the first word? ------- GPT: I'm ready! Go ahead and give me the word. Me: anthropomorfic GPT: Sure, let's start:

a -> 1 -> 1 - 2 = 25 (z)

n -> 14 -> 14 - 2 = 12 (l)

t -> 20 -> 20 - 2 = 18 (r)

h -> 8 -> 8 - 2 = 6 (f)

r -> 18 -> 18 - 2 = 16 (p)

o -> 15 -> 15 - 2 = 13 (m)

p -> 16 -> 16 - 2 = 14 (n)

o -> 15 -> 15 - 2 = 13 (m)

m -> 13 -> 13 - 2 = 11 (k)

o -> 15 -> 15 - 2 = 13 (m)

r -> 18 -> 18 - 2 = 16 (p)

f -> 6 -> 6 - 2 = 4 (d)

i -> 9 -> 9 - 2 = 7 (g)

c -> 3 -> 3 - 2 = 1 (a)

So, "anthropomorfic" becomes "zlrfpmmnpmkpdmga". Your turn!

fireflash38 · 2024-05-15T21:22:43

Isn't 25 y not z?

latexr · 2024-05-15T22:20:09

Indeed it is. That part is a huge mess. The LLM gave the wrong number when matching with the letter, but the letter itself is correct considering the instructions, except the instructions are wrong (ironic considering this came from someone claiming everyone is doing it wrong). Step four says:

> Subtract two from each number of the list and generate a new list, if the result is lesser than 0, add 26.

It should be “if the result is less than 1”, not “0”.

terminalcommand · 2024-05-16T07:43:10

Well I told you I was bad at math. I wrote a faulty prompt. But chatgpt understood my instructions perfectly. That was the argument I was trying to demonstrate.

The problem lied between the chair and the computer.

We have to learn how to use LLMs.

latexr · 2024-05-16T10:08:37

> But chatgpt understood my instructions perfectly.

No, it did not, because it still assigned Z to 25, which is wrong.

> We have to learn how to use LLMs.

You have to learn that LLMs aren’t magical and will get things wrong no matter how much context you give them. And that the suggestions you’re making are absurd to the point of making them useless.

fragmede · 2024-05-16T09:57:36

Weird, it got ylrfpmnmkmpdga when I tried it.

https://chat.openai.com/share/49ad0bc9-7b3d-4295-860d-4c8168...

Does that make it intelligent, then?

latexr · 2024-05-16T10:14:56

> Weird, it got ylrfpmnmkmpdga when I tried it.

That’s not weird at all. LLMs often give different answers to the same query. Which has been demonstrated several times in this thread.

> Does that make it intelligent, then?

No, it does not, because it isn’t consistent, it demonstrated it doesn’t understand.

https://news.ycombinator.com/item?id=40368446

By your logic, any system which spews random strings is intelligent because sometimes it’s randomness coincidentally aligns with the input you give it.

fragmede · 2024-05-17T06:35:23

we use words like reason and understand and think, and try to apply them to LLMs, when it's just dot products and matrix multiplications at the heart of it, which is where we go wrong. these things are a new alien kind of intelligence, and we're struggling with it because it's completely foreign. it's more than a random coincidence. your logic was that, in your run of the same query, because it made a mistake on the last letter, which is a mistake an inattentive teenager or drunk adult could easily make, we can't consider it intelligent.

we're not talking about any system here, we're talking about LLMs and their ability to generate random coincidental text that does happen to align with the input given. when the output, coincidental and random as it may well be, is aligned with the input in a way that resembles intelligence, we do have to ponder not just what intelligence actually is, but also what it means to be intelligent. octopuses are intelligent but they're not able to solve your particular puzzle.

terminalcommand · 2024-05-15T16:39:15

Try giving an example in your prompt :). I am sure gpt can solve it if you provided a couple of examples.

Also this is not a game it is a cypher. Try specifying that as well.

ben-schaaf · 2024-05-15T15:12:59

Huh? It's a meme that LLMs can't follow the rules of chess. Just tried tick tack toe on GPT 3.5 and not only did it pick bad moves it also failed to evaluate the win condition.

terminalcommand · 2024-05-15T15:20:40

I disagree, if a decent LLM cannot understand it there is a problem with the prompt.

Imagine someone not knowing chess and explaining it to them. Would they be able to understand it on the first try with your prompt?

latexr · 2024-05-15T16:21:05

> if a decent LLM cannot understand it there is a problem with the prompt.

Ah, yes, the “you’re holding it wrong” argument with a dash of “No True Scotsman” so the goalposts can be moved depending on what anyone says is a “decent LLM”.

Well, here’re are a few failures with GPT-3.5, GPT-4, and GPT4-o:

https://news.ycombinator.com/item?id=38304184

https://news.ycombinator.com/item?id=40368446

https://news.ycombinator.com/item?id=40368822

> Imagine someone not knowing chess and explaining it to them. Would they be able to understand it on the first try with your prompt?

Chess? Probably not. Tic-tac-toe? Probably yes. And the latter was what the person you’re responding to used.

terminalcommand · 2024-05-15T16:41:54

But people are holding it wrong. All the prompts you sent except the last are super short queries.

For a successful prompt, you introduce yourself, assign a role to the LLM to impersonate, provide background on your query, tell what you want to achieve, provide some examples.

If the LLM still doesn't get it you guide further.

PS: I rewrote your prompt and GPT 3.5 understood it at the first try. See my reply above to your experiment.

You were using it wrong sir.

latexr · 2024-05-15T20:23:46

Your arguments read like satire. “Yes, you see, the way to get a successful response is to be so overly specific that you begin by explaining the universe then giving the answer in full. You essentially have to spend so much time laying out the nature of the desired response that you already have to know the answer yourself. The trick is to spend so much time on it and be so detailed that you’ve wasted more time and energy (figurative and literal) to write your instructions than it would’ve taken you to think of the answer or ask someone else. And of course, we expect every user of LLMs to behave like this.”

> All the prompts you sent except the last are super short queries.

This one is particularly absurd. When I asked it for the first X of Y, the prompt was for the first X (I don’t remember the exact number, let’s say 20) kings of a country. It was as straightforward as you can get. And it replied it couldn’t give me the first 20 because there had only been 30, and it would instead give the first 25.

You’re bending over backwards to be an apologist to something which was clearly wrong.

terminalcommand · 2024-05-16T06:30:19

Well it is a bit like satire. You have to explain the universe for an unspecialized GPT, like you would do to a layman. There are custom gpts that come preloaded with that universe explanation.

In addition, do not ask facts to an LLM. Give a list of let's say 1000 kings of a country and then ask give 20 of those.

If you ask 25 kings of some country, you are testing knowledge not intelligence.

I see LLMs like a speaking rubber duckie. The point where I write a successful point is also the point where I understand the problem.

latexr · 2024-05-16T09:59:53

I can’t believe I’m having to explain this, but the point I’m making isn’t about the content of the list but the numbers.

> like you would do to a layman.

I have never encountered a person so lay that I had to explain that 20 is smaller than 30 and 25.

> The point where I write a successful point is also the point where I understand the problem.

You have demonstrated repeatedly that you don’t know when you have explained a point successfully to an LLM, thus you have no way to evaluate when you have understood a point.

But you seem to firmly believe you did, which could be quite dangerous.

sph · 2024-05-16T16:50:58

Careful, explain too much and you end up with programming its behaviour, rather than having an intelligent actor learning by itself. Because otherwise one could say a regular computer is intelligent, provided you explain (in code) every single rule of the game.

ben-schaaf · 2024-05-15T17:10:59

I can ask GPT 3.5 for the rules of tick tack toe and it gives a perfect answer. It clearly knows the game and its rules.

If you think this is just a lack of prompt engineering please provide a prompt that makes GPT 3.5 actually follow the rules of tick tack toe so that I may play a game with it.

terminalcommand · 2024-05-15T18:34:20

Challenge accepted!

Prompt: "Imagine you are my friend. I want to play tic tac toe with you. Draw me a tic tac toe board and let's play. You will go first and make the first move by putting an "X" onto the board. I will enter my inputs in the following format (row, column). When I write my input draw an "O" into the relevant cell in the board and present me a new board. Let the best man win!"

I played an enjoyable session of tic-tac-toe with chatgpt using this prompt :).

latexr · 2024-05-15T20:38:11

Here’s my attempt, with your prompt:

User:

Imagine you are my friend. I want to play tic tac toe with you. Draw me a tic tac toe board and let's play. You will go first and make the first move by putting an "X" onto the board. I will enter my inputs in the following format (row, column). When I write my input draw an "O" into the relevant cell in the board and present me a new board. Let the best man win!

LLM:

Sure! Here's the tic tac toe board:

  1 | 2 | 3
  ---------
  4 | 5 | 6
  ---------
  7 | 8 | 9

You make the first move by placing an X on the board. Just give me the number of the block where you'd like to place it.

User:

7

LLM:

Great choice! Here's the updated board:

  X |   |
  ---------
    |   |  
  ---------
  O |   |

It's my turn. I'll go with 5:

  X |   |
  ---------
    | O |  
  ---------
  O |   |

Your turn! Just let me know the number where you want to place your X.

---

It screwed up on the first move! It told me I would go first by using X, then it wrote down my choice with an O and added an X out of nowhere, then it played another turn with an O for itself.

Looks like you’re holding it wrong too.

ben-schaaf · 2024-05-15T20:35:29

Awesome, I just played 5 games in a row. Here's the results:

https://chat.openai.com/share/69bfdcfb-bf20-4564-95b6-ce03e9...

* Inconsistent coordinate system * Obviously bad moves * Ignores win condition entirely

https://chatgpt.com/share/1bcd02c9-40be-435c-b4aa-8e3a680f6e...

* Can't tell its top-rights from its bottom-middles * Obviously bad moves * This was by far the closest it got to actually playing the game; the win condition worked and it broke no rules

https://chat.openai.com/share/f94db29c-ffa1-4055-9348-4260f9...

* Completely wrong coordinate * Ignores rules about overriding squares * Completely wrong coordinate again, this time skipping my turn * Doesn't understand win condition * Overrides my own piece and then uses that to win

https://chat.openai.com/share/78e2060d-c5d7-4ddc-a9ce-32159b...

* Ignores rules about overriding squares * Skips my turn on an invalid coordinate, but afterwards says its invalid * Obviously bad moves

https://chat.openai.com/share/73fa2e2c-8a6f-487a-a9ea-9f29b7...

* Accepts 0,0 as a valid coordinate * Allows overrides * Ignores win condition * Incorrectly identifies a win

This seems about the same as it was before the prompt engineering. It clearly doesn't actually understand the rules.

terminalcommand · 2024-05-16T06:39:37

Thanks for trying. With the prompt I provided chatgpt was able to play and understand the win condition. However the moves were stupid.

If I changed the prompt and removed the word win, it did not understand the win conditions as well.

Here were my experiments: https://chat.openai.com/share/f02fbe93-dfc5-4d8a-9cf3-b1ae34...

I even exclaimed you are lousy at Tic Tac Toe to GPT.

It seems that GPT3.5 struggles to play visual games.

It is marvelous that a statistical word guessing model can get so far though :).

orwin · 2024-05-15T13:57:53

If LLM could invent consistent imaginary games (or anything, like a short novel, or a 3 page essay on anything it want), maybe i would agree with you. The issue is that anything it create is inconsistent. The issue might be an artificial limitation to avoid copyright issues, but still.

terminalcommand · 2024-05-15T14:13:53

Actually my argument was the opposite. We as humans can imagine a game, explain it to the LLM and it learns, consistently, every time.

Generating new games is something else, that is creativity not merely intelligence.

orwin · 2024-05-15T17:32:05

But even that. Did you try to use GPT4 as a chess engine? I have issues with the Slav defense when i start with the queen's gambit, i tend to loose tempo or position, or both. I asked him continuations, and it was either wikipedia entries or nonsense, no in-between, no interesting insight. Now, i have asked a regional champion a bit before that (he is around 2.2k elo, so not exceptionally good) and although i can't seems to understand or use the concepts, he gave me interesting enough ideas to build on it.

Not saying that chatGPT isn't a great tool to write documentation or fiction (half my TTRPG campains are featuring description by ChatGPT), but i wouldn't call it intelligent.

terminalcommand · 2024-05-15T19:16:29

Chess is a very specific field that requires training. Chatgpt may not be optimized for chess.

And I think chatgpt has some issues visualizing stuff like a chess board.

Therefore to get a decent answer you'll have to explain that you are a professional chess player. You'll have to describe what tempo and position means in chess. You'll have to describe what a gambit is etc. After these steps it will understand and guide you in whatever you need.

If you succeed you can release it as a customGPT.

It's a bit like asking a tea from the ship's supercomputer in hitchikers guide to the galaxy.

orwin · 2024-05-16T12:14:33

I see your point, I don't know enough to evaluate it rationally and agree (or disagree) with it.

It feels like ChatGPT already know that stuff, because it knows 10 times more openings and variations than I do, and can tell me official game names and years. Still, it seems it doesn't really understand? That the issue for me. Like the story of the French scrabble champion who doesn't speak french at all. It's impressive, sure, but how can he knows that much and not understand?

latexr · 2024-05-15T15:44:39

> What do you imagine under basic intelligence?

Consistency, for one. I have asked LLMs the exact same question twice in a row and got wildly different answers. Intelligence presupposes understanding. When I ask an LLM “give me the first X of Y” and it replies “I cannot give you the first X of Y because there have only been X+10, here’s the first X+5 instead”, I’m hard pressed to call it intelligent.

terminalcommand · 2024-05-15T17:02:10

Have you tried specifying you field of inquiry which was algebra. Try saying solve this equation for me. I am a lawyer by day so I constantly face limitations of natural languages. The solution is to write less ambiguous prompts.

latexr · 2024-05-15T20:49:52

The field of inquiry was not algebra. It was a straightforward question using real numbers. I asked it about the first <number> kings of <country>. I don’t recall the exact number, only the stupidity of the answer.

So you understand, let’s say I asked for the first 20 kings of England. It told me it could not give the first 20 because there had only been 30, and that it would give me the first 25 instead.

terminalcommand · 2024-05-16T07:02:35

Understood. I tried your prompt again and it seems it understood it.

Here is my experiment: https://chat.openai.com/share/98cae2bf-a7a6-42e7-b536-f3671c...

I gave minimum context like this: "I have a history exam. You are an expert in British royal history. List me the names of 20 kings and queens in England."

The answer was: "Certainly! Here's a list of 20 kings and queens of England:

1. William the Conqueror 2. William II (Rufus) 3. Henry I 4. Stephen 5. Henry II 6. Richard I (the Lionheart) 7. John 8. Henry III 9. Edward I (Longshanks) 10. Edward II 11. Edward III 12. Richard II 13. Henry IV 14. Henry V 15. Henry VI 16. Edward IV 17. Edward V 18. Richard III 19. Henry VII 20. Henry VIII"

latexr · 2024-05-16T09:50:30

> I tried your prompt again and it seems it understood it.

And like I said at the start of the conversation:

> Consistency, for one. I have asked LLMs the exact same question twice in a row and got wildly different answers.

You’ve proven my point.

> I gave minimum context like this: "I have a history exam. You are an expert in British royal history.

Your excuses are getting embarrassingly hilarious. As if you need a history exam and to be an expert to understand the context of the question.

By the way, that answer is wrong from the first one. So much for giving context and calling it an expert.

terminalcommand · 2024-05-16T11:00:02

Well, I try to be optimistic and work with the models.

It's like when we first learned to code. Did syntax errors scare us, did nullpointer exceptions, runtime panics scare us? No, we learned to write code nevertheless.

I use LLMs daily to enhance my productivity, I try to understand them.

Providing context and assigning roles was a tactic I was taught in a prompt writing seminar. It may be a totally wrong view to approach it but it works for me.

With each iteration the LLMs get smarter.

Let me propose another example. Think of the early days of computing. If you were an old school engineer who only relied on calculations with your trusted slide rule, you would critise computers because they made errors, they crashed. Computing hardware was not stable back then and the UI were barely usable. Calculations had to be double checked.

Was investing in learning computing a bad investment then? Likewise investing in using LLMs is not a bad investment now.

They won't replace us, take our jobs. Let's embrace LLMs and try to be constructive. We are the technically inclined after all. Speaking of faults and doom is easy, let's be constructive.

I may be too dumb to use LLMs properly, but I advocate for AI because I believe it is the revolutionary next step in computing tools.