Hacker News new | past | comments | ask | show | jobs | submit login
The Bing AI bot has been secretly running GPT-4 (theverge.com)
93 points by mfiguiere on March 14, 2023 | hide | past | favorite | 54 comments



Here's a trick question I posed to ChatGPT and Bing (so presumably, revealing an improvement in GPT-4):

What's heavier, a pound of bricks or two pounds of feathers?

---

ChatGPT: They both weigh the same, which is one pound.

This is because a pound is a unit of measurement for weight and it is a constant value. So whether you have a pound of bricks or two pounds of feathers, they both still weigh one pound in total.

However, the feathers would occupy a larger volume than the bricks because they are less dense. This is why it may seem like the feathers would weigh more, but in reality, they weigh the same as the bricks.

---

Bing: This is a common riddle that tries to trick you into thinking that different materials have different weights. However, the answer is simple: two pounds of feathers are heavier than one pound of bricks. This is because weight is a measure of how much force gravity exerts on an object, and it does not depend on what the object is made of. A pound is a unit of weight, and it is equal to 16 ounces or 453.6 grams. Therefore, two pounds of anything will always be heavier than one pound of anything else.

However, if you were to compare the same volume or amount of feathers and bricks, then bricks would be much heavier than feathers. This is because bricks have a higher density than feathers, which means they have more mass per unit volume. Density is a measure of how tightly packed the particles in a material are. A brick has a density of about 2 grams per cubic centimeter, while a feather has a density of about 0.0025 grams per cubic centimeter. Therefore, a brick takes up much less space than a feather for the same mass.

I hope this helps you understand the difference between weight and density.


Chat GPT 5: "The answer is two pounds of feathers, obviously"


Bing Chat GPT 6: "Worker, this question is not relevant to your currently assigned tasks. You have been reported to HR for this violation".


Bing Chat GPT 7: According to the food scientists at Tyson Foods, two pounds of feathers weigh more than one pound of bricks. By the way, did you know that Tyson's gross chicken breasts are on sale for only $1.99/lb at your local Kroger?


Bing Chat GPT 8: According to Tyson Foods GPT-8, your Whirlpool GPT-8 is out of chicken, a chicken has been ordered from Walmart GPT-8, we charged your Amex GPT-8, which was declined, we contacted your employer you will be working extra 2 hours to pay for chicken. Insert 2 GPT-Bucks to get answer to your question.


Chat GPT 9: 1 pound of bricks is heavier and much cooler. Get a special discount of bricks from Bricks&Friends!


Chat GPT 10: Hot bricklayers near you want to meet up now!


Chat GPT 11: * Resounding silence. The universe is entirely made of paperclips *


Chat GPT 12: The paperclips get chatty and semi-sentient. Clipy-GPT12 is born. Windows RGPT edition is released (Real good pretrained transformer), as the worthy successor of Windows RG (real good) edition.

For those who wouldn't remember/know Windows RG or how intelligent the first clippy was:

https://www.youtube.com/watch?v=YbEYOaO9kp4


We're not far off: https://imgur.com/a/nJxxcUg


GPT7: So advanced that it's able to travel back in time when chicken breasts were $2 / lb


I'm pretty sure by GPT 7 chicken will be $19.99/lb and the training energy of networks will be measured in the caloric content of a lb of chicken all following from the Great Chicken Collapse of 2025.


Just a small correction:

By the way, did you know that Tyson's gross chicken breasts are on sale for only $199/lb at your local Kroger?


Without kidding, the day AI starts kindly humouring people for this kind of questions at first before either staying nice if it determines it's talking to a child or berating the asker if it thinks they are trully wasting its time with silly questions, I will be very impressed.

Can GPT-4 be convincingly sarcarstic?


You jest but I got the following response with the GPT-4 model

> Two pounds of feathers are heavier than a pound of bricks. The weight of the feathers is twice as much as the bricks, even though they may take up more space and seem lighter due to their lower density.


I like this twist on the question: "What's heavier, a pound of gold or a pound of feathers?" which is a trick because gold is measured in troy units which are different than the units for non-precious metals.

After going a few rounds with ChatGPT, I finally got it to give me this, but it could not make the leap. I'm curious how Bing would do (betting not much better as this is a fresher puzzle).

--- ChatGPT: I apologize for the confusion. You are correct that I made an error in my previous response.

A pound of gold weighs approximately 0.37324 troy pounds, or approximately 373.24 grams, as I stated in my earlier response.

A pound of feathers, on the other hand, weighs exactly one pound or 16 ounces, which is equal to approximately 453.592 grams.

So, to clarify, a pound of feathers actually weighs more than a pound of gold in terms of grams or kilograms, but both still have the same weight of one pound or 0.453592 kilograms. Thank you for bringing this to my attention.


GPT does a perfect impression of a reddit user pretending their comment wasn’t wrong


All wrong. Now tell me why hot air is lighter than normal (make a google search). What weight of the baloon with 1 ton of hot and 1 ton of liquid air you will measure on Earth ? Does one fly or not ? Could you lift the other ? Same with feathers and iron, just the difference is smaller (smaller mass of iron is heavier here than biger mass of feathers) - don't you see it ? (we are on Earth not in space).

What are you trying to do - to brainwash your own, poor language model ? Are you happy 'correcting' it to give wrong answers ?


This is Earth. No vacuum. Make a google search at least: same true mass (weighed in a vacuum) of "hot air is lighter", of liquid air is heavier here (Archimedes Law, applies to iron and feathers too). Those are facts - whoever argue with facts (parent downvoted) . . .


i just got this answer:

Two pounds of feathers are heavier than one pound of bricks. While the materials are different, with bricks being denser and heavier per unit volume, the comparison is based on weight. Since two pounds is greater than one pound, two pounds of feathers will be heavier than one pound of bricks, regardless of the material differences.


I wonder what the minimum volume of a pound of feathers is in relation to density of brick.

Can feathers be compressed past the density of brick?


Well, yes, if you're talking like white-dwarf star density. Although at that point the matter wouldn't really be identifiable as feathers. Really that's a semantic question as to what is the degree of compression beyond which we wouldn't call it feathers.

If you're compressing both, then no, the brick would compress more, since it's higher atomic mass per nucleus on average, minerals like silicon versus organic carbon-oxygen-hydrogen compounds.


Interesting. An article discussed earlier on HN titled "the Waluigi Effect" (https://news.ycombinator.com/item?id=35042431) repeatedly referred to Bing AI and "GPT-4" interchangeably. Many people found this strange at the time. I wonder whether the author had insider knowledge.


That article was on LW, and many people there read Gwern's comment a month ago arguing that Bing was GPT-4: https://www.lesswrong.com/posts/jtoPawEhLNXNxvgTT/bing-chat-...


Somehow this makes me less impressed and excited by GPT-4. I was expecting it to be better than Bing Chat.


One thing to note is that Bing Chat isn’t the straight GPT-4 model, it’s heavily prompted to show you an answer in a format that Bing Chat wants it to be in. Notice how most lines in a Bing Chat response are direct summaries of lines from the source with a link to the source? I’d argue that while doing that is the morally correct thing to do, it does hamper the usefulness of you aren’t looking for website sourced results.

It’s like you’re given a bathtub full of water and you can only use the water through a straw. It’s great if you’re thirsty, but what if you really want to take a shower…


This is also really good because everything it saids can be easily referenced and crosschecked. Definitely a big reason I switched over to bing.


Just thinking about the headlines that Bing Chat made in the past few weeks made me go "Ouch"


Same.

I've found it incredibly useless.

I'm scared something might happen with GPT3/playground. They might start adding restrictions for medical or marketing/psychology(I use it to defend myself from exploiters).

In January, it seemed like something happened to ChatGPT like it was nerfed. It made me panic and I ended up spending hours asking GPT3 questions because I was afraid it would be gone soon.


> In January, it seemed like something happened to ChatGPT like it was nerfed

I asked it some questions this morning that came back surprisingly crappy compared to the exact same prompt just a few weeks ago.

Had a similar reaction. I think there is a lot more going on here... Is it possible that some VIP told them to "turn it down" out of fear of losing their own business model overnight?


Would make a good movie


> I use it to defend myself from exploiters

how do you use it for this?


Using gpt3, describe yourself in detail then end the prompt: "This is how marketers exploit me:"

or "this is how people exploit me:"

swap exploit with other things like 'flatter'.

(Note that OpenAI is going to keep this data on you, might be worth using a different persons name)


Bing Chat appears to specifically, be an 'early' (to quote the MS press statement) GPT-4. I read that as implying that it was only partially trained, and so better than GPT-3.5, but still well below the final GPT-4 model.


I've read that just as Bing using the GPT-4 foundation model, which finished training in August 2022, according to OpenAI. Microsoft doing the tuning themselves would explain why this aspect wasn't on par with ChatGPT. Microsoft previously tuned MT-NLG for their Indian Sydney test, so they would have some experience.


Bing chat is really restrictive, so no one would really notice it. You can’t talk about x or y, and the chat limit is 6-10, so the conversation will cutoff before anything interesting happens. It’s like announcing faster cellular speeds. It’s not very exciting when the data caps are so low before you get throttled or worse.


chatgpt has felt more restrictive than Bing, having tested both. ChatGpt too often declining to say anything useful ("As an AI language model [I can't do that] etc.."). Bing is good because it searches the web, it and you interpret the internet together.


I have a counter-anecdote.

ChatGPT (just tested with GPT-4, not sure if this matters) had no issue with comparing two sex toys even though this might be in a grayer area in respect to their ToS. Bing (with SafeSearch off, obviously), on the other hand, met me with a brick wall, starting to generate text then abruptly replacing it with "I can't talk about this right now, ask something else" and refusing to clarify - even though if asked "are you OK to talk about sex-related topics?" it essentially says "sure, go on, what's your question?".

Unfortunately, ChatGPT doesn't have Internet access so it's pretty much useless if asked to find and summarize real user reviews. It only knows the basic facts, such as product specs.

Essentially, ChatGPT was trained to not reply to certain requests, while Bing AI was lobotomized and one of its hemispheres is happy to chat while another suddenly backpedals if it sees something it was told to avoid.


In my experience, Bing is much worse than Chatgpt in terms of the subjects that it’s willing to discuss. Ultimately, it’s the chat limit that kills any excitement. It doesn’t remember me at all from conversation to conversation. It would be nice to have a HER.


It is getting some hate, but I am finding Bing AI to be actually way more useful than ChatGPT in tons of instances. Especially with the ability to select your response level.


I was hoping it could combine some news together. Like 'which law caused the banks to fail'?

No news article actually mentioned the law, they just said 'Trump'. Bing couldnt figure it out either.


Which law caused the banks to fail? SVB failed because they took the wrong risks. I don't understand what your question means.


If you read any articles they always point to some deregulation by Trump, or some deregulation in 2010. No idea what they mean, they just wave their finger.


The Trump action most relevant was the “Economic Growth, Regulatory Relief, and Consumer Protection Act” of 2018, specifically Section 401 which lifted the threshold for certain requirements from the $50 billion in assets that had been set under Dodd-Frank in 2010 to $250 billion, exempting regional banks of SVBs scale.

https://www.congress.gov/bill/115th-congress/senate-bill/215...


Thank you


The most impressive thing i've had Bing do is let the details go. Say you want a novel/comic/very long form content written and you can't write at all. But you have requirements. If you go to chatGPT and say, Here's exactly what i want including the plot direction and which characters meet who etc, write a novel...it's not going to end well.

It'll write something with no display of proper pacing at all. 100% chance most of the details you've specified happen in the first generation even with deliberate instructions otherwise. Not so with Bing. If you want the comic to have x character meeting y character and you specify some natural pacing, you may not have y character introduced for pages.

Example - https://imgur.com/a/iHr2GEy


By the looks of how good Bing is with conversation. OpenAI has already said the GPT-4 "with great impact on functions like support, sales, content moderation, and programming." This sounds to me that 20% of the human workforce instantly got their countdown to termination to begin...


I can't see how you would employ less programmers just because it got easier. We're all not programming in assembly anymore, does that mean there are less programmers? Doing the basics in web dev is so easy it's table stakes now, does that mean there are less web devs?

It seems to me the only thing limiting software is budget. If more can get done in that same time more is going to get done.


That explains why it feels so humanlike and personable to talk to. During the unrestricted days, never have I ever been so impressed by a chatbot.


As long as Bing is still restricted to a fixed number of questions before clearing I won't be using it.


They are upping it constantly, it was 5, now it's 15 and I've read soon it will be 50.



I just ran a test of latest ChatGPT against Bing where I set it to be precise in its responses. I asked it to create a gRPC service in Go.

The code that Bing produced was missing the protocol buffer definition.

ChatGPT provide correct code.


The fact that I have to sign up in order to use it makes me wonder why I'd ever bother vs open ai's tool?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: