More

steveridout · 2024-09-09T17:35:22 1725903322

Even if you don't feel like reading the whole article, do yourself a favor and skip down to the video of the final product at the very end. It's delightful and put a big smile on my face. The fact that all the modern technology is hidden inside leaving only the wooden structure visible makes it magical, like something from Harry Potter.

tomcam · 2024-09-09T23:31:51 1725924711

Insane! Bluetooth powered walking tables were admittedly not part of my 2024 bingo card. Love that thing.

usrusr · 2024-09-10T11:12:02 1725966722

In defense of your bingo skills, that table isn't exactly running on PoBT

tomcam · 2024-09-10T14:08:27 1725977307

Ha! I meant to write powered Bluetooth but I can’t edit it now

gwern · 2024-09-09T22:31:24 1725921084

You mean Discworld!

ajb · 2024-09-09T23:34:36 1725924876

Exactly! So reminiscent of "the luggage". Albeit with the legs facing sideways

bigiain · 2024-09-10T00:13:58 1725927238

> Even if you don't feel like reading the whole article, do yourself a favor and skip down to the video

No. Don't.

Read the whole damned article. It's _great!_

chrisco255 · 2024-09-10T06:06:51 1725948411

Here's the video link for those who want to see the final result up front: https://www.youtube.com/watch?v=xKDY4yWxfJM

steveridout · 2024-07-05T15:25:22 1720193122

I've got a bunch of scripts to handle my finances, both personal investments and to help with certain parts of my company tax reporting.

Among other things it uses ChatGPT to extract structured data from PDF invoices for reporting expenses.

steveridout · 2024-05-07T16:30:00 1715099400

I'm under the impression that this CPU is faster AND more efficient, so if you do equivalent tasks on the M4 vs an older processor, the M4 should be less power hungry, not more. Someone correct me if this is wrong!

mort96 · 2024-05-07T16:34:27 1715099667

It's more power efficient than the M3, sure, but surely it could've been even more power efficient if it had worse performance simply from having fewer transistors to switch? It would certainly be more environmentally friendly at the very least!

Kon-Peki · 2024-05-07T16:55:42 1715100942

The most environmentally friendly thing to do is to keep your A12Z for as long as you can, ignoring the annual updates. And when the time comes that you must do a replacement, get the most up to date replacement that meets your needs. Change your mindset - you are not required to buy this one, or the next one.

mort96 · 2024-05-07T17:07:48 1715101668

Of course, I'm not buying this one or any other until something breaks. After all, my current A12Z is way too powerful for iPadOS. It just pains me to see amazing feats of hardware engineering like these iPads with M4 be completely squandered by a software stack which doesn't facilitate more demanding tasks than decoding video.

Millions of people will be buying these things regardless of what I'm doing.

_ph_ · 2024-05-07T21:02:08 1715115728

Look at the efficiency cores. They are all you are looking for.

mort96 · 2024-05-07T21:56:42 1715119002

I agree!

So what are the performance cores doing there?

_ph_ · 2024-05-08T07:10:17 1715152217

They are for those tasks, where you do need high performance. Where you would wait for your device instead. A few tasks require all cpu power you can get, so that is what the performance cores are for. But most of the time, it will consume a fraction of that power.

mort96 · 2024-05-08T08:54:19 1715158459

My whole point is that iPadOS is such that there's really nothing useful to do with that performance. No task an iPad can do requires CPU power (except for maybe playing games but that'll throttle the M4 to hell and back anyway).

_ph_ · 2024-05-08T18:09:36 1715191776

Every time you perform complex computations on images or video, you need any bit of performance you can get.

mort96 · 2024-05-09T08:31:52 1715243512

steveridout · 2024-04-26T08:04:19 1714118659

I'd been using 2 external displays with my macbook pro for the past few years. The most annoying thing for me was waking the mac from sleep and getting it to detect the two of them. Once they were both detected I honestly thought the experience was OK. But the dance I needed to do every time to detect was so frustrating that I recently replaced the two of them DELL's new 40 inch 5K ultra wide: https://www.dell.com/en-us/shop/dell-ultrasharp-40-curved-wu.... I'm very happy with it. For my (slightly aging) eyes the pixel density is great, there's lots of space, and the waking from sleep detection issue is finally gone.

steveridout · 2024-01-11T09:30:48 1704965448

Interesting point but I don't think it's that clear cut. Twitter/X seemed to increase the pace of product changes directly after laying of the majority of its employees after Elon Musk took over. Also, when Steve Jobs returned to lead apple in 1997 he fired a significant fraction of the company before starting an incredible period of innovation. So I think a lot depends on the leadership and incentive structures.

kalleboo · 2024-01-11T10:46:29 1704969989

Both of those examples started with laying off basically all of top management.

If you’re just laying off engineers to meet some profitability measure for Wall Street then you’re not going to fix innovation. You need to replace all of management who are the ones who are in charge of what the engineers are doing.

Google is completely lost and doesn’t know what it does. Management launches new products just to look good and shuts them down a year later, and still the only thing making money is search and ads. That’s not going to be fixed by laying off engineers.

nusl · 2024-01-11T10:05:36 1704967536

Twitter laid most of their staff off then Musk gave them an ultimatum of: commit every day, all day, sleep at the office, to Twitter, or get fired. I wonder why there was more work produced in that period.

fifteen1506 · 2024-01-11T10:59:04 1704970744

Maybe because most are on H-1B and thus have no other option than to sacrifice personal life to have a job and not be deported?

sabellito · 2024-01-11T09:34:33 1704965673

Twitter has lost 70% of its value since Elon took over.

baq · 2024-01-11T11:00:59 1704970859

Which it only had because Musk thrown out a number and people fought very very hard to make him stick to it. If he wasn't such a dumbass and forfeit due diligence, this wouldn't be the story.

infotainment · 2024-01-11T09:56:23 1704966983

Important business lessons from Twitter: try not to antagonize your biggest customers, because they might stop buying your products.

In addition, if this occurs, make sure to blame external factors or wide-ranging conspiracies.

trashtester · 2024-01-11T10:44:10 1704969850

Another lesson: Don't ignore the users, as one of them may get angry enough to buy your company and fire you and most of your peers.

baq · 2024-01-11T11:01:55 1704970915

Another important lesson: do your DD when you buy a company

tgma · 2024-01-11T10:27:00 1704968820

According to rando analysts who don't have a stake. Note that Twitter pre X as a business was unsustainable and I don't think it ever made a net profit in aggregate over its lifetime. It was all selling the dream even before the acquisition.

oblio · 2024-01-11T11:20:24 1704972024

> According to rando analysts who don't have a stake.

Says another rando who doesn't have a stake.

It was starting to be sustainable: https://www.businessofapps.com/data/twitter-statistics/

I don't know exactly what they did in 2020 to mess up their numbers, but 2017 - 2021 shows a company that was stabilizing at ok profit margins.

tgma · 2024-01-11T16:42:16 1704991336

You state that as if your link does not substantiate what I said earlier. For the benefit of the people who don’t click on the link you cite, it simply proves as I said, that they never made an aggregate profit over their lifetime and you are linearly extrapolating the past (“starting to become…”) coming into the high interest rate environment where peers like $FB and $SNAP crashed and Twitter would surely have too.

lapcat · 2024-01-11T12:23:40 1704975820

> Twitter/X seemed to increase the pace of product changes directly after laying of the majority of its employees after Elon Musk took over.

Changes? Yes. Often superficial changes. Change "Twitter" to "X", change blue checkmarks to yellow or gray or whatever, and sell blue to the lowest bidders. As for improvements? No, I haven't seen Twitter improve much if at all since the acquisition. In fact, the removal of third-party clients for example was a gross vandalization of the service.

> Also, when Steve Jobs returned to lead apple in 1997 he fired a significant fraction of the company before starting an incredible period of innovation.

This quip misses the biggest part of the story, which is that Apple didn't cut its way to success but rather acquired NeXT for over $400 million, a massive investment at the time, and Steve was merely replacing the old guard with his people.

steveridout · 2024-01-08T13:11:50 1704719510

A very relatable story! Are you going to write follow-up posts with more details covering the rest of the journey with more of the ups and downs of running a tiny bootstrapped startup? I'd be interested in reading!

mokkol · 2024-01-08T13:28:57 1704720537

Yeah, a follow-up article is in the pipeline. I have had enough ups and downs to create multiple articles!

mjwhansen · 2024-01-08T17:40:28 1704735628

That would be a great series!

steveridout · 2024-01-08T13:06:38 1704719198

I've gone retro with my watch. Wearing a simple Casio to avoid getting sucked into using my phone and checking notifications each time I want to see the time.

For my laptop I have an M1 pro macbook. It was a huge upgrade over the previous intel mac, but I don't feel the need to upgrade to the M2 or M3 machines, the M1 still feels really fast to me.

scop · 2024-01-08T14:03:01 1704722581

I likewise switched back to $20 Casio watch after a few years of Apple Watch. I loved a lot about the Apple Watch, but just couldn’t deal with the distraction and time suck of “let me see what time it is oh has anybody message me, my favorite person?” At first I was tempted to spend big on some Casio GShock and while the notifications wouldn’t be there I realized it was still scratching a similar “all-knowing” itch. So $20 Casio it’s been for awhile and I love it.

timw4mail · 2024-01-08T15:36:27 1704728187

Quartz isn't retro enough for me...I find clockwork watches (especially with skeleton mechanisms) more appealing.

steveridout · on Oct 10, 2023

Nice start! I love the idea of an AI assisted learning experience structured around stories.

I make a tool in the same space (readlang.com). I started it before the current LLM wave but I've recently added LLM-generated explanations and several users have been uploading LLM generated texts to read (including me!). I've been considering adding LLM based practice too, similar to what you've done with the comprehension questions, but haven't got around to that yet.

Feel free to reach out if you'd like a chat :-)

fersarr · on Oct 11, 2023

thank you! I had a look at readlang, very interesting. Definitely let's connect, I'll reach out.

steveridout · on Sept 23, 2023

At first glance this doesn't seem that surprising. We often use "is" in a way which isn't reversible. e.g.

"A dog is an animal" -> Makes sense

"An animal is a dog" -> Doesn't make sense

Majromax · on Sept 23, 2023

> At first glance this doesn't seem that surprising. We often use "is" in a way which isn't reversible. e.g.

They appear to only be testing the 'reliable' cases. There schematic example was fine-tuning the model on "<Fictitious name> is the composer of <fictitious album>", yet having the model be unable to answer "Who composed <fictitious album>"?

In this case, English and common sense force symmetry on 'is'. Without further specification, these kinds of prompts imply an exclusive relationship.

Additionally, the authors claim that when they tested it, the model didn't even rate the correct answer more probable than random chance. This suggests that the model isn't being clever about logical implications.

phire · on Sept 23, 2023

To us, it's obvious that "is" in these examples is symmetrical. But LLMs don't have common sense, they have to rely on the training dataset we feed them.

It's entirely possible there is nothing wrong with the logical reasoning abilities of LLM architectures and this result is simply an indication the training data doesn't provide enough infomation for LLMs to learn the symmetrical/commutative nature of these "is" relationships.

Though, based on the find-the-next-token architecture of LLMs, it seems logical that LLM should need to learn facts in both directions. If it's input set contains <Fictitious name>, it makes sense the tokens for "<fictitious album>" and "composer" will show up with high probability. But there is no reason that having the tokens "composer" and "<fictitious album>" in the input set should increase the probability of the "<fictitious name>" token, because that ordering never occurred in the training data.

If true, it would would suggest that LLMs have a massive bias against the very concept of symmetrical logic and commutative operations.

wongarsu · on Sept 23, 2023

The "is" in that sentence still isn't fully symmetric, I'd rather call it reversible. There is a learned relationship that "is composer of" has the same meaning as "composed" (as in "<Name> composed <Album>"). Now you can turn the active verb passive to switch subject and object: <Album> was composed by <Name>.

The final puzzle piece is then recognizing the difference between the question "Who composed <x>" and "Who did <x> compose", one asking for the object of the passive sentence and one for the object of the active sentence.

In a "traditional" system without ML you would represent this with a directional knowledge graph <Artist> --composed--> <Album>, with the system then able to form sentences or answer questions in either arrow direction. But that conversion is generally tricky unless you know how many other arrows exist. That's obvious with categories, but even if you know that one person composed a song that doesn't tell you that only that person composed that song. That can lead to unsatisfying answers, and might be a reason why this is hard for LLMs.

HerculePoirot · on Sept 23, 2023

My random reflexions on this topic make me think there is something deep about identity/equivalence in LLMs that is on par with the special status identity/equivalence have in homotopy type theory.

• GPT4 (and other LLMs) is some kind of generalized homotopy engine. You can give it any input, ask it to apply any "translation". Language translation, style translation, or even keeping the style but talking about another subject, or translating code to another programming language – and it gives you something different, yet identical. "Write something like ... but ..." There is some deep understanding of what identity is here, in particular with respect to the messy expectations of our human sign systems: you can throw any kind of equivalence path, and GPT4 will handle them just fine. It seems the limit is not in its ability to generalize to any kind of identity schema we throw at it, but in the complexity of these schemas.

• I'm not saying GPT has an explicit understanding of these schemas/homotopies. My point is that even though GPT doesn't know much about homotopy type theory, I think it knows them in a latent way: GPT would perform much better at translating a piece of code in one language to another than it'd be at explaining what it just did in sound terms what through the lens of homotopy type theory. That knowledge about identity/equivalence is implicit.

The rest of my thoughts: https://pastebin.com/zSKHKqw3

Note: I'm not claiming to have a clear view of what's at stake here, just that there is a link between textuality, identity, and the foundations of logical inference

phire · on Sept 24, 2023

I know nothing about homotopy type theory, but your description does line up with my experiments.

When playing with gpt 3.5, I gave it a conversation and asked it to "translate" one side of a conversation from "sarcastic mocking GLaDOS" to "concise professional language". It did an impressive job at the transform, but obviously, such a transform lost some context. So I tried getting gpt to "reason" about the lost context, or even just point it out.

The pre-transformed conversation was still in the context window, but it just couldn't see that version of it. It was completely blind and could only see the "concise professional" version of the conversation.

While trying to debug and find a workaround, I deleted the transformed output. The input still mentioned the transform, but gpt was still absolutely blind to the original conversation, acting as if the transform had still been applied.

It seemed like the simple suggestion of a transform was enough for gpt apply that transform within its internal context. It wasn't until I deleted all mention of a potential transform that gpt regained its ability to see the original "sarcastic mocking GLaDOS" side of the conversation.

diffeomorphism · on Sept 23, 2023

English only forces that if there is a definite article "the" (unique composer). If it instead said "a" composer, then it is impossible to answer "who composed" completely; you only know one of the composers.

Jumping to conclusions like "if A then B" to "A=B" is a very common mistake for humans, bad statistics and propaganda. So I am actually positively surprised that models don't make that mistake.

V__ · on Sept 23, 2023

I would have anticipated that, with a large enough dataset, the latent space would create graph-like relationships. Encoding things many-to-many, one-to-one etc. To my limited understanding this is a surprising find.

DonaldFisk · on Sept 23, 2023

Your examples use the indefinite article, but the first example in the abstract uses the definite article. (The second, after rephrasing, also does.) Contrast "Mars is the fourth planet from the Sun" and "Mars is a planet".

With GOFAI (e.g. Cyc, SHRDLU), you'd distinguish between "X is a Y" and "X is the Y" and store them differently, and if you got an incorrect answer you'd have a good idea where to look for your bug. With a LLM, you have a black box with billions of connexion weights and (correct me if I'm wrong) your only recourse is to retrain it on data which distinguishes the two cases, but even that might get lost in the noise, or cause problems somewhere else.

anonzzzies · on Sept 23, 2023

Yeah, that seems just unclear language and because it's trained on human language, 'is' does not equal 'equals'. Using 'equals' will help.

eloisant · on Sept 23, 2023

That's the whole problem of LLM: they work only on human language.

Even before computers we created formal languages (mathematics, logic equations) precisely because human language is too often ambiguous.

Spivak · on Sept 23, 2023

Don't you lump math in there, math is 99% human language. The symbol pushing you learned in HS is just advanced arithmetic. Math is more like legalese with some very loose additional notation than a formal language.

lkirkwood · on Sept 23, 2023

Can you expand on this? Notation like "=" can be written using language but we define an exact meaning for the operator regardless, unlike language.

JieJie · on Sept 23, 2023

I suppose that begs the question, what if we trained an LLM only on examples of formal languages?

DonaldFisk · on Sept 23, 2023

In the particular cases being discussed, there's no ambiguity: "is a" means "member of" and "is the" means equals.

dragonwriter · on Sept 23, 2023

> In the particular cases being discussed, there's no ambiguity: "is a" means "member of" and "is the" means equals.

Yes, and fitting just those cases would result in a model that handled other cases incorrectly, because idioms inconsistent with that rule exist. (“Jodie is the bomb” has a meaning distinct from the individual words taken separately which is not stating a reflexive equivalency, for instance.)

robjan · on Sept 23, 2023

I think the key words are "a" vs "the" when you use "a" the relationship is one to many, whereas when you use "the" it's one to one. If I say "Charles is the King" then "the King is Charles" also holds true. If I say "Charles is a King" then I can't conclude that the King is Charles.

beardyw · on Sept 23, 2023

So "dogs are animals", does that work?

diffeomorphism · on Sept 23, 2023

Yes, "dogs are the animals (e.g. the only animals on this space station)" is implied to be reversible. Indefinite or missing articles like in your example make no such implication.

smusamashah · on Sept 24, 2023

There is a plant which mimics leaves of nearby plants. Try asking GPT-4 which plant it is and it will always give you wrong answers. But if you do give it the name of that plant and ask what it is known for, it will tell you that it can mimic leaves of other plants.

This is what their inability to infer A from B is about.

tmalsburg2 · on Sept 23, 2023

This is a useful observation, but it doesn’t explain the particular example given in the article.

TZubiri · on Sept 23, 2023

Not all relations are order independent, so the LLM just assumed none are, prioritizing not being incorrect over being correct.

lordnacho · on Sept 23, 2023

Yeah isn't this one of those logic things?

Perhaps what they mean is NotB -> NotA, which often uses a symbol that maybe is being erased?

In any case the abstract seems wrong.

DebtDeflation · on Sept 23, 2023

Yes. Modus Ponens vs Affirming The Consequent.

If A then B. A. Therefore, B. -> Valid.

If A then B. B. Therefore, A. -> Not valid.

demondemidi · on Sept 23, 2023

Depends on the meaning of the word “is”?

TZubiri · on Sept 23, 2023

Rain is wet. Wet is not rain.

robjan · on Sept 23, 2023

Wet is an adjective

Scarblac · on Sept 23, 2023

Adjectives are wet.

dragonwriter · on Sept 23, 2023

Birds are dinosaurs.

Dinosaurs are not birds. At least not generally.

“Birds” and “Dinosaurs” are nouns.

dataflow · on Sept 23, 2023

Rain is water. Water is not rain.

drt5b7j · on Sept 23, 2023

It depends upon what the meaning of the word "is" is.

ahartmetz · on Sept 23, 2023

In this case, whether it's an identity-is or a "is a member of the group".

steveridout · on June 2, 2023

Thanks. I did an (admittedly quick) search to confirm this but didn't find anything. Do you have a reference by any chance?

tikkun · on June 2, 2023

I wish I could find the quote. It was something like:

"You gotta assume we're going to keep making the models better and faster and cheaper." (My memory of an old interview)

There was also this from a recently deleted blog post:

"Cheaper and faster GPT-4 — This is their top priority. In general, OpenAI’s aim is to drive “the cost of intelligence” down as far as possible and so they will work hard to continue to reduce the cost of the APIs over time." (Now deleted blog post from earlier this week)