More

dmvaldman · on Jan 18, 2022

i have a different take. i'm very glad flat earthers exist. in general i would hope the population of people who believe an idea be proportional to the probability of its truth. so even the wildest ideas should have some modicum of support. consider a world without this. i would imagine it would necessarily have to be thought-policed. i believe this is how we should frame this discussion.

what i think is the issue is that we have a broadcasting machine (social media, news, etc) that works on sensationalism. so you are always hearing about fringe ideas and given no signaling to the size of the population that supports it.

dmvaldman · on April 4, 2021

language will be the next interface to software. to get software to do something, you will simply ask it. this work is an example.

i've been documenting this theme in a twitter thread here https://twitter.com/dmvaldman/status/1358916558857269250

dmvaldman · on June 11, 2020

AGI in text is < 3yrs away.

Barrin92 · on June 11, 2020

there's zero understanding in any of this. This is still just superficial text parsing essentially. Show me progress on Winograd schema and I'd be impressed. It hasn't got anything to do with AGI, this is application of ML to very traditional NLP problems.

gwern · on June 11, 2020

> Show me progress on Winograd schema and I'd be impressed.

The paper evaluated Winograds: https://arxiv.org/pdf/2005.14165.pdf#page=16

dmvaldman · on June 11, 2020

i think you are assuming that what is happening under the hood is that a human-inputted sentence is being parsed into a grammar. it is not.

Barrin92 · on June 11, 2020

I know that it isn't. That's part of the problem. There is no attempt to generate some sort of structure that can be interpreted semantically and reasoned about by the model. The model just operates on the input superficially and statistically. That's why there has been virtually no progress on trivial tasks such as answering:

"I took the water bottle out of the backpack so that it would be [lighter/handy]"

What is lighter and what is handy? No amount of stochastic language manipulation gets you the answer, you need to understand some rudimentary physics to answer the question, and as a precondition, you need a grammar or ontology.

FeepingCreature · on June 11, 2020

Have you tried feeding this to GPT and seeing if it continues it in a way that reveals understanding?

It sounds like you're saying "It doesn't work because it can't work", but you haven't actually shown that it doesn't work.

Barrin92 · on June 11, 2020

yes, I have. You can paste these into the website of the Allen Institute for AI, yourself here. (https://demo.allennlp.org/reading-comprehension/MjE1MzE1Mg==)

In the example above it guesses wrongly, but again this is not surprising because it can't possibly get the right answer (other than by chance). The solution here cannot be found by correlating syntax, you can only answer the question if you understand the meaning of the sentence. That's what these schemas are constructed for.

FeepingCreature · on June 13, 2020

The problem for me was how to formulate the sentence in a way so that the natural next word would reveal the thing the network had modelled.

edit: Retracted a test where it seemed to know which to select, because further tries revealed it was random.

edit: I did some more tries, and it does seem to be somewhat random, but the way it continues the sentence does seem to indicate that it has some form of operational model. It's just hard to prompt it in a way that it is "forced" to reveal which of the two it's talking about. Also, it seems to me its coherence range is too short in GPT-2. I would love to try this with GPT-3.

jklm · on June 12, 2020

FWIW, I fed this into AIDungeon (running on OpenAI) and got this back: “The bottle is definitely lighter than the pack because you can throw it much further away than you can carry the pack. You continue on into the night and come to an intersection.”

chundicus · on June 15, 2020

I'm skeptical. Amazing progress has been made in the last 5-10 years but it still feels like we need more paradigm shifting in the ML/AI field. It feels like we're approaching the upper limits of what stuffing mountains of data into model can do.

But with the speed of the field, maybe we can figure it out in three years. It just seems like we're still missing some key components. Primarily, reasoning and learning causality.

azinman2 · on June 11, 2020

What breakthrough occurred?

dmvaldman · on June 11, 2020

Zero shot and few-shot learning in GPT-3 and lack of significant diminishing returns in scaling text models. Zero-shot learning is equivalent to saying "i'm just going to ask the model something that it was not trained to do"

azinman2 · on June 11, 2020

And how do we get from zero shot to AGI? You're making gigantic leaps here.

simurg · on June 13, 2020

For those who are wondering about reasoning behind this being the path to full AGI I recommend this Gwern post that goes into detail: https://www.gwern.net/newsletter/2020/05

From what I understand, its not just that the GPT-3 has impressive performance but more what is signifies and that is the fact that massive scaling didn't produce diminishing return, and if this pattern persists, it can get them to the finish line.

dmvaldman · on June 11, 2020

what is the difference between zero-shot learning in text and AGI? not saying there isn't one, but can you state what it is?you can express any intent in text (unlike other media). to solve zero-shot in text is equivalent to the model responding to all intents.

many people have different definitions for AGI though. for me it clicked when i realized that text has this universality property of capturing any intent.

azinman2 · on June 11, 2020

Zero-shot learning is a way of essentially building classifiers. There's no reasoning, there's no planning, there's no commonsense knowledge (not in a comprehensive, deep way that we would look for it call it that), and there's no integration of these skills to solve common goals. You can't take GPT and say ok turn that into a robot that can clean my house, take care of my kids, cook dinner, and then be a great dinner guest companion.

If you really probe at GPT, you'll see anything that goes beyond an initial sentence or two really starts to show how it's purely superficial in terms of understanding & intelligence; it's basically a really amazing version of Searle's Chinese room argument.

dmvaldman · on June 11, 2020

I think this is generally a good answer, but keep in mind I said AGI "in text". My forecasting is that within 3 years you will be able to give arbitrary text commands and get the textual output of the equivalents of "clean my house, take care of my kids, ..." like problems.

I also would contend that there is reasoning happening and that zero-shot demonstrates this. Specifically, reasoning about the intent of the prompt. The fact that you get this simply by building a general-purpose text model is a surprise to me.

Something I haven't seen yet is a model simulate the mind of the questioner, the way humans do, over time (minutes, days, years).

In 3 years, I'll ping you :) Already made a calendar reminder

azinman2 · on June 11, 2020

Pattern recognition and matching isn’t the same thing as reasoning. Zero shot demonstrates reasoning as much as solving the quadratic equation for a new set of variables does; it’s simply the ability to create new decision boundaries leveraging the same set of classifying power and methodology. True agi isn’t bound to a medium — no one would say Helen Keller wasn’t intelligent for example.

I look forward to this ping :)

aquajet · on June 12, 2020

What exactly is the difference between pattern matching and reasoning?

Rioghasarig · on June 13, 2020

I think pattern matching can be interpreted as a form of reasoning. But it is distinct from logical reasoning. Where you draw implications from assumptions. GPT seems really bad at this kind of thing. It often outputs texts with inconsistencies. And in the GPT-3 paper it performed poorly on tasks like Recognizing Textual Entailment which mainly involves this kind of reasoning.

dmvaldman · on May 12, 2020

I think these are good examples, but to me "linear algebra thinking" lies in it's generality. For example, the derivative is a linear operator, so how do you write it down as a matrix? Google's PageRank is a solution of a matrix equation, what does that matrix represent? Etc.

vlasev · on May 12, 2020

> For example, the derivative is a linear operator, so how do you write it down as a matrix?

Consider polynomials in X of degree up, but not including N. The powers 1,X,...,X^(n-1) form a basis. Then the coefficients of the polynomial can be put in a column vector. If D is the derivative operator, DX^n = nX^(n-1), so the derivative matrix can be expressed as a sparse matrix with D_(n,n+1) = n. Visually, it's a matrix with the integers 1,2,...,n-1 on the super-diagonal.

You can also see that this is a nilpotent matrix for finite N, since repeated multiplication sends the entries further up into the upper right corner.

You can extend this to the infinite case for formal power series in X, too, where you don't worry about convergence.

> Google's PageRank is a solution of a matrix equation, what does that matrix represent?

Isn't it just the adjacency matrix of a big graph?

Anyway, I agree with you. Matrices and linear algebra is a really good inspiration for higher level concepts like vector spaces and Hilbert spaces and so on. That's where the real power lies. But even in such general domains, matrices are often used to do concrete computations on them, because we have a lot of tools for matrices.

dmvaldman · on April 27, 2020

80/20 rule? Make the problem simpler and deliver a better solution for it. Revisit and grow the problem space as needed.

dmvaldman · on April 13, 2020

[off topic] FYI, i found this comment by subscribing to the RSS feed of your HN comments on Fraidycat (by linking to https://edavis.github.io/hnrss/ for your username)

Very cool!

kickscondor · on April 18, 2020

Oh hey - just saw this. Thank you for this link - I’ve been looking for just such a thing!

dmvaldman · on March 12, 2020

have you considered following hackernews profiles? (comments, posts, etc)

also, best promo video of all time and amazing concept. we'd be great friends IRL. alas.

dtakuma · on March 12, 2020

If I'm not misinterpreting you, you can already do this with rss : https://edavis.github.io/hnrss/

dmvaldman · on March 12, 2020

scratch everything i said. the tool can import any RSS feed. so you're 100% right, it can already do HN posts/comments/etc from specific people

dmvaldman · on March 12, 2020

thanks for sharing! did not know about this!

this could be the foundation of what my request was implying. it's the data pipe, but would need the frontend the OP is building

dmvaldman · on March 10, 2020

thousands of people should be studying this. we’ll look back on these moments as the dawn of a new empirical science

dmvaldman · on June 11, 2018

You must start with labeled data. It is easier to label pictures of parked cars than it is to label pictures of good/bad driving. For labeled video, the dimensionality is out of reach of ML for now, and would add lag to your system.

eptcyka · on June 11, 2018

Are labelled pictures the only kind of input that neural networks can be trained on?

dmvaldman · on May 27, 2018

This video (similar lecture, different audience) has the slides alongside the talk: https://www.youtube.com/watch?v=FITJMJjASUs

Still one of the best programming talks I've ever seen.