The example was to just illustrate the general problem. Think of ingesting a who...

lgas · on Jan 29, 2023

> Think of ingesting a whole novel that takes place over a few years.

I did exactly that with Asimov's Let's Get Together using https://github.com/jerryjliu/gpt_index. It's a short story that's only 8,846 words, so it's not quite a novel, much less the whole of the Harry Potter series, but it was able to answer questions that required information from different parts of the text all at the same time.

It requires multiple passes of incremental summarization so it is of course much slower than making a single call to the model, but I stand by my assertion that these things just aren't much problem in practice. They are only a problem if you're trying to paste them into ChatGPT or the GPT-3 playground window or something like that.

People are solving the problems with building these systems in the real world almost as fast as the problems arise in the first place.

cwkoss · on Jan 24, 2023

One of chatgpt's hidden parameters is what timerange of knowledge it can use to answer. I imagine implementing something similar for 'paging' through the plot could work well. Conversation starts at the beginning of the book and then either explicit syntax or revealing particular information in the conversation 'unlocks' further plot from the bot to draw answers from.

The idea of 'unlocking' information for a chatbot to use in answering feels very compelling for non-fiction as well. Ex. maybe the chatbot requires a demonstration of algebraic knowledge before it can draw from calculus in answering questions. Would feel kind of like a game 'achievement system' which could incentivize people exploring the extent of contained knowledge. And you could generate neat visual maps of the users knowledge.

gamegoblin · on Jan 24, 2023

The date in ChatGPT's prompt is there so the model can know when its training data ends. So if you ask it about something that happens in 2023, it can tell you that its training data cuts off in 2021 and it doesn't have knowledge of current events. Current LLM architectures do not enable functionality like "answer this question using only data from before 2010". It is possible future architectures might enable this, though.

alexpotato · on Jan 24, 2023

I would imagine that the "attention" phase of the LLMs could get longer over time as more resources are dedicated to them.

e.g. we are seeing the equivalent of movies that are 5 minutes long b/c they were hand animated. Once we move to computer animated movies, it becomes a lot easier to generate an entire film.

gamegoblin · on Jan 24, 2023

I agree they will get longer. ChatGPT (GPT3.5) is 2x larger than GPT3. 8192 tokens vs 4096.

The problem is that in the existing transformer architecture, the complexity of this is O(N^2). Making the context window 10x larger involves 100x more memory and compute.

We'll either need a new architecture that improves upon the basic transformer, or just wait for Moore's law to paper over the problem for the scales we care about.

In the short term, you can also use the basic transformer with a combination of other techniques to try to find the relevant things to put into the context window. For instance, I ask "Does Harry Potter know the foobaricus spell?" and then the external system does a more traditional search technique to find all sentences relevant to the query in the novels, maybe a few paragraph summary of each novel, etc, then feeds that ~1 page worth of data to GPT to then answer the question.

spion · on Jan 24, 2023

This is a speculation based on a few longer chats I've had but I think ChatGPT does some text summarization (similar to the method used to name your chats) to fit more into the token window.