there is actually a paper by OpenAI themselves on summarizing long document.
essentially, break a longer text into smaller chunks, and run a multi-stage sequential summarization. each chunk uses a trailing window of previous chunk as context, and run this recursively.
https://arxiv.org/abs/2109.10862
did a rough implementation myself, works well for articles even 20k tokens. but kind slow because all the additional overlapping runs required. (and more costly)
A technique I have had success with is to do it in multiple passes.
Map-reduce it with overlapping sections, but then propagate back downwards and repeat the process, but now each map-reduce node knows the context it's operating in and can summarize more salient details.
Concretely, on the first pass, your leaf nodes are given a prompt like "The following is lines X-Y of a Z length article. Output a 1 paragraph summary."
You then summarize those summaries, etc. But then you can propagate that info back down for a second pass, so in the second pass, your leaf nodes are given a prompt like "The following is lines X-Y of a Z length article. The article is about <topic>. The section before line X is about <subtopic>. The section after Y is about <subtopic>. Output a 1 paragraph summary that covers details most relevant to this article in the surrounding context."
Could you expand on this? Is the idea to embed paragraphs (or some other arbitrary subsection) of text, and then semantic search for the most relevant paragraphs, and then only summarize them?
Yes that's exactly right, but it presumes you know what to look for and what you want in your summary. Our use case is to pick out action items or next steps from meeting notes so this can work. But not for all use cases - i.e. summarize this paper.
Agreed, you can try sending it in chunks but then you lose context. Perhaps the ChatGPT based API will help if they expose the conversational memory as a feature.
Maybe OP has figured out a method with the current API?
I saw in another thread that people were working around this by asking for a summary of sections and then combining the summaries and asking for a joint summary.
This is an issue. I haven't experimented to see if there are workarounds, so the service currently checks the length of the article text and if it's very long, it will send a portion, otherwise we'll exceed the token limit. There's a note on the front page about it: "Limitations: The OpenAI API does not allow submission of large texts, so summarization may only be based on a portion of the whole article."
I tried, they don’t. Seems when they were ranking #1 on HN yesterday, someone made a summary (top comment) of what they are for that isn’t quite correct.
Can't find it for some reason, can you provide a link? Did they summarize with GPTSimpleVectorIndex or GPTListIndex? GPTSimpleVectorIndex is in get-started examples and is cheaper, but it provides worse results.