I've been working on a project I've called DocGPT: https://github.com/featurebasedb/DocGPT. This is a work in progress and I'll be putting together a release next week, once I've got the instructions for running it done.
It was my assumption that GPT-4 could be used for keyterm and question extraction from the fragments of a document and perhaps used to build a semantic graph that could be leveraged for rapid prompt assembly, augmenting the vector searches that are normally done. The idea here is to set "attention flags" on the document during discussion.
Here's are some questions it asked and answered, using keyterms in each fragment to identify relevant document fragments for assembly (in addition to using nearest neighbor vectors):
system> How can attention head comparison and model pruning be used to identify anomalies in transformer behaviour?
bot> Attention head comparison and model pruning can be used to identify irregularities and potential behavioural issues with transformers through attention pattern exploration, which can provide insights into why large language models fail at reasoning tasks or math. Comparing heads could allow to find parts of the model that are actually useful, and tracking attention flow throughout the model could help with hypothesis testing and visualizing induction head pairs.
system> How does Attention Viz help to visualize multiple datasets from different NLP tasks?
bot> AttentionViz can identify consistent shapes and patterns across different NLP tasks and can also identify anomalies and unexpected behavior, as well as facilitate attention pattern comparisons.
system> What are the mean correlations between query-key distances and dot products for BERT and GPT on Wiki-Auto data?
bot> -0.938 for BERT and -0.792 for GPT
system> What technologies are used in the Attention Viz prototype?
bot> The AttentionViz prototype uses a Python/Flask backend, a Vue/Typescript frontend, a DeepLabv3 segmentation model and Deck.gl to visualize the corresponding query-key joint embeddings.
tl;dr a 'parse tree' for how their (transformer based) LLM weights the import of the tokens in a given prompt. The authors noticed patterns in the visualization that they felt imitated relationships that we would regard as related.
That's as much as I can summarize, given my ignorance.
user-aRxb[attentionviz.pdf]> summarize the document's contents
bot> Querying GPT...
bot> This paper describes AttentionViz, a tool designed to provide a global view of Transformer Attention that can be used to identify sentence view, clusters of search results, semantic behavior, and fine-grained attention patterns; experts have suggested additional interaction modes and global search patterns to be used to quickly scan for and compare attention trends.
In the meanwhile I have found the paper page interesting: https://catherinesyeh.github.io/attn-docs/