Could this matrix be compressed to binary form for storage in a binary index?

cschmidt · on Sept 5, 2022

That wouldn't really help. Let me explain in a bit more detail. The results depends on the query matrix, which will be different for each set of queries. We have a query matrix Q that of dimension 10000x128. And we have another vector matrix A that is 1,000,000x128. We preprocess both Q and A so each row has unit norm:

    Q[i,:] /= norm(Q[i,:])
    A[k,:] /= norm(A[k,:])

So now with that preprocessing the cosine similarity of a given row i of Q and k of A is:

    cossim(i,k) = dot(Q[i,:], A[k,:])

If you multiply QxA.T (10,000 x 128)x(128, 1M) you get a result matrix (10,000 x 1M) with all the cosine similarity values for each combination of query and vector.

If you make a pass across each column with a priority queue, you can find the top-n cosine similarity values in time O(1,000,000xn).

Now you could store the resulting matrix, but Q is going to change for each call, and we really only care about the top-n values for each query, so storing it wouldn't really accomplish anything.

Edited: fixed lots of typos

kordlessagain · on Sept 5, 2022

I asked GPT-3 about it using an array of vectors of fragments of this page, weighted by relevance (using np.dot(v1,v2)) to the query. This is used to build the prompt for submission to the OpenAI APIs. I'm interested in storing these vectors in a very fast DB for memories.

pastel-mature-herring~> Could this matrix be compressed to binary form for storage in a binary index?

angelic-quokka|> It is possible to compress the matrix to binary form for storage in a binary index, but this would likely decrease the accuracy of the cosine similarity values.