It's *a *representation, not *the* representation. By looking at the internet, e...

maegul · on Feb 22, 2023

Yea, this.

The same point or a similar critique can be made a few ways, I’d say.

Running with the brain/neuron analogy, there’s a measurement problem (as there is in real neuroscience!). The synaptic activity of the “meta-mind” has been recorded with keyboards, smart phones and plain text. These aren’t the native ways of human communication though, the synapses if you will. That’s more like spoken conversation and physical interaction. All richer phenomena.

To the extent that “textual” communication is now native/normal to humanity, it’s still partial in coverage of all human interaction, new, and shifting with tech developments like video/streaming.

So the internet is a lossy representation, apart from whatever other biases it might have, as suggested above.

brookst · on Feb 22, 2023

Do the datasets follow the algorithmic weightings? I thought they included all content for their domains without weighting by popularity / engagement algorithm.