First step is to generate a graph distance matrix to use as features. You can do...

First step is to generate a graph distance matrix to use as features.

You can do the hierarchical clustering using HDBScan probably in reasonable time, it's a fast algorithm.

To have any sort of 2d display you need to project the nodes, which might require some form of PCA given the data set size. UMAP might also work.

From there, you can use an R* tree in conjunction with "cut-depth" cluster segmentation tied to zoom level with additional entity selection based on count and centrality. If you load it in postgres PostGIS can do this in one query.

All pretty straightforward stuff.