This is exactly why tokenizers should provide offsets into the original text! The Spacy tokenizer provides this, though the original wordpiece tokenizer provided for BERT does not. It is relatively easy to add offset information to that tokenizer, though. After you have start and end offsets for each token, it's not hard to just align tokens by offset overlap.
These days, at least in non-research contexts, people seem to mostly use spacy-transformers for that, already contains all the necessary glue code between huggingface transformers (including BERT) and spacy.
Does anyone know how those beautiful "terminal" images were generated? Or recommend some resources for how to make similar images? Or how to actually configure iterm2/vim/etc to that point?