The unredacting part was originally borne out of my experiments with a word-level LSTM approach trained on everything the SCO had released. More relevantly, that part was quickly abandoned. It’s all about extracting date-referenced narrative text, and the combination of the NER and the dependency parser have been amazing. Together, they’ve let me begin an extension that dereferences relative dates and last names as though they were pronouns.
displaCy will make an appearance in the final “public” post I’m writing, as well as the tech post for the HN crowd. Thank you so much for your work on spaCy/displaCy!
(And I say this as the developer of the tools he'd likely be using...At least, he has a screenshot of our NER visualiser there.)