I made a program to add spaces into Chinese text, which requires a lot of string...

repiret · on April 28, 2018

I would put your dictionary into a prefix tree if some sort, and the first implementation I would reach for is a trinary tree [1]. As you scan your input, walk the tree. Each time you get to the end of the tree, insert a space one go back to the root. Once the tree is built it’s O(length of input) [2] regardless if dictionary size.

1: https://en.m.wikipedia.org/wiki/Ternary_search_tree

2: for a fixed alphabet size. It’s really closer to O(dictionary size * log(alphabet size)) if that increases with dictionary size.

Inityx · on April 28, 2018

Instead of storing the dictionary as a list, why not use a Map or Set?

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

As for offline StackOverflow, you should use a database engine like SQLite or Redis to manage indexing into smaller blocks of compressed post data on-disk, or store the posts directly in the database and keep the database files in a filesystem like BTRFS that supports online compression.