Hacker News new | past | comments | ask | show | jobs | submit login

Regarding "Match Against Multiple Patterns", the examples are about finding the _suffixes_ of something, email domains in the example.

An attempt to find a suffix like that will not be able to use an index, whereas creating a functional index on the reverse and looking for the reversed suffix as a prefix will be:

  # create table users (id int primary key, email text);
  CREATE TABLE
  # create unique index on users(lower(email));
  CREATE INDEX
  # set enable_seqscan to false;
  SET
  
  # insert into users values (1, 'foo@gmail.com'), (2, 'bar@gmail.com'), (3, 'foo@yahoo.com');
  INSERT 0 3
  # explain select * from users where email ~* '@(gmail.com|yahoo.com)$';
                                  QUERY PLAN
  --------------------------------------------------------------------------
   Seq Scan on users  (cost=10000000000.00..10000000025.88 rows=1 width=36)
     Filter: (email ~* '@(gmail.com|yahoo.com)$'::text)

  # create index on users(reverse(lower(email)) collate "C"); -- collate C explicitly to enable prefix lookups
  CREATE INDEX

  # explain select * from users where reverse(lower(email)) ~ '^(moc.liamg|moc.oohay)';
                                               QUERY PLAN
  ----------------------------------------------------------------------------------------------------
   Bitmap Heap Scan on users  (cost=4.21..13.71 rows=1 width=36)
     Filter: (reverse(lower(email)) ~ '^(moc.liamg|moc.oohay)'::text)
     ->  Bitmap Index Scan on users_reverse_idx  (cost=0.00..4.21 rows=6 width=0)
           Index Cond: ((reverse(lower(email)) >= 'm'::text) AND (reverse(lower(email)) < 'n'::text))

(Another approach could of course be to tokenize the email, but since it's about pattern matching in particular)



Alternatively you could use a trigram index which i've used in the past successfully to speed up queries using like '%gmail.com'.

Blogpost discussing this approach: https://about.gitlab.com/blog/2016/03/18/fast-search-using-p...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: