A few months ago I attended a talk by Alon Halevy (of Google) on the algorithms behind this. There are several papers with more details for those who are curious. Check his publications listed at http://alonhalevy.googlepages.com/, specifically those about WebTables and dataspaces.