January 16, 2006

Search Tech Opportunities- Improvement #8: Machine Extraction

Following up my post on the opening of the search tech chain, the eighth opportunity that I see is with machine extraction and, even better, “linking” material from machine extraction. While this is really a subset of machine learning, the concept is different enough to discuss separately. The basic idea is how can you extract information from the web and put it into a more structured and, more importantly, accurate form. Then, how can you infer the relationships between the data elements that you have extracted.

A good example of a company related to this field is zoom info. Their tag line is “the search engine for discovering people, companies, and relationships.” I typed my name in to see what they had on me, and the list of information that they were able to extract and put together was very complete (I saw one error, as they matched me with an Insight Ventures that has a confusingly similar name to my firm. Hard to take that away from them, however). Pretty impressive stuff!

It seems to me that machine extraction and building relationships is an important part of finding non-explicit links between entities (people, places, and things) as well as compiling information on those entities. The web has a lot of resources (URIs) that help describe the entities, but Machine Extraction along with other Machine Learning techniques may be what is necessary to help push forward with Tim Berners-Lee’s vision for the Semantic Web (a very powerful concept).

I would put this in the category of VERY advanced search with an HUGE amount of innovation potential!

