January 16, 2006

Search Tech Opportunities- Improvement #1: APIs From Everyone

Following up my post on the opening of the search tech chain, I thought I might give a number of the things that I would like to see in search.

The first improvement is relatively simplistic. Give me an open API so that I can feed data into an aggregator, mix in some processing, and output a mashup for consumption.

  • From traditional search vendors, OpenSearch is a great start. I would like to be able to put in a search query, and get back all of the resources (URIs) that are aligned with that query. I would also like you to append any metrics that you have calculated on each resource (e.g., links in, links out, metrics associated with each.  Even better, whatever network attributes surrounding the URI that you are willing to offer).
  • From the social tagging sites, I would like to be able to input a list of URIs (probably from my search above) and get back the information that you have on each. For example, the tags that people are using for them, the number/percent of people that have tagged them (more advanced, the number of “experts” that have tagged them), etc. If you are really creative, you will come up with a lot of useful metrics for me.
  • From sites that measure/monitor traffic (this includes the ISPs, Alexa, Google, other search engines etc.), I would like to be able to input a list of URIs, and get back all of your statistics.
  • From sites that measure security risks, I would get back your security view on each URI.
  • From the “owner” of the URI, I will take any metadata that you want to provide, so long as it is relatively organized into some standard approach.
  • From the Domain Registrars, I would like to get your information as well (by domain in this case).
  • I would also like similar input/output information from the message boards, forums, blog sites (or possibly the Feedburner’s of the world), Ajax desktops, and any other site that cares to offer an opinion on a URI (or the objects associated with the URI, such as person, organization, product, etc.).
  • If I ask really nicely, perhaps I can also have the request information from the DNS servers (yes, I know this is a major issue. But as long as I am asking…)
  • I am sure that I missed several classes of sites. If you have data to offer up related to a URI, I would love to get it (assuming it has information value).

In terms of the API, my first request is URI input and data back, but each site has a lot of other useful API calls (e.g., tag in, URLs out). It would be great if over time each class of site had some standards emerge that had robust APIs associated with them.

I would also like to see some intermediaries (ala Feedburner) form that will reduce some of your server load and mine and, perhaps, find some value add processing to the combined data streams.

Okay, now I have a pretty nice set of possibly interesting URIs and the possibility of getting a lot of useful information that others are gathering on each. It is probably obvious what I want to do with all this, but I will be more explicit in a later post.


