January 27, 2006

You really need to focus!

Posted in CEO, management, Product Development at 9:14 am by scottmaxwell

I posted several times on the issue of focus, including the opportunity to develop a scope advantage against the large company, the issue of time horizon of CEO focus vs. company size, the the issue of CEO time horizon focus by day of the year.

I am just completing a week of meetings with emerging growth technology companies in Germany and Russia, and the meetings reminded me again that everyone talks about focus, but very few companies actually do focus (this is true of emerging growth companies in all countries, not just the companies that triggered this post).

There are two major opportunities for most (all?) emerging growth companies:

  1. Narrow your focus– Reduce the number of products and the number of new features that you are trying to add to the products. Use the extra time to make the product and features that you are developing that much better (and simpler).
  2. Improve the target of your focus– Listen to your customers to help you narrow down to the right product and the right features. (some ideas on how to do this are in my post on gaining an information advantage).

Ok. I know…obvious points. But if the points are so obvious, why is it that so many companies feel that they understand their customers so well without spending much time with the customers? Why is it that they are thinking up grand new products when their current products have a long way to go before they are fully intuitive and extremely easy to use?

I think the issue is that most companies are not critical enough of themselves…they go through their day thinking that they know their customer (using logic and a lot of assumptions) and they are focusing when EVERY company has significant opportunity to improve on both! (one senior manager on this trip went so far as to explain to me that talking to the customers will mislead the company into believing the feedback, which would only be relevant for that particular customer. While he is right to point out that you need to be careful about your approach and conclusions that you reach, this is a very bad excuse for not taking the time to understand your customers’ point of view!!)

An approach that will help most (all?) companies is to appoint a person responsible for both formalizing and capturing the customers’ feedback AND be responsible for minimizing scope creep (this needs to be a highly disciplined person that feels comfortable keeping everyone focused and on target). The role could be called (or be part of) product marketing or product management. If you do not have this role in your organization, consider creating it and assigning the right type of person to it.

You have a huge opportunity to make better products while growing faster with fewer managment headaches.  At a minimum, walk around your company for the rest of the day saying to everyone that you meet “You really need to focus!”  I am 100% certain that everyone will agree…

Advertisements

January 16, 2006

Search Tech Opportunities- Summary

Posted in innovate this!, Innovation at 8:28 pm by scottmaxwell

This is a summary of my series on Opening up the Search Tech Chain. The major point is that opening up the search technology model will have some great effects on all of the possible innovations around the model. The post the opening of the search tech chain discusses my argument for the opening up of the tech value chain and links to some other bloggers thoughts and resources on OpenSearch and the opening of the Alexa A9 search engine platform for others to innovate on.

Just so that I am clear on the abundance of innovation opportunities related to search, I have several posts that describe some opportunities for innovators. The ideas overall can be applied to text, images, audio, or video, each of which has its own issues and opportunities.
The posts:

  1. Every site needs to expose its APIs. Open APIs would be a huge opportunity to improve search, even with today’s search technologies.
  2. Lots of innovation potential in micropayments. While the ecosystem could get started without a micropayments infrastructure, these issues will need to get worked out.
  3. Mashups will be a lot more interesting with the new APIs. In my view, the mashups for the end users are the most exciting innovations (but probably not the most difficult). I lay out my thoughts on the potential for mashups in this post.
  4. Help me find what I am looking for. The matching of search intent to search results works okay today, but there are lots of improvement opportunities.
  5. Some thoughts on improving tagging. The social tagging sites have changed my use of the internet in tremendous ways. I point out some of the improvements to the tagging that I would like to see in this post.
  6. Put more features in feature vectors. We need to move beyond exact word search into many new approaches to finding what we are looking for. The posts start getting a little more technical here with the ideas on creating and exposing larger feature vectors.
  7. Machine Learning as a Web Service. Machine learning is one of developments on the internet that will change things dramatically for the users (and will offer some great economic results to the winners). This post outlines some thoughts on how it might be achieved as a web service.
  8. Machine Extraction. Finally, turning unstructured unlinked data into structured and linked data is unbelieveably difficult but very powerful stuff. This post gives an example and outlines some other rough thoughts on the topic.

I am sure that I have missed lots of opportunities for innovation in search, but this list is still very long and hopefully demonstrates the point that there is a lot to do! If I have any other ideas worth posting, I will post them and link to them here. Please also comment if there are other ideas or resources that are valuable to this list.

Search Tech Opportunities- Improvement #8: Machine Extraction

Posted in innovate this!, Innovation at 7:49 pm by scottmaxwell

Following up my post on the opening of the search tech chain, the eighth opportunity that I see is with machine extraction and, even better, “linking” material from machine extraction. While this is really a subset of machine learning, the concept is different enough to discuss separately. The basic idea is how can you extract information from the web and put it into a more structured and, more importantly, accurate form. Then, how can you infer the relationships between the data elements that you have extracted.

A good example of a company related to this field is zoom info. Their tag line is “the search engine for discovering people, companies, and relationships.” I typed my name in to see what they had on me, and the list of information that they were able to extract and put together was very complete (I saw one error, as they matched me with an Insight Ventures that has a confusingly similar name to my firm. Hard to take that away from them, however). Pretty impressive stuff!

It seems to me that machine extraction and building relationships is an important part of finding non-explicit links between entities (people, places, and things) as well as compiling information on those entities. The web has a lot of resources (URIs) that help describe the entities, but Machine Extraction along with other Machine Learning techniques may be what is necessary to help push forward with Tim Berners-Lee’s vision for the Semantic Web (a very powerful concept).

I would put this in the category of VERY advanced search with an HUGE amount of innovation potential!

Search Tech Opportunities- Improvement #7: Machine Learning as a Web Service

Posted in innovate this!, Innovation at 7:23 pm by scottmaxwell

Following up my post on the opening of the search tech chain, the seventh opportunity that I see is with machine learning of various kinds. If you are not familiar with machine learning, take a look at Tom Mitchell’s book on the topic (funny, when I punched “machine learning” into Google, the first entry was an advertisement from Google asking “Want to work at Google?”)

This topic gets deep and broad very fast. I have put a lot of time into it over the years (starting with some credit card behavioral modeling in the early ’90s), but even with all the research I have done, I know only enough to be dangerous. Machine learning is relatively technical and relatively difficult to get exactly right (lots of math, CS, and art here).

I do, however, have a few non-technical thoughts on the topic:

What I would use it for…

Machine learning can be used for an amazing number of things (too many to describe here and many, many that I have not even considered). With respect to search (and assuming that all of my improvement ideas so far have had some level of development), innovators could create the following (and much more):

  • Propose tags for me on the social tagging (as I requested in an earlier post on tagging).
  • Given a set of resources (a.k.a., webpages, URIs), find resources that are similar (the find similar buttons on the search engines are really quite bad at this point. The approach that I would test would be asking the user what “dimensions” of similarity the user is looking for and then find everything similar. Note that the approach would use the feature vectors and models from the machine learning algorithms.
  • Automatically update my Ajax desktop pico-domain. There is refresh work that would need to be done in addition to the machine learning, but I should be able to develop algorithms that help me to quickly find and update resources in my domain-specific site.

These are just a few of the list of examples that solid machine learning can do. Clearly, it is not perfect, so I will still need to have some manual activity to sort through some bad results, particularly at the beginning. But, the machine learning will save me a lot of time.

An innovative service…

The problem is that building good machine learning models is a time intensive task that starts with creating a model building environment. While many of the large companies appear to be doing just this, most smaller companies are standing flat footed in this area, due to lack of understanding, resources, and skills.

So, how about an innovative On-demand (SAAS) service as an offering in this area from a company that has the skills (generally now operating as professional services groups)? Most internet services could use machine learning of one kind or another, but they do not have the resources/skills to set up the model development environment. The service could be helpful with code that helps create feature vectors (and other machine learning inputs), recommend modeling approaches for the particular class of problems, walk the user through the model building process, and back test the models. Finally, it could deliver the code for the models or, possibly, even implement the models on its own systems as an ongoing service.

Since the inputs are all available via the Internet and there is a lot of work in the set-up of the model building environment, this innovation lends itself pretty nicely to be set up as an internet based service.

Given that the process still has a fair amount of art in it, the service could also offer up expert model building advisors to its customers.

If some innovators do not move in this direction for the community at large to share, this will become a major area of strategic advantage for the larger companies over the next few years. Perhaps Alexa (or another innovator) will move in this direction for the benefit of many?
Huge amount of innovation potential here!

Search Tech Opportunities- Improvement #6: Feature Vectors

Posted in innovate this!, Innovation at 5:27 pm by scottmaxwell

Following up my post on the opening of the search tech chain, the next few opportunities start getting slightly more technical. The sixth opportunity that I see is with more useful features in the search feature vectors and the mathematical combination of entries in those vectors.
Briefly, the current features that I can use to extract resources in the major search engines are word-based where each resource can be retrieved based on a series of words. (Note that this is not completely accurate, as there are a few other features that could be searched on such as language, file format, date of update, and domain suffix but the vast majority of the entries in the vector currently represent words).

Some Ideas (aimed at the search engines):

  • How about giving me a few more features to search on? For example, search engines seem to be storing away bolded and highlighted words to use in their prioritization schemes. How about exposing some of them so I can use them in a basic (or more advanced) search?
  • More advanced, give me the access to a large number of features that you are not already calculating, but should be (maybe you are already?). For example, I am always searching for interesting new technology product companies. Most of them have a tab-based link on their home page that says “product” (sometimes plural, but the stem is fine). How about a calculated feature that I can search on? This is one of a huge number of possible features.
  • Even better, let me make my own features calculations with some tools that you provide and you execute them and store them on your systems!
  • Just as important as the exposure to features, I would like to have the ability to make calculations off of the features and store them in your system. This will allow me to create some machine learning models (at a higher, concept level) and make the calculations in advance of my searches. (I will reduce the load on the systems by only uses specific features through this method, as I will use my composite variable for the resource extraction). This will also give me the ability to store some very advanced searches AND some of my schemes for prioritizing results. If you are really generous, you would allow me to make the calculations through the API as well.

I think Alexa already allows me to do this with its open platform, but I have not studied it enough at this point.

The net result is that I should be able to do some really interesting things with the information, especially if the other components of the open platform are in place (prior posts). The Amazon Camera Image search is one good example of what is possible (even if this does not interest you from a user standpoint, the search ability is amazingly specific).

Search Tech Opportunities- Improvement #5: Improved Tagging

Posted in innovate this!, Innovation at 3:29 pm by scottmaxwell

Following up my post on the opening of the search tech chain, the fifth opportunity that I see is with improved tagging. (While tagging may not be thought of in a traditional search sense, the reason for tags is to find things, so I include it here.)

The basic issue is that, while I love the tagging sites, I suck at tagging. When I find a site that I like, I would like to remember it, so I tag it and put it on Delicious or Wink, right? Well, it doesn’t work very well for me for several reasons. First, I have a hard time thinking about the tags that I should have. Second, I have a hard time remembering my tags. Third, the thought pattern that I have at the time of tagging is usually different than the thought pattern that I have at the time of retrieval. Fourth, the semantic meaning of one person’s tags can be very different than another person’s tags. Finally, those tag clouds definitely were not build with me in mind (they are attractive in an artistic way, but hard to use).

The net net of it is that a given resource ends up being tagged more generally rather than specifically, gets tagged based on a thought at the time of tagging, and gets tagged in a subset of all possible tags. This causes all sorts of retrieval problems.

Some thoughts on improvements to tagging:

  • Bare minimum, when I find a resource that I want to tag, let me know what others have tagged it and let me tick off the tags that I want to use for it.
  • Allow users to put together tag trees (already starting to happen. Wordrpess, for example, allows me to nest categories to two levels) that allow the tagger and the reader to better understand how a given tag fits into the world (as a side note, everyone knows that single taxonomy trees suck, but facited tree taxonomies are in my view the ultimate approach which is effectively this point. In my view, there can be lots of different overlapping trees that change over time, which allows for the “messy” world to be described better.)
  • Allow users to create tag trees that can be used by other users.
  • Use machine learning to propose possible trees and tags at the time of tagging of a resource AND to propose trees and tags to the user at the time of search/retrieval.
  • I am not sure the answer to the tag clouds, especially if others like them. perhaps a more organized way of representing how tags or trees relate would be helpful?

Again, there is an enormous amount of innovation potential here…

Search Tech Opportunities- Improvement #4: Quickly Find What I am Looking for

Posted in innovate this!, Innovation, management at 2:06 pm by scottmaxwell

Following up my post on the opening of the search tech chain, the fourth opportunity that I see is with getting back from a search exactly what I am looking for (a.k.a., user intent). There seem to be various issues and approaches to resolving this issue. I love Google (it looks like Fred Wilson likes Yahoo even more), but I think there are a lot of improvement opportunities available for getting me what I want a lot more quickly (of course, this cuts down on opportunities to show me advertising which could be an EXTREME disincentive for the large search engines to execute well here).

Some thoughts:

  • The current search approach is word-based. If I choose the right words, I get the right set of results. If I am clever (negative words, quotes, etc.), I will get even more accurate results. But what if I don’t know the right words (or, perhaps the sites don’t know the right words) and what if I am not clever? Quintura has a new solution to help me in certain ways (at least if I have a Windows machine at this point). Perhaps there are other, better ways as well, such as search engines offering up concept search in addition to word search, (and some other features that I will discuss in a later post). btw, I met with Quintura guys on my last trip to Moscow. Really smart team and making a lot of rapid progress with their product.
  • Even if the word-based search with the current word search, the SEO crowd has crowded out many of the sites that I am really looking for, so I still have a problem finding what I am looking for first. The Quintura-like solution will partially solve this problem, but one or more of the mashers (actually, most of the mashers) should allow me to take my starter set of URIs and allow me to reprioritize it in any way I like. How about by amount of traffic OR what my network thinks is good, OR what the bloggers (even better, technorati rated bloggers for this topic) talk about most OR anything else some clever person comes up with. I would also like the approach that allows me to combine different rating techniques in different ways to truly understand the “best sites” for me from various angles.
  • Of course, once there are enough pico-domains, the pico aggregator should allow me to find the pico-domain which should pretty quickly point me to the right resource (this is not a replacement, but rather another approach to the basic search approach).

Lots of room for innovation here (and I am sure I missed a lot)…

Search Tech Opportunities- Improvement #3: Mashups

Posted in innovate this!, Innovation at 2:00 pm by scottmaxwell

Following up my post on the opening of the search tech chain, the third opportunity that I see is with search-related mashups of various kinds (this is where the fun comes in!).

If the search tech system becomes more open, perhaps at least partially in the manner that I describe in my prior posts, it should give the foundation for mashups of various kinds (that I can not begin to imagine at this point). I am really impressed with the mashups that already exist and the amount of innovation that is going into them (see the Web 2.0 mashup matrix, which I mentioned last month, as a good source for the current innovation), but they all start with the same small set of useful APIs that currently exist. A host of new APIs will bring a host of new mash-ups (today there are 56 on Programmable Web. Adding one new API will bring a lot more than 56 new possibilities (as the two dimensional matrix implies) due to the possibility of using n of the APIs together (n is equal to or lower than the number of APIs).

One extreme version of this would be every Ajax desktop being able to be configured and released which covers a complete pico-domain, the pico-agregators (i.e., a tiny portion of the overall world that is homogenous in some way chosen by the aggregator) in a way that is not possible today. There are a lot of attempts to do this type of thing now, with domain specific directory sites, About.com, and, more recently, Squidoo (and others), but all of them would have a great number of more complete and better approaches to presenting the best most complete and organized information with the foundation in place.

(Somewhat circular) Their should also be some great opportunities for pico-agregator search engines, which would allow a user to find the pico-aggregators of interest (this is done to some extent by Squidoo for its own lenses, for example, but needs to be extended to include all pico-aggregators).

Clearly, this is only one angle into mashups and many, many more exist. The bus is headed in this direction and I see nothing but a few speedbumps, detours, and stops at the gas station to stop it from reaching this destination. Lot’s of opportunities for innovators here!

Search Tech Opportunities- Improvement #2: Micropayment System

Posted in innovate this!, Innovation at 11:50 am by scottmaxwell

Following up my post on the opening of the search tech chain, this is a second improvement that I would like to see:

My last post was on opening up the APIs. While I would hope that we can start in beta or have some level of usage for free (ala Amazon’s API), I expect that at some point there will need to be a parallel effort to sort out the micropayment issues. Some thoughts:

  • The current RSS feed model of the user going to the content site works fine for the content site, but it is difficult for the search engines to create a revenue stream this way.
  • The Feed-plus-ad approach seems to be another approach, but the problem is that cascading value-add services will end up with a tremendous amount of ad space (unless there is some sort of coordination).
  • The Amazon model of a small payment per use seems like a fair one to me (and relatively easy to execute on)
  • Perhaps there is a more advanced one with respect to sharing advertising revenue or giving some real estate on the ultimate browser view for advertising to be delivered by the originating site.

Like the Blog and RSS feed approach, I would hope that some innovative companies could get this ecosystem going without the need for a payment system right away (free beta and ongoing small usage for free), but It seems to me that longer term there is a lot of innovation potential in this area!

Search Tech Opportunities- Improvement #1: APIs From Everyone

Posted in innovate this!, Innovation at 11:33 am by scottmaxwell

Following up my post on the opening of the search tech chain, I thought I might give a number of the things that I would like to see in search.

The first improvement is relatively simplistic. Give me an open API so that I can feed data into an aggregator, mix in some processing, and output a mashup for consumption.

  • From traditional search vendors, OpenSearch is a great start. I would like to be able to put in a search query, and get back all of the resources (URIs) that are aligned with that query. I would also like you to append any metrics that you have calculated on each resource (e.g., links in, links out, metrics associated with each.  Even better, whatever network attributes surrounding the URI that you are willing to offer).
  • From the social tagging sites, I would like to be able to input a list of URIs (probably from my search above) and get back the information that you have on each. For example, the tags that people are using for them, the number/percent of people that have tagged them (more advanced, the number of “experts” that have tagged them), etc. If you are really creative, you will come up with a lot of useful metrics for me.
  • From sites that measure/monitor traffic (this includes the ISPs, Alexa, Google, other search engines etc.), I would like to be able to input a list of URIs, and get back all of your statistics.
  • From sites that measure security risks, I would get back your security view on each URI.
  • From the “owner” of the URI, I will take any metadata that you want to provide, so long as it is relatively organized into some standard approach.
  • From the Domain Registrars, I would like to get your information as well (by domain in this case).
  • I would also like similar input/output information from the message boards, forums, blog sites (or possibly the Feedburner’s of the world), Ajax desktops, and any other site that cares to offer an opinion on a URI (or the objects associated with the URI, such as person, organization, product, etc.).
  • If I ask really nicely, perhaps I can also have the request information from the DNS servers (yes, I know this is a major issue. But as long as I am asking…)
  • I am sure that I missed several classes of sites. If you have data to offer up related to a URI, I would love to get it (assuming it has information value).

In terms of the API, my first request is URI input and data back, but each site has a lot of other useful API calls (e.g., tag in, URLs out). It would be great if over time each class of site had some standards emerge that had robust APIs associated with them.

I would also like to see some intermediaries (ala Feedburner) form that will reduce some of your server load and mine and, perhaps, find some value add processing to the combined data streams.

Okay, now I have a pretty nice set of possibly interesting URIs and the possibility of getting a lot of useful information that others are gathering on each. It is probably obvious what I want to do with all this, but I will be more explicit in a later post.

Previous page · Next page