Wednesday, July 19, 2006

The Next Search Engine

It might as well be Google itself, but there sure is a lot of room for search. The amount of data available online is just going to keep growing like crazy. Five of the six billion people are not yet online. And as they come online, companies are going to work to make it easier and easier for people to create data online, text, audio, video and more. So you are looking at more and more stuff, in more and more languages, in more and more formats, text, audio,video. And there is the perennial demand: the quality of search. Can you find what you are looking for? How precise is the result? How user friendly is the search experience?

Google's handicap for now is it only knows text. And even text it does not do as well as we might want, even though it does it better than anyone else.

When you search for something you want the most relevant webpages and sites to show up first. And you want to be able to search within sites. How to rank pages? Google started out by saying the more sites that link to your site, the more valuable it is. Well, maybe, but not if many of them are link farms. But that was a great way to start. That still has to be the basic formula. But then each site has to be given a weight of its own based on many different criteria.

The language challenge is a big one, as is the format challenge. Can a search engine "read" audio and video like it reads words? Can a search engine search content regardless of what language or format it might be in, and then present the same in the language of the end user's choosing? You are talking real time translation. That right there is a huge challenge. Major work will have to be done in speech software to make this possible.

The formula that a website's worth is how many other websites link to it is basically good. But the formula has to get more sophisticated than that. Not all links are equal. And each site should have a weight based on a few different things. So one link from a really good site should count for more than many links from so so sites. And the search engine should be able to count the page hits for each site in real time. The activity level of a site should be a major factor of how important a site is. As important as links.

And there is this real time thing. Say if I put out a new website or just a new page to an existing site, how long before the search engine finds it? Can it be an hour? A minute? Less? It should be less than a second. The reverse should also be true. If a site or a page is taken down, the search engine should know.

Language, content format, speech, website weightage based on links and page hits. These are some of the things that come to mind. This is enough homework for now.

Google might or might not deliver. There is plenty of room for others, especially for bold upstarts.

Eric Schmidt interview (39mins MP3)

Kosmix: Desi Pride
Email, Search, News
Memo To Bill Gates

On The Web

Search engine - Wikipedia, the free encyclopedia
Google UK
MSN Search
Google Home Page
Dogpile Web Search Home Page
Homepage HotBot Web Search
Web Site Search Engine, Free and Pro Versions -
Yahoo! Search Engine - Better Web Search
MetaCrawler Web Search Home Page - MetaCrawler
Yahoo! Search - Web Search
Vivisimo Clustering Engine
Search Engine Watch
My Excite
WebCrawler Web Search Home Page
Google Scholar
Ixquick Search Engine - Better Web Search
KartOO visual meta search engine
Environment Web Directory
go2: The #1 Yellow Pages on your mobile phone
ProFusion - The Original Meta-Search Engine
KidsClick! Web Search
Lucene Search Engine
Clusty the Clustering Engine
Google Blog Search
Netscape Search
WebSideStory Site Search and Web Content Management Applications
Ask for Kids
Creative Commons Search
Chemical Search Engine
Mamma Metasearch - The Mother of All Search Engines
Search engine optimization - Wikipedia, the free encyclopedia
Google Australia
What is search engine? - A Word Definition From the Webopedia ...
Scrub The Web Search Engine
Find | Creative Commons
Kids' Tools for Searching the Internet
Search Engines
Search Engine Colossus
Search Engine Guide
Submit Express - Search Engine Placement, Optimization, SEO Marketing
W3 Search Engines XML - multimedia internet search
Clipart Searcher - find free clip art, photos, and animations
InFoPeople: Best Search Tools Chart
Search the RFC Index We do all the searching for you.
Arabic Search Engine: Directory of arabic and islamic sites
SEO Book
Beaucoup! 2000+ Search Engines, Indices and Directories
Koders - Source Code Search Engine
Main - TSEP - The Search Engine Project - A search engine for your ...
Fast Search
isoHunt - World's largest BitTorrent search engine
Yahoo! Search Marketing
Search Engines
Search Engine Lowdown
mnoGoSearch - Internet search engine software
Blog Search Engine
Fluid Dynamics Search Engine
LookSmart Vertical Search
Search Engines :: Mozilla Add-ons :: Add Features to Mozilla Software
Search Engine Features Chart
Search Engine Showdown
Internet Search Engines - Search Engine Relationship Chart®
MathSearch -- search a collection of mathematical Web material
Search Engine Submission, Website Optimization and Free ...
FindLaw LawCrawler - Law, Lawyer, Lawyers. Attorney, Attorneys
MIDI Search Engine: Let MIDI Explorer find your files
Debian GNU/Linux -- Search
Entrez cross-database search
SearchEngineWatch Forums -- Discussions About Search Engine ...
A Helpful Guide to Search Engines, Top Page
Rocketinfo - Search Result
Searching Stanford
Choose the Best Search for Your Information Need
Search Engine Decoder | Relationship Chart
Lyrics Search Engine
Web site search facilities - OHCHR
RootsWeb: Database Index
Ananzi Search Engine
Search Engine Roundtable
The Cybercafe Search Engine, July 17, 2006

Reblog this post [with Zemanta]