Do you have a question? Post it now! No Registration Necessary. Now with pictures!
January 25, 2008, 7:03 am
rate this thread
Web search engines provide an interface to search for information on
the World Wide Web.
Information may consist of web pages, images and other types of files.
Some search engines also mine data available in newsgroups, databases,
or open directories.
Unlike Web directories, which are maintained by human editors, search
algorithmically or are a mixture of algorithmic and human input.
User specify criteria about an object of interest and have the engine
to find the matching
query.This criteria is a search query.
In text search engines, the search query is expressed as a set of
words that identify the desired
concept, which one or hundreds documents may contain.
There are several styles of search query syntax that vary in
strictness.some text search engines
require users to enter two or three words separated by white space,
other search engines enable
users to specify entire documents, pictures, sounds, and various forms
of natural language. Some
engines apply improvements to search queries to increase the
likelihood of providing a quality
set of results through a process known as query expansion.
index-based search engine :
The list of results that meet criteria specified by query is typically
sorted, or ranked, in some
regard so as to place the most relevant results first.
Ranking results by relevance (from highest to lowest) reduces the time
required to find the
Probabilistic search engines :
rank results based on measures of similarity and sometimes popularity
Boolean search engines :
return items which match exactly without regard to order.
To provide a set of matching results quickly, a search engine will
typically collect metadata
about group of items under consideration beforehand through a process
referred to as indexing.
Index requires a smaller amount of computer storage, and provides a
basis for the engine to
calculate result relevance. The search engine may store a copy of each
result in a cache so that
users can see the state of the result at the time it was indexed or
for archive purposes or to
make repetitive processes work more efficiently and quickly.
Notably, some search engines do not store an index. Crawler, or spider
type search engines may
collect and assess results at the time of the search query. Meta
search engines simply reuse the
index or results of one or more other search engines.
Here is The most used Search Engine's short description :
Around 2001,Google search engine rose to prominence. Its success based
on the concept of link
popularity and PageRank which are from 0 To 10. The number of other
websites and webpages that
link to a given page is taken into consideration with
PageRank.Google's minimalist user interface
is very popular with users, and has since spawned a number of
The quality of content presented in the pages is very important. It
then matched by Google with
Meta Tags .If they are relevent to the content then it has a high
quality page and would have a
Google retrieves pages by a Web crawler (known as a spider) -- an
automated Web browser which
follows every link it sees. Exclusions can be made by the use of
robots.txt. The contents of each
page are then analyzed to determine how it should be indexed (for
example, words are extracted
from the titles, headings, or special fields called meta tags). Data
about web pages are stored
in an index database for use in later queries.
Some search engines, such as www.Google.com, store all or part of the
source page (referred to
as a cache) as well as information about the web pages, whereas
others, such as www.AltaVista.com
stores every word of every page they find.
Some search websites uses the Google's API for searching as www.pakistan.sc
uses it and find
queries of its users through google. It also provides thumbnails of
pages for a quick glance.
Google utilize not only PageRank but more than 150 criteria to
determine relevancy.The algorithm
"remembers" where it has been and indexes the number of cross-links
and relates these into
groupings. PageRank is based on citation analysis.
The Google Algorithim is Today's most Hidded Truth !
( From HaiderAliryk )
Re: How search engines work :-
Hi All !
Get Your Website Indexed And Ranked Fast With PR Backlinks
PR Backlinks Generator is a software that helps you find hundreds of
WordPress blogs with high PageRank that matches the theme of your
website and lets you submit comments with links back to your website
Inshort, PR Backlinks Generator software can bring you one way high PR
backlinks to your site, that Google loves!
A MUST HAVE TO ALL WEBSITE OWNER !
HAVE A GOOD BACKLINK GENERATION !