learning, thinking, blogging.
SEO for Dummies 3 – How does a Search Engine Works?
How does a search engine select the pages to show for a given query? How a specific query is processed? How does a search engine finds the pages online?
This article briefly explains how a search engine works
1. Discovery
Search engines use automated programs (called spiders or bots) that explore the web, jumping from one page to the other following the links they found.
2. Index
When a page is found, or a known page is re-visited, its content it’s saved in the search engine database, so it can be accessed faster in the future.
3. Returning Results
When a query is sent to the search engine (i.e. when a user hit the “search” button in the search engine homepage), the matching pages are selected and ranked with a specific algorithm (every search engine has it’s super-secret ranking algorithm), and the pages are returned to the user ordered by descending importance.
Ranking Criterion
There are enormous differences in the ranking algorithms used by the search engines, but all of them are based on relevance and popularity.
This are terms from the Information Retrieval, of which search engines are one of the most visible application.
Basically higher relevance means that the document is more focused on the given search term, and higher popularity means that the document is more cited from other sources.
In terms of search engines,
relevance is evaluated analyzing
- the page textual content
- the pages that provide inbound links
- reading the anchor text used to link to the document
- reading the text surrounding the link
- evaluating the linking pages
This means, for example, that a page can rank well for a phrase or keyword even if that phrase never appear in that page. (One famous case is Bush bio ranking #1 for miserable failure on google… this is the result of a massive use of “miserable failure” as anchor text for www.whitehouse.gov/president/).(This is no more true due to a change in google algorithm)
popularity is evaluated counting the number of links to the given page (more links means more popularity)
Given this two main criterions, each search engine adds its own interpretations, for example giving more weight to some “trusted” sites (.edu and .gov domains and sites with higher popularity are considered more trusted), or giving different weights to each element (page title, body, heading tags…)
As an example consider my google guide: it ranks #1 for ‘mapelli’ (my last name) on google, because it has been widely linked with the title of the page (that contains the domain name, i.e. www.mapelli.info), and google gives high relevance to inbound links text, while the same article is not in the top 100 results on yahoo. (This is no more true due to a change in google algorithm)
The obvious consequence is that if you want to get higher rankings you have to
- allow search engines to find your site
- make easy for the spiders to understand the structure of the pages
- increase your relevance
- increase your popularity
We’ll talk about how to do this in the next few articles.
Summary
- Spiders or bots: automated programs that crawl the web and index the pages
- Relevance: represents how much a web page match the search terms
- Popularity: represents the number of “citations” (inbound links) of a given webpage, it’s a metric of the importance of the webpage
In the next article in SEO for Dummies I’ll talk about the most well-known ranking system: Google Pagerank.
Resources:
- googlebot (google spider)
- Yahoo Slurp help (yahoo spider)
- msnBot faq (msn live spider)
| Print article | This entry was posted by francesco mapelli on 2007/01/15 at 1:46 am, and is filed under Uncategorized. Follow any responses to this post through RSS 2.0. You can leave a response or trackback from your own site. |



about 4 years ago
Seriously, make a mistake, give us a chance to criticize haha. Even though you’re not getting into many details (Because these are basics) you’re still making it very clear.
Well, with this occasion I’ll just explain one of my stories heh. Yes, search engines do interpret differently and yes, most are based on importance and popularity. I have one of my sites #1 on yahoo and #13 on google, so in my experience is harder to get :)
Cheers
about 2 years ago
Exactly where is the information stored that the search engine searches? I’m a real dummy but a curious one.
about 1 year ago
Hi, my english isn’t the best but I think by regularly visits of your blog it will be better in the next time. You have a good writing style which is easy to understand and can helps people like me to learn english. I will be now a regularly visitor of your blog.