topleft topright
topleftcorner toprightcorner
Logo Team
LEADERS IN SEARCH ENGINE OPTIMIZATION
 
 
 
ABOUT OUR SERVICES
Separator

SEO GLOSSARY
Separator

OUR NEWS SECTION
October 05, 2007
According to a new study made by the Internet Advertising Bureau and PriceWaterhouseCoopers, revenues generated from Internet ads in the first half year of 2007 has reached almost $10 billion an increase with nearly 27% compared to same period last year....
More »


THE COMPANY
Separator

Left bottom

Learn more about search engine spiders


What are search engine spiders?

You might have already encountered this term 'spider' especially on internet marketing related websites or articles also mentioning as 'robots', 'bots', agents' or simply 'web crawlers'. Their name reflects exactly what they actually are and doing. Search engine spiders are basically automated software programmes or scripts which are crawling the whole web by following the hyperlinks found on the web pages and gathering all data and indexing or storing them in the search engines huge databases. Spiders also have their limits which mean they can not read images, flashes, frames or Javascripts. They can not follow drop down menus and stop crawling if they find dynamic URLs. There are of course many types of spiders or worms we can distinguish and not all of them are search engine spiders, some of them are harmful or created with bad purposes.


Content separator line

How do search engine spiders work?

The first thing you should know before explaining how search engine spiders are working you have to visualize the whole internet as a web of a spider with billions of single pages interconnected by hyperlinks with each other. Spiders or robots are following these links and reading the content of the pages fetching all those data in huge databases where other software programmes also known as algorithms are compiling the retrieved data.

Many people consider that they should submit their newly built web site to the search engines which is of course an option to be indexed but as long as you get a reciprocal link from other web site preferably relevant to your niche and which is already indexed by the search engines than the spider will automatically index your web site as well. So it is very important to increase your link popularity, the more quality link you have the better the chances for a high ranking as well as always fresh data in the search engines index.

By checking your server logs or traffic statistics you can see how often the spider is indexing your pages and also you can identify them by their user agent name like 'Googlebot' or 'Yahoo Slurp'. There are of course many unidentifiable robots some of them are even human-powered.

After all the data has been retrieved by the spiders, advanced algorithms are being used to evaluate and score all the information so that when a searcher enters a query into the search engine, it will list the most relevant result which would satisfy the user needs.

Another important thing is that search engine spiders can be controlled with a robots.txt file where you can set rules how often and what pages should spiders crawl or ignore.


Content separator line

How do search engine spiders read your pages?

As soon as the spider has reached a website he starts reading its visible contextual content as well as meta data content like title, description. In the body part of the page the robot will read all the text content, image alt tags, headings, comment and attribute tags and nevertheless the embedded anchored texts as being hyperlinks leading to other internal pages or websites. The retrieved data is analyzed by other scripts following many factors before deciding on what the website is about and how valuable it is. The evaluation methods search engines are using are different.

Search engines are updating their data on a regular basis and once you get indexed search engine robots will keep revisiting your site to see if there is fresh new content and index them again to deliver always the most current data for the searchers. Basically the more popular and active your website is the more often will be revisited by most of search engine spiders. It is very important where are your web pages hosted, since in case it drops down too often, you risk not to be re-indexed any more and deleted from search engines index database however usually if spiders can't access your web site they will try to revisit later when it will be accessible.


Content separator line

 
Page copy protected against web site content infringement by Copyscape

Home |  Services |  Negotiation |  Website review |  Competitive research |  Keyword research
Website development |  Code optimization |  SEO Copywriting |  Link building |  Monitoring and reporting
SEO Consulting |  SEO Glossary |  News

About us |  Contact us |  Privacy policy |  Our Enquiries |  Partnership |  Link to us |  FAQ |  Sitemap

AddThis Social Bookmark Button
Valid XHTML 1.0 Transitional
Bottom image