Website Promotion Marketing
 
FAQ
Site Map
Online Support
© Copyright 1999-2007 Search Marketing Sales Ltd. All Rights Reserved.
Website Promotion, DMOZ Submission & Website Marketing Consultant
LEADING THE WAY WITH WEB PROMOTION MARKETING & DMOZ SUBMISSION   
Help Us Bookmark This Page & Share With Others In The Internet. Thank You

Web Site Promotion, Internet Marketing & DMOZ Solution

Web Site Promotion, Internet Marketing Services, DMOZ Submission & SEO Company Since 1999
           

                             Anatomy of a Search Engine
                        Going under the hood of a search engine by Dave Davies



          

For some unfortunate souls SEO is simply the learning of tricks and techniques that, according to
their understanding, should propel their site into the top rankings on the major search engines.
This understanding of the way SEO works can be effective for a time however it contains one basic
flaw: the rules change. Search engines are in a constant state of evolution in order to keep up with
the SEO's in much the same way that Norton, McAfee, AVG or any of the other anti-virus software
companies are constantly trying to keep up with the virus writers.

Basing your entire websites future on one simple set of rules (read: tricks) about how the search
engines will rank your site contains an additional flaw, there are more factors being considered
than any SEO is aware of and can confirm. That’s right, I will freely admit that there are factors at
work that I may not be aware of and even those that I am aware of I cannot with 100 percent
accuracy give you the exact weight they are given in the overall algorithm. Even if I could, the
algorithm would change a few weeks later and what’s more, hold your hats for this one: there is
more than one search engine.

So if we cannot base our optimization on a set of hard-and-fast rules what can we do? The key my
friends, is not to understand the tricks but rather what they accomplish. Reflecting back on my high
school math teach Mr. Barry Nicholl I recall a silly story that had a great impact. One weekend he
had the entire class watch Dumbo The Flying Elephant (there was actually going to be a question
about it on our test). Why? The lesson we were to get from it is that formulas (like tricks) are the
feather in the story. They are unnecessary and yet we hold on to them in the false belief that it is
the feather that works and not the logic. Indeed, the tricks and techniques are not what works but
rather the logic they follow and that is their shortcoming.

And So What Is Necessary?
To rank a website highly and keep it ranking over time one must optimize it with one primary
understanding, that a search engine is a living thing. Obviously this is not to say that search
engines have brains, I will leave those tales to Orson Scott Card and other science fiction writers,
however their very nature results in a lifelike being with far more storage capacity.

If we consider for a moment how a search engine functions; it goes out into the world, follows the
road signs and paths to get where it’s going, and collects all of the information in its path. From
this point, the information is sent back to a group of servers where algorithms are applied in order
to determine the importance of specific documents. How are these algorithms generated? They are
created by human beings who have a great deal of experience in understanding the fundamentals
of the Internet and the documents it contains and who also have the capacity to learn from their
mistakes, and update the algorithms accordingly. Essentially we have an entity that collects data,
stores it, and then sorts through it to determine what’s important which it’s happy to share with
others and what’s unimportant which it keeps tucked away.

So Let’s Break It Down
To gain a true understanding of what a search engine is, it’s simple enough to compare it to the
human anatomy as, though not breathing, it contains many of the same core functions required
for life. And these are:

The Lungs & Other Vital Organs – The lungs of a search engine and indeed the vast majority of
vital organs are contained within the datacenters in which they are housed. Be it in the form of
power, Internet connectivity, etc. As with the human body, we do not generally consider these
important in defining who we are, however we’re certainly grateful to have them and need them all
to function properly.

The Arms & Legs – Think of the links from the engine itself as the arms and legs. These are the
vehicles by which we get where we need to go and retrieve what needs to be accessed. While we
don’t commonly think of these as functions when we’re considering SEO these are the purpose of
the entire thing. Much as the human body is designed primarily to keep you mobile and able to
access other things, so too is the entire search engine designed primarily to access the outside
world.

The Eyes – The eyes of the search engine are the spiders (AKA robots or crawlers). These are the
1s and 0s that the search engines send out over the Internet to retrieve documents. In the case of
all the major search engines the spiders crawl from one page to another following the links, as you
would look down various paths along your way. Fortunately for the spiders they are traveling mainly
over fiber optic connections and so their ability to travel at light speed enables them to visit all the
paths they come across whereas we as mere humans have to be a bit more selective.

The Brain – The brain of a search engine, like the human brain, is the most complex of its
functions and components. The brain must have instinct, must know, and must learn in order to
function properly. A search engine (and by search engine we mean the natural listings of the major
engines) must also include these critical three components in order to survive.

The Instinct – The instinct of a search engines is defined in it’s core functions, that is the crawling
of sites and either the inability to read specific types of data, or the programmed response to
ignore files meeting a specific criteria. Even the programmed responses become automated by the
engines and thus fall under the category of instinct much the same as the westernized human
instinct to jump from a large spider is learned. An infant would probably watch the spider or even
eat it meaning this is not an automatic human reaction.

The instinct of a search engines is important to understand however once one understands what
can and cannot be read and how the spiders will crawl a site this will become instinct for you too
and can then safely be stored in the “autopilot” part of your brain.
The Knowing – Search engines know by crawling. What they know goes far beyond what is
commonly perceived by most users, webmasters and SEOs. While the vast storehouse we call the
Internet provides billions upon billions of pages of data for the search engines to know they also
pick up more than that. Search engines know a number of different methods for storing data,
presenting data, prioritizing data and of course, way of tricking the engines themselves.

While the search engine spiders are crawling the web they are grabbing the stores of data that
exist and sending it back to the datacenters, where that information is processed through existing
algorithms and spam filters where it will attain a ranking based on the engine’s current
understanding of the way the Internet and the documents contained within it work.

Similar to the way we process an article from a newspaper based on our current understanding of
the world, the search engines process and rank documents based on what they understand to be
true in the way documents are organized on the Internet.

The Learning – Once it is understood that search engines rank documents based on a specific
understanding of the way the Internet functions, it then follows that in order to insure that new
document types and technologies are able to be read and that the algorithm be changed as new
understandings of the functionality of the Internet are uncovered a search engine must have the
ability to “learn”.  

Aside from a search engine needing the ability to properly spider documents stored in newer
technologies, search engines must also have the ability to detect and accurately penalize spam
and as well as accurately rank websites based on new understandings of the way documents are
organized and links arranged.  Examples of areas where search engines must learn in an ongoing
basis include but are most certainly not limited to:

•        Understanding the relevancy of the content between sites where a link is found
•        Attaining the ability to view the content on documents contained within new technologies
such as database types, Flash, etc.
•        Understanding the various methods used to hide text, links, etc. in order to penalize sites
engaging in these tactics
•        Learning from current results and any shortcoming in them, what tweaks to current
algorithms or what additional considerations must be taken into account to improve the relevancy
of the results in the future.

The learning of a search engine generally comes from the uber-geeks hired by and the users of
the search engines. Once a factor is taken into account and programmed into the algorithm it
them moves into the “knowing” category until the next round of updates.

How This Helps in SEO
This is the point at which you may be asking yourself, “This is all well-and-good but exactly how
does this help ME?” An understanding of how search engines function, how they learn, and how
they live is one of the most important understandings you can have in optimizing a website. This
understanding will insure that you don’t simply apply random tricks in hopes that you’ve listened to
the right person in the forums that day but rather that you consider what is the search engine
trying to do and does this tactic fit with the long term goals of the engine.

For a while keyword density spamming was all the rage among the less ethical SEOs as was
building networks of websites to link together in order to boost link popularity. Neither of these
tactics work today and why? They do not fit with the long-term goals of the search engine. Search
engines, like humans, want to survive.  If the results they provide are poor then the engine will die
a slow but steady death and so they evolve.

When considering any tactic you must consider, does this fit with the long-term goals of the
engine? Does this tactic in general serve to provide better results for the largest number of
searches? If the answer is yes then the tactic is sound.

For example, the overall relevancy of your website (i.e. does the majority of your content focus on
a single subject) has become more important over the past year or so. Does this help the
searcher?  The searcher will find more content on the subject they have searched on larger sites
with larger amounts of related content and thus this shift does help the searcher overall. A tactic
that includes the addition of more content to your site is thus a solid one as it helps build the
overall relevancy of your website and gives the visitor more and updated information at their
disposal once they get there.

Another example would be in link building. Reciprocal links are becoming less relevant and
reciprocal-links between unrelated sites are virtually irrelevant. If you are engaging in reciprocal
link building insure that the sites you link to are related to your site’s content. As a search engine I
would want to know that a site in my results also provided links to other related sites thus
increasing the chance that the searcher was going to find the information that they are looking for
one way or another without having to switch to a different search engine.

In Short
In short, think ahead. Understand that search engines are organic beings that will continue to
evolve. Help feed them when they visit your site and they will return often and reward your efforts.
Use unethical tactics and you may hold a good position for a while but in the end, if you do not use
tactics that provide for good overall results, you will not hold your position for long. They will learn.