Does clean HTML lead to better Search Engine Rankings?

Monday, September 21st, 2009

I’ve seen numerous posts where people have claimed that good clean HTML leads to better SEO rankings. I’ve even talked about the lie that of the link between SEO and valid HTML.

I’ve yet to see any studies done which verify this claim and in fact I’ve even seen evidence to the opposite effect being true. I pointed out how in some competitive markets, the higher the ranking for the site the worse the validation recently in another post. This does not mean that poor code relates to higher rankings – just that valid HTML code doesn’t correlate to better rankings as some people would like to say, just like good clean code doesn’t mean the site will look better to visitors.

Matt Cutts recently did a short video on why Google’s own code doesn’t validate. (video opens new window – not allowed to embed)

Now does this mean that you should only have your web designers write invalid code – not, but it does mean you shouldn’t waste unnecessary time on it either.

Popularity: 4% [?]

Valid HTML Helps Search Engine Rankings – NOT!

Sunday, May 17th, 2009
A graphical despiction of a very simple html d...
Image via Wikipedia

There are lots of interesting theories out there about how one can get there site to rank better in search engines. Unfortunately much of this advice, while it would make since, is wrong.

One example is that correct, valid HTML code is important for search engines to read your website, and thus for you rank highly.

This makes sense for several reasons:

  • bad code may not be read by a search engine as it doesn’t know what it sees,
  • search engines want to promote good code, to clean the web of garbage, or even
  • bad coding appears unprofessional, and therefore is likely to be web spam/fraudulent/etc.

Unfortunately, there is little evidence that any of these statements are true.

First, lets consider that it is estimated that over 99% of the web is made up on invalid HTML code [source]. If this is the case, could you imagine being the search engine which cannot read those pages, or which pages you would be able to read. A search engine which only searched valid HTML pages would find so little, that no one would really use it.

Search engines, while many do claim they want to promote good clean HTML pages, also realize search engines need to promote finding appropriate information on the web. While a web designed might find valid HTML important, the common user is more interested in finding out about the new digital camera, how to download a ring tone to his phone, or other related information.

Of course the proof is in the pudding, as they say. So I took several random search queries.  If the given hypothesis is true, then the top search engine rankings will have clean, or nearly clean code.

The first item I searched for was “shoes”. The top three results, in order was:

  • Shoes.Com – has 253 errors and 124 warnings on there homepage [source]
  • Zappos.com – has 144 errors and 101 warnings [source]
  • payless.com/store/ – has 81 errors and 22 warnings [source]

I also checked two other popular search terms “travel”, (expedia.com with 154 Errors and 194 warnings [source]) and doctor (webmd.com/physician_finder/ with 101 Errors and 32 warnings [source]).

Given that these are popular search terms, one would think that search engines code find plenty of valid HTML webpages. However, it decides to rank these.  And as you can see with the shoes examples, the further down in the search results you went, the “better” the web page.  So, based upon this basic information, I would have to say that any boost a search engine gives you based upon valid HTML code is limited or more likely imaginary.

Does this mean we shouldn’t develop valid HTML websites? NO!

Instead, look at developing content which a search engine wants to see. I assume that if my browsers can read it, the major search engines can read it. We should develop new code to be valid, but not worry about fixing old code if it is working. There are clearly other things that we can do to make our sites more search engine friendly.

Reblog this post [with Zemanta]

Popularity: unranked [?]

Search Engines and Flash Files

Thursday, July 31st, 2008

In the past, the search engines (Google, Yahoo, Microsoft Live, et all) couldn’t really search Flash files that well. Well Adobe has been working with the search engines to allow them to search the Flash sites, widgets, buttons, and more.

This is good news on the surface, but still requires one digging a little deeper before trying to get a nifty Flash site. Here are three quick take aways to know about, before

The first thing to know, is you still have to use text, as text, to be searchable. Many Flash developers convert the text into something known as shapes so that they can manipulate it easier to look nice on your screen. While your site will look nicer overall, it will cause the search engine to fail to read that part of you Flash site.

Second, most search engines cannot run JavaScript. Because of a software patent issue, Internet Explorer needs JavaScript to write the Flash file to the webpage. So now, in many cases, your Flash site is no longer searchable.

Flash screens, are not the same as web pages, and that means it will be harder to isolate a topic and rank for it, when looking at the overall site with all of the other text working along with it. Add upon that, that most Flash developers are inexperienced at Search Engine Optimization, and lack good tools to build a search optimized site (links, individual pages, helpful page elements, etc) they will most likely not be able to help you rank like a good HTML based website could.

Overall, I would hold off on developing your all Flash based sites if you are interested in long term Search Engine rankings. (Besides, most developers charge more for Flash sites – use that money to make more content which can rank in the search engines, it will be money better spent.)

SEOmoz has more information on his views on why Flash and Search Engines still don’t mix.

Popularity: 1% [?]

Google fills out your search forms

Monday, May 19th, 2008

Occasionally I have found sites that on their home page you would have to select a company or product from a drop down box, and then enter the site from that information. Until now, Google could not access those pages without being provided direct links. Sites with these types of pages were often called the “Deep Web” or the “Invisible Web”, because search engines could not access them. Google has said in the past that they believe that 80% or more of web pages are “hidden” to them because they require a form to fill out to find them.

Now, if you had hired a good Search Engine Optimizer, this would not be an issue as they would know how to provide links to those pages so all search engines could access them appropriately. However, sometimes the advice of your SEO expert is ignored, or you didn’t include one on your team, and thus search engines can’t access those pages.

In April Google announced that it could begin search pages which required a user to fill out a form. This has all types of interesting applications, both good and bad. You need to understand what this means, as well as what this can do. So without further ado, I present The Good, The Bad, and The Truth.

  • The Good:
    • Now more pages will be accessible.
    • Simple “categorical” search forms will no longer cause Google to stumble. For example I recently built a simple movie web application. In it people could search by genre. I had to devise ways to not use a drop down when possible so Search Engines could find the reviews.
  • The Bad:
    • If you tried to “hide” pages, you need to rethink your method. Consider the use of the robot.txt file or robots meta tag to properly ask search engines to not process certain files.
    • Some people fear that this means Google will explore or try to hack restricted access sections of your site. (Remember your robots.txt file in these instances.)
  • Some Truth:
    • Only Google has announced this feature. While other search engines will probably have to follow suite, as this time they don’t and they still account for 35-45% of all search traffic.
    • Only simple forms are filled out. Google is not (currently) entering information into text boxes, so many forms cannot be processed.

Popularity: 26% [?]