Why link to index.php?

Well, you can call me pedantic, because of writing a whole blog entry about this, but I think it’s important. We are web designers, so we should care about every little thing that helps us improve our websites.

I noticed that most people do not worry about what to link to, when linking to the home page of their sites. That’s a big mistake, I think. Let me explain what I mean. Normally, you have a link back to the entry site of your website. Often it’s titled “home”. But where do you link to? A little case study:

  • The CSS Vault links to “http://www.cssvault.com/”, but that will always redirect you to “http://cssvault.com/”
  • Dave Shea links to “/”.
  • Jeremy Flint links to “/wp/index.php”
  • Yellowlane links to “http://www.yellowlane.com/”
  • Anil Dash links to “index.php”

So, where’s the point? Well, in the worst case you can make three sites out of one. The visitors will still see one site, but search engines will think there are three.

I’ll take my URL as an example.

If you got “http://www.julian-bez.de/” you can make three sites out of it:

  • http://www.julian-bez.de/
  • http://www.julian-bez.de/index.php
  • http://julian-bez.de/

A search engine will see three completely different addresses. No one will – hopefully – set a link to your site containing a “index.php”, or something like that, so don’t do the mistake yourself.

Don’t link to “/index.php”

There’s another problem, which I think should be mentioned. Do we need a “www” or not? Some people let their visitors decide, and support them in their choice. I take mezzoblue as an example. The “home” link will take us always to “/”. That could be “http://www.mezzoblue.com/” OR “http://mezzoblue.com/”. Google has indexed both.

I’m constantly trying to avoid this problem by linking straight to the whole URL. If Dave wouldn’t want someone to use “http://mezzoblue.com/” he could’ve set the “home” link to “http://www.mezzoblue.com/”, instead of “/”. That doesn’t prevent the usage of the non-www address, but the reader could notice, what should be used. For instance “What Do I Know” does so. Todd links to “http://whatdoiknow.org/”, what means he doesn’t want the “www” to be used. If I go to the site using “www” and later click on “home” and then bookmark it, because it’s nice, I won’t have the “www” in the bookmark.

What I want to say is: Don’t make multiple sites out of one. Avoid using “index.*” and link to the full URL.

To prevent the usage of “/index.php” on my site I have a PHP script on the entry page which examines if the user agent is looking for “index.php”. If so, it’ll send a “Location:” header. No robot will ever index “index.php”. Ha!

if(strstr($_SERVER['REQUEST_URI'],'/index')) {
header('Location: http://www.yourdomain.com/');
}

Update: Read also the follow-up www or not.

Published by

Julian Bez

Julian Bez

Julian Bez is a software engineer and former startup founder from Berlin, Germany.

  • http://www.jeremyflint.com Jeremy Flint

    I think most people just link to the actual file that represents the homepage because it is common practice. I have been designing sites for about 9 years (4-1/2 professionally) and I have always linked “home” to index.*. Why? I have never really seen any harm in it.

    How sure are you on the validity of search engines “seeing” three sites? Search engines index the content of the file, and if all three links point to the same file…

    My home link actually points to index.php, but /wp is my webroot from when I installed WordPress (I am too lazy to move everything out to the actual webroot).

    As for the non-www issue, I think it is completely annoying when going to a site without the WWW does not work. Adobe.com was like this for the longest time.

  • http://www.julian-bez.de/blog/ Julian

    most people just link to the actual file that represents the homepage

    That’s true, but users will never have an URL with index.* in mind. They won’t type “http://www.cssvault.com/index.php” in the address field of their browsers and press enter. And they don’t see your site consisting of files. So I think one should not confuse them. The index.* is unnecessary, it makes the displayed URL longer. Leave it out.

  • http://intosh.co.uk/ Richard Farmer

    Jeremy, just because you have been a web developer for 9 years and have been linking to index.* all that times does not make it good practice. In fact many people think that having file extension meta data in a uri is bad practice too, servers are designed to look at the root of a directory for files such as index.htm, index.html, index.php and even index.xml why take the long way round and bypass this ever-so-useful feature of the server when a simple ./ will do. If i had to visit http://www.apple.com/index.php every time I wanted to look up some Apple info I would get mighty annoyed; instead, most of the time I type apple.com and all is fine with the world.

  • http://shunuk.co.uk/ Paul Connolley

    I wish to concur with Richard in his statement:

    many people think that having file extension meta data in a uri is bad practice

    I believe that it can fool visitors when viewing a website. Prime examples of mind boggling URIs are those given by online stores such as amazon and apple. Furthermore, redirecting somebody, transparently, to your home directory is one of the most important items on your web-developers to-do list. I loathe websites, such as egg.com which forwards me to a subdirectory. We are in a millennium of fantastic technology, people are designing devices which will carry us to Mars and further. Can’t we present information properly and without hassle for the user?

  • http://jeffcroft.com Jeff Croft

    While I usually do not link directly to index.php, there is a reason why you should do this, at least from a server administrator perspective. There’s a very fundamental different between “http://jeffcroft.com/index.php” and “http://jeffcroft.com/”. The first one is a file and the second is a directory. When linking to a directory, the web server is forced to locate the proper file to display (in this case, index.php), which adds processing time and load.

    I’m more of a user interface persona, and as such I think linking to the directory (http://jeffcroft.com/) is more user-friendly. However, I thought I’d point out that doing this does have a backend impact and you server administrator might like to slap your wrist for it.

    As for using www or not — there’s really no reason that any server should be set up to require this these days. If yours is, please bitch and moan to your admin. If it is not, then I believe you should drop the www in your links and marketing materials, just for the sake of simplicity. Of course, you should make sure your server is configured to work when someone enters the www, because many users will, whether you want them to or not.

  • http://e26.co.uk Eddie Sowden

    On the point of the www. try here. On the point of the /index.php I dont really think it matters as long as you follow Sir Tim Berners-Lee’s “Cool URIs don’t change” Does it really matter?

  • http://www.zelph.com/ Aaron Barker

    I don’t have an official opinion either way, and don’t think I am even consistant on the sites I maintain.

    That being said, there is another thing to consider. What happens when (for whatever reason) you have to change technologies from .php to .jsp (asp,cf,etc)? Anyone that has linked to your filename with that extention now has a broken link.

    This may not happen all that often, but is still something to be considered. This is already a situation for internal pages that must use extentions, but could potentially be worse for your front door.

  • http://intosh.co.uk/ Richard Farmer

    The first one is a file and the second is a directory. When linking to a directory, the web server is forced to locate the proper file to display (in this case, index.php), which adds processing time and load.

    Dear god man, Im glad you informed us all of this, I didnt realise webservers werent designed to do this job and the processing load was so immense as to cost a million cycles per second of the process.

    Get a real webserver and watch the immense load when it resolves to index.* I think you will find the Apache foundation know what they are doing.

  • http://www.jeremyflint.com Jeremy Flint

    I have been linking directly to the file only because it is the way I have done it. Seemed better (for some reason) than putting the entire URL in the href. Most servers that our clients sites are hosted on are not set up to support using the “/” in the href and having it go directly to the root.

    I don’t really have an official opinion nor do I have anything I can point to that says definitely, for sure, this way or that way is the way to do it.

  • http://intosh.co.uk/ Richard Farmer

    I have been linking directly to the file only because it is the way I have done it. Seemed better (for some reason) than putting the entire URL in the href. Most servers that our clients sites are hosted on are not set up to support using the

  • http://vna.com.au/ Lachlan Hardy

    I’ve always been a fan of minimalist linkage. I link to the directory rather than the file.
    1. It is shorter
    2. As Aaron pointed out, it is a vague attempt at future-proofing (just in case you ever decide to swap technologies without changing your site structure)
    3. Additional processing is worth it for the usability gains

    I’m also a fan of no www. The www sub-domain has no function and simply adds length to the URL

  • http://www.jeremyflint.com Jeremy Flint

    I can guarantee you that there are some servers out there where if you linked just to “/” or “/images”, it would not know what your are talking about.

  • http://nathanlogan.com Nathan Logan

    I can guarantee you that I don’t know of them.

    Neither Apache nor IIS have this problem (that is, you can set up default files to be pulled when that directory is called).

    Now some servers may very well have the problem of an inept systems administrator who has not a clue about how to set this feature up properly. If that is the case, I think this is the least of your problems.

  • http://nathanlogan.com Nathan Logan

    BTW, thanks for the article.

    I had not considered that search engines will grab http://www.nathanlogan.com and nathanlogan.com as two different things. Good tip!

  • http://www.jeremyflint.com Jeremy Flint

    Maybe I am thinking more of the mod-rewrite, using /about instead of about.php…

    That is very possible. It has been a long week already.

  • http://www.julian-bez.de/blog/ Julian

    @Nathan:
    I have to admit that I don’t know if e.g. Google handles the two things as completely different. Check this Google search.
    What would you say? The result with www has (according to this nice Firefox extension) a PR of 7, the non-www result a PR of 6.

  • http://www.jeremyflint.com Jeremy Flint

    Is that search really relevant? How many people (normal web users) know to search using those commands? Even if you search for just “mezzoblue”, the top result google comes back with uses http://www.mezzoblue.com as the url.

  • http://www.julian-bez.de/blog/ Julian

    It was just to show you that Google has indexed both. Normal web users won’t do that search, yes.

  • http://www.jeremyflint.com Jeremy Flint

    ah, gotcha.

  • http://shunuk.co.uk/ Paul Connolley

    On my website /about and /about/ both resolve to about.htm. If you believe that is causing problems on my web-server you need to check the manual.

    Also the difference between /images and /images/ is minimal. I know that I have never had issue with it. May I also point out that I do not present directory structures to visitors so there is no reason to worry either.

  • http://mathibus.com/ MaThIbUs

    I’ve written a post about www.-less URIs a couple of months ago, too. And you’re right — index.php indeed is cruft. Even in post permalinks on blogs who don’t use the fancy .htaccess option (why use /index.php?p=xxx when /?p=xxx works just as well?).

    People who use your .htaccess code (or mine) can use relative linking without being afraid of Google double-caching their site.

  • djn

    Most of the time it’s not people linking to index.whatever, it’s their authoring software that “knows better”. Also, it’s hard to expect much attention for link consistency somewhere in the future from people who don’t care if their pages are broken in another browser (or platform, or resolution) now.

  • http://xmouse.ithium.net David House

    About the www thing:

    People are going to type both with and without the www. to get to your site, because without the www. is quicker except for users that know the Ctrl+Enter shortcut. Hence, following the path of least surprise, both should be available and lead to your site. I’d redirect people toward the version without the www. myself, just for redundancy reasons. But the point is we can’t usably support only one version of our site, so Google will inevitably index both.

  • http://mathibus.com/ MaThIbUs

    David House: I’m not claiming we should all make our sites class C, ’cause that would just be overkill. I was promoting the silent redirection to the non-www. URI.

  • Ishra Kafil

    Techneo360 !!!Good content for Php Developer.