Need a WordPress website this weekend? Start here...

Broken Link Checker plugin for WordPress (review)

(Reading time: 7 – 11 minutes)

Broken links frustrate readers and make Google think you are a bad blogger. You will want to eliminate broken links from your blog, and keep them at bay.

Broken links come in two cases:

  1. There’s a bad link on your side, perhaps there is a typo or misspelling in your href attribute for your link.
  2. The web page has moved or disappeared on the target site.

Both cases apply to internal links to your blog posts and pages, and external or outbound links targeting web pages elswhere.

For Case 1, you may need to do a little sleuthing.

Sometimes the problem is obvious, sometimes it’s subtle. The first place to start is to copy the URL directly from the href attribute into your web browser and see what happens.

For example, I just fixed a URL that looked like this: http:http://somedomain.com/. Probably my fault, a cut and paste error.

For Case 2, links expire for any number of reasons.

On WordPress.com or Blogger blogs, the owner may delete the entire site. The blog or website owner may let the domain name expire, sometimes by accident, whence the registrar or a new owner parks the domain.

Usually there is little you can do except notify the site owner (if you can find him or her) and remove the link from your blog post.

However, if the link really is useful, you may be able to find the same web page at a different URL. Perhaps the site owner moved it for some reason, or a redirect was deleted. Google is your friend here.

Find and fix broken links

There’s a lot of ways to find broken links. You can examine all the pages on your blog, and click through on the links. But that’s so last millennium, and as Johnson Yip notes,

Clicking every link in your blog can be very time consuming for finding broken links.

Boring.

Instead, here’s three ways to automate that task:

  1. Use a web service such as the Link Checker – The W3C Markup Validation Service.

    Using any W3 tool is a smart idea, at least on an occasional basis. The W3 doesn’t have any commercial interest, and provides a neutral, third party analysis of your site. You may find that the W3 tools catch problems and issues other tools miss.

  2. Use Google Webmaster tools to investigate site crawl errors.

    I recommend checking your site with Webmaster tools no matter what, because you see what Google sees. No guesswork.

  3. For WordPress users, install the Broken Link Checker plugin.

I really like using the Broken Link Checker plugin for WordPress, and it’s my first defense against broken links.

Let’s take a closer look at this highly useful plugin.

Broken Link Checker features

First, here’s a list of important features:

  • Detects links that don’t work, including 404 Not Found, 410 Gone, 403 Forbidden, Connection Failed, 500 Internal Server Error, Timeout, and Server Not Found (DNS issue).
  • Detects missing images.
  • Periodically checks links in posts, pages, comments and the blogroll. Checking comments is especially important, and you will see after a few months that many of your commenter’s web sites will vanish! Unlink them.

    Trackbacks are also included in comment link checking.

  • New and modified entries are checked ASAP.
  • Notifies you on the Dashboard if any problems are found.
  • Lets you edit all instances of a specific link at once.
  • Gives you a list of all links ever posted on your site, with the ability to search and filter it.
  • Lets you apply custom CSS styles to broken and removed links.
  • Highly configurable.
  • Bug reporting and feature request forum! Forums are a lot of work; take this as a commitment from the plugin author.

Benefit: Broken Link Checker will save you a massive amount of time eliminating broken links.

Website In A Weekend has 5353 links (October 3, 2010).

Imagine checking all of those links by hand, or by submitting your blog pagewise to link checking services. Or even if you paid for a full-scale analysis for your blog, you would still have to dig into the posts and pages containing broken links one at a time.

Instead, the plugin saves you time by collecting every link which needs fixing into a simple, intuitive web interface.

Note: when you first install Broken Link Checker, it won’t have any results to report. The plugin needs to run for a while on your blog to collect data over time. If you prefer to keep the number of your plugins to an absolute minimum, install Broken Link Checker, let it run for a couple of weeks, then clean up the mess. If you like keeping your plugin count low, uninstall it after you clean up, then reinstall when you need it again.

Depending on your publishing schedule, repeat this link cleanup monthly to quarterly, you should be in good shape.

Case study: Gordie Rogers

Long time readers (bofem) will recognize – and welcome back – Gordie Rogers. Gordie used to publish a lifestyle design website and blog, but wasn’t able to make the numbers work at the time. So he took a bit of break, and now he’s back with Personal Development X.

Note: Gordie brings up a good point in the comments. If the broken link is “otherwise good,” use Broken Link Checker’s “ignore” feature instead of unlinking or deleting.

Gordie has loads of comments here on Website In A Weekend, all pointing to the old lifestyle design articles, and all currently broken.

Let’s give Gordie a hand. We’ll fix the broken links on Website In A Weekend, and get Gordie a few dozen (dofollow) backlinks for his new website. Here’s how we’ll do it:

  1. First, we’ll replace the lifestyle design URL in all of his comments.
  2. Then we’ll unlink, for now, all the CommentLuv links returning HTTP 404 errors.

This can be done fast, takes maybe 10 minutes. Here’s a 3 minute screencast to show you exactly how it works:

Brief excursion: canonical plugins

WordPress is sufficiently mature and has a sufficiently large user base (8% of the whole web), that it makes sense to maintain a “best of breed” list of plugins worth following in detail. Such plugins are characterized by:

  1. Usefulness.
  2. Good designed.
  3. Maturity.

Broken Link Checker meets these criteria, so it’s on the Official Website In A Weekend list. Expect more on this topic of canonical plugins, and how such a list fits into a “micro-niche” strategy for developing authority.

While I’m at it, here’s a few words from Janis Elsts, the plugin author.

µ-interview with Janis Elsts

WIAW: What was your main motivation for developing BLC? Frustration? The challenge of coding? Something else?

JE: I must admit I don’t really remember what the initial motivation was. I guess it was one of those lucky ideas, stumbling upon an unfulfilled need.

WIAW: How long have you been working on BLC?

JE: The first version of the plugin was released on 5th August, 2007. So, just over three years now.

WIAW: Offerring a Pro version indicates you are committed to BLC as a long(er) term project. Are you comfortable mentioning one or two features users might expect in the future?

JE: A few things that I would like to add, eventually :
* Link suggestions, i.e. automatically finding alternatives for broken links.
* Support for internationalized domain names.
* Bulk URL editing.

As you probably know, there is a dedicated forum where users can suggest new features and provide feedback:
Broken Link Checker forum.

Most likely, any new features (once implemented) will only be available to users who’ve purchased the Pro version.

Speaking of the Pro version…

Help keep Broken Link Checker up-to-date

Broken Link Checker also has a Pro version for the very reasonable price of $4.99 US (October 3, 2010).

The Pro version of Broken Link Checker is available from WP Plugins. It’s advertised through the plugin with a screen options tab (as of October 3, 2010).

By the way, I prefer to promote professional and paid versions of blogging tools, including plugins, and here’s why: it means the author of the plugin is taking his work seriously enough to realize his effort should be compensated. That reassures me that the time I take learning such tools won’t be wasted. Because it’s just horrible to sink hours, days, weeks or more of you life into a technology that dies. Paying a few bucks to help keep worthy technology alive and growing just makes sense.

Anyway, there you are, a great plugin and a few other ways to check broken links. Here’s a few questions for you:

Are you regularly checking for broken links?

If so, how?

If not why not?

Off to the comments!

Leveraging Localhost WordPress for Learning Webmaster Skills

(Reading time: 10 – 16 minutes)

Long time readers know I run several testing installations of WordPress right on my own computer. This is colossally useful for a number of reasons:

  1. New versions of WordPress are easily installed and tested
  2. New plugins and themes can be developed and tested.
  3. Reduces risk of destroying production installations.
  4. And finally, I can learn new skills in a safe, no-risk environment.

If you have been following along in this series, you will have installed Apache, MySQL, PHP and WordPress independently from separate installer packages, and your configuration should be very close to mine. You should have a working WordPress installation, which means MySQL and PHP work essentially correctly.

As it turns out, your Apache configuration is going to be bare bones, and set to very strict default behavior. This is a good thing. Better to learn how to relax your web security incrementally, than to get your web security beat upside your head by malicious exploits. Of course, running from localhost, it’s not that big of deal… if you get hacked you can just get off the network and deal with it. Thus, in addition to WordPress, we’re going to learn some Apache magic as well.

Quick subdomain setup for Windows Apache Localhost

Suppose you want to move a WordPress installation from one subdomain to another. You will need to know something about setting up subdomains, which is the topic of this section.

Setting up a subdomain on your locally hosted Apache webserver is not difficult. There’s plenty of material on the web explaining exactly how to accomplish this task… for specific and different combinations of Apache and Windows versions. As usual, none of these explanations fit my situation exactly. Most likely, none of them will fit yours either.

So here’s mine, which you can use as yet another resource. I’ll add some links I used for reference at the end of this section.

What you need to know:

  • How to use a plain old ASCII text editor. I use XEmacs. You can use Notepad or whatever you want. Just make sure you stay in plain old text.
  • Where your Apache webserver is installed. Mine is installed in C:\Program Files (x86)\Apache Software Foundation\Apache2.2\. You may be using XAMPP, which is fine, these instructions will be adaptable to your situation.

    I’m going to refer to this location as APACHE_DIR for the rest of this article.

Here’s the steps for creating a subdomain:

  1. Create a subdirectory. For example, APACHE_DIR/htdocs/wptest. Creating a directory or folder should be easy for you. If this is confusing, leave a comment below.
  2. Edit Apache config to add the subdomain. This part can be a little tricky; you need to know something about how configuration in general works. We’ll take a detailed look at this in the next section.
  3. Edit the Windows /etc/hosts file. Again, could be a little tricky. There are a couple of ways to set this up, I’ll discuss in detail below.

Add subdomain to Apache httpd-vhosts.conf

As mentioned, everyone’s installation is going to be slightly different, so the best I can do is walk you through exactly what I did for my installation.

First, open APACHE_DIR/conf/httpd.conf. Find the lines that look like this:

460
461
# Virtual hosts
#Include conf/extra/httpd-vhosts.conf

You want to uncomment line 461, such that you are including conf/extra/httpd-vhosts.conf.

Now open APACHE_DIR/conf/extras/httpd-vhosts.conf. Make sure you’re serving from the correct port, in my case 8080:

17
18
19
# Use name-based virtual hosting.
#
NameVirtualHost *:8080

I have a couple of virtual examples in my conf file, which I’m going to comment out. Below those commented out sections, I added the following (Note: your line numbers will likely be different):

44
45
46
47
48
49
50
51
52
53
54
55
56
<VirtualHost *:8080>
    ServerAdmin admin@localhost
    DocumentRoot "C:/Program Files (x86)/Apache Software Foundation/Apache2.2/htdocs/"
    ServerName localhost
    ErrorLog "logs/error.log"
 </VirtualHost>
 
<VirtualHost *:8080>
    ServerAdmin admin@localhost
    DocumentRoot "C:/Program Files (x86)/Apache Software Foundation/Apache2.2/htdocs/wptest"
    ServerName wptest.localhost
    ErrorLog "logs/wptest-error.log"
 </VirtualHost>

For my configuration, I need to have my root directory set up explicitly as a VirtualHost. Otherwise, only the wptest subdomain works. Note that I added a specific error log for wptest. This is for my convenience, the webserver doesn’t care.

Use your own email address for ServerAdmin of course.

Adding to Windows etc/hosts

This is an easy step. Your hosts file should be located in or very near C:\Windows\System32\drivers\etc. Open this file in Notepad and add a single line:

127.0.0.1  wptest.localhost

Restart Apache.

Now, if you used “wptest” as your example as I did, you should be able to click the link http://wptest.localhost:8080/ and it will take you to your new subdomain.

Related links for localhost subdomains

These are the articles I used for reference, you should use them too:

  • Team Offshoot on Setting up localhost subdomains
  • Here’s Jared Hocutt’s article setting up subdomains on localhost.
  • After you have read the previous two links, and after you have your subdomain working, go read the official Apache VirtualHost documentation. There is more there than you probably need to know… but you should get familiar with what you don’t know.
  • Zaib Kaleem has an outstanding little article on moving a blog from a subdomain to your main domain. He points out to remember fixing the author’s URLs as well as to update the uploads URLs. There’s another article on the WordPress Codex that covers the same process from a slightly different perspective. The reader will find it easily using Google.
  • RajeshPG.com shows how to move WordPress from root to a subdirectory. This is a pretty good article, which I saved as a pdf for future reference. NOTE: I haven’t checked this article for accuracy, but the information is good enough that I could fix any typos, blunders or small errors easily.

How to move a blog using .htaccess magic

Suppose you have (as I do) a directory you want to rename from “wp28″ to “wordpress28,” perhaps as illustrative example for your blog. Or whatever. Bradley Charbonneau provides a distinctly telegraphic procedure for moving WordPress between subdirectories, which works fine for me, but bears a bit more in-depth examination for some readers.

So you’re moving a bunch of stuff around, say, like a WordPress blog… using .htaccess. Lot’s of wonderful tutorials on the web on rewriting URLs. For our example, the following .htaccess file handles the move:

1
2
3
RewriteEngine On
RewriteBase /wp28/
RewriteRule ^(.*)$ http://localhost:8080/wordpress28/$1 [L,R=301]

But no matter what you try… NOTHING works. Not even a little bit. Are you going crazy? Maybe, but there may be a simpler answer. Open up your httpd.conf file. Does it look like this:

114
115
116
117
118
#LoadModule proxy_ftp_module modules/mod_proxy_ftp.so
#LoadModule proxy_http_module modules/mod_proxy_http.so
#LoadModule rewrite_module modules/mod_rewrite.so
LoadModule setenvif_module modules/mod_setenvif.so
#LoadModule speling_module modules/mod_speling.so

So, (my) line 116… you need to uncomment that if you want to use .htaccess for rewriting (which is what you want to do). Then make sure to restart Apache!

Did that do the trick? No? Ok, time to dig deeper.

Check the latest entries in logs/error.log. If you see an error similar to this:

[Mon Jul 20 12:38:19 2009] [error] [client 127.0.0.1] client denied by server configuration: C:/Program Files (x86)/Apache Software Foundation/Apache2.2/htdocs/wp28/.htaccess

you’re dead in the water.

But it’s easily fixable, if you know exactly what to do.

The operative phrases here are “client denied by server configuration” and “.htaccess.” Succinctly, you will need to allow Apache to use .htaccess to override default configuration that denies .htaccess usage. It’s not difficult, and there’s a couple of ways to do it. The first way is making the override global. That’s fast and easy, but not as secure. The second way is to override on a per-directory basis, which is what we’re going to do here:

1
2
3
4
5
6
<Directory "C:/Program Files (x86)/Apache Software Foundation/Apache2.2/htdocs/wp28">
    Options -Indexes FollowSymLinks
    AllowOverride AuthConfig FileInfo
    Order allow,deny
    Allow from all
</Directory>

Line 3 is the key, and the AuthConfig and FileInfo options will do what we want, namely, allow the redirect from wp28 to wordpress28. Add another one of these Directory modules for wordpress28 to enable permalinks for wordpress28 directory. Otherwise, you will be able to write the .htaccess file, but Apache won’t read it.

There’s even more…

If you’re run a cPanel-hosted site, you can change all the top-level redirects by doing a search and replace in the .htaccess for that site. Here’s the steps:

  1. Log in to cPanel, click on the “Redirects” link. You probably already know how to do this; if you’re changing redirects, you had to put them there in the first place right?
  2. Back up your current .htaccess file. I do it like this: .htaccess → .htaccess_todaysdate.
  3. Use your FTP program to download it for local editing, or edit it on the server if you feel brave. If you have a number of websites, you may already have flock of .htaccess files lurking, so editing on the server helps reduce the confusion.
  4. Do a search and replace to fix all the new URLs. Upload the new .htaccess and test everything out. Refresh your cPanel Redirects page to ensure everything works. That’s it, you’re done!

Giving WordPress it’s own subdirectory

Perhaps you would like to put all your files in a subdirectory, but have WordPress operate from the top level of the domain. For example, you want mycooldomain.com as your URL, but you want to keep all the files safely tucked away in mycooldomain.com/main1 (don’t ask).

This is so easy and common that the WordPress Codex has just the procedure for you: Giving WordPress Its Own Directory.

That’s great if you are just starting out.

But suppose you have a blog that’s been in operation at mycooldomain/main1 for a long time, with dozens of hundreds of posts and pages, and great search engine results. You need to do a little more work to capture all the results going from the old URL to the new. John Godley’s Redirection plugin is just the ticket.

The idea is to set up a new redirection using regular expressions. Using the directions on the Redirection home page, here’s what it looks like:

Setting up redirection regular expression

Setting up redirection regular expression

It’s not difficult, and using your localhost installation for testing is smart.

One last thing: If you already have a sitemap generator installed, such as Google XML Sitemap, make sure you update the location of the resulting sitemap. If you do not, it will attempt to write the sitemap into your old location, and may fail.

And, when you do this “for real” instead of on localhost…

Make sure you change your settings for Google Webmaster Tools!

You will want to ensure the new sitemap uploads and is valid, then you want to examine the details to ensure all the URLs are correct. If you have your redirections set up properly, you may want to leave the existing sitemap in place for a while, until the new one is indexed. Don’t forget about your RSS feeds, feedburner, and URLs you may have in any other places.

Themes, plugins and more!

Having a localhost installation makes it easy and safe to test new themes and plugins. Your new plugins can meddle with the database with impunity. If you crash it or corrupt the database, no problem, just reinstall or re-import a testing database.

With themes, you can go wild with CSS and formatting and nobody will ever see how terrible it looks.

Other advantages of running a localhost webserver:

  • It’s easy to dig into Apache configuration, and no-risk when you mess it up (and you will mess it up).
  • Learning to handle various HTTP issues such as 404 errors and 301 redirects gives you a lot of power on once you apply your newfound knowledge to your production server.
  • You can develop entire maintenance procedures and scripts locally, testing and debugging before deploying live.

So if you haven’t installed a local webserver, block out a couple of hours and jump in. It’s not difficult, and your knowledge will pay you back in the future. Even if you have no plans to be a “professional” web developer, the more knowledge you have the easier it will be to outsource exactly what you need… without getting ripped off by unscrupulous developers.