Electric Type

Multimedia

About Us

News

Help

Log File Lowdown

Page 3 — A Sample Log File

Next time you fire up your FTP client or log in to your Web server, take a moment and dig around for your log files. On most Web servers, you will find a directory — usually in your root directory, or the parent directory just above it — named "logs" or "stats". Inside, you will most likely see a file with a .log, .web, or .clf extension. Since Web logs are essentially text files, some will even have a .txt file extension. Download the log file, save it to a local drive, and have a look.

Most servers generate CLF (Common Log Format) files, but they also come in other flavors, like ELF (Extended Log Format) and DLF (Combined Log Format). Some servers produce files with different extensions in different formats, but most of the log file types out there are formatted much like CLF files. For this reason, we'll use the structure of a CLF file for our example.

In Common Log Format files, each line represents one request. So if a user comes to your site and is served a page with three images, it shows up as four lines of text in your CLF file — one request each for the three images and one request for the HTML file itself.

CLF files are standardized, so they almost always look the same. A normal CLF file logs the data in this format:

user's computer  ident  userID  [date and time]  "requested file"  status
filesize

The fields are separated by spaces. Some fields, such as the date and request information, are defined with punctuation. If any of the fields are non-existent during the session logged, the server puts a hyphen in the place of the non-active field. Let's look at these fields one by one.

  • The remote host information shows the IP address and, in some cases, the domain name of the client computer requesting the file.
  • The ident information is logged if your server is running IdentityCheck, an antiquated directive that was once used for thorough server logging. It was phased out of general use because it required the identification process to run every time a file is served. Because this process can sometimes take 5 or 10 seconds, most sites turn IdentityCheck off so that their pages load more quickly.
  • If your site requires a password upon login, the userID that the user entered is logged in this field. If you don't have any user login features on your site, this field is no big deal.
  • The date field is straightforward — the date and time of the request is logged here.
  • The request field logs the type of request made by the user, as well as the path and name of the requested file.
  • The status field contains a three-digit code that tells you if the file was transferred successfully or not. These codes are standard HTTP codes.
  • The filesize field is also straightforward — it lists the number of bytes transfered when the requested file was served.

For the following example, I've extracted one line from a log file that records the activity on my own personal website, snackfight.com. My hosting company serves my site using Apache, and they've tweaked a few options to provide me with more comprehensive data. (Apache's mod_log_config module allows you to customize the string that's fed into the logs.) I've divided this logged request into its separate parts for clarity — normally, all of this data would be dumped onto one single line in the log file.

adsl-63-183-164.ilm.bellsouth.net - - [09/May/2001:13:42:07 -0700]
"GET /about.htm HTTP/1.1" 200 3741
"http://www.e-angelica.com"
"Mozilla/4.0 (compatible; MSIE 5.0; Windows 98)"

The first part of the request shows the user's local domain. I can see that this is a DSL subscriber on the BellSouth network. The two hyphens that follow are where the IndentityCheck and UserID information would normally show up, but since my site does not utilize either of these processes, I get nothing but hyphens. Next, in brackets, is the date, then the time (in 24-hour format), followed by the user's time zone code.

The request field, displayed within quotes, shows that the user asked the server to GET a page. Other request types are POST, DELETE, and HEAD, though you don't see those nearly as often. Following the request type is the path and name of the file. In this case, the user was requesting the "about.htm" file in the root directory of snackfight.com. Also, you can see that the protocol used here was the good old Hypertext Transfer Protocol, version 1.1.

The status field shows a status code of 200, meaning that everything went through just peachy. A status code of 404, as you may know, means that the file was not found on the server. Immediately following the status code is the file size of "about.htm". It's 3,741 bytes. Hey, not bad! I'll bet it loaded nice and quick.

Referers

The next two fields are especially interesting. These are custom fields that my hosting company has added to its logging so that I can get a better idea of who's visiting my site. The first field, in quotes, is the referer field. This is where my user clicked on a link in order to arrive at the page he was just served. I can see that this particular user is a fan of the e-angelica site, because that's where he came from to arrive at my site. In some cases, referers are logged in their own log file. These referer logs usually use the same format and can also be viewed or run through an analyzer. For the full skinny on referer logs, check out Jeff's article.

The last field, also in quotes, shows some information about the user's browser and platform, in this case, Internet Explorer 5.0 on a Windows 98 machine. Oh, how original!

And that's about it! It's a lot of information, I know, and your log file may store even more goodies (an in-depth explanation of the syntax used in log files can be found in the massive spec for HTTP 1.1, which is also useful as a reference when looking up header fields and server status codes.)

All this data is little overwhelming, no? Especially in its raw state. If you're not exactly thrilled about the idea of picking through thousands of lines of text and status codes to determine whether or not your users are being served in the most efficient manner, there are several software packages on the market that you can use to generate reports without getting your hands dirty (and without opening your text editor). But which one is right for you? Well, that all depends on what you're looking for.

next page»


Dynamic HTML  

Frames  

HTML Basics  

Stylesheets  

Tables  

XML  

Javascript  

Database Connections  

Intro To Perl  

HTML 4.0  

User Blogs

Screen Shots

Latest Updates

Contact Us

Valid HTML 4.01!
Valid CSS!

Breadcrumb

© ElectricType
Maintained by My-Hosts.com
Site map | Copyright | Disclaimer
Privacy policy | Acceptable Use Policy
Legal information.