Home
Parent Directory

apalog (apalogretrieve)

Overview

The program apalogretrieve (binary-name: apalog) retrieves data from an Apache logfile with a syntax, that is derived from (and a subset of) the SQL language.

SQL is used for relational databases normally. But here we use a SQL-like language (SQL-subset) for Webserver-logfile-queries.

There are a lot of possibilities to retrieve data from an apache logfile.
To have an overview on the requests (with graphics support also) very quickly, you can use webalizer. It's a nice tool but as every tool it has not only advantages but also disadvantages.
Webalizer does averaging. For getting detailed infromation in certain queries, this is a bad choice.

You could switch back to grep, awk or Perl or use specialzed tools, which you implement in a common programming language, which does some specialized analyzes.
But grep has some limitations here, because it does string matching linewise, not on certain fields. awk has no domain-specific named fields. Even if you can use associative arrays in awk, which makes things easier to handle, there is no predefined name for the logentry-fields. Also you may get problems with parsing the data, because you have to find out the appropriate way to select the fields.
apalogretrieve brings you specific names for each field. And if you want to use specialized tools, they might be to narrow in focus, even if possibly good in what they are intended to do.

If you want to make concise, but flexible queries, wether webalizer nor grep nor awk nor specialized tools might be the best choice, IMHO.

What apalogretrieve makes possible here is to retrieve the fields by their name and use filters (WHERE-clause) as well as boolean operators (AND, OR, NOT). You also have a simple regular expression mechanism, like the like-operator from SQL.
IMHO this makes data retrieval in lookups for some special entries very convenient.

Application Examples

During development of apastat (a logfile analyser for user-statistics) I nedded to select certain fields for looking up the data in logfiles, so that I can check the functionality of the analyser.
For that, apalogretrieve is an invaluable tool!

News from apalog (31th January 2008):

ChangeLog

How to Use apalog

Keywords

Implemented SQL-statements (subset of SQL)

Other Query-Keywords

Field-Names

Other Keywords

Usage Helper

apalog has no line-editing functionality implemented.
If you want to have this feature, please use the ledit-tool. :) (It's also written in OCaml :))
Debian-package for ledit

Usage Example

Example 1

SELECT host,date FROM "apache-combined.log" where size > 2000;

Example 2

SELECT host,date,client,referrer FROM "apache-combined.log" where host = "foobar.host.net";

Example 3:

The task: Look for all entries with domainnames ending in ".it", ignoring the entries for icons ("/icons/back.gif", "/icons/folder.gif", "/icons/blank.gif", "/favicon.ico" and so on)

# select host,date,request,referrer from "access.log" where host like "%.it" AND (NOT request like "%icon%");

Example 4: There is no "DISTINCT" clause - example on that?

Invoke apalog like here:
$ cat | apalog | sort -u
and then for example type this command:
select referrer from "access.log"; quit;
Then you get all referrer-entries reported once.

Necessary to mention ("disclaimer")

Download

apalogretrieve (License: GPL)

Implementation

The language of choice is OCaml, the ultimative language for high-level programming.

If you want to give feedback (feature wishes, bug report or if you like the tool and where you use it), do not hesitate to contact me.


Mail: oliver _at_ first.in-berlin.de
$Date: 2008-01-31 19:30:22 +0100 (Do, 31 Jan 2008) $