We compress our log files using GZip. It does a heck of a job compressing at almost 95% average. Here's a note-to-self that others might find useful for use when trying to provide sizing information (e.g. when evaluating web analytics solutions):
- Run the following at a command prompt: FOR /R X:\ %G IN (*.log.gz) DO GZip -l %G >> C:\Temp\LogStats.txt
- Open the file in Excel
- Sort Column A in alphabetical order
- Delete the repetitive headers at the bottom of the sort
- Trim the content into the adjacent column(TRIM function)
- Paste the trimmed values back to the first column (Paste Special > Values)
- Split the first column using space the delimiter (Text to Columns > Delimited > Space)
- Add up the contents of the uncompressed column and you will have the result (SUM function)