Recently I need to do merging of log files and sort the records according to timestamp. This is what I did, use the bulk replace file extension way to change the log files to .txt files, then applied the following command through Shell script.
#!/bin/sh
export MFILE=merge_file.txt
export SFILE=complete_file.txt
for file in `ls *.txt`
do
cat $file >> $MFILE
done
sort -k4,4 $MFILE > $SFILE
rm $MFILE
Note:
Log files are expected to have same format, otherwise the sorting will not be possible.
-k, --key = POS1[,POS2], means sorting by start a key at POS1, end it at POS2 (which is where the timestamp position in my log files)
2 comments:
find/xargs is your friend. You could do away with your ls/for loop like:
find . -name *.txt | xargs sort -m -k4,4 > $SFILE
Then you wouldn't need to be forever deleting $MFILE, and you'll potentially do a lot less work.
What about timestamp wrapping? If your timestamp gets to max value and wrap to 0. How do you handle the sorting after that?
Post a Comment