Papertrail log searching with Powershell

Papertrail is a great free* cloud based distributed log management system. If you aren’t currently using one for your application, I highly recommend it. The benefits it gives you far outway the relatively simple process of setting it up. And compared to tools like Splunk, it’s considerably cheaper.

However, sometimes clients can have unusual requests and require you to go digging back quite a while in the logs. Papertrail offers flexible search periods, from two days on the free plan up to two weeks, but sometimes that’s not enough. They also allow you to backup the data to AWS S3 for long storage, so you can always pull down those files and search yourself. However Papertrail also has their own API you can use and I wanted to give that a try.

Full disclosure, I am not very good at Powershell, infact, I would say I am terrible. But that is never a good reason to not try so I gave it a good hot go. After about ten minutes and a lot of googling I managed to put something fairly simple and seems to work, even with a long date range. The first thing you will need to do is go retrieve your API token from your profile page (at the bottom). Then copy the script below, and update the first four lines based on your criteria.

$startDate = Get-Date -Date '2019-08-01'
$endDate = Get-Date
$searchString = 'Find Me'
$paperTrailToken = 'YOUR CODE'

Function DeGZip-File{
    Param(
        $infile,
        $outfile = ($infile -replace '\.gz$','')
        )

    $input = New-Object System.IO.FileStream $inFile, ([IO.FileMode]::Open), ([IO.FileAccess]::Read), ([IO.FileShare]::Read)
    $output = New-Object System.IO.FileStream $outFile, ([IO.FileMode]::Create), ([IO.FileAccess]::Write), ([IO.FileShare]::None)
    $gzipStream = New-Object System.IO.Compression.GzipStream $input, ([IO.Compression.CompressionMode]::Decompress)

    $buffer = New-Object byte[](1024)
    while($true){
        $read = $gzipstream.Read($buffer, 0, 1024)
        if ($read -le 0){break}
        $output.Write($buffer, 0, $read)
        }

    $gzipStream.Close()
    $output.Close()
    $input.Close()
}

$jsonArchiveFile='archives.json'
$cli = New-Object System.Net.WebClient
$cli.Headers['X-Papertrail-Token'] = $paperTrailToken
$cli.DownloadFile('https://papertrailapp.com/api/v1/archives.json', $jsonArchiveFile)

$JSON = Get-Content $jsonArchiveFile | Out-String | ConvertFrom-Json
$files = $JSON | where { 
    ([datetime]::Parse($_.start) -ge $startDate) -and ([datetime]::Parse($_.start) -le $endDate)
}

foreach ($file in $files) {
    $filename = $file.start -replace ":", ""
    $tsvFile = "$filename.tsv"
    if (!(Test-Path $tsvFile)) {
        $cli.DownloadFile($file._links.download.href, $filename)
        DeGzip-File $filename $tsvFile
        Remove-Item $filename
    }
}

Select-String -Path *.tsv -SimpleMatch -Pattern $searchString | ForEach-Object {
    $fields = $_.Line.Split("`t")
    echo "$($fields[1]) $($fields[9])"
} > results.txt

After running it, you should have a file named results.txt with any matches, sorted chronologically with the timestamp and the content of the log message. If you need other fields, like log level, IP, group etc you can play with the second last line to extract the information you need. In this example, it will find any lines that contain the string “Find Me” from the 1st of August 2019 to the current date. Although it is slow to run for large search periods, it is considerably quicker to run subsequent times for the same date range as the files will already be downloaded.

There are many things to improve on this, for me foremost would be to parallelize the download to speed things up. Passing in the search criteria on the command line and selecting the desired output files (as well as the destination itself) would also be nice, but this served my needs and I was Powershelled-out. Hopefully this helps one other person out there.

* Free up to 50mb/month, two day search and seven day archive.

Update: The date stored in Papertrail is GMT +0, which is very helpful, but when giving the information to a client, they may prefer it localised to their own timezone. You can simply add one line above the echo comand to unlock this, have a look below at the last few lines again, this time with a small modification to add 10 hours to every timestamp.

Select-String -Path *.tsv -SimpleMatch -Pattern $searchString | ForEach-Object {
    $fields = $_.Line.Split("`t")
    $fixedDate = [datetime]::ParseExact($fields[1], 'yyyy-MM-dd HH:mm:ss', [Globalization.CultureInfo]::InvariantCulture)
    echo "$($fixedDate.AddHours(10).ToString()) $($fields[9])"
} > results.txt

About the Author

Mannan

Mannan is a software engineering enthusiast and has been madly coding since 2002. When he isn't coding he loves to travel, so you will find both of these topics on this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *