The importance of benchmarking

As an operations engineer, my go-to philosophy is often “make it work first, then make it work better.” When you write a script, or apply a complex change, you don’t always have the luxury of making is perfect in the first iteration.

As always, there are exceptions to this rule. In today’s case, I was making a change to an application which runs the same method billions of times a date.
In situations like this, small changes can have very large consequences.

The change I’m making requires checking whether a timestamp is within a specific range, in this case +/- one day. Whichever language you’re working in, there are no doubt multiple ways to achieve this. I’m using Ruby, and stackoverflow offered me a few quick and reasonable solutions:
http://stackoverflow.com/questions/4521921/how-to-know-if-todays-date-is-in-a-date-range

In an application where this would be run a few billion times a day, microsecond changes in performance can add up to a lot of lost CPU time.

Benchmarking to the rescue!

Ruby has a lovely and easy way to benchmark these. (Python does too, find them for your favourite language!)
This is the script I used to compare 100,000 runs of each approach to the problem:

#!/usr/bin/ruby
## Benchmark a couple of different ways to see if a given time is in a time
## range of +- one day.
require "benchmark"
require "date"
time = Benchmark.measure do
 (1..100000).each { |i|
 now = Date.today
 ((now-1)..(now+1)) === now
 }
end
puts "The first test took: #{time}"
time = Benchmark.measure do
 (1..100000).each { |i|
 (Time.now-1..Time.now+1).cover?(Time.now)
 }
end
puts "The second test took: #{time}"

Running the script produces the following result:

The first test took: 0.580000 0.130000 0.710000 ( 0.712725)
The second test took: 0.250000 0.000000 0.250000 ( 0.247216)

Great! A quick win. After another look, we can even spot that we’re calling Time.now three time. It probably makes sense to only call it once:

now = Time.now
(now-1..now+1).cover?(now)

If we run the benchmark again, we now see:

The first test took: 0.580000 0.120000 0.700000 ( 0.706634)
The second test took: 0.160000 0.000000 0.160000 ( 0.163877)

An even more significant improvement.
Writing and running this benchmark, with tweaks and measuring averages, took me about 10 minutes.
It saved me significantly more time, and money.
This code will be run 40,000 times more in one day, than I ran in my benchmark.
The difference between 6,400 seconds (the second test) of CPU time and 23,200 seconds (the first test) is pretty big.

How big is the difference?

Let’s do the math!

23,200 - 6,400 = 16,800 seconds of wasted time.
16,800 / 60 seconds / 60 minutes = 4.67 hours of CPU time per day.

I’d say that’s pretty big.

Leave a Reply

Your email address will not be published. Required fields are marked *