use Benchmark qw(timethese cmpthese timeit countit timestr); # You can always pass in code as strings: timethese $count, { 'Name1' => '…code1…', 'Name2' => '…code2…', }; # Or as subroutines references: timethese $count, { 'Name1' => sub { …code1… }, 'Name2' => sub { …code2… }, }; cmpthese $count, { 'Name1' => '…code1…', 'Name2' => '…code2…', }; $t = timeit $count, '…code…'; print "$count loops of code took:", timestr($t), " "; $t = countit $time, '…code…'; $count = $t->iters; print "$count loops of code took:", timestr($t), " ";
The Benchmark
module can help you
determine which of several possible choices executes the fastest. The
timethese
function runs the specified code segments
the number of times requested and reports back how long each segment
took. You can get a nicely sorted comparison chart if you call
cmpthese
the same way.
Code segments may be given as function references instead of strings (in fact, they must be if you use lexical variables from the calling scope), but call overhead can influence the timings. If you don't ask for enough iterations to get a good timing, the function emits a warning.
Lower-level interfaces are available that run just one piece of
code either for some number of iterations (timeit
)
or for some number of seconds (countit
). These
functions return Benchmark
objects (see the online
documentation for a description). With countit
, you
know it will run in enough time to avoid warnings, because you
specified a minimum run time.
To get the most out of the Benchmark
module,
you'll need a good bit of practice. It isn't usually enough to run a
couple different algorithms on the same data set, because the timings
only reflect how well those algorithms did on that particular data
set. To get a better feel for the general case, you'll need to run
several sets of benchmarks, varying the data sets used.
For example, suppose you wanted to know the best way to get a
copy of a string without the last two characters. You think of four
ways to do so (there are, of course, several others):
chop
twice, copy and substitute, or use
substr
on either the left- or righthand side of an
assignment. You test these algorithms on strings of length
2
, 200
, and
20_000
:
use Benchmark qw/countit cmpthese/; sub run($) { countit(5, @_) } for $size (2, 200, 20_000) { $s = "." x $len; print " DATASIZE = $size "; cmpthese { chop2 => run q{ $t = $s; chop $t; chop $t; }, subs => run q{ ($t = $s) =~ s/..//s; }, lsubstr => run q{ $t = $s; substr($t, -2) = ''; }, rsubstr => run q{ $t = substr($s, 0, length($s)-2); }, }; }
which produces the following output:
DATASIZE = 2 Rate subs lsubstr chop2 rsubstr subs 181399/s -- -15% -46% -53% lsubstr 214655/s 18% -- -37% -44% chop2 338477/s 87% 58% -- -12% rsubstr 384487/s 112% 79% 14% -- DATASIZE = 200 Rate subs lsubstr rsubstr chop2 subs 200967/s -- -18% -24% -34% lsubstr 246468/s 23% -- -7% -19% rsubstr 264428/s 32% 7% -- -13% chop2 304818/s 52% 24% 15% -- DATASIZE = 20000 Rate rsubstr subs lsubstr chop2 rsubstr 5271/s -- -42% -43% -45% subs 9087/s 72% -- -2% -6% lsubstr 9260/s 76% 2% -- -4% chop2 9660/s 83% 6% 4% --
With small data sets, the "rsubstr
" algorithm
runs 14% faster than the "chop2
" algorithm, but in
large data sets, it runs 45% slower. On empty data sets (not shown
here), the substitution mechanism is the fastest. So there is often no
best solution for all possible cases, and even these timings don't
tell the whole story, since you're still at the mercy of your
operating system and the C library Perl was built with. What's good
for you may be bad for someone else. It takes a while to develop
decent benchmarking skills. In the meantime, it helps to be a good
liar.