Home > Uncategorized > How fast is integer arithmetic than floating point arithmetic?

## How fast is integer arithmetic than floating point arithmetic?

You may already know that integer arithmetic is faster than floating point arithmetic. But how fast? I had the same question in my mind. So created the following script to test the “fastness” for each data type.

#include <vector>
#include <iostream>
#include <cstdlib>
#include “Stopwatch.hpp”

template <typename T>
void test( size_t inArraySize, size_t inCorpusSize )
{
typedef std::vector<T> array_t;
typedef std::vector<array_t> arrays_t;

std::cout << “constructing the corpus…”;

Stopwatch sw;
sw.Start();
arrays_t corpus( inCorpusSize + 1);
for ( size_t i = 0; i < inCorpusSize+1; ++i )
{
array_t &array = corpus[i];
array.resize( inArraySize );
for ( size_t j = 0; j < inArraySize; ++j )
{
array[j] = (T)( rand() );
}
}

sw.Stop();
std::cout << sw.Elapsed() << ” secs” << std::endl;

std::cout << “scored in … “;
sw.Start();
array_t scores( inCorpusSize );
const array_t &query = corpus;
for ( size_t i = 1; i < inCorpusSize+1; ++i )
{
const array_t &array = corpus[i];

register T &score = scores[i1];
score = 0;
for ( size_t j = 0; j < inArraySize; ++j )
{
score += array[j]*query[j];
}
}

sw.Stop();
std::cout << sw.Elapsed() << ” secs” << std::endl;

}

int main( int argc, char **argv )
{
if ( argc < 2 )
return 1;

int csize = atoi(argv);
int asize = atoi(argv);

std::cout << “testing float..” << std::endl;
test<float>( asize, csize );

std::cout << “testing double..” << std::endl;
test<double>( asize, csize );

std::cout << “testing long double..” << std::endl;
test<long double>( asize, csize );

std::cout << “testing int..” << std::endl;
test<int>( asize, csize );

std::cout << “testing unsigned int..” << std::endl;
test<unsigned int>( asize, csize );

std::cout << “testing short..” << std::endl;
test<short>( asize, csize );

std::cout << “testing unsigned short..” << std::endl;
test<unsigned short>( asize, csize );

std::cout << “testing char..” << std::endl;
test<char>( asize, csize );

std::cout << “testing unsigned char..” << std::endl;
test<unsigned char>( asize, csize );

std::cout << “testing long long..” << std::endl;
test<long long>( asize, csize );

std::cout << “testing unsigned long long..” << std::endl;
test<unsigned long long>( asize, csize );

return 0;
}

Syntax highlighting done by: /usr/bin/highlight -O html -t 4 -I -S cpp Main.cpp –inline-css -o /tmp/m.html

The result of running this script with vector size = 100 and corpus size = 1 million is shown below:
sudar@kiriya /opt/testing/bin \$ ./tester 1000000 100
testing float..
constructing the corpus…1.447 secs
scored in … 1.731 secs

testing double..
constructing the corpus…1.543 secs
scored in … 1.836 secs

testing long double..
constructing the corpus…2.135 secs
scored in … 2.614 secs

testing int..
constructing the corpus…1.454 secs
scored in … 1.657 secs

testing unsigned int..
constructing the corpus…1.41 secs
scored in … 1.615 secs

testing short..
constructing the corpus…1.394 secs
scored in … 1.59 secs

testing unsigned short..
constructing the corpus…1.347 secs
scored in … 1.542 secs

testing char..
constructing the corpus…1.313 secs
scored in … 1.507 secs

testing unsigned char..
constructing the corpus…1.328 secs
scored in … 1.522 secs

testing long..
constructing the corpus…1.553 secs
scored in … 1.767 secs

testing unsigned long..
constructing the corpus…1.548 secs
scored in … 1.761 secs

1. If you observe carefully, floating point calculations have taken more time than the fixed point calculations.
2. Long Double took the longest time and it is the slowest.
3. The reverse climax is that 64bit fixed calculation is slower than single-precision arithmetic.  An interesting read on multiplication is here.  The reason could be overflow due to the size of product register being 64 bits.
4. The fastest is 8 bit calculation.  But unsigned 8 bit arithmetic is slower than signed.  Can’t explain why!
5. For fixed point, unsigned calculations take less time than signed calculations, which can be understood by the additional time taken for 2’s complement arithmetic.
6. Results can be different with different CPUs / Operating Systems, so infer accordingly.

Upgrade version of the testing application.
#include <vector>
#include <iostream>
#include <cstdlib>
#include “Stopwatch.hpp”
#include <string>
#include <map>

template <typename T>
double test( size_t inArraySize, size_t inCorpusSize )
{
typedef std::vector<T> array_t;
typedef std::vector<array_t> arrays_t;

arrays_t corpus( 2 );
for ( size_t i = 0; i < 2; ++i )
{
array_t &array = corpus[i];
array.resize( inArraySize );
for ( size_t j = 0; j < inArraySize; ++j )
{
array[j] = (T)( rand()%1000 );
}
}

Stopwatch sw;
sw.Start();
const array_t &query = corpus;

array_t scores( inCorpusSize );
for ( register size_t i = 0; i < inCorpusSize; ++i )
{
const array_t &docu = corpus;
T & score = scores[i];
score = 0;
for ( register size_t j = 0; j < inArraySize; ++j )
{
score += docu[j]*query[j];
}
}

sw.Stop();
return sw.Elapsed();
}

int main( int argc, char **argv )
{
typedef double (*fun_t)( size_t, size_t );

typedef std::pair < std::string, fun_t > named_fun_t;
std::vector< named_fun_t > functions;

functions.push_back( named_fun_t( “float”, test<float>) );
functions.push_back( named_fun_t( “double”, test<double>) );
functions.push_back( named_fun_t( “long double”, test<long double>) );
functions.push_back( named_fun_t( “int”, test<int>) );
functions.push_back( named_fun_t( “unsigned int”, test<unsigned int>) );
functions.push_back( named_fun_t( “short”, test<short>) );
functions.push_back( named_fun_t( “unsigned short”, test<unsigned short>) );
functions.push_back( named_fun_t( “long long”, test<long long>) );
functions.push_back( named_fun_t( “unsigned long long”, test<unsigned long long>) );
functions.push_back( named_fun_t( “char”, test<char>) );
functions.push_back( named_fun_t( “unsigned char”, test<unsigned char>) );

if ( argc < 4 )
{
std::cout << argv << ” <corpus_size> <array_size> <repeats>” << std::endl;
return 1;
}

int csize = atoi(argv);
int asize = atoi(argv);
int iters = atoi(argv);

typedef std::map<std::string, double> timings_t;
timings_t timings;

for ( size_t j = 0; j < iters; ++j )
{
std::cout << \niteration #” << j+1 << std::endl;
for ( size_t i = 0; i < functions.size(); ++i )
{
std::string name = functions[i].first;
double timetaken = functions[i].second(asize, csize);

timings[name] += timetaken;
std::cout << “testing “ << name << “….” << timetaken << std::endl;
}
}

std::string minstring;
double minvalue = 99999.9;
std::cout << \naverages” <<std::endl;
for ( timings_t::iterator iter = timings.begin(), end = timings.end(); iter != end; ++iter )
{
iter->second /= (double)iters;
std::cout << “average time for “ << iter->first << “: “ << iter->second << std::endl;

if ( iter->second < minvalue )
{
minvalue = iter->second;
minstring = iter->first;
}
}

std::cout << “minimum timing is for “ << minstring << “: “ << minvalue << std::endl;
}

The result of running this script with vector size = 500 and corpus size = 1 million with 20 iterations is shown below:

iteration #1
testing float….1.473
testing double….1.469
testing long double….2.025
testing int….1.024
testing unsigned int….1.069
testing short….0.996
testing unsigned short….0.993
testing long long….1.004
testing unsigned long long….1.004
testing char….0.991
testing unsigned char….0.991

iteration #2
testing float….1.449
testing double….1.459
testing long double….2.011
testing int….1.002
testing unsigned int….1
testing short….0.985
testing unsigned short….0.99
testing long long….0.993
testing unsigned long long….0.993
testing char….0.983
testing unsigned char….0.992

..
iteration #20
testing float….1.448
testing double….1.45
testing long double….1.975
testing int….0.99
testing unsigned int….0.991
testing short….0.989
testing unsigned short….0.986
testing long long….0.997
testing unsigned long long….0.992
testing char….0.992
testing unsigned char….0.988

averages
average time for char: 0.99575
average time for double: 1.4665
average time for float: 1.4676
average time for int: 1.0046
average time for long double: 2.00115
average time for long long: 1.00585
average time for short: 0.99985
average time for unsigned char: 0.9945
average time for unsigned int: 1.0085
average time for unsigned long long: 1.0025
average time for unsigned short: 0.9963
minimum timing is for unsigned char: 0.9945

So, even in this test, char is the best choice!!

Tags: