bedis9

a little bit of anything

Optimize your code with valgrind

When writing code, whatever language you use, one of the thing you should take care is efficiency.
An efficient code is scalable, an inefficient one will be a problem in the future.

One way to see how efficient is your code is to use valgrind.
Valgrind is a framework which comes with a few profiler tools. Two of them are cachegrind and callgrind and are usefull to observe how your code is executed at CPU level.

Valgrind is available on all good linux distribution.

Example

The code

As an example, we’re going to write an “itoa” (Integer to Ascii) function in C which will turn an Integer into a char string.
The code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#include <sys/time.h>
#include <sys/types.h>

void reverse(char *);

int itoa(int n, char *s)
{
        char *b, c, *p;
        int len;

        p = s;
	if (n < 0) {
		*p++ = '-';
		n = 0 - n;
	}
	b = p;

	do {
		*p++ = (n % 10) + '0';
	} while ((n /= 10) > 0);

	len = (int)(p - s);
	*p-- = '\0';

        //swap
	do {
		c = *p;
		*p = *b;
		*b = c;
		--p;
		++b;
	} while (b < p);

	return len;
}

/* itoa:  convert n to characters in s 
 * K&R version */
int itoa_kb(int n, char *s)
{
	int i, sign;
 
	if ((sign = n) < 0)  /* record sign */
		n = -n;          /* make n positive */
	i = 0;
	do {       /* generate digits in reverse order */
		s[i++] = n % 10 + '0';   /* get next digit */
	} while ((n /= 10) > 0);     /* delete it */

	if (sign < 0)
		s[i++] = '-';

	s[i] = '\0';
	
	reverse(s);

	return i;
}

 
/* reverse:  reverse string s in place 
 * K&R version */
void reverse(char *s)
{
	int i, j;
	char c;
 
	for (i = 0, j = strlen(s) - 1; i<j; i++, j--) {
		c = s[i];
		s[i] = s[j];
		s[j] = c;
	}
}

int
main(void) {
	int i, n;
	char s[20];

	for (i=-32768; i < 32768; i++) {
		// K&R
		n = itoa_kb(i, s);
	}

	for (i=-32768; i < 32768; i++) {
		// sprintf
		n = sprintf(s, "%d", i);
		s[n] = '\0';
	}

	for (i=-32768; i < 32768; i++) {
		// optimized
		n = itoa(i, s);
	}

	return EXIT_SUCCESS;
}

From this code, you can see 3 ways to write an itoa function in C:

  1. using snprintf
  2. K&R ito function from The C programming language book
  3. an “optimized” function based on the K&R one

Compilation

Compilation is done using gcc:

gcc -Wall -g itoa.c -o itoa

Run valgrind

Now, just run valgrind against the binary file:

$ valgrind --tool=callgrind --dump-instr=yes  ./itoa
==25197== Callgrind, a call-graph generating cache profiler
==25197== Copyright (C) 2002-2010, and GNU GPL'd, by Josef Weidendorfer et al.
==25197== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==25197== Command: ./itoa
==25197== 
==25197== For interactive control, run 'callgrind_control -h'.
==25197== 
==25197== Events    : Ir
==25197== Collected : 76647687
==25197== 
==25197== I   refs:      76,647,687

You may have a file named callgrind.out.[[PID]] in your current directory.

Analyze the result with kcachegrind

Now run kcachegrind.
The result looks like below:

kcachegrind

kcachegrind

On the left tab, look at the Incl. column, the number tells the amount of instruction executed by the function.
In our itoa example, we can see the number of instructions to turn 65536 integers into strings:

  1. sprintf: 42 657 264 instructions
  2. itoa K&R: 18 137 059 instructions
  3. itoa: 13 654 807 instrcutions

Basically, sprintf seems not so efficient 🙂

Conclusion

Valgrind and cachegrind are useful tools to point possible issues in any code.
Then it’s up to the developer to write clean code to avoid using too many instructions where not necessary.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: