Optimization of C or C++ applications
C and C++ could be optimized in several ways, we could differentiate two way of making a program run faster :
- Changing the algorithme that is used
- Changing the data structure, or using knowleadge about the variables
This two tasks are very general to all programing languages, C and C++ check the code, but don't check everything.
On the other hand program could also leak memory or crash in some uncommon cases. To avoid this, it's important to be very carefull when manipulating pointer, or arrays ( in fact both are the same ), because their misuse may result in several bugs. Forgeting to delete an object could causes memory. Java avoid several of this bugs by doing a lot of checkings, letting the programmer focus on the important part of the program. When using some properties of the data ( for instance that they are ordered ), we should ensure that no operation could destroy this order. Or in other words all operations we apply should keep this property.
Find out the time consumming function
The optimization is generally important only one some few functions that do a lot of operation. To find this function we could use gprof that is provided by the binutils package. It's generally installed with the compilation environement. To use it we need to generate a application that will generate the gmon.out file. This file contain a calling count of all function. To do so we need to run the gcc compiler suite program in the following way : ( example with the C++ compiler )
g++ -gp -O3 -fno-inline test.cxx -o test
The role the option are :
- -gp : generate a code that when the program ended ( proprelly ) results in a gmon.out file
- -O3 : generate a optimized code
- -fno-inline : avoid code inling
- -o : tell's the filename to generate
Avoiding inling could be quite intersting in order to know which function are time consuming and which are not, because the function could be inlined automatically by the compiler. Code inlining meen's that a part of code is directly put instead of doing the function call ( that need a copy of the parameters, and a call of the function ). This could result in a large improvement, because it avoid's the copy of the varible when we call the function, and avoid moving to another part of the code each time we do a operation. It's especially recommanded for function that are called several million of times, and that do very little operation. We could also force a operation to be inline by declaring it in a file we load directly or not with the #include instruction. In the .h file instead of writing the function declaration in the following way :
int function(int a,int b);
We will write :
int function(int a, int b)
{
...code...
}
Running gprof is simple, we need only to type gprof program.Their are two distinct part in the gprof output :
- Each function call count and time usage
- number of calls of each function by each other functions
This too output will indicate on which part of the program we should modify in order to get better performance. A common way to improve performance is to use already avaible information instead of searching this information again. Reuse the same memory that was already allocated, avoid the usage of pointer in part of the program in which they are not required. Each time we use a pointer we need to search again in the memory the element, this operation takes at least some CPU cycle when the information is in the cache. It could be more than 100 if the information is in memory. A tool we will see later has a tool cachegrind that allow to monitor cache usage. To enhance this kind of "memory" usage it's important to split down the probleme in such a way that the probleme could be handeled in cpu cache instead of using the main memory.
Avoiding memory leaks
One important rule to avoid them is to delete object once they are used, it's generally a good idea to not mix copying the object some time and copy the reference or the pointer to the object another time. Mixing both way of doing could easly result in memory leaks. We could check if memory leaking is occuring in several ways :
- Running the program with a large data set, on various conditions : with various conformation of the parameters
- using simply top and have a look on the evolution of memory usage
- with valgrind
to run valgrin we need simply to compile the program with the -g option in order help to locate elements in the code, and then run the program under valgrind control, in the following way :
valgrind --leak-check=full ./program arguments 2> valg_output
The program will take arround 30 time more to do the same job, it will create a file valg_output, containing the output of valgrind.
When their is a bug, in the program simply run the program in the same way I show before ( with the redirection of the error flow to a file ), but without the --leak-check option. Unfortunatly some bugs could not be detected by valgrind, for instance going to far in a array will not be detected, and could sometime be a quite hard bug to detect. It's generally also a good idea to cover a whole range of different condition when debuging a program and trying to check if their is an error. Some of them appear only with a very low probability, to avoid them it's important to place ourself in a condition in which they have the greatest chance to occure.
a memory leak example
a array overflow example
Valgrind memory leak detection tool
A serie of article about CPU cache usage