Wednesday, April 23, 2014

Double Checked Locking Problem

The problem is often liked to Singleton pattern. Following Singleton implementation as traditionally done works fine in a single threaded program; but not in a multi threaded one.
1 // from the header file
2 class Singleton {
3 public:
4 static Singleton* instance();
5    ...
6 private:
7    static Singleton* pInstance;
8 };
9 // from the implementation file
10 Singleton* Singleton::pInstance = 0;
 
11 Singleton* Singleton::instance() {
12  if (pInstance == 0) {
13      pInstance = new Singleton;
14   }
15   return pInstance;
16 }


 In the above example, if thread 1 context switched after testing pInstance for null at line 12, thread 2 still see pInstance as null and create an instance of Singleton. thread 1 continue from line 13 and create another instance of Singleton and assign it to pInstance. There are 2 instances of Singleton has now been created.

The solution for this is simple, by protecting the code that modify the shared resource with Locks. However the cost is that each access to the singleton (calling instance()) requires acquisition of a lock, but we need the lock only for the first call.
1 Singleton* Singleton::instance() {
2    Lock lock; // acquire lock (params omitted for simplicity)
3    if (pInstance == 0) {
4      pInstance = new Singleton;
5    }
6    return pInstance;
7 } // release lock (via Lock destructor)
To avoid the expensive locking on each call to instance(), we could check pInstance for null before acquiring the lock, and check again before creating the new instance(hence double checking)
Singleton* Singleton::instance() {
   if (pInstance == 0) { // 1st test
      Lock lock;
      if (pInstance == 0) { // 2nd test
         pInstance = new Singleton;
      }
   }
   return pInstance;
}
The 2nd test is needed as it is possible a second thread could create the pInstance after 1st test but before acquiring the lock. With this version, after the singleton is initialized, access to singleton would return the pInstance straight away without acquiring the lock or creating an instance.

Still there could be issues. C++ standard defines the sequence points are at end of statements. Compiler still have the freedom to re order within the code emitted for one single statement. This is because C++ standard defines the behavior of an abstract machine with no notion of multiple threads (C++ as a language doesn't support multiple threads). 
In the above example, statement pInstance = new Singleton; will emit code for 3 operations Allocate momory, create Singleton object by calling constructor, assign the address to pInstance. If second and third steps are re-ordered, then a 2nd thread calling instance() may see pInstance as not null and return the object, which is not yet constructed (but just the memory has been allocated).
To tell the compiler that the data being dealt with could be changed outside and any read/write operation to the data should not be reordered.
Here you should note that, both pInstance = new Singleton; and *pInstance should be made volatile. if *pInstance is not volatile, any member initialization in the constructor could be re-ordered and may be initialised after pInstance has been set; which means, the object is yet not fully constructed and pInstance is made not null.

The solution would look like:
class Singleton {
public:
static volatile Singleton* volatile instance();
   ...
private:
   // one more volatile added
   static volatile Singleton* volatile pInstance;
};
// from the implementation file
volatile Singleton* volatile Singleton::pInstance = 0;
volatile Singleton* volatile Singleton::instance() {
   if (pInstance == 0) {
      Lock lock;
      if (pInstance == 0) {
        // one more volatile added
        volatile Singleton* volatile temp =
           new volatile Singleton;
        pInstance = temp;
      }
   }
   return pInstance;
}
 However, it could still fail for 2 reasons:
1) the Standard's constraints on observable behavior are only for an abstract machine defined by the Standard, and that abstract machine has no notion of multiple threads of execution. As a result, though the Standard prevents compilers from reordering reads and writes to volatile data within a thread, it imposes no constraints at all on such reorderings across threads.

2) Second, just as const-qualified objects don't become const until their constructors have run to completion, volatile-qualified objects become volatile only upon exit from their constructors. In the statement:

There could be further problems on a multi processor system where each processor has it's own cash and the object construction by one processor may not be flushed to the main memory, but the initialization of pInstance might have been flushed and be seen by other processors.

Conclusion:
Avoid implement singleton with DCLP. Before go for complicated solution, check the performance impact of lock acquisition for each access to the singleton. if not significant, the simple implementation without DCLP would be fine.
Other options are, rather lazy initialization of the singleton, do an eager initialization at the time of program startup in the single threaded code (before it become multi-threaded).

Monday, November 16, 2009

Distinct memory areas in C++

Const data:
stores string literals and other data whose values are known at compile time
No objects of class type will be sotred in const data area
Available during entire lifetime of the program
read only
stack
Const Data
Stores string literals and other data whose values are known at compile time
No objects of class type
Available during entire lifetime of the program
read-only data?

free store
heap
free store and heap should not be two different memory areas; but heap is the data
model to implement the abstract type free-store. almost those terms are
interchangeable.

Global/Static
Storage allocated at program startup.
May not be initialized until after program has started execution.
The order of initialization of global variables across translation units is not defined.
Object access rules as are of Free Store.

What is the difference between const char *p, char * const p and const char * const p?

const char *p - This is a pointer to a constant character. You cannot change the value pointed by p, but you can change the pointer p itself.

*p = 'S' is illegal.
p = "Test" is legal.

Note - char const *p is the same.


const * char p - This is a constant pointer to non-constant character. You cannot change the pointer p, but can change the value pointed by p.

*p = 'A' is legal.
p = "Hello" is illegal.


const char * const p - This is a constant pointer to constant character. You cannot change the value pointed by p nor the pointer p.

*p = 'A' is illegal.
p = "Hello" is also illegal.

Sunday, October 07, 2007

typedef a function header: useful to define function pointers

* explains how to do typedef to a function pointer.
*/
#include <stdio.h>

typedef int (*FUNC)(char* z, int a);

int func(char* z, int a){
printf("my func. \n");
// do something

return 0;
}

int main(){

FUNC pFunc = func;
char test[100];
pFunc(test, 6);

return 0;

Thursday, August 02, 2007

Useful SQLs

1. select * into Customer_tmp from Customer [any where conditions if needed]
Select entries from Customer table and create a new table Customer_tmp with the same structure of Customer and Insert them into the new table.
Useful to copy a table to a tmpTable.

Wednesday, March 21, 2007

Techniques to handle runtime errors in C++

Bail, return, jump, or . . . throw?

The common techniques for handling run-time errors in C leave something
to be desired, like maybe exception handling.
*The common techniques for handling run-time errors in C leave something
to be desired, like maybe exception handling.*

The exception handling machinery in C++ is designed to deal with program
errors, such as a resource allocation failure or a value out of range.
C++ exception handling provides a way to decouple error reporting from
error handling. However, it's not designed to handle asynchronous events
such as hardware interrupts.

C++ exception handling is designed to address the limitations of error
handling in C. In this installment, I'll look at some of the more common
techniques for handling run-time errors in C programs and show you why
these techniques leave something to be desired.

*Error reporting via return values*
Many C functions report failures through their function return values or
arguments. For example, in the Standard C library:

* *malloc* returns a null pointer if it fails to allocate memory.
* *strtoul* returns *ULONG_MAX* and stores the *ERANGE* into the
object designated by *errno* if the converted value can't be
represented as an *unsigned long*.
* *printf* returns a negative value if it can't format and print
every operand specified in its format list.

(The macro *ULONG_MAX* is defined in the standard header *<limits.h>*.
Macros *ERANGE* and *errno* are defined in *<errno.h>*.)

If you want your C code to be reliable, you should write it so that it
checks the return values from calls to all such functions. In some
cases, adding code to check the return value isn't too burdensome. For
example, a typical call to *malloc* such as:

*

p = malloc(sizeof(T));

*becomes: *

p = malloc(sizeof(T));
if (p == NULL)
// cope with the failure

*In other cases, writing a proper check is a bit tricky. For example, a
call to *strtoul* such as: *

n = strtoul(s, &e, 10);

*becomes: *

n = strtoul(s, &e, 10);
if (n == ULONG_MAX
&& errno == ERANGE)
// deal with the overflow

*Having detected an error, you then have to decide what to do about it.

*Bailing*
Some errors, such as a value out of range, might be the result of
erroneous user input. If the input is interactive, the program can just
prod the user for a more acceptable value. With other errors, such as a
resource allocation failure, the system may have little choice other
than to shutdown.

The most abrupt way to bail out is by calling the Standard C *abort*
function, as in:

*

if (/something really bad happened/)
abort();

*Calling *abort* terminates program execution with no promise of
cleaning anything up. Calling the Standard C *exit* function is not
quite as rude: *

if (/something really bad happened/)
exit(EXIT_FAILURE);

*Calling *exit* closes all open files after flushing any unwritten
buffered data, removes temporary files, and returns an integer-valued
exit status to the operating system. The standard header *<stdlib.h>*
defines the macro *EXIT_FAILURE* as the value indicating unsuccessful
termination.

You can use the Standard C *atexit* function to customize *exit* to
perform additional actions at program termination. For example, calling: *

atexit(turn_gizmo_off);

*"registers" the *turn_gizmo_off* function so that a subsequent call to
*exit* will invoke: *

turn_gizmo_off();

*as it terminates the program. The C standard says that *atexit* should
let you register up to 32 functions. Some implementations allow even more.

Embedded systems being as diverse as they are, I suspect that some don't
support either *abort* or *exit*. In those systems, you must use some
other platform-specific function(s) to shut things down.

More commonly, complete shutdown is not the appropriate response to an
error. Rather than shut down, the system should transition to a "safe"
state, whatever that is, and continue running. Here again, the details
of that transition are platform specific.

*Returning*
Some of the code in any embedded system is clearly application specific.
Many systems contain a good chunk of application-independent code as
well. The application-independent code could be from a library shipped
with the compiler or operating system, from a third-party library, or
from something developed in-house.

When an application-specific function detects an error, it can respond
on the spot with a specific action, as in: *

if (/something really bad happened/)
take_me_some_place_safe();

*In contrast, when an application-independent function detects an error,
it can't respond on its own because it doesn't know how the application
wants to respond. (If it did know, it wouldn't be application
independent.) Rather than respond at the point where the error was
detected, an application-independent function can only announce that the
error has occurred and leave the error handling to some other function
further up the call chain. The announcement might appear as a return
value, an argument passed by address, a global object, or some
combination of these. As I described earlier, this is what most Standard
C library functions do.

Although conceptually simple, returning error indicators can quickly
become cumbersome. For example, suppose your application contains a
chain of calls in which *main* calls *f*, which calls *g*, which calls
*h*. Ignoring any concern for error handling, the code would be as shown
in Listing 1.

Now, suppose reality intrudes and function *h* has to check for a
condition it can't handle. In that case, you might rewrite *h* so that
it has a non-*void* return type, such as *int*, and appropriate return
statements for error and normal returns. The function might look like: *

int h(void)
{
if (/something really bad happened/)
return -1;
// do h
return 0;
}

* Now *g* is responsible to heed the return value of *h* and act
accordingly. However, more often than not, functions in the middle of a
call chain, such as *g* and *f*, aren't in the position to handle the
error. In that case, all they can do is look for error values coming
from the functions they call and return them up the call chain. This
means you must rewrite both *f* and *g* to have non-*void* return types
along with appropriate return statements, as in: *

int g(void)
{
int status;
if ((status = h()) != 0)
return status;
// do the rest of g
return 0;
}

int f(void)
{
int status;
if ((status = g()) != 0)
return status;
// do the rest of f
return 0;
}

*Finally, the buck stops with *main*: *

int main()
{
if (f() != 0)
// handle the error
// do the rest of main
return 0;
}

*This approach--returning error codes via return values or
arguments--effectively decouples error detection from error handling,
but the costs can be high. Passing the error codes back up the call
chain increases the size of both the source code and object code and
slows execution time. It's been a while since I've used this approach to
any extent, but my recollection is that the last time I did, it
increased the non-comment source lines in my application by 15 to 20%,
with a comparable increase in the object code. Other programmers have
told me they've experienced increases to the tune of 30 to 40%.

This technique also increases coding effort and reduces readability.
It's usually difficult to be sure that your code checks for all possible
errors. Static analyzers, such as Lint, can tell you when you've ignored
a function's return value, but as far as I know, they can't tell you
when you've ignored the value of an argument passed by address. The
consistent application of this technique can easily break down when the
current maintainer of the code hands it off to a less experienced one.

*Jumping*
We could eliminate much of the error reporting code from the middle
layers of the call chain by transferring control directly from the
error-detection point to the error-handling point. Some languages let
you do this with a non-local goto. If you could do this in C, it might
look like: *

int h(void)
{
if (/something really bad happened/)
goto error_handler;
// do h
return 0;
}

...

int main()
{
f();
// do the rest of main
return 0;
error_handler:
// handle the error
}

*but you can't. It won't compile. However, you can do something similar
using the facilities provided by the standard header *<setjmp.h>*. That
header declares three components: a type named *jmp_buf* and two
functions named *setjmp* and *longjmp*. (Actually, *setjmp* might be a
function-like macro, but for the most part, you can think of it as a
function.)

Calling *setjmp(jb)* stores a "snapshot" of the program's current
calling environment into *jmp_buf jb*. That snapshot typically includes
values such as the program counter, stack pointer, and possibly other
CPU registers that characterize the current state of the calling
environment.

Subsequently, calling *longjmp(jb, v)* (I'll explain *v* shortly)
effectively performs a non-local goto--it restores the calling
environment from snapshot *jb* and causes the program to resume
execution as if it were returning from the call to *setjmp* that took
the snapshot previously. It's like /déjà vu/ all over again.

The function calling *setjmp* can use *setjmp*'s return value to
determine whether the return from *setjmp* is really that, or actually a
return from *longjmp*. When a function directly calls *setjmp(jb)* to
take a snapshot, *setjmp* returns 0. A later call to *longjmp(jb, v)*,
where *v* is non-zero, causes program execution to resume as if the
corresponding call to *setjmp* returned *v*. In the special case where
*v* is equal to 0, *longjmp(jb, v)* causes setjmp to return 1, so that
*setjmp* only returns 0 when called directly.

Listing 2 shows our hypothetical application with a *longjmp* from *h*
to *main*. Since the *longjmp* bypasses *g* and *f*, these two functions
no longer need to check for error return values, thus simplifying the
source code and reducing the object code.

Using *setjmp* and *longjmp* eliminates most, if not all, of the clutter
that accrues from checking and returning error codes. So what's not to
like about them?

The problem is that you must be extremely cautious with them to avoid
accessing invalid data or mismanaging resources. A *jmp_buf* need not
contain any more information than necessary to enable the program to
resume execution as if it were returning from a *setjmp* call. It need
not and probably will not preserve the state of any local or global
objects, files, or floating-point status flags.

Using *setjmp* and *longjmp* can easily lead to resource leaks. For
example, suppose functions *g* and *f* each allocate and deallocate a
resource, as in: *

void g(size_t n)
{
char *p;
if ((p = malloc(n)) == NULL)
// deal with it
h();
// do the rest of g
free(p);
}

void f(char const *n)
{
FILE *f;
if ((f = fopen(n, "r")) == NULL)
// deal with it
g();
// do the rest of f
if (fclose(f) == EOF)
// deal with it
}

*A call to *longjmp* from *h* transfers control to *main*, completely
bypassing the remaining portions of *g* and *f*. When this happens, *g*
misses the opportunity to close its *FILE*, and *f* misses the
opportunity to free its allocated memory.

C++ classes use destructors to provide automatic resource deallocation.
A common practice in C++ is to wrap pointers inside classes, and provide
destructors to ensure that the resources managed via these pointers are
eventually released. Unfortunately, *setjmp* and *longjmp* are strictly
C functions that know nothing about destructors. Calling *longjmp* in a
C++ program can bypass destructor calls, resulting in resource leaks.

*A forward pass*
Using exception handling in C++ can avoid these resource leaks. It
properly executes destructors as it transfers control from an
error-detection point to an error handler, and it will be the subject of
my next column.

Wednesday, March 07, 2007

C++ Performance Tips

Introduction

These tips are based mainly on ideas from the book Efficient C++
by Dov Bulka and David Mayhew. For a more thorough treatment of
performance programming with C++, I highly recommend this book.

Constructors and Destructors

* The performance of constructors and destructors is often poor due
to the fact that an object's constructor (destructor) may call the
constructors (destructors) of member objects and parent objects.
This can result in constructors (destructors) that take a long
time to execute, especially with objects in complex hierarchies or
objects that contain several member objects. As long as all of the
computations are necessary, then there isn't really a way around
this. As a programmer, you should at least be aware of this
"silent execution".

If all of the computations mentioned above are not necessary, then
they should be avoided. This seems like an obvious statement, but
you should be sure that the computations performed by the
constructor that you are using is doing only what you need.

* Objects should be only created when they are used. A good
technique is to put off object creation to the scope in which it
is used. This prevents unnecessary constructors and destructors
from being called.
* Using the initializer list functionality that C++ offers is very
important for efficiency. All member objects that are not in the
initializer list are by default created by the compiler using
their respective default constructors. By calling an object's
constructor in the initializer list, you avoid having to call an
object's default constructor and the overhead from an assignment
operator inside the constructor. Also, using the initializer list
may reduce the number of temporaries needed to construct the
object. See the Temporaries section for more information on this.

------------------------------------------------------------------------


Virtual Functions

* Virtual functions negatively affect performance in 3 main ways:
1. The constructor of an object containing virtual functions
must initialize the vptr table, which is the table of
pointers to its member functions.
2. Virtual functions are called using pointer indirection,
which results in a few extra instructions per method
invocation as compared to a non-virtual method invocation.
3. Virtual functions whose resolution is only known at run-time
cannot be inlined. (For more on inlining, see the Inlining
section.
* Templates can be used to avoid the overhead of virtual functions
by using a templated class in place of inheritance. A templated
class does not use the vptr table because the type of class is
known at compile-time instead of having to be determined at
run-time. Also, the non-virtual methods in a templated class can
be inlined.
* The cost of using virtual functions is usually not a factor in
calling methods that take a long time to execute since the call
overhead is dominated by the method itself. In smaller methods,
for example accessor methods, the cost of virtual functions is
more important.

------------------------------------------------------------------------


Return Value

Methods that must return an object usually have to create an object to
return. Since constructing this object takes time, we want to avoid it
if possible. There are several ways to accomplish this.

* Instead of returning an object, add another parameter to the
method which allows the programmer to pass in the object in which
the programmer wants the result stored. This way the method won't
have to create an extra object. It will simply use the parameter
passed to the method. This technique is called Return Value
Optimization (RVO).
* Whether or not RVO will result in an actual optimization is up to
the compiler. Different compilers handle this differently. One way
to help the compiler is to use a computational constructor. A
computational constructor can be used in place of a method that
returns an object. The computational constructor takes the same
parameters as the method to be optimized, but instead of returning
an object based on the parameters, it initializes itself based on
the values of the parameters.

------------------------------------------------------------------------


Temporaries

Temporaries are objects that are "by-products" of a computation. They
are not explicitly declared, and as their name implies, they are
temporary. Still, you should know when the compiler is creating a
temporary object because it is often possible to prevent this from
happening.

* The most common place for temporaries to occur is in passing an
object to a method by value. The formal argument is created on the
stack. This can be prevented by using pass by address or pass by
reference.
* Compilers may create a temporary object in assignment of an
object. For example, a constructor that takes an int as an
argument may be assigned an int. The compiler will create a
temporary object using the int as the parameter and then call
the assignment operator on the object. You can prevent the
compiler from doing this behind your back by using the explicit
keyword in the declaration of the constructor.
* When objects are returned by value, temporaries are often used.
See the Return Value section for more on this.
* Temporaries can be avoided by using = operators. For
example, the code
a = b + c;
could be written as
a=b;
a+=c;

------------------------------------------------------------------------


Inlining

Inlining is one of the easiest optimizations to use in C++ and it can
result in the most dramatic improvements in execution speed. The main
thing to know when using inlining is when you should inline a method and
when you shouldn't inline.

* There is always a trade-off between code size and execution speed
when inlining. In general, small methods (for example, accessors)
should be inlined and large methods should not be inlined.
* If you are not sure of whether or not a given method should be
inlined, the best way to decide is to profile the code. That is,
run test samples of the code, timing inlining and non-inlining
versions.
* Excessive inlining can drastically increase code size, which can
result in increased execution times because of a resulting lower
cache hit rate.
* Watch out for inlined methods that make calls to other inlined
methods. This can make the code size unexpectedly larger.
* Singleton methods, methods that are only called from one place in
a program, are ideal for inlining. The code size does not get any
bigger and execution speed only gets better.
* Using literal arguments with an inlined method allows the compiler
to make significant optimizations. (This is, however, compiler
dependent.)
* The compiler preprocessor can be used to implement conditional
inlining. This is useful so that during testing the code is easier
to debug. But for compiling production code, there are no changes
to be made to the source code. This is implemented by using a
preprocessor macro called INLINE. Inlined code is defined within
#ifdef INLINE ...