For a main thread a thread object is created by the CRT0 before _main gets
called so that _main can already run in a generic environment that, e.g.,
catches stack overflows as a page-fault instead of corrupting the BSS.
Additionally dynamic programs have only one CRT0 - the one of the LDSO -
which does the initialization for both LDSO and program.
ref #989
Using the new 'join()' function, the caller can explicitly block for the
completion of the thread's 'entry()' function. The test case for this
feature can be found at 'os/src/test/thread_join'. For hybrid
Linux/Genode programs, the 'Thread_base::join()' does not map directly
to 'pthread_join'. The latter function gets already called by the
destructor of 'Thread_base'. According to the documentation, subsequent
calls of 'pthread_join' for one thread may result in undefined behaviour.
So we use a 'Genode::Lock' on this platform, which is in line with the
other platforms.
Related to #194, #501