Thursday, July 28, 2011

The curious case of pthread_atfork on PowerPC

Recently I was asked to take a look at an issue on PowerPC whereby the symbol pthread_atfork seems to be missing from the 64-bit libpthread.so.0 library because the following message was output when trying to run an application that was linked with -lpthread.
undefined symbol: pthread_atfork
(Note: while this problem was encountered on PowerPC, the general concept is relevant to all platforms.)

If we compare the symbols exported in /lib/libpthread.so.0 (32-bit shared object) to those in /lib64/libpthread.so.0 (64-bit shared object) we see that there are no exported symbols for pthread_atfork in the 64-bit shared object:

nm /lib/libpthread.so.0 | grep atfork
0000c900 t __dyn_pthread_atfork
0000c900 T pthread_atfork@GLIBC_2.0
         U __register_atfork@@GLIBC_2.3.2
nm /lib64/libpthread.so.0 | grep atfork
<nothing>
In the 32-bit library notice that the addresses for __dyn_pthread_atfork and pthread_atfork@GLIBC_2.0 are the same!  This indicates that pthread_atfork@GLIBC_2.0 is a version tagged symbol which is not the same as a non-versioned pthread_atfork symbol.  This is telling us that there is no dynamically exported pthread_atfork in either the 32-bit or 64-bit libpthread.so.0.

So what's going on?  Since pthread_atfork hasn't been deprecated from POSIX where is it?

Let's take a look at some code from GLIBC that may help clarify things:

nptl/old_pthread_atfork.c:

/* Copyright (C) 2002 Free Software Foundation, Inc.
   This file is part of the GNU C Library.
   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, write to the Free
   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
   02111-1307 USA. */

#include <shlib-compat.h>

#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_3)
# define __pthread_atfork __dyn_pthread_atfork
# include "pthread_atfork.c"
# undef __pthread_atfork
compat_symbol (libpthread, __dyn_pthread_atfork, pthread_atfork, GLIBC_2_0);
#endif
nptl/pthread_atfork.c:
/* Copyright (C) 2002, 2006 Free Software Foundation, Inc.
   This file is part of the GNU C Library.
   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.

   The GNU C Library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Lesser General Public
   License as published by the Free Software Foundation; either
   version 2.1 of the License, or (at your option) any later version.

   In addition to the permissions in the GNU Lesser General Public
   License, the Free Software Foundation gives you unlimited
   permission to link the compiled version of this file with other
   programs, and to distribute those programs without any restriction
   coming from the use of this file. (The GNU Lesser General Public
   License restrictions do apply in other respects; for example, they
   cover modification of the file, and distribution when not linked
   into another program.)

   Note that people who make modified versions of this file are not
   obligated to grant this special exception for their modified
   versions; it is their choice whether to do so. The GNU Lesser
   General Public License gives permission to release a modified
   version without this exception; this exception also makes it
   possible to release a modified version which carries forward this
   exception.

   The GNU C Library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
   Lesser General Public License for more details.

   You should have received a copy of the GNU Lesser General Public
   License along with the GNU C Library; if not, write to the Free
   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
   02111-1307 USA. */

#include "pthreadP.h"
#include <fork.h>

/* This is defined by newer gcc version unique for each module. */
extern void *__dso_handle __attribute__ ((__weak__,
                                          __visibility__ ("hidden")));


/* Hide the symbol so that no definition but the one locally in the
executable or DSO is used. */
int
#ifndef __pthread_atfork
/* Don't mark the compatibility function as hidden. */
attribute_hidden
#endif
__pthread_atfork (prepare, parent, child)
     void (*prepare) (void);
     void (*parent) (void);
     void (*child) (void);
{
return __register_atfork (prepare, parent, child,
                            &__dso_handle == NULL ? NULL : __dso_handle);
}
#ifndef __pthread_atfork
extern int pthread_atfork (void (*prepare) (void), void (*parent) (void),
                           void (*child) (void)) attribute_hidden;
strong_alias (__pthread_atfork, pthread_atfork)
#endif
Here we see that __dyn_pthread_atfork is a compat symbol for an older pthread_atfork implementation bound by alias to pthread_atfork@GLIBC_2.0, a versioned symbol.  Older applications that were linked against GLIBC 2.0 and pthread_atfork will be bound to __dyn_pthread_atfork when run against newer versions of GLIBC.

Looking back at our nm output, what we don't see is a non-versioned pthread_atfork symbol being exported by /lib/libpthread.so.0 (32-bit) and there is neither that nor a versioned symbol for /lib64/libpthread.so.0 (64-bit); why?
       
To answer the second question first; It's because PPC64 wasn't supported as a platform in GLIBC until after pthread_atfork was aliased to __dyn_pthread_atfork. So on PPC64 there isn't a dynamically exported version of pthread_atfork@GLIBC_2.0.

To get to the bottom of the first question takes a bit more analysis. The guard, #ifndef __pthread_atfork, is only true if we're building the compat version of the symbol, i.e. building "old_pthread_atfork.c".    
       
Otherwise, we're building pthread_atfork.c directly, in which case __pthread_atfork is marked hidden, and a strong_alias pthread_atfork is bound to it.  Curiously this non-versioned pthread_atfork alias is also marked hidden!  This implies that this will not be exported from libpthread.so.0.

We then see that the nptl/Makefile explicitly omits a non-versioned pthread_atfork symbol from being exported by the shared object file libpthread.so.0 by simply excluding the pthread_atfork relocatable file from libpthread.so.0. this is done using eliding in the nptl/Makefile:   
libpthread-static-only-routines = pthread_atfork
Any relocatable files elided via a lib*-static-only-routines variable are left out of the lib*.so.* shared object and built into a lib*_nonshared.a archive instead.

In general the functions in a lib*_nonshared.a archive are kept hidden, hence the reason for the curious hidden attribute on the pthread_atfork strong_alias.

If we look at libpthread_nonshared.a we will indeed see the pthread_atfork symbol.
nm /usr/lib/libpthread_nonshared.a

pthread_atfork.oS:
         w __dso_handle
00000000 T __pthread_atfork
         U __register_atfork
00000000 T pthread_atfork

nm /usr/lib64/libpthread_nonshared.a

pthread_atfork.oS:
                 w __dso_handle
0000000000000000 D __pthread_atfork
                 U __register_atfork
0000000000000000 D pthread_atfork
The libraries in GLIBC that need to use pthread_atfork are each statically linked against the libpthread_nonshared.a static archive.  The pthread_atfork symbol becomes statically linked into each individual GLIBC library and need not be exported.

So how does a user application use pthread_atfork?  By linking against libpthread.so, which is in fact a linker script that directs the linker to dynamically link against libpthread.so.0 and statically link in libpthread_nonshared.a:

cat /usr/lib64/libpthread.so
/* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily. */
OUTPUT_FORMAT(elf32-powerpc)
GROUP ( /lib/libpthread.so.0 /usr/lib/libpthread_nonshared.a )

cat /usr/lib64/libpthread.so
/* GNU ld script
Use the shared library, but some functions are only in
the static library, so try that secondarily. */
OUTPUT_FORMAT(elf64-powerpc)
GROUP ( /lib64/libpthread.so.0 /usr/lib64/libpthread_nonshared.a )
This will automatically statically link libpthread_nonshared.a into an application at compilation and link time.  As of 2002-11-26 the pthread_atfork function is no longer a dynamically exported symbol in GLIBC.

So why are we getting undefined symbol: pthread_atfork even when we're linking with -lpthread?  The reason most likely has to due with the order of linking.

Let's say we have a relocatable object foo.o which includes a reference to pthread_atfork and we link foo.o into a shared object in the following way:

gcc -shared -o libfoo-0.0.1.so -lpthread ./foo.o

This seems like a perfectly reasonable way to link a shared object, and if we were only requesting dynamically exported symbols from libpthread.so.0 it would probably work just fine.  The problem comes in when we need to use a symbol provided by libpthread_nonshared.a, a static archive.

The link editor (static linker) is generally quite smart, but it links files in sequential order as it encounters them on the command line.  When it examines libpthread.so and sees that it needs to statically link in libpthread_nonshared.a it searches for outstanding undefined symbol references that this archive might satisfy.  If it finds one it links it, and its dependencies into the archive shared object.  Otherwise it purges the symbols it encounters in libpthread_nonshared.a.

So when -lpthread is added before foo.o on the linker invocation, libpthread_nonshared.a is linked and pthread_atfork is purged as unneeded before foo.o is processed.  So to solve this we need to make sure that the reference to pthread_atfork by foo.o is registered before libpthread_nonshared.a is linked.  This can be accomplished by any one of the following methods.
  1. ... -o foo-0.0.1.so ./foo.o -lpthread
  2. ... -o foo-0.0.1.so -lpthread ./foo.o -Wl,-u,pthread_atfork
  3. ... -o foo-0.0.1.so -Wl,--whole-archive -lpthread -Wl,--no-whole-archive ./foo.o
  4. ... -o foo-0.0.1.so -\( -lpthread ./foo.o \)-
The first method simply makes sure we link libpthread_nonshared.a after foo.o.

The second method explicitly tells the linker to not purge the pthread_atfork symbol.

The third method is heavy handed and tells the linker to include all of the symbols in libpthread_nonshared.a and libpthread.so.0 in the final link.

The fourth method tells the linker to reprocess all of the parenthesis enclosed libraries each time a new relocatable object or shared object is processed.  This means that the pthread libraries would be processed twice, once before and once after foo.o is processed.  This can increase link time significantly.

7 comments:

  1. Your article was very helpful for me. Thank you!

    ReplyDelete
  2. This article is very useful. Thank you for help me saving my time.

    ReplyDelete
  3. Very useful, thanks for the explanation.

    ReplyDelete
  4. On x86_64, Ubuntu 18.04 get some error with pthread_atfork. On previous linux versions there was all oK:


    [ 72%] Linking C executable testedbsync
    cd /home/mvitolin/projects/endurox/ubftest && /usr/bin/cmake -E cmake_link_script CMakeFiles/testedbsync.dir/link.txt --verbose=1
    /usr/bin/cc -fsanitize=address -fno-omit-frame-pointer -O1 -ggdb -Wno-format-truncation -Wstringop-overflow=0 -D_DEFAULT_SOURCE=1 -g -rdynamic CMakeFiles/testedbsync.dir/test_nstd_msync.c.o -o testedbsync -L/home/mvitolin/projects/endurox/libubf -Wl,-rpath,/home/mvitolin/projects/endurox/libubf:/home/mvitolin/projects/endurox/libnstd ../libubf/libubf.so ../libcgreen/libcgreen.a -lm ../libnstd/libnstd.so -lrt -ldl -pthread -lpthread
    /usr/bin/ld: testedbsync: hidden symbol `pthread_atfork' in /usr/lib/x86_64-linux-gnu/libpthread_nonshared.a(pthread_atfork.oS) is referenced by DSO
    /usr/bin/ld: final link failed: Bad value
    collect2: error: ld returned 1 exit status

    Any ideas where could be the problem? Tried all above approaches, does not help

    ReplyDelete
    Replies
    1. basically that above binary is linking against shared library which uses pthread_atfork()... i.e.:


      here is library built:

      cd /home/mvitolin/projects/endurox/libnstd && /usr/bin/cmake -E cmake_link_script CMakeFiles/nstd.dir/link.txt --verbose=1
      /usr/bin/cc -fPIC -fsanitize=address -fno-omit-frame-pointer -O1 -ggdb -Wno-format-truncation -Wstringop-overflow=0 -D_DEFAULT_SOURCE=1 -g -shared -Wl,-soname,libnstd.so -o libnstd.so CMakeFiles/nstd.dir/ndebug.c.o CMakeFiles/nstd.dir/nstdutil.c.o CMakeFiles/nstd.dir/nstopwatch.c.o CMakeFiles/nstd.dir/nclopt.c.o CMakeFiles/nstd.dir/benchmark.c.o CMakeFiles/nstd.dir/ini.c.o CMakeFiles/nstd.dir/inicfg.c.o CMakeFiles/nstd.dir/cconfig.c.o CMakeFiles/nstd.dir/nerror.c.o CMakeFiles/nstd.dir/nstd_tls.c.o CMakeFiles/nstd.dir/ulog.c.o CMakeFiles/nstd.dir/sys_genunix.c.o CMakeFiles/nstd.dir/sys_svqpoll.c.o CMakeFiles/nstd.dir/sys_linux.c.o CMakeFiles/nstd.dir/sys_common.c.o CMakeFiles/nstd.dir/sys_posixq.c.o CMakeFiles/nstd.dir/sys_svq.c.o CMakeFiles/nstd.dir/tplog.c.o CMakeFiles/nstd.dir/exregex.c.o CMakeFiles/nstd.dir/platform.c.o CMakeFiles/nstd.dir/msgsizemax.c.o CMakeFiles/nstd.dir/exaes.c.o CMakeFiles/nstd.dir/exsha1.c.o CMakeFiles/nstd.dir/exbase64.c.o CMakeFiles/nstd.dir/crypto.c.o CMakeFiles/nstd.dir/expluginbase.c.o CMakeFiles/nstd.dir/lmdb/eidl.c.o CMakeFiles/nstd.dir/lmdb/edb.c.o CMakeFiles/nstd.dir/edbutil.c.o CMakeFiles/nstd.dir/crc32.c.o CMakeFiles/nstd.dir/nstd_shm.c.o CMakeFiles/nstd.dir/sys_svqshm.c.o CMakeFiles/nstd.dir/sys_svqevent.c.o CMakeFiles/nstd.dir/nstd_sem.c.o CMakeFiles/nstd.dir/sys_svqadmin.c.o
      make[2]: Leaving directory '/home/mvitolin/projects/endurox'
      [100%] Built target nstd
      make[1]: Leaving directory '/home/mvitolin/projects/endurox'
      /usr/bin/cmake -E cmake_progress_start /home/mvitolin/projects/endurox/CMakeFiles 0

      Delete
  5. The shared library that is using pthread_atfork() should have been statically linked against libpthread_nonshared.a automatically by the libpthread.so linker script. This means that pthread_atfork should be embedded in the shared library as a symbol. Use elfutils to inspect the shared library for that symbol. If it doesn't contain the symbol, but expects it, check the NEEDED list in the shared library for the libpthread version. If it's sufficiently old it'll expect the old version of libpthread where pthread_atfork was dynamically linked. If that's the case you need the old libpthread compat libraries for the older ABI on the system. If it's expecting the newer version of pthread_atfork, then for some reason libpthread_nonshared.a was not statically linked when the shared library was created (as it should have been), which would point to an error with the script used to compile and link the shared library.

    ReplyDelete