Linux threads and PIDs

At our weekly software team meeting we were discussing our progress towards removing Flask in favor of Tornado. As part of the dicussion we were talking about how we need to be careful in Tornado because no "work" can happen in the main server process. In Flask work could happen because multiple threads were running. So if one thread was blocked doing some work other threads could still answer requests. Tornado is single threaded with cooperative multitasking so that won't work. Tasks need to cooperatively monitor other processes doing the actual work. In passing the CTO mentioned that each thread has a unique PID (or at least, that's what I heard him say). That surprised me. I assumed that threads shared a PID.

I did a bit of research and it looks like things have changed over time. Specifically pre 2003 (<2.6 kernel) the thread implementation was called LinuxThreads. Each LinuxThread had a separate PID. In >=2.6 the thread implementation changed to Native Posix Thread Library (NPTL). This is an implementation of threads that aligns more closely with the POSIX threads standard. In pthreads they share the same PID.

To add to the confusion the definition of a PID is hard to nail down. It seems POSIX and Linux have different meanings for the term PID. In addition, the kernel meaning of a PID and user meaning of a PID aren't the same. See the notes section of the getpid man page for more discussion. In programs like top it may display one of many different identifiers under the PID column (see section 3.a.19 PID).

To add further to the confusion it seems threads are mostly implemented in user space so libc implementations besides glibc (ex musl) may do things differently.