This section discusses issues surrounding the proper compilation of multithreaded applications which use the Standard C++ library. This information is GCC-specific since the C++ standard does not address matters of multithreaded applications.
All normal disclaimers aside, multithreaded C++ application are
      only supported when libstdc++ and all user code was built with
      compilers which report (via  gcc/g++ -v ) the same thread
      model and that model is not single.  As long as your
      final application is actually single-threaded, then it should be
      safe to mix user code built with a thread model of
      single with a libstdc++ and other C++ libraries built
      with another thread model useful on the platform.  Other mixes
      may or may not work but are not considered supported.  (Thus, if
      you distribute a shared C++ library in binary form only, it may
      be best to compile it with a GCC configured with
      --enable-threads for maximal interchangeability and usefulness
      with a user population that may have built GCC with either
      --enable-threads or --disable-threads.)
   
When you link a multithreaded application, you will probably need to add a library or flag to g++. This is a very non-standardized area of GCC across ports. Some ports support a special flag (the spelling isn't even standardized yet) to add all required macros to a compilation (if any such flags are required then you must provide the flag for all compilations not just linking) and link-library additions and/or replacements at link time. The documentation is weak. On several targets (including GNU/Linux, Solaris and various BSDs) -pthread is honored. Some other ports use other switches. This is not well documented anywhere other than in "gcc -dumpspecs" (look at the 'lib' and 'cpp' entries).
     Some uses of std::atomic also require linking
     to libatomic.
   
In the terms of the 2011 C++ standard a thread-safe program is one which does not perform any conflicting non-atomic operations on memory locations and so does not contain any data races. The standard places requirements on the library to ensure that no data races are caused by the library itself or by programs which use the library correctly (as described below). The C++11 memory model and library requirements are a more formal version of the SGI STL definition of thread safety, which the library used prior to the 2011 standard.
The library strives to be thread-safe when all of the following conditions are met:
The system's libc is itself thread-safe,
	   The compiler in use reports a thread model other than
	   'single'. This can be tested via output from gcc
	   -v. Multi-thread capable versions of gcc output
	   something like this:
	 
%gcc -v Using built-in specs. ... Thread model: posix gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)
Look for "Thread model" lines that aren't equal to "single."
	 Requisite command-line flags are used for atomic operations
	 and threading. Examples of this include -pthread
	 and -march=native, although specifics vary
	 depending on the host environment. See
	 Command Options and
	 Machine
	 Dependent Options.
       
	   An implementation of the
	   atomicity.h functions
	   exists for the architecture in question. See the
	   internals
	   documentation for more details.
       
The user code must guard against concurrent function calls which access any particular library object's state when one or more of those accesses modifies the state. An object will be modified by invoking a non-const member function on it or passing it as a non-const argument to a library function. An object will not be modified by invoking a const member function on it or passing it to a function as a pointer- or reference-to-const. Typically, the application programmer may infer what object locks must be held based on the objects referenced in a function call and whether the objects are accessed as const or non-const. Without getting into great detail, here is an example which requires user-level locks:
     library_class_a shared_object_a;
     void thread_main () {
       library_class_b *object_b = new library_class_b;
       shared_object_a.add_b (object_b);   // must hold lock for shared_object_a
       shared_object_a.mutate ();          // must hold lock for shared_object_a
     }
     // Multiple copies of thread_main() are started in independent threads.Under the assumption that object_a and object_b are never exposed to another thread, here is an example that does not require any user-level locks:
     void thread_main () {
       library_class_a object_a;
       library_class_b *object_b = new library_class_b;
       object_a.add_b (object_b);
       object_a.mutate ();
     } All library types are safe to use in a multithreaded program
         if objects are not shared between threads or as
	 long each thread carefully locks out access by any other
	 thread while it modifies any object visible to another thread.
	 Unless otherwise documented, the only exceptions to these rules
         are atomic operations on the types in
         <atomic>
         and lock/unlock operations on the standard mutex types in
         <mutex>. These
         atomic operations allow concurrent accesses to the same object
         without introducing data races.
      
The following member functions of standard containers can be
         considered to be const for the purposes of avoiding data races:
         begin, end, rbegin, rend,
         front, back, data,
         find, lower_bound, upper_bound,
         equal_range, at 
         and, except in associative or unordered associative containers,
         operator[]. In other words, although they are non-const
         so that they can return mutable iterators, those member functions
         will not modify the container.
         Accessing an iterator might cause a non-modifying access to
         the container the iterator refers to (for example incrementing a
         list iterator must access the pointers between nodes, which are part
         of the container and so conflict with other accesses to the container).
      
Programs which follow the rules above will not encounter data
         races in library code, even when using library types which share
         state between distinct objects.  In the example below the
         shared_ptr objects share a reference count, but
         because the code does not perform any non-const operations on the
         globally-visible object, the library ensures that the reference
         count updates are atomic and do not introduce data races:
      
    std::shared_ptr<int> global_sp;
    void thread_main() {
      auto local_sp = global_sp;  // OK, copy constructor's parameter is reference-to-const
      int i = *global_sp;         // OK, operator* is const
      int j = *local_sp;          // OK, does not operate on global_sp
      // *global_sp = 2;          // NOT OK, modifies int visible to other threads      
      // *local_sp = 2;           // NOT OK, modifies int visible to other threads      
      // global_sp.reset();       // NOT OK, reset is non-const
      local_sp.reset();           // OK, does not operate on global_sp
    }
    int main() {
      global_sp.reset(new int(1));
      std::thread t1(thread_main);
      std::thread t2(thread_main);
      t1.join();
      t2.join();
    }
      For further details of the C++11 memory model see Hans-J. Boehm's Threads and memory model for C++ pages, particularly the introduction and FAQ.
This gets a bit tricky. Please read carefully, and bear with me.
A wrapper
      type called __basic_file provides our abstraction layer
      for the std::filebuf classes.  Nearly all decisions dealing
      with actual input and output must be made in __basic_file.
   
A generic locking mechanism is somewhat in place at the filebuf layer, but is not used in the current code. Providing locking at any higher level is akin to providing locking within containers, and is not done for the same reasons (see the links above).
The __basic_file type is simply a collection of small wrappers around
      the C stdio layer (again, see the link under Structure).  We do no
      locking ourselves, but simply pass through to calls to fopen,
      fwrite, and so forth.
   
So, for 3.0, the question of "is multithreading safe for I/O" must be answered with, "is your platform's C library threadsafe for I/O?" Some are by default, some are not; many offer multiple implementations of the C library with varying tradeoffs of threadsafety and efficiency. You, the programmer, are always required to take care with multiple threads.
(As an example, the POSIX standard requires that C stdio FILE*
       operations are atomic.  POSIX-conforming C libraries (e.g, on Solaris
       and GNU/Linux) have an internal mutex to serialize operations on
       FILE*s.  However, you still need to not do stupid things like calling
       fclose(fs) in one thread followed by an access of
       fs in another.)
   
So, if your platform's C library is threadsafe, then your
      fstream I/O operations will be threadsafe at the lowest
      level.  For higher-level operations, such as manipulating the data
      contained in the stream formatting classes (e.g., setting up callbacks
      inside an std::ofstream), you need to guard such accesses
      like any other critical shared resource.
   
A second choice may be available for I/O implementations: libio. This is disabled by default, and in fact will not currently work due to other issues. It will be revisited, however.
The libio code is a subset of the guts of the GNU libc (glibc) I/O
      implementation.  When libio is in use, the __basic_file
      type is basically derived from FILE.  (The real situation is more
      complex than that... it's derived from an internal type used to
      implement FILE.  See libio/libioP.h to see scary things done with
      vtbls.)  The result is that there is no "layer" of C stdio
      to go through; the filebuf makes calls directly into the same
      functions used to implement fread, fwrite,
      and so forth, using internal data structures.  (And when I say
      "makes calls directly," I mean the function is literally
      replaced by a jump into an internal function.  Fast but frightening.
      *grin*)
   
Also, the libio internal locks are used. This requires pulling in large chunks of glibc, such as a pthreads implementation, and is one of the issues preventing widespread use of libio as the libstdc++ cstdio implementation.
But we plan to make this work, at least as an option if not a future default. Platforms running a copy of glibc with a recent-enough version will see calls from libstdc++ directly into the glibc already installed. For other platforms, a copy of the libio subsection will be built and included in libstdc++.
This section discusses issues surrounding the design of multithreaded applications which use Standard C++ containers. All information in this section is current as of the gcc 3.0 release and all later point releases. Although earlier gcc releases had a different approach to threading configuration and proper compilation, the basic code design rules presented here were similar. For information on all other aspects of multithreading as it relates to libstdc++, including details on the proper compilation of threaded code (and compatibility between threaded and non-threaded code), see Chapter 17.
Two excellent pages to read when working with the Standard C++ containers and threads are SGI's http://www.sgi.com/tech/stl/thread_safety.html and SGI's http://www.sgi.com/tech/stl/Allocators.html.
However, please ignore all discussions about the user-level configuration of the lock implementation inside the STL container-memory allocator on those pages. For the sake of this discussion, libstdc++ configures the SGI STL implementation, not you. This is quite different from how gcc pre-3.0 worked. In particular, past advice was for people using g++ to explicitly define _PTHREADS or other macros or port-specific compilation options on the command line to get a thread-safe STL. This is no longer required for any port and should no longer be done unless you really know what you are doing and assume all responsibility.
Since the container implementation of libstdc++ uses the SGI code, we use the same definition of thread safety as SGI when discussing design. A key point that beginners may miss is the fourth major paragraph of the first page mentioned above (For most clients...), which points out that locking must nearly always be done outside the container, by client code (that'd be you, not us). There is a notable exceptions to this rule. Allocators called while a container or element is constructed uses an internal lock obtained and released solely within libstdc++ code (in fact, this is the reason STL requires any knowledge of the thread configuration).
For implementing a container which does its own locking, it is trivial to provide a wrapper class which obtains the lock (as SGI suggests), performs the container operation, and then releases the lock. This could be templatized to a certain extent, on the underlying container and/or a locking mechanism. Trying to provide a catch-all general template solution would probably be more trouble than it's worth.
The library implementation may be configured to use the high-speed caching memory allocator, which complicates thread safety issues. For all details about how to globally override this at application run-time see here. Also useful are details on allocator options and capabilities.