glibc-exit源码阅读

glibc调用 exit

exit.c可以找到exit的实现。

1
2
3
4
5
6
void
exit (int status)
{
  __run_exit_handlers (status, &__exit_funcs, true, true);
}
libc_hidden_def (exit)

调用glibc的exit相当于调用了__run_exit_handlers, 下面来看看__run_exit_handlers的实现。

先看定义:

1
2
3
4
void
attribute_hidden
__run_exit_handlers (int status, struct exit_function_list **listp,
                     bool run_list_atexit, bool run_dtors)

由此知道,调用exit的时候run_list_atexitrun_dtors被设置为了trueexit_function_list被设置为了__exit_funcs

第一阶段

这个函数执行的时候,首先会判断run_dtors然后调用__call_tls_dtors

1
2
3
4
5
6
  /* First, call the TLS destructors.  */
#ifndef SHARED
  if (&__call_tls_dtors != NULL)
#endif
    if (run_dtors)
      __call_tls_dtors ();

什么是TLS?网上查到一些资料说这是一种通信协议,进入__call_tls_dtors阅读,可以发现这里的TLS并不指代TLS协议,阅读这篇文章-《TLS–线程局部存储》-可以对TLS有大概的了解,TLS会与写时复制(COW)比较类似,是一个全局变量,每个线程有自己的副本,从而可以保证每个线程自己修改自己的TLS变量,而不会影响其他线程的TLS变量。

__call_tls_dtors会做什么?注释说的比较清楚了,会调用TLS的析构函数,这个析构函数负责析构thread_local中声明的TLS变量。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
/* Call the destructors.  This is called either when a thread returns from the
   initial function or when the process exits via the exit function.  */
void
__call_tls_dtors (void)
{
  while (tls_dtor_list)
    {
      struct dtor_list *cur = tls_dtor_list;
      dtor_func func = cur->func;
#ifdef PTR_DEMANGLE
      PTR_DEMANGLE (func);
#endif
      tls_dtor_list = tls_dtor_list->next;
      func (cur->obj);
      /* Ensure that the MAP dereference happens before
         l_tls_dtor_count decrement.  That way, we protect this access from a
         potential DSO unload in _dl_close_worker, which happens when
         l_tls_dtor_count is 0.  See CONCURRENCY NOTES for more detail.  */
      atomic_fetch_add_release (&cur->map->l_tls_dtor_count, -1);
      free (cur);
    }
}

TLS析构函数是通过一个全局链表tls_dtor_list调用的,tls_dtor_list是什么时候初始化的呢?通过下面这个函数__cxa_thread_atexit_impl,并且__cxa_thread_atexit_impl是只被编译器调用的。(编译器什么时候会调用就暂不追踪了。)

1
2
3
4
5
6
7
8
/* Register a destructor for TLS variables declared with the 'thread_local'
   keyword.  This function is only called from code generated by the C++
   compiler.  FUNC is the destructor function and OBJ is the object to be
   passed to the destructor.  DSO_SYMBOL is the __dso_handle symbol that each
   DSO has at a unique address in its map, added from crtbegin.o during the
   linking phase.  */
int
__cxa_thread_atexit_impl (dtor_func func, void *obj, void *dso_symbol)

以上总结就是,在调用glibc-exit的时候,首先会析构所有的TLS变量。

第二阶段

然后是处理listp,会根据每个节点不同的属性调用不同函数。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
while (true)
  {
  restart:
    cur = *listp;
    while (cur->idx > 0)
      {
        struct exit_function *const f = &cur->fns[--cur->idx];
        const uint64_t new_exitfn_called = __new_exitfn_called;
        /* Unlock the list while we call a foreign function.  */
        __libc_lock_unlock (__exit_funcs_lock);
        switch (f->flavor)
          {
            //...
          case ef_free:
          case ef_us:
            break;
          case ef_on:
            onfct = f->func.on.fn;
#ifdef PTR_DEMANGLE
            PTR_DEMANGLE (onfct);
#endif
            onfct (status, f->func.on.arg);
            break;
          case ef_at:
            atfct = f->func.at;
#ifdef PTR_DEMANGLE
            PTR_DEMANGLE (atfct);
#endif
            atfct ();
            break;
          case ef_cxa:
            /* To avoid dlclose/exit race calling cxafct twice (BZ 22180),
                we must mark this function as ef_free.  */
            f->flavor = ef_free;
            cxafct = f->func.cxa.fn;
#ifdef PTR_DEMANGLE
            PTR_DEMANGLE (cxafct);
#endif
            cxafct (f->func.cxa.arg, status);
            break;
          }
        /* Re-lock again before looking at global state.  */
        __libc_lock_lock (__exit_funcs_lock);
        if (__glibc_unlikely (new_exitfn_called != __new_exitfn_called))
          /* The last exit function, or another thread, has registered
              more exit functions.  Start the loop over.  */
          goto restart;
      }
      //...
  }

这里需要关注的是__exit_funcs__exit_funcs是怎么初始化的?

cxa_atexit.c可以找到对__exit_funcs的初始化。

1
2
3
4
5
6
7
8
9
/* Register a function to be called by exit or when a shared library
   is unloaded.  This function is only called from code generated by
   the C++ compiler.  */
int
__cxa_atexit (void (*func) (void *), void *arg, void *d)
{
  return __internal_atexit (func, arg, d, &__exit_funcs);
}
libc_hidden_def (__cxa_atexit)

和第一阶段类似,__exit_funcs初始化也是编译器的行为,具体会把哪些函数串在这个链表上呢?暂时不追踪了。

同时,也提供了用户接口,用户可以自定义一些在exit时执行的函数,下面这一段在on_exit.c:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/* Register a function to be called by exit.  */
int
__on_exit (void (*func) (int status, void *arg), void *arg)
{
  struct exit_function *new;
  /* As a QoI issue we detect NULL early with an assertion instead
     of a SIGSEGV at program exit when the handler is run (bug 20544).  */
  assert (func != NULL);
   __libc_lock_lock (__exit_funcs_lock);
  new = __new_exitfn (&__exit_funcs);
  if (new == NULL)
    {
      __libc_lock_unlock (__exit_funcs_lock);
      return -1;
    }
#ifdef PTR_MANGLE
  PTR_MANGLE (func);
#endif
  new->func.on.fn = func;
  new->func.on.arg = arg;
  new->flavor = ef_on;
  __libc_lock_unlock (__exit_funcs_lock);
  return 0;
}
weak_alias (__on_exit, on_exit)

总之,第二阶段也是会执行一些编译器或用户注册的函数。

第三阶段

最后判断run_list_atexit调用__libc_atexit_exit

1
2
3
if (run_list_atexit)
  RUN_HOOK (__libc_atexit, ());
_exit (status);

这里关注两个函数__libc_atexit_exit

genops.c可以找到对__libc_atexit的说明。

1
text_set_element(__libc_atexit, _IO_cleanup);

原来__libc_atexit绑定的是一个叫_IO_cleanup的函数,这里就可以猜到,此时会做一些IO清理相关的工作。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
int
_IO_cleanup (void)
{
  /* We do *not* want locking.  Some threads might use streams but
     that is their problem, we flush them underneath them.  */
  int result = _IO_flush_all_lockp (0);
  /* We currently don't have a reliable mechanism for making sure that
     C++ static destructors are executed in the correct order.
     So it is possible that other static destructors might want to
     write to cout - and they're supposed to be able to do so.
     The following will make the standard streambufs be unbuffered,
     which forces any output from late destructors to be written out. */
  _IO_unbuffer_all ();
  return result;
}

什么是IO清理相关的工作?比如使用glibc标准stream函数,一般是有IO缓存的,比如读写文件或者标准输入输出,因为需要考虑IO性能和CPU性能的差距,会缓存一段buffer,这段buffer满或者外部触发时就可以出发写入或者读出操作了。

在调用exit的时候相当于手动将这些缓存buffer输出了。

_exit则是系统调用,会引导退出进程。

系统调用 _exit

_exit源码大概在_exit.S,但是看不太懂…可以另外关注man7的说明:

_exit() terminates the calling process “immediately”. Any open file descriptors belonging to the process are closed. Any children of the process are inherited by init(1) (or by the nearest “subreaper” process as defined through the use of the prctl(2) PR_SET_CHILD_SUBREAPER operation). The process’s parent is sent a SIGCHLD signal.

The value status & 0xFF is returned to the parent process as the process’s exit status, and can be collected by the parent using one of the wait(2) family of calls.

The function _Exit() is equivalent to _exit().

意思是:

  1. _exit会立刻中断当前进程
  2. 关闭所有属于该进程的文件
  3. 将该进程的所有子进程移交给init进程,这里可以看到例子《进程控制和通信(一) · 进程控制》
  4. 给该进程的父进程发送SIGCHLD信号
  5. _exit的参数status会被返回给父进程,可以被父进程的wait函数接收。

In glibc up to version 2.3, the _exit() wrapper function invoked the kernel system call of the same name. Since glibc 2.3, the wrapper function invokes exit_group(2), in order to terminate all of the threads in a process.

The raw _exit() system call terminates only the calling thread, and actions such as reparenting child processes or sending SIGCHLD to the parent process are performed only if this is the last thread in the thread group.

glibc调用的_exit会被映射到exit_groupexit_group会中断进程的所有线程,这里和group id有关,在这篇文章中《进程控制和通信(四) · PCB介绍》已经介绍过了,在当前多任务Linux系统中,进程ID指task_struct中的tgid(thread group id),线程id则指pid(process id),有一点区别,主要是为了兼容。

原生的系统调用_exit只会中断当前的线程,并且仅当当前线程是进程的最后一个线程的时候才会有上述诸如发送SIGCHLD的操作。

return和exit的区别

栈桢

先看一段关于return的代码:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
void func1() {
    return;
}

int func2() {
    return 1;
}

int func3(int v) {
    v++;
    return v;
}

int main() {
    func1();
    func2();
    func3(1);

    return 1;
}

func1翻译成汇编是:

1
2
3
4
5
 push   rbp
 mov    rbp,rsp
 pop    rbp
 ret
 cs nop WORD PTR [rax+rax*1+0x0]

函数入口处是保存上一个栈帧rbp,然后将当前栈地址赋值给栈帧寄存器rbp。函数退出时会将父栈帧地址pop给栈帧寄存器rbp

func2翻译成汇编是:

1
2
3
4
5
6
 push   rbp
 mov    rbp,rsp
 mov    eax,0x1
 pop    rbp
 ret
 nop    DWORD PTR [rax+rax*1+0x0]

函数入口处是保存上一个栈帧rbp,然后将当前栈地址赋值给栈帧寄存器rbp。函数退出时,会将返回值赋值给寄存器eax,再将父栈帧地址pop给栈帧寄存器rbp

func3翻译成汇编是:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
 push   rbp
 mov    rbp,rsp
 mov    DWORD PTR [rbp-0x4],edi
 mov    eax,DWORD PTR [rbp-0x4]
 add    eax,0x1
 mov    DWORD PTR [rbp-0x4],eax
 mov    eax,DWORD PTR [rbp-0x4]
 pop    rbp
 ret
 cs nop WORD PTR [rax+rax*1+0x0]
 nop

函数入口处是保存上一个栈帧rbp,然后将当前栈地址赋值给栈帧寄存器rbp,然后将函数入参赋值给[rbp-0x4],因为只有一个参数,所以对应的是栈帧的前4个字节。函数执行时,从[rbp-0x4]取出值给累加器eax,然后累加器eax+1操作,再将结果返回给[rbp-0x4]。函数退出时,从[rbp-0x4]取值,将返回值赋值给寄存器eax,再将父栈帧地址pop给栈帧寄存器rbp

所以return是什么?

return可以将返回值保存在某寄存器,然后将父栈帧弹出,对应的就是赋值/出栈操作。这里介绍的不太仔细,但是对我们目前的问题够用了,不过也算是查漏补缺了(TODO:函数栈帧具体过程,如果只需要大概了解,也可以参考《《UCB CS61a SICP Python 中文》一周目笔记(一)》)。此外,本节内容还参考了:

  1. 《x86-64 下函数调用及栈帧原理》
  2. 《手撕虚拟内存(8)——函数栈桢原理》

所以,returnexit的区别之一: return负责了一些栈桢的退出操作,exit负责程序/进程方面的退出操作。

main函数

再说到returnexit区别的时候,还想到一个问题,mainreturn就是进程的退出吗?要回答这个问题得先了解main是怎么执行的。

程序的入口函数是哪里?是main吗?

不是的,程序的入口函数是_start,这是glibc约定的入口,可以参考这里what-is-the-use-of-start-in-c

入口函数的定义看start.S,这里就不贴代码了,在_start开始会做一些初始化工作,比如初始化栈帧,其他的也看不太懂了,不过在_start最后会调用一个函数call *__libc_start_main@GOTPCREL(%rip),这个函数指向libc-start.c。在__libc_start_main会做一些准备工作,比如收集输入参数argcargv,然后会调用用户定义的main函数,最后会用main的返回值调用exit函数。

1
2
3
/* Nothing fancy, just call the function.  */
result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
exit (result);

这里有几点启发:

  1. main函数调用returnexit没有太大区别
  2. 正常情况main函数最好返回0(因为C语言一般用0表示Success), 而不是1(我大部分时候喜欢返回1)