Tricky Signal Tracing in Kernel

Have you ever experienced such horrible signals between core processes which might crash a whole system? I had done. Unknown signals are a kind of critical issue in product.

Someone says "Better safe than sorry". I like that quote. Even though software world is totally chaotic on bugs, we should at least try some approaches being safe. Especially signals. They never log while on their way unless we trigger so that could quietly stab your back one day if you are off the guard.

This article is about an approach let you be safe from them.

Our softwares are running on a software. we call it as Operating System. If we can read its code, modify as we want, why not put our idea into it? Say more straight, "hook the code where signals walk inside kernel".

Practice! time

Let's specify some wishes for our goal. I'm going to make this more simple and they are only three which are signals' ...

  1. Sender's name/pid
  2. Signal's number
  3. Receiver's name/pid

Signal processing has three stages in simple saying.

  1. kill syscall or from kernel
  2. ready to signal
  3. wake up the target process

We're going to hook at the last of the second stage. Because most exceptions, for example dereferencing NULL pointers are already considered at the boundary between the second and third.

 1/* kernel/signal.c */
 2static int __send_signal(int sig,
 3                         struct kernel_siginfo *info,
 4                         struct task_struct *t,
 5                         enum pid_type type,
 6                         bool force)
 7{
 8  struct sigpending *pending;
 9  struct sigqueue *q;
10  int override_rlimit;
11  int ret = 0, result;
12
13  /* ... A LOT CODE HERE ... */
14
15  complete_signal(sig, t, type);
16ret:
17  trace_signal_generate(sig, info, t, type != PIDTYPE_PID, result);
18  return ret; 
19}

Here's one of functions signals walk and as the callee's name, complete_signal gives a clue, hook points are its ahead/behind. I pick ahead.

Now code a few lines.

 1/* kernel/signal.c */
 2static int __send_signal(int sig,
 3                         struct kernel_siginfo *info,
 4                         struct task_struct *t,
 5                         enum pid_type type,
 6                         bool force)
 7{ 
 8  struct sigpending *pending;
 9  struct sigqueue *q;
10  int override_rlimit;
11  int ret = 0, result;
12
13  /* ... A LOT CODE HERE ... */
14
15  /* Our code! */
16  if (sig != 17 /* SIGCHLD in Arm */ &&
17      sig != 14 /* SIGALRM in Arm */ &&
18      info && t)
19  {
20    int srcpid, dstpid;
21    char src[TASK_COMM_LEN] = { 0, };
22    char dst[TASK_COMM_LEN] = { 0, };
23    struct task_struct * cur_task = NULL;
24
25    if (!force) {
26      srcpid = info->si_pid;
27      cur_task = find_task_by_vpid(srcpid);
28    } else
29      srcpid = 0; /* just for kernel case */
30
31    dstpid = t->pid;
32    memcpy(src, cur_task ? cur_task->comm : "kernel", TASK_COMM_LEN-1);
33    memcpy(dst, t->comm, TASK_COMM_LEN-1);
34
35    printk(KERN_INFO "Signal :: (%s %d) --[%d]--> (%s %d)\n",
36        src, srcpid,
37        sig,
38        dst, dstpid);
39  }
40
41  complete_signal(sig, t, type);
42ret:
43  trace_signal_generate(sig, info, t, type != PIDTYPE_PID, result);
44  return ret;
45}

There are two type of sender, SI_USER and SI_KERNEL. force is set to false or true respectively. According to that, cur_task has sender's info or being NULL represents kernel so that we finally decide sender's name.

We do ignore SIGCHLD and SIGALRM used for a signal notification to its parent that itself exited and time interval like a timer by kernel respectively to prevent dmesg flooding.

dmesg

 1root@OpenWrt:/# reboot
 2[   56.365579] Signal :: (reboot 1286) --[15]--> (procd 1)
 3[   56.452796] Signal :: (procd 1) --[15]--> (hostapd 705)
 4[   56.460590] Signal :: (procd 1) --[15]--> (wpa_supplicant 706)
 5[   56.504500] Signal :: (killall 1350) --[15]--> (dropbear 597)
 6[   56.551519] Signal :: (procd 1) --[15]--> (odhcpd 822)
 7[   56.599167] Signal :: (procd 1) --[15]--> (logd 430)
 8[   56.701690] Signal :: (netifd 766) --[15]--> (udhcpc 1044)
 9[   56.728474] br-wan: port 1(eth0) entered disabled state
10[   56.738012] device eth0 left promiscuous mode
11[   56.743576] br-wan: port 1(eth0) entered disabled state
12[   56.769219] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
13[   57.730085] Signal :: (procd 1) --[15]--> (ubusd 141)
14[   57.737611] Signal :: (procd 1) --[15]--> (askfirst 143)
15[   57.745295] Signal :: (procd 1) --[15]--> (urngd 177)
16[   57.752705] Signal :: (procd 1) --[15]--> (brcmf_wdog/mmc1 293)
17[   57.760934] Signal :: (procd 1) --[15]--> (dnsmasq 534)
18[   57.768427] Signal :: (procd 1) --[15]--> (netifd 766)
19[   57.775770] Signal :: (procd 1) --[15]--> (ntpd 1070)
20[   57.782966] Signal :: (procd 1) --[15]--> (sh 1476)
21[   57.790105] Signal :: (ntpd 1070) --[15]--> (ntpd 1070)
22[   57.790712] Signal :: (kernel 0) --[1]--> (askfirst 143)
23[   58.791265] Signal :: (procd 1) --[9]--> (ash 142)
24[   58.799449] Signal :: (kernel 0) --[1]--> (ash 142)

ftrace and strace

ftrace is really cool. I love and often use it. But it has a steep learning curve as much as its flexibility than most tracers. Not only that reason, but also roughly able to trace them at the early stage of boot or halt.
strace traces only one program at a time. That means you must execute it as many as you'd like to trace. NOT the whole system.

This strategy might be good for both linux distros we mainly use in laptops and embedded systems. But more comfy on the latter for a test. This is a rough idea rather than bad article like "this one is better than others".

I hope you get inspired a lot and would be happy to have emails if someone has good ideas more than mine :-)

Stay safe!