[Kernel] Analyzing linux kernel Oops report easily…

In Linux, usually, kernel crashes with so-called Oops report.
This Oops report includes lots of useful information for debugging.

Therefore, I have seen lots of developer suffering from analyzing register, memory and stack dump in the report.
To analyze dumped information, memory information should be matched with source code.
But, this is not easy process.
So, I want to introduce the way to do this easily.

Main concept is, we can make tool to parse Oops report and pass these to debugging tool.
Here is introduction of my case.
I uses TRACE32 software – ARM simulator for debugging tool.
What I did is, implementing simple Perl script that parses Oops report and make cmm file that set register and memory information given by the report.
For example, auto generated cmm file is like this.

R.S cpsr 0x20000013
R.S r0 0x0
R.S r1 0x0
...
D.S 0xc035a248 %long 0xe3a02000
D.S 0xc035a24c %long 0xe3a03020
...

It’s time to use TRACE32 PowerView for ARM to analyze the report.
Launching t32marm with simulator mode -> Loading issued ‘vmlinux’ -> Runnig auto-generated cmm script
Finally, I can see stack frame with local variables and interpreted memory information by virtue of T32.

I’m sorry not to share parsing tool – Perl script – due to … as you know, something like legal issue… (I hate this.)
I hope this post is helpful for others…

Advertisements

[C/C++] Getting return address…

Very simple… Just describing here for me to remind.

=== ARM(32bit) - RVCT ===
/* #pragma O0 <- due to optimized out, this may be required */
{ /* Just Scope */
    void* ra;
    /* register lr(r14) has return address */
    __asm 
    { mov ra, lr }
    /* now variable 'ra' has return address */
}

=== x86(32bit) - GCC ===
{ /* Just Scope */
    register void* ra; /* return address */
    /* return address is stored at 4byte above from 'ebp' */
    asm ("movl 4(%%ebp), %0;"
         :"=r"(ra));
    /* now variable 'ra' has return address */
}

[Android] Classify unexpected cases

Terms Definition :

Reboot : Real reboot. System restarts from ‘boot logo’.
Soft reboot : System restarts from ‘Power on animation’. (NOT from ‘Boot logo’.)

In user mode

On Dalvik

ANR (Application Not Responding)

* Event dispatching timeout : 5 sec.
=> Windows can’t dispatch events over 5 sec.
* Broadcast timeout : 10 sec.
=> Broadcast receiver can’t complete things to do in 10 sec.
* Service timeout : 20 sec.
=> Service can’t finish jobs in 20 sec..

Hang up

* There is no UI response, even if window dispatched events.
* Framework misses window that events should be passed to. (It doesn’t never happens.)
=> Note! : We should wait over 5 sec. to tell it’s hang-up or ANR. (See ‘Event dispatching timeout’)

FC (Force Close)

* Unexpected stop. Ex. null object access,  exceptions.

return from ‘main’ function.

* It’s normal end of process if it is intentional. But in some cases, we may come across unexpected return from ‘main’. System’s point of view, it’s just normal exit of process. So, no error message is left.

At native

Exception (ex. data abort etc)

* Kernel sends appropriate Signal (ex. SIGSEGV) to user process. And user process may be killed. In this case, process is killed due to signal from kernel.  So, we cannot expect UI feedback, but low level logs – ex. tombstone.

In privilege mode

Exception : Reboot. (Kernel ‘oops’ is executed before reboot. So, we may  be able to find corresponding kernel log/panic data.

Other important and notable cases.

system_process :

* Unexpected stop of  this process may cause system-soft-reboot. (This should be distinguished from reboot caused by kernel exception)

home / launcher

* Unexpected stop causes one-time-flickering of home screen to restart home. Let’s imagine following scenario.
-> Home enters unexpected busy state(cannot dispatch user input during this busy time). After 3~4 sec., there is exception and home is stopped -> restarted.
In this case, it may look like just ‘Smooth Reset’!