Tuesday, November 11, 2008

Debugging a booting problem

In this post, I will explain about how I debugged a problem in which our development board was taking a lot of time to completly boot up into Windows CE.

The usual bootup time for the development board was around 30-35 seconds, but as we were adding new drivers , the bootup time went up by minutes. The booting was really slow. So I took up this issue.

As i analyzed the booting up process, it was clear that the Boot phase 1 (loading of FMD) was getting over without any hickups. And slow down seems to have been occuring while drivers are getting loaded and initialized.

We can follow two strategies for pin pointing this issue:

Remove each driver one-by-one.
If we are not clear with the source of the problem, then individually eliminating each component one-by-one and observing the behaviour is a straight forward but time consuming way.

Using Celog
In one of my earlier posts, I have mentioned about a cab file that I created with all the celog settings and files. I installed this file on to the development board and configured the registry to launch 'CeLogFlush.exe' during bootup. Also Celog was configured to dump the data into a local file.

"Launch05"="CeLogFlush.exe" -- include celog.dll

"BufferSize"=dword:100000 // 1MB
"FlushTimeout"=dword:1388 // 5 seconds
"FileFlags"=dword:1 // Keep file open
"ThreadPriority"=dword:C8 // Priority increased to 200
After the complete system was up, I killed the CeLogFlush.exe using ITaskMgr (a really good task manager application) and took the Celog.clg from the device. This output file will contain only the thread IDs, we need to fix this up using readlog (http://devwince.blogspot.com/2008/10/fixing-thread-names-in-celog-output.html).

Now we can view this file in Windows CE Remote Kernel tracker.


According the Celog data, the booting process went normally till the 7th second since celogflush was started and all the drivers till that point was loaded and threads have been started (that means the XXX_Init routine was completed successfully). But after that the next thread start only at the 52 seconds mark, which was the battery monitor thread. And according the driver loading order in registry, the drivers were the power button monitor and battery driver.


When I analyzed the XXX_Init of battery driver, I found out that there was a t_printf statement in the init routine. According to the MSDN documentation t_printf will dump the data onto a console, which is again a driver and is not started yet. So when I removed the t_printf statements, the booting speed was back to normal.