Книга: Writing Windows WDM Device Drivers

Debugging

Debugging

Win32 user mode programs run within the protective arms of the operating system. For example, it stops them accessing memory that is not in a task's address space. Windows 98 programs have some of the kernel mapped into its upper addresses, so user mode programs could trample on the kernel. The protection offered by NT and Windows 2000 is better. If a user mode application runs into problems, it is usually possible to stop the program and carry on with other work.

A device driver has none of this protection because it is part of the kernel. Be especially careful to test your driver fully or you may lose data. Be prepared for the worst on your own development computer by making regular backups.

Although problems with a driver can be a real pain, it can eventually give you a greater understanding of what you are doing. You have to inspect each call or section of code carefully. Rather than just following the instructions given here, you may eventually really know what is going on. Or even, heaven forbid, why the system was designed to work the way it does. You may have suggestions for how drivers or the system might be improved.

How Do Things Go Wrong?

Windows 98 and Windows 2000 sometimes react differently to errors in drivers.

Crashes

A fatal error in NT or W2000 causes the "blue screen of death", properly called a "bugcheck". You must reset the computer to continue. The blue screen gives an indication of the error, and usually a list of the kernel modules and a stack trace. More details of how to decode this information are given later. Annoying though these bug checks are, they do stop further damage being done to the operating system, such as corrupting disks.

NT and W2000 usually log a "Save Dump" event for a bugcheck, listing the most pertinent information. An event may not be logged if the bugcheck occurred at boot time, before the event log service has started.

In W98, a similar blue screen appears for fatal errors, again giving a brief description of what went wrong. W98 can usually carry on, but it may be best to restart the PC.

The two most common causes of fatal errors are

• Accessing nonexistent memory

• Accessing paged memory at or above DISPATCH_LEVEL IRL

Core Dumps

If NT or W2000 has a bugcheck, you can also get it to produce core dumps in a file on disk, called memory.dmp by default. You have to enable this option in the Control Panel System applet Advanced tab Startup and Recovery section. You can use the WinDbg and DumpExam tools to inspect the dump. However, I found that using WinDbg in this way was not particularly productive.

In NT and W2000, you can include information in the core dump by calling KeInitializeCallbackRecord and KeRegisterBugCheckCallback. In the event of a bugcheck, your callback routine is called to store any state information in the core dump.

Driver Will Not Start

When you update a driver like Wdm1, the Device Manager may state that you need to restart the system before the driver will run. This means that the driver returned an error while loading.

There are two possible causes for such errors. Suppose your driver will start when the system reboots. This means that when it unloads it must be leaving something around that stops it from starting again. A common problem is forgetting to delete your device or the symbolic link to your device.

If your driver will not start when the system reboots, it means that you must have changed your driver so that the DriverEntry routine fails. Trace through the code to work out what is going wrong.

Just to complicate matters, in some circumstances, Windows needs to juggle the revised system resources when a driver is reloaded. In this case, Windows will not even attempt to start your driver if a reboot is necessary to satisfy the resource allocation process.

Hang Ups

A user thread can hang up if you never complete an IRP. The thread will never be able to complete. Have you queued the IRP somewhere and never processed it?

A driver can also hang up if it cannot acquire a resource that it needs. The different resources are described later. If a resource is unavailable, it may mean that just one IRP cannot be processed, or it could mean that all IRP processing grinds to halt.

A "deadly embrace" occurs if two different pieces of code are trying to access the same resources. For example, code A might hold resource X and want resource Y. If code B holds Y and wants X, then a deadly embrace occurs, hanging the whole system. The simplest solution is to always acquire resources in the same order.

You can also get hang ups if you try to acquire a resource more than once, or if your interrupt service routine never returns. Continual interrupts can seriously degrade the system performance.

Resource Leaks

Resource leaks are less dire, but still important to fix. For example, if you forget to free some memory allocated during each read, the system will eventually run out of memory.

Some resources are more important than others. For example, nonpaged memory should be used conservatively.

Time Dependencies

Possibly the worst type of problem to sort out is related to timing.

At the simplest level, a timing problem can simply mean that you have not filled a buffer quickly enough. For example, if your hardware needs to output data regularly, you may not have provided it with the data. Or did you fill it too quickly? Check that your driver works on different speed computers.

Another possible problem is that your driver may not be reading data quickly enough. For example, an isochronous device, such as a microphone, may generate a regular number of samples per second. Check that you can keep reading data at this speed, even in a stressed system. Alternatively, have a strategy for skipping samples (e.g., a call to your device to drop samples).

Оглавление книги


Генерация: 0.711. Запросов К БД/Cache: 2 / 0
поделиться
Вверх Вниз