Archive for the ‘DDK’ Category

Two additional CSQ rules

Wednesday, November 1st, 2006

Cancel-safe queues are a fantastic addition to Windows from back in the XP timeframe, and I have come to rely on them in my drivers. There are a couple of important extra rules that aren’t reflected in the documentation that you should be aware of, though.

Rule #1

If you read the documentation for IoCsqInsertIrp or IoCsqInsertIrpEx, you’ll find that the routine can be called at <= DISPATCH_LEVEL. While this is true, you cannot call it while holding a spin lock.

To see why this is true, consider the case where a dispatch routine receives an IRP, and before it can queue the IRP, another thread swoops in and cancels it. If this happens, someone should call the cancel routine, and that someone is IoCsqInsertIrp. It eventually calls CsqCompleteCanceledIrp, which calls IoCompleteRequest with a status of STATUS_CANCELLED.

(Note the double-L. Very bad news for a guy who can’t spell in the first place. I’ll be looking that one up for the rest of my life.)

The rest should be obvious: it is in fact illegal to call IoCompleteRequest while holding a lock, which is exactly what winds up happening in this case. Therefore, you can’t call IoCsqInsertIrp or IoCsqInsertIrpEx while holding a lock.

Rule #2

The second rule is related to when IoCsqInsertIrpEx marks an IRP pending. The rule is simple: If the supplied CsqInsertIrpEx callback returns STATUS_SUCCESS, the IRP is marked pending. Otherwise, it’s not.

In the case of IoCsqInsertIrp, the IRP is unconditionally marked pending. This is even quasi-documented by the WDK cancel sample, but the corresponding startio sample doesn’t say anything about the behavior of IoCsqInsertIrpEx, which I had always assumed was the same based on that comment. It’s not. :-)

Neither of these rules show up in the docs (at least as of now), so hopefully this will save some confusion down the road.

Vista driver verifier enhancements

Tuesday, October 31st, 2006

I just ran across this document that explains the changes present in Vista’s driver verifier. Verifier is one of the Best Things Ever.

Thanks to Dan Mihai from Microsoft for pointing this out on the newsgroups.

Keeping ExInterlocked* operations interlocked

Tuesday, October 31st, 2006

To continue on yesterday’s discussion of interlocked lists, let’s explore the nature of the interlocking done by the ExInterlocked* APIs. The ExInterlockedRemoveHeadList documentation says the following about its spin lock argument:

You must use this spin lock only with the ExInterlockedXxxList routines.

The documentation page provides a hint as to why this is the case:

The ExInterlockedRemoveHeadList routine can be called at any IRQL.

The reason that this function can be called from any IRQL is that the function acquires the spin lock at the highest IRQL in the system. To understand why this is important, we have to examine another kind of race condition – priority inversion deadlocks.

Recall that the kernel operates on a prioritization scheme implemented using IRQLs. Normal tasks run as PASSIVE_LEVEL, drivers run at some higher IRQL (called DIRQL), and other tasks happen at various points in between. (See the DDK or any introductory driver book for more information on this.) Drivers typically acquire spin locks at DISPATCH_LEVEL, which is below all DIRQLs.

A priority inversion deadlock can happen if a driver acquires a spin lock at DISPATCH_LEVEL, and while holding that lock, is interrupted by hardware. An interrupt service routine is invoked on behalf of the hardware, and runs at DIRQL. If the ISR tries to acquire that same lock, a deadlock will occur: the ISR will spin forever, waiting for the driver to release the lock, but the driver is stuck suspended until the ISR returns.

With that in mind, let’s come back to the ExInterlocked* functions. Suppose you try to acquire the spin lock at DISPATCH_LEVEL (with KeAcquireSpinLock), perhaps for the purpose of removing an entry from the list. Suppose that your hardware interrupts in the middle of your operation, and your ISR lands on the same CPU you were just operating on. If you then call something like ExInterlockedInsertHeadList, you’ll deadlock. The lower-priority routine will own the lock, and the higher-priority routine will wait forever trying to acquire it.

The solution is to follow the documentation’s advice and always use that spin lock exclusively with ExInterlocked* routines. When you use ExInterlockedInsertHeadList from any routine (not just an ISR), it raises the IRQL to the highest IRQL possible on that CPU, which masks out everything else in your driver – even ISRs. This prevents the priority inversion.

For what it’s worth, the documentation used to say something like ExInterlocked routines are only interlocked with respect to each other. The new wording says less but is much clearer in my opinion.

UPDATE: clarified wording to prevent deliberate mis-interpretation.

Get ready for more random crash bugs

Tuesday, October 31st, 2006

For better or worse, Ars is reporting that Gateway will start selling factory-overclocked computers. They’re only overclocking the highest-end systems, and (surprisingly) they seem to be offering a full factory warranty.

Someone asked on one of the newsgroups the other day why overclocking matters. When you run a CPU out of spec, it can fail in various creative ways. There is a bucket for crashes of this type within Microsoft, and !analyze buckets obvious crashes as hardware errors as well. Raymond discusses this a bit more from Microsoft’s perspective.

As a driver guy, I am pretty opposed to this practice. It means that I’m going to wind up getting more crash dumps with difficult-to-debug problems. I put a lot of pressure on my team to not close bugs as INVALID or WORKSFORME, even if the bug is hard to repro or track down. This is obviously going to undermine that effort.

But, it was inevitable, and I’m sure Gateway will be rewarded for their innovation. :-)

Why is there no ExInterlockedRemoveEntryList?

Monday, October 30th, 2006

A long time ago, I promised an entry on why there is no ExInterlockedRemoveEntryList function. If you search the NTDEV archives (or if you got to hear Peter Viscarola from OSR discuss it at one of the Driver DevCons a while back), you know that Microsoft left the function out intentionally due to its potential for misuse.

To understand why this is, consider one of the nice properties of a doubly-linked list: constant-time removal of an item from the middle, if you already know the item’s address. The list entries look something like this:

typedef struct _LIST_ENTRY
{
	struct _LIST_ENTRY *Flink;
	struct _LIST_ENTRY *Blink;
}
LIST_ENTRY, *PLIST_ENTRY;

To do a remove operation, you would simply point the next item (Flink) to the previous item (Blink) and vice-versa. No need to walk a long list of items. There’s even a macro to do this for you: RemoveEntryList.

This process is subject to an obvious race condition and another less obvious one. The obvious race is that two different threads could try to mutate the list simultaneously. The naïve solution is to wrap the removal in locks:

LockList();             // Spin lock, mutex, whatever...
RemoveEntryList(item);
UnlockList();

That does indeed prevent two threads from making simultaneous updates, but it misses another important problem: What if the entry you’re trying to remove is no longer on the list? What if another thread has just finished removing the same item, right before your call to LockList above? You certainly have no idea if the item’s neighbors are still valid after the item has been removed from the list, so you could easily trash the list.

The only safe way to do this is to ensure that the item is still a part of the list at the time you remove it. And the only safe way to do that is to walk the list from a point that you know will always be on the list, namely, from the head.

There are, of course, situations in which you can be sure, due to other semantics of the program in question, that your item really is still on the list. In those cases, the pattern above is safe.

But in other situations, you have to walk the entire list. This can be expensive, and has to be done under the protection of whatever lock you’re using. For a list with thousands of entries on it, you would want to avoid this whenever possible, and you should probably try to set up whatever extra bookkeeping you need to take advantage of O(1) removal. But, you don’t get that bookkeeping automatically, so that’s why there’s no ExInterlockedRemoveEntryList.

Downlevel support for Winsock Kernel

Monday, October 30th, 2006

David Powell from the provided me with some insight about the possibility of downlevel support for WSK, now that TDI is being deprecated. He tells me that the WSK team has been getting lots of requests for Windows XP/2003 support lately, and that it’s high on our list of things to do as soon as we get Vista out
the door.

As for Windows 2000 support, my impression is that it is pretty unlikely. If this really matters to you, I’d encourage you to follow the link in my previous post to send the WSK team feedback. Such feedback has been effective before.

Vista kd cheat sheet

Wednesday, October 25th, 2006

I am annoyingly scatterbrained and have had to look this up repeatedly this week. Here’s hoping I can save someone else the trouble:

c:\\> bcdedit /debug on
c:\\> bcdedit /dbgsettings serial debugport:1 baudrate:115200

Then reboot and enjoy.

NDIS6 stack usage

Tuesday, October 24th, 2006

While working with one of our drivers, Ken and I ran across an interesting stack fault that I’ve never seen or heard of on a pre-Vista OS. It turns out that the Vista network stack uses much more stack than previous OSes. Ken has all of the details.

My advice: if you have any sort of NDIS driver that goes above a basic miniport, you should be testing with Vista now. Install an IM driver or two, while you’re at it. NDIS-WDM drivers could have issues here as well.

Happy testing!

TDI is on the path to deprecation

Friday, October 20th, 2006

Mike Flasko posted to the Windows Core Networking blog with an entry discussing Winsock Kernel. He points out that it is the successor to TDI, and goes on to say that TDI in Vista is an emulation layer built on top of WSK:

On Windows Vista & Windows Server Longhorn, TDI is still supported for compatibility reasons; however, it has been implemented using a translation layer and thus its performance is sub-optimal resulting in performance degradation for TDI clients. For this reason, as well as others (TDI on path to deprecation, etc — see resource links below), drivers should opt to use WSK whenever possible.

He’s conducting a survey on the blog about WSK adoption. If you have a TDI driver, it would probably benefit you to get involved.

This has that standard everything before Vista is legacy feel that a lot of us have complained about before; I for one will have to continue supporting TDI for years, and I suspect I’m not alone. We are still shipping code supporting Win98 (although thankfully I’m out of the business of actually writing new Win9x code, but only recently).

Interesting stuff; worth a read.

Acronym overloading

Wednesday, October 18th, 2006

Name collisions are not uncommon in the software world. For example, XP is either Windows XP or eXtreme Programming. But usually the technologies are not this close. So, I’m curious:

When you hear WFP, do you think of Windows Filtering Platform or Windows File Protection? If you happen to be developing a WFP driver, WFP will be important for you…

Actually, it turns out that Microsoft has declared that WFP is now WRP (see above link).