Archive for July, 2006

Google-proof blogging

Monday, July 31st, 2006

A blog post I read today – can’t remember where now of course – mentioned a hunk of javascript that implements an encryption algorithm to keep a blog post encrypted unless the viewer types in a password. Not hugely practical, but interesting.

Google never forgets. It’s kind of scary to think that people can find all of the stupid things I said Way Back When, and will probably be able to find the stupid things I’m saying now for quite some time into the future.

So, I have a (partial) solution. Implement a {Wordpress|MT|Community Server|etc} plug-in that allows you to ROT13 posts (or parts thereof) and automatically un-rot13’s them when the browser loads them, using a little hunk of JavaScript. You could obviously vary the algorithm, but ROT13 is fast and easy, and none of the search engines are going to be randomly un-ROT13′ing webpages any time too soon.

In fact, you could use any reversible transformation to implement this – compression, base64, AES-256 (with the decryption key coded into the page), etc. There’s no security here – that’s not the point – but there is index-proofing.

I don’t actually have time to go learn how to build a WordPress plug-in, so I’m probably not going to actually do this. If you do and you manage to get rich doing it, I want a cut. ‘Nuff said.

DNS problems

Monday, July 31st, 2006

Sorry about the DNS problems from today. I accidentally renewed my hosted DNS service with Network Solutions, and they reminded me why I haven’t liked them since they quit being InterNIC.

DNS records have be re-updated and all is well. Until next year, anyway…

PREfast backgrounder

Sunday, July 30th, 2006

I’ve been a happy user of PREfast for a couple of years now. It has quickly become as imperative for driver development as Driver Verifier. I’ve written a little about it before, but I thought I’d collect some of my past ramblings into one PREfast background post.

PREfast is a static source code analysis tool that ships with the DDK. If you’ve used lint, you’ll feel fairly at home using PREfast. It can be used with kernel-mode or user-mode code, and it can do simple C-based analysis (e.g. possible NULL pointer deref) as well as more complex OS-specific analysis (e.g. You can’t call KeAcquireSpinLockAtDpcLevel because you’re at PASSIVE_LEVEL).

To use PREfast, you simply open up a build window and feed it your standard build command:

C:\\dev\\sandbox\\scratch\\driver>prefast build -cZ
--------------------------------------------------------
Microsoft (R) PREfast Version 8.0.72190.
Copyright (C) Microsoft Corporation. All rights reserved.
--------------------------------------------------------
BUILD: Compile and Link for x86
BUILD: Start time: Sun Jul 30 22:04:54 2006
BUILD: Examining c:\\dev\\sandbox\\scratch\\driver directory
for files to compile
    c:\\dev\\sandbox\\scratch\\driver
BUILD: Compiling and Linking c:\\dev\\sandbox\\scratch\\driver
directory
Precompiling - pch.h
Compiling - sdriver.c
Compiling - sdriver.c
Linking Executable - objfre_wlh_x86\\i386\\sdriver.sys
BUILD: Finish time: Sun Jul 30 22:04:59 2006
BUILD: Done

    6 files compiled
    1 executable built
----------------------------------------------------------
Removing duplicate defects from the log...
----------------------------------------------------------
PREfast reported 8 defects during execution of the command.
----------------------------------------------------------
Enter PREFAST LIST to list the defect log as text within the
console.
Enter PREFAST VIEW to display the defect log user interface.

PREfast then compiles your code (twice usually; more on that later) and analyzes it in the process. You can then see what PREfast found by typing either of the two commands it so helpfully supplies for you: “prefast list” will get you a text-based list of what defects were found, and “prefast view” will get you a nice GUI with similar information.

A couple of caveats:

  • You can’t use it with 64-bit builds yet. It finds various problems in various places. This should be fixed soon, but meanwhile, run PREfast on 32-bit builds.
  • You have to supply a full build command, not a doskey macro, on the prefast command line. That means prefast build -cZ, NOT prefast bcz.
  • If you set up your own build window with setenv, be sure you pass a fully qualified path – it has to start with a drive letter, not a backslash.

PREfast is one of the best tools in the DDK. It has saved me dozens of bugs in the time I’ve been using it. Give it a try – it’s free and painless, and you don’t even have to spend time setting it up.

More on the DDK interlocked bugs

Friday, July 28th, 2006

According to some folks on USENET, it’s not just InterlockedOr() that’s broken (see my earlier post for background).

Someone asked if the problem was limited to InterlockedOr():

On 2006-07-29 00:16:04 -0500, 440gtx@email.com said:

Skywing [MVP] wrote:
> InterlockedAnd as well from following ntdev.

It appears InterlockedXor generates incorrect code as well. Perhaps the bug can appear anytime the compiler emits a cmpxchg instruction.

At this point I wouldn’t mind seeing something from Microsoft along the lines of an official erratum describing what really is broken, as opposed to having people make guesses.

Memory pressure

Thursday, July 27th, 2006

If you test your drivers with Driver Verifier’s Low Resources Simulation option, you’ll soon find that memory allocations and other resource allocations are failing occasionally. If you are also testing with NDIS verifier with its low resources sim turned on, you’ll find that things fail quite often.

A basic principle of good driver design is grace under pressure. No, I’m not referring to the Rush album, I’m referring to the quality that a good driver has that allows it to be maximally useful in low-resources conditions.

This is a commonly overlooked area of driver design. It doesn’t simply imply checking return values (although you should, every time). It further means making intelligent micro-design decisions and intelligent high-level design decisions.

An example of a bad micro-design decision (pesudo-code):

PUCHAR g_array;

BOOLEAN updateArray(PUCHAR newArray, SIZE_T newSize)
{
	if(g_array)
		ExFreePool(&g_array);

	g_array = ExAllocatePoolWithTag(
	                  PagdPool,
	                  newSize,
	                  '0mem');

	if(!g_array)
		return FALSE;

	RtlCopyMemory(
	        g_array,
	        newArray,
	        newSize);

	return TRUE;
}

It’s good that this code checks the return value of the allocate – otherwise it’d be a bugcheck on failure – but it has weird side effects. As a user of this function, I’d expect at a glance that it would either successfully update the array and return TRUE or fail to update the array and return FALSE. Instead, it updates the array to something totally odd (NULL) and returns FALSE. And, as a bonus (not!), if this array was adding value to your driver, then you’ve killed that in the process.

Obviously each failed allocation case needs to be considered independently, but there are a couple of techniques you can use to reduce failures like these. One thing that works more often than you might expect is to simply statically allocate resources once and re-use them. In the above example, I could have simply declared a global array that was big enough to handle my expected needs, and perhaps added some bounds checking to the mix. In things like device extensions or network adapter context areas, you can often reserve enough extra space the first time that you don’t have to free and re-allocate space after the initial allocation.

Another tool that you may be able to bring to bear is the lookaside list – reference ExInitializePagedLookasideList and friends. If you’re allocating and releasing fixed-size hunks of data, then using a lookaside list will help smooth out your requests to the OS for memory, and may result in fewer out-of-memory conditions being hit. Note, however, that this is totally OS-dependent, in that the OS can decide to shrink your free list whenever it wants if it is under memory pressure.

Using a lookaside list can have a couple of other benefits too: since you tend to build up your cache of blocks early in your driver’s execution, you can avoid resource constraints due to pool fragmentation, and on the flip side, using a lookaside list keeps your driver from contributing to the fragmentation. And, of course, there are performance benefits.

At the end of the day, it’s far more important to design your driver to gracefully handle low-resources conditions than it is to push the problem off onto the lookaside list code, but lookaside lists are worth thinking about next time you are evaluating a driver design.

On the subject of setups

Wednesday, July 26th, 2006

Maybe blogging about pet peeves will make me realize how nit-picky I am. Probably not. :)

set⋅up |ˈsetˌəp|
noun [usu. in sing. ] informal

Repeatedly in the last few days, I have read the word setup used as a verb. It’s not. It’s a noun. The verb form is two words: set up. You can set up a setup, or you can set a setup up, or this can be a setup, but you never setup a user’s computer.

In computerese, setup can also be the name of a program, in which case you can run setup, or run the setup program, but it’s still not a verb.

There, I feel better now. Back to solving more important problems.

Beware the InterlockedOr()

Tuesday, July 25th, 2006

There is a thread going on in the newsgroups (following a previous discussion on NTDEV) pointing out a current compiler bug in the DDK relating to the return value of InterlockedOr(). As Gary Little pointed out, it’s not a commonly used function, particularly when it comes to examining the return value, but nevertheless, the bug is there, so you should be aware of it.

Elyias Yakub from Microsoft pointed out that the bug had been reported and fixed a couple of years ago in the mainline compilers, and another poster mentioned that the Visual C++ 2005 compiler doesn’t exhibit this problem.

The bug is in the compiler intrinsic, so don’t try to bypass the API by going to the intrinsic – you’ll have the same problem. If you do call InterlockedOr(), treat it as returning VOID – don’t check the return value.

And while the bug may be fixed in the WDK compiler (I haven’t checked, actually), that is still a beta kit so it’s still subject to change. If you do wind up needing this DDI, you should probably hand-verify the generated code.

Saturday music blogging

Saturday, July 22nd, 2006

Saturday music blogging was so much fun last time that I think I’ll try it again.

A few years ago Suzanne asked me who my favorite composer is. At the time I was on a bit of a Beethoven kick, so that’s what I said. Then, I promptly became a Bachaholic and have remained so ever since.

Well, about a month ago I went on a mission to collect Beethoven’s entire symphony cycle. They are, of course, fantastic, but one thing has jumped out at me – I am, it turns out, a particular fan of the odd-numbered symphonies. I love the evens too, but particularly 5, 7, and 9 are my three favorites, in indeterminate order.

The differences between the even-numbered and odd-numbered symphonies has, of course, been noted many times before. The evens have a reputation for being tranquil, calm, humorous, carefully crafted, and classical. The odds have a reputation for being intense, emotionally charged, and forceful. I guess that’s the kind of mood I’m in lately. :-)

Anyway, I just finished listening to #5, and am stunned that I spent three years basically not listening to this stuff. Oh well, there’s lots of coding time to catch up.

New debuggers

Wednesday, July 19th, 2006

Version 6.6.7.5 of the debugging tools has been released. Get ‘em at http://www.microsoft.com/whdc/devtools/debugging/default.mspx. And, Ken has some extra upgrade advice for remote debugging.

Interlocked list manipulation functions

Monday, July 17th, 2006

An NTDEV poster asked about interlocked list manipulation functions, and about the DDK’s insistence that calls to those functions not be mixed with non-interlocked functions. I answered that there was no difference between calling an ExInterlocked* function and grabbing that lock in the normal way followed by using non-interlocked ops.

Doron Holan from Microsoft clarified my response by saying that the lock is acquired at the highest IRQL in the machine, which prevents priority inversion deadlocks with things like ISRs. I felt temporarily dumb for forgetting about that case, but I thought the way Doron phrased it was a little odd.

So, I decided to take my own advice to the OP and see what these functions really do under the hood. Picking on ExfInterlockedPopEntryList:

lkd> u ExfInterlockedPopEntryList
nt!ExfInterlockedPopEntryList:
804e2e68 9c       pushfd
804e2e69 fa       cli
804e2e6a 8b01     mov     eax,[ecx]
804e2e6c 0bc0     or      eax,eax
804e2e6e 7406     jz      EmptyList
804e2e70 8b10     mov     edx,[eax]
804e2e72 8911     mov     [ecx],edx
804e2e74 9d       popfd
804e2e75 c3       ret

EmptyList:
804e2e76 9d       popfd
804e2e77 33c0     xor     eax,eax
804e2e79 c3       ret

This is a UP kernel, but you should still need to grab the lock at high IRQL to mask off any higher-IRQL threads that might want it, to prevent exactly the priority inversion Doron discussed.

The answer is the little cli instruction at the top of the routine. Totally disabling interrupts on the chip is equivalent to the highest possible IRQL, and does exactly the job we needed. The popfd restores the interrupt enable flag (among others) to its previous condition.

The MP kernel looks about the same, with the addition that the routine actually spins on the supplied lock – unnecessary for this UP kernel.