Archive for the ‘Networking’ Category

More on object lifetimes

Saturday, November 18th, 2006

In an earlier post, I described a subtle race condition resulting from the differing lifetimes of miniport adapters and control device objects. Last week, Gianluca Varenni, the maintainer of WinPcap and one of the brains at CACE technology, pointed out that Microsoft had recently changed the Passthru sample to add reference counting in some instances to adapter objects. I went back and looked, and sure enough, the current DDK WDK sample has additional reference counting built into the driver.

Microsoft didn’t add any comments to the sample describing the reference counting addition, but I found this bug myself a while ago and implemented essentially the same solution. The basic problem is that there is a race between the two different adapter tear-down paths – the one that is initiated from halting the virtual miniport itself and the other that is triggered by the halting of the underlying miniport.

Gianluca also pointed out that nobody in their right mind would write an IM driver from scratch, other than as an educational experience, because it’s entirely too difficult to get the various NDIS synchronization issues right unless you’re an absolute expert at it. Obviously, even Microsoft is still finding bugs.

The good news is that IM’s are dead. Vista has a much-improved lightweight filtering architecture, so the writing is no the wall.

Time to recertify

Monday, November 6th, 2006

Ever since I passed the CCIE lab back in 1999, I’ve done everything in my power to procrastinate the mandatory recertification exams. Never in my history with Cisco have I managed to re-certify early.

Well, it’s that time again, so I just scheduled my exam. The cost went up to $300 – for one 2-hour test! They’re not kidding around any more; I guess I should actually try to pass instead of my usual routine of going in cold and hoping for the best. :-)

OK, last beef for the weekend, promise…

Sunday, October 22nd, 2006

We’ve had a lot to complain about this weekend, which isn’t normally my style, but since I’m in that kind of mood, I may as well add one more item to the list.

C:\\Users\\dispensa>telnet
'telnet' is not recognized as an internal or external command,
operable program or batch file.

No telnet?! This Vista thing is gonna take some getting used to…

UPDATE: An anonymous commenter points out that telnet.exe is simply not installed by default any more, but you can add it by going through the control panel. I guess this is better than nothing, but it was certainly annoying and the solution was non-obvious.

A lot slower than a station wagon, but still faster than DSL

Wednesday, October 11th, 2006

There’s an article in the New Yorker comparing the data transfer rate of DSL with that of a snail hitched to a chariot with 4 DVDs for wheels. This isn’t nearly as fast as my station wagon, but it’s on the right track!

via kottke.

Using SDV with NDIS drivers

Sunday, October 8th, 2006

Static Driver Verifier (SDV) is one of the best new testing additions to the DDK. It’s slow and takes a ton of RAM, but the results are worth it if it saves you from shipping a bug.

The DDK lists several limitations about the kinds of drivers that SDV can verify, but even though NDIS drivers don’t meet these requirements, my experience is that SDV tests can be quite useful.

Lots of NDIS calls are simply wrappers around WDM DDIs. For example, NdisAcquireSpinLock() is just a #define to KeAcquireSpinLock. Look through ndis.h to get a feel for just how often this is true. While it can’t do anything with NDIS-specific callbacks, it seems that some of SDV’s tests run perfectly well in this environment.

For example, to have SDV validate your usage of spin locks, you could do the following from your driver’s build directory:

C:\\driver> staticdv /rule:*SpinLock*

These tests take abouy 45 minutes for me on one of my IM drivers on a relatively new (single-proc) computer, so it’s not something you can plug into every build, but it’s totally worth the time to run through the rules before a release to QA.

It turns out that nobody knows how to use XHTML

Wednesday, September 20th, 2006

The Surfin’ Safari blog has a post by maciej pointing out that most people use XHTML wrong, including (in particular) almost everyone that displays the “Valid XHTML 1.0″ logo on their websites.

So what really determines if a document is HTML or XHTML? The one and only thing that controls whether a document is HTML or XHTML is the MIME type. If the document is served with a text/html MIME type, it is treated as HTML. If it is served as application/xhtml+xml or text/xml, it gets treated as XHTML. In particular, none of the following things will cause your document to be treated as XHTML:

  • Using an XHTML doctype declaration
  • Putting an XML declaration at the top
  • Using XHTML-specific syntax like self-closing tags
  • Validating it as XHTML

In fact, the vast majority of supposedly XHTML documents on the internet are served as text/html. Which means they are not XHTML at all, but actually invalid HTML that’s getting by on the error handling of HTML parsers.

The article concludes by recommending, essentially, that you write HTML 4.01 code and serve it up as text/html. Controversial advice, given what the Web Standards crowd has been saying about XHTML for a while. In light of the above, XHTML seems worse than useless (for the next decade anyway) – it’s actually causing the very compatibility problems that the WaSP people are trying to prevent. I’m curious to know what people like Molly Holzschlag have to say about this.

The article is worth a read. Don’t give up on it because of it’s very introductory-level first paragraph.

NdisMRegisterDevice and object lifetimes

Wednesday, September 13th, 2006

NdisMRegisterDevice was introduced back in the NDIS 5 days to allow for the creation of a control device in an NDIS miniport. The control device mechanism is used a lot by IM driver writers (configuring firewall rules, QoS, whatever). Some people also use them for miniports, such as virtual Ethernet miniports and NDIS-WDM drivers.

One thing that’s not clear from the documentation, though, is what sort of lifetime the device object has. Living in the nice safe world of the NDIS Miniport model, it’s easy to get lulled into a false sense of security regarding synchronization. NDIS is generally pretty good at managing device state, synchronizing with MiniportHalt(), and so on.

But when you create a device object with NdisMRegisterDevice(), NDIS basically turns all state management over to you. That means that you have to be cognizant of the different lifetime a device object has vs. a miniport object. In particular, a control device can indeed outlive a miniport!

For example, say you have a memory block that you allocate during MiniportInitialize() and free during MiniportHalt() – not an uncommon strategy. If your dispatch routines for your control device need to reference that memory, you have to be careful about how you handle it. MiniportHalt() is NOT synchronized with these control device dispatch routines, so your dispatch routines can indeed be called after MiniportHalt() is called. In fact, there’s no great way to tell when your dispatch routines will not be called any more, and there’s no way to tell if a thread is executing in a dispatch routine at the time you deregister the device.

All of this means that you have to be more clever in the way you manage that shared memory block. One possibility is reference counting the block itself – Reference the block on creation in MiniportInit(), dereference it on MiniportHalt(), and take/release references as necessary in your dispatch routines. This must be done in an interlocked and race-proof way; NdisInterlockedIncrement() and friends are helpful for this.

This is a subtle point, and one that is not well documented. It stems in large part from the absence of device extension access for control devices created this way; that’s the mechanism that the OS uses to tie a memory block’s lifetime to a device object’s (as opposed to a miniport’s). But, as long as you’re aware of the issue, it can be worked around.

Why drivers have to be secure

Friday, June 23rd, 2006

Here’s a very practical reason to run your drives through SDV and PREfast: people are using wi-fi drivers as attack vectors.

What does it cost to test drivers using PREfast, SDV, and the kind of input fuzzing described in the article? What does it cost to have a user’s system breached via your driver?

Net Neutrality

Monday, June 5th, 2006

I was interviewed by the Kansas City Business Journal about Net Neutrality. recently. More commentary at the Kauffman Foundation blog.

Summary: Last mile broadband is an oligopoly (best case) or a monopoly at the moment, so it needs help from the government. There’s a lot of complex opinion behind that; the biz journal article points out some of it.

AHA!

Saturday, April 29th, 2006

See, I told you so. It was all my fault.

After tons more code spelunking and debugging, it became obvious that NDIS was simply failing to set up the miniport block for my driver correctly. It even lacked the special miniport block magic number (which is ‘NDMP’ in big-endian). I spent hours tracing through the internals and finally gave up and went to bed, hoping for inspiration the next day.

I found several samples that called NdisReadConfiguration() in the DDK, and noticed in particular that Elyias Yakub’s netvmini sample was passing a variable called ‘WrapperConfigurationContext’ to NdisOpenConfiguration. I’d read the docs about 100 times, but for some reason, that variable stuck out this time. So I went back to the NdisOpenConfiguration() docs to see what that parameter should be. The documentation reads: Specifies the handle input to MiniportInitialize. So I looked at the MiniportInitialize documentation, and Lo And Behold, there are two handles input to MiniportInitialize(). Sure enough this code was passing the wrong one in.

It would have helped if I had named that variable precisely WrapperConfigurationContext the first time, instead of something slightly and confusingly different, because then I wouldn’t have had to make the tacit translation of variable names from the docs to my code, and the error would have been (more) obvious. Reason #143,532 to have coding standards.

So, back to testing. What a pain. I discovered several interesting NDIS-related bugs along the way, though, so it wasn’t a total loss. NdisInitializeString() is just weird, and to add to that, there are a couple of minor type bugs in NdisInitializeString() (again!) and NdisMEthIndicateReceive(), where things that are documented as taking VOID are actually prototyped as taking PUCHAR.