Sunday, June 24, 2012

One-Bit change caused an Access Violation 


We were analyzing one of the process mini dumps that we got and here is what we saw:

The process was crashing due to an AV while carrying out a fist instruction :

df9090000000    fist    word ptr [rax+80h] ds:000007fe`e90918e2=fff1

But the code was something like this:

IApplicationPtr->SomeMethod(SomeParameter);

So the above instruction was wrong. It shouldn't be a fist instruction but a call instruction. I also verified this with a running instance of a process and found that at that offset the instruction is actually a call instruction. And surprisingly the call instruction and fist instruction differ by one bit.
So somehow one bit was changed and that lead to an unwanted crash of the process. I found a WinDBG command useful in this case: chkimg. This command compares the image in the dump to the image in the repository and reports any problem. In my case it returned:

0:001> !chkimg -d MyDLl
    7fefeceb128  - MyDll!SomeMethod+505
                [ ff:df ]
1 error : mcmscsub (7fefeceb128)
So it was a dll image problem. Somehow that bit changed causing the crash. Further search on the internet landed me on the page Microsoft Research Paper which talks about this problem.

So when we get a crash first check if its valid code issue or a hardware issue. !chkimg will help us in this case.