One-Bit change caused an Access Violation
We were analyzing one of the process mini dumps that we got and here is what we saw:
The process was crashing due to an AV while carrying out a fist instruction :
df9090000000 fist word ptr [rax+80h] ds:000007fe`e90918e2=fff1
But the code was something like this:
IApplicationPtr->SomeMethod(SomeParameter);
So the above instruction was wrong. It shouldn't be a fist instruction but a call instruction. I also verified this with a running instance of a process and found that at that offset the instruction is actually a call instruction. And surprisingly the call instruction and fist instruction differ by one bit.
So somehow one bit was changed and that lead to an unwanted crash of the process. I found a WinDBG command useful in this case: chkimg. This command compares the image in the dump to the image in the repository and reports any problem. In my case it returned:
0:001> !chkimg -d MyDLl
7fefeceb128 - MyDll!SomeMethod+505
[ ff:df ]
1 error : mcmscsub (7fefeceb128)
7fefeceb128 - MyDll!SomeMethod+505
[ ff:df ]
1 error : mcmscsub (7fefeceb128)
So it was a dll image problem. Somehow that bit changed causing the crash. Further search on the internet landed me on the page Microsoft Research Paper which talks about this problem.
So when we get a crash first check if its valid code issue or a hardware issue. !chkimg will help us in this case.
So when we get a crash first check if its valid code issue or a hardware issue. !chkimg will help us in this case.