/
The Case of the Unexplained, 2010: Troubleshooting with Mar The Case of the Unexplained, 2010: Troubleshooting with Mar

The Case of the Unexplained, 2010: Troubleshooting with Mar - PowerPoint Presentation

lindy-dunigan
lindy-dunigan . @lindy-dunigan
Follow
396 views
Uploaded On 2017-06-30

The Case of the Unexplained, 2010: Troubleshooting with Mar - PPT Presentation

Russinovich Mark Russinovich Technical Fellow Microsoft Corporation SESSION CODE WCL315 About Me Technical Fellow Microsoft Cofounder and chief software architect of Winternals Software ID: 565074

case process windows microsoft process case microsoft windows thread crash application system file solved cpu monitor dump crashes start

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "The Case of the Unexplained, 2010: Troub..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

The Case of the Unexplained, 2010: Troubleshooting with Mark Russinovich

Mark RussinovichTechnical FellowMicrosoft Corporation

SESSION CODE: WCL315Slide2

About Me

Technical Fellow, Microsoft

Co-founder and chief software

architect of Winternals Software

Co-author of Windows Internals 4th and 5th edition and Inside Windows 2000 3rd edition with David SolomonAuthor of TechNet SysinternalsHome of blog and forumsContributing Editor TechNet Magazine, Windows IT Pro MagazinePh.D. in Computer EngineeringSlide3

Outline

Introduction

Sluggish Performance

Application Hangs

Error Messages

Application CrashesBlue ScreensSlide4

Case of the Unexplained…

This is the 2010 version of the “case of the unexplained” talk seriesPrevious versions covered different cases

Can view webcast on Sysinternals->Mark’s webcasts

Based on real case studies

Some of these have been written up on my blogSlide5

Troubleshooting

Most applications do a poor job of reporting unexpected errorsLocked, missing or corrupt filesMissing or corrupt registry data

Permissions problems

Errors manifest in several different ways

Misleading error messagesCrashes or hangsSlide6
Slide7

Purpose of Talk

Show you how to solve these classes of problems by peering beneath the surfaceInterpreting process, file and registry activityInterpreting call stacks

You’ll learn tools and techniques to help you solve seemingly unsolvable problemsSlide8

Tools We’ll Use

Sysinternals: www.microsoft.com/technet/sysinternals Process Explorer – process/thread viewer

Process Monitor – file/registry/process/thread tracing

Autoruns

– displays all autostart locationsSigCheck – shows file version information PsExec – execute processes remotely or in the system accountTcpView – shows TCP/IP endpointsStrings – dumps printable strings in any fileADInsight – real time LDAP (Active Directory) monitorZoomit – presentation tool I’m usingMicrosoft downloads:Kernrate – sample-based system profiler

Visual Studio: Spy++ - Window analysis utility Debugging Tools for Windows: Windbg application and kernel debugger: www.microsoft.com/whdc/devtools/debugging/WindbgSlide9

Outline

Sluggish Performance

Application Hangs

Error Messages

Application CrashesBlue ScreensSlide10

Process Explorer

Process Explorer is a Task Manager replacementYou can literally replace Task Manager with Options->Replace Task Manager

Hide-when-minimize to always have it handy

Hover the mouse to see a tooltip showing the process consuming the most CPU

Open System Information graph to see CPU usage historyGraphs are time stamped with hover showing biggest consumer at point in timeAlso includes other activity such as I/O, kernel memory limitsSlide11

The Case of the Wmiprvse.exe CPU Hog

Customer periodically saw Wmiprsve.exe consuming excessive amounts of CPU:

Wmiprsve

is a hosting process for WMI providers so had to look deeper to find causeSlide12

Processes and Threads

A process represents an instance of a running programAddress space

Resources (e.g., open handles)

Security profile (token)

A thread is an execution context within a processUnit of scheduling (threads run, processes don’t run)All threads in a process share the same per-process address spaceThe System process is the default home for kernel mode system threadsFunctions in OS and some drivers that need to run as real threadsE.g., need to run concurrently with other system activity, wait on timers, perform background “housekeeping” workOther multi-host processes: svchost, iexplore, mmc, dllhostSlide13

Viewing Threads

Task Manager doesn’t show thread details within a processProcess Explorer does on “Threads” tab

Displays thread details such as ID, CPU usage, start time, state, priority

Start address is where the thread began running (not where it is now)

Click Module to get details on module containing thread start addressSlide14

Thread Start Functions and Symbol Information

Process Explorer can map the addresses within a module to the names of functions

This can help identify which component within a process is responsible for CPU usage

Configure Process

Explorer’s symbol engine:Download the latest Debugging Tools for Windows from Microsoft (free)Use dbghelp.dll from the Debugging ToolsPoint at the Microsoft public symbol server (or internal symbol server if you have access)Slide15

The Case of the Wmiprvse.exe CPU Hog (

Cont)

Thread list pointed at thread with generic start address:

Had to look deeper…Slide16

Call Stacks

Sometimes a thread start address doesn’t tell you what a thread is doingThe stack might provide a hint:

The stack is a per-thread region of memory that records a history of function nesting

The bottom from (Function 3) is where the thread will continue executing

Function 2Function 1

Function 3Slide17

Viewing Call Stacks

Click Stack on the Threads tab to view a thread’s call stack

Lists functions in reverse chronological order

Note that start address on Threads tab is different than first function shown in stack

This is because all threads created by Windows programs start in a library function in Kernel32.dll which calls the programmed start addressSlide18

The Case of the Wmiprvse.exe CPU Hog: Solved

Thread stack implicated AssetAdvisor.dll:Web search led to this KB article:

Article had hotfix for SMS 2003: problem solvedSlide19

The Case of the Runaway CPU

User noticed that system was sluggishRan Process Explorer and saw that System process was consuming CPU:Slide20

The Case of the Runaway CPU (Cont

)Looked at threads tab and saw thread from ALCXWDM driver causing the CPU usage:Slide21

The Case of the Runaway CPU: Solved

Double-clicked to look at version and s

aw it was

Realtek

driverWent to Realtek site and downloaded newer version: problem solved:Slide22

Outline

Sluggish Performance

Application Hangs

Error Messages

Application CrashesBlue ScreensSlide23

Process Monitor

Process Monitor is a real-time file, registry, process and thread monitorIt requires Windows 2000 SP4 w/Update Rollup 1, XP SP2 or higher, Server 2003 SP1 or higher, Vista and higher, or Server 2008 (including 64-bit versions of Windows) and higher

It replaces

Filemon

and Regmon, but you can use Filemon and Regmon on older operating systemsEnhancements over Filemon/Regmon include:More advanced filteringOperation call stacksBoot-time loggingData mining viewsProcess tree to see short-lived processesWhen in doubt, run Process Monitor!It will often show you the cause for error messages

It many times tells you what is causing sluggish performanceSlide24

The Case of the Slow Signed Application Start

User had an application that started quickly until they digitally signed itLaunch time went from seconds to a minute and a half

Asked user to captured Process Monitor trace

Saw multiple references to certificate revocation list (CRL) servers

Saw multiple references to proxy configurationSlide25

The Case of the Slow Signed Application Start: Solved

Asked user if system was connected to network: noSearched the web and learned that delays caused by .NET runtime signature verification

Could see .NET 2.0 framework loaded in log file

That triggered proxy server lookups

Solution: create a .config file that tells runtime to skip check<?xml version="1.0" encoding="utf-8"?><configuration>      <runtime>              <generatePublisherEvidence

enabled="false"/>      </runtime>

</configuration>Slide26

Outline

Sluggish Performance

Application Hangs

Error Messages

Application CrashesBlue ScreensSlide27

The Case of the Failed SQL Reporting Services Attachment

Customer contacted Microsoft Support because sending an email subscription from SQL Reporting Services

(SRS) would

not attach the image

fileSupport spent 34 hours investigating:Had customer try on another identical SRS system: successTried to repro in house with same SRS DLL (Cdosys.dll) and on various OS’s, but unableFinally decided to capture Process Monitor trace from working and failing system to compareSlide28

The Case of the Failed SQL Reporting Services Attachment (

Cont)

Searched through traces for reference of CDO.Message.1 and started comparing

Working trace references a

CodePage key:Failing trace doesn’t:

Failing

WorkingSlide29

The Case of the Failed SQL Reporting Services Attachment: Solved

Opened HKEY_LOCAL_MACHINE\SYSTEM\

CurrentControlSet

\

Control\Nls\CodePage with “Jump to” on failing systemNoticed lots of missing values:Imported key from working system: problem solved

Failing

WorkingSlide30

The Case of the Blocked HTTP Port

User complained that they were unable to browse the web Got connection error from IE

Had just had system migrated between domains

Admin went about troubleshooting

Deleted IE cache: problem persistedChecked DNS, gateway, IP settings: no problemsTried other outbound ports: no problemsSlide31

The Case of the Blocked HTTP Port (Cont)

Suspected third-party plugin

so captured a Process Monitor trace while launching IE

Set a file system filter and looked at stack of each event

Got to event that accessed Software.log hive file:Slide32

The Case of the Blocked HTTP Port (Cont)

Web search revealed that driver was part of ZoneAlarms

stateful

firewallSearch also showed that Cisco VPN client uses it:Had uninstalled VPN client before moving system across domainsUninstall must have left something behind Slide33

Viewing Autostarts

Use Autoruns

to see what’s configured to start when the system boots and you login

Windows

MsConfig shows a subset defined autostart locationsMsConfig doesn’t show as much informationSlide34

The Case of the Blocked HTTP Port: Solved

Ran Autoruns and looked for driver:

Unchecked driver entry, rebooted and problem solvedSlide35

Outline

Sluggish Performance

Application Hangs

Error Messages

Application CrashesBlue ScreensSlide36

Application Crashes

In most cases, there’s nothing you can do about application crashes They are caused by a bug in in the programOnly the developer can fix a bug

However, the crash may be caused by

misconfiguration

or an extension (a plugin)Monitor the application’s crash with Process Monitor if it’s reproducibleLook for extensions in the crash file with WindbgSlide37

Finding the Crash Dump

On pre-Vista systems, finding the dump file is easy:Slide38

Attaching to the Dying Process

Vista and higher doesn’t save crash dumps for most crashesOnly if Microsoft requests a dump for study and you send it in

When a crash occurs, don’t dismiss the crash dialog:

Launch

Windbg and attach to the processYou can save a dump with the .dumpcommandSlide39

Identifying the Crashed Process

On Vista and higher, the process name might not be enough to identify the instance that’s crashed:

To determine the PID of the crashed instance, look at

WerFault’s

command line:Slide40

Enabling Dump Archiving on Vista and Higher

Or you can configure Vista and higher to always generate and save a dump fileCreate a key named:

HKLM\Software\Microsoft\Windows\Windows Error Reporting\

LocalDumps

Dumps go to %LOCALAPPDATA%\CrashDumpsOverride with a DumpFolder value (REG_EXPAND_SZ)Limit dump history with a DumpCount value (DWORD)Slide41

Analyzing a Crash

Basic crash dump analysis is easy and it might tell you the causeRequires Windbg

and symbol configuration

Once the dump is loaded, find the faulting thread

The debugger might identify itIf the debugger doesn’t, examine each thread stack looking for “fault”, “exception”, or “error” namesExamine the stack of the faulting thread to look for third-party pluginsIf you suspect an extension:Check for a new version Uninstall it if the problem persistsSlide42

The Case of the Media Foundation Crash

User tried to open a WMV file with Windows Media Player, but would get a crash:Slide43

The Case of the Media Foundation Crash (Cont)

Attached to process and did a !analyze –v:Slide44

The Case of the Media Foundation Crash: Solved

Did a Web search for “evr

monitor crash” and found a

hotfix

:User was using 5 monitorsApplied hotfix and problem solvedSlide45

Outline

Sluggish Performance

Application Hangs

Error Messages

Application CrashesBlue ScreensSlide46

Blue Screen Crashes

Windows has various components that run in Kernel Mode, the highest privilege mode of the OSOS components: Ntoskrnl.exe, Hal.dll

Drivers: Ntfs.sys, Tcpip.sys, device drivers

Kernel-mode components are privileged extensions to the OS have to adhere to various rules

Not accessing invalid memoryAccessing memory at the right “Interrupt Request Level”Not causing resource deadlocksWhen a kernel-mode component performs an illegal operation, Windows crashes (blue screens)Crashing helps preserve the integrity of user dataA resource deadlock can hang the systemSlide47

Online Crash Analysis

When you reboot after a crash, Windows offers to upload it to Microsoft Online Crash Analysis (OCA)Automated server generates a thumbprint of the crash and uses it as a key in a database

If the database has an entry, the user is told the cause and directed at a fixSlide48

Basic Crash Dump Analysis

Many times OCA doesn’t know the cause:

Basic crash dump analysis is easy and it might tell you the cause

Requires

Windbg and symbol configurationDump files are in either: \Windows\Memory.dmp: Vista+ and servers\Windows\Minidump: Windows 2000 Pro and Windows XPSlide49

The Case of the Spontaneous Reboots

Admin reported that server was sporadically rebootingOther admin saw ‘case of’ talk and looked in event log:Slide50

The Case of the Spontaneous Reboots: Solved

Crash dump showed that cpqteam.sys was likely responsible:

File properties showed it was HP

Proliant

network driver and old version:Went to HP’s site and got new version: problem solvedSlide51

Summary and More Information

A few basic tools and techniques can solve seemingly impossible problemsI learn by always trying to determine the root cause

Resources:

Webcasts of two previous “Case of the Unexplained “ talked

Sysinternals->Mark’s WebcastsSysinternals Video Library: in-depth dive on tools and troubleshootingMy blogWindows Internals: understand the way the OS worksIf you’ve solved one, send me a description, screenshots and log files!Slide52

Weekly, Monthly and Quarterly Rhythm of Topical Content

What is the Springboard Series?

To the IT pro, our goal is

Be the definitive resource for Desktop IT pros

Open, honest; show don’t tell

Information at right time, right level across Adoption Lifecycle

Inside of Microsoft we are

A turnkey IT pro engagement platform for depth and breadth

The program to mobilize MS marketing and field to

focus on desktop OS IT pros

Visit the Springboard Series on TechNet at www.microsoft.com/springboard

The Springboard Series IT pro experience offers dynamic content

and structured guidance across the adoption lifecycle

DEPLOY

PILOT

MANAGE

EXPLORE

DISCOVER

Is it worth the pain?

How does it change

my

work?

Is our

environment

ready?

Is the organization ready?

How do I maintain

and

optimize?

one-Windows

TechCenter

in 10 languages

Virtual

Roundtable Events

Springboard Technical Experts Panel Event Support

and Resources

Straight-talk Monthly Feature Articles and Overview Guides

TalkingAboutWindows

Video BlogsSlide53

Resources

Required Slide

www.microsoft.com/teched

Sessions On-Demand & Community

Microsoft Certification & Training Resources

Resources for IT Professionals

Resources for Developers

www.microsoft.com/learning

http://microsoft.com/technet

http://microsoft.com/msdn

LearningSlide54

Complete an evaluation on

CommNet

and

enter to win!

Required SlideSlide55

Sign up for Tech·Ed 2011 and save $500

starting June 8 – June 31sthttp://

northamerica.msteched.com/registration

 

You can also register at the North America 2011 kiosk located at registrationJoin us in Atlanta next year Slide56

©

2010 Microsoft

Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.

MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.Slide57

Required Slide