Programming4us
         
 
 
Windows

Windows Sysinternals : Windows Core Concepts - Call Stacks and Symbols

8/8/2011 6:18:45 PM

1. Processes, Threads, and Jobs

Although programs and processes appear similar on the surface, they are fundamentally different. A program is a static sequence of instructions, whereas a process is a container for a set of resources used to execute a program. At the highest level of abstraction, a Windows process comprises the following:

  • A unique identifier called a process ID (PID).

  • At least one thread of execution. Every thread in a process has full access to all the resources referenced by the process container

  • A private virtual address space, which is a set of virtual memory addresses that the process can use to store and reference data and code

  • An executable program, which defines initial code and data and is mapped into the process’ virtual address space

  • A list of open handles to various system resources, such as semaphores, communication ports, and files

  • A security context called an access token that identifies the user, security groups, privileges, UAC virtualization state, LSA logon session ID, and terminal services session ID

Each process also has a record of the PID of its parent process. However, if the parent exits, this information is not updated. Therefore, it is possible for a process to reference a nonexistent parent or even a different process that has been assigned the original parent’s PID. A process records its parent PID only for informational purposes, however.

Windows provides an extension to the process model called a job. A job object’s main function is to allow groups of processes to be managed and manipulated as a unit. For example, a job can be used to terminate a group of processes all at once instead of one at a time and without the calling process having to know which processes are in the group. A job object also allows control of certain attributes and provides limits for the process or processes associated with the job. For example, jobs can enforce per-process or job-wide limits on user-mode execution time and committed virtual memory. Windows Management Instrumentation (WMI) loads its providers into separate host processes controlled by a job that limits memory consumption as well as the total number of WMI provider host processes that can run at one time.

As mentioned, a process is merely a container. Technically, it is not the process that runs—it is its threads. A thread is the entity within a process that Windows schedules for execution, and it includes the following essential components:

  • The contents of a set of CPU registers representing the state of the processor. These include an instruction pointer that identifies the next machine instruction the thread will execute.

  • Two stacks, one for the thread to use while executing in kernel mode and one for executing in user mode.

  • A private storage area called thread-local storage (TLS) for use by subsystems, run-time libraries, and dynamic-link libraries (DLLs).

  • A unique identifier called a thread ID (TID). Process IDs and thread IDs are generated from the same namespace, so they never overlap.

  • Threads sometimes have their own security context that is often used by multithreaded server applications that impersonate the security context of the clients they serve.

Although threads have their own execution context, every thread within a process shares the process’ virtual address space (in addition to the rest of the resources belonging to the process), meaning that all the threads in a process can write to and read from one another’s memory. Threads cannot reference the address space of another process, however, unless the other process makes available part of its private address space as a shared memory section (called a file mapping object in the Windows API) or unless one process has the right to open another process to use cross-process memory functions.

By default, threads don’t have their own access token, but they can obtain one, thus allowing individual threads to impersonate a different security context—including that of a process running on a remote Windows system—without affecting other threads in the process.

2. User Mode and Kernel Mode

To prevent user applications from accessing or modifying critical operating system data, Windows uses two processor access modes: user mode and kernel mode. All processes other than the System process run in user mode (Ring 3 on Intel x86 and x64 architectures), whereas device drivers and operating system components such as the executive and kernel run only in kernel mode. Kernel mode refers to a mode of execution (Ring 0 on x86 and x64) in a processor that grants access to all system memory and to all CPU instructions. By providing the low-level operating system software with a higher privilege level than user-mode processes have, the processor provides a necessary foundation for operating system designers to ensure that a misbehaving application can’t disrupt the stability of the system as a whole.


Note:

Do not confuse the user-mode vs. kernel-mode distinction with that of user rights vs. administrator rights. “User mode” in this context does not mean “has only standard user privileges.”


Although each Windows process has its own private memory space, the kernel-mode operating system and device driver code share a single virtual address space that is also included in the address space of every process. The operating system tags each page of virtual memory with the access mode the processor must be in to read or write the page. Pages in system space can be accessed only from kernel mode, whereas all pages in the user address space are accessible from user mode.

Threads of user-mode processes switch from user mode to kernel mode when they make a system service call. For example, a call into the Windows ReadFile API eventually needs to call the internal Windows routine that actually handles reading data from a file. That routine, because it accesses internal system data structures, must run in kernel mode. The transition from user mode to kernel mode is accomplished by the use of a special processor instruction that causes the processor to switch to a system service dispatching function in kernel mode. The operating system executes the corresponding internal function, which for ReadFile is the NtReadFile kernel function. Kernel service functions validate parameters and perform appropriate access checks using the Security Reference Monitor before they execute the requested operation. When the function finishes, the operating system switches the processor mode back to user mode.

Thus, it is normal for a thread in a user-mode process to spend part of its time executing in user mode and part in kernel mode. In fact, because the bulk of the graphics and windowing system also runs in kernel mode, processes hosting graphics-intensive applications can spend more of their time in kernel mode than in user mode. You can see these two modes in the Process Explorer CPU usage graphs: the red portion of the graph represents time spent in kernel mode, and the green area of the graph represents time spent in user mode.

3. Handles

The kernel-mode core of Windows, which is implemented in Ntoskrnl.exe, consists of various subsystems such as the Memory Manager, Process Manager, I/O Manager, and Configuration Manager (registry), which are all parts of the Executive. Each of these subsystems defines one or more types with the Object Manager to represent the resources they expose to applications. For example, the Configuration Manager defines the Key object to represent an open registry key; the Memory Manager defines the Section object for shared memory; the Executive defines Semaphore, Mutant (the internal name for a mutex), and Event synchronization objects (which are objects that wrap fundamental data structures defined by the operating system’s Kernel subsystem); the I/O Manager defines the File object to represent open instances of device-driver resources, which include file system files; and the Process Manager creates Thread and Process objects. Every release of Windows introduces new object types, with Windows 7 defining a total of 42. You can see the object types that a particular version of Windows defines by running the WinObj utility with administrative rights and navigating to the ObjectTypes directory in the Object Manager namespace.

When an application wants to use one of these resources, it first must call the appropriate API to create or open the resource. For instance, the CreateFile function opens or creates a file, the RegOpenKeyEx function opens a registry key, and the CreateSemaphoreEx function opens or creates a semaphore. If the function succeeds, Windows allocates a reference to the object in the process’ handle table, which is maintained by the Executive, and returns the index of the new handle table entry to the application.

This handle value is what the application uses for subsequent operations on the resource. To query or manipulate the resource, the application passes the handle value to API functions such as ReadFile, SetEvent, SetThreadPriority, and MapViewOfFile. The system can look up the object the handle refers to by indexing into the handle table to locate the corresponding handle entry, which contains a pointer to the object. The handle entry also stores the accesses the process was granted at the time it opened the object, which enables the system to make sure it doesn’t allow the process to perform an operation on the object for which it didn’t ask permission. For example, if the process successfully opened a file for read access but tried to use the handle to write to the file, the function would fail.

When a process no longer needs access to an object, it can release its handle to that object, typically by passing the handle value to the CloseHandle API. (Note that some resource managers provide a different API to release its resources.) When a process exits, any handles it still possesses are closed.

4. Call Stacks and Symbols

Several Sysinternals utilities—including Process Explorer, Process Monitor, and VMMap—can display details about the code paths being executed at a particular point in time called call stacks. Associating symbols with the modules in a process’ address space provides more meaningful context information about those code paths, particularly within Windows operating system code. Understanding call stacks and symbols, and how to configure them in the Sysinternals utilities, gives tremendous insight into a process’ behavior and can often lead to the root cause of a problem.

4.1. What Is a Call Stack?

Executable code in a process is normally organized as a collection of discrete functions. To perform its tasks, a function can invoke other functions (subfunctions). When a function has finished, it returns control back to the function that called it.

A made-up example, shown in Figure 1, demonstrates this flow. MyApp.exe ships with a DLL named HelperFunctions.dll. That DLL includes a function named EncryptThisText that encrypts text passed to it. After performing some preparatory operations, EncryptThisText calls the Windows API CryptEncryptMessage in Crypt32.dll. At some point, CryptEncryptMessage needs to allocate some memory and invokes the memory-allocation function malloc in Msvcrt.dll. After malloc has done its work and allocated the requested memory, execution resumes at the point where CryptEncryptMessage had left off. And when CryptEncryptMessage has completed its task, control returns back to the point in EncryptThisText just after its call to CryptEncryptMessage.

Figure 1. Example function calling sequence.


The call stack is the construct that allows the system to know how to return control to a series of callers, as well as to pass parameters between functions and to store local function variables. It’s organized in a “last in, first out” manner, where functions remove items in the reverse order from how they add them. When a function is about to call a subfunction, it puts the memory address of the next instruction to execute upon returning from the subfunction (its “return address”) at the top of the stack. When that subfunction calls yet another function, it adds its own return address to the stack. On returning from a function, the system retrieves whatever address is at the top of the stack and begins executing code from that point.

The convention for displaying a return address in a call stack is module!function+offset, where module is the name of the executable image file containing the function, and offset is the number of bytes (in hexadecimal) past the beginning of the function. If the function name is not available, the address is shown simply as “module+offset”. While malloc is executing in the fictitious example just given, the call stack might look like this:

msvcrt!malloc+0x2a
crypt32!CryptEncryptMessage+0x9f
HelperFunctions!EncryptThisText+0x43
MyApp.exe+0x25d8

As you can see, a call stack not only tells you what piece of code is executing, it also tells you how the program got there.

4.2. What Are Symbols?

When inspecting a thread start address or a return address on a call stack, a debugger can easily determine what module it belongs to by examining the list of loaded modules and their address ranges. However, when a compiler converts a developer’s source code into computer instructions, it does not retain the original function names. The one exception is that a DLL includes an export table that lists the names and offsets of the functions it makes available to other modules. However, the export table does not list the names of the library’s internal functions, nor does it list the names of COM entry points that are designed to be discovered at runtime.


Note:

Executable files loaded in user-mode processes are generally either EXE files with which a new process can be started or DLL files that are loaded into an existing process. EXE and DLL files are not restricted to using those two file extensions, however. Files with COM or SCR extensions are actually EXE files, while ACM, AX, CPL, DRV, and OCX are examples of other file extensions of DLLs. And installation programs commonly extract and launch EXE files with TMP extensions.


When creating executable files, compilers and linkers can also create corresponding symbol files (with the default extension PDB). Symbol files hold a variety of data that is not needed when running the executable code but which can be useful during debugging, including the names and entry point offsets of functions within the module. With this information, a debugger can take a memory address and easily identify the function with the closest preceding address. Without symbols, the debugger is limited to using exported functions, if any, which might have no relation at all to the code being executed. In general, the larger the offset on a return address, the less likely the reported function name is to be accurate.


Note:

The Sysinternals utilities are able to use only native (unmanaged) symbol files when reporting call stacks. They are not able to report function names within JIT-compiled .NET assemblies.


A symbol file must be built at the same time as its corresponding executable or it will not be correct and the debug engine might refuse to use it. Older versions of Microsoft Visual C++ created symbol files only for Debug builds unless the developer explicitly changed the build configuration. Newer versions now create symbol files for Release builds as well, writing them into the same folder with the executable files. Microsoft Visual Basic 6 can create symbol files, but it does not do so by default.

Symbol files can contain differing levels of detail. Full symbol files (sometimes called private symbol files) contain details that are not found in public symbol files, including the path to and the line number within the source file where the symbol is defined, function parameter names and types, and variable names and types. Software companies that make symbol files externally available typically release only public symbol files, while retaining the full symbol files for internal use.

The Debugging Tools for Windows make it possible to download correct symbol files on demand from a symbol server. The server can store symbol files for many different builds of a given executable file, and the Debugging Tools will download the one that matches the image you are debugging. (It uses the timestamp and checksum stored in the executable’s header as a unique identifier.)

Microsoft has a symbol server accessible over the Web that makes Windows’ public symbol files freely available. By installing the Debugging Tools for Windows and configuring the Sysinternals utilities to use the Microsoft symbol server, you can easily see what Windows functions are being invoked by your processes.

Figure 2 shows a call stack for an event captured with Process Monitor. The presence of MSVBVM60.DLL on the stack (frames 15 and 17–21) indicates that this is a Visual Basic 6 program because MSVBVM60.DLL is the Visual Basic 6 runtime DLL. The large offsets for the MSVBVM60 frames suggest that symbols are not available for that module and that the names shown are not the actual functions being called. Frame 14 shows a call into a function named Form1::cmdCreate_Click in the main executable (LuaBugs_VB6.exe). This frame also shows a source file path, indicating that we have full symbolic information for this third-party module. This function then calls CWshShell::RegWrite in Wshom.ocx (frame 13), indicating that this Visual Basic 6 program is using a Windows Script Host ActiveX to write to the registry. CWshShell::RegWrite calls an internal function in the same module (frame 12), which calls the documented RegCreateKeyExA Windows API in Kernel32.dll (frame 11). Execution passes through Kernel32 internal functions (frames 8–10) and then into the ZwCreateKey native API in Ntdll.dll (frame 7). So far, all of these functions have executed in user mode, as indicated by the U in the Frame column, but in frame 6 the program transitions to kernel mode, indicated by the K. The two-letter prefixes of the kernel functions (frames 0–6) identify the executive components to which they belong. For example, Cm refers to the Configuration Manager, which is responsible for the registry, and Ob refers to the Object Manager. It was during the processing of CmpCallCallBacks (frame 0) that this stack trace was captured. Note that the symbolic information shown in frames 0–13 was all derived from Windows public symbols downloaded on demand by Process Monitor from Microsoft’s symbol server.

Figure 2. Process Monitor call stack with information from symbol files.


4.3. Configuring Symbols

The Sysinternals utilities that use symbols require two pieces of information, as shown in Figure 3: the location of the Dbghelp.dll to use, and the symbols path. The Sysinternals utilities that can use full symbolic information to display source files also request source code paths.

Dbghelp.dll is one of Microsoft’s debug engine DLLs, and it provides the functionality for walking a call stack, loading symbol files, and resolving process memory addresses to names. Only the version of Dbghelp.dll that ships in the Debugging Tools for Windows supports the downloading of files from symbol servers. The Dbghelp.dll that ships with Windows in the %SystemRoot%\System32 directory can use only symbol files stored locally. The first time you run them, Sysinternals utilities check default installation locations for the Debugging Tools and use its Dbghelp.dll if found. Otherwise, it defaults to using the version in %SystemRoot%\System32.

Figure 3. Process Explorer’s Configure Symbols dialog box.


The URL for the Debugging Tools for Windows is http://www.microsoft.com/whdc/devtools/debugging/default.mspx. The Debugging Tools installer used to be a standalone download, but it is now incorporated into the Windows SDK. To get the Debugging Tools, you must run the SDK installer and select the Debugging Tools options you want. Among the options are the Debugging Tools redistributables, which are the standalone Debugging Tools installers, available for x86, x64, and IA64. The redistributables are handy for installing the debuggers to other machines in your environment without having to run the full SDK installer on each of them.

The symbols path tells the debugging engine where to search for symbol files if they cannot be found in default locations. The two default locations that the debugging engine searches for symbol files before checking the symbols path are the executable’s folder and the folder where the symbol file was originally created, if that information is in the executable file.

The symbols path can consist of file system folders and symbol server directives. The first time you run it, the Sysinternals utility will set its symbol path to the value of the _NT_SYMBOL_PATH environment variable. If that variable is not defined, the utility sets its symbol path to srv*http://msdl.microsoft.com/download/symbols, which uses the Microsoft public symbol server but does not save the downloaded symbol files to a local cache.

File system folders and symbol server directives can be intermixed in the symbols path, separated with a semicolon. Each element is searched in the order it appears in the path. As implied earlier, symbol server directives are of the form srv*DownstreamStore*SymbolServer. Consider the following symbols path:

C:\MySyms;srv*C:\MSSymbols*http://msdl.microsoft.com/download/symbols

The debugging engine will first search the default locations and then C:\MySyms, which could be a good place to put your own applications’ private symbol files. If it hasn’t found the symbol file, it then searches C:\MSSymbols, and if the file isn’t there it finally queries the symbol server. If the symbol server has the file, the debugging engine downloads the file to C:\MSSymbols.

See the Debugging Tools documentation for more information about symbol paths, symbol servers, source paths, and environment variables used by the debugging engine.


Tip:

If the Microsoft public symbols are the only symbols you need, set the symbols path to the following:

srv*c:\symbols*http://msdl.microsoft.com/download/symbols

This directs the debugging engine first to search the cache under C:\Symbols and then to download symbol files as needed from the Microsoft public symbol server, saving them into the cache so that they won’t need to be downloaded again. The debugging engine will create C:\Symbols if it doesn’t already exist.

Other -----------------
- Windows Sysinternals : Windows Core Concepts - Administrative Rights
- Overview of Process Management in Microsoft Visio 2010 (part 4)
- Overview of Process Management in Microsoft Visio 2010 (part 3) - Validation of process diagrams
- Overview of Process Management in Microsoft Visio 2010 (part 2) - New process flow templates
- Overview of Process Management in Microsoft Visio 2010 (part 1)
- Windows 7 : Working with Scanners and Cameras (part 2) - Manipulating Pictures in Windows Media Player & Printing Your Masterpiece
- Windows 7 : Working with Scanners and Cameras (part 1) - Using Windows Pictures Library with a Digital Camera, Scanner
- Windows 7 : Understanding and Resolving Installation Failures (part 2) - Understanding Installation Restrictions with AppLocker
- Windows 7 : Understanding and Resolving Installation Failures (part 1) - Verifying Software Installation Requirements
- Sysinternals License Information
- The Windows Sysinternals Web Site
- Overview of Internet Explorer 8 (part 4) - Installing Add-Ons to IE8 & Configuring Windows Internet Explorer 8 Options
- Overview of Internet Explorer 8 (part 3) - Using New Security and Safety Features of IE8 & Working with SmartScreen Filters
- Overview of Internet Explorer 8 (part 2) - Defining IE8 Web Slices & Using IE8 Compatibility View
- Overview of Internet Explorer 8 (part 1) - Defining IE8 Accelerators
- Windows 7 : Configuring Hardware and Applications - Managing Applications
- Windows 7 : Configuring Hardware and Applications - Managing Printers
- Windows 7 : Configuring Hardware (part 2) - Installing and Updating Device Drivers & Driver Signing
- Windows 7 : Configuring Hardware (part 1) - Device Stage & Using Device Manager
- Windows 7 : Scripting Windows with PowerShell - Creating PowerShell Scripts
 
 
Most View
- Windows Home Server 2011 : Setting Up a Fax Server (part 1) - Configuring a Shared Fax Printer, Configuring the Fax Modem, Starting Windows Fax and Scan
- Programming with DirectX : Rendering Geometry - Colors
- Windows 7 : Getting to the Command Line (part 2) - Running CMD
- BizTalk Server 2009 : Getting results from asynchronous invocations (part 1) - Building WCF services that support client callbacks
- Introduction to Federated Delegation in Exchange Server 2010
- Active Directory Domain Services 2008: Enable the Directory Service Changes Auditing Subcategory
- Developing an SEO-Friendly Website : Root Domains, Subdomains, and Microsites (part 2)
- An OLAP Requirements Example: CompSales International (part 2) - Adding a Data Source
- Using Non-Windows Systems to Access Exchange Server 2007 : Configuring and Implementing Entourage for the Mac
- Upgrading to SQL Server 2008 : Upgrading Using a Configuration File
Top 10
- Implementing Edge Services for an Exchange Server 2007 Environment : Utilizing the Basic Sender and Recipient Connection Filters (part 3) - Configuring Recipient Filtering
- Implementing Edge Services for an Exchange Server 2007 Environment : Utilizing the Basic Sender and Recipient Connection Filters (part 2)
- Implementing Edge Services for an Exchange Server 2007 Environment : Utilizing the Basic Sender and Recipient Connection Filters (part 1)
- Implementing Edge Services for an Exchange Server 2007 Environment : Installing and Configuring the Edge Transport Server Components
- What's New in SharePoint 2013 (part 7) - BCS
- What's New in SharePoint 2013 (part 6) - SEARCH
- What's New in SharePoint 2013 (part 6) - WEB CONTENT MANAGEMENT
- What's New in SharePoint 2013 (part 5) - ENTERPRISE CONTENT MANAGEMENT
- What's New in SharePoint 2013 (part 4) - WORKFLOWS
- What's New in SharePoint 2013 (part 3) - REMOTE EVENTS