Programming4us
         
 
 
SQL Server

The Overall Disaster Recovery Process

10/24/2010 4:35:53 PM
In general, a handful of things need to be put together (that is, defined and executed upon) as the basis for an overall disaster recovery process or plan. The following list clearly identifies where you need to start:
1.
Create a disaster recovery execution tasks/run book. This should include all steps to take to recover from a disaster and cover all system components that need to be recovered.

2.
Arrange for or procure a server/site to recover to. This should be a configuration that can house what is needed to get you back online.

3.
Guarantee that a complete database backup/recovery mechanism is in place (including offsite/alternate site archive and retrieval of databases).

4.
Guarantee that an application backup/recovery mechanism is in place (for example, COM+ applications, .NET applications, web services, other application components, and so on).

5.
Make sure you can completely re-create and resynchronize your security (Microsoft Active Directory, domain accounts, SQL Server logins/passwords, and so on). We call this “security resynchronization readiness.”

6.
Make sure you can completely configure and open up network/communication lines. This also includes ensuring that routers are configured properly, IP addresses are made available, and so on.

7.
Train your support personnel on all elements of recovery. You can never know enough ways to recover a system. And it seems that a system never recovers the same way twice.

8.
Plan and execute an annual or bi-annual disaster recovery simulation. The one or two days that you do this will pay you back a hundred times over if a disaster actually occurs. And, remember, disasters come in many flavors.

Many organizations have gone to the concept of having hot alternate sites available via stretch clustering or log shipping techniques. Costs can be high for some of these advanced and highly redundant solutions.

The Focus of Disaster Recovery

If you create some very solid, time-tested mechanisms for re-creating your SQL Server environment, they will serve you well when you need them most. Following are the things to focus on for disaster recovery:

  • Always generate scripts for as much of your work as possible (anything created using a wizard, SMSS, and so on). These scripts will save your hide. They should include the following:

    • Complete replication buildup/breakdown scripts

    • Complete database creation scripts (DB, tables, indexes, views, and so on).

    • Complete SQL login, database user IDs and password scripts (including roles and other grants)

    • Linked/remote server setup (linked servers, remote logins)

    • Log shipping setup (source, target, and monitor servers)

    • Any custom SQL Agent tasks

    • Backup/restore scripts

    • Potentially other scripts, depending on what you have built on SQL Server

  • Make sure you document all aspects of SQL database maintenance plans being used. This includes frequencies, alerts, email addresses being notified when errors occur, backup file/device locations, and so on.

  • Document all hardware/software configurations used:

    • Leverage sqldiag.exe for this (as described in the next section).

    • Record what accounts were used to start up the SQL Agent service for an instance and MS Distributed Transaction Coordinator (MS DTC) service. This step is especially important if you’re using distributed transactions and data replication.

    • The favorite SQL Server implementation characteristics that we script and record for a SQL Server instance are

      • select @@SERVERNAME— Provides the full network name of the SQL Server and instance.

      • select @@SERVICENAME— Provides the Registry key under which Microsoft SQL Server is running.

      • select @@VERSION— Provides the date, version, and processor type for the current installation of Microsoft SQL Server.

      • exec sp_helpserver— Provides the server name; the server’s network name; the server’s replication status; and the server’s identification number, collation name, and time-out values for connecting to, or queries against, linked servers.

      • exec sp_helplogins— Provides information about logins and the associated users in each database.

      • exec sp_linkedservers— Returns the list of linked servers defined in the local server.

      • exec sp_helplinkedsrvlogin— Provides information about login mappings defined against a specific linked server used for distributed queries and remote stored procedures.

      • exec sp_server_info— Returns a list of attribute names and matching values for Microsoft SQL Server.

      • exec sp_helpdb dbnamexyz— Provides information about a specified database or all databases. This includes the database allocation names, sizes, and locations.

        use dbnamexyz
        go
        exec sp_spaceused
      • exec sp_spaceused— Set of SQL statements that provide the actual database usage information of both data and indexes for the specified database name (dbnamexyz).

        use dbnamexyz
        go
        exec sp_spaceused
        go
      • exec sp_configure– Get the current SQL Server configuration values by running sp_configure (with the “show advanced option”):

        USE master
        EXEC sp_configure 'show advanced option', '1'
        RECONFIGURE
        go
        EXEC sp_configure
        Go
        name minimum maximum config_value
        run_value
        —---------------------------------- -------- ------- -------------
        access check cache bucket count 0 65536 0 0
        access check cache quota 0 2147483647 0 0
        Ad Hoc Distributed Queries 0 1 0 0
        affinity I/O mask -2147483648 2147483647 0 0
        affinity mask -2147483648 2147483647 0 0
        affinity64 I/O mask -2147483648 2147483647 0 0
        affinity64 mask -2147483648 2147483647 0 0
        Agent XPs 0 1 1 1
        allow updates 0 1 0 0
        awe enabled 0 1 0 0
        backup compression default 0 1 0 0
        blocked process threshold (s) 0 86400 0 0
        c2 audit mode 0 1 0 0
        clr enabled 0 1 0 0
        common criteria compliance enabled 0 1 0 0
        cost threshold for parallelism 0 32767 5 5
        cross db ownership chaining 0 1 0 0
        cursor threshold -1 2147483647 -1 -1
        Database Mail XPs 0 1 0 0
        default full-text language 0 2147483647 1033
        1033
        default language 0 9999 0 0
        default trace enabled 0 1 1 1
        disallow results from triggers 0 1 0 0
        EKM provider enabled 0 1 0 0
        filestream access level 0 2 2 2
        fill factor (%) 0 100 0 0
        ft crawl bandwidth (max) 0 32767 100
        100
        ft crawl bandwidth (min) 0 32767 0 0
        ft notify bandwidth (max) 0 32767 100
        100
        ft notify bandwidth (min) 0 32767 0 0
        index create memory (KB) 704 2147483647 0 0
        in-doubt xact resolution 0 2 0 0
        lightweight pooling 0 1 0 0
        locks 5000 2147483647 0 0
        max degree of parallelism 0 64 0 0
        max full-text crawl range 0 256 4 4
        max server memory (MB) 16 2147483647 2147483647
        2147483647
        max text repl size (B) -1 2147483647 65536
        65536
        max worker threads 128 32767 0 0
        media retention 0 365 0 0
        min memory per query (KB) 512 2147483647 1024
        1024
        min server memory (MB) 0 2147483647 0 0
        nested triggers 0 1 1 1
        network packet size (B) 512 32767 4096
        4096
        Ole Automation Procedures 0 1 0 0
        open objects 0 2147483647 0 0
        optimize for ad hoc workloads 0 1 0 0
        PH timeout (s) 1 3600 60
        60
        precompute rank 0 1 0 0
        priority boost 0 1 0 0
        query governor cost limit 0 2147483647 0 0
        query wait (s) -1 2147483647 -1 -1
        recovery interval (min) 0 32767 0 0
        remote access 0 1 1 1
        remote admin connections 0 1 0 0
        remote login timeout (s) 0 2147483647 20 20
        remote proc trans 0 1 0 0
        remote query timeout (s) 0 2147483647 600
        600
        Replication XPs 0 1 0 0
        scan for startup procs 0 1 0 0
        server trigger recursion 0 1 1 1
        set working set size 0 1 0 0
        show advanced options 0 1 1 1
        SMO and DMO XPs 0 1 1 1
        SQL Mail XPs 0 1 0 0
        transform noise words 0 1 0 0
        two digit year cutoff 1753 9999 2049
        2049
        user connections 0 32767 0 0
        user options 0 32767 0 0
        xp_cmdshell 0 1 0 0


    • Disk configurations, sizes, and current size availability (use standard OS directory listing commands on all disk volumes being used).

    • Capture the sa login password and OS administrator password so that anything can be accessed and anything can be installed (or re-installed).

  • Document all contact information for your vendors:

    • Microsoft support services contacts (do you use “Premier Product Support Services”?)

    • Storage vendor contact info

    • Hardware vendor contact info

    • Offsite storage contact info (to get your archived copy fast)

    • Network/telecom contact info

    • Your CTO, CIO, and other senior management contact info

    • CD-ROMs available for everything (SQL Server, service packs, operating system, utilities, and so on)

sqldiag.exe

One good way to get a complete environmental picture is to run the sqldiag.exe program provided with SQL Server 2008 on your production box (which you would have to re-create on an alternate site if a disaster occurred). It is located in the Binn directory where all SQL Server executables reside (C:\Program Files\Microsoft SQL Server\100\Tools\Binn). It shows how the server is configured, all hardware and software components (and their versions), memory sizes, CPU types, operating system version and build information, paging file information, environment variables, and so on. If you run this program on your production server periodically, it serves as good environment documentation to supplement your disaster recovery plan. This utility is also used to capture and diagnose SQL Server-wide issues and has a prompt that you must respond to when re-creating issues on which you want to collect diagnosis information. For the purposes of this chapter, when prompted for the SQLDIAG Collection, you can just terminate that portion by pressing Ctrl+C. Figure 1 shows the expected execution DOS windows and system information dialog window.

Figure 1. sqldiag.exe execution.


To run this utility, you open a DOS command prompt and change directory to the SQL Server Binn directory. Then, at the command prompt, you run sqldiag.exe:

C:\Program Files\Microsoft SQL Server\100\Tools\Binn> sqldiag.exe

The results are written into several text files within the SQLDIAG subdirectory. Each file contains different types of data about the physical machine (server) that SQL Server is running on and information about each SQL Server instance. The machine (server) information is stored in a file named XYX_MSINFO32.TXT, where XYX is the machine name. It really contains a verbose snapshot of everything that relates to SQL Server (in one way or another) and all the hardware configuration, drivers, and so on. It is the tightly coupled metadata and configuration information directly related to the SQL Server instance. The following is an example of what it contains:

System Information report written at: 09/11/09 22:13:16
System Name: DBARCH-LT2
[System Summary]

Item Value
OS Name Microsoft® Windows Vista™ Home Premium
Version 6.0.6001 Service Pack 1 Build 6001
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Name DBARCH-LT2
System Manufacturer Hewlett-Packard
System Model HP G60 Notebook PC
System Type x64-based PC
Processor Pentium(R) Dual-Core CPU T4300 @ 2.10GHz, 2100 Mhz, 2 Core(s),
2 Logical Processor(s)
BIOS Version/Date Hewlett-Packard F.3C, 6/23/2009
SMBIOS Version 2.4
Windows Directory C:\Windows
System Directory C:\Windows\system32
Boot Device \Device\HarddiskVolume1
Locale United States
Hardware Abstraction Layer Version = "6.0.6001.18000"
User Name DBARCH-LT2\DBARCH
Time Zone Pacific Daylight Time
Installed Physical Memory (RAM) Not Available
Total Physical Memory 3.90 GB
Available Physical Memory 1.87 GB
Total Virtual Memory 8.04 GB
Available Virtual Memory 5.63 GB
Page File Space 4.20 GB
Page File C:\pagefile.sys


and so on.

A separate file is generated for each SQL Server instance you have installed on a server. These files are named XYZ_ABC_sp_sqldiag_Shutdown.OUT, where XYZ is the machine name and ABC is the SQL Server instance name. This file contains most of the internal SQL Server information regarding how it is configured, including a snapshot of the SQL Server log as this server is operating on this machine. The following example shows this critical information from the DBARCH-LT2_SQL08DE01_sp_sqldiag_Shutdown.OUT file:

2009-09-07 23:50:21.540 Server       Microsoft SQL Server 2008 (SP1) - 10.0.2531.0
(X64)
Mar 29 2009 10:11:52
Copyright (c) 1988-2008 Microsoft Corporation
Developer Edition (64-bit) on Windows NT 6.0 <X64> (Build 6001: Service Pack 1)
2009-09-07 23:50:21.560 Server (c) 2005 Microsoft Corporation.
2009-09-07 23:50:21.560 Server All rights reserved.
2009-09-07 23:50:21.560 Server Server process ID is 1884.
2009-09-07 23:50:21.560 Server Logging SQL Server messages in file
'C:\Program Files\Microsoft SQL Server\MSSQL10.SQL08DE01\MSSQL\Log\ERRORLOG'.
2009-09-07 23:50:21.570 Server Registry startup parameters:
-d C:\Program Files\Microsoft SQL
Server\MSSQL10.SQL08DE01\MSSQL\DATA\master.mdf
-e C:\Program Files\Microsoft SQL Server\MSSQL10.SQL08DE01\MSSQL\Log\ERRORLOG
-l C:\Program Files\Microsoft SQL Server\MSSQL10.SQL08DE01\MSSQL\DATA\mast-
log.ldf
2009-09-07 23:50:21.610 Server Detected 2 CPUs.
This is an informational message; no user action is required.
2009-09-07 23:50:21.910 Server Using dynamic lock allocation.
Initial allocation of 2500 Lock blocks and 5000 Lock Owner blocks per node.
This is an informational message only. No user action is required.
2009-09-07 23:50:23.050 spid7s FILESTREAM: effective level = 3,
configured level = 3, file system access share name = 'SQL08DE01'.
2009-09-07 23:50:23.820 spid7s Server name is 'DBARCH-LT2\SQL08DE01'.
This is an informational message only. No user action is required.


From this output, you are able to ascertain the complete SQL Server instance information as it was running on the primary site. It is excellent documentation for your SQL Server implementation. We suggest that you run this utility regularly and compare the outcome with prior executions to guarantee that you know exactly what you have to have in place in case of disaster.

Planning and Executing a Disaster Recovery

The process of planning and executing a complete disaster recovery is serious business, and many companies around the globe set aside a few days a year to perform this exact task. Here’s what it involves:

  • Simulate a disaster.

  • Record all actions taken.

  • Time all events from start to finish. Sometimes this means someone is standing around with a stopwatch.

  • Hold a postmortem following the DR simulation.

Many companies tie the results of a DR simulation to the IT group’s salaries (their raise percentage). This is more than enough motivation for IT to get this drill right and to perform well.

Correcting any failures or issues that occur is critical. The next time might not be a simulation.

Other -----------------
- Microsoft SQL Server Options for Disaster Recovery
- How to Approach Disaster Recovery
- SQL Server 2008 : Database Mirroring
- Creating and Using a SQL Azure Database
- SQL Server 2008 : Failover Clustering
- SQL Server 2008 Reporting Services : Management and Security
- SQL Server 2008: Security and User Administration - Authentication Methods
- SQL Server 2008: Security and User Administration - Managing Principals (part 2) - Roles
- SQL Server 2008: Security and User Administration - Managing Principals (part 1) - Users
- SQL Server 2008: Security and User Administration - Managing Securables
- SQL Server 2008: Security and User Administration - Managing Permissions
- SQL Server 2008: Security and User Administration - Managing SQL Server Logins
- Managing SQL Server Permissions (part 4) - Using T-SQL to Manage Permissions
- Managing SQL Server Permissions (part 2) - Using SSMS to Manage Permissions at the Object Level
- Managing SQL Server Permissions (part 2) - Using SSMS to Manage Permissions at the Database Level
- Managing SQL Server Permissions (part 1) - Using SSMS to Manage Permissions at the Server Level
- Central Management Servers (part 4) - Evaluating Policies
- Central Management Servers (part 3) - Configuring Multi-Server Query Options
- Central Management Servers (part 2) - Running Multi-Server Queries
- Central Management Servers (part 1) - Creating a Central Management Server
 
 
Most View
- Sharepoint 2010 : Creating a .NET Connector in Visual Studio 2010 (part 1)
- SQL Server Integration Services : Running the SSIS Wizard
- Programming Windows Azure : Table Operations - Updating Entities
- Programming Windows Services with Microsoft Visual Basic 2008 : Services and Polling - Updating the Service Events
- Exchange Server 2010 Mailbox Services Configuration (part 2) - Database Maintenance
- Programming WCF Services : Data Contracts - Collections (part 1) - Concrete Collections & Custom Collections
- Windows Phone 7 Game Development : The World of 3D Graphics - Vertex and Index Buffers
- Programming WCF Services : Queued Services - Delivery Failures (part 1) - Configuring the Dead-Letter Queue
- Windows Phone 7 : Changing Zune Sync Settings
- Performing Administrative Tasks Using Central Administration (part 15) - E-Mail And Text Messages
Top 10
- Implementing Edge Services for an Exchange Server 2007 Environment : Utilizing the Basic Sender and Recipient Connection Filters (part 3) - Configuring Recipient Filtering
- Implementing Edge Services for an Exchange Server 2007 Environment : Utilizing the Basic Sender and Recipient Connection Filters (part 2)
- Implementing Edge Services for an Exchange Server 2007 Environment : Utilizing the Basic Sender and Recipient Connection Filters (part 1)
- Implementing Edge Services for an Exchange Server 2007 Environment : Installing and Configuring the Edge Transport Server Components
- What's New in SharePoint 2013 (part 7) - BCS
- What's New in SharePoint 2013 (part 6) - SEARCH
- What's New in SharePoint 2013 (part 6) - WEB CONTENT MANAGEMENT
- What's New in SharePoint 2013 (part 5) - ENTERPRISE CONTENT MANAGEMENT
- What's New in SharePoint 2013 (part 4) - WORKFLOWS
- What's New in SharePoint 2013 (part 3) - REMOTE EVENTS