AlwaysOn Sanjay Mishra Program Manager Microsoft Corporation DBI316 Setting the Stage Assumed Prerequisites for this presentation Basic knowledge of AlwaysOn Failover Cluster Instances FCI ID: 543124
Download Presentation The PPT/PDF document "SQLCAT: SQL Server HA and DR Design Patt..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn
Sanjay MishraProgram ManagerMicrosoft Corporation
DBI316Slide2
Setting the Stage
Assumed Pre-requisites for this presentation: Basic
knowledge
of
AlwaysOn
Failover Cluster Instances (FCI)
AlwaysOn
Availability Groups (AG)
Definition: For the purpose of this presentation
High Availability (Local HA): Availability within a data center
Disaster Recovery (DR): Availability across data
centersSlide3
Setting the Stage
AlwaysOn
≠ Availability
Groups
AlwaysOn
= { SQL Server Failover Cluster Instances, Availability Groups
}
Availability Groups ≠ Database
MirroringSlide4
SQL Server 2012 AlwaysOn HA+DR Design Patterns
SQL Server 2012
AlwaysOn
HA+DR Design
Pattern
Solution Characteristics
Corresponding
Pre-SQL Server 2012 Solution
1
Multi-site
Failover Cluster Instance (FCI) for HA and DRShared Storage solution 11 Masked by storage replication Multi-site FCI using stretch VLAN
2Availability Group for HA and DRNon-Shared Storage solutionDatabase Mirroring for Local HA and Log Shipping for DR
3Failover Cluster Instance for local HA + Availability Group for DRCombined Shared Storage and Non-Shared StorageFailover Cluster Instance for Local HA and Database Mirroring for DR
Slight variations of these design patterns are occasionally observed as well.Slide5
Wednesday, June 13, 10:15 AM – 11:30 AM
SQLCAT: HA/DR Customer Panel - Microsoft SQL Server 2012 AlwaysOn Deployment Considerations
N 320EMichael Steineke, Edgenet, Inc.David P. Smith, ServiceU Corporation
Ayad
Shammout
,
CareGroup
Healthcare Systems
Wolfgang Kutschera, bwin partyThomas Grohser, Hedge fund in ConnecticutSlide6
title
Multi-site Failover Cluster Instance for HA and DRSlide7
Multi-site Failover Cluster Instance for HA and DR
Primary Site
DR Site
Node 1
Node 2
Node 3
Node 4
Windows Server Failover Cluster
Storage Replication
SQL-FCI
Active
PassivePassivePassiveSQL Server 2012 AlwaysOn HA+DR Solution
Solution CharacteristicsCorresponding Pre-SQL Server 2012 Solution1Multi-site Failover Cluster Instance (FCI) for HA and DR(http://sqlcat.com/sqlcat/b/whitepapers/archive/2011/12/22/sql-server-2012-alwayson_3a00_-multisite-failover-cluster-instance.aspx)
Shared Storage solution
1
Instance Level HA (automatic)
Instance Level DR
(automatic
2
)
Uses
storage replication
1
Masked by storage replication
2
Consider 3
rd
data center
Multi-site FCI using
stretch VLANSlide8
Multi-site Failover Cluster Instance
Key Elements
A single SQL Server failover cluster instance (FCI) providing HA as well as DRspanning across multiple sites (usually multiple subnets as well)
Key components:
Storage
Storage level replication
Cluster Enabler
Provided by the storage vendor
Work with your storage vendor to get the appropriate software and best practices
NetworkMulti-subnet support in SQL Server configuration and engineKey improvement in SQL Server 2012IP address OR dependency set within SQL Server setupSQL Engine skips binding to any IP’s which are not online at start-upRegisterAllProvidersIP for Network Name improves application failover timeSlide9
Storage Validation
Storage Validation Check Requirement is relaxed due to make-up of multi-site storage infrastructure (but still get the pop-up!)Multi-site FCI Solution does not require passing the storage validation tests, to be supported.
http://support.microsoft.com/kb/943984
Appropriate
Quorum Model
Validation suggests “
Node and Disk Majority
” which can be ignored
Consider “Node
and File Share Majority” or “Node Majority” based on number of nodesMulti-site Failover Cluster InstanceDeployment ConsiderationsSlide10
TEMPDB on Local Disk
Not specific to “multi-site” FCIs, but has some great positive side effects for “multi-site” scenariosEnables use of local storage for TEMPDB
Can use solid state storage to improve performance of TEMPDB-heavy workloadsSaves money on storage replication licensingReduces cross-data center storage replication traffic
Multi-site Failover Cluster Instance
Deployment ConsiderationsSlide11
titleAvailability Groups for
HA and DRSlide12
Availability Groups for HA and DR
SQL Server 2012
AlwaysOn
HA+DR Solution
Solution Characteristics
Corresponding
Pre-SQL Server 2012 Solution
2
Availability Group for HA and DR
Non-Shared Storage solution
(Group of) Database Level HA (automatic)(Group of) Database Level DR (manual 3)DR replica can be Active Secondary3 DR is manual, if HA is chosen automatic. Consider 3rd
data center, if need automatic DR.Database Mirroring for Local HA and Log Shipping for DRSlide13
Pre-requisites:Prerequisites, Restrictions, and Recommendations for
AlwaysOn Availability Groups (SQL Server)Unit of FailoverGroup
of databases – not the instanceConsider Contained Database for containing logins for failoverFor jobs and other objects outside the database, simple customization neededConsiderations for Replacing Log ShippingNo
delayed apply on the secondary
Removing
log shipping means
the regular log
backup job is
removed
Need to re-establish periodic log backup (essential for truncating the log)New tools for monitoring and alertingAlwaysOn DashboardNew DMVsSystem Center Operations ManagerAvailability Groups for HA and DRDeployment ConsiderationsSlide14
Quorum is managed by the WSFC, irrespective of the number of SQL Server instances, number of nodes, number of availability groups
Important goal: Design to ensureUnavailability of the DR site (or the node at DR site) , or loss of network connectivity between sites should not impact the quorum of the WSFC
Two steps:Node votes: First decide which nodes should have a voteQuorum Model: Then choose the appropriate quorum modelAvailability Groups for HA and DRQuorum ConsiderationsSlide15
Node VotesBy default, every node has a vote => May not be ideal for the HA / DR goals
Windows Server hotfix: http://support.microsoft.com/kb/2494036Guidelines: http://
msdn.microsoft.com/en-us/library/hh270280.aspx#RecommendedAdjustmentstoQuorumVotingFor the example topology discussed here, this means:1 vote to each node in the primary data center0 vote to the node in the disaster recovery data center= total 2 votes in the Windows Cluster => not ideal !Need odd number of votes for a “majority” based quorum modelSince this is a purely non-shared storage solution, two possible quorum models:Node and File Share Majority, or
Node
Majority
Availability Groups for HA and DR
Quorum ConsiderationsSlide16
Note
: The Fileshare Witness always has 1 vote.
Quorum Model and Node
Votes
Node and
Fileshare
Majority
Use the “Node and File Share Majority” quorum model with a protected file share
witness.Slide17
Quorum Model and Node Votes
Node Majority
Add an additional voting node to the WSFC in the primary data center, and then use the “Node Majority” quorum
model.Slide18
Quorum Model and Node Votes
How to set / view
To View Quorum Model
Windows Failover Cluster Manager GUI
PowerShell
Cluster.exe
SQL Server DMVs
AlwaysOn
Dashboard in
SSMS
To Change Quorum ModelWindows Failover Cluster Manager GUIPowerShellCluster.exeTo View Node Votes
PowerShellCluster.exeSQL Server DMVsAlwaysOn Dashboard in SSMSTo Change Node VotesPowerShellCluster.exeSlide19
Disaster = Primary site is downManual Process involved to bring database service online on the DR site
Force Quorum on the secondary in the DR siteExecute FORCE SERVICE ALLOW DATA LOSSAdjust quorum model and/or node votes
Recovering from a DisasterSlide20
Hardware: new hardware, reuse existing hardware?Windows Clustering: involve the Windows System Administration team and the networking
teamQuorum considerations across multiple data centersCluster network communication across multiple data centersStages: migrate the whole configuration at once, or migrate the DR afterwards?
Application connection string changeMigration: From DBM+LS to AGPlanning and Key ConsiderationsSlide21
Special Case: Automatic Failover for DR
Use of 3rd Data CenterSlide22
titleFailover Cluster Instance for HA, and
Availability Group for DRSlide23
Failover Cluster Instance + Database MirroringSlide24
FCI for
HA + AG for DR
SQL Server 2012
AlwaysOn
HA+DR Solution
Solution Characteristics
Corresponding
Pre-SQL Server 2012 Solution
3
Failover Cluster Instance for
local HA + Availability Group for DRCombined Shared Storage and Non-Shared StorageInstance Level HA (automatic)(Group of) Database Level DR (manual)DR replica can be Active SecondaryAsymmetric storage
is the key to this solutionFailover Cluster Instance for Local HA and Database Mirroring for DRSlide25
Pre-requisites:Windows Server Service
packs / QFEs:Asymmetric StorageWindows Server 2008 with http://support.microsoft.com/kb/976097OR, Windows Server 2008 R2 SP1Node Votes:
http://support.microsoft.com/kb/2494036Validate disk test QFE: http://support.microsoft.com/kb/2531907Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server)Different units of failover for HA and DR
Instance-level failover for local HA (FCI)
Group of databases (AG) for DR
AG Failover Mode
In FCI+AG configuration, FCI provides automatic failover, and AG provides manual failover
FCI for HA + AG for DR
Deployment ConsiderationsSlide26
Asymmetric Storage
Key concept behind this architectureNew Windows Server Failover Clustering capability introduced in:Windows Server 2008 R2 SP1
Windows Server 2008 with QFESymmetric storage = a cluster disk that is shared between all the WSFC nodesAsymmetric storage = a cluster disk that is shared between a subset of nodesInstance NamingEach FCI within the WSFC needs to have a different instance name
Database File Paths
(recommended)
use identical drive letters for the
disks for each FCI
(recommended) use identical file paths for data and log files for each FCI
FCI for HA + AG for DR
Deployment ConsiderationsSlide27
Quorum is managed by the WSFC, irrespective of the number of SQL Server instances (FCI or standalone), number of nodes, number of availability groups
Important goal: Design to ensureUnavailability of the DR site, or loss of network connectivity between sites should not impact the quorum of the WSFCTwo steps:
Node votes: First decide which nodes should have a voteQuorum Model: Then choose the appropriate quorum modelFCI for HA + AG for DRQuorum ConsiderationsSlide28
Node VotesBy default, every node has a vote => May not be ideal for the HA / DR goals
Windows Server hotfix: http://support.microsoft.com/kb/2494036Guidelines: http://
msdn.microsoft.com/en-us/library/hh270280.aspx#RecommendedAdjustmentstoQuorumVotingFor the example topology discussed here, this means:1 vote to each node in the primary data center0 vote to each node in the disaster recovery data center= total 2 votes in the Windows Cluster => not ideal !Need odd number of votes for a “majority” based quorum modelQuorum models:
Pick one of the “majority” based quorum models with odd number of votes
Node and File Share Majority, or
Node
Majority, or
Node and (asymmetric) Disk Majority
Or, pick (asymmetric) Disk Only (special case!) – votes don’t matter
FCI for HA + AG for DRQuorum ConsiderationsSlide29
Note
: The Fileshare Witness always has 1 vote.
Quorum Model and Node
Votes
Example:
Node and
Fileshare
MajoritySlide30
Quorum Model and Node Votes
How to set / view
To View Quorum Model
Windows Failover Cluster Manager GUI
PowerShell
Cluster.exe
SQL Server DMVs
AlwaysOn
Dashboard in
SSMS
To Change Quorum ModelWindows Failover Cluster Manager GUIPowerShellCluster.exeTo View Node Votes
PowerShellCluster.exeSQL Server DMVsAlwaysOn Dashboard in SSMSTo Change Node VotesPowerShellCluster.exeNote: Only cluster.exe can be used to set quorum model to
“Node and (asymmetric) Disk Majority” or “(asymmetric) Disk Only” Slide31
Disaster = Primary site is downManual Process involved to bring database service online on the DR site
Force Quorum on the secondary in the DR siteExecute FORCE SERVICE ALLOW DATA LOSS on the Availability GroupAdjust quorum model and/or node votesRethink quorum model: needs for another
fileshare at the DR site?Recovering from a DisasterSlide32
Hardware: new hardware, reuse existing hardware?Windows Clustering
Quorum considerations across multiple data centersCluster network communication across multiple data centersStages: migrate the whole configuration at once, or migrate the DR afterwards?Secondary (DR site) needs re-seeding
Uninstall existing SQL FCIDestroy existing WSFC at the DR siteRe-install SQL FCI after joining DR nodes to primary data center WSFCBackup from primary, and Restore on the secondaryApplication connection string changeMigration: From FCI+DBM to FCI+AG
Planning and Key ConsiderationsSlide33
SummarySlide34
SQL Server 2012 AlwaysOn HA+DR Design Patterns
SQL Server 2012
AlwaysOn
HA+DR Design
Pattern
Solution Characteristics
Corresponding
Pre-SQL Server 2012 Solution
1
Multi-site
Failover Cluster Instance (FCI) for HA and DRShared Storage solution 1Instance Level HA (automatic)Instance Level DR (automatic 2)Uses storage replicationDoesn’t require database to be in FULL recovery model
1 Masked by storage replication 2 Consider 3rd data centerMulti-site FCI using stretch VLAN2Availability Group for HA and DR
Non-Shared Storage solution(Group of) Database Level HA (automatic)(Group of) Database Level DR (manual 3)DR replica can be Active SecondaryRequires database to be in FULL recovery model3 DR is manual, if HA is chosen automatic. Consider 3rd data center, if need automatic DR.Database Mirroring for Local HA and Log Shipping for DR
3
Failover Cluster Instance for
local HA +
Availability Group for
DR
Combined Shared Storage and Non-Shared Storage
Instance Level HA (automatic)
(Group of) Database Level DR (manual)
DR replica
can be
Active Secondary
Requires database to
be in
FULL recovery model
Asymmetric storage
is the key to this solution
Failover Cluster Instance for Local HA
and
Database Mirroring for DR
Slight variations of these design patterns are occasionally observed as well.Slide35
Wednesday, June 13, 10:15 AM – 11:30 AM
SQLCAT: HA/DR Customer Panel - Microsoft SQL Server 2012 AlwaysOn Deployment Considerations
N 320EMichael Steineke, Edgenet, Inc.David P. Smith, ServiceU Corporation
Ayad
Shammout
,
CareGroup
Healthcare Systems
Wolfgang Kutschera, bwin partyThomas Grohser, Hedge fund in ConnecticutSlide36
Track Resources
@sqlserver
@ms_teched
m
v
a
Microsoft Virtual Academy
SQL Server 2012 Eval Copy
Get Certified!
Hands-On LabsSlide37
Sanjay
Mishra
sanjaymi@microsoft.com
www.sqlcat.com
@
sqlcatSlide38
Resources
Connect. Share. Discuss.
http://northamerica.msteched.com
Learning
Microsoft Certification & Training Resources
www.microsoft.com/learning
TechNet
Resources for IT Professionals
http://microsoft.com/technet
Resources for Developers
http://microsoft.com/msdn Slide39
Required Slide
Complete an evaluation on CommNet and enter to win!Slide40
MS Tag
Scan the Tag
to evaluate this
session now on
myTechEd
Mobile
Required Slide
*delete this box when your slide is finalized
Your MS Tag will be inserted here during the final scrub. Slide41
©
2012 Microsoft
Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the
part
of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT
MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.Slide42