Sunday, March 30, 2014

SQL Server Database Mirroring: automating mirror failovers

Problem:

You have a bunch of databases on a production server and you decided to go with SQL Server mirroring as a DR solution. Your IT dept, in its infinite wisdom, thought that it is a waste of resources to have a witness server and you were asked to somehow adjust with a Principal and a Mirror server. Your databases are critical and though the option of automated failovers is ruled out due to lack of witness, you would like to put process in place such that you are always battle ready to deal with a possible disaster and recover quickly. The solution should be designed in such a way that even the most junior of your DBAs should be able to handle it.

Solution:

For the sake of this explanation, lets us assume my production server is YIN and Mirror server is YANG. So what happens when a disaster occurs? Assuming that you already have mirroring setup between YIN and YANG and it is functional, you might have to do the following after a disaster.

  • When YIN goes down, databases on YANG server will be in restoring/disconnected state. You'll bring all those databases online  by performing a forced failover with allow data loss.
  • When the failover is done, you might want to fix orphaned users
  • You now make necessary changes to whichever mechanism your application uses to connect to the databases to point to the new server. In our case, we store database names and connection strings in a SQL that sits on a well maintained cluster (lets call this MOTHERSHIP), so we update those connection strings with IP address of YANG. 
  • We have lot of business specific SQL Agent jobs running on our production servers apart from regular maint jobs. These jobs exist both on YIN and YANG. The jobs are kept in disabled state on mirror server. Now that we had a disaster and we failover to the evil YANG, we might have to enable these jobs. 
  • Send a notification that failover of some sort occurred. 
What Next?

Ok, you did all of that and we were back in business with minimal downtime. Our former Principal server was resurrected and is back online. YANG is currently the Principal and you might want to add YIN back to mirroring setup as the new mirror. After a couple of days, you are sure that YIN is now stable and it should be primary because it probably has better resources compared to YANG. You might have to do a soft (planned) failover. So following are various scenarios that are possible in a setup like this:
  • Planned Failover from Yin to Yang. Stop jobs on Yin, enable jobs on Yang, fix users, update connection strings
  • Planned Failover from  Yang to Yin etc.,
  • Forced Failover on Yin
  • Forced Failover on Yang
  • Resume Mirroring on Yin
  • Resume Mirroring on Yang
Enough of theory. Lets get our hands dirty. Creating batch scripts for each of these scenarios appeared like the easiest simplest (if not elegant) solution to me. The actual logic that performs a failover can be a stored procedures and the batch scripts can be stored on a server which is highly available. 

Create these stored procedures on both Principal and Mirror Servers.
 Example procedure below. Click links above for other sps. 

USE [master]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[usp_ForcedFailover]
AS

/* stored procedure for doing a forced failover of all database to YANG
example usage:  EXEC master.dbo.usp_MirrorFailover
*/

DECLARE @sql NVARCHAR(300)
 ,@maxid INT
 ,@i INT
 ,@dbname SYSNAME
DECLARE @MirrorDBs TABLE (
 [dbid] INT identity(1, 1) NOT NULL PRIMARY KEY
 ,dbname SYSNAME NOT NULL
 )

SET NOCOUNT ON

INSERT INTO @MirrorDBs (dbname)
SELECT DB_NAME(database_id)
FROM sys.database_mirroring
WHERE mirroring_role = 2 -- Mirror
 AND mirroring_state IN (0,1) -- Disconnected or Suspended
 AND mirroring_safety_level = 2

--select * from @MirrorDBs
SELECT @maxid = max([dbid])
 ,@i = 1
FROM @MirrorDBs;

WHILE @i <= @maxid
BEGIN
 SELECT @dbname = dbname
 FROM @MirrorDBs
 WHERE [dbid] = @i

 SET @sql = 'ALTER DATABASE ' + QUOTENAME(@dbname) + ' SET PARTNER FORCE_SERVICE_ALLOW_DATA_LOSS;'

 PRINT @sql

 EXEC (@sql);

 SET @i = @i + 1
END

SET NOCOUNT OFF


Next, create scripts for enabling/disabling jobs:
declare @sql nvarchar(max) = '';
select
@sql += N'exec msdb.dbo.sp_update_job @job_name = ''' + name + N''', @enabled = 1;
' from msdb.dbo.sysjobs
where enabled = 0
order by name;

print @sql;
exec (@sql);


Create scripts for fixing users

DECLARE @sql NVARCHAR(300)
 ,@maxid INT
 ,@i INT
 ,@dbname SYSNAME
DECLARE @Mirrors TABLE (
 [dbid] INT identity(1, 1) NOT NULL PRIMARY KEY
 ,dbname SYSNAME NOT NULL
 )

SET NOCOUNT ON

INSERT INTO @mirrors (dbname)
SELECT DB_NAME(database_id)
FROM sys.database_mirroring
WHERE mirroring_role = 1 -- Principal 
 AND mirroring_state = 4 -- Synchronized
 AND mirroring_safety_level = 2

--select * from @mirrors
SELECT @maxid = max([dbid])
 ,@i = 1
FROM @mirrors;

WHILE @i <= @maxid
BEGIN
 SELECT @dbname = dbname
 FROM @mirrors
 WHERE [dbid] = @i

 SET @sql = 'USE ' + QUOTENAME(@dbname) + 
 ' EXEC sp_change_users_login ''Auto_Fix'', ''myUser1'';
EXEC sp_change_users_login ''Auto_Fix'', ''myUser2'';
 '

 PRINT @sql

 EXEC (@sql);

 SET @i = @i + 1
END
 
Update connection strings on MOTHERSHIP server
USE ConnectionDB
GO
UPDATE COMPANY SET ConnectionString='Data Source=10.10.2.10;User ID=fancyuser;Password=fancypass;Max Pool Size=200' 
WHERE ConnectionString='Data Source=10.10.2.11;User ID=fancyuser;Password=fancypass;Max Pool Size=200';
Finally, the batch file that brings all these together:
@echo off
goto :startbatch

Description
=============================================================
This batch file is used for 
1. Performing a soft failover of YIN to YANG. YANG will become the primary
2. Fix Database users
3. Update connection strings in connectiondb database post failover
4. Disable jobs on YIN
5. Enable jobs on YANG
6. Send a notification email 
===============================================================

:startbatch

set source_server=10.10.2.10
set mothership_server=10.10.2.23
set mirror_server=10.10.2.11
set serverlogin=fancyuser
set serverpass=fancypass
set connectiondb=connectiondb

set home_dir=d:\mirrorme\Failover_YIN_TO_YANG
set failover_script=%home_dir%\failover.sql
set failover_out=%home_dir%\failover_out.txt
set fixusers_script=%home_dir%\fixusers.sql
set fixusers_out=%home_dir%\fixusers_out.txt
set update_connectiondb=%home_dir%\update_connectiondb.sql
set update_connectiondb_out=%home_dir%\update_connectiondb_out.txt
set disablejobs_script=%home_dir%\disablejobs.sql
set disablejobs_out=%home_dir%\disablejobs_out.txt
set enablejobs_script=%home_dir%\enablejobs.sql
set enablejobs_out=%home_dir%\enablejobs_out.txt




REM Failover from YIN to YANG
sqlcmd -S tcp:%source_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %failover_script% -o %failover_out%

REM Run fix users script on YANG to deal with orphan users
sqlcmd -S tcp:%mirror_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %fixusers_script% -o %fixusers_out%

REM update connectiondb database with new connection string
sqlcmd -S tcp:%connectiondb_server%,1433 -d %connectiondb% -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %update_connectiondb% -o %update_connectiondb_out%

REM disable jobs on YIN because we failedover to YANG
sqlcmd -S tcp:%source_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %disablejobs_script% -o %disablejobs_out%

REM Enable jobs on YANG since it is the new primary
sqlcmd -S tcp:%mirror_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %enablejobs_script% -o %enablejobs_out%


REM send emails on completion
C:\work\tool\sendemail -t allcoolsdudes@Nirvanainc.com -cc -u "Mirroring failover of YIN to YANG" -m "Database Mirroring Failover of YIN to YANG has been triggered and completed. SQL Agent jobs on YIN were disabled and jobs on YANG were enabled. Check logs attached. " -s 192.24.2.110:25 -f "LordOfServers@NirvanaInc.com" -a %failover_out% -a %fixusers_out% -a %update_connectiondb_out% -a %disablejobs_out% -a %enablejobs_out%


To do:

Update this post with scripts for all other possible failover scenarios:

Thursday, March 20, 2014

Batch Script: SQL Server Database Backup, Refresh, sanitize, run script and upload csv to sftp

One of the customers requested the following:

Backup a production database and restored it on a dev server. Sanitize the refreshed database and run a script and upload results as a csv file to sftp site. I know this can be done with SSIS, but I thought a batch job would be simpler to manage and troubleshoot.

First, create a directory for this job and add the following files to it. I will call mine D:\NirvanaInc. We are going to use the following tools to get this task done.

  • SQLCMD - To run backup/restore commands, sanitize scripts etc
  • PSFTP - To upload files to an SFTP site
  • sendemail - To send notification emails

Now that we are ready, lets start building the scripts we will use. The idea is to create one batch file which will call and run various scripts that do backup, restore, cleanup etc. this approach will make it incredibly easier to maintain and make changes to the job in future, if needed.

backup.sql

This is simple backup script:

USE master;

DECLARE @startdate VARCHAR(100);

SET @startdate = 'backup start: ' + convert(VARCHAR(32), getdate(), 121);

PRINT @startdate;

BACKUP DATABASE [NirvanaProd] TO DISK = N'R:\Backups\NirvanaProd.bak'
WITH COPY_ONLY
 ,NOFORMAT
 ,INIT
 ,NAME = N'NirvanaProd-Full Database Backup'
 ,SKIP
 ,NOREWIND
 ,NOUNLOAD
 ,COMPRESSION
 ,STATS = 10
GO

DECLARE @backupSetId AS INT

SELECT @backupSetId = position
FROM msdb.dbo.backupset
WHERE database_name = N'NirvanaProd.bak'
 AND backup_set_id = (
  SELECT max(backup_set_id)
  FROM msdb..backupset
  WHERE database_name = N'NirvanaProd.bak'
  )

IF @backupSetId IS NULL
BEGIN
 RAISERROR (
   N'Verify failed. Backup information for database ''NirvanaProd'' not found.'
   ,16
   ,1
   )
END

RESTORE VERIFYONLY
FROM DISK = N'R:\Backups\NirvanaProd.bak'
WITH FILE = @backupSetId
 ,NOUNLOAD
 ,NOREWIND
GO

DECLARE @enddate VARCHAR(100);

SET @enddate = 'backup end: ' + convert(VARCHAR(32), getdate(), 121);

PRINT @enddate;

Restore.sql

-- restore backup on target server, SQL:
USE master;

-- wait for start time
DECLARE @now DATETIME
 ,@targettime DATETIME;

-- targettime is set to 02:00:00 AM, If the backup is done sooner, it will wait till 2 AM to kick off restore
SET @targettime = substring(CONVERT(VARCHAR(39), GETDATE(), 121), 1, 10) + ' 02:00:00';
SET @now = GETDATE();

IF (@targettime > @now)
BEGIN
 WAITFOR TIME @targettime;
END;

DECLARE @startdate VARCHAR(100);
SET @startdate = 'db restore start: ' + convert(VARCHAR(32), getdate(), 121);
PRINT @startdate;

IF EXISTS (
  SELECT *
  FROM sys.databases
  WHERE NAME = 'NirvanaDev'
  )
BEGIN
 ALTER DATABASE [NirvanaDev] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
END

RESTORE DATABASE [NirvanaDev]
FROM DISK = N'R:\Remotebackups\Nirvanaprod.bak'
WITH FILE = 1
 ,MOVE N'FinancialSoftwareSystems_Data' TO N'D:\SQLData\NirvanaDev.mdf'
 ,MOVE N'FinancialSoftwareSystems_log' TO N'D:\SQLData\nirvanaDev.ldf'
 ,NOUNLOAD
 ,REPLACE
 ,STATS = 10
GO

ALTER DATABASE [NirvanaDev] SET MULTI_USER WITH ROLLBACK IMMEDIATE;
GO

DECLARE @enddate VARCHAR(100);
SET @enddate = 'db users restore end: ' + convert(VARCHAR(32), getdate(), 121);
PRINT @enddate;

I will leave out sanitize and grant permissions script because they are specific to your environment and you should build your scripts accordingly. Next up is running the a report script on target database and uploading the output as CSV to SFTP site. I chose PSFTP because of its ease of use.

run_psftp.bat

echo lcd D:\NirvanaInc > psqftp.bat
echu lpwd >> psftp.bat
echo put D:\NirvanaInc\sql_results.csv >> psftp.bat
echo quit >> psftp.bat

c:\sysutil\psftp mysftpuser@sftp.nirvanainc.com -pw myFancyPwd -b psftp.bat  -v

By now, you should have the following files on your D:\NirvanaInc folder:
backup.sql
restore.sql
sanitize.sql
grant_script.sql
run.sql
run_psftp.bat


Finally, the main batch file:

@echo off
goto :startbatch

Description
=============================================================
This batch file is used for 
1. Taking a backup of NirvanaProd database
2. Copying backup file to a Dev db server
3. Restore database on Dev Environment
4. Sanitizing Database
5. Creating user and granting permission
6. Run a script
7. Upload results to sftp
===============================================================

:startbatch

REM set variablesa

set source_server=10.10.20.21
set target_server=10.10.20.22
set source_db=NirvanaProd
set target_db=NirvanaDev

set home_dir=D:\NirvanaInc\NirvanaProd
set backup_script=%home_dir%\backup.sql
set backup_out=%home_dir%\backup_out.txt
set copy_out=%home_dir%\copy_out.txt
set restore_script=%home_dir%\restore.sql
set restore_out=%home_dir%\restore_out.txt
set sanitize_script=%home_dir%\sanitize.sql
set sanitize_out=%home_dir%\sanitize_out.txt
set grant_script=%home_dir%\grant_script.sql
set grant_out=%home_dir%\grant_out.txt
set run_script=%home_dir%\run.sql
set run_script_out=%home_dir%\sql_results.csv

set serverlogin=mylogin
set serverpass=fancypassword123


REM backup your production database
sqlcmd -S tcp:%source_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %backup_script% -o %backup_out%


REM copy backup from source server to target server, log is in %copy_out%
echo copy of backup file started at: %date% %time% > %copy_out%
copy /y \\%source_server%\R$\Backups\%source_db%.bak \\%target_server%\R$\RemoteBackups 1>>%copy_out% 2>&1
echo copying of the backup file completed at: %date% %time% >> %copy_out%

REM Restore backup on the target server
sqlcmd -S tcp:%target_server%,1433 -d master -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %restore_script% -o %restore_out%

REM Sanitize database and create user, grant permissions
sqlcmd -S tcp:%target_server%,1433 -d %target_db% -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %sanitize_script% -o %sanitize_out%
sqlcmd -S tcp:%target_server%,1433 -d %target_db% -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %grant_script% -o %grant_out%

REM Run sql script
sqlcmd -S tcp:%target_server%,1433 -d %target_db% -U %serverlogin% -P %serverpass% -s "," -W -h -1 -i %run_script% -o %run_script_out%

REM Upload run_sql.csv results to SFTP
call %home_dir%/run_psftp.bat 

REM send emails on completion
C:\sysutil\sendemail -t alldba@nirvanainc.com -cc ImeMyself@nirvanainc.com -u "%target_db% refresh" -m "%source_db% backup from server %source_server% is restored on %target_server% as db %target_db% . The script ran successfully and the output was uploaded to SFTP" -s 192.178.6.221:25 -f "Gandalph@NirvanaInc.com" -a %backup_out% -a %restore_out% -a %sanitize_out% -a %grant_out% 

have fun! :)

Saturday, March 8, 2014

Postgres en_US - UTF 8 Collation on Windows

I was restoring a schema today on a stage server and ran into a collation error. The source of schema backup was Postgres 9.1, CentOS and the destination was Postgresq 9.1, windows server 2008 R2.

pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 2125939; 1259 50334466 INDEX ix_rd_vendorlist_facts_defaultsort_en_us
pg_restore: [archiver (db)] could not execute query: ERROR: collation "public.en_US" for encoding "UTF8" does not exist

After reading a related stackoverflow thread I realized en_US locale is Linux specific and the windows equivalent is 'English_United States.1252'. I created these collations on template database and the user database I was trying to restore this schema on and the restore finally worked.

CREATE COLLATION pg_catalog."en_US" (
 LC_COLLATE = 'English_United States.1252'
 ,LC_CTYPE = 'English_United States.1252'
 );

CREATE COLLATION pg_catalog."fr_CA" (
 LC_COLLATE = 'French_Canada.1252'
 ,LC_CTYPE = 'French_Canada.1252'
 );

CREATE COLLATION pg_catalog."pt_BR" (
 LC_COLLATE = 'Portuguese_Brazil.1252'
 ,LC_CTYPE = 'Portuguese_Brazil.1252'
 );

-- add public collations
CREATE COLLATION PUBLIC."en_US" (
 LC_COLLATE = 'English_United States.1252'
 ,LC_CTYPE = 'English_United States.1252'
 );

CREATE COLLATION PUBLIC."fr_CA" (
 LC_COLLATE = 'French_Canada.1252'
 ,LC_CTYPE = 'French_Canada.1252'
 );

CREATE COLLATION PUBLIC."pt_BR" (
 LC_COLLATE = 'Portuguese_Brazil.1252'
 ,LC_CTYPE = 'Portuguese_Brazil.1252'
 );