Tuesday, October 1, 2013

Setup: Python/Flask dev environment on windows using virtualenv and pip

Chacolateyis a sudo apt-get for the windows crowd and its awesome! Chacolatey makes it incredibly easy to setup multiple dev environments for Python/flask/django which is otherwise a multi-step process (installing setuptools, pip, virtualenv etc). Here are the steps.

Install Chacolatey

Launch a powershell prompt using "Run As Admin" and set execution policy to unrestricted.

Set-ExecutionPolicy Unrestricted

And run the following command in the same window to install chacolatey

iex ((new-object net.webclient).DownloadString('https://chocolatey.org/install.ps1'))

Once its installed, you'll have csint added to your path and it is the equivalent of apt-get.

Install Python

If you don't already have Python installed, run the following in a powershell/cmd

cinst Python

This will install Python 2.7 at C:\Python27. We need a few more things to setup a proper dev environment. Instead of installing flask package directly to the main Python installation, it can be separated using virtualenv. If you are developing an app on django and another using flask and would like to keep these environments separate, virtualenv is the way to go.

Install Pip

cinst pip

pip install virtualenvwrapper
pip install virtualenvwrapper-powershell

This will add virtualenv.exe to your system path and you can create multiple Python dev environments using this. Lets call our flask app FlaskApp. Lets create a directory for this app on D drive. Create a virtual env in that folder.

virtualenv FlaskApp

This will create a Python instance in FlaskApp folder. Now activate the virtual environment we just created. cd to FlaskApp/Scripts and run activate

cd FlaskApp\Scripts\
activate
#your command line should change to (FlaskApp) D:\FlaskApp\Scripts. 
pip install flask
# To come out of (FlaskApp), simply type 'deactivate'

Now you are all set. We created a virtual environment called FlaskApp and installed flask package. Lets see if it works. Create a sample app. Create a hello.py with below

#!C:\flaskapp\Scripts
-- the above line is to let the computer run python.exe from our VirtualEnv not the system wide Python
from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello World!'

if __name__ == '__main__':
    app.run()

Run

Python hello.py

It should run the web server on 127.0.0.1:5000.






Thursday, July 25, 2013

Virtual Lab: Multi node Hadoop cluster on windows (HortonWorks, HDInsight)

Microsoft recently announced HDInsight, built on top of HortonWorks DataPlatform (a distribution of Hadoop similar to Cloudera). There is a developer preview of HDInsight available as a free download from Microsoft, which is very easy to setup. However,  the developer preview is a single box solution and it is not possible to simulate a real world deployment with multi node clusters. You can create a 4 node hadoop cluster on Azure with a few clicks but it is prohibitively costly (and the cluster will be shutdown after free tier usage, if your account is a trial one). This is a step by step guide to setup a multi node cluster for free on your laptop using HortonWorks Data Platform. HDInsight is actually built on top of Hortonworks solution, so there won’t be many differences apart from some usability features and nice looking clients.  So let’s get started with the setup:

What will we build?


We will build a Virtualbox Base VM with  all the necessary software installed, sysprep it and spin linked clones from the base for better use of available resources, We will build a domain controller and add 4 VMs to the domain. You can achieve similar setup using HyperV too, using differential disks. We should end up with something like this:

Domain: TESLA.COM

Name of the VM
Description

IP Address
RAM
HDDC Domain Controller and Database host for Metastore 192.168.80.1 1024
HDMaster Name node, Secondary Name node, Hive, Oozie, Templeton 192.168.80.2 2048
HDSlave01 Slave/Data Node 192.168.80.3 2048
HDSlave02 Slave/Data Node 192.168.80.4 2048
HDSlave03 Slave/Data Node 192.168.80.5 1024

In prod environments, HIVE, OOZIE and TEMPLETON usually have separate servers running the services, but for the sake of simplicity, I chose to run all of them on a single VM.  If you don’t have enough resources to run 3 data /slave nodes, then you can go with 2. Usually 2 slave nodes and 1 master is necessary to get a feel of real Hadoop Cluster.

System Requirements:


You need a decent laptop/desktop with more than 12 GB of RAM and a decent processor like core i5. If you have less RAM, you might want to run a bare minimum of 3 node cluster (1 master and 2 slaves) and adjust the available RAM accordingly.  For this demo, check the way I assigned RAM in the table above. You can get away with assigning lesser memory to data nodes.

Software Requirements:


Windows Server 2008 R2 or Windows Server 2012 Evaluation edition, 64 bit
Microsoft .NET Framework 4.0
Microsoft Visual C++  2010 redistributable, 64 bit
Java JDK 6u31 or higher
Python 2.7
Microsoft SQL Server 2008 R2 (if you would like to use it for Metastore)
Microsoft SQL Server JDBC Driver
VirtualBox for virtualization (Or HyperV, if you prefer)
HortonWorks DataPlatform for Windows

Building a Base VM:


This guide assumes that you are using VirtualBox for virtualization. The steps are similar even if you are using HyperV with minor differences like creating differencing disks before creating VM clones from it. Hadoop cluster setup involves installing a bunch of software and setting environment variables on each server. Creating a base VM and spinning linked clones from it makes it remarkably easier and saves lot of time and it helps creating more VMs within the available limits of your disk space compared to stand alone VMs.
1. Download and Install VirtualBox
2. Create a new VM for a base VM. I called mine MOTHERSHIP. Select 25 GB of initial space, allowing dynamic expansion. We will do a lot of installations on this machine and to make them work quick, assign maximum amount of RAM temporarily. At the end of the setup, we can adjust the RAM to a lesser value.
3. Enable two network adapters on your VM. Adapter1 connected to NAT (if you care about internet access on VMs). Adapter2 for Internal only Domain Network.

image image

4. From Setting > Storage > Controller IDE , mount Windows Server 2008 R2 installation media iso.
5. Start the VM and boot from the iso. Perform Windows Installation. This step is straight forward.
6. Once the installation is done, set password to P@$$w0rd! and login.
7. Install guest additions for better graphics, shared folders etc.
8. You will move a lot of file from your Host machine to the VM, so configuring a shared folder will help.

image

Install necessary software on BaseVM:


1. Login to the base VM you built above and install Microsoft .NET Framework 4.0
2. Install Microsoft 2010 VC ++ redistributable package.
3. Install Java JDK 6u31 or higher. Make sure that you select a destination directory that doesn’t have any spaces in the address during installation. For example, I installed mine on C:\Java\jdk1.6.0_31
4. Add an environment variable for Java. If you installed Java on C:\Java\jdk1.6.0_31, you should add that path as a new system variable.  Open a Powershell windows as an admin and execute the below command:
[Environment]::SetEnvironmentVariable("JAVA_HOME","c:\java\jdk1.6.0_31","Machine")

5. Install Python. Add Python to Path. Use the command below in a powershell shell launched as Administrator

$env:Path += ";C:\Python27"

6. Confirm that the environment variables have been set properly by running %JAVA_HOME% from Run, it should open the directory where JDK was installed. Also, run Python from Run and it should open Python shell.

7. We are almost done with the Base VM. We can now generalize this image using sysprep. Go to Run, type sysprep and enter. It should open /system32/sysprep directory. Run sysprep.exe as an Admin. Select 'System Out of Box Experience’ , check Generalize and select Shutdown.


image

Create Linked Clones:


1. Go to VirtualBox, Right click on the Base VM you created above and select clone.

image

2. Give the VM an appropriate name. In my case, its HDDC for domain controller. Make sure you select Re-Initialize MAC Addresses of all network cards
3. In the Clone Type Dialog, select Linked Clone, click Clone.

image

4. Repeat steps # 1 – 3 for creating other VMs that will be part of your cluster. You should end up with HDDC, HDMaster and slave nodes - HDSlave01, HDSlave02, HDSlave03 (depending on your available resources)
5. Since we created these clones from the same base, it will inherit all settings, including memory. You might want to go to the properties of each VM and assign memory as you see fit.

Configuring Domain on HDDC


We will not go into details of setting up active directory. Google is your friend. But below are some steps specific to our setup.

1. Start HDDC VM and login.
2. VirtualBox creates funny computers names, even though your VM name is HDDC. Change the computer name to HDDC and reboot.
3. Go to the network settings and you should see two network adapters enabled. Usually ‘Network Adapter2’ is your domain only internal network. To confirm, you can go to VM settings while the VM is running, go to Network tab and temporarily simulate cable disconnection by unchecking cable connected. The Adapter attached to NAT is for internet and the one created as Internal is for Domain Network.

image

4. Go to properties of Domain Network Adapter inside the VM and set below ip configuration

image

5. Enable Active Directory services role and reboot. Use dcpromo.exe to create a new domain. I chose my domain name as TESLA.COM, named after the greatest scientist ever lived. If you need help configuring AD Domain, check this page. Post configuration and reboot, logon to the Domain Controller as TESLA\Administrator and create an active directory account and make it a Domain Admin, this will make things easier for you for the rest of setup.
6. Optionally, if you are planning on using this VM as Metastore for Hadoop cluster, you can install SQL Server 2008 R2. We will not go into the details of that setup. It is just a straight forward installation. Make sure you create a SQL Login which has sysadmin privileges. My login is hadoop. I have also created a couple of blank databases on it, namely hivedb, ooziedb.

Network Configuration and Joining other VMs to Domain


1. Log on to HDMaster as an Administrator
2. Find the Domain network Adapter (refer to step #3 in Configuring Domain section). Make sure your HDDC (Domain controller) VM is running.
3. Change the adapter settings: IP - 192.168.80.2, subnet mast – 255.255.255.0, Default Gateway – Leave it blank, Preferred DNS – 192.168.80.1
4. Right click on My Computer > Setting > Computer name, domain, workgroup settings > Settings and Change the computer name to HDMaster and enter the Domain name TESLA.COM. Enter Domain Admin login and password when prompted and you’ll be greeted with a welcome message. Reboot.
5. Logon to each of the remaining VMs and repeat steps #1 to 4 – the only difference is the IP address. Refer to the list of IPs under ‘What will we build’ section of this article. Preferred DNS remains same for all the VMs i.e, 192.168.80.1

Preparing The environment for Hadoop Cluster


We now have a domain and all the VMs added to it. We also installed all the necessary software to the base VM so all the VMs created from it will inherit those. It is now time to prepare the environment for hadoop installation. Before we begin, make sure ALL VMs have same date, time settings.



Disable IPV6 on ALL VMs:


Log on to each VM, open each network adapter and uncheck IPV6 protocol from adapter settings.

Adding firewall exceptions:



Hadoop uses multiple ports for communicating between clients and other services so we need to create firewall exceptions on all the servers involved in the cluster. Execute the code below on each VM of the cluster on a powershell prompt launched as Administrator. If you want to make it simpler, you can just turn off Firewall at the domain level altogether on all VMs. But we don’t normally do that on production.

netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=50470
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=50070
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=8020-9000
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=50075
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=50475
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=50010
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=50020
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=50030
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=8021
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=50060
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=51111
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=10000
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=9083
netsh advfirewall firewall add rule name=AllowRPCCommunication dir=in action=allow protocol=TCP localport=50111


Networking Configurations and Edit Group Policies:



1. Logon to your Domain Controller VM, HDDC with a Domain Administrator account (ex: TESLA\admin)
2.Go to start and type Group Policy Management in the search bar. Open Group Policy Management.
3. If you don’t see your Forest TESLA.COM already, add a Forest and provide your domain name
4. Expand Forest –> Domain –> Name of your Domain –> right click on default domain policy –> edit

image

5. Group Policy Management Editor (lets call it GPM Editor) will open up and note that ALL the steps following below are done through this editor.

6. In GPM Editor, navigate to  Computer Configuration -> Policies -> Windows Settings -> Security Settings ->System Services -> Windows Remote Management (WS-Management). Change the startup mode to Automatic


7. Go to Computer Configuration -> Policies -> Windows Settings -> Security Settings -> Windows Firewall with Advanced Security. Right click on it and create a new  inbound firewall rule. For Rule Type, choose pre-defined, and select Windows Remote Management from the dropdown. It will automatically create 2 rules. Let them be.  click Next, Check Allow The Connection.



8. This step is for enabling Remote Execution Policy for Powershell. Go to Computer Configuration -> Policies -> Administrative Templates -> Windows Components -> Windows PowerShell. Enable   Turn on Script Execution. You’ll see Execution Policy dropdown under options, select Allow All Scripts



9. Go to Computer Configuration -> Policies -> Administrative Templates -> Windows Components -> Windows Windows Remote Management (WinRM) -> WinRM Service. Enable “Allow Automatic Configuration of Listeners” . Under Options, enter * (asterisk symbol)   for IPV4 filter.  Also change “Allow CredSSP Authentication” to Enabled.



10. We configured WinRM Service so far, now it’s WinRM Client’s turn.  Go to Computer Configuration -> Policies -> Administrative Templates -> Windows Components -> Windows Windows Remote Management (WinRM) -> WinRM Client. Set Trusted Hosts to Enabled and under Options, set TrustedHostsList to * (asterisk symbol) .  In the same dialog, look for Allow CredSSP Authentication and set it to Enabled.



11. Go to  Computer Configuration -> Policies -> Administrative Templates -> System -> Credentials Delegation. Set Allow Delegating Fresh Credentials  to Enabled. Under options, click on Show next to Add Servers To The List  and set WSMAN to *  as shown below: (make a new entry like so: WSMAN/*.TESLA.COM, replace it with your domain here)


image

12. Repeat instructions in step#11 above, this time for property NTLM-only server authentication


13. Already tired? We are almost done. Just a couple more steps before we launch the installation. Go to Run  and type ADSIEdit.msc and enter. Expand OU=Domain Controllers menu item and select CN=HDDC (controller hostname). Go to Properties -> Security -> Advanced –> Add. Enter NETWORK SERVICE, click Check Names, then Ok. In the Permission Entry select Validated write to service principal name. Click Allow and OK to save your changes.


14.On the Domain controller VM, launch a powershell windows an an administrator and run the following: Restart-Service WinRM


15. Force update gpupdate on the other VMs in the environment. Run this command: gpupdate /force


Define Cluster Configuration File



When you download HortonWorks Dataplatform for windows, the zip contains a sample clusterproperties.txt file which can be used as a template during installation. The installation msi  depends on clusterproperties file for cluster layout definition. Below is my Cluster Configuration file. Make changes to yours accordingly.

#Log directory HDP_LOG_DIR=c:\hadoop\logs #Data directory HDP_DATA_DIR=c:\hdp\data #Hosts NAMENODE_HOST=HDMaster.tesla.com SECONDARY_NAMENODE_HOST=HDMaster.tesla.com JOBTRACKER_HOST=HDMaster.tesla.com HIVE_SERVER_HOST=HDMaster.tesla.com OOZIE_SERVER_HOST=HDMaster.tesla.com TEMPLETON_HOST=HDMaster.tesla.com SLAVE_HOSTS=HDSlave01.tesla.com, HDSlave02.tesla.com #Database host DB_FLAVOR=mssql DB_HOSTNAME=HDDC.tesla.com #Hive properties HIVE_DB_NAME=hivedb HIVE_DB_USERNAME=sa HIVE_DB_PASSWORD=P@$$word! #Oozie properties OOZIE_DB_NAME=ooziedb OOZIE_DB_USERNAME=sa OOZIE_DB_PASSWORD=P@$$word!

Installation



1. Keep all of your VMs running and logon to HDMaster, with domain admin account.
2. Create a folder C:\setup and copy the contents of HortonWorks Data Platform installation media you downloaded at the beginning of this guide.
3. Replace the clusterproperties.txt with the one you created under Define Cluster Config File section. This file is same for ALL the participant servers in the cluster.
4. open a command prompt as administrator and cd to C:\setup.
5. Type the below command to kick off installation. Beware of unnecessary spaces in the command. It took me a lot of time to figure out the space was the issue. Edit the folder path according to your setup. There is no space between HDP_LAYOUT and ‘=’.

msiexec /i "c:\setup\hdp-1.1.0-GA.winpkg.msi" /lv "hdp.log" HDP_LAYOUT="C:\setup\clusterProperties.txt" HDP_DIR="C:\hdp\Hadoop" DESTROY_DATA="no"

6. If everything goes well, you should see a success message.

image

7. Log on to each of the other VMs, and repeat steps from #2 to 6.


Congratulations, you have now successfully created your first Hadoop Cluster. You’ll see shortcuts like below on your desktop on each VM. Before starting Hadoop exploration, we need to start services on both Local and remote machines. Log on to the HDMaster server, go to hadoop installation directory and look for \hadoop folder. You’ll see a couple of batch files to start/stop services.


image

In the next blog post, we will write a sample MapReduce job using Pig and one using Python. Have fun!

Saturday, July 6, 2013

VirtualBox: SSH to ubuntu guest from windows host

Make sure you have ssh server installed on ubuntu.

sudo apt-get install open-sshserver

Go to VM properties => Network; you should see an adapter attached to NAT. Go to advanced properties and click port forwarding. Create a new port forwarding rule with the below details.

Name: SSH
Protocol: TCP
Host Port: 3022
Guest Port: 22


Or, you can run this command to configure:

VBoxManage modifyvm myubuntuVM --natpf1 "ssh,tcp,,3022,,22"

Use your favorite SSH client (I use cygwin on my win 8 machine) and use the following command to connect:

ssh -p 3022 cherry@127.0.0.1



Sunday, June 16, 2013

Python Game: Asteroids (Rice Rocks)

The due time for this project submission is over so I can post this here. With this final mini project, An Introduction to Interactive Programming in Python class is complete. It has been a wonderful experience! You can play the game here:

__author__ = 'RK Kuppala'

# program template for Spaceship
import simplegui
import math
import random

# globals for user interface
WIDTH = 800
HEIGHT = 600
score = 0
lives = 3
time = 0
wrap = 20
started = False
alive = True
rock_group = set([])
class ImageInfo:
    def __init__(self, center, size, radius = 0, lifespan = None, animated = False):
        self.center = center
        self.size = size
        self.radius = radius
        if lifespan:
            self.lifespan = lifespan
        else:
            self.lifespan = float('inf')
        self.animated = animated

    def get_center(self):
        return self.center

    def get_size(self):
        return self.size

    def get_radius(self):
        return self.radius

    def get_lifespan(self):
        return self.lifespan

    def get_animated(self):
        return self.animated


# art assets created by Kim Lathrop, may be freely re-used in non-commercial projects, please credit Kim

# debris images - debris1_brown.png, debris2_brown.png, debris3_brown.png, debris4_brown.png
#                 debris1_blue.png, debris2_blue.png, debris3_blue.png, debris4_blue.png, debris_blend.png
debris_info = ImageInfo([320, 240], [640, 480])
debris_image = simplegui.load_image("http://commondatastorage.googleapis.com/codeskulptor-assets/lathrop/debris2_blue.png")

# nebula images - nebula_brown.png, nebula_blue.png
nebula_info = ImageInfo([400, 300], [800, 600])
nebula_image = simplegui.load_image("http://commondatastorage.googleapis.com/codeskulptor-assets/lathrop/nebula_blue.png")

# splash image
splash_info = ImageInfo([200, 150], [400, 300])
splash_image = simplegui.load_image("http://commondatastorage.googleapis.com/codeskulptor-assets/lathrop/splash.png")

# ship image
ship_info = ImageInfo([45, 45], [90, 90], 35)
ship_image = simplegui.load_image("http://commondatastorage.googleapis.com/codeskulptor-assets/lathrop/double_ship.png")

# missile image - shot1.png, shot2.png, shot3.png
missile_info = ImageInfo([5,5], [10, 10], 3, 50)
missile_image = simplegui.load_image("http://commondatastorage.googleapis.com/codeskulptor-assets/lathrop/shot2.png")

# asteroid images - asteroid_blue.png, asteroid_brown.png, asteroid_blend.png
asteroid_info = ImageInfo([45, 45], [90, 90], 40)
asteroid_image = simplegui.load_image("http://commondatastorage.googleapis.com/codeskulptor-assets/lathrop/asteroid_blue.png")

# animated explosion - explosion_orange.png, explosion_blue.png, explosion_blue2.png, explosion_alpha.png
explosion_info = ImageInfo([64, 64], [128, 128], 17, 24, True)
explosion_image = simplegui.load_image("http://commondatastorage.googleapis.com/codeskulptor-assets/lathrop/explosion_alpha.png")

# sound assets purchased from sounddogs.com, please do not redistribute
soundtrack = simplegui.load_sound("http://commondatastorage.googleapis.com/codeskulptor-assets/sounddogs/soundtrack.mp3")
missile_sound = simplegui.load_sound("http://commondatastorage.googleapis.com/codeskulptor-assets/sounddogs/missile.mp3")
missile_sound.set_volume(.4)
ship_thrust_sound = simplegui.load_sound("http://commondatastorage.googleapis.com/codeskulptor-assets/sounddogs/thrust.mp3")
ship_thrust_sound.set_volume(.4)
explosion_sound = simplegui.load_sound("http://commondatastorage.googleapis.com/codeskulptor-assets/sounddogs/explosion.mp3")
explosion_sound.set_volume(.5)

# helper functions to handle transformations
def angle_to_vector(ang):
    return [math.cos(ang), math.sin(ang)]

def dist(p,q):
    return math.sqrt((p[0] - q[0]) ** 2+(p[1] - q[1]) ** 2)


# Ship class
class Ship:
    def __init__(self, pos, vel, angle, image, info):
        self.pos = [pos[0],pos[1]]
        self.vel = [vel[0],vel[1]]
        self.thrust = False
        self.angle = angle
        self.angle_vel = 0
        self.image = image
        self.image_center = info.get_center()
        self.image_size = info.get_size()
        self.radius = info.get_radius()

    def draw(self,canvas):
        if alive:
            if self.thrust:
                self.image_center = (135, 45)
            else:
                self.image_center = (45, 45)
        canvas.draw_image(ship_image, self.image_center, self.image_size, self.pos, self.image_size, self.angle)
        if self.pos[0]  WIDTH - self.radius:
            canvas.draw_image(ship_image, self.image_center, self.image_size,
                              [self.pos[0]-WIDTH,self.pos[1]], self.image_size, self.angle)
        if self.pos[1] < self.radius:
            canvas.draw_image(ship_image, self.image_center, self.image_size,
                              [self.pos[0],self.pos[1]+HEIGHT], self.image_size, self.angle)
        if self.pos[1] > HEIGHT - self.radius:
            canvas.draw_image(ship_image, self.image_center, self.image_size,
                              [self.pos[0],self.pos[1]-HEIGHT], self.image_size, self.angle)

    def update(self):
        if alive:
            self.pos[0] += self.vel[0]
            self.pos[0] = (self.pos[0] % WIDTH)
            self.pos[1] += self.vel[1]
            self.pos[1] = (self.pos[1] % HEIGHT)
            self.vel[0] *= 0.97
            self.vel[1] *= 0.97
            self.angle += self.angle_vel
            if self.thrust:
                self.vel[0] += 0.3 * (angle_to_vector(self.angle)[0])
                self.vel[1] += 0.3 * (angle_to_vector(self.angle) [1])

    def angleincrement(self):
        self.angle_vel += 0.08

    def angledecrement(self):
        self.angle_vel -= 0.08

    def thrusters(self):
        self.thrust = not self.thrust
        if self.thrust and started and alive:
            ship_thrust_sound.play()
        else:
            ship_thrust_sound.pause()

    def shoot(self):
        global missile_group
        missile_timer.start()
        if alive and len(missile_group) < 2:
            missile_group.add(Sprite([self.pos[0]+self.radius*angle_to_vector(self.angle)[0], self.pos[1]+self.radius*angle_to_vector(self.angle)[1]],
                                     [11*angle_to_vector(self.angle)[0] + 0.4*self.vel[0], 11*angle_to_vector(self.angle)[1] + 0.4*self.vel[1]],
                                     0, 0, missile_image, missile_info, missile_sound))

    def respawn(self):
        global alive, started
        if lives > 0:
            self.thrust = False
            alive = True
            self.vel[0] = 0
            self.vel[1] = 0
            respawn_timer.stop()
        else:
            started = False
            soundtrack.rewind()
            ship_thrust_sound.rewind()

# Sprite class
class Sprite:
    def __init__(self, pos, vel, ang, ang_vel, image, info, sound = None):
        self.pos = [pos[0],pos[1]]
        self.vel = [vel[0],vel[1]]
        self.angle = ang
        self.angle_vel = ang_vel
        self.image = image
        self.image_center = info.get_center()
        self.image_size = info.get_size()
        self.radius = info.get_radius()
        self.lifespan = info.get_lifespan()
        self.animated = info.get_animated()
        self.age = 0
        if sound:
            sound.rewind()
            sound.play()

    def draw(self, canvas):
        if self.animated:
            canvas.draw_image(self.image, [self.image_center[0] + self.image_size[0] * self.age, self.image_center[1]],
                              self.image_size, self.pos, self.image_size, self.angle)
        else:
            canvas.draw_image(self.image, self.image_center, self.image_size, self.pos, self.image_size, self.angle)
            if self.pos[0] < self.radius:
                canvas.draw_image(self.image, self.image_center, self.image_size,
                                  [self.pos[0]+ WIDTH,self.pos[1]], self.image_size, self.angle)
            if self.pos[0] > WIDTH - self.radius:
                canvas.draw_image(self.image, self.image_center, self.image_size,
                                  [self.pos[0]- WIDTH,self.pos[1]], self.image_size, self.angle)
            if self.pos[1] < self.radius:
                canvas.draw_image(self.image, self.image_center, self.image_size,
                                  [self.pos[0],self.pos[1]+HEIGHT], self.image_size, self.angle)
            if self.pos[1] > HEIGHT - self.radius:
                canvas.draw_image(self.image, self.image_center, self.image_size,
                                  [self.pos[0],self.pos[1]- HEIGHT], self.image_size, self.angle)

    def update(self):
        global wrap
        self.pos[0] += self.vel[0]
        self.pos[0] = (self.pos[0] % WIDTH)
        self.pos[1] += self.vel[1]
        self.pos[1] = (self.pos[1] % HEIGHT)
        self.angle += self.angle_vel
        self.age += 1

    def check_lifespan(self):
        return self.age < self.lifespan

    def collide(self, other_object):
        if dist(self.pos,other_object.pos) < (self.radius + other_object.radius):
            return True


def draw(canvas):
    global time, lives, score, timer, started, alive, rock_group

    # animiate background
    time += 1
    center = debris_info.get_center()
    size = debris_info.get_size()
    wtime = (time / 8) % center[0]
    canvas.draw_image(nebula_image, nebula_info.get_center(), nebula_info.get_size(), [WIDTH/2, HEIGHT/2], [WIDTH, HEIGHT])
    canvas.draw_image(debris_image, [center[0]-wtime, center[1]], [size[0]-2*wtime, size[1]],
                                [WIDTH/2+1.25*wtime, HEIGHT/2], [WIDTH-2.5*wtime, HEIGHT])
    canvas.draw_image(debris_image, [size[0]-wtime, center[1]], [2*wtime, size[1]],
                                [1.25*wtime, HEIGHT/2], [2.5*wtime, HEIGHT])

    if not started:
        canvas.draw_image(splash_image, splash_info.get_center(),
                          splash_info.get_size(), [WIDTH/2, HEIGHT/2],
                          splash_info.get_size())

    # draw ship and sprites
    elif started:
        my_ship.draw(canvas)
        process_sprite_group(rock_group, canvas)
        process_sprite_group(missile_group, canvas)
        process_sprite_group(explosion_group, canvas)
        canvas.draw_text("Lives: " + str(lives), [50,50], 20, "White")
        canvas.draw_text("Score: " + str(score), [WIDTH-150,50], 20, "White")
        # update ship and sprites
        my_ship.update()
        if group_collide(rock_group, my_ship) > 0:
            alive = False
            lives -= 1
            explosion_sound.play()
            respawn_timer.start()
        if group_group_collide(missile_group, rock_group) > 0:
            score += 1
            explosion_sound.rewind()
            explosion_sound.play()

def keydown(key):
    if key == simplegui.KEY_MAP['right']:
        my_ship.angleincrement()
    elif key == simplegui.KEY_MAP['left']:
        my_ship.angledecrement()
    elif key == simplegui.KEY_MAP['up']:
        my_ship.thrusters()
    elif key == simplegui.KEY_MAP['space']:
        my_ship.shoot()

def keyup(key):
    if key == simplegui.KEY_MAP['right']:
        my_ship.angledecrement()
    elif key == simplegui.KEY_MAP['left']:
        my_ship.angleincrement()
    elif key == simplegui.KEY_MAP['up']:
        my_ship.thrusters()
    elif key == simplegui.KEY_MAP['space']:
        missile_timer.stop()

# timer handler that spawns a rock
def rock_spawner():
    global rock_group
    random_place = [random.randrange(0, WIDTH), random.randrange(0, HEIGHT)]
    while dist(random_place,my_ship.pos) < 150:
        random_place = [random.randrange(0, WIDTH), random.randrange(0, HEIGHT)]
    random_velocity = [random.choice([-1, -2, 1, 2]), random.choice([-1, -2, 1, 2])]
    random_angle = random.randrange(0, 7)
    random_anglevel = (random.randrange(-30,30) / 1000)
    if len(rock_group) < 5:
        rock_group.add((Sprite(random_place, random_velocity, random_angle, random_anglevel, asteroid_image, asteroid_info)))

def process_sprite_group(a_set, canvas):
    for sprite in a_set:
        sprite.draw(canvas)
    for sprite in a_set:
        sprite.update()
    for sprite in a_set:
        if not sprite.check_lifespan():
            a_set.remove(sprite)


def group_collide(group, other_object):
    collide = 0
    for an_object in group:
        if Sprite.collide(an_object, other_object):
             explosion_group.add(Sprite(an_object.pos, (0,0), 0, 0, explosion_image, explosion_info, explosion_sound))
             if other_object.radius > 10:
                 explosion_group.add(Sprite(other_object.pos, (0,0), 0, 0, explosion_image, explosion_info, explosion_sound))
             group.remove(an_object)
             collide += 1
    return collide

def group_group_collide(group1, group2):
    collide = 0
    for an_object in group1:
        if group_collide(group2, an_object) > 0:
            group1.remove(an_object)
            collide += 1
    return collide

def click(pos):
    global started, lives
    center = [WIDTH / 2, HEIGHT / 2]
    size = splash_info.get_size()
    inwidth = (center[0] - size[0] / 2) < pos[0] < (center[0] + size[0] / 2)
    inheight = (center[1] - size[1] / 2) < pos[1] < (center[1] + size[1] / 2)
    if (not started) and inwidth and inheight:
        reset()
        soundtrack.play()
        started = True


def reset():
    global my_ship, rock_group, missile_group, lives, score
    my_ship = Ship([WIDTH / 2, HEIGHT / 3], [0, 0], 0, ship_image, ship_info)
    rock_group = set()
    missile_group = set()
    explosion_group = set()
    score = 0
    lives = 3
    timer.start()

def missile():
    my_ship.shoot()

# initialize frame
frame = simplegui.create_frame("Asteroids", WIDTH, HEIGHT)

# initialize ship and two sprites
my_ship = Ship([WIDTH / 2, HEIGHT / 2], [0, 0], 0, ship_image, ship_info)
a_rock = Sprite([WIDTH / 3, HEIGHT / 3], [1, 1], 0, 0, asteroid_image, asteroid_info)
a_missile = Sprite([2 * WIDTH / 3, 2 * HEIGHT / 3], [-1,1], 0, 0, missile_image, missile_info, missile_sound)

# register handlers
frame.set_draw_handler(draw)
frame.set_mouseclick_handler(click)
timer = simplegui.create_timer(500.0, rock_spawner)
frame.set_keydown_handler(keydown)
frame.set_keyup_handler(keyup)    

my_ship = Ship([WIDTH / 2, HEIGHT / 3], [0, 0], 0, ship_image, ship_info)
missile_group = set()
explosion_group = set()
missile_timer = simplegui.create_timer(700.0, missile)
respawn_timer = simplegui.create_timer(700.0, my_ship.respawn)

# get things rolling
timer.start()
frame.start()

Thursday, June 6, 2013

Playing with text file data in Python: Building SQL delete statements from unstructured data

Yesterday I played around with excel files to build SQL update statements and today I got an opportunity do similar stuff, but this time I got the text file that had column names in the first line and values in following lines separated by spaces. The goal is delete a bunch of records from a table based on the eventids etc. It looks like this:



First, you need to open the file to read it. The rb argument is to open the file for reading in binary mode in windows.

myFile = open("D:\Scripts\deldata.txt", "rb")

I initially tried to find the number of lines in the text file thinking this might be useful while iterating through the lines and when trying to make a list out of these . I know, you could just open the text file in notepad++ and look at the end of the document, but where is the fun in that? So this is what I came up with.

myFile = open("D:\Scripts\deldata.txt", 'rb')
i = 0
for line in enumerate(myFile):
    i += 1
print i

I know that totally sucks. May be there a more elegant way?

myFile = open("D:\Scripts\deldata.txt", "rb")
numLines = sum(1 for line in open myFile)
print numLines

It turns out, you don't really need it. You can just loop through each line and manipulate the lines as needed. When I printed the lines, there were lots of white spaces and tabs in the resulting strings. After browsing through a lot of stackoverflow questions and reading documentation and unsuccessfully trying to use regular expressions, I found an easier way to deal with the white spaces. Just use string.split() method and turn the string into a list.

__author__ = 'RK'
myFile = open("D:\Scripts\deldata.txt", "rb")

for line in myFile:
    line = line.split() #remove white spaces and tabs
    for value in line:
        print ("DELETE FROM REGISTR WHERE eventid = '%s' AND subeventid = '%s' AND personid = '%s' 
                AND RegNo = '%s'" %(line[0], line[1], line[2], line[3]))
    





Wednesday, June 5, 2013

Manipulating Excel data with Python: Building SQL statements from data in excel

I got an excel doc with a couple of sheets from our dev team. Each sheet has two columns from one of our database tables. Lets call these columns CurrentID, LegacyID. I need to update the LegacyID based on CurrentID. Each sheet has more than 2000 values and it is pointless trying to build the SQL statements manually. Its pretty easy to get this done using TSQL. Just import this data into a temporary table and build update statements using TSQL. Since I have been trying to learn Python, I thought I'd give it a try. I used xlrd module. Check the documentation here. Below code writes the update statements to a text file from sheet1.

import xlrd

book = xlrd.open_workbook('D:\Scripts\UpdateID01.xls')
sheet = book.sheet_by_index(0)

myList = []
for i in range(1,sheet.nrows):
   myList.append(sheet.row_values(i))

#print myList
outFile = open('D:\Scripts\update.txt', 'wb')

for i in myList:
    outFile.write("\nUPDATE MyTable SET LegacyID = '%s' WHERE CurrentID = '%s'" %( int(i[1]), str(i[0])))

The second sheet had white spaces at the end of CurrentID values and the update statements will fail if they are not removed. To deal with it, use str.replace method.

import xlrd

book = xlrd.open_workbook('D:\Scripts\UpdateID01.xls')
sheet = book.sheet_by_index(1)

myList = []
for i in range(1,sheet.nrows):
   myList.append(sheet.row_values(i))

#print myList
outFile = open('D:\Scripts\updatemyesi.txt', 'wb')

for i in myList:
    i[0] = str(i[0]).replace('  ', '')
    outFile.write("\nUPDATE MyTable SET LegacyID = 0 WHERE CurrentID = '%s'" %(str(i[0])))

Friday, May 31, 2013

Python Game: Blackjack

This week's game is Blackjack. Go play it here

# Blackjack
__author__ = 'RK'
import simplegui
import random

CARD_SIZE = (73, 98)
CARD_CENTER = (36.5, 49)
card_images = simplegui.load_image("http://commondatastorage.googleapis.com/codeskulptor-assets/cards.jfitz.png")

CARD_BACK_SIZE = (71, 96)
CARD_BACK_CENTER = (35.5, 48)
card_back = simplegui.load_image("https://www.dropbox.com/s/75fcadgeewharzg/joker.jpg?dl=1")

# initialize some useful global variables
in_play = False
outcome = ""
score = 0

# define globals for cards
SUITS = ('C', 'S', 'H', 'D')
RANKS = ('A', '2', '3', '4', '5', '6', '7', '8', '9', 'T', 'J', 'Q', 'K')
VALUES = {'A':1, '2':2, '3':3, '4':4, '5':5, '6':6, '7':7, '8':8, '9':9, 'T':10, 'J':10, 'Q':10, 'K':10}


# define card class
class Card:
    def __init__(self, suit, rank):
        if (suit in SUITS) and (rank in RANKS):
            self.suit = suit
            self.rank = rank
        else:
            self.suit = None
            self.rank = None
            print "Invalid card: ", suit, rank

    def __str__(self):
        return self.suit + self.rank

    def get_suit(self):
        return self.suit

    def get_rank(self):
        return self.rank

    def draw(self, canvas, pos):
        card_loc = (CARD_CENTER[0] + CARD_SIZE[0] * RANKS.index(self.rank),
                    CARD_CENTER[1] + CARD_SIZE[1] * SUITS.index(self.suit))
        canvas.draw_image(card_images, card_loc, CARD_SIZE, [pos[0] + CARD_CENTER[0], pos[1] + CARD_CENTER[1]], CARD_SIZE)

# define hand class
class Hand:
    def __init__(self):
        self.hand = []

    def __str__(self):
        s = ''
        for c in self.hand:
            s += str(c)
            s += " "
        return s

    def add_card(self, card):
        self.hand.append(card)

    def hit(self,deck):
        self.add_card(deck.deal_card())
        player.get_value()
        dealer.get_value()

    def get_value(self):
        """ count aces as 1, if the hand has an ace, then add 10 to hand value if it doesn't bust"""
        global in_play, message, score
        self.value = 0
        A = 0
        for card in self.hand:
            if card[1] == 'A':
                A += 1
            self.value += VALUES[card[1]]
        if A > 0 and self.value < 12:
            self.value += 10
        if self.value > 21:
            if in_play and (player.value > 21):
                message = "You lose! The computer wins!"
                score -= 1
            in_play = False
        return self.value


    def draw(self, canvas, pos):
        """draw a hand on the canvas, use the draw method for cards"""
        x = 0
        for card in self.hand:
            card = Card(self.hand[x][0], self.hand[x][1])
            card.draw(canvas, [x * 90 + 50, pos * 200])
            x += 1


# define deck class
class Deck:
    def __init__(self):
        self.deck = [(suit, rank) for suit in SUITS for rank in RANKS]
        self.shuffle()

    def shuffle(self):
        random.shuffle(self.deck)

    def deal_card(self):
        return self.deck.pop()


#define event handlers for buttons
def deal():
    global outcome, in_play,deck, hand, dealer, hand_total, player, message, score
    message = "Do you choose to Hit or Stand?"
    if in_play:
        score -= 1
        message = "That's cheating!"
    hand_total = 0
    deck = Deck()
    player = Hand()
    dealer = Hand()
    player.hit(deck)
    player.hit(deck)
    dealer.hit(deck)
    dealer.hit(deck)
    in_play = True


def hit():
    global player, in_play, message
    """if the hand is in play, hit the player"""
    player.get_value()
    if (player.value <= 21) and in_play:
        player.hit(deck)
        if player.value < 21:
            message = "OK. Do you want to Hit again or Stand?"
            """if busted, assign a message to outcome, update in_play and score"""
    else:
        message = "STOP. CLICKING. THAT."


def stand():
    global value, message, in_play, score
    if in_play == False:
        message = "STOP. CLICKING. THAT."
    else:
        player.value
        message = "Please wait! Computer is making its move..."
        timer.start()
    in_play = False

def dealercard():
    global score, message
    if dealer.value < 17:
        dealer.hit(deck)
    elif (player.value > dealer.value) or (dealer.value > 21):
        message = "You win. Congrats! Deal again?"
        score += 1
        timer.stop()
    else:
        message = "Computer Wins. Deal again?"
        score -= 1
        timer.stop()

# draw handler
def draw(canvas):
    global dealer, player, message, in_play, score
    scorestring = "Your points are: "
    scorestring += str(score)
    dealer.draw(canvas, 1)
    player.draw(canvas, 2)
    canvas.draw_text(message, [50, 185], 18, "Black")
    canvas.draw_text(scorestring, [480, 555], 14, "Yellow")
    canvas.draw_text("BlackJack Game", [480, 585], 18, "Black")
    if in_play:
        canvas.draw_image(card_back, CARD_BACK_CENTER, CARD_BACK_SIZE, (88, 249), (70, 94))

# initialization frame
frame = simplegui.create_frame("Blackjack", 600, 600)
frame.set_canvas_background("Gray")
timer = simplegui.create_timer(1000, dealercard)

#create buttons and canvas callback
frame.add_button("Deal", deal, 200)
frame.add_button("Hit",  hit, 200)
frame.add_button("Stand", stand, 200)
frame.set_draw_handler(draw)

# get things rolling
frame.start()
deck = Deck()
player = Hand()
dealer = Hand()
message = "BlackJack Game.                       Please hit Deal"

Thursday, May 30, 2013

A simple list manipulation - Computing primes using Sieve of Eratosthenes process

I am incredibly dumb that I had to wrestle with this problem for a long time before being able to solve it. But the experience has been invaluable.

The Problem:

Initialize n to be 1000. Initialize numbers to be a list of numbers from 2 to n, but not including n. With results starting as the empty list, repeat the following as long as numbers contains any numbers.
1. Add the first number in numbers to the end of results.
2. Remove every number in numbers that is evenly divisible by (has no remainder when divided by) the number that you
had just added to results.

How long is results?

The Solution (It is ugly!)

n  = 1000 
numbers = range(2, n)
results = []
while numbers:
    i = numbers[0]
    results.append(i)
    for number in numbers:
        if number % i == 0:
            numbers.remove(number)

print numbers
print len(results)


It looks like this computes the primes less than n, by a process known as Sieve of Eratosthenes. Below is another, more elegant and pythonic way to do it.

n = 1000
numbers = range(2, n)
results = []

while numbers != []:
    results.append(numbers[0])
    numbers = [n for n in numbers if n % numbers[0] != 0]

print len(results)

I was stuck with infinite loops with my original program before I stumbled upon pythontutor.org and saw the execution step by step. Below is the visualized execution of my program.



SQL Server: Storage internals, recovery model and log behavior

This is not a comprehensive post on SQL Server storage and internals, but we recently ran into an interesting issue. We were doing PCI scans on some of our servers for PCI compliance. There is a tool that scans each bit and byte inside the database and server and return a list of possible violations. Our developers identified data that needs to be removed in some legacy systems and they cleaned up the data by updating offending fields with blank values, deleting some tables altogether etc. But the scan tool would often find violations even after truncating/deleting data. Some of those databases were on simple and some were in FULL recovery mode and I had to explain how log backups, recovery and checkpoint affect the databases and I came up with this small tutorial.

Lets create a database to play with. Lets set that database in FULL recovery mode and create some data and see whats in the log.
CREATE DATABASE storagefun;
GO

ALTER DATABASE storagefun SET RECOVERY FULL
GO

A database with no data in it is no good. Lets create a table and see. I am going to define DEFAULTS just to save some time and effort while inserting records into that table. I'm just lazy:

USE storagefun
GO
 
CREATE TABLE CreditInfo 
(
id int identity,
name nvarchar(30) DEFAULT 'SQL Server is fun!',
ccnumber nvarchar(40) DEFAULT '1234-4567-7891-1023'
)

A full backup initiates an LSN chain, so lets do a FULL backup.
BACKUP DATABASE storagefun TO DISK = 'D:\SQLData\Backups\Storagefun.bak'

Lets check and try to find more details about our transaction log. There is a very useful DBCC command, LOGINFO which allows us to do that. Here is Paul Randal demonstrating the circular nature of the log using DBCC LOGINFO.

DBCC LOGINFO
GO

You should see some thing similar to the below image.



So there are a couple of VLFs (Virtual Log Files) and on one of them, the status is 2, which means 'Active'. Can we find more information? What is in those log files? Yes we can, using the undocumented function fn_dblog

SELECT
 [Previous LSN] ,
 [Current LSN] ,
 [Transaction ID] ,
 [Operation],
 [CONTEXT],
 [AllocUnitName] ,
 [Page ID],
 [Slot ID],
 [Offset in Row] ,
 [Transaction Name] ,
 [Begin Time],
 [End Time],
 [Number of Locks] ,
 [Lock Information] ,
 [RowLog Contents 0] ,
 [RowLog Contents 1] ,
 [RowLog Contents 2] ,
 [RowLog Contents 3] ,
 [RowLog Contents 4] ,
 [Log Record],
 [Log Record Fixed Length] ,
 [Log Record Length]
FROM fn_dblog (null,null)
GO

This will return a lot of data which may not make much sense to you at first. That's ok. We just created a table and there is no data in it. Let us insert some data and see how the log behaves.

INSERT INTO CreditInfo VALUES ( DEFAULT, DEFAULT);
GO 200

How does out VLF files look now?

DBCC LOGINFO
GO


Our VLF count has increased and now two VLFs are marked as Active. Let us insert some more records and see what happens to the log

INSERT INTO CreditInfo VALUES ( DEFAULT, DEFAULT);
GO 5000

DBCC LOGINFO
GO

Since we seem to have a lot of data, may be try and read the log and see if we can make sense of the output from fn_dlog function output. Lets use the query we used above, but only this time, we know we only ever dealt with the CreditInfo table. So lets narrow the scope of the results based on this.

SELECT
 [Previous LSN] ,
 [Current LSN] ,
 [Transaction ID] ,
 [Operation],
 [CONTEXT],
 [AllocUnitName] ,
 [Page ID],
 [Slot ID],
 [Offset in Row] ,
 [Transaction Name] ,
 [Begin Time],
 [End Time],
 [Number of Locks] ,
 [Lock Information] ,
 [RowLog Contents 0] ,
 [RowLog Contents 1] ,
 [RowLog Contents 2] ,
 [RowLog Contents 3] ,
 [RowLog Contents 4] ,
 [Log Record],
 [Log Record Fixed Length] ,
 [Log Record Length]
FROM fn_dblog (null,null)WHERE AllocUnitName LIKE '%CreditInfo%'

You'll notice the details of the operation we performed, LOP_INSERT_ROWS. So is that the actual data? Yes and no. Its not actual data in the sense of data in a data file. It is knowledge about what happened to that data for SQL Server's use. I had to demonstrate this to our devs because they were claiming that there is credit card info in the log files as per the scanning tool.



Coming back to log, observe how the log file grows and how most of the VLFs are in Active mode. This means they are currently being used and are not marked for reuse yet. We obviously dont want our log file to grow indefinitely. It needs to be cleared, right? So what clears a transaction log file when the database is in Full Recovery Mode? Two things- First, When a checkpoint has occurs since the last full backup AND a transaction log backup. Lets issue a manual CHECKPOINT and do a log backup and see what happens to our log.

CHECKPOINT
GO

BACKUP LOG storagefun TO DISK = 'D:\SQLData\storagefun_log.trn'
GO

What happened? All the VLFs are still there, but the log backup has cleared the transaction log and the existing virtual log files are marked as inactive. So they can be used again for any transactions in future. Now, try to read the log again and limit your scope to CreditInfo table.

SELECT
 [Previous LSN] ,
 [Current LSN] ,
 [Transaction ID] ,
 [Operation],
 [CONTEXT],
 [AllocUnitName] ,
 [Page ID],
 [Slot ID],
 [Offset in Row] ,
 [Transaction Name] ,
 [Begin Time],
 [End Time],
 [Number of Locks] ,
 [Lock Information] ,
 [RowLog Contents 0] ,
 [RowLog Contents 1] ,
 [RowLog Contents 2] ,
 [RowLog Contents 3] ,
 [RowLog Contents 4] ,
 [Log Record],
 [Log Record Fixed Length] ,
 [Log Record Length]
FROM fn_dblog (null,null) WHERE AllocUnitName LIKE '%CreditInfo%'
GO

Boom! there is nothing related to CreditInfo object. Didn't I tell you that the log backup cleared the transaction log?

But you can see that data in the data file. Use a simple select.

Ok, this is credit card data and one internet kitten will die for each row you keep in your database. Its vile. So lets get rid of it. After you truncate the table, will you find any information about it in the log file? Yes, you will. You'll see LOP_MODIFY_ROW operation.

TRUNCATE TABLE CreditInfo;

SELECT
 [Previous LSN] ,
 [Current LSN] ,
 [Transaction ID] ,
 [Operation],
 [CONTEXT],
 [AllocUnitName] ,
 [Page ID],
 [Slot ID],
 [Offset in Row] ,
 [Transaction Name] ,
 [Begin Time],
 [End Time],
 [Number of Locks] ,
 [Lock Information] ,
 [RowLog Contents 0] ,
 [RowLog Contents 1] ,
 [RowLog Contents 2] ,
 [RowLog Contents 3] ,
 [RowLog Contents 4] ,
 [Log Record],
 [Log Record Fixed Length] ,
 [Log Record Length]
FROM fn_dblog (null,null) WHERE AllocUnitName LIKE '%CreditInfo%'

The credit card scan utility still sees the data, we just truncated the table and still. Ok, lets do a log backup.

BACKUP LOG storagefun TO DISK  = 'D:\SQLData\storagefun_lon1.trn'

Read the log again

SELECT
 [Previous LSN] ,
 [Current LSN] ,
 [Transaction ID] ,
 [Operation],
 [CONTEXT],
 [AllocUnitName] ,
 [Page ID],
 [Slot ID],
 [Offset in Row] ,
 [Transaction Name] ,
 [Begin Time],
 [End Time],
 [Number of Locks] ,
 [Lock Information] ,
 [RowLog Contents 0] ,
 [RowLog Contents 1] ,
 [RowLog Contents 2] ,
 [RowLog Contents 3] ,
 [RowLog Contents 4] ,
 [Log Record],
 [Log Record Fixed Length] ,
 [Log Record Length]
FROM fn_dblog (null,null) WHERE AllocUnitName LIKE '%CreditInfo%'

Still see the records? How about a checkpoint

CHECKPOINT
GO

At this point, if you try to read again, you wont see any records related to CreditInfo and its completely gone. I was able to explain about FULL Recovery mode, nature of log using this example to our devs and I hope you find it useful too.

Wednesday, May 29, 2013

Python Classes and Objects

It took me 4 hours to figure out whats going on with classes and objects in Python and I am not even sure if I understand them completely yet. Yeah, I am dumb. Anyways - I was able to complete this 'Hello World' kind of program for calculating account balance and fees.

__author__ = 'RK Kuppala'

class BankAccount:
    def __init__(self, balance):
        """Creates an account with the given balance."""
        self.balance = balance
        self.counter = 0

    def withdraw(self, amount):
        """
        Withdraws the amount from the account.  Each withdrawal resulting in a
        negative balance also deducts a penalty fee of 5 dollars from the balance.
        """
        if self.balance - amount < 0:
            self.balance -= amount+5
            self.counter += 5
        else:
            self.balance -= amount

    def deposit(self, amount):
        """Deposits the amount into the account."""
        self.balance += amount

    def get_balance(self):
        """Returns the current balance in the account."""
        return self.balance

    def get_fees(self):
        """Returns the total fees ever deducted from the account."""
        return self.counter

#lets test if it works
my_account = BankAccount(5)
my_account.withdraw(15)
my_account.deposit(20)
my_account.withdraw(60)
my_account.deposit(20)
my_account.withdraw(60)

print my_account.get_balance(), my_account.get_fees()

Friday, May 24, 2013

Another simple Python game: Memory

This week's assignment is a simple and straight forward Memory game. You play with a set of cards facedown and you are allowed to flip and view two cards in one turn and if they both match, they remain opened. If they don't they go back to facedown mode and you use your memory to match the cards. You can play it here:

UPDATE: I was doing peer evaluation for this and I found one implementation which is better than mine. It uses classes. Check it here:

__author__ = 'RK Kuppala'
# implementation of card game - Memory
import simplegui
import random

num_list = []
exposed = []
state = 0
first_pick = 0
second_pick = 0
moves = 0

# helper function to initialize globals
def init():
    global num_list, exposed, moves
    moves = 0
    num_list = [i%8 for i in range(16)]
    random.shuffle(num_list)
    exposed = [False for i in range(16)]
    pass


# define event handlers
def mouseclick(pos):
    global state, first_pick, second_pick, moves
    this_pick = int(pos[0] / 50)
    if state == 0:
        first_pick = this_pick
        exposed[first_pick] = True
        state = 1
        moves += 1
    elif state == 1:
        if not exposed[this_pick]:
            second_pick = int(pos[0] / 50)
            exposed[second_pick] = True
            state = 2
            moves += 1
    elif state == 2:
        if not exposed[this_pick]:
            if num_list[first_pick] == num_list[second_pick]:
                pass
            else:
                exposed[first_pick] = False
                exposed[second_pick] = False
            first_pick = this_pick
            exposed[first_pick] = True
            state = 1
            moves += 1
    l.set_text("Moves = " + str(moves))
    pass


# cards are logically 50x100 pixels in size
def draw(canvas):
    offset = 50
    hor_pos = -25
    for i in range(len(num_list)):
        hor_pos += offset
        canvas.draw_text(str(num_list[i]), [hor_pos, 50], 30, "White")
    exposed_pos = -50
    for i in exposed:
        exposed_pos += offset
        if not i:
            canvas.draw_polygon([(exposed_pos, 0), (exposed_pos + 50, 0), (exposed_pos + 50, 100), (exposed_pos + 0, 100)], 10, "White", "Orange")

# create frame and add a button and labels
frame = simplegui.create_frame("Memory", 800, 100)
frame.add_button("Restart", init)
l=frame.add_label("Moves = 0")

# initialize global variables
init()

# register event handlers
frame.set_mouseclick_handler(mouseclick)
frame.set_draw_handler(draw)

# get things rolling
frame.start()

Saturday, May 18, 2013

Simple Python Game: Pong

Here goes this week's game. The classic Pong game. You can play this game on your browser here (don't try this on IE). Its a two player game. Player 1 can use 'w' and 's' keys and Player 2 can use ↑ and ↓. I would like to make this game working locally on Windows using tkinter. Let see

# Author: RK

import simplegui
import random

# initialize globals - pos and vel encode vertical info for paddles
WIDTH = 600
HEIGHT = 400       
BALL_RADIUS = 20
PAD_WIDTH = 8
PAD_HEIGHT = 80
HALF_PAD_WIDTH = PAD_WIDTH / 2
HALF_PAD_HEIGHT = PAD_HEIGHT / 2

ball_pos = [WIDTH/2, HEIGHT/2]
ball_vel = [-random.randrange(60, 180) / 60, random.randrange(120, 240) / 60]
paddle1_pos = HEIGHT/2
paddle2_pos = HEIGHT/2

# helper function that spawns a ball by updating the 
# ball's position vector and velocity vector
# if right is True, the ball's velocity is upper right, else upper left
def ball_init(right):
    global ball_pos, ball_vel # these are vectors stored as lists
    ball_pos = [WIDTH/2,HEIGHT/2]
    ball_vel[1] = -random.randrange(60, 180)/60
    if right == True:
        ball_vel[0] = random.randrange(120, 240)/60
    else:
        ball_vel[0] = -random.randrange(120, 240)/60
    pass

def new_game():
    global paddle1_pos, paddle2_pos, paddle1_vel, paddle2_vel
    global score1, score2
    paddle1_pos = HEIGHT/2
    paddle2_pos = HEIGHT/2
    paddle1_vel = 0
    paddle2_vel = 0
    score1 = 0
    score2 = 0
    ball_init(0 == random.randrange(0, 11) % 2)
    pass 

# define event handlers


def draw(c):
    global score1, score2, paddle1_pos, paddle2_pos, paddle1_vel, paddle2_vel, ball_pos, ball_vel
    # update paddle's vertical position, keep paddle on the screen
    if paddle1_pos < (HALF_PAD_HEIGHT) and paddle1_vel < 0:
        paddle_vel = 0
    if paddle2_pos < (HALF_PAD_HEIGHT) and paddle2_vel < 0:
        paddle2_vel = 0
    if paddle1_pos > (HEIGHT - (HALF_PAD_HEIGHT)) and paddle1_vel >0:
        paddle2_vel = 0
    if paddle2_pos > (HEIGHT - (HALF_PAD_HEIGHT)) and paddle2_vel > 0:
        paddle2_vel = 0
    paddle1_pos += paddle1_vel
    paddle2_pos += paddle2_vel
    # draw mid line and gutters
    c.draw_line([WIDTH / 2, 0],[WIDTH / 2, HEIGHT], 1, "White")
    c.draw_line([PAD_WIDTH, 0],[PAD_WIDTH, HEIGHT], 1, "White")
    c.draw_line([WIDTH - PAD_WIDTH, 0],[WIDTH - PAD_WIDTH, HEIGHT], 1, "White")
    # draw paddles
    c.draw_polygon([(0, paddle1_pos-HALF_PAD_HEIGHT), (0, paddle1_pos+HALF_PAD_HEIGHT), (PAD_WIDTH-2, paddle1_pos+HALF_PAD_HEIGHT),(PAD_WIDTH-2,paddle1_pos-HALF_PAD_HEIGHT)], PAD_WIDTH-1, "White","White")
    c.draw_polygon([(WIDTH, paddle2_pos-HALF_PAD_HEIGHT), (WIDTH, paddle2_pos+HALF_PAD_HEIGHT), (WIDTH-PAD_WIDTH+2, paddle2_pos+HALF_PAD_HEIGHT),(WIDTH-PAD_WIDTH+2,paddle2_pos-HALF_PAD_HEIGHT)], PAD_WIDTH-1, "White","White") 
    # update ball
    ball_pos[0] += ball_vel[0]
    ball_pos[1] += ball_vel[1]
    if ball_pos[1] >= (HEIGHT - BALL_RADIUS) or ball_pos[1] <= (BALL_RADIUS):
        ball_vel[1] = -ball_vel[1]
    if ball_pos[0] <= (PAD_WIDTH + BALL_RADIUS):
        if ball_pos[1] < (paddle1_pos - HALF_PAD_HEIGHT) or ball_pos[1] > (paddle1_pos + HALF_PAD_HEIGHT):
            ball_init(True)
            score2 += 1
        else:
            ball_vel[0] = -ball_vel[0] * 1.1
            
    if  ball_pos[0] >= (WIDTH - PAD_WIDTH - BALL_RADIUS):
        if ball_pos[1] < (paddle2_pos - HALF_PAD_HEIGHT) or ball_pos[1] > (paddle2_pos + HALF_PAD_HEIGHT):
            ball_init(False)
            score1 += 1
        else:
            ball_vel[0] = -ball_vel[0] * 1.1        
    # draw ball and scores
    c.draw_circle(ball_pos, BALL_RADIUS, 2, "Yellow", "White")
    c.draw_text(str(score1), (170, 50), 36, "Yellow")
    c.draw_text(str(score2), (400, 50), 36, "Yellow")
    
def keydown(key):
    global paddle1_vel, paddle2_vel
    if key == simplegui.KEY_MAP['w']:
        paddle1_vel = -4
    elif key == simplegui.KEY_MAP['s']:
        paddle1_vel = 4
    elif key == simplegui.KEY_MAP['up']:
        paddle2_vel = -4
    elif key == simplegui.KEY_MAP['down']:
        paddle2_vel = 4
        
def keyup(key):
    global paddle1_vel, paddle2_vel
    if key == simplegui.KEY_MAP['w']:
        paddle1_vel = 0
    elif key == simplegui.KEY_MAP['s']:
        paddle1_vel = 0
    elif key == simplegui.KEY_MAP['up']:
        paddle2_vel = 0
    elif key == simplegui.KEY_MAP['down']:
        paddle2_vel = 0

# create frame
frame = simplegui.create_frame("Pong", WIDTH, HEIGHT)
frame.set_draw_handler(draw)
frame.set_keydown_handler(keydown)
frame.set_keyup_handler(keyup)

# start frame
frame.start()
new_game()

Thursday, May 16, 2013

Step By Step: SQL Server 2012 AlwaysOn Availability Groups Lab Setup Using VirtualBox

I was inspired to write this post by the super awesome DBA, Jonathan Kehayias of sqlskills. He has a great series of articles on setting up a home lab using virtualbox and evaluation software from Microsoft available as free downloads. Check out his articles on sqlskills.

If you are on this page and reading this, chances are that you already know what AlwaysOn is so I am not going to explain it again.For a quick overview of this new High availability/disaster recovery solution, go read this article. Now lets get started with building our lab.

What do you need?

In order to follow this tutorial, you need the following

The latest version of VirtualBox
Windows server 2008 R2 with sp1, evaluation edition
SQL Server 2012 Evaluation edition

Coming to the hardware, you need a decent laptop/desktop with 8 GB RAM and a good processor (may be core i5). You can try with lesser RAM too, but things will be a little slow. You need at least 4 VMs to test a basic setup of AlwaysOn. The server that acts as a domain controller can not be a part of availability groups. Don't worry about storage for all these VMs, because we are going to use linked clones that make use of resources from a base VM.

Saturday, May 11, 2013

A not so exciting game in Python: Stop Watch

Another week and another assignment. The game itself is not very exciting. You can start and stop the timer, every start is an attempt and every stop exactly at 0 is a win.

Thursday, May 9, 2013

Visual Fox Pro Hell: fpt file missing or invalid (memo file corruption)

I know, its 2013 and nobody should be dealing with Visual Fox Pro 6.0 databases, but one of the clients I support has a legacy system that uses foxpro databases. We have an SSIS job that pulls data everyday from `foxpro` and it recently started failing with the error, `person.fpt is missing or invalid`, I initially thought it was a network issue but the frequency of the failure increased and one fine day it stopped working at all. Lesson - don't wait for it to break completely. Because some day it will. So I naturally started googling and that was a time travel back to the ancient kingdoms of the internet. Microsoft's resources didn't help much, and I had to deal with shareware, freeware software that promise `.dbf` and `.fpt` repair as soon as you pay, help forums with no answers and annoying browser addons etc. After trying a lot of demo software I thought may be its just the `.fpt` file that needs rebuilding. So going though the documentation I thought I'd make a copy of this database and see if it rebuilds `.fpt` file and fortunately, it worked!

Wednesday, May 1, 2013

Game: Guess the number!

I have recently started learning Python from an online course and we had this course assignment to program this game. I am avoiding the name of the course and the name of the online resource on purpose because I don't want this show up on Google results. I'm just childishly happy that I wrote something and I wanted to save it here. This code may not work as is, because it uses some modules that are not part of standard python installations. After writing it in Python, I tried to do it on Powershell too and it kind of worked.

Thursday, April 25, 2013

Zen of Python

I just saw this on Python.org and I loved it so much I have to save it on my blog. It applies not just for Python but programming in general:

    Beautiful is better than ugly.
    Explicit is better than implicit.
    Simple is better than complex.
    Complex is better than complicated.
    Flat is better than nested.
    Sparse is better than dense.
    Readability counts.
    Special cases aren't special enough to break the rules.
    Although practicality beats purity.
    Errors should never pass silently.
    Unless explicitly silenced.
    In the face of ambiguity, refuse the temptation to guess.
    There should be one-- and preferably only one --obvious way to do it.
    Although that way may not be obvious at first unless you're Dutch.
    Now is better than never.
    Although never is often better than *right* now.
    If the implementation is hard to explain, it's a bad idea.
    If the implementation is easy to explain, it may be a good idea.
    Namespaces are one honking great idea -- let's do more of those!
                                                -- Tim Peters

Tuesday, April 23, 2013

Run VirtualBox VMs in the background - headless mode using VboxManage

I love virtualbox for my virtualization needs for its ease of use and low resource utilization. I have a bunch of VMs created to install and test SQL Server high availability features like AlwaysOn, Clustering etc. Working with these VMs I always wished if there was a way to just run these VMs in the background instead of keeping a window open for each VM. I googled around a bit and it is possible to start a VM in background mode, in virtualbox speak, its called headless mode.

Escaping wildcards while using LIKE predicate

When you look up the LIKE predicate on BOL you'll see the wildcard characters like %, _,[], [^] that will help finding a pattern in your tables. For example, the below sql will give you students whose lastname starts with D.

SELECT firstname, lastname
FROM students
WHERE lastname LIKE 'D%'

But things start getting interesting when your data has one or all of the above wildcards in your data and you would like to lookup a pattern. Lets create a table to illustrate this:

CREATE TABLE wildcards
(
  c1 int identity,
  c2 nvarchar(20)
);
--insert some data that has wildcards

INSERT INTO wildcards VALUES ('Joker')
INSERT INTO wildcards VALUES ('_Joker')
INSERT INTO wildcards VALUES ('Joker_')
INSERT INTO wildcards VALUES ('%Joker')
INSERT INTO wildcards VALUES ('%%Joker')
INSERT INTO wildcards VALUES ('[]Joker')
INSERT INTO wildcards VALUES ('[][]Joker')
INSERT INTO wildcards VALUES ('[^]Joker')
INSERT INTO wildcards VALUES ('__Joker')
GO
--check data,

SELECT * FROM wildcards

--9 rows

Now try a regular query with Like predicate and try to find all jokers that have an underscore preceding the value.

SELECT * FROM wildcards WHERE c2 LIKE '_%'

Surprise! SQL server returns all 9 rows. It thinks that we passed two wildcards. We need to tell SQL Server not to treat '_' as wildcard. You can do so using the ESCAPE keyword.

SELECT * FROM wildcards WHERE c2 LIKE '!_%' ESCAPE '!'
SELECT * FROM wildcards WHERE c2 LIKE '%!_' ESCAPE '!'


Monday, April 22, 2013

Scheduling SQL Profiler trace as SQL Agent job

Connect to the server using management studio and go to Tools and Launch SQL Server Profiler.
Connect to the server you want to run the trace on. 

Monday, March 11, 2013

Enable email to operator for all jobs using TSQL

There were a bunch of jobs that I needed to enable email notifications in case of a failure. I came up with this quick and dirty way:

DECLARE @job_id1 nvarchar(200)
DECLARE updateOperator CURSOR FOR 
select job_id from msdb..sysjobs WHERE enabled = 1

OPEN updateOperator
FETCH NEXT FROM updateOperator INTO @job_id1

WHILE @@FETCH_STATUS = 0
BEGIN
EXEC msdb.dbo.sp_update_job @job_id= @job_id1, 
  @notify_level_email=2, 
  @notify_level_netsend=2, 
  @notify_level_page=2, 
  @notify_email_operator_name=N'Master-DBA'

    FETCH NEXT FROM updateOperator INTO @job_id1
    END

CLOSE updateOperator
DEALLOCATE updateOperator

Tuesday, January 29, 2013

Stored procedure for exporting query results to text file

First you need to enable xp_cmdshell( I realize it can be a security issue so I always enable it on demand, do what I need to do and disable it immediately)


sp_configure 'show advanced options', 1
reconfigure
go
sp_configure 'xp_cmdshell', 1
reconfigure
go

And now the procedure -

CREATE PROCEDURE dbo.ExportText
AS
BEGIN
DECLARE @bcp AS VARCHAR (8000)
DECLARE @status AS INT

PRINT ''
PRINT 'Lets generate the text file'

SELECT @bcp = 'bcp "SELECT * FROM master..sysdatabases" QUERYOUT "D:\output.txt" -T -c'

EXEC @Status = master.dbo.xp_cmdshell @bcp, no_output
IF @Status <> 0
BEGIN
 PRINT 'Something went wrong. No text file for you!'
 RETURN
END ELSE
 PRINT 'Successful! Go check your text'
END


SSRS Report Access Error: User 'DOMAIN\LOGIN' does not have required permissions. UAC issue

I didn't configure reporting services on my local machine during the initial installation. I recently had to test something on my machine and configured everything from 'Reporting Services Configuration Manager' and when I tried to access reports on IE, I initially had to enter my windows login and password (which is silly, how hard is it to get the current login and security token without having to enter manually?)  I was greeted with the below message:

User 'DOMAIN\LOGIN' does not have required permissions. Verify that sufficient permissions have been granted and Windows User Account Control (UAC) restrictions have been addressed



Really? I had to launch IE using "Run As Admin" to get this to work. But that doesn't solve the problem forever. You either set your IE to run as an admin or disable UAC. But what if you access this server from different machines?

There is a workaround though: Once logged into report server, go to

Site Settings => Security => New Role Assignment => and Grant your admin login a System Admin role.


Wednesday, January 23, 2013

TIL: What makes a query relational?

1.The relational result set does not any order by default. If your query has an ORDER BY clause, your resultset ceases to be relational.
2. For any query to be relational, it must satisfy a number of requirements - No ORDER BY clause, All attributes must have names, all attributes must be unique, duplicates must not appear in the result.