Cursor Transformation

2015 ArtOne of my favorite presentations to give is called, “Changing Your Habits to Improve the Performance of Your T-SQL.” After giving this presentation in Austin, TX this past January, I had one of my students contact me. He saw in one of my demonstration how bad cursors can perform. He wanted to see if his stored procedure could be written using set theory.

Scenario

He has an application that can only accept 3 data types (NVARCHAR, NUMERIC, and DATETIME), and as all normal database schemas do, his database has more than 3 data types. He needed to identify which column had data types that could not be used and determine which of the 3 data types could be used for that column. The actual conversion was handled by another process, so the information to do the conversion was stored in a table.

The Cursor Approach

His approach was to use a cursor to cycle through all the columns in the provided table, analyze each column, determine the new data type, and store the information in a table variable. After the cursor was completed, the data in the table variable was written to a permanent table for the next process to use.

This approach isn’t necessarily bad. If you are only running it infrequently and you needed to write this stored procedure quickly, then it’s fine. But if this type of stored procedure needs to be run frequently, then it should be rewritten.

The Reason For the Rewrite

Every SQL statement used in a stored procedure can get it’s own execution plan, depending on whether or not it was parameterized. Parameterized queries that are identically written (including case, spaces, and line breaks), can reuse an execution plan. Those that are not, will receive their own execution. Each unique execution plan is stored in the cache. When there are multiple similar execution plans stored in the Cache, it’s called “Cache Bloat.”

Note: This does not apply to databases with the “optimize for ad hoc workloads” setting turned on, but that would be a different blog post.

What does this have to do with Cursors? Each time the Cursor loops (and this includes While loops), each SELECT statement is executed independently and receives its own execution plan. If the query is parameterized, then execution plans can be reused. You can see this by using Extended Events, or Profiler.

Note: If you try this out, don’t do it in production. You’ll be adding load to your SQL Server.

The Cursor

Below is a similar query to what I was sent. This particular solution looks very complicated. There are table variables, one holding the good data types, and one that will hold the needed information for the conversion. There are also several variables to be used in the cursor. Finally the table variable with the needed information, is written to the external table.


USE DemoProgramming
GO
SET NOCOUNT ON;

/* These are the parameters that would be used in the stored procedure.*/
DECLARE
@schema AS NVARCHAR(200) = 'dbo'
,@tableName AS NVARCHAR(200) = 'SalesHeader'
/* table variable will approved data types.*/
DECLARE @legalDataTypes TABLE
(
Data_Type NVARCHAR(500)
);

INSERT INTO @legalDataTypes
(Data_Type)
VALUES
('nvarchar'),
('numeric'),
('datetime');
/* use information_schema.columns to discover information about column types of input table */
DECLARE @ColumnInformation TABLE
(
ColumnName NVARCHAR(100) NOT NULL
PRIMARY KEY
,ColumnDataType NVARCHAR(20) NOT NULL
,ColumnLength NVARCHAR(10)
,NumericPrecision INT
,NumericScale INT
,OrdinalPosition INT
);

INSERT INTO @ColumnInformation
(ColumnName
,ColumnDataType
,ColumnLength
,NumericPrecision
,NumericScale
,OrdinalPosition
)
SELECT
COLUMN_NAME
,DATA_TYPE
,CHARACTER_MAXIMUM_LENGTH
,NUMERIC_PRECISION
,NUMERIC_SCALE
,ORDINAL_POSITION
FROM
INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_NAME = @tableName
AND TABLE_SCHEMA = @schema;

/* The table will be populated iteratively and eventually inserted into the meta data table */
DECLARE @resultSet TABLE
(
ColumnName NVARCHAR(100)
,ColumnDataType NVARCHAR(20)
,ColumnLength NVARCHAR(10)
,Position INT
);

/* These will be set each loop */
DECLARE @columnLength AS NVARCHAR(10)
DECLARE @newDataType AS NVARCHAR(20);

/*setup cursor*/
DECLARE @columnName NVARCHAR(100);
DECLARE @dataType NVARCHAR(100);
DECLARE @characterMaximumLength NVARCHAR(10);
DECLARE @ordinalPosition INT;
DECLARE @numericPrecision INT;
DECLARE @numericScale INT;

DECLARE allColumnsCursor CURSOR
FOR
SELECT
ColumnName
,ColumnDataType
,ColumnLength
,NumericPrecision
,NumericScale
,OrdinalPosition
FROM
@ColumnInformation;
OPEN allColumnsCursor;
FETCH NEXT FROM allColumnsCursor INTO @columnName, @dataType, @characterMaximumLength, @numericPrecision, @numericScale, @ordinalPosition;
WHILE @@FETCH_STATUS = 0
BEGIN
IF @dataType NOT IN (SELECT
Data_Type
FROM
@legalDataTypes)
/* illegal data types*/
BEGIN
IF @dataType IN ('varchar', 'char', 'nchar')
BEGIN
SET @newDataType = N'nvarchar';
SET @columnLength = CAST(@characterMaximumLength AS NVARCHAR(10));
END;
ELSE
IF @dataType IN ('decimal', 'float', 'real', 'money', 'smallmoney')
BEGIN
SET @newDataType = N'numeric';
SET @columnLength = '(19,8)';
END;
ELSE
IF @dataType IN ('bigint', 'smallint', 'tinyint', 'binary', 'varbinary', 'int')
BEGIN
SET @newDataType = N'numeric';
SET @columnLength = '(19,0)';
END;
ELSE
IF @dataType IN ('bit')
BEGIN
SET @newDataType = N'numeric';
SET @columnLength = '(1,0)';
END;
ELSE
IF @dataType IN ('smalldatetime', 'date', 'time', 'datetimeoffset', 'datetime2', 'timestamp')
BEGIN
SET @newDataType = N'datetime';
SET @columnLength = NULL;
END;
ELSE
BEGIN
DECLARE @ret_string VARCHAR(255);
EXEC sys.xp_sprintf @ret_string OUTPUT, '@@columnName = %s has unrecoginzed @dataType = %s', @columnName, @dataType;
RAISERROR(@ret_string,16,1);
RETURN;
END;
END;
ELSE
BEGIN
/* legal data types, don't change datatype but capture correct columnLength */
-- VALUES ('nvarchar'),('numeric'),('datetime');
SET @newDataType = @dataType;
IF @dataType = 'nvarchar'
BEGIN
SET @columnLength = CAST(@characterMaximumLength AS NVARCHAR(10));
END;
ELSE
IF @dataType = 'numeric'
BEGIN
SET @columnLength = '(' + CAST(@numericPrecision AS NVARCHAR(10)) + ',' + CAST(@numericScale AS NVARCHAR(10)) + ')';
END;
ELSE
BEGIN
SET @columnLength = NULL;
END;
END;
INSERT INTO @resultSet
(ColumnName
,ColumnDataType
,ColumnLength
,Position
)
VALUES
(@columnName
,@newDataType
,@columnLength
,@ordinalPosition
);

FETCH NEXT FROM allColumnsCursor INTO @columnName, @dataType, @characterMaximumLength, @ordinalPosition, @numericPrecision, @numericScale;

END;
CLOSE allColumnsCursor;
DEALLOCATE allColumnsCursor;
/* populate two meta data tables*/
INSERT INTO dbo.DataTypeConversion_Cursor
(TableName
,ColumnName
,ColumnDataType
,ColumnLength
,OrdinalPosition
)
SELECT
@tableName
,ColumnName
,ColumnDataType
,ColumnLength
,Position
FROM
@resultSet;

The New Solution

The new solution consists of a permanent table that contains the information, per data type to do the conversion. That allows a join between the conversion information and the metadata found in the system SQL view, “INFORMATION_SCHEMA.COLUMNS”. After the join, the new information can be directly inserted into the permanent table for the next process to consume.


/*Create this table one time*/
CREATE TABLE dbo.DataTypeConversion
(
OldDataType NVARCHAR(20)
,NewDataType NVARCHAR(20)
,columnLength NVARCHAR(20)
)

/*Values for the Data Types that will be converted.*/
INSERT INTO dbo.DataTypeConversion
(OldDataType, NewDataType, columnLength)
VALUES
('decimal', N'numeric', '(19,8)')
,('Float', N'numeric', '(19,8)')
,('real', N'numeric', '(19,8)')
,('money', N'numeric', '(19,8)')
,('smallmoney', N'numeric', '(19,8)')
,('varchar', N'nvarchar', '-1')
,('char', N'nvarchar', '-1')
,('nchar', N'nvarchar', '-1')
,('bigint', N'numeric', '(19,0)')
,('smallint', N'numeric', '(19,0)')
,('tinyint', N'numeric', '(19,0)')
,('binary', N'numeric', '(19,0)')
,('varbinary', N'numeric', '(19,0)')
,('int', N'numeric', '(19,0)')
,('bit', N'numeric', '(1,0)')
,('smalldatetime', N'datetime', NULL)
,('date', N'datetime', NULL)
,('time', N'datetime', NULL)
,('datetimeoffset', N'datetime', NULL)
,('datetime2', N'datetime', NULL)
,('timestamp', N'datetime', NULL)
,('numeric', N'numeric', NULL)
,('nvarchar', N'nvarchar', NULL)

/*paramters that would be used with the Stored Procedure.*/
DECLARE
@Tablename AS VARCHAR(100) = 'SalesHeader'
,@schema AS VARCHAR(20) = 'dbo'

INSERT DataTypeConversion_SetTheory
(TableName
,ColumnName
,ColumnDataType
,ColumnLength
,OrdinalPosition
)
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
,d.NewDataType

/*Conversion for data types*/
,CASE WHEN d.NewDataType = 'nvarchar' THEN CAST(c.CHARACTER_MAXIMUM_LENGTH AS NVARCHAR(10))
WHEN d.OldDataType = 'numeic' THEN '(' + CAST(c.NUMERIC_PRECISION AS NVARCHAR(10)) + ',' + CAST(c.NUMERIC_SCALE AS NVARCHAR(10)) + ')'
ELSE d.columnLength
END AS NewColumnLength
,c.ORDINAL_POSITION
FROM
INFORMATION_SCHEMA.COLUMNS AS c
JOIN dbo.DataTypeConversion AS d ON c.DATA_TYPE = d.OldDataType
WHERE
c.TABLE_NAME = @Tablename
AND c.TABLE_SCHEMA = @schema;

Giving thanks

I want to give thanks to my student, Mark Lai and the company he works for, allowing me to write about this Cursor Transformation. As the geek I am, I thoroughly enjoyed the challenge of thinking outside of the box to rewrite the stored procedure.

Where in The World Has Mickey Been?

FullSizeRenderI was so sad to see that my last post was January 12th, but I do have to give myself a break. I’ve been busy with life and presentations. My oldest daughter is off to college this fall, but before she decided to go to Chicago or Boston, I wanted her to experience the cold. Especially since we live in Southern California. I tied our college visitation trip with a SQL Saturday in Chicago, which happily was her idea. For those of you who are curious, Victoria will be attending Emerson in Boston. The school and the city are both a perfect fit for her. (This also means occasional I’ll be able to visit beautiful Boston to see my daughter and hopefully present at a SQL event or two.)

I’ve also been writing abstracts for new presentations for this year, creating a new presentation called “Be a Role Model for Women in Tech through Blogging”, and presenting, presenting, presenting.

My youngest daughter, oldest daughter, and two nieces are also graduating from either high school or the 8th grade this coming month. Sadly, I may not get to write in May either. The good news, I will be writing and creating presentation through the summer. For those that have been reading my blog this past year, you know that my oldest daughter and I go to a coffee shop every Sunday. She studies for school and I write. This will be the last Summer we get to do this together until her school breaks. I will miss this time with her. I’m hoping my youngest daughter will carry on the tradition. We’ll see.

T-SQL Tuesday #74 – Knowing When Data Changes Occur in Standard Edition

Robert L. Davis (b|t) is hosting this month’s T-SQL Tuesday blog party. The party was started by Adam Machanic (b|t) in December of 2009.

This month’s T-SQL Tuesday’s topic is on Data Change. This is an important topic for me, since I’m about to create our first Data Warehouse at work. I’ve chosen to champion the companies who are using the Standard Edition of SQL Server.

Crestfallen

I took a class on Change Data Capture this past year at the SQL Saturday in Portland, Oregon. I absolutely loved it. I couldn’t wait to implement it. Then I found out it was an Enterprise Level feature only. I was crestfallen.

I went back to work asking if we could consider using Enterprise SQL Server, I was told that we have too many cores on our Nutanix hardware to make the features we would use worth it. You see, SQL Server pricing is based on the number of cores the underlying hardware has, not the number of cores you utilize in your VM. So what is a girl to do? (And guys, I haven’t forgotten you either.)

Since the Change Data Capture feature is not available to us Standard Edition users, let me take you through another pattern that uses auditing columns on each table.

The Basis of The Pattern I Use

I use a pattern that includes four fields on all transactional tables. This (absolutely) includes lookup tables too. The two table types that are an exception to this pattern are audit tables and error tables. I’ll cover why later in this article.

Four fields include CreatedOn, CreatedBy, UpdatedOn, and UpdatedBy. The dates should be DateTime2. CreatedOn is the easiest to populate. You can create a default on the field to be populated with GetDate().

The other fields can be a little more tricky depending on your environment. There are two patterns for populating CreatedBy as well as two patterns for populating UpdatedBy and UpdatedOn. Below you will find the pros and cons of each.

Pattern One For CreatedBy

The first pattern has the CreatedBy using the SYSTEM_USER function to populate the field. This will retrieve the windows login that the user used to login into the application.

Pros
  • This pattern allows the use of a default on the CreatedBy field. This allows the field to not be dependant on code to populate it.
  • All users have a validated account.
  • Accounts that are used by SSIS or services don’t need to have a row in the User table.
Cons
  • The field will be a varchar or an nvarchar.
  • The CreatedBy field will need to be parsed to find out which user inserted the record for reporting. (One way around this, is to remove the domain before inserting the (n)varchar).

This is my preferred technique since you don’t need to rely on the User table being immediately populated for a user to start using an application. It also makes it easier when the UpdatedBy field is updated through a pattern below.

 

Pattern Two for CreatedBy

This technique is based on an existing User table and uses their ID for the value of CreatedBy.

Pros
  • This pattern allows the CreatedBy field to join to the User table through an integer. Information about the user can be determined easily by joining to the User table but takes less space than the varchar/nvarchar mentioned above.
  • CreatedBy is not relying on windows authentication. This is helpful when not all users of the application have a windows login account.
Cons
  • This pattern requires a row for various tasks in the User table when SSIS or services insert data automatically.
  • This pattern requires CreatedBy to be populated by the application instead of by a default value. This requires working with the AppDev team and getting the AppDev team to write some additional code for you. (If you need to resort to bribing them, I suggest pizza, donuts, chocolate, or beer.)

 

Now let’s talk about UpdatedOn and UpdatedBy. They are a little more tricky to populate since they are populated on an update.

Pattern One For UpdatedBy and UpdatedOn

This pattern requires code to populate the fields since there are no defaults on updated rows. As stated above, you’ll have to work with the AppDev team.

Note: This is not a bad thing. I whole heartedly believe in having a good relationship with the AppDev team. It makes it easier to work with the team when they have the ability to write SQL. It also has the benefit of being included earlier in design sessions. (I personally have been blessed with a great AppDev lead.)

Pros
  • This pattern can be used to populate the UpdatedBy field and even the UpdatedOn fields using similar code.
  • Either User IDs or Windows account can be used to populate UpdatedBy.
Cons
  • The pattern is dependent on the AppDev team to remember to include UpdatedBy and UpdatedOn for every updated row, especially when there are tight deadlines.
  • You need to be vigilant that the fields are always updated, especially with lookup tables.

This is my preferred technique since it doesn’t involve triggers, which we’ll see in pattern two below.

Pattern Two For UpdatedBy and UpdatedOn

This pattern uses the After Update trigger. The reason I choose the After Update trigger, is that it waits until all the changes are made to the table(s) being updated by other triggers.

Note: I do not encourage multiple tables to be updated in a single trigger. (Hmm. This sounds like a good future post.)

Pros
  • This can be used as a stop gap while the AppDev team refactors their code to populate UpdatedBy and UpdatedOn fields as well as CreatedBy.
  • There is no requirement to rely on the application code to populate these fields.
Cons
  • The UpdatedBy can’t be updated with the User’s Id when populating the field.
  • You might not be allowed to use triggers. (Note: Triggers are not bad when used correctly.)

CREATE TABLE dbo.OrderStatusType
(
OrderStatusTypeId INT IDENTITY PRIMARY KEY
,OrderStatusName VARCHAR(50) NOT NULL
,IsActive BIT NOT NULL DEFAULT 1
,CreatedBy VARCHAR(50) NOT NULL DEFAULT SYSTEM_USER
,CreatedOn DATETIME2 NOT NULL DEFAULT GETDATE()
,UpdatedBy VARCHAR(50)
,UpdatedOn DATETIME2
)
GO

CREATE TRIGGER dbo.Trg_OrderStatusType_u ON dbo.OrderStatusType
AFTER UPDATE
AS
SET NOCOUNT ON

UPDATE dbo.OrderStatusType
SET
UpdatedOn = ISNULL(i.UpdatedOn, GETDATE()) -- This makes sure if a NULL was used, it is overwritten with the current Date.
,UpdatedBy = ISNULL(i.UpdatedBy, SUSER_NAME()) -- This makes sure if a NULL was used, it is overwritten with the current user.
FROM
dbo.OrderStatusType AS ost
INNER Join inserted AS i ON ost.OrderStatusTypeID = i.OrderStatusTypeID
GO
INSERT INTO OrderStatusType
(OrderStatusName
,IsActive
)
VALUES
('Created',1)
,('Back Orddered', 1)
,('Shipped',1)
,('Completed',1)

GO

SELECT * FROM OrderStatusType AS ost

UPDATE
OrderStatusType
SET
IsActive = 0
WHERE
OrderStatusTypeId = 2

SELECT * FROM OrderStatusType AS ost

UPDATE
OrderStatusType
SET
IsActive = 0
,UpdatedBy = 'George'
,UpdatedOn = '2016-01-11 17:38'
WHERE
OrderStatusTypeId = 4

SELECT * FROM OrderStatusType AS ost

If you are adding this pattern to existing triggers such as “AFTER INSERT, UPDATE, DELETE”, then make sure to check that the execution of this trigger was based on an update. This can be done by checking if there are any records in the Inserted and deleted tables that are used in triggers.

CREATE TRIGGER dbo.Trg_OrderStatusType_iud ON dbo.OrderStatusType
AFTER INSERT, UPDATE, DELETE
AS
SET NOCOUNT ON

IF EXISTS (SELECT 1 FROM Inserted) AND EXISTS (SELECT 1 FROM deleted)
BEGIN
UPDATE dbo.OrderStatusType
SET
UpdatedOn = ISNULL(i.UpdatedOn, GETDATE()) -- This makes sure if a NULL was used, it is overwritten with the current Date.
FROM
dbo.OrderStatusType AS ost
INNER Join inserted AS i ON ost.OrderStatusTypeID = i.OrderStatusTypeID
END
GO

Exceptions For Audit and Error Tables

These two types of tables are usually populated by inserts only. Most of the time the CreatedOn field is the only field that needs to be added. I would change the name to AuditedOn and ErroredOn respectively since the CreatedOn will most likely represent the creation of the original record.

Why are Lookup tables not in this list? They can be modified over time. Someone may need a value changed or the value needs to be discontinued. The row can’t be deleted when it is discontinued since that would break referential integrity.

In Summary

By using CreatedOn and UpdatedOn you can compare dates when data needs to be processed without having to compare the entire table.

One last thing on the UpdatedBy and UpdatedOn fields. For a long time I insisted that these two fields are NULL during an insert, but I have since changed my mind. By setting the UpdatedBy and UpdatedOn to the same values as CreatedBy and CreatedOn for an insert, you won’t have to deal with the ISNULL function or using the CreatedOn for (some) comparisons.

Thanks For All The Fish

I want to thank Robert L. Davis for hosting this month and I look forward to participating in future T-SQL Tuesday blog parties.

Year in Review for 2015 and Future Goals for 2016

AdobeStock_96432559This has been another spectacular year in the SQL world. Unfortunately, I can’t find my list of goals, so I’ll have to wing it. I can say, that I have had some unexpected surprises this year.

 

Level up!
  1. I have two favorite activities in the SQL world. One of which is speaking. This year I presented 14 times to over 1300 people. I spoke at two conferences, one of which was my second year at PASS Summit in Seattle. I spoke at 6 user groups, one of which was presented remotely in Australia. I also spoke at 5 SQL Saturday’s and once for Pragmatic Works.
  2. My second favorite activity is writing. This year, I started writing for SQL Server Central. I wrote two articles for them, which had more than 20K views. I have my third post scheduled for Jan 4th/5th, so keep your eyes open for it. I also had 17 blog posts on my website. My metrics for the year were off the wall. An 82 % increase in sessions, a 403 % increase in session duration, a 97 % increase in page views (that’s 23K page views!), and my favorite, 52% increase in Users, which yields 9.5K users. Thanks viewers!
  3. I created my meet and greet list for PASS Summit and I was able to meet most of the people on my list. Some of them I knew through conversing on Twitter, like Pinal Dave. I (finally) had an opportunity to give Buck Woody a big hug too. I also met some people that I didn’t know: Wawrick Rudd, Mellisa Lord, Michael Upton, Denis Horner, and many more.
  4. This is my second year co-leading our local BI user group with Rob Hatton.
  5. I had the honor of being part of the Friends of Red Gate program again. This is my third year.
  6. My husband granted me my wish for our 20th wedding anniversary of going on my second SQL Cruise in the Caribbean AND he went with me on it. When I mentioned I had been on a SQL Cruise in my interview for my current job, they thought I was kidding. If you have never heard of SQL Cruise, I highly recommend checking it out. There is no where else you can get 6 amazing instructors for 30 students. Those speakers are trapped on a ship with you, so you can actually spend time with them over drinks or dinner asking them any question you want. I had that opportunity with Jes Borland, Grant Fritchey, Kevin Kline, David Klee, Tim Ford, and Amy Ford.
I have some new goals for 2016 as well
  1. I was hoping to speak at least once a month again, but after looking at 2014 and 2015 it will be easy to speak an average of twice a month. I have a goal of of 9 SQL Saturdays. Hopefully, I will get another opportunity to speak at PASS Summit. And I plan on speaking over the inter-webs as many times as I can. Here are some of the speaking engagements, I already have planned.
    1. SQL Saturday 461 in Austin, TX, Janurary 30th, 2016.
    2. Pragmatic Works on Feburary 9th.
    3. Profession Development Virtual Chapter in March.
    4. SQL Saturday 497 in Huntington Beach, CA April 02, 2016.
    5. DBA Fundamentals Downunder Virtual Chapter in May.
    6. All the others will slowly appear on my Speaking Engagements page.
  2. I plan to continue writing for my own blog and for SQL Server Central with a goal of one post each month for each site. That is a lofty goal for me since almost all my writing is done on Sunday’s, in a little coffee shop with my daughter. (I also spend that time writing abstracts.) Wish me luck!
  3. I’ve enjoyed speaking on SSRS, but I’m going to change it up. I’ll continue speaking on writing better SQL, but I’m also going to take up another SQL subject. Stay tuned!
  4. I’ll continue co-leading our local BI user group.
  5. I plan on mentoring one of my colleagues, Ly Nguyen. He has a goal of becoming a DBA or a Database Developer. I’m super excited about this, since he is eager to learn.
  6. This next year, I want to spend more time on forums, helping others.
  7. Hopefully, I’ll be part of Friends of Red Gate for another year.
Stretch goals

I think it’s a good idea to have some stretch goals to help push yourself past your comfort zone. Here is mine.

  1. Create a full day session to present. This seems so overwhelming, but I was a Microsoft Trainer for two years at the beginning of my career, so I know it’s possible.
  2. Speak (physically) in another country. My Australian friends, have been pushing encouraging me to speak in Melbourn, AU. I’m not sure if it will be possible, since my oldest daughter will be attending college next year, and most of my speaking money will be redirected to her tuition. If I accomplish the first of my stretch goals, then this might be possible.
  3. Writing another book. This is a big commitment of time. The good news is, my family is willing to support me in this endeavor. This is great, since they would hardly see me, except at dinner, until the project was completed.
I want to thank…

There is no way I could accomplish what I do without the support of friends and family. Here is this year’s shout out.

  1. My husband is definitely number one on this list. Whenever I have a really bad week, or I get bummed out about something, his first question is, “When is your next SQL Saturday”? Also, he is encouraging me to speak and write until my heart’s content.
  2. I always say, my first language is SQL and my second language is English. Luckily I have my daughter Victoria to help with my grammar and spelling. She is now one of my official editors and will continue to be my editor through college.
  3. Ben McNamara is my second editor. While Victoria can catch my English mistakes, Ben can catch the technical ones. He is also one of my touchstones when I get nervous about speaking or am taking criticism personally.
  4. Jes Borland, Chris Yates, Julie Koesmarno, and Nghi Nguyen are my other touchstones in my life. They are great at keeping me grounded.
  5. Steve Jones asked me to write for SQL Server Central and I was very honored. He also has the most relaxed manner that I wish I had. I can learn how to be more laid back from him, since I see him at SQL Saturdays and at Summit.
  6. I want to thank you, my readers and those that attend my presentations. Without you, I wouldn’t be having any fun.
Now it’s your turn.

My question to you is, what are your technical goals for 2016? Do you have some achievable goals and some stretch goals? Here are some ideas.

  • Start a blog. Most people start writing a blog for themselves, to remember how to do something in the future.
  • Start speaking. This can be very scary, but there is always someone out there who needs to know what you know. It can be as simple as rewriting a cursor or as complex as setting up replications.
  • Mentoring. Speaking might be too scary, so instead take someone under wing. Not only can you mentor them in a technical capacity, but you can also mentor them in how to deal with different parts of a team or how to gather requirements.
  • Volunteering: Every organization that is run by volunteers, needs more volunteers. You can help out at your local SQL Saturday or at PASS Summit to name a couple of places. (I volunteered at Summit this year. I directed people to the WIT lunch or to the normal lunch. I had a blast!) You can also volunteer at local community centers that have programming classes for kids. We need to help encourage the next generation.

SQL Source Control and Git Support In the Real World

When I heard that Redgate had improved the Git support in SQL Source Control, I have to confess I got excited about it. I even tweeted about it. There’s a good reason for that too.

I fell in love with SQL Source Control the first time I used it in another company. Before it came along, we scripted out database objects, which is a pain. At one point in one company, we used the Visual Studio version, and that’s painful as well because you don’t get the immediate gratification of knowing what’s changed. You have to actually bring in the changes and then see what your changes are. I just found that very cumbersome.

SQL Source Control makes it a really easy process and defines all of your changes. It’s especially useful if you’re working on multiple projects and one project is sitting there for three weeks and you come back and look at the changes. You might think, for example, oh yeah, I was working on indexing and I should probably get those indexes into production. Using SQL Source Control, you can do it quickly and easily.

The missing link

As much as I love SQL Source Control, a niggle I’ve had has been its limited support for Git. Like many other people, I moved from Subversion to Git because its branching features are so much better, and adopted Atlassian Stash (now called Bitbucket Server) as the repository.

That’s where I had a difficulty because there was a gap between SQL Source Control and the Stash repository on the server. I tried a workaround using command prompts, but it was so painful and cumbersome I turned to Atlassian SourceTree instead. A free Git client, it provides a visual GUI to do the push and the pull between the server, Git, and the local repository.

The tweet

imageAt the beginning of October 2015, everything changed. That was when Redgate announced SQL Source Control users could now push to and pull from remote Git repositories.

And not just from within SQL Source Control – from within SQL Server Management Studio (SSMS).

I was thrilled because this one apparently small move eliminated an entire step in another application. No more closing this, opening that, moving this, quitting that, going back to the application you were happily working with before.

It also made things a lot clearer to team members who were new to using version control by taking the complexity out of the process.

You’re not out of the woods yet, Redgate

I use SQL Source Control every single day and I’ve now got my entire team checking in their changes using Git.

They can commit changes, get the latest changes, and do everything they need to do without complications. As well as saving time, it makes it easier for less technical staff who are new to using repositories. A glance at what it looks like in SSMS demonstrates how simple it is:

image

So it has made my life easier, but do you know something? There are still some improvements I’d like to see.

What branch am I on?

Top of the list would be to see which branch I’m on. Right now I can see which database I’m on, but not which branch

How many changes are ready to push or pull?

Near the ‘Push’ and ‘Pull’ buttons, it would be nice to know if there are files to pull or push, in case other stuff has been checked in from other users. A number on the buttons would work as well, to show how many items have been checked in locally and are ready to go up or come down.

The pull button is the more important one, to make sure you’re on the right version. Team keep forgetting to pull down changes that exist on the repository so this would be a real bonus.

What about conflict resolution?

Conflict resolution would be another great feature because sometimes it’s difficult. The developers don’t always remember to bring down the latest version and when they check something in, it causes a conflict. I think I spent an hour fixing that yesterday, and I still have to go back and deal with it on one of my team mate’s machines.

Right now, I use KDiff3 to merge our QA branch and personal branches to fix conflict resolution, but a feature within SQL Source Control would make things a lot simpler.

And finally

Finally, I’m looking forward to when SQL Source Control is incorporated with migration scripts. At that point, everything will become a breeze.

If, like me, you want to see some of these features added then visit the SQL Source Control UserVoice site and vote!

Adventures Into Azure Databases For The First Time (Part 2 of 2)

AdobeStock_52521593_thumb[2]In my last post, I started a two part series to learn how to create and migrate a database from SQL Server 2014 to a new Azure Database.

Recap of My Goals

In Part 1 of this series, I completed the following goals:

  • Activated my free credits through my MSDN license.
  • Created my first empty database.
  • Determined how to connect to the database.

In this post (part 2), my goals are:

  • Poke around the Azure GUI to learn what is available with my free credits.
  • Use my Red Gate tools to create a new schema and load it with data.
  • Learn if I can take a backup from SQL Server 2014 and restore it in my Azure account.
4. Poking Around the Portal

When I was poking around, I found some great features. I discovered these screens by clicking on the database name on the Portal screen (figure A), where I found a series of links at the top that you can click on.

11-29-2015 5-37-20 PM_thumb[1]

 

There is a monitoring screen, which also has some helpful metrics at the bottom, like dead locks. This is an important screen. You don’t want to max out your resources or you’ll be considered a “noisy neighbor” and Microsoft will do something about it.

11-28-2015 5-00-57 PM_thumb[1]

I also found the Scale Out page. This page will allow you to increase or decrease the size of your database. This can be very helpful if your database slowly changes size throughout the month. By changing the service level or the DTU’s to only what you need, you can save money.

5. Upload A Database

I have an existing Database on my SQL Server 2014 server that I want to migrate to my Azure account. I want to first move the schema, then the data. I have always trusted my Red Gate tools to create reliable, deployable scripts. The two tools I’ll be using are SQL Compare and SQL Data Compare.

SQL Compare will compare my existing SQL Server 2014 database schema with my new Azure database schema, which is currently nonexistent. The reason why I like this tool is that it creates serializable transactions for deploying all the individual scripts as one script. If the script fails at any point, then the whole script is rolled back. The second tool, SQL Data Compare, will be used to move the data. I don’t recommend doing this on a gigabyte of data, but for my small database it will work great. It is also a good tool for moving all the values of lookup tables, or for resetting development data to a previous state.

When you open up SQL Compare, you are presented with a screen to set up the two databases. The left hand screen has the database with the SQL Objects you want to deploy and the right hand side has the database you want to deploy to. After you enter the server and login credentials, SQL Compare connects to the Server to obtain the list of databases.

At this point, I learnt another lesson. When I went to get the list of databases from my Azure Server, I was given an error message. It stated that my IP address was not allowed access to the Azure Server. Back to the Portal, I went and I followed these steps:

  1. On the Portal, I clicked on the server name.
  2. I clicked on Configure in the menu bar under the server name.
  3. I was presented with a very helpful page. It told me which IP address was currently accessing the portal. At this point, I clicked on the link “Add to the allowed IP addresses”
  4. Finally, I clicked save at the bottom.

11-29-2015 8-05-39 PM_thumb[1]

 

I went back to SQL Compare, and I was able to retrieve the list of databases from my Azure server. Win!

image

When I clicked the Compare Now button, SQL Compare will compare the two databases and provide a list of deltas. The deltas are broken up into differing groups. The schemas exist in both databases, but differ. One group consists of the schemas that only exist in the database that will be deployed, and the other consists of the schemas that only exist in the database that will be receiving the deployment. You are able to select only the changes that you want to deploy.

image

In this database, there are only two tables and one function that are relevant for my Azure database, so I selected them and then I clicked on the Deployment Wizard button at the top. The next screen asked me if I want to have SQL Compare perform the deployment or to create a deployment script. I chose to create a deployment script. I’m then directed to the screen below.

image

It showed me the script that I can use to do the deployment. When I clicked the “Open Script in Editor” button, a marvelous thing happened. SSMS opened, a tab opened with the script I will be deploying, AND I had been connected to my Azure database with the same credentials I used to create the script.  It also opened the correct database so that I was not left in the Master database.

image

After I executed the script, I had a database schema in the new Azure Utility database.

image

 

Now to get some data.

The SQL Data Compare works exactly like SQL Compare, but it is comparing all the data between two identical (or mostly identical) tables. Note: If the tables don’t have primary keys set, then you can select the field that should be the primary key.

image

Here you can see that I have two tables that I can compare. The ExpandedDate table has +44K records and my Tally table has 1 Billion records to move. The following steps are just like SQL Compare and I can choose whether SQL Data Compare deploys the script or creates a script to be manually deployed.

Since this was my first time migrating data from my local database to my Azure database, I learnt a few things. One, a deployment script with over 1 billion inserts, does not work. (I’m not surprised.)

Two, it took the SQL Data Compare tool a long time to migrate the data. I’m sure my home internet connection (with teenagers using the bandwidth) didn’t help.

6. Restore a local database backup to my Azure Database

While I waited…and waited for my Utility data to be inserted into my Azure database, I did some poking around to see if it was even possible to restore a local SQL Server 2014 backup to an Azure database. Guess what, I found something (And there was much rejoicing).

On CodePlex, I found a SQL Database Migration Wizard that can be used to restore the database. They even had Migration Wizards for 2008R2, 2012, and 2014. SQL Database Migration Wizard v3.15.6, v4.15.6 and v5.15.6

In the post, they also listed under their Reference Materials a “Migration Cookbook” that you can download and use. Migration cookbook now available for the latest Azure SQL Database Update (V12)

Recap

When I first started this two part post, I had only taken classes on Azure topics, but I had never gone through the steps of using the Azure portal. Now I’ve gone on a journey of completing a list of goals, which started with the creation of an Azure account and ended with loading a schema and data to a new database.

I hope you enjoyed this journey and will take a journey of your own exploring what Azure has to offer.

Adventures Into Azure Databases For The First Time (Part 1 of 2)

New Construction Home High Ceiling Wood Stud Framing

Whether we want to accept it or not, cloud computing is here to stay. Microsoft has made a big push in that direction for a few years now. When I first started using Office 365, I wasn’t sure I would like it. Now I love it. I love being able to access Word, Excel, and Outlook from wherever I am.

Now I want to do the same with my databases. To do that, I need to take the plunge and learn how to work with Azure Databases. If I don’t, I’ll fall behind, which is something I don’t want to do.

So let’s get started with my first Azure Database.

Ready. Get Set. Go!

Whenever you start a new project, you should set goals, and my experiments in learning about Azure Databases is no exception.

In Part 1 of this series I have the following goals:

  • Activate my free credits through my MSDN license.
  • Create my first empty database.
  • Determine how to connect to the database.

In Part 2 of this series:

  • Poke around the Azure GUI to learn what is available with my free credits.
  • Use my Red Gate tools to create a new schema and load it with data.
  • Learn if I can take a backup from SQL Server 2014 and restore it in my Azure account.
1.  Setup Azure Account

The first thing you need to do is activate your Azure Account through your MSDN License. This only took minutes. I entered my contact information, then Microsoft sent me a text with an activation number, and I was set.

There were lessons that I learned though.

  • The Azure account uses the same login as Office 365, so I ended up setting up my Azure account for work the first time. Oops.
  • There is a difference between the Free Azure Credits for the Professional MSDN license and the Enterprise MSDN license.
  • The Professional MSDN license provides $100 per month free credit, while the Enterprise MSDN license provides $150 per month free credit.
  • You can pay for more credit, but I didn’t need to.
  • You can find the details here for all MSDN and Visual Studio licenses. Note: These credits are for development and testing only, not for production use.
  • The free credits can’t be used to work with third party products like Oracle. (I’m good with that.)

If you do have multiple Azure Accounts, you can verify which one you are currently logged into in the top right hand corner of the portal. Now that I’m paranoid as to which account I’m in, I’ll be watching that corner for my private email or my work email.

Portal

2. Setup My First Empty Database

Setting up my first database was as easy as setting up my Azure Account.

At the bottom of the portal, there is a New link and a Delete link. These are for creating and deleting databases.

After clicking the New link, I went through a series of screens to create my database.

The first screen asked me for the name of my database and what size database I wanted to create. This is an important step, since it will affect my monthly charges. Remember, I only have $150 in free credits each month. You can go here to see the pricing for the various service tiers and the performance levels. I chose to create the smallest database I could (2 GB, and 5 DTUs). I also created this database on a new SQL Database Server (I kind of have to, since it is the first database).

11-28-2015 4-24-28 PM

The second screen I was shown was for the login credentials of my Azure Server. I created a login name, a strong password, and where I wanted my server to be stored. Microsoft has locations all over the world that house the Azure servers. While I can’t pick a particular location, I can pick a particular region.

11-28-2015 4-27-29 PM

Now I have an Azure Server and Database. By clicking the Database Link or the Server link, I can see my Databases and Servers.

11-29-2015 4-48-49 PM

Figure A

There is another portal that you can look at as well. You can find it by clicking on your login name and selecting Azure Preview Portal. (This is in preview at the time of writing this article.)

There are a couple of different ways you can look at your servers and databases. One is a tile layout and the other is a traditional list layout.

11-28-2015 4-47-48 PM

Figure B

11-29-2015 4-55-29 PM

Lessons learned

  • I don’t get to pick the name of my server, but I’m ok with that. It will provide better security.
  • If I’m moving a database to Azure, I should probably provide the same name as my current database. This means I get to create two new databases, AWMonkey and Utility, which will be the two databases I’ll be working with. I’ll also delete the AzureMonkey database since I don’t need it.
3. Find the Azure Server and ODBC strings I’ll need to use to access from my applications on my laptop

This was very easy. I remember the 90s where I would bang my head on the table trying to figure out how to put together an ODBC string the first few times. (This was the stone age where Google didn’t exist. I know, horrifying.)

  • If you are on the current portal page (Figure A above), then you double click on the database name. Towards the bottom you’ll see the string that will be used to connect to your Azure Server.
  • By clicking on the link for the connection strings, you’ll be given four different connection strings. You just need to provide your user account credentials.

11-29-2015 5-06-58 PM

If you are in the preview portal (Figure B above), you can click on the name of the database, which will bring you to a property page, which has the string for the Azure server. Then you can click on the link that will provide the connection strings.

11-29-2015 5-19-15 PM

Until We Meet Again

Even though I’ve only created an empty database so far, I feel like I’ve accomplished a lot. I remember trying to get SQL Server 7.0 setup. It took two of us half the day and a fifteen step cheat sheet to get it installed. (Yup. Still no Google.)

Note: Setting up an Azure Account and an empty database took less than 10 minutes. In fact, it took less time than it took me to write this post. See you soon for the next post.

My Eleven Day PASS Summit 15 Experience

“STUEWE!” This is how I knew PASS Summit was about to begin. I was walking to the Friday night SQL Saturday Portland speaker dinner, prior to PASS Summit, when I heard my name being called, well, yelled. Since it was already dark, it took a minute to spot the black SUV with Mike Fal hanging out the window waving at me.

But wait, that is not when my Eleven Day Summit Experience started.

Day One – Thursday

It actually started on Thursday before Summit at LAX. I was there bright and early to pick up two of my Australian friends, Martin Cairney and Ben McNamara, who would be traveling with me to Portland and then on to Seattle. Despite the 14 hour flight, they wanted to go see the Endeavor space shuttle at the California Science Center in LA.

IMG_4924

Day Two – Friday

Friday was spent flying to Portland where we would be attending SQL Saturday Portland. We stayed in an Air B & B house. This is a great way to save some money for lodging and to feel like you’re at home. There is always a coffee machine with decent coffee and the best…no rambunctious kids on the floor above you. WIN!

Friday night, we attended a wonderful speaker dinner (This is when Mike Fal yelled my name out a window of an SUV). Speaker dinners are always fun to attend. You get to catch up with friends and make new ones. Portland is always a bit special, because it’s full of speakers from around the world who made a little extra time to speak at a community event before PASS Summit.

After dinner, we found an amazing restaurant. It had the most scrumptious gluten free bread. I think it was made of clouds.

IMG_4938

Day Three – Saturday

Saturday was spent at one of my favorite SQL Saturdays. Ok, I’ll be honest. ALL SQL Saturdays are my favorite. It’s a great time to do a little extra networking, get to see SQL Family, and meet new people in the community. I also love speaking and that is what I did in Portland. I spoke on SSRS.

2015 PASS Summit12

I was able to attend several wonderful sessions. My favorite was given by Julie Koesmarno and Cindy Gross: Moving beyond Unconscious Bias. I really liked their approach to this topic. Throughout the presentation, they strongly emphasized that we are all good people.

They spoke about how everyone categorizes everything in our lives, but sometimes we have a bias in our categorization, an unconscious bias that we are unaware of. They told us about a study by Harvard called the Implicit Association test, which tests how we categorize various topics. Julie and Cindy then showed us a video of Allen Alda taking the Harvard test. I strongly recommend learning about unconscious bias.

No Portland trip is complete without going to my two favorite places in the evening. We put our names in for our two hour wait at the Multnomah Whiskey Library, and then we headed over to Cassidy’s. They have great food and AMAZING bacon. I ordered enough bacon for all sixteen of us. (And there was much rejoicing). Afterwards we went to the Whiskey Library, where I was finally able to try some Scapa Scotch. I really liked it (Thanks for the recommendation Grant!).2015 PASS Summit13

Day Four – Sunday

One of the cheapest ways to get from Portland to Seattle is the train. When we bought our tickets, they were around $25. Little did we know it would be an unexpectedly amazing day. My friend, Ted Stathakis, was really looking forward to this day too. He loves trains; neither of us realized how amazing it would be.

Last year, there were only five of us on the train. Martin Cairney and I spent the majority of the trip troubleshooting a problem with my VM. Not this time. This time it was completely non-technical. There were sixteen of us in our car, and two who (foolishly) purchased tickets in a nicer car. Why do I say foolishly? Well, we were having so much fun, that they spent the majority of the trip in our car…eating our Voodoo Donuts. Next time, I think they will buy the general tickets. Next time, I’m also going to try getting the WHOLE car too. So if you want to ride with us, keep an eye out for a post from me in the summer.

2015 PASS Summit4

For more details on the fun we had, go look up the twitter handle #SQLTrain.

Day Five – Monday

Monday was a red day. Red Gate day that is. Red Gate puts on a free event called SQL in the City. They have put it on each year on the Monday before the PASS Summit for several years now. They bring quite a few people from the UK to talk about their tools, and they have amazing presentation. This year, they went with a common theme: Continuous Integration. I participated in the Lightening Talks. I spoke on how to use their DLM Dashboard. This is a great tool that keeps an eye on the databases you connect to the DLM dashboard. You can monitor who modified the database and which scripts they ran on the database. Did I mention this is a free tool?

2015 PASS Summit5

Monday night is the annual Networking dinner put on by Steve Jones and Andy Warren. This is a “must attend” event. Why? It’s all about networking and integrating 1st timers into the community. This year, a bunch of us crammed into a booth with four 1st timers. At least two were from Europe. I enjoyed getting to know them and they enjoyed being introduced to all the people that came by our table to say hi.

Note: Remember, networking isn’t just about talking tech. It’s about building relationships so that you can ask technical questions in the future. If you need ideas on how to meet people, take a look at my blog post, Top Ten Ways To Create Your Meet and Greet List For Summit.

Day Six – Tuesday

Tuesday was a relaxing day before the busyness of Summit. I started the day off with breakfast with Chris Yates. This is a rare pleasure. Chris and I met through the SQL Community on Twitter. We now blog together occasionally and I speak at the SQL Saturday in his hometown.

Before I took time to prep for my Wednesday morning presentation, my apartment mates and I went to the Starbucks where they roast all of their coffee beans. I had my first french press, and, I must say, I’m hooked. 2015 PASS Summit2-001

After I was done with my presentation preparation, it was time for the opening event for PASS Summit. This is a whirlwind event. I feel like a butterfly, fluttering from conversation to conversation. This year, I felt like a match maker too. I had three people from my company attending with me. I haven’t had that happen since the mid 90s! I enjoyed introducing them to a number of people.

I also did something different. I went looking for the lone 1st timers. The wallflowers who don’t know who to talk to. I found one. He was eating by himself near a wall, watching everyone by himself. I went up to him, introduced myself, and got to know him. I then asked if there was someone he wanted to meet. He wanted to meet Pinal Dave from India. Pinal and I know of each other, but we’ve never had the pleasure of shaking each other’s hands. So that’s what we did. Luckily, Pinal was standing nearby. Pinal and I finally shook hands, and I introduced him to the 1st timer. That made me very happy. Win!

2015 PASS Summit14

Day Seven – Wednesday

This was my day <deep breaths inserted here>. I was first up to bat <deep breaths inserted here>. I’m nervous as I walk to my room. I see my room and I get calmer. I climb up on the stage and I get calmer. I get my laptop set up and I’m ready to go. Boom.

As you’ve probably noticed, I really believe in networking. My class filled up with plenty of time to spare before the presentation, so I did the same thing I did last year. I had everyone stand up and introduce themselves to their neighbors. I stood on stage with a huge grin on my face, then I stepped off stage and introduced myself to people in the front row.

200 hundred people came to hear my presentation on Sophisticated Techniques in SSRS. I enjoyed every minute of the session. We did have a couple of exciting moments in class though. There was a laptop on the table (not mine) that started beeping. I thoroughly enjoyed slamming nicely shutting the laptop lid. There was also a crash of dishes behind me in the hallway. I said “Opa!” in my head and kept on presenting.

2015 PASS Summit15

With my presentation done, I was able to enjoy classes the rest of the conference.

One class I took was not found in a classroom, but on the edge of a planter on the ground floor. I started talking with this guy (sorry I can’t recall his name, but he was awesome). He was telling me how he had submitted his session, but wasn’t selected. After he told me his topic, I told him I would have taken his class. His presentation was right up my alley. So he asked if I would like to see it. I said yes and we had a wonderful one-on-one discussion, complete with demos on his laptop. Right there. On the edge of a planter. It was my favorite class. THAT is one the many special things about PASS Summit.

One of the things I was really happy about this year was the change in how special diets were handled. In the past, they have always messed up my meals. I have a lot of food sensitivities, so I understand it’s difficult to accommodate my diet, but I still need to eat. This year, they had our names on our meals, and they had the more common special requests handled in the regular food line. (Thank you)

Day Eight – Thursday

I was invited again to live-blog the keynote on Thursday. You can see my comments here. I was very sad to hear that this would be the last keynote for PASS Summit given by David DeWitt and Rimma Nehme. I’m happy that I’ve been able to see two of their keynotes.

2015 PASS Summit Notes1

Before lunch, I thoroughly enjoyed directing people to lunch (I’ll tell you a secret. I think I enjoyed twirling my sign a little too much. Open-mouthed smile).

Thursday is the Women In Technology Day (WIT). WIT puts on a great lunch, and I was in the perfect place to direct people to the regular lunch and the WIT lunch. One of the traditions of WIT day is for men and women to wear kilts. This tradition was started by Grant Fritchey many years ago. This was my fourth year participating.

2015 PASS Summit16

Day Nine – Friday

Friday is full of sessions and bitter sweet goodbyes. Now, I don’t want you to think I only had fun this week. I did attend sessions. I have proof. I also took notes.

2015 PASS Summit17

 

… And I had fun.

 

2015 PASS Summit9

Summit is not complete without at least one night hanging out with Jason Strate at Bush Garden participating in SQL Karaoke. I was super happy to make it there multiple times this year. Last year, I wasn’t able to go at all due to all the commitments I had at Summit.

2015 PASS Summit8

Day Ten – Saturday

Saturday was a vacation day. I spent the day with my Australian friends. These are friends that I’ve made through Summit. I even worked with a some of them for a few months. In the evening, Julie Koesmarno and I threw a dinner party for a few of our friends who were still in town. We were even able to invite some friends we met this year.

2015 PASS Summit19-001

Day Eleven – Sunday

Sunday I had to come back to reality on my flight home. While I had an amazing time in Seattle speaking, learning, networking, and reconnecting, I was happy to see my family and they were happy to see me.

LucybyVictoria

T-SQL Tuesday #72 Summary – Data Modeling Gone Wrong

This month marks the 72nd T-SQL Tuesday.  Adam Machanic’s (b|t) started the T-SQL Tuesday blog party in December of 2009. Each month an invitation is sent out on the first Tuesday of the month, inviting bloggers to participate in a common topic. On the second Tuesday of the month all the bloggers post their contribution to the event for everyone to read. The host sums up all the participant’s entries at the end of the week. This month I’m the host and the topic is …

Data Modeling Gone Wrong

The purpose of SQL Server, is to make sure that the databases are kept safe and run as optimally as possible. The problem is, if the data model is flawed, or not maintained, then no matter how optimally the SQL Server is configured, the database won’t be able to function efficiently.

Below you will find the summary of the 19 posts from this blog party.

Summary

Note: This is a heap. The summary is based on the order received… Well, except Ed’s I inserted him at the top. You’ll see why.

Ed Leighton-Dick
SQL New Blogger Challenge: Week 2 ideas
I found several posts in my comments that simply reiterated my invitation. I didn’t list them below. I am, however listing this one. Ed is challenging new bloggers. I love this idea. He not only broke down my invitation for new bloggers, but gave some ideas on how to find a topic that will fit into the prompt. Thanks go out to Ed for encouraging the “next generation” of SQL Bloggers.

Thomas Rushton – The Lone DBA
T-SQL Tuesday #72 – Data Modeling Gone Wrong
He wrote about generically named Fields and overloaded Fields. This is a very important topic and I’m glad it’s at the top.

Robert Pearl
T-SQL Tuesday No.72 – Is Your Data Model, Normal?
Robert started off his post with, “Well, I’m not a data modeler, nor do I play one in the corporate world.” He then goes on to tell us how important data modeling is. Finally, he goes over the basics of the 3 forms of normalization, which he feels everyone should know. (And I wholeheartedly agree with him.)

Chris Yates

T-SQL Tuesday #72 Invitation – Data Modeling Gone
Wrong

Chris, one of my dearest friends, wrote a fabulous topic on what you should think about and do prior to creating that first table. He then went on to talk about some good general practices to keep in mind while creating the data model.

Tamera Clark
#TSQL2SDAY – Data Modeling Gone Wrong

I really liked Tamera’s approach to the topic. She goes into the realities of teams not admitting there are problems due to

  • “Reasons”
  • Applications
  • Egos

I’ve been there, seen that and have the t-shirt.

Andy Galbraith
T-SQL Tuesday #72 – Implicit Conversion Problems

This is one of those topics that people, who don’t write SQL, often forget about. Implicit conversations and how they impact queries. This is a great read.

Tim Peters

That time a beer broke my database – T-SQL Tuesday
#72

Tim has a great example of one of his data modeling experiences about finding out (after the data model has been deployed) that another table to hold multiple breweries was needed.

Side note: He has a great website, Beer Intelligence Project, where he has documented and visualized new beers. I think he probably enjoyed the research tremendously too.

Rob Volk
T-SQL Tuesday #72: Data Modeling
As usual, Rob tells a great story. This one is a fictitious story about how a small ice cream store grew over time. It’s told from the point of view of the database. The database started as a small database and grew until paralysis hit, I mean performance issues hit.

Steve Jones
Poor Data Modeling – T-SQL Tuesday #72
Steve also has a great topic about data modeling morphing into a difficult model. I’ve worked with the same data models that he describes here. That of a hierarchal data model that can get out of hand if it is not planned properly. The moral of the story is, when a data model is being designed, be forward thinking.

Mike Fal
#TSQL2SDAY: You’re Gonna Cluster that GUID
Mike attacked a very controversial data type: The GUID <sinister music inserted here>. This is a great topic to add to this collection of posts. Mike makes a great argument on why GUIDs should be indexed. He also specifies a specific use case for it.

Aaron Bertrand

T-SQL Tuesday #72 : Models Gone Wild!

I think Aaron and I were separated at birth. I preach the same list of “database sins”. In Aaron’s post he goes over two categories of “sins”, bad names and incorrectly used data types. I think the only thing I would add to his list would be to not use the default SQL Server constraint names. Give them meaningful names!

Kenneth Fisher
Why are you still using datetime?
Kenneth brings a great question to the table. Why are you still using datetime? He goes into the different date data types that are available and why we should be taking advantage of them.

Anders Pedersen
T-SQL Tuesday #72: Data modeling gone extremely wrong
Anders gives a good example of how an over-normalized database can slow down your application.

Terry McCann
Tally/Number tables in SQL Server #TSQL2sday
Terry took a different approach. He looked at the prompt from the point of implementing some best practices instead of identifying bad ones. He wrote about how Tally/Number tables can help with queries to find missing data.

Rob Farley
What’s driving your data model?
Like Chris Yates, Rob wrote about considering the data models purpose BEFORE creating it. The difference between the two posts, is Rob took it from the data warehouse data model point of view.

Malathi Mahadevan
TSQL Tuesday #72 – Data Modeling gone wrong
My good friend Mala, had a great topic for her post. She talked about the “one table to rule them all” pattern that she encountered at one of her jobs. I really liked one of her sentences in her summary, “The big lesson that came out of it was that the size of the schema or fewer tables do not really mean a simpler/easier to manage design, in fact it can be the exact opposite.”

Sander Stad
Data Modeling Gone Wrong
Sander painted a story of XML woe. I want you to be prepared when you read his excellent post. He has an image of an execution plan that will make your hair stand on end. Make sure you are sitting down.

Jens Vestergaard
#TSQL2SDAY – Data Modeling Gone Wrong
Jens has a horrifying tale of bad field name combined with the lack of foreign keys. I may have bad dreams about this one tonight. He does have a great reaction GIF that summarizes how any of us would feel about being faced with the database he describes.

(Update…I forgot to add my own post. Oopse.)

Mickey Stuewe
T-SQL Tuesday #72 – Bad Decisions Made With Surrogate Keys
In my post, I go into when you should and should not use Surrogate Keys as primary keys. I then give a case on how surrogates can cause duplicate data in a many to many table.

Thanks for all the fish!

First, I would like to point out how cool this blog party is. Even though Adam lives in the US, this is not just a US blog party. It’s international. In my list I have posts from The Netherlands, the UK, Denmark, and Australia. I think that is really cool. The one thing that I would like to see is, more women joining the blog party. Including myself, there were only three women who participated. While I’m very happy that Tamera and Mala joined the party, I would like to see more in the future.

T-SQL Tuesday #72 – Bad Decisions Made With Surrogate Keys

This is my second time hosting the t-SQL Tuesday blog party. The party was started by Adam Machanic (b|t) in December of 2009.

This month’s invitation topic is on Data Modeling Gone Wrong. Being a Database Developer, I deal with bad database design decisions daily. One of my app-dev teammates loves to tell me that the bad decisions were made because I didn’t work there yet. (That makes me laugh.)

Surrogate Keys vs Natural Keys

The point of surrogate keys is to represent complicated natural keys as the primary key of the table. Both the surrogate key and natural key will yield a unique key for the row. Sometimes that unique natural key is the entire row. When possible, it is better to use the natural key since it is the true representation of the row. Unfortunately, this is not always practical. Let’s look at some examples.

TSQLTuesday72 Image1

In the employee table it would take four fields to make a primary key from the natural key (first name, last name, social security number, and birthdate). Note: This is assuming this table is only used in the US and the employees have social security numbers. The reason the birthdate is also needed is due to the fact that social security numbers can be reused after someone has passed away. For the employee table it makes sense to have a surrogate key since it would be cumbersome to use all four fields as foreign keys in other tables.

The StateList (representing each state in the United States) is a good example of using the natural key as the primary key. Each state only uses two characters to represent the State so CHAR(2) can be used for the natural key and the primary key. This would provide the added benefit of not needing to join back to the StateList to get the two character representation of the State abbreviation…unless additional information about the state is needed. So what is the point of this table? Well, by having it, you are guaranteed referential integrity on the StateCode field by having a foreign key back to the StateList table. You don’t have to worry that someone puts ZZ as a StateCode.

Danger, Will Robinson!

One of the problems I’ve seen with careless use of surrogate keys are the duplication of natural keys. Quite often it’s overlooked that the natural key still needs to have a unique constraint. Without it, the reporting team ends up having to use MAX or DISTINCT to get the latest instance of the natural key, or SSIS packages are needed to clean up the duplicates. This can be compounded with many-to-many tables.

Many-to-many tables allow two tables to be joined multiple times. An example can be seen in the car insurance industry.  If you have multiple people on the same insurance and they are registered to drive multiple cars, then a many-to-many table would be created to capture the data.

If a surrogate key is used on the many-to-many table in order to provide uniqueness and if the natural key does not have a unique constraint, then duplicate natural key combinations can occur. This can be obfuscated if there is additional information in the table. Maybe the amount the car is insured, is also maintained in this table. Let’s take Victoria’s insurance as an example. If Victoria is in the table with her 1971 Corvette listed twice with two different insurance amounts listed, which one is the current one? The better pattern in this case would be to use the natural key.

TSQLTuesday72 Image2

Conclusion

Surrogate keys are very useful, but it should not be assumed that they should be used for all tables. The natural key should always be considered first. If the natural key is too complicated to be used as foreign keys in other tables, then the surrogate key is a good choice. Just remember to ALSO put a unique constraint on the natural key.

Thanks for all the fish

I had several people tell me on Twitter that they were going to write their first blog post for this t-SQL Tuesday blog party. I want to thank them ahead of time for taking the leap into the blogging world to share their experiences and expertise in their fields.

%d bloggers like this: