Archive for Data Modeling

T-SQL Tuesday #74 – Knowing When Data Changes Occur in Standard Edition

Robert L. Davis (b|t) is hosting this month’s T-SQL Tuesday blog party. The party was started by Adam Machanic (b|t) in December of 2009.

This month’s T-SQL Tuesday’s topic is on Data Change. This is an important topic for me, since I’m about to create our first Data Warehouse at work. I’ve chosen to champion the companies who are using the Standard Edition of SQL Server.

Crestfallen

I took a class on Change Data Capture this past year at the SQL Saturday in Portland, Oregon. I absolutely loved it. I couldn’t wait to implement it. Then I found out it was an Enterprise Level feature only. I was crestfallen.

I went back to work asking if we could consider using Enterprise SQL Server, I was told that we have too many cores on our Nutanix hardware to make the features we would use worth it. You see, SQL Server pricing is based on the number of cores the underlying hardware has, not the number of cores you utilize in your VM. So what is a girl to do? (And guys, I haven’t forgotten you either.)

Since the Change Data Capture feature is not available to us Standard Edition users, let me take you through another pattern that uses auditing columns on each table.

The Basis of The Pattern I Use

I use a pattern that includes four fields on all transactional tables. This (absolutely) includes lookup tables too. The two table types that are an exception to this pattern are audit tables and error tables. I’ll cover why later in this article.

Four fields include CreatedOn, CreatedBy, UpdatedOn, and UpdatedBy. The dates should be DateTime2. CreatedOn is the easiest to populate. You can create a default on the field to be populated with GetDate().

The other fields can be a little more tricky depending on your environment. There are two patterns for populating CreatedBy as well as two patterns for populating UpdatedBy and UpdatedOn. Below you will find the pros and cons of each.

Pattern One For CreatedBy

The first pattern has the CreatedBy using the SYSTEM_USER function to populate the field. This will retrieve the windows login that the user used to login into the application.

Pros
  • This pattern allows the use of a default on the CreatedBy field. This allows the field to not be dependant on code to populate it.
  • All users have a validated account.
  • Accounts that are used by SSIS or services don’t need to have a row in the User table.
Cons
  • The field will be a varchar or an nvarchar.
  • The CreatedBy field will need to be parsed to find out which user inserted the record for reporting. (One way around this, is to remove the domain before inserting the (n)varchar).

This is my preferred technique since you don’t need to rely on the User table being immediately populated for a user to start using an application. It also makes it easier when the UpdatedBy field is updated through a pattern below.

 

Pattern Two for CreatedBy

This technique is based on an existing User table and uses their ID for the value of CreatedBy.

Pros
  • This pattern allows the CreatedBy field to join to the User table through an integer. Information about the user can be determined easily by joining to the User table but takes less space than the varchar/nvarchar mentioned above.
  • CreatedBy is not relying on windows authentication. This is helpful when not all users of the application have a windows login account.
Cons
  • This pattern requires a row for various tasks in the User table when SSIS or services insert data automatically.
  • This pattern requires CreatedBy to be populated by the application instead of by a default value. This requires working with the AppDev team and getting the AppDev team to write some additional code for you. (If you need to resort to bribing them, I suggest pizza, donuts, chocolate, or beer.)

 

Now let’s talk about UpdatedOn and UpdatedBy. They are a little more tricky to populate since they are populated on an update.

Pattern One For UpdatedBy and UpdatedOn

This pattern requires code to populate the fields since there are no defaults on updated rows. As stated above, you’ll have to work with the AppDev team.

Note: This is not a bad thing. I whole heartedly believe in having a good relationship with the AppDev team. It makes it easier to work with the team when they have the ability to write SQL. It also has the benefit of being included earlier in design sessions. (I personally have been blessed with a great AppDev lead.)

Pros
  • This pattern can be used to populate the UpdatedBy field and even the UpdatedOn fields using similar code.
  • Either User IDs or Windows account can be used to populate UpdatedBy.
Cons
  • The pattern is dependent on the AppDev team to remember to include UpdatedBy and UpdatedOn for every updated row, especially when there are tight deadlines.
  • You need to be vigilant that the fields are always updated, especially with lookup tables.

This is my preferred technique since it doesn’t involve triggers, which we’ll see in pattern two below.

Pattern Two For UpdatedBy and UpdatedOn

This pattern uses the After Update trigger. The reason I choose the After Update trigger, is that it waits until all the changes are made to the table(s) being updated by other triggers.

Note: I do not encourage multiple tables to be updated in a single trigger. (Hmm. This sounds like a good future post.)

Pros
  • This can be used as a stop gap while the AppDev team refactors their code to populate UpdatedBy and UpdatedOn fields as well as CreatedBy.
  • There is no requirement to rely on the application code to populate these fields.
Cons
  • The UpdatedBy can’t be updated with the User’s Id when populating the field.
  • You might not be allowed to use triggers. (Note: Triggers are not bad when used correctly.)

CREATE TABLE dbo.OrderStatusType
(
OrderStatusTypeId INT IDENTITY PRIMARY KEY
,OrderStatusName VARCHAR(50) NOT NULL
,IsActive BIT NOT NULL DEFAULT 1
,CreatedBy VARCHAR(50) NOT NULL DEFAULT SYSTEM_USER
,CreatedOn DATETIME2 NOT NULL DEFAULT GETDATE()
,UpdatedBy VARCHAR(50)
,UpdatedOn DATETIME2
)
GO

CREATE TRIGGER dbo.Trg_OrderStatusType_u ON dbo.OrderStatusType
AFTER UPDATE
AS
SET NOCOUNT ON

UPDATE dbo.OrderStatusType
SET
UpdatedOn = ISNULL(i.UpdatedOn, GETDATE()) -- This makes sure if a NULL was used, it is overwritten with the current Date.
,UpdatedBy = ISNULL(i.UpdatedBy, SUSER_NAME()) -- This makes sure if a NULL was used, it is overwritten with the current user.
FROM
dbo.OrderStatusType AS ost
INNER Join inserted AS i ON ost.OrderStatusTypeID = i.OrderStatusTypeID
GO
INSERT INTO OrderStatusType
(OrderStatusName
,IsActive
)
VALUES
('Created',1)
,('Back Orddered', 1)
,('Shipped',1)
,('Completed',1)

GO

SELECT * FROM OrderStatusType AS ost

UPDATE
OrderStatusType
SET
IsActive = 0
WHERE
OrderStatusTypeId = 2

SELECT * FROM OrderStatusType AS ost

UPDATE
OrderStatusType
SET
IsActive = 0
,UpdatedBy = 'George'
,UpdatedOn = '2016-01-11 17:38'
WHERE
OrderStatusTypeId = 4

SELECT * FROM OrderStatusType AS ost

If you are adding this pattern to existing triggers such as “AFTER INSERT, UPDATE, DELETE”, then make sure to check that the execution of this trigger was based on an update. This can be done by checking if there are any records in the Inserted and deleted tables that are used in triggers.

CREATE TRIGGER dbo.Trg_OrderStatusType_iud ON dbo.OrderStatusType
AFTER INSERT, UPDATE, DELETE
AS
SET NOCOUNT ON

IF EXISTS (SELECT 1 FROM Inserted) AND EXISTS (SELECT 1 FROM deleted)
BEGIN
UPDATE dbo.OrderStatusType
SET
UpdatedOn = ISNULL(i.UpdatedOn, GETDATE()) -- This makes sure if a NULL was used, it is overwritten with the current Date.
FROM
dbo.OrderStatusType AS ost
INNER Join inserted AS i ON ost.OrderStatusTypeID = i.OrderStatusTypeID
END
GO

Exceptions For Audit and Error Tables

These two types of tables are usually populated by inserts only. Most of the time the CreatedOn field is the only field that needs to be added. I would change the name to AuditedOn and ErroredOn respectively since the CreatedOn will most likely represent the creation of the original record.

Why are Lookup tables not in this list? They can be modified over time. Someone may need a value changed or the value needs to be discontinued. The row can’t be deleted when it is discontinued since that would break referential integrity.

In Summary

By using CreatedOn and UpdatedOn you can compare dates when data needs to be processed without having to compare the entire table.

One last thing on the UpdatedBy and UpdatedOn fields. For a long time I insisted that these two fields are NULL during an insert, but I have since changed my mind. By setting the UpdatedBy and UpdatedOn to the same values as CreatedBy and CreatedOn for an insert, you won’t have to deal with the ISNULL function or using the CreatedOn for (some) comparisons.

Thanks For All The Fish

I want to thank Robert L. Davis for hosting this month and I look forward to participating in future T-SQL Tuesday blog parties.

T-SQL Tuesday #72 Summary – Data Modeling Gone Wrong

This month marks the 72nd T-SQL Tuesday.  Adam Machanic’s (b|t) started the T-SQL Tuesday blog party in December of 2009. Each month an invitation is sent out on the first Tuesday of the month, inviting bloggers to participate in a common topic. On the second Tuesday of the month all the bloggers post their contribution to the event for everyone to read. The host sums up all the participant’s entries at the end of the week. This month I’m the host and the topic is …

Data Modeling Gone Wrong

The purpose of SQL Server, is to make sure that the databases are kept safe and run as optimally as possible. The problem is, if the data model is flawed, or not maintained, then no matter how optimally the SQL Server is configured, the database won’t be able to function efficiently.

Below you will find the summary of the 19 posts from this blog party.

Summary

Note: This is a heap. The summary is based on the order received… Well, except Ed’s I inserted him at the top. You’ll see why.

Ed Leighton-Dick
SQL New Blogger Challenge: Week 2 ideas
I found several posts in my comments that simply reiterated my invitation. I didn’t list them below. I am, however listing this one. Ed is challenging new bloggers. I love this idea. He not only broke down my invitation for new bloggers, but gave some ideas on how to find a topic that will fit into the prompt. Thanks go out to Ed for encouraging the “next generation” of SQL Bloggers.

Thomas Rushton – The Lone DBA
T-SQL Tuesday #72 – Data Modeling Gone Wrong
He wrote about generically named Fields and overloaded Fields. This is a very important topic and I’m glad it’s at the top.

Robert Pearl
T-SQL Tuesday No.72 – Is Your Data Model, Normal?
Robert started off his post with, “Well, I’m not a data modeler, nor do I play one in the corporate world.” He then goes on to tell us how important data modeling is. Finally, he goes over the basics of the 3 forms of normalization, which he feels everyone should know. (And I wholeheartedly agree with him.)

Chris Yates

T-SQL Tuesday #72 Invitation – Data Modeling Gone
Wrong

Chris, one of my dearest friends, wrote a fabulous topic on what you should think about and do prior to creating that first table. He then went on to talk about some good general practices to keep in mind while creating the data model.

Tamera Clark
#TSQL2SDAY – Data Modeling Gone Wrong

I really liked Tamera’s approach to the topic. She goes into the realities of teams not admitting there are problems due to

  • “Reasons”
  • Applications
  • Egos

I’ve been there, seen that and have the t-shirt.

Andy Galbraith
T-SQL Tuesday #72 – Implicit Conversion Problems

This is one of those topics that people, who don’t write SQL, often forget about. Implicit conversations and how they impact queries. This is a great read.

Tim Peters

That time a beer broke my database – T-SQL Tuesday
#72

Tim has a great example of one of his data modeling experiences about finding out (after the data model has been deployed) that another table to hold multiple breweries was needed.

Side note: He has a great website, Beer Intelligence Project, where he has documented and visualized new beers. I think he probably enjoyed the research tremendously too.

Rob Volk
T-SQL Tuesday #72: Data Modeling
As usual, Rob tells a great story. This one is a fictitious story about how a small ice cream store grew over time. It’s told from the point of view of the database. The database started as a small database and grew until paralysis hit, I mean performance issues hit.

Steve Jones
Poor Data Modeling – T-SQL Tuesday #72
Steve also has a great topic about data modeling morphing into a difficult model. I’ve worked with the same data models that he describes here. That of a hierarchal data model that can get out of hand if it is not planned properly. The moral of the story is, when a data model is being designed, be forward thinking.

Mike Fal
#TSQL2SDAY: You’re Gonna Cluster that GUID
Mike attacked a very controversial data type: The GUID <sinister music inserted here>. This is a great topic to add to this collection of posts. Mike makes a great argument on why GUIDs should be indexed. He also specifies a specific use case for it.

Aaron Bertrand

T-SQL Tuesday #72 : Models Gone Wild!

I think Aaron and I were separated at birth. I preach the same list of “database sins”. In Aaron’s post he goes over two categories of “sins”, bad names and incorrectly used data types. I think the only thing I would add to his list would be to not use the default SQL Server constraint names. Give them meaningful names!

Kenneth Fisher
Why are you still using datetime?
Kenneth brings a great question to the table. Why are you still using datetime? He goes into the different date data types that are available and why we should be taking advantage of them.

Anders Pedersen
T-SQL Tuesday #72: Data modeling gone extremely wrong
Anders gives a good example of how an over-normalized database can slow down your application.

Terry McCann
Tally/Number tables in SQL Server #TSQL2sday
Terry took a different approach. He looked at the prompt from the point of implementing some best practices instead of identifying bad ones. He wrote about how Tally/Number tables can help with queries to find missing data.

Rob Farley
What’s driving your data model?
Like Chris Yates, Rob wrote about considering the data models purpose BEFORE creating it. The difference between the two posts, is Rob took it from the data warehouse data model point of view.

Malathi Mahadevan
TSQL Tuesday #72 – Data Modeling gone wrong
My good friend Mala, had a great topic for her post. She talked about the “one table to rule them all” pattern that she encountered at one of her jobs. I really liked one of her sentences in her summary, “The big lesson that came out of it was that the size of the schema or fewer tables do not really mean a simpler/easier to manage design, in fact it can be the exact opposite.”

Sander Stad
Data Modeling Gone Wrong
Sander painted a story of XML woe. I want you to be prepared when you read his excellent post. He has an image of an execution plan that will make your hair stand on end. Make sure you are sitting down.

Jens Vestergaard
#TSQL2SDAY – Data Modeling Gone Wrong
Jens has a horrifying tale of bad field name combined with the lack of foreign keys. I may have bad dreams about this one tonight. He does have a great reaction GIF that summarizes how any of us would feel about being faced with the database he describes.

(Update…I forgot to add my own post. Oopse.)

Mickey Stuewe
T-SQL Tuesday #72 – Bad Decisions Made With Surrogate Keys
In my post, I go into when you should and should not use Surrogate Keys as primary keys. I then give a case on how surrogates can cause duplicate data in a many to many table.

Thanks for all the fish!

First, I would like to point out how cool this blog party is. Even though Adam lives in the US, this is not just a US blog party. It’s international. In my list I have posts from The Netherlands, the UK, Denmark, and Australia. I think that is really cool. The one thing that I would like to see is, more women joining the blog party. Including myself, there were only three women who participated. While I’m very happy that Tamera and Mala joined the party, I would like to see more in the future.

T-SQL Tuesday #72 – Bad Decisions Made With Surrogate Keys

This is my second time hosting the t-SQL Tuesday blog party. The party was started by Adam Machanic (b|t) in December of 2009.

This month’s invitation topic is on Data Modeling Gone Wrong. Being a Database Developer, I deal with bad database design decisions daily. One of my app-dev teammates loves to tell me that the bad decisions were made because I didn’t work there yet. (That makes me laugh.)

Surrogate Keys vs Natural Keys

The point of surrogate keys is to represent complicated natural keys as the primary key of the table. Both the surrogate key and natural key will yield a unique key for the row. Sometimes that unique natural key is the entire row. When possible, it is better to use the natural key since it is the true representation of the row. Unfortunately, this is not always practical. Let’s look at some examples.

TSQLTuesday72 Image1

In the employee table it would take four fields to make a primary key from the natural key (first name, last name, social security number, and birthdate). Note: This is assuming this table is only used in the US and the employees have social security numbers. The reason the birthdate is also needed is due to the fact that social security numbers can be reused after someone has passed away. For the employee table it makes sense to have a surrogate key since it would be cumbersome to use all four fields as foreign keys in other tables.

The StateList (representing each state in the United States) is a good example of using the natural key as the primary key. Each state only uses two characters to represent the State so CHAR(2) can be used for the natural key and the primary key. This would provide the added benefit of not needing to join back to the StateList to get the two character representation of the State abbreviation…unless additional information about the state is needed. So what is the point of this table? Well, by having it, you are guaranteed referential integrity on the StateCode field by having a foreign key back to the StateList table. You don’t have to worry that someone puts ZZ as a StateCode.

Danger, Will Robinson!

One of the problems I’ve seen with careless use of surrogate keys are the duplication of natural keys. Quite often it’s overlooked that the natural key still needs to have a unique constraint. Without it, the reporting team ends up having to use MAX or DISTINCT to get the latest instance of the natural key, or SSIS packages are needed to clean up the duplicates. This can be compounded with many-to-many tables.

Many-to-many tables allow two tables to be joined multiple times. An example can be seen in the car insurance industry.  If you have multiple people on the same insurance and they are registered to drive multiple cars, then a many-to-many table would be created to capture the data.

If a surrogate key is used on the many-to-many table in order to provide uniqueness and if the natural key does not have a unique constraint, then duplicate natural key combinations can occur. This can be obfuscated if there is additional information in the table. Maybe the amount the car is insured, is also maintained in this table. Let’s take Victoria’s insurance as an example. If Victoria is in the table with her 1971 Corvette listed twice with two different insurance amounts listed, which one is the current one? The better pattern in this case would be to use the natural key.

TSQLTuesday72 Image2

Conclusion

Surrogate keys are very useful, but it should not be assumed that they should be used for all tables. The natural key should always be considered first. If the natural key is too complicated to be used as foreign keys in other tables, then the surrogate key is a good choice. Just remember to ALSO put a unique constraint on the natural key.

Thanks for all the fish

I had several people tell me on Twitter that they were going to write their first blog post for this t-SQL Tuesday blog party. I want to thank them ahead of time for taking the leap into the blogging world to share their experiences and expertise in their fields.

T-SQL Tuesday #72 Invitation – Data Modeling Gone Wrong

This month marks the 72nd T-SQL Tuesday.  Adam Machanic’s (b|t) started the T-SQL Tuesday blog party in December of 2009. Each month an invitation is sent out on the first Tuesday of the month, inviting bloggers to participate in a common topic. On the second Tuesday of the month all the bloggers post their contribution to the event for everyone to read. The host sums up all the participant’s entries at the end of the week. This month I’m the host and the topic is …

Data Modeling Gone Wrong

The purpose of SQL Server, is to make sure that the databases are kept safe and run as optimally as possible. The problem is, if the data model is flawed, or not maintained, then no matter how optimally the SQL Server is configured, the database won’t be able to function efficiently.

I would like to invite you to share some data modeling practices that should be avoided, and how to fix them when they do occur.

Rules for T-SQL Tuesday Blog Party

Rule 1: Make sure that you include the T-SQL Tuesday image at the top of the post which will help identify your post as a T-SQL Tuesday blog post.  Please include a link back to this invitation too.

Rule 2: Publish your post sometime next Tuesday using GMT. Here’s a link to a GMT time convertor.  For example, in California, that would cover 5 pm Monday to 5 pm (PDT) Tuesday.

Rule 3: Come back here and post a link in the comments so that I can find all the posts for the round up.

Rule 4: Don’t get yourself fired. Make sure that you either generalize your post or get permission to blog about anything from work.

Rule 5: If you roam the Twitterverse, then don’t forget to Tweet about your blog post with the hashtag #tsql2sday.

Rule 6: Go read someone else’s blog on the subject!

Final Rule: Have fun!

T-SQL Tuesday #62 – Primary Key Constraints Are Good For Your Database

Robert Pearl (b|t) is hosting this month’s T-SQL Tuesday blog party. The party was started by Adam Machanic (b|t) in December of 2009.

This month’s invitation is about making sure your database is healthy. For me, that is not just making sure that patches are updated and backups restore properly. It also means the database schema is healthy.

In the beginning

One of the simplest things that we can do when creating a new table, is to make sure there is a primary key constraint. A primary key constraint is simply a specialized index. It can be a clustered or non-clustered index.  The important part is, it is created as a primary key constraint.

Why?

Two strong reasons come to mind on why you want to have primary key constraints on your tables. The first, is the accuracy of your data. The purpose of a primary key is to uniquely identify a record (just like your fingerprints). By using primary key constraints, SQL Server will guarantee that the primary keys are unique.

I’ve had application developers argue with me that the application will guarantee that the primary key is unique, so a primary key constraint is not needed. I’ve also had other developers tell me that using GUIDS or identities will guarantee uniqueness. Even if you use these data types or have checks in your application to verify the value is unique, it is still possible to go around the application to insert data and it is possible for multiple applications to simultaneously provide what they think is a unique value. A primary key constraint can will prevent duplication.

This does bring me to the second reason. Say, you do a have a spectacular way to keep all of your primary keys unique. Without primary key constraints, you can’t implement Transactional Replication. That’s right. Transactional Replication requires primary key constraints.

How can I find my straying tables?

Now that I’ve (hopefully) peeked your curiosity, let’s look at a way to find a list of tables that are missing primary key constraints.

SELECT
     t.name AS TableName
FROM
     sys.tables AS t
     LEFT JOIN sys.indexes AS i ON i.object_id = t.object_id
          AND is_primary_key = 1
WHERE
     is_primary_key IS NULL
ORDER BY
     t.name

Remember it is not a good idea to “just add” new constraints without determining the impact on the current system. Once you have the list of tables that are missing primary key constraints, you can go through the process of determining if you can add them into the environment without side affects. Some side affects can include having to deal with duplicate data, increase database size by the addition of new indexes, and better performance.

Thanks for all the fish

Thanks go out to Robert Pearl for hosting this month’s T-SQL Tuesday blog party.

Data Models, SQL Server, SQLite, and PowerShell

The Fourth Mission

In this installment of SQLCoOp, we are sharing our views on something we’ve learned this month. While I learned quite a bit this month in order to create a new SQLite Database with test data that mapped back to my SQL Server database, I will limit this post to how my first PowerShell script aided me in creating a new SQLite Database from Embarcadero’s ER/Studio Data Architect.

MCJJ_ALL

The Project

I’ve been working on a project recently where we have a SQL Server backend on our server and SQLite databases on our client machines. The two databases have quite a few tables in common, but the two database platforms don’t share the same toolsets. Here are the hurdles I had to overcome.

  1. The SQL language that SQLite uses is similar to T-SQL, but not the same. This means scripting tables, constraints, and other SQL objects are slightly different.
  2. I use Embarcadero’s ER/Studio Data Architect (DA) to do all of my data modeling, but there isn’t a driver that will work with DA to compare my database model to a SQLite database so that I can keep them in sync.
  3. I needed to create test data in both databases, but I don’t have an environment where the databases can coexist so that I can write the proper insert scripts.

In this post, I’ll focus on the first two hurdles. The third hurdle was solved using Red Gate’s SQL DataGenerator, Python, and SSIS. I will save this story for another time since it is worthy of a blog post all on its own.

Hurdle 1: Understanding SQLite

The first thing I had to learn was the ins and outs of SQLite. I acquired a great book, Using SQLite and started reading. I also searched the depths of the internet. What I found was that SQLite likes to store almost everything as integer, numeric, and text, BUT there are data types that we are familiar with like varchar, datetime, etc. They just get “interpreted” as integers, numeric, and text depending on what the data types are. This is a good reference for understanding how the more familiar data types are actually stored in the SQLite database.

When I was first learning about SQLite, I mistook the storage classes as actual data types. I thought SQLite was going to use a type of implicit conversion, but SQLite is not the same as SQL Server. There is also a big benefit to using the more familiar data types instead of sticking to the storage classes that SQLite uses. The benefit has to do with the ADO.NET class and making life easier for your application developer who has to take data from SQL Server and get it into SQLite.

NOTE: You REALLY need to understand how dates work in SQLite. Depending on how the date and time are stored in the underlying database (string, integer, numeric) will determine what “zero” is equivalent to. Read this to find out more.

Hurdle 2: Creating SQL Scripts for Schema Changes From ONE Environment

There are several software packages out in the world that will allow you to get visual data models of SQLite tables, but I already use Embarcadero’s ER/Studio Data Architect (DA) and I didn’t want to learn another one. I also didn’t want to have my SQL Server data models in one software package and my SQLite data models in a different one. So I had to figure out how to get around this hurdle…and I used PowerShell to do it.

The first thing I did, was to create my data models. My first SQLite data model was based on my initial research which implied I should store all my data in integer and text data types since that is what the engine will be using to store the data. This was a poor assumption on my part. Since I had both my SQL Server and my SQLite database models in DA, I was able to easily update my data types in my SQLite database with their SQL Server counter parts. I did have to inspect each field manually since not all attributes were the same between the two databases. For instance, I have to use CURRENT_TIMESTAMP instead of getdate() and I had to model auto incrementing columns differently.

Note: This is a sample database

SQLCoOp_Image3

SQL Server database

SQLCoOp4_Image2

First iteration of my SQLite database

NOTE: I was able to use the Compare and Merge Utility by selecting the option to compare against another model.

SQLCoOp4_Image1

 

SQLCoOp_Image4

Second iteration of my SQLite database

Making a SQLite Script from ER/Studio Data Architect

Before we go down this rabbit hole, there is another piece of information that you need to know about SQLite. You can’t add foreign key constraints after the table is created. This is very frustrating for me since I know of some existing tables that need foreign key constraints and it will be a bigger task than it would be in SQL Server to get foreign key constraints applied. For this project, it means I need to script the foreign keys as well as the primary keys in the table creation script.

There are two operations that need to be done in order to generate a SQLite script from DA. The first is to use the Generate Database Wizard to generate a whole database instead of using the Compare and Merge Utility to compare the database model with the actual database. This operation is outlined below in steps 1 though 6. The second operation is to use a simple PowerShell script to make the necessary changes to the script so that it will no longer be a T-SQL script made to run in SQL Server, but a SQL script ready to run in SQLite. This is outlined in step 7 below.

Here are the steps you’ll need to do:

Step 1

Make sure that all auto incrementing fields are not set up as primary keys in the database model. The reason for this, is all auto incrementing fields are primary keys, so the syntax for creating an auto incrementing field is very specific.

Take a look at the EmployeeAward table in DM SQLite above. It is setup as an identity column, but not as a primary key. In order for me to have a visual cue that the field is a primary key, I used an object called a Domain. Domains are templates that can be used throughout the database model so that data types, defaults, and even definitions remain consistent and can easily be changed throughout an entire database model or across all database models if the Repository feature of DA is being used.

Step 2

Right click the physical model that you will be working with, and select Generate Database. You will be walked through a wizard. At the end you are going to want to save your settings so that you don’t have to select them again.

SQLCoOp_Image6

Step 3

On Page 1 of the wizard, you can create individual scripts or a single long script. I’ve been doing both so that I can select a single table later on if I need to regenerate it. You will also need to select the folder that you would like the scripts to be created in. I created 3 folders for my project.

  • MultiScripts (This will hold scripts for each object separately)
  • SingleScript (This will hold a single script that contains all the objects.)
  • Standardize (This holds the PowerShell script called StandardizeSQL.ps1 which will be explained in step 7.)
Step 3

I don’t do anything on page 2 of the wizard since this script will not run in SQL Server.

Step 4

Page 3 of the wizard is where the fun starts. Set the following options for the tables you want to include in the script. These options are found on the General Options tab. All the other tabs don’t need to have their options modified…unless you used features like sparse columns that can’t be used in SQLite. Then you might need to fiddle with some of the other settings.

 

SQLCoOp_Image5

Step 5

Save your settings on page 4 of the wizard so that you can quickly run through the wizard next time.

Step 6

If you are asked to open your newly created script, then say no. We’re not done yet.

Step 7

Run the following PowerShell script. You can do this by right clicking the file containing the PowerShell script and selecting Run with PowerShell. Note: The ability to run PowerShell scripts needs to be enabled first.

The script will go through all the *.sql files in the indicated folder and change the T-SQL specific syntax to the SQLite syntax. Here are the syntactical changes.

  • Int will be changed to integer.
  • Identity columns will be changed to primary key autoincrement columns.
  • The words nonclustered and clustered will be removed from primary and foreign key constraints since all indexes are nonclustered and those keywords are not supported in SQLite.

Before you run the script below, you’ll need to change the path to your scripts on line 8.

Note: there is a space before the asterisk on line 8 and there are spaces on either side of “int” on line 12.

# To Run this in the Powershell it would be written as ./StandardizeSQL.ps1
# Note: nonclustered needs to come before clustered since clustered is in the word nonclustered.

$wshell = New-Object -ComObject Wscript.Shell -ErrorAction Stop

Try
{
$configFiles=get-childitem -Recurse C:\SQLiteScripts *.sql

foreach ($file in $configFiles)
{
(Get-Content $file.PSPath) |
Foreach-Object {$_ -replace " int ", " integer "} |
Foreach-Object {$_ -replace "IDENTITY\(1,1\)", "PRIMARY KEY AUTOINCREMENT"} |
Foreach-Object {$_ -replace "nonclustered", ""} |
Foreach-Object {$_ -replace "clustered", ""} |
Set-Content $file.PSPath

}

$wshell.Popup("Success! And there was much rejoicing!", 0,"Done.",64)
}
Catch
{

$wshell.Popup($_.Exception.Message,0, "Oops! An error occurred.",48)
Break
}

After you are done running the PowerShell script, you’ll have a SQLite script that you can run in your favorite SQLite querying tool.

Here is the before and after scripts for two of the tables from my example.

CREATE TABLE Employee(
EmployeeID int NOT NULL,
FirstName varchar(50) NOT NULL,
LastName varchar(50) NOT NULL,
isActive bit DEFAULT 0 NOT NULL,
InsertedDTS datetime CONSTRAINT [DF_Employee_InsertedDTS] DEFAULT CURRENT_TIMESTAMP NOT NULL,
ModifiedDTS datetime NULL,
CONSTRAINT PK_Employee PRIMARY KEY CLUSTERED (EmployeeID)
)
;

CREATE TABLE EmployeeAward(
EmployeeAwardID int IDENTITY(1,1),
EmployeeID int NOT NULL,
AwardName varchar(100) NOT NULL,
AwardDate datetime NULL,
CONSTRAINT FK_EmployeeAward_Employee FOREIGN KEY (EmployeeID)
REFERENCES Employee(EmployeeID)
)
;

Before

CREATE TABLE Employee(
EmployeeID integer NOT NULL,
FirstName varchar(50) NOT NULL,
LastName varchar(50) NOT NULL,
isActive bit DEFAULT 0 NOT NULL,
InsertedDTS datetime CONSTRAINT [DF_Employee_InsertedDTS] DEFAULT CURRENT_TIMESTAMP NOT NULL,
ModifiedDTS datetime NULL,
CONSTRAINT PK_Employee PRIMARY KEY (EmployeeID)
)
;

CREATE TABLE EmployeeAward(
EmployeeAwardID integer PRIMARY KEY AUTOINCREMENT,
EmployeeID integer NOT NULL,
AwardName varchar(100) NOT NULL,
AwardDate datetime NULL,
CONSTRAINT FK_EmployeeAward_Employee FOREIGN KEY (EmployeeID)
REFERENCES Employee(EmployeeID)
)
;

After

Don’t Stop Yet

If you want to read more about the newly released SQL Search, don’t forget to check out these blog posts:

To follow our quest for SQL knowledge through this collaborative project, follow the #SQLCoOp tag on Twitter.

See you next time!!

On a SQL Collaboration Quest

Four SQL professionals gathered from the four corners of the world to share their SQL knowledge with each other and with their readers: Mickey Stuewe from California, USA, Chris Yates from Kentucky, USA, Julie Koesmarno from Canberra, Australia, and Jeffrey Verheul from Rotterdam, The Netherlands. They invite you to join them on their quest as they ask each other questions and seek out the answers in this collaborative blog series. Along the way, they will also include other SQL professionals to join in the collaboration.

Original Post: On a SQL Collaboration Quest

SQL Advent Calendar 2013 – Day 9 – Creating Change Tracking Fields (2 of 3)

AdventCalendar09First Post in this series: SQL Advent Calendar 2013-Day 1- Placeholders in SQL Prompt Snippets

As I was hanging our family Advent Calendar up, I thought I would make one for you, my readers. I’ll be sharing my scripts from two of my favorite products as well as tips on how to create your own.

Second Step Is Creating The Change Tracking Fields

Today I’ll share with you my macro called Add Change Tracking Attributes. This script will add the attributes InsertedBy, InsertedDTS, ModifiedBy, and ModifiedDTS to the selected entities. These attributes are based on the domains that were created by the macro in yesterday’s post found here. If you add these fields, and then decide you don’t want them, you can use my macro called Remove Attribute From Selected Tables, which I blogged about here to remove them.

'----------------------------------------------------------------------------
' PURPOSE:  	This macro adds Change Tracking Attributes to Selected Entities.
' Author:   	Mickey Stuewe
' Date:     	12/9/2013
'----------------------------------------------------------------------------

Sub Main

	Dim objModel As Model
	Dim objSelObject As SelectedObject
	Dim objEntity As Entity
	Dim objAttribute As AttributeObj
	Dim iCounter As Integer
	Dim dict As Dictionary
	Dim iInsertedDateID As Integer
	Dim iInsertedByID As Integer
	Dim iModifiedByID As Integer
	Dim iModifiedDateID As Integer

	' Get the Local Data Dictionary.
	Set dict = DiagramManager.ActiveDiagram.Dictionary

    ' Determine Domains.
    Set MyDomain = dict.Domains.Item("InsertedDTS")
    iInsertedDateID = MyDomain.ID

    Set MyDomain = dict.Domains.Item("InsertedBy")
    iInsertedByID = MyDomain.ID

    Set MyDomain = dict.Domains.Item("ModifiedDTS")
    iModifiedDateID = MyDomain.ID

    Set MyDomain = dict.Domains.Item("ModifiedBy")
    iModifiedByID = MyDomain.ID

	Set objModel = DiagramManager.ActiveDiagram.ActiveModel

	' Iterate through all the selected objects in the current
	' model.
	For Each objSelObject In objModel.ActiveSubModel.SelectedObjects

		'Get the object type - we are only concerned
		'with entities.
		If objSelObject.Type = 1 Then

			' Get the actual entity object with this ID.
			' The model contains the collection of all the
			' entities.
			Set objEntity =  objModel.Entities.Item(objSelObject.ID)

			'Create Tracking fields and set DomainIDs
			Set objAttribute = objEntity.Attributes.Add("InsertedBy", False)
			 SetDefaultDomain objAttribute, iInsertedByID,objEntity.EntityName

			Set objAttribute = objEntity.Attributes.Add("InsertedDTS", False)
			SetDefaultDomain objAttribute, iInsertedDateID,objEntity.EntityName

			Set objAttribute = objEntity.Attributes.Add("ModifiedBy", False)
			SetDefaultDomain objAttribute, iModifiedByID,objEntity.EntityName

			Set objAttribute = objEntity.Attributes.Add("ModifiedDTS", False)
			SetDefaultDomain objAttribute, iModifiedDateID,objEntity.EntityName

			iCounter = iCounter + 1
		End If
	Next objSelObject

	MsgBox "Four new Attributes were added to " & iCounter & " Table(s).", vbOkOnly + vbInformation, "Attributes Added To Tables"

End Sub

Sub SetDefaultDomain(objAttribute As AttributeObj, iDomainID As Integer, strEntityName As String)

	objAttribute.DomainId = iDomainID

	If Len(objAttribute.DeclaredDefault) > 0 Then

		'if the attribute default is bound, then unbind it so that it can be named.
		If objAttribute.DomainId > 0 Then
			'You have to override the bound default value and the default text (name).
			objAttribute.EnableOverride (1,True)
			objAttribute.EnableOverride (2,True)

		End If

		If Len(objAttribute.DefaultName) = 0 Then
			objAttribute.DefaultName = "DF_" & strEntityName & "_" & objAttribute.AttributeName

		ElseIf objAttribute.DefaultName <> "DF_" & strEntityName & "_" & objAttribute.AttributeName Then
			If MsgBox("The name of the Default Constraint does not match our naming standards. Would you like to replace it?" & vbCrLf & _
				vbCrLf & vbCrLf & "Attribute: " & objAttribute.AttributeName & " - Default Name: " & objAttribute.DefaultName & " - Value: " & _
				objAttribute.DeclaredDefault & vbCrLf & " Standardized Name: " & "DF_" & strEntityName & "_" & objAttribute.AttributeName, _
				vbYesNo + vbQuestion, "Stadardize Name?") = vbYes Then
					objAttribute.DefaultName = "DF_" & strEntityName & "_" & objAttribute.AttributeName

			End If
		End If
	End If
End Sub

Take a look at lines 81 and 82. These two lines are needed to be able to give the default a name and here is why: This macro creates attributes based off of domains. The Inserted attributes have defaults. If we don’t name the defaults, then SQL Server will provide names for them…and they are not pretty. In order for us to be able to override the domain, we have to set the enableoverride property for two properties. Then we’ll be able to provide a name to the default. This step can’t be done in the domain, because defaults are a type of constraint and all constraints need to have unique names.

Tomorrow I’ll show you a Red Gate Snippet used to create the triggers for the tables with these fields.

Previous Post In Series: SQL Advent Calendar 2013-Day 8-Creating Change Tracking Domains (1 of 3)

SQL Advent Calendar 2013 – Day 8 – Creating Change Tracking Domains (1 of 3)

AdventCalendar08First Post in this series: SQL Advent Calendar 2013-Day 1- Placeholders in SQL Prompt Snippets

As I was hanging our family Advent Calendar up, I thought I would make one for you, my readers. I’ll be sharing my scripts from two of my favorite products as well as tips on how to create your own.

First Step Is The Foundation

Whenever I need to have minimal auditing on a table, I add four fields to the table. InsertedBy, InsertedDTS, Modifiedby, and ModifiedDTS. The Inserted fields tell me who created the row. These two fields are updated based on the defaults set on the columns. The Modified fields tell me who changed the fields and are updated by a trigger.

Over the next couple of days I will show you two ER/Studio Data Architect (ER Studio DA) macros and one SQL Prompt Script that will help you create these four fields and the needed triggers.

The first step is making sure that your data model has four Domains. They reside in the Data Dictionary on the Data Dictionary tab. Domains are templates that fields can be created against.

There are two types of Data Dictionaries. There are local Data Dictionaries and Enterprise Data dictionaries. A data model can only have one local Data Dictionary and it always exists. If you are using ER Studio DA’s Repository feature, then you can create an Enterprise Data Dictionary to use across al Data Models. A Data Model can have multiple Enterprise Data Dictionaries. This is a great feature and I have used it in the past, but it isn’t available if you are not using the Repository.

 

AdventCalendar2013Day8_Img1

 

I’ve created a macro called Create Change Tracking Domains to use the local Data Dictionary.

The macro does the following: If a folder called Change Tracking doesn’t exist in the Data Dictionary, then it is created. If the four change tracking domains don’t exist, then they will be created. Either way, all the properties of these four domains will either be created or reset when this macro is run.

Note: You can modify it to use an Enterprise Data Dictionary, or you can see if I have time to add it to the macro for you.

Below, in the code. You can see the properties that are set for each field (Data type, name, definition, nullability, and default.)

'----------------------------------------------------------------------------
' PURPOSE:  	This macro adds four domains to aid in creating Change Tracking
'				fields.
' Author:   	Mickey Stuewe
' Date:     	12/9/2013
'----------------------------------------------------------------------------

Sub Main

	Dim dict As Dictionary

	' Get the Local Data Dictionary.
	Set dict = DiagramManager.ActiveDiagram.Dictionary

	'If the Change Tracking folder does not exist, then add it.
	If (dict.DomainFolders.Item("Change Tracking") Is Nothing) Then
		dict.DomainFolders.Add("Change Tracking","Domains")
	End If

    ' if Domain doesn't exist, then create it.
	If (dict.Domains.Item("InsertedDTS") Is Nothing) Then
		dict.Domains.Add("InsertedDTS", "InsertedDTS")
	End If

	With dict.Domains.Item("InsertedDTS")
		.AttributeName = "InsertedDTS"
		.Datatype="DateTime"
		.DeclaredDefault = "GETDATE()"
		.Definition = "This is the date the row that was inserted."
		.DomainFolder = "Change Tracking"
		.Nullable = False
	End With

    ' if Domain doesn't exist, then create it.
	If (dict.Domains.Item("InsertedBy") Is Nothing) Then
		dict.Domains.Add("InsertedBy", "InsertedBy")
	End If

	With dict.Domains.Item("InsertedBy")
		.AttributeName = "InsertedBy"
		.Datatype = "nvarchar"
		.DataLength = 255
		.DeclaredDefault = "SUSER_NAME()"
		.Definition = "This is the system user who created the row."
		.DomainFolder = "Change Tracking"
		.Nullable = False
	End With

	 ' if Domain doesn't exist, then create it.
	If (dict.Domains.Item("ModifiedDTS") Is Nothing) Then
		dict.Domains.Add("ModifiedDTS", "ModifiedDTS")
	End If

	With dict.Domains.Item("ModifiedDTS")
		.AttributeName = "ModifiedDTS"
		.Datatype="DateTime"
		.Definition = "This is the date the row was modified. It is updated via a trigger."
		.DomainFolder = "Change Tracking"
		.Nullable = True
	End With

 	' if Domain doesn't exist, then create it.
	If (dict.Domains.Item("ModifiedBy") Is Nothing) Then
		dict.Domains.Add("ModifiedBy", "ModifiedBy")
	End If

	With dict.Domains.Item("ModifiedBy")
		.AttributeName = "ModifiedBy"
		.Datatype = "nvarchar"
		.DataLength = 255
		.Definition = "This is the system user who modified the row. It is updated via a trigger."
		.DomainFolder = "Change Tracking"
		.Nullable = True
	End With

	Set dict = Nothing
End Sub

Tomorrow I’ll share the macro that will add these four fields to selected tables.

Previous Post In Series: SQL Advent Calendar – Day 7 – Verify All Tables and Fields Have a Definition

Next Post in Series: SQL Advent Calendar 2013 – Day 9 – Creating Change Tracking Fields (2 of 3)

SQL Advent Calendar – Day 7 – Verify All Tables and Fields Have a Definition

First Post in this series: SQL Advent Calendar 2013-Day 1- Placeholders in SQL Prompt Snippets

As I was hanging our family Advent Calendar up, I thought I would make one for you, my readers. I’ll be sharing my scripts from two of my favorite products as well as tips on how to create your own.

Oops, I missed one.

I try to remember to add definitions as I create my tables so that I don’t forget later, but sometimes I miss one…especially definitions for attributes. Here is a great macro for ER/Studio Data Architect to find all the Entities and Attributes that are missing definitions. It even has a feature to colorize the tables that are missing definitions. I call it Verify Definitions For Selected Objects.

Note: Right now it only provides definitions for Entities and Attributes. I am happy to expand it if you need the other objects.
Option Explicit
'----------------------------------------------------------------------------
' PURPOSE:  	Creates a list of Entities and Attributes that are missing
'				definitions.
' Author:   	Mickey Stuewe
' Date:     	9/7/2013
'----------------------------------------------------------------------------
Sub Main
	'Defines the dialog box
	Begin Dialog UserDialog 680,322,"Verify Definitions for All Objects", .MyDialogAction
		Text 20,7,360,14,"Entities and Attributes that need Definitions",.Text1
		TextBox 20,28,500,287,.txtList,1
		PushButton 540,28,130,21,"Check",.CheckForDefinitions
		OKButton 540,49,130,21
		CheckBox 540,84,130,14,"Colorize Entities",.chkColor
	End Dialog

	Dim dlg As UserDialog

	'Calls the dialogbox
	Dialog dlg, -2
End Sub

Sub CheckForDefinitions()
    Dim objDiagram As Diagram
	Dim objModel As Model
	Dim objSubModel As SubModel
	Dim objSelObject As SelectedObject
	Dim objEntity As Entity
	Dim objAttr As AttributeObj

	Set objDiagram = DiagramManager.ActiveDiagram
	Set objModel = objDiagram.ActiveModel
	Set objSubModel = objModel.ActiveSubModel

	'Clears list
	DlgText("txtList","")

	' Iterate through all the selected objects in the current
	' model.
	For Each objSelObject In objSubModel.SelectedObjects
		Select Case objSelObject.Type
			Case 1 'Entity
				If objModel.Entities(objSelObject.ID).Definition = "" Then
					Call AddToList("Entity", objModel.Entities(objSelObject.ID).Owner & "." & objModel.Entities(objSelObject.ID).EntityName)
					If DlgValue("chkColor") = 1 Then
						objSubModel.EntityDisplays.Item(objModel.Entities(objSelObject.ID).TableName).BackgroundColor = RGB(243,222,248)
					End If

				End If
				For Each objAttr In objModel.Entities(objSelObject.ID).Attributes
					If objAttr.Definition = "" Then
						Call AddToList("Attrib", objModel.Entities(objSelObject.ID).Owner & "." & objModel.Entities(objSelObject.ID).EntityName & "." & objAttr.AttributeName)
						If DlgValue("chkColor") = 1 Then
						objSubModel.EntityDisplays.Item(objModel.Entities(objSelObject.ID).TableName).BackgroundColor = RGB(243,222,248)
						End If
					End If
				Next

			Case 16 'View
				MsgBox "It's time to write code for looking at Definitions for Views."
			Case 54 'Function
				MsgBox "It's time to write code for looking at Definitions for Functions."
		End Select
	Next objSelObject

End Sub

Sub AddToList(strObject As String, strObjectName As String)
	'Adds Entity or Attribute to list.
	DlgText("txtList",DlgText("txtList") & strObject & ": " & strObjectName & vbCrLf )

End Sub

'This is called for all actions in the dialog box. I will be traping for the button being clicked.
Private Function MyDialogAction(DlgItem As String, Action As Integer, SuppValue&) As Boolean

	If Action = 2 Then ' Value changing or button pressed
		Rem MyDialogFunc = True ' Prevent button press from closing the dialog box

		If DlgItem = "CheckForDefinitions" Then
			CheckForDefinitions
			MyDialogAction = True
		Else
			MyDialogAction = False
		End If
	End If
End Function

 

AdventCalendar2013Day7_Img1

 

Previous Post In Series: SQL Advent Calendar – Day 6 – What’s In All These Indexes?

Next Post In Series: SQL Advent Calendar 2013 – Day 8 – Creating Change Tracking Domains (1 of 3)

SQL Advent Calendar 2013 – Day 3 – Organize Before All The New Toys Come In

AdventCalendar03First Post in this series: SQL Advent Calendar 2013-Day 1- Placeholders in SQL Prompt Snippets

As I was hanging our family Advent Calendar up, I thought I would make one for you, my readers. I’ll be sharing my scripts from two of my favorite products as well as tips on how to create your own.

Installing A New Macro

AdventCalendar2013Day3_Img1

When I first started working with the provided macros in ER Studio Data Architect by Embarcadero (ER Studio DA), I would keep my modified copies in the same folders as the macros that came with the product. I soon realized that it would be better to create my own folders. This allowed me to leave the provided library intact and it allowed me to put my macros in source control (where all code should be kept).

The default location of the macros for version 9.6 is c:\ProgramData\Embarcadero\ERStudioDA_9.6\Macros. You can create your own folders in this location either through the application or through Explorer. There is a “refresh” menu item in the shortcut menu that you can use to refresh the list of macros. I created sub-folders similar to the ones that were under Sample Macros as well. I stored the Remove Attribute From Selected Tables macro I created under Modeling Productivity Macros subfolder.

Once you have your new macros in ER Studio DA, you can add them to the shortcut menus for various objects. You can have up to 10 macros in the shortcut list at a time.

AdventCalendar2013Day3_Img2To add yesterday’s macro to the shortcut menu for Tables, do the following:

  1. Right -click on any table in any model.
  2. Click on Add/Remove Macro Shortcuts.
  3. Select the macro you added to your library called Remove Attribute From Selected Tables.

Next time you right-click on a table, you’ll have that macro listed under macros. You can have up to 10 macros at any given time. The list will change based on what kind of object you have right-clicked on.

Note: Not all macros can be run in both the Logical and Physical models. I usually provide a warning when I run a macro that won’t work in a particular model so that I know it’s not going to do anything. I’ll discuss this more in future posts this month.

Previous Post In Series: SQL Advent Calendar 2013 – Day 2 – Macro to Remove Common Fields in ER Studio Data Architect

Next Post In Series: SQL Advent Calendar 2013 – Day 4–Comments Are Like Love Notes to Your Future Self

%d bloggers like this: