Tag Archive for Data Change

T-SQL Tuesday #74 – Knowing When Data Changes Occur in Standard Edition

Robert L. Davis (b|t) is hosting this month’s T-SQL Tuesday blog party. The party was started by Adam Machanic (b|t) in December of 2009.

This month’s T-SQL Tuesday’s topic is on Data Change. This is an important topic for me, since I’m about to create our first Data Warehouse at work. I’ve chosen to champion the companies who are using the Standard Edition of SQL Server.

Crestfallen

I took a class on Change Data Capture this past year at the SQL Saturday in Portland, Oregon. I absolutely loved it. I couldn’t wait to implement it. Then I found out it was an Enterprise Level feature only. I was crestfallen.

I went back to work asking if we could consider using Enterprise SQL Server, I was told that we have too many cores on our Nutanix hardware to make the features we would use worth it. You see, SQL Server pricing is based on the number of cores the underlying hardware has, not the number of cores you utilize in your VM. So what is a girl to do? (And guys, I haven’t forgotten you either.)

Since the Change Data Capture feature is not available to us Standard Edition users, let me take you through another pattern that uses auditing columns on each table.

The Basis of The Pattern I Use

I use a pattern that includes four fields on all transactional tables. This (absolutely) includes lookup tables too. The two table types that are an exception to this pattern are audit tables and error tables. I’ll cover why later in this article.

Four fields include CreatedOn, CreatedBy, UpdatedOn, and UpdatedBy. The dates should be DateTime2. CreatedOn is the easiest to populate. You can create a default on the field to be populated with GetDate().

The other fields can be a little more tricky depending on your environment. There are two patterns for populating CreatedBy as well as two patterns for populating UpdatedBy and UpdatedOn. Below you will find the pros and cons of each.

Pattern One For CreatedBy

The first pattern has the CreatedBy using the SYSTEM_USER function to populate the field. This will retrieve the windows login that the user used to login into the application.

Pros
  • This pattern allows the use of a default on the CreatedBy field. This allows the field to not be dependant on code to populate it.
  • All users have a validated account.
  • Accounts that are used by SSIS or services don’t need to have a row in the User table.
Cons
  • The field will be a varchar or an nvarchar.
  • The CreatedBy field will need to be parsed to find out which user inserted the record for reporting. (One way around this, is to remove the domain before inserting the (n)varchar).

This is my preferred technique since you don’t need to rely on the User table being immediately populated for a user to start using an application. It also makes it easier when the UpdatedBy field is updated through a pattern below.

 

Pattern Two for CreatedBy

This technique is based on an existing User table and uses their ID for the value of CreatedBy.

Pros
  • This pattern allows the CreatedBy field to join to the User table through an integer. Information about the user can be determined easily by joining to the User table but takes less space than the varchar/nvarchar mentioned above.
  • CreatedBy is not relying on windows authentication. This is helpful when not all users of the application have a windows login account.
Cons
  • This pattern requires a row for various tasks in the User table when SSIS or services insert data automatically.
  • This pattern requires CreatedBy to be populated by the application instead of by a default value. This requires working with the AppDev team and getting the AppDev team to write some additional code for you. (If you need to resort to bribing them, I suggest pizza, donuts, chocolate, or beer.)

 

Now let’s talk about UpdatedOn and UpdatedBy. They are a little more tricky to populate since they are populated on an update.

Pattern One For UpdatedBy and UpdatedOn

This pattern requires code to populate the fields since there are no defaults on updated rows. As stated above, you’ll have to work with the AppDev team.

Note: This is not a bad thing. I whole heartedly believe in having a good relationship with the AppDev team. It makes it easier to work with the team when they have the ability to write SQL. It also has the benefit of being included earlier in design sessions. (I personally have been blessed with a great AppDev lead.)

Pros
  • This pattern can be used to populate the UpdatedBy field and even the UpdatedOn fields using similar code.
  • Either User IDs or Windows account can be used to populate UpdatedBy.
Cons
  • The pattern is dependent on the AppDev team to remember to include UpdatedBy and UpdatedOn for every updated row, especially when there are tight deadlines.
  • You need to be vigilant that the fields are always updated, especially with lookup tables.

This is my preferred technique since it doesn’t involve triggers, which we’ll see in pattern two below.

Pattern Two For UpdatedBy and UpdatedOn

This pattern uses the After Update trigger. The reason I choose the After Update trigger, is that it waits until all the changes are made to the table(s) being updated by other triggers.

Note: I do not encourage multiple tables to be updated in a single trigger. (Hmm. This sounds like a good future post.)

Pros
  • This can be used as a stop gap while the AppDev team refactors their code to populate UpdatedBy and UpdatedOn fields as well as CreatedBy.
  • There is no requirement to rely on the application code to populate these fields.
Cons
  • The UpdatedBy can’t be updated with the User’s Id when populating the field.
  • You might not be allowed to use triggers. (Note: Triggers are not bad when used correctly.)

CREATE TABLE dbo.OrderStatusType
(
OrderStatusTypeId INT IDENTITY PRIMARY KEY
,OrderStatusName VARCHAR(50) NOT NULL
,IsActive BIT NOT NULL DEFAULT 1
,CreatedBy VARCHAR(50) NOT NULL DEFAULT SYSTEM_USER
,CreatedOn DATETIME2 NOT NULL DEFAULT GETDATE()
,UpdatedBy VARCHAR(50)
,UpdatedOn DATETIME2
)
GO

CREATE TRIGGER dbo.Trg_OrderStatusType_u ON dbo.OrderStatusType
AFTER UPDATE
AS
SET NOCOUNT ON

UPDATE dbo.OrderStatusType
SET
UpdatedOn = ISNULL(i.UpdatedOn, GETDATE()) -- This makes sure if a NULL was used, it is overwritten with the current Date.
,UpdatedBy = ISNULL(i.UpdatedBy, SUSER_NAME()) -- This makes sure if a NULL was used, it is overwritten with the current user.
FROM
dbo.OrderStatusType AS ost
INNER Join inserted AS i ON ost.OrderStatusTypeID = i.OrderStatusTypeID
GO
INSERT INTO OrderStatusType
(OrderStatusName
,IsActive
)
VALUES
('Created',1)
,('Back Orddered', 1)
,('Shipped',1)
,('Completed',1)

GO

SELECT * FROM OrderStatusType AS ost

UPDATE
OrderStatusType
SET
IsActive = 0
WHERE
OrderStatusTypeId = 2

SELECT * FROM OrderStatusType AS ost

UPDATE
OrderStatusType
SET
IsActive = 0
,UpdatedBy = 'George'
,UpdatedOn = '2016-01-11 17:38'
WHERE
OrderStatusTypeId = 4

SELECT * FROM OrderStatusType AS ost

If you are adding this pattern to existing triggers such as “AFTER INSERT, UPDATE, DELETE”, then make sure to check that the execution of this trigger was based on an update. This can be done by checking if there are any records in the Inserted and deleted tables that are used in triggers.

CREATE TRIGGER dbo.Trg_OrderStatusType_iud ON dbo.OrderStatusType
AFTER INSERT, UPDATE, DELETE
AS
SET NOCOUNT ON

IF EXISTS (SELECT 1 FROM Inserted) AND EXISTS (SELECT 1 FROM deleted)
BEGIN
UPDATE dbo.OrderStatusType
SET
UpdatedOn = ISNULL(i.UpdatedOn, GETDATE()) -- This makes sure if a NULL was used, it is overwritten with the current Date.
FROM
dbo.OrderStatusType AS ost
INNER Join inserted AS i ON ost.OrderStatusTypeID = i.OrderStatusTypeID
END
GO

Exceptions For Audit and Error Tables

These two types of tables are usually populated by inserts only. Most of the time the CreatedOn field is the only field that needs to be added. I would change the name to AuditedOn and ErroredOn respectively since the CreatedOn will most likely represent the creation of the original record.

Why are Lookup tables not in this list? They can be modified over time. Someone may need a value changed or the value needs to be discontinued. The row can’t be deleted when it is discontinued since that would break referential integrity.

In Summary

By using CreatedOn and UpdatedOn you can compare dates when data needs to be processed without having to compare the entire table.

One last thing on the UpdatedBy and UpdatedOn fields. For a long time I insisted that these two fields are NULL during an insert, but I have since changed my mind. By setting the UpdatedBy and UpdatedOn to the same values as CreatedBy and CreatedOn for an insert, you won’t have to deal with the ISNULL function or using the CreatedOn for (some) comparisons.

Thanks For All The Fish

I want to thank Robert L. Davis for hosting this month and I look forward to participating in future T-SQL Tuesday blog parties.

%d bloggers like this: