I am watching the Houston Texans play Oakland while writing this post. Case Keenum, JJ Watt and the rest are all doing their best and Coach Kubiak is up here checking on the boys but l have to admit this season really sucks for the Texans. They have been beaten straight 8 or 9 times and its looks like they are automatically loosing any match up going forward. Well, football aside, let’s go over the sql server output clause this post is dedicated and also learning a little bit about Smart Meter.

If you are in the energy industry and particular in the Texas energy market, you will here about smart meters which is an emerging technology in the industry which allows customer’s energy use every 15 minute to be read throughout the day. This allows better understanding and analysis of the customers energy use for targeted promotions or campaigns. The interval files looks ugly to many developers because they don’t come similar like other files we used to have with column headers and rows specifying records for each item. To read more about smart meter go here https://www.smartmetertexas.com/CAP/public/home/home_faq.html.

To read details on the smart meter files, visit the ercot website at  http://www.ercot.com/services/mdt/userguides/index and click on the interval data LSE file specification document to read.  Sampe file format from that document is copied below

00000001,100000000000000,4,20080510000000,20080510235900,Y,N

00000002,0,0,0,,0,,900,01,1,-1,0.0,0.0,CST

00000003,UNIQUETRANID0123456789

00000004,20080519112825,M

00000030,ATTRIBUTE_VALUE_PAIRS,MRE=666666666,Sender=666666666,Receiver=183529049,REP=111111111

10000000,68.29,A,,69.17,A,,67.99,A,,67.99,A,,

10000001,66.81,E,,66.52,E,,67.11,E,,67.11,E,,

10000002,67.11,A,,66.52,A,,66.81,A,,66.82,A,,

10000003,69.47,A,,107.73,A,,230.76,A,,228.11,A,,

10000004,69.47,A,,107.73,A,,230.76,A,,228.11,A,,

10000005,231.94,A,,228.4,A,,230.17,A,,233.7,A,,

10000006,25.18,A,,21.95,A,,246.36,A,,247.83,A,,

10000007,259.61,A,,283.45,A,,303.16,A,,313.76,A,,

10000008,325.83,A,,336.72,A,,350.26,A,,354.38,A,,

10000009,358.5,A,,30.27,A,,361.74,A,,366.45,A,,

10000010,365.86,A,,369.39,A,,369.69,A,,368.21,A,,

10000011,371.45,A,,39.69,A,,367.63,A,,369.39,A,,

10000012,371.15,A,,369.68,A,,368.51,A,,369.98,A,,

10000013,370.57,A,,373.51,A,,373.22,A,,369.39,A,,

10000014,370.86,A,,369.98,A,,368.81,A,,365.57,A,,

10000015,364.39,A,,34.68,A,,36.8,A,,284.03,A,,

10000016,309.64,A,,308.76,A,,302.88,A,,296.1,A,,

10000017,286.68,A,,279.91,A,,24.29,A,,189.55,A,,

10000018,190.43,A,,161.59,A,,10.68,A,,114.79,A,,

10000019,108.9,A,,107.14,A,,101.55,A,,98.61,A,,

10000020,94.78,A,,4.49,A,,95.66,A,,92.42,A,,

10000021,84.77,A,,80.94,A,,80.94,A,,80.35,A,,

10000022,80.35,A,,80.35,A,,7.29,A,,77.41,A,,

10000023,77.71,A,,76.82,A,,77.41,A,,7.82,A,,

 

This is the file format and it sure looks ugly so when given to most developers, they get scared how to handle it. In my experience, l wrote custom c# code to read this into a staging table and then extract records into separate tables based on business logic. The file has information at the top about the ESIID of the meter, the read date, read start and read end values, some other information  then the 24 hour readings split into 15 minutes below for that meter. Most often than not, the header information will go into a separate table and the usage goes into another table. Just fyi, this is huge data so its not easy to manage without careful analysis.  I am not going to go deep into the analysis part but l just want to talk about using SQL server output clause using this type of file as illustration. So what is the output clause? Its just a way to get information on rows affected when there is insert, delete, update or merge in sql server. To read more go here http://technet.microsoft.com/en-us/library/ms177564.aspx. I want to talk about the insert part and how to use it for making life easy when handling files like the one above.

Assume l have two tables with the following attributes.

Header(HeaderID, ESIID, ReadStartDate, ReadEndDate, ReadStart, ReadEnd)

Usage(HeaderID,hour, usage1,usage2,usage3,usage4)

So for example, records in the tables will look like this

Header table:

HeaderID        ESIID                    ReadStartDate         ReadEndDate            ReadStart               ReadEnd

I                      10002020             201311170000      201311172359          0                                 10000

 

Usage table:

HeaderID       hour                usage1             usage2     usage3              usage4

1                         1                            1                      1                1                            10

1                         2                          10                       11            10                         10

1                         3                            10                       1             10                         10

1                         4                            10                       10             10                         10

1                         5                            1                       10             11                        10

1                         6                            10                       1             10                         10

1                         7                            11                      10             10                         10

1                         8                           10                       10             10                         10

1                         9                            10                       10             10                         10

1                         10                            10                       10             10                         10

1                         11                           10                       10             10                         10

1                         12                           10                       10             10                         10

1                         13                           10                       10             10                         10

1                         14                           10                       10             10                         10

1                         15                           10                       10             10                         10

1                         16                           10                       10             10                         10

1                         17                           10                       10             10                         10

1                         18                          10                       10             10                         10

1                         19                           10                       10             10                         10

1                         20                           10                       10             10                         10

1                         21                           10                       10             10                         10

1                         22                           10                       10             10                         10

1                         23                           10                       10             10                         10

1                         24                           10                       10             10                         10

So the header table has information for the meter coming from the top of each record and the usage table has the 15 minutes readings for each hour up to the 24 hours of the day. Look at this data alone tells you how huge data you will process when you have millions of customers coming in these files. You realize the header table is related to the usage table so we can identify the meter to which the usage belongs.

Given the file and the data, how the output clause can help use is to insert data into the header table and then use the output clause to get information on all meters entered and use that information to join to data going into the usage table.

So lets assume the data from the file resides in a staging table called intevalfilestg.

Then I will create procedure that will read from the staging table into the final tables. In the procedure, l will declare a table variable which will have the same attributes as the Header table. Then I will read the meter header information from the file into the header table and use the output clause to access all meter information that l just read into the header table. For example, using the insert statement below;

INSERT

 

OUTPUT INSERT.HeaderID

INSERT. ESIID

INSERT. ReadStartDate

INSERT.ReadEndDate

INSERT.ReadStart

INSERT.ReadEnd

INTO @HeaderVariable

INTO HEADER

SELECT     ESIID,

ReadStartDate

ReadEndDate

ReadStart

ReadEnd

FROM   INTERVALEFILESTG

The headerid is just an identity value that gets inserted for each record and on each day, a meter gets a new headerid inserted to mark that days record. The idea is to avoid assigning the wrong headerid to different days reading so when you keep all the headerids in the table variable above, it saves you from joining to the main header table to look up the id for that meter you want to insert the usage for in the next table. Also, you avoid having to join to the ever growing header table every day to get the headerid to insert into the usage table for each meter’s usage record.

Having this in place, all you have to do to get data into the Usage table will be below;

INSERT HeaderID,

Hour,

Usage1,

Usage2,

Usage3,

Usage4

INTO USAGE

SELECT     HeaderID,

Hour,

Usage1,

Usage2,

Usage3,

Usage4

FROM INTERVALUSAGEFILESTG I

JOIN @headerVariable H

ON I.ESIID = H.ESIID

Some may ask why don’t I put the header informations into temp table first then you can join just like the table variable generated. Well, you have to think optimization most of the time. To reduce the I/O on the server, its better to access the values from the output clause, put it in a variable which resides in memory and continue to use it for further processing.  It will make your processing faster and reduce burden on the server as well due to less I/O.

The output clause can also be used to access deleted, update or merged records and you can find more information on that at the link l put above. Remember, when there is an error or a transaction is rolled back, nothing changes in the records accessed in the output clause so you have to be careful to make sure you are only accessing the records from the output clause after a successful transaction.

Ok, I have to continue watching this Texans game to the end so drop any question or comment below. Weird enough, Mike Shaub just came to replace Keenum, like really? lol

 

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Name *