in

Pragmatic Works

Enabling your business intelligence enterprise.

Brian Knight

  • Speaking at Midlands PASS May Meeting

    Are you in Columbia, SC or Charlotte? If so, I hope to see you at the Midlands PASS meeting on the 27th of May.

    Tuesday, May 27, 2008

    Speaker: SQL Server MVP Brian Knight

    The Midlands PASS chapter will hold a special meeting on Tuesday, May 27, 2008, to host SQL Server MVP Brian Knight. I will be giving a presentation on Data Mining using SQL Server. The meeting will once again be held at Training Concepts off of Berryhill Road. We will begin our meet and greet time at 6:15 PM as usual and start the presentation between 6:30 and 6:45 PM. I will send out an agenda next week.

    Please feel free to forward this to anyone who you think would be interested in attending. If you plan on attending, please RSVP (BKelley{AT}AgFirst.com) (replace {AT} with @) as soon as possible so we can ensure we have enough space and food.

    -- Brian Knight

  • SSIS Case Sensitivity

    One of the most frustrating learning curves for a DBA going to SSIS is the case sensitivity of the environment. Many developers are quite used to this due to option explicit .NET programming languages. The case sensitivity can in some cases create behavior that is not expected and may give you bad results if you're not careful.

    One such example is with the Lookup Transform, where comparisons against the cache are case sensitive. If you do not expect this, you may have a miss in a match that is actually a hit. To resolve this, you may have to use an UPPER or LOWER function to make everything upper or lower case. You can also create a new column in the data flow that has the data as UPPER by the use of a Derived Column Transform. There are many options there keep in mind. In the Lookup Transform, you can do upper case the data by changing the SQL query as shown below if the City column is the one where the comparison is happening.

    image

    The expression language is also case sensitive. Let's take the conditional logic as shown below:

    STATE == "FL" ? "Florida" : STATE

    If the state of "Fl" comes through, it will not qualify to be changed to "Florida". Instead, you must make a like comparison as shown below:

    UPPER(STATE) == "FL" ? "Florida" : STATE

    Additionally, a more obvious case sensitive issue is with variables. All variables, whether in expressions or a script component are case sensitive.

     

    -- Brian Knight

    Posted May 08 2008, 08:11 PM by Brian Knight with 1 comment(s)
    Filed under:
  • SQL Saturday #3 Jacksonville Recap

    I'm still coming down from my high of hosting my first SQL Saturday in Jacksonville at University of North Florida. We had almost 400 people registered and nearly 300 came on the day of the event. The 20 volunteers we had did an amazing job of handling the rush and the day in general. We also had 37 total sessions by 28 national and regional speakers.

    If you attended SQL Saturday #3, expect to receive an email from us soon with the various slide decks. Since space was slightly cramped in the building, we had a bit of a big geek hippy picnic in the courtyard shade. It was a neat sight to see all the introverted geeks, socializing. What amazed me most was that we had 100 people at the after event with a live band and pool tables. We had planned on seeing maybe 50-60 there!

      

    As we read the evals, it seems our biggest lessons learned was to give better directions to the facility but other than that, attendees had a fantastic time.

    Tim Mitchell from Dallas was one of the speakers and attendees and was kind enough to post almost 100 pictures from his SQL Saturday experience. You'll see pictures that follow the order of: registration, sessions, lunch, closing ceremony and the after event.

    http://tim-mitchell.spaces.live.com/

     

    The fanatic SQL Server winner of the day must be Rodney, who has added a SQL tattoo on his left arm.

     

     

    Thanks to all the sponsors, volunteers, speakers and attendees that made this day possible.

     

    --Brian Knight

  • (local) Times out in SQL Server 2005

    Embarrassingly enough, since SQL Server 2005 came out, I've made myself get in the habit of typing localhost instead of (local) because (local) seemed to time out on my laptop. I never really cared enough to research why localhost worked for my server name but (local) did not. In SQL Server 2000, I had written tons of DTS packages to communicate to (local) and once converted, they timed out for some mysterious reason.

    Well finally curiosity took me and I did some research.  It turns out (as many of you already know) that localhost uses the TCP/IP protocol to communicate with SQL Server and (local) communicates over Named Pipes. This is not really a big deal until SQL Server 2005. In SQL Server 2005, Microsoft changed the behavior of the instance of SQL Server to not listen to Named Pipes by default. Thus, causing your SSIS or DTS packages to now time out. You can turn this on quickly in the SQL Server Configuration Manager as shown below. Simply flick the Disabled text to Enabled and you're ready to roll! It will require a stop and start of the SQL Server instance after enabling.

     

    image

    Brian Knight

    Posted Apr 25 2008, 11:28 PM by Brian Knight with 2 comment(s)
    Filed under:
  • MVP Summit Event Party Video of Bayer White

    I just got back from the annual MVP summit and had a great time in Seattle. One item that I'd love to share that should be under NDA was the party :). At the party, they had a large karaoke rock band setup. Connected Systems MVP and friend Bayer White got up and sang a good southern song, "Gimmie Three Steps". There's a quick video recorded on the cell phone below. Sorry for the poor quality.

  • SQL Saturday Rapidly Approaching

    SQL Saturday Jacksonville is only a month away on 5/3 at UNF! We’d like to spend a moment of your time asking for your help promoting this free event. It’s not often that Jacksonville gets an event this size dedicated to SQL Server. So far, we have about 175 attendees and it’s growing every day. Please help us fill the seats by sending this email to your friends or co-workers or blog about it. The complete schedule is now posted here: http://www.sqlsaturday.com/schedule.aspx and registration and lunch are free on the http://www.sqlsaturday.com website. In addition to 36 sessions and 28 speakers, we have hundreds of prizes to giveaway throughout the day and some great networking opportunities.

     

    If you’re coming in from out of town, Our primary hotel for the event will the Embassy Suites Jacksonville. Call the hotel directly to register and ask for the event rate of $89/night (includes Internet access), it will be under Pragmatic Works, or if you register online use group code PGW. If you have problems getting a room please email webmaster@sqlsaturday.com.

    Want to go all out and really make a weekend of it? JSSUG is also sponsoring a one day Performance Tuning Seminar on May 2nd by well known SQL Server trainer Andy Warren for only $99. If you're looking for more than just simple tips, this is the seminar for you! Find out indexes really work, how to parameterize and reuse plans correctly, identify missing and under used indexes, and more.

  • New Laptops with Solid State Drives

    For a geek, nothing brings back that feeling that we had when we were 7 years old running out of our rooms on Christmas day to see what Santa brought much like getting a new PC or laptop. Recently I purchased a Latitude D830 from Dell. The laptop is normal enough but the main difference is the hard drive. I splurged this time and purchased the solid state drive (SSD) upgrade, which caps out now at 64 GB. For those new to SDD, it's essentially a hard drive technology with no moving parts and resembles a flash drive that you use in your camera. Because it has no moving parts, it's strangely quiet and the battery lasts an amazing 6-8 hours.

    Most importantly, it's fast. Boot times are about 4-8 seconds once you see the Windows logo. Applications like Outlook and Excel open in about 1.5 seconds. For a DBA, queries that used to take 30 seconds on my old laptop, now take 5 seconds. According to manufactors, you should see about 150% boost on write performance. I'm seeing this easily so far.

    With every pro there is a con. With SDD technology, it's new and expensive. The upgrade retailed at $800, although it was heavily discounted. The most you can order presently is 64 GB, which for a developer with virtual pc's may be tough. This technology is getting cheaper fast though and larger drives are coming. Samsung just announced a 300 GB drive to be released (at about a $2,000 price tag recently). The drive size is my biggest gripe so far but it's easily upgradable later.

     I've been so spoiled by the new drive technology that I'll find it hard to buy anything else than a laptop with SDD.

     Brian Knight

     

  • File Properties Task Now Free

    File Properties Task

    This SQL Server 2005 task reads the properties of a file and writes the values to a series of variables. For example, you can use this task to determine if the file is in use (still be uploaded or written to) and then conditionally run the Data Flow task to load the file if it's not being used. You can also use it to determine when the file was created in order to determine if it must be archived.

    Download the Fully Functional Product for Free


    Task Configuration
    See Larger Image of Estimation
    After dragging the task over, you can see a list of exposed properties for the file. First, specify the connection manager or the variable that contains the file name that you wish to evaluate. Then, map whichever properties you wish to read from the file to a variable in SSIS. Some of the properties exposed are:
    • File Creation Date and Time
    • File In Use
    • Last Access Date and Time
    • Last Modified Date and Time
    • File Read Only
    • File Hidden
    • File Name
    • File Extension

    Use Cases

    This task compartmentalizes and simples what would be very complex scripting inside a Script Task into a simple to use task user interface with no coding experience needed. Here are a few of the use cases that you can use the File System Task for.

    File Archival

    Files have a tendency to build up on a SQL Server. Maintenance plan steps fail often times, leaving backups behind or extract files are never removed. You can use the File System Task to quickly develop a package to read the creation date or the last accessed date of a file and move or remove it.

    To accomplish this, first drag over a For Each Loop Container and configure it to point to the directory to loop through. You can also setup the container to evaluate subdirectories. Drag over the File Properties Task onto the pane and configure it to point to the file name that's presently being enumerated through in the For Each Loop Container. Next, configure the File Creation Date (or another date property) to output to an SSIS variable like FileCreationDate. Next, create a File System Task to remove the file using the Delete File operation. The last step is to create a precedence constraint between the two tasks and double-click on the constraint. Set the Evaluation Operation to Expression and Constraint and also set the Expression property to the following expression:

     DATEDIFF( "D", @FileCreationDate, getdate())  > 30

    The final product package would resemble this screenshot.

    File In Use

    In large data loads where a file is your source, you want to ensure that you don't load a file that's still being transmitted to you. If you try to load a file that's still being FTPed or copied to you, you will receive a potential sharing violation and the package will fail. With the File Properties Task, you can look at the file you're about to upload to ensure it's not in use. If it is, skip that file in the processing chain and retry later.

    To accomplish this inside a loop, drag over the File Properties Task onto the SSIS design pane and point it to the connection manager you're about to load. Next, output the File In Use property to a SSIS variable like FileInUse. Connect the task to whichever task will be loading the file. Then, double-click on the precedence constraint and set the Evaluation Operation to Expression and Constraint and also set the Expression property to @FileInUse == False. This wil ensure that the next task will only execute if the file is not in use and the final product will resemble a package that looks like the below.

     

     

    Pricing

    The File Properties Task is priced per machine license. The price of the task is:

    • Free in an *as is* license
  • Renumbering Package IDs in an Automated Fashion

    I'm now a huge advocate of using package templates. For more info on how to use them see this post. Template provide a handy way to determine the best practices for your company and reuse those over and over again. Well there's one slight problem with using them. The package that uses the template inherits the PackageID of the template. Why is this a problem? Well, if you're using the System::PackageID variable for logging, all of your packages now will report that they are the template package.

    So, you have two options. Ultimately, you must renumber each package to a unique value. You can do this manually in the package Properties pane. You can also do this in an automated method by using the dtutil.exe application and the /I switch as shown below:

    dtutil.exe /I /FILE "PackageName.dtsx"

    The best way I feel is to create a batch file with the following code that will loop through every package in a directory and renumber it:

    for %%f in (*.dtsx) do dtutil.exe /I /FILE "%%f"

    Please note though that once you use DTSutil.exe to do this, it will re-arrange your package. Don't worry, your package will still work but the appearance may be a bit rearranged.

    Note: You can vote to change the behavior of SSIS here to fix that problem: http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=241901

    Posted Mar 12 2008, 09:04 AM by Brian Knight with 3 comment(s)
    Filed under:
  • SSIS Lab: Runtime Performance Differences Between DTS and SSIS

    Even though you can indeed run DTS packages from within SSIS the solution is only partially complete. In Management Studio, you can manage your DTS packages under Managment --> Legacy --> Data Transformation Services. From here, you can open packages, import new ones but you cannot create a new DTS package. To create a new one, you can open an old one, delete everything in the package and then do Package --> Save As, naming it the new name. That's a not so elegant workaround but it will work!

    When you upgrade to SSIS, you can see a dramatic improvement in performance. For example, I took a simple DTS package which pulls one million rows out of a comma-delimited text file and writes that data to a SQL Server table. In DTS the average runtime over five runs of the package was 33.2 seconds for the million records. The same upgraded package took 11.3 seconds on average if you choose to migrate to a SQL Server Destination in the Data Flow task or 12.3 seconds on average for a standard OLE DB Destination. This was on a multi-core laptop with 2 GB of RAM and the package was run 5 times for closer accuracy.

     
      DTS SSIS SQL Server Destination SSIS OLE DB Destination
    Average Runtime

    33.2

    11.3

    12.3

     

    Keep in mind that this package was quite simple. If you migrated to an SSIS package with a more complex transformation, the results would be much, much more impressive. But, with a simple package with no transformation, there was close to a 65% performance improvement between DTS and SSIS. Notice also that the SQL Server Destination runs slightly faster also than the OLE DB Destination. The reason for that is the SQL Server Destination writes directly into SQL Server's shared memory space and skips the TCP/IP stack. The package here gives about an 8% gain in performance but this could be less as the SSIS package had more transformation and the share of the destination load became less of the work.

     

    Another key point about running DTS in the 2005 or 2008 runtime. There is no way to look at the DTS package logs as you did in Enterprise Manager. To do this, the workaround is to create reports or queries to go against the sysdtslogs tables in the msdb database. A far cry though from the old Enterprise Manager interface. The moral of the story in my eyes is to not be lured into running DTS in the 2005 or 2008 environment. While it will work, it does create a management headache. Another key is that the DTS runtime will not run in the 64 bit environment.

     

    If you need help migrating DTS packages to SSIS, check out http://www.DTSxChange.com.

     

    -- Brian Knight

  • SQL Saturday Jacksonville Pre-Conference Announced

    Hopefully by now, you've already heard of SQL Saturday #3 in Jacksonville, FL. SQL Saturday is a free conference that brings together the SQL Server community of developers, DBAs and BI guys/gals to a single event with lots of great sessions, speakers and networking. The event is totally free and is funding thanks to the kindness of sponsors and the pre-conference event.

    I'm pleased to announce that following in the footsteps of the Orlando and Tampa events we have a one day seminar on Performance Tuning May 2nd by SQL Server author, speaker and trainer Andy Warren of End to End Training. The one day seminar will be be at a great price of $99 and that includes lunch and any profits go to pay for the event SQL Saturday event. The seminar will focus on the things that a beginning DBA or a developer that does data access needs to know to help get the best performance from SQL Server. Our goal is that you go home with some useful knowledge that you apply immediately. Seminar will be held at the Embassy Suites Jacksonville (link below).

    If you're coming to the pre-conference, make sure you stay one more day for the free SQL Saturday event, giving you two great SQL Server days of rich content for $99. For more information about SQL Saturday, please visit http://www.sqlsaturday.com.

    Our primary hotel for the event will the Embassy Suites Jacksonville. Call the hotel directly to register and ask for the event rate of $89/night (includes internet access Friday), it will be under Pragmatic Works, or if you register online use group code PGW. If you have problems getting a room please email webmaster@sqlsaturday.com.

  • Customer Labs: SQL Server Data Mining Bagging

    Recently I had a client who was receiving very good performance out of the decision tree algorithm on the top cases but the predictions in the middle of the cases left room for improvement. We tried each of the other algorithms like Logical Regression and Naive Baines but each had its own flaws. In all cases, the customer loved parts of each of the four primary SQL Server data mining algorithms but each left some key down at the bottom of the pile or promoted cases too aggressively.

    Enter a concept called data mining bagging. With data mining bagging, you can take all four algorithms that give you reasonable performance and combine their score into a more level prediction. In this customer’s case, we choose the top four algorithms for their data: Decision Tree, Clustering, Naïve Baines and Logical Regression. We were trying to predict the probability of a salesperson wanting to chase a given customer based on the odds of them getting the project. All the salespeople voted whether this was a good customer or not and these votes were all introduced as evidence.

    So the end game was this. We took all four algorithms and ran their respective queries in SSIS, loading a table for each algorithm of the customer and probability of it being a good customer (score). Then, we bagged the results in SSIS by taking the four tables and averaging their score. The result improved the already splendid results from data mining by 40%.

    By bagging your data, you ensure that one week model doesn’t inappropriately grade a case. It ensures that each algorithm essentially votes for each case and the more models that vote for that case, truly pushes it up. It also helps you get past the weakness of each model inherently will have. After seeing these results in SQL Server Analysis Services, I would highly recommend using bagging on any future projects.

    -- Brian Knight
  • SSIS Class Free Bonus Day

     In case you're looking for SSIS training in the immediately future, I have an upcoming class that we extended to 5 days for the same price. The fifth day includes training in SQL Server 2008 SSIS concepts and additional performance tuning on SSIS. It's our most complete SSIS class to date and there are only a few seats left if you can make it. 

    http://www.endtoendtraining.com/public/classes/coursedetails.aspx?courseid=19

    The upcoming class is in our Orlando training center. The training center is top notch with every seat a leather high-back seat and lunches served each night. On one night, we even take students to CheeseCake Factory and have a massage theropist come in one day to take care of class stress.

    Brian Knight 

     

    Posted Feb 23 2008, 08:13 PM by Brian Knight with no comments
    Filed under:
  • SQL Saturday Registration and Call for Speakers Open!

    We've now locked in on a venue and registration is now open for SQL Saturday #3 in Jacksonville. SQL Saturday is a free all day data related event for database professionals and developers who use data. The event is 5/3/8 at University of North FLorida. Space is limitted to the first 400 registrants and each of the other two event in Orlando and Tampa filled up to capacity. You can reserve your spot here:http://www.sqlsaturday.com/register.aspx

     We're also looking for sessions. You can see what sessions have been submitted, suggested or suggest a session here: http://www.sqlsaturday.com/possiblesessions.aspx 

    You can also submit a session if you're interested in speaking here: http://www.sqlsaturday.com/callforspeakers.aspx

    I hope to see you there and register soon to hold your space!

    Brian

     

  • Speaking at the Richmond SQL Server Users Group

     I'll be at the Richmond SQL Server Users Group next week if you're near the area.

    Brian Knight, SQL Server MVP
    presents Introduction to Data Mining

    Wednesday, February 27, 2008, 6:30 PM - 8:00 PM
    Social Time starts at 6:00 PM!
    Location: Markel Plaza
    4600 Cox Road
    Glen Allen, VA 23060 [map]

    Registration is not required for this event but if you plan to attend, please register so we'll know how many developers and database professionals to expect (feed). Thanks!

     

More Posts Next page »
Copyright Pragmatic Works
Powered by Community Server (Non-Commercial Edition), by Telligent Systems