Old ladyHow Big Is Your Package?
Size up and convert your DTS package to SSIS. If you have the biggest DTS package, win a gift card and certificate of a world record package size.  
March 2008 - Posts - Brian Knight

Pragmatic Works

Welcome to Pragmatic Works Sign in | Join | Help
in Search

Brian Knight

March 2008 - Posts

  • File Properties Task Now Free

    File Properties Task

    This SQL Server 2005 task reads the properties of a file and writes the values to a series of variables. For example, you can use this task to determine if the file is in use (still be uploaded or written to) and then conditionally run the Data Flow task to load the file if it's not being used. You can also use it to determine when the file was created in order to determine if it must be archived.

    Download the Fully Functional Product for Free


    Task Configuration
    See Larger Image of Estimation
    After dragging the task over, you can see a list of exposed properties for the file. First, specify the connection manager or the variable that contains the file name that you wish to evaluate. Then, map whichever properties you wish to read from the file to a variable in SSIS. Some of the properties exposed are:
    • File Creation Date and Time
    • File In Use
    • Last Access Date and Time
    • Last Modified Date and Time
    • File Read Only
    • File Hidden
    • File Name
    • File Extension

    Use Cases

    This task compartmentalizes and simples what would be very complex scripting inside a Script Task into a simple to use task user interface with no coding experience needed. Here are a few of the use cases that you can use the File System Task for.

    File Archival

    Files have a tendency to build up on a SQL Server. Maintenance plan steps fail often times, leaving backups behind or extract files are never removed. You can use the File System Task to quickly develop a package to read the creation date or the last accessed date of a file and move or remove it.

    To accomplish this, first drag over a For Each Loop Container and configure it to point to the directory to loop through. You can also setup the container to evaluate subdirectories. Drag over the File Properties Task onto the pane and configure it to point to the file name that's presently being enumerated through in the For Each Loop Container. Next, configure the File Creation Date (or another date property) to output to an SSIS variable like FileCreationDate. Next, create a File System Task to remove the file using the Delete File operation. The last step is to create a precedence constraint between the two tasks and double-click on the constraint. Set the Evaluation Operation to Expression and Constraint and also set the Expression property to the following expression:

     DATEDIFF( "D", @FileCreationDate, getdate())  > 30

    The final product package would resemble this screenshot.

    File In Use

    In large data loads where a file is your source, you want to ensure that you don't load a file that's still being transmitted to you. If you try to load a file that's still being FTPed or copied to you, you will receive a potential sharing violation and the package will fail. With the File Properties Task, you can look at the file you're about to upload to ensure it's not in use. If it is, skip that file in the processing chain and retry later.

    To accomplish this inside a loop, drag over the File Properties Task onto the SSIS design pane and point it to the connection manager you're about to load. Next, output the File In Use property to a SSIS variable like FileInUse. Connect the task to whichever task will be loading the file. Then, double-click on the precedence constraint and set the Evaluation Operation to Expression and Constraint and also set the Expression property to @FileInUse == False. This wil ensure that the next task will only execute if the file is not in use and the final product will resemble a package that looks like the below.

     

     

    Pricing

    The File Properties Task is priced per machine license. The price of the task is:

    • Free in an *as is* license
  • Renumbering Package IDs in an Automated Fashion

    I'm now a huge advocate of using package templates. For more info on how to use them see this post. Template provide a handy way to determine the best practices for your company and reuse those over and over again. Well there's one slight problem with using them. The package that uses the template inherits the PackageID of the template. Why is this a problem? Well, if you're using the System::PackageID variable for logging, all of your packages now will report that they are the template package.

    So, you have two options. Ultimately, you must renumber each package to a unique value. You can do this manually in the package Properties pane. You can also do this in an automated method by using the dtutil.exe application and the /I switch as shown below:

    dtutil.exe /I /FILE "PackageName.dtsx"

    The best way I feel is to create a batch file with the following code that will loop through every package in a directory and renumber it:

    for %%f in (*.dtsx) do dtutil.exe /I /FILE "%%f"

    Please note though that once you use DTSutil.exe to do this, it will re-arrange your package. Don't worry, your package will still work but the appearance may be a bit rearranged.

    Note: You can vote to change the behavior of SSIS here to fix that problem: http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=241901

    Posted Mar 12 2008, 09:04 AM by Brian Knight with 3 comment(s)
    Filed under:
  • SSIS Lab: Runtime Performance Differences Between DTS and SSIS

    Even though you can indeed run DTS packages from within SSIS the solution is only partially complete. In Management Studio, you can manage your DTS packages under Managment --> Legacy --> Data Transformation Services. From here, you can open packages, import new ones but you cannot create a new DTS package. To create a new one, you can open an old one, delete everything in the package and then do Package --> Save As, naming it the new name. That's a not so elegant workaround but it will work!

    When you upgrade to SSIS, you can see a dramatic improvement in performance. For example, I took a simple DTS package which pulls one million rows out of a comma-delimited text file and writes that data to a SQL Server table. In DTS the average runtime over five runs of the package was 33.2 seconds for the million records. The same upgraded package took 11.3 seconds on average if you choose to migrate to a SQL Server Destination in the Data Flow task or 12.3 seconds on average for a standard OLE DB Destination. This was on a multi-core laptop with 2 GB of RAM and the package was run 5 times for closer accuracy.

     
      DTS SSIS SQL Server Destination SSIS OLE DB Destination
    Average Runtime

    33.2

    11.3

    12.3

     

    Keep in mind that this package was quite simple. If you migrated to an SSIS package with a more complex transformation, the results would be much, much more impressive. But, with a simple package with no transformation, there was close to a 65% performance improvement between DTS and SSIS. Notice also that the SQL Server Destination runs slightly faster also than the OLE DB Destination. The reason for that is the SQL Server Destination writes directly into SQL Server's shared memory space and skips the TCP/IP stack. The package here gives about an 8% gain in performance but this could be less as the SSIS package had more transformation and the share of the destination load became less of the work.

     

    Another key point about running DTS in the 2005 or 2008 runtime. There is no way to look at the DTS package logs as you did in Enterprise Manager. To do this, the workaround is to create reports or queries to go against the sysdtslogs tables in the msdb database. A far cry though from the old Enterprise Manager interface. The moral of the story in my eyes is to not be lured into running DTS in the 2005 or 2008 environment. While it will work, it does create a management headache. Another key is that the DTS runtime will not run in the 64 bit environment.

     

    If you need help migrating DTS packages to SSIS, check out http://www.DTSxChange.com.

     

    -- Brian Knight

  • SQL Saturday Jacksonville Pre-Conference Announced

    Hopefully by now, you've already heard of SQL Saturday #3 in Jacksonville, FL. SQL Saturday is a free conference that brings together the SQL Server community of developers, DBAs and BI guys/gals to a single event with lots of great sessions, speakers and networking. The event is totally free and is funding thanks to the kindness of sponsors and the pre-conference event.

    I'm pleased to announce that following in the footsteps of the Orlando and Tampa events we have a one day seminar on Performance Tuning May 2nd by SQL Server author, speaker and trainer Andy Warren of End to End Training. The one day seminar will be be at a great price of $99 and that includes lunch and any profits go to pay for the event SQL Saturday event. The seminar will focus on the things that a beginning DBA or a developer that does data access needs to know to help get the best performance from SQL Server. Our goal is that you go home with some useful knowledge that you apply immediately. Seminar will be held at the Embassy Suites Jacksonville (link below).

    If you're coming to the pre-conference, make sure you stay one more day for the free SQL Saturday event, giving you two great SQL Server days of rich content for $99. For more information about SQL Saturday, please visit http://www.sqlsaturday.com.

    Our primary hotel for the event will the Embassy Suites Jacksonville. Call the hotel directly to register and ask for the event rate of $89/night (includes internet access Friday), it will be under Pragmatic Works, or if you register online use group code PGW. If you have problems getting a room please email webmaster@sqlsaturday.com.

More Posts
Powered by Community Server (Non-Commercial Edition), by Telligent Systems