Johan Åhlén

Johan Åhlén's blog about life, universe and everything.
SQL Azure new user interface - Project Houston

"Project Houston" is the new user interface for SQL Azure. If you have tried SQL Azure you know that the user interface on the web portal is very brief. You can't do almost anything from there (except create new database and setup security). To do anything with your SQL Azure database you need to use the "non-cloud" tools like Management Studio (in SQL Server 2008 R2).

The new user interface is a Silverlight application. It's got a fresh new look:

You can use it to design and edit the contents of your database objects. However it is not the fastest (I managed to hang it when I opened a too large table).

So what is it useful for? I'd say it's useful for situations where you don't have access to Management Studio (or similar tool) or people who'd be scared of Management Studio. I think the idea is good that a cloud database should also have a cloud based user interface, although in most cases it will be more convenient to use the traditional user interfaces.

 

StreamInsight introduction and BizTalk adapter

I stumbled upon this blog post, which I think gives a very good introduction to building a StreamInsight Application:
http://blogs.msdn.com/b/masimms/archive/2010/08/10/building-your-first-end-to-end-streaminsight-application.aspx

It's rather long and more detailed than my introduction articles. I can really recommend it if you want to get started with StreamInsight!

I can also recommend this article which describes how to build an adapter that communicates with BizTalk (or any other SOAP/REST endpoint):
http://seroter.wordpress.com/2010/07/09/sending-streaminsight-events-to-biztalk-through-new-web-soaprest-adapter/

 

Try your SQL optimizing skills and win great honor

As winner of the previous round, I’ve been given the honor to write the problem for Phil Factor Speed Phreak Challenge #6. This problem is from the real world (algorithmic trading) and the sample data is real.

The challenge is to reproduce highest buy and lowest sell prices from data captured from a stock exchange. There are four competition categories, all of which give you great honor and nice prizes:

  • Overall best performance (first prize)
  • Best SQL CLR solution
  • Best cursor-based solution
  • Best unconventional solution (be creative!)

Don't hesitate - give it a try!
http://ask.sqlservercentral.com/questions/17499/phil-factor-speed-phreak-challenge-6-the-stock-exchange-order-book-state-problem 

Have fun and good luck!

 

Exploring SQL Server under the hood using undocumented DBCC commands

Have you ever wanted to see the details on how data is organised in a SQL Server database? Exactly how tables, indexes, blogs, etc are stored. How SQL Server can find a specific row in a table. This is possible through two undocumented DBCC commands: DBCC IND and DBCC PAGE.

Here's a description of the DBCC commands.

DBCC IND

Lists all of a table's data and index pages. Syntax:

DBCC IND
(
  'database_name' | database_id,
  table_name,
  index_id
)

where index_id is either a value from sys.indexes or 
-1 to show all indexes and IAMs 
-2 to show all IAMs.

DBCC PAGE

Displays the content of data and index pages.

DBCC PAGE
(
  'database_name' | database_id,
  file_number,
  page_number,
  detail_level
)

where detail_level is a number between 0 (least detailed) and 3 (most detailed).

Note that both these DBCC commands require that you set trace flag 3604:

DBCC TRACEON (3604)

An example

Let's start by creating a database with a few files.

CREATE DATABASE [TEST] ON  PRIMARY
( NAME = N'TEST_Primary1', FILENAME = N'C:\Your_path_here\TEST_Primary1.mdf' , SIZE = 3072KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ),
( NAME = N'TEST_Primary2', FILENAME = N'C:\Your_path_here\TEST_Primary2.ndf' , SIZE = 3072KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ),  
FILEGROUP [SECONDARY]
( NAME = N'TEST_Secondary', FILENAME = N'C:\Your_path_here\TEST_Secondary.ndf' , SIZE = 3072KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB ) 
LOG ON

( NAME = N'TEST_log', FILENAME = N'C:\Your_path_here\TEST_log.ldf' , SIZE = 1024KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)

Next, let's create a table.

USE [TEST]
GO 
CREATE TABLE [dbo].[TestTable]
(
           
  [ID] [int] IDENTITY(1,1) NOT NULL,
          
  [SmallText] [varchar](2000) NOT NULL,
          
  [LargeText] [text] NOT NULL,
          
  [SparseText] [varchar](20) SPARSE  NULL,
 
CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
(          
  [ID] ASC
) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [SECONDARY]

Now let's insert some data into the table.

WITH CTE1 AS
(          
 SELECT 1 AS [Ctr]
          
 UNION ALL
          
 SELECT [Ctr] + 1
          
 FROM CTE1
          
 WHERE [Ctr] < 2000
)
,
CTE2 AS
(          
 SELECT 
 '*** Row ' + CAST([Ctr] AS VARCHAR) + ' ' + REPLICATE('*', 1000) [Str]
                      
 
,[Ctr]
          
 FROM CTE1
)
INSERT INTO TestTable(SmallText, LargeText, SparseText)
SELECT [Str], [Str], NULL
FROM CTE2
OPTION (MAXRECURSION 0)

Let's modify one of the rows.

UPDATE TestTable
SET SparseText = 'Hello world!'
WHERE ID = 2

Now let's have a look a the table details through a DMV.

SELECT * FROM SYS.DM_DB_INDEX_PHYSICAL_STATS(DB_ID('TEST'), OBJECT_ID('TestTable'), NULL, NULL, 'DETAILED')

The interesting output here is that the IN_ROW_DATA (ID, SmallText and SparseText) has 2000 records (same as amount of rows) while the LOB_DATA (LargeText) has 4000 records! That is something you could investigate using the DBCC commands.

First let's find the page numbers.

DBCC TRACEON (3604)DBCC IND('TEST', TestTable, -1)

You'll get a long list of rows (one for each data/index page). Some of the interesting columns are: PageFID and PagePID which gives you the file_number and page_number of the page. We'll see three different values for PageFID: 1 for Primary1, 3 for Primary2 and 4 for Secondary if you have used my database creation statement. There is also a PageType field that has indicates type of page: 1 for data page, 2 for index page, 3 and 4 for text pages, 8 for GAM page, 9 for SGAM page, 10 for IAM page and 11 for PFS page.

Now you can display the detailed contents of any page using the DBCC PAGE command. Let's have a look at the first data page of the table - in my example PageFID 1 / PagePID 89.

DBCC PAGE('TEST', 1, 89, 3)

The number 3 says we want to get as much detail as possible. You'd then get very detailed info about the page. Some of the output is shown below (run the command yourself to see all output).


PAGE: (1:89)


BUFFER:


BUF @0x0000000085FE9080

bpage = 0x0000000085C2C000           bhash = 0x0000000000000000           bpageno = (1:89)
bdbid = 9                            breferences = 0                      bcputicks = 390
bsampleCount = 1                     bUse1 = 7672                         bstat = 0xc0010b
blog = 0xca2159bb                    bnext = 0x0000000000000000          

PAGE HEADER:


Page @0x0000000085C2C000

m_pageId = (1:89)                    m_headerVersion = 1                  m_type = 1
m_typeFlagBits = 0x4                 m_level = 0                          m_flagBits = 0x0
m_objId (AllocUnitId.idObj) = 29     m_indexId (AllocUnitId.idInd) = 256 
Metadata: AllocUnitId = 72057594039828480                                
Metadata: PartitionId = 72057594038779904                                 Metadata: IndexId = 1
Metadata: ObjectId = 2105058535      m_prevPage = (0:0)                   m_nextPage = (1:94)
pminlen = 8                          m_slotCnt = 7                        m_freeCnt = 759
m_freeData = 7419                    m_reservedCnt = 0                    m_lsn = (43:39:2)
m_xactReserved = 0                   m_xdesId = (0:0)                     m_ghostRecCnt = 0
m_tornBits = 0                      

Allocation Status

GAM (1:2) = ALLOCATED                SGAM (1:3) = NOT ALLOCATED          
PFS (1:1) = 0x60 MIXED_EXT ALLOCATED   0_PCT_FULL                         DIFF (1:6) = CHANGED
ML (1:7) = NOT MIN_LOGGED           

Slot 0 Offset 0x60 Length 1043

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP VARIABLE_COLUMNS
Record Size = 1043                  
Memory Dump @0x000000000DD8A060

0000000000000000:   30000800 01000000 03000002 00030413 †0...............
0000000000000010:   842a2a2a 20526f77 2031202a 2a2a2a2a †„*** Row 1 *****
0000000000000020:   2a2a2a2a 2a2a2a2a 2a2a2a2a 2a2a2a2a †****************
0000000000000030:   2a2a2a2a 2a2a2a2a 2a2a2a2a 2a2a2a2a †****************
0000000000000040:   2a2a2a2a 2a2a2a2a 2a2a2a2a 2a2a2a2a †****************

Good luck exploring SQL Server storage!

 

SQLIO performance tests on my SSD drive

SQLIO is a great tool to do performance tests on SANs or local hard disks. It helps you identify the optimal configurations for getting maximum data rates and latencies from your storage. Brent Ozar has created a helpful ten-minute video on how to get started with SQLIO.

I now have tested my SSD drive. To get something to relate to, I also tested my old drive in the same machine. You can see the results below.

DiskType ReadOrWrite IOpattern MBs_Sec LatencyMS_Avg
SSD R random 237 22
NonSSD R random 4 1266
SSD R sequential 250 22
NonSSD R sequential 38 155
SSD W random 65 96
NonSSD W random 8 712
SSD W sequential 95 60
NonSSD W sequential 38 161

The results are similar to what others report about SSD drives. The real killer is random reads, where the SSD is more than 50 times faster than the traditional drive. Also note that for SSD reading is much faster than writing, whereas for the traditional drive there is no difference in speed between reading and writing sequentially.

For more details you can see my raw output files from SQLIO here. You can also read this earlier blog post.

Autodetecting Instance Names in StreamInsight

Since the RTM version you have to specify the instance name when connecting to or creating an in-process StreamInsight server. There is no such thing as a "Default Instance" like there is in the SQL Server database engine. Since users are very creative when picking the name of their StreamInsight installation you need to either have your application ask for the instance name or autodetect it.

The instance name can be autodetected from the registry. The only problem is that you need sufficient permissions to read the appropriate registry key, which you won't have in Windows 7 or Vista if you just start the application as usual from Visual Studio. Instead you have to start Visual Studio with Run as administrator.

Here's an example class to enumerate the installed instances (yes there can be more than one instance). It also provides you the info you need to deploy and run your applications on the StreamInsightHost.

using System;
using System.Collections.Generic;
using System.Text;
using Microsoft.Win32;

namespace JAhlenExample
{
    public class StreamInsightSetupInfo
    {
        public const string RegKeyPrefix = "MSSI.";

        public static string[] EnumerateInstances()
        {
            RegistryKey masterKey;

            try
            {
                masterKey = Registry.LocalMachine.CreateSubKey(@"SOFTWARE\Microsoft\Microsoft StreamInsight");
            }
            catch (UnauthorizedAccessException ex)
            {
                throw new Exception("Access denied to StreamInsight registry key. Make sure you run with high enough permissions. If executing from Visual Studio you may have to run Visual Studio as administrator.");
            }

            var lst = new List<string>();

            foreach (var key in masterKey.GetSubKeyNames())
            {
                if (key.StartsWith(RegKeyPrefix))
                {
                    lst.Add(key.Substring(RegKeyPrefix.Length));
                }
            }

            return lst.ToArray();
        }

        public static string GetHostPath(string instanceName)
        {
            var masterKey = Registry.LocalMachine.CreateSubKey(@"SOFTWARE\Microsoft\Microsoft StreamInsight\" + RegKeyPrefix + instanceName);
            var dumperPath = masterKey.GetValue("StreamInsightDumperPath").ToString();

            var streamInsightRootPath = dumperPath.Substring(0, dumperPath.LastIndexOf('\\'));
            streamInsightRootPath = dumperPath.Substring(0, streamInsightRootPath.LastIndexOf('\\') + 1);

            return streamInsightRootPath + @"Host\";
        }

        public static string GetServiceName(string instanceName)
        {
            return "MSSI$" + instanceName;
        }

        public static string GetEndpointName(string instanceName)
        {
            return "http://localhost/StreamInsight/" + instanceName;
        }
    }
}

Posted: Jul 19 2010, 08:23 AM by jahlen
Filed under:
My StreamInsight samples have been moved

I've decided to move my samples to the "official" StreamInsight sample site on Codeplex, http://streaminsight.codeplex.com. The advantage is that there they will be co-located with other useful StreamInsight samples.

My currently published samples are:

Twitter input adapter

Yahoo Finance input adapter

I'll keep updating with more samples in the future.

Posted: Jun 27 2010, 10:02 PM by jahlen
Filed under:
StreamInsight - how to solve the "Input string was not in a correct format" bug

Are you running StreamInsight in a non-English speaking country? Then you may have run into the problem of getting the exception "Input string was not in a correct format" from StreamInsight when you try to run a query that contains a non-integer number. The problem is that some countries (like Sweden) use comma as decimal separator instead of a dot and that causes problems with the XML serialization/deserialization within StreamInsight. I've reported this as a bug to Microsoft and you can read more about it here.

Workaround?

Yes, I've found a workaround. You can change the CurrentCulture of your thread just before creating your query templates or calling ToQuery():

Thread.CurrentThread.CurrentCulture = CultureInfo.InvariantCulture;

After creating your query tempates or instantiating your queries you can change the culture back to your original again if needed.

 

Posted: Jun 04 2010, 09:59 PM by jahlen
Filed under:
Performance optimization - T-SQL, SSIS, SQL CLR, etc what is fastest?

Ever wondered what is the fastest technology for your queries? One way to find out is to let a couple of SQL Server optimization freaks compete who can write the fastest solution. This is what the Phil Factor SQL Speed Phreak Competition is about.

Below is the results of round 5 of the competition, the SSN matching SQL problem. Interesting to see the results and learn new optimization ideas from the different entries.

+-------------------+-----------+----------+-----------+--------+--------+
| Name              | Time (ms) | Position |   Reads   | Result |  Type  |
+-------------------+-----------+----------+-----------+--------+--------+
| JAhlen v4         |       320 |        1 |       231 | OK     | SQLCLR |
+-------------------+-----------+----------+-----------+--------+--------+
| Daniel Ross       |       350 |        2 |       206 | OK     | SSIS   |
+-------------------+-----------+----------+-----------+--------+--------+
| JAhlen v3         |       370 |          |       231 | Error  | SQLCLR |
+-------------------+-----------+----------+-----------+--------+--------+
| Matt v2           |       391 |        3 |       231 | OK     | SQLCLR |
+-------------------+-----------+----------+-----------+--------+--------+
| Matt v3           |       426 |        4 |       237 | OK     | SQLCLR |
+-------------------+-----------+----------+-----------+--------+--------+
| Matt v1           |       538 |        5 |     1 422 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Phil v1e          |       582 |          |     1 994 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Matt v1           |       590 |        6 |       237 | OK     | SQLCLR |
+-------------------+-----------+----------+-----------+--------+--------+
| JAhlen v2         |       652 |        7 |       462 | OK     | SQLCLR |
+-------------------+-----------+----------+-----------+--------+--------+
| Phil v1c          |       674 |          |     1 537 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Phil v1b          |       700 |          |     1 609 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Peso v1           |       727 |          |    55 701 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| JAhlen v1         |       776 |        8 |       462 | OK     | SQLCLR |
+-------------------+-----------+----------+-----------+--------+--------+
| Peso 2            |       819 |          |   154 153 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Steve 1a          |       887 |        9 |    98 500 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Phil v1d          |       934 |          |    98 500 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Matt v2           |       946 |       10 |    55 755 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Phil Factor v1    |     1 900 |          |     5 362 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Scot Hauder v1    |     3 122 |       11 |     2 464 | OK     | T-SQL  |
+-------------------+-----------+----------+-----------+--------+--------+
| Fu Bar v1         | 4 467 583 |          | 1 484 180 | OK     | CURSOR |
+-------------------+-----------+----------+-----------+--------+--------+

Basically, my SQL CLR solution was the fastest. However it is very interesting to see that the SSIS solution was very close in speed. It also had fewer logical reads, which leads me to the thought that a well-written SSIS solution might be even faster than a SQL CLR solution. Maybe an idea for next round?

Also very interesting to see how good T-SQL is. A well-written T-SQL solution like Matt v1 is not easy to beat with SQL CLR as you can see from my first tries (JAhlen v1 and JAhlen v2).

Here is the source code of my SQL CLR solution.

public partial class StoredProcedures
{
    internal struct IntArrayBucket
    {
        public int Count;
        public int[] Arr;

        public void Add(int ssn)
        {
            if (Count == 0)
                Arr = new int[6];
            else if (Count >= Arr.Length)
                Array.Resize(ref Arr, Arr.Length * 2);
            Arr[Count++] = ssn;
        }
    }

    internal struct ResultRow : IComparable<ResultRow>
    {
        public int vSSN;
        public int ueSSN;
        public byte Status;

        public ResultRow(int vssn, int uessn, byte status)
        {
            vSSN = vssn;
            ueSSN = uessn;
            Status = status;
        }

        public int CompareTo(ResultRow other)
        {
            if (this.vSSN == other.vSSN)
                return this.ueSSN.CompareTo(other.ueSSN);
            else
                return this.vSSN.CompareTo(other.vSSN);
        }
    }

    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void MatchProc()
    {
        // Validated SSNs
        int[] vSSN = new int[10000];

        // Split into evens and odds, use BitArrays for extra fast lookups
        IntArrayBucket[] vSSNEven = new IntArrayBucket[100000];
        IntArrayBucket[] vSSNOdd = new IntArrayBucket[10000];
        BitArray vSSNEvenBits = new BitArray(100000);
        BitArray vSSNOddBits = new BitArray(10000);

        // Exact matches
        int[] exactResults = new int[10000];
        int exactResultsLength = 0;

        // Matches with 1 or 2 digits difference
        ResultRow[] diffResults = new ResultRow[301];
        int diffResultsLength = 0;

        using (var connection = new SqlConnection("context connection=true"))
        {
            connection.Open();

            using (var cmd = new SqlCommand(
                "SELECT CAST([SSN] AS INT) FROM dbo.[Validated]; SELECT CAST([SSN] AS INT) FROM dbo.[UserEntered]",
                connection))
            {
                using (var rdr = cmd.ExecuteReader())
                {
                    var i = 0;
                    var vSSNlength = vSSN.Length;
                    // Loop through first result set - validated SSNs
                    while (rdr.Read())
                    {
                        if (i >= vSSNlength)
                        {
                            // Increase vSSN array size if necessary
                            Array.Resize(ref vSSN, vSSNlength * 2);
                            vSSNlength = vSSN.Length;
                        }

                        int ssn = rdr.GetInt32(0);
                        vSSN[i] = ssn;
                        var even = GetEven(ssn);
                        var odd = GetOdd(ssn);

                        vSSNEven[even].Add(ssn);
                        vSSNOdd[odd].Add(ssn);
                        vSSNEvenBits.Set(even, true);
                        vSSNOddBits.Set(odd, true);
                        i++;
                    }

                    // Adjust vSSN array size if necessary
                    if (i != vSSN.Length)
                        Array.Resize(ref vSSN, i);


                    rdr.NextResult();

                    i = 0;
                    // Loop through second result set - user entered SSNs
                    while (rdr.Read())
                    {
                        var uessn = rdr.GetInt32(0);
                        var ueeven = GetEven(uessn);
                        var ueodd = GetOdd(uessn);

                        // Check first for exact match. Linear search works well with small volumes, but
                        // should be replaced by hash or binary search for larger volumes.
                        if (vSSNEvenBits[ueeven] && vSSNOddBits[ueodd] && Array.IndexOf<int>(vSSNEven[ueeven].Arr, uessn) >= 0)
                        {
                            if (exactResultsLength >= exactResults.Length)
                            {
                                // Increase exactResults array size if necessary
                                Array.Resize(ref exactResults, exactResults.Length * 2);
                            }

                            exactResults[exactResultsLength++] = uessn;
                        }
                        else
                        {
                            // Check for 1 or 2 digit difference matches
                            if (vSSNEvenBits[ueeven])
                            {
                                var bucket = vSSNEven[ueeven];
                                for (int j = 0; j < bucket.Count; j++)
                                {
                                    var vssn = bucket.Arr[j];
                                    var status = 0;
                                    if (((vssn / 10) % 10) != ueodd % 10)
                                        status++;
                                    if (((vssn / 1000) % 10) != ((ueodd / 10) % 10))
                                        status++;
                                    if (((vssn / 100000) % 10) != ((ueodd / 100) % 10))
                                        status++;
                                    if (status > 2)
                                        continue;
                                    if (((vssn / 10000000) % 10) != ((ueodd / 1000) % 10))
                                        status++;

                                    if (status <= 2)
                                    {
                                        // Increase array sizes if necessary
                                        if (diffResultsLength >= diffResults.Length)
                                        {
                                            Array.Resize(ref diffResults, diffResults.Length * 2);
                                        }

                                        diffResults[diffResultsLength].vSSN = vssn;
                                        diffResults[diffResultsLength].ueSSN = uessn;
                                        diffResults[diffResultsLength].Status = (byte)status;
                                        diffResultsLength++;
                                    }
                                }
                            }
                            if (vSSNOddBits[ueodd])
                            {
                                var bucket = vSSNOdd[ueodd];
                                for (int j = 0; j < bucket.Count; j++)
                                {
                                    var vssn = bucket.Arr[j];
                                    var status = 0;
                                    if ((vssn % 10) != ueeven % 10)
                                        status++;
                                    if (((vssn / 100) % 10) != ((ueeven / 10) % 10))
                                        status++;
                                    if (((vssn / 10000) % 10) != ((ueeven / 100) % 10))
                                        status++;
                                    if (status > 2)
                                        continue;
                                    if (((vssn / 1000000) % 10) != ((ueeven / 1000) % 10))
                                        status++;
                                    if (status > 2)
                                        continue;
                                    if (((vssn / 100000000) % 10) != ((ueeven / 10000) % 10))
                                        status++;

                                    if (status <= 2)
                                    {
                                        // Increase array sizes if necessary
                                        if (diffResultsLength >= diffResults.Length)
                                        {
                                            Array.Resize(ref diffResults, diffResults.Length * 2);
                                        }

                                        diffResults[diffResultsLength].vSSN = vssn;
                                        diffResults[diffResultsLength].ueSSN = uessn;
                                        diffResults[diffResultsLength].Status = (byte)status;
                                        diffResultsLength++;
                                    }
                                }
                            }
                        }
                    }

                    // Sorts the 1 or 2 digit difference matches.
                    Array.Sort<ResultRow>(diffResults, 0, diffResultsLength);

                    if (diffResultsLength >= diffResults.Length)
                    {
                        Array.Resize(ref diffResults, diffResults.Length * 2);
                    }
                    // Add an extra value at end of array that is larger than every vSSN
                    diffResults[diffResultsLength].vSSN = 1000000000;
                }
            }
        }

        // Prepare output
        SqlDataRecord record = new SqlDataRecord(new SqlMetaData("vSSN", SqlDbType.Char, 9),
            new SqlMetaData("ueSSN", SqlDbType.Char, 9), new SqlMetaData("Status", SqlDbType.TinyInt));
        SqlContext.Pipe.SendResultsStart(record);

        char[] buffer = new char[9];

        // Now that we have two sorted lists (exactResults and diffResults), we can
        // easily merge them together and output.

        var diffIndex = 0;
        var exactIndex = 0;

        var nextDiffvSSN = diffResults[diffIndex].vSSN;
        while (exactIndex < exactResultsLength)
        {
            var vssn = exactResults[exactIndex];

            if (vssn < nextDiffvSSN)
            {
                Int32ToSqlChars(buffer, vssn);
                record.SetChars(0, 0, buffer, 0, 9);
                record.SetChars(1, 0, buffer, 0, 9);
                record.SetByte(2, (byte)0);

                SqlContext.Pipe.SendResultsRow(record);
                exactIndex++;
            }
            else
            {
                SortedDictionary<int, byte> results = new SortedDictionary<int, byte>();
                if (vssn == nextDiffvSSN)
                {
                    results.Add(vssn, 0);
                    exactIndex++;
                }

                while (diffResults[diffIndex].vSSN == nextDiffvSSN)
                {
                    results.Add(diffResults[diffIndex].ueSSN, diffResults[diffIndex].Status);
                    diffIndex++;
                }

                foreach (var result in results)
                {
                    Int32ToSqlChars(buffer, nextDiffvSSN);
                    record.SetChars(0, 0, buffer, 0, 9);
                    Int32ToSqlChars(buffer, result.Key);
                    record.SetChars(1, 0, buffer, 0, 9);
                    record.SetByte(2, result.Value);
                    SqlContext.Pipe.SendResultsRow(record);
                }

                nextDiffvSSN = diffResults[diffIndex].vSSN;
            }
        }

        while (diffIndex < diffResultsLength)
        {
            Int32ToSqlChars(buffer, diffResults[diffIndex].vSSN);
            record.SetChars(0, 0, buffer, 0, 9);
            Int32ToSqlChars(buffer, diffResults[diffIndex].ueSSN);
            record.SetChars(1, 0, buffer, 0, 9);
            record.SetByte(2, diffResults[diffIndex].Status);
            SqlContext.Pipe.SendResultsRow(record);
            diffIndex++;
        }

        // End
        SqlContext.Pipe.SendResultsEnd();
    }

    private static void Int32ToSqlChars(char[] buffer, int ssn)
    {
        buffer[0] = (char)('0' + (ssn / 100000000));
        buffer[1] = (char)('0' + ((ssn / 10000000)) % 10);
        buffer[2] = (char)('0' + ((ssn / 1000000)) % 10);
        buffer[3] = (char)('0' + ((ssn / 100000)) % 10);
        buffer[4] = (char)('0' + ((ssn / 10000)) % 10);
        buffer[5] = (char)('0' + ((ssn / 1000)) % 10);
        buffer[6] = (char)('0' + ((ssn / 100)) % 10);
        buffer[7] = (char)('0' + ((ssn / 10)) % 10);
        buffer[8] = (char)('0' + (ssn % 10));
    }

    private static Int32 SqlCharsToInt32(SqlChars sqlChars)
    {
        var buf = sqlChars.Buffer;
        return (buf[0] - '0') * 100000000 + (buf[1] - '0') * 10000000 + (buf[2] - '0') * 1000000 + (buf[3] - '0') * 100000 + (buf[4] - '0') * 10000 + (buf[5] - '0') * 1000 + (buf[6] - '0') * 100 + (buf[7] - '0') * 10 + (buf[8] - '0');
    }

    private static int GetEven(int num)
    {
        return (num % 10) +
               ((num / 100) % 10) * 10 +
               ((num / 10000) % 10) * 100 +
               ((num / 1000000) % 10) * 1000 +
               ((num / 100000000) % 10) * 10000;
    }

    private static int GetOdd(int num)
    {
        return ((num / 10) % 10) +
               ((num / 1000) % 10) * 10 +
               ((num / 100000) % 10) * 100 +
               ((num / 10000000) % 10) * 1000;
    }
};
 

Here are some hints of how to make fast SQL CLR solutions:

  • Try to work with primitive types as much as possible since .NET is optimized for them. An Array of integers is significantly faster than for example the collection classes. An exception is the BitArray class that performs very well.
  • Avoid properties as they are slower than working with fields or local variables directly.
  • String operations are much slower than integer operations, so work with integers if possible.
  • Do not use "foreach". It is faster to use "for" instead.
  • Be clever and make sure your procedure/function reaches its result with minimum effort. Use hash functions, bitmap indexes, binary searches, etc.

Why not try your skills on the next round of the competition?

 

SSD - a great performance booster for tired laptops

Tired of waiting minutes for your laptop to boot? I was until now... Having replaced my old hard disk with an SSD-disk, my laptop has become blazing fast!

Below are my computer's performance scores. My new hard disk has a score of 7.7 and is now by far the fastest component in my computer.

The disadvantage is the disk size. I've now got only 160 GB in my laptop while my previous hard disk had 320 GB. However I still have my old hard disk as secondary storage...

SSD disks are of two kinds: MLC or SLC, where SLC have the best performance and durability. I purchased an MLC disk and paid about $500 for 160 GB. An SLC disk of 128 GB would have cost $1700, so they are not as good value for money in my opinion. Here's a photo of my SSD disk. Doesn't it look nice? ;-)

I've made some configuration changes since I changed to SSD. I've turned off the defragmentation, since there is no point of defragmenting SSD disks. It would only decrease their lifetime. I've also placed everything - temp files, paging file and database files on the same partition - since there should no longer be any gain by splitting them on different partitions. If anyone think I am wrong, tell me...

I can absolutely recommend changing to an SSD disk in your laptop, but in a desktop computer I would prefer a combination of SSD and traditional disk.

Posted: Apr 25 2010, 09:07 PM by jahlen | with 4 comment(s)
Filed under:
A volcano presentation at Microsoft BI Conference

Sometimes things turn out in a quite different way than expected. I looked forward to visit the Microsoft BI Conference in Stockholm and listen to Rafal Lukawiecki, but ended up doing a presentation there myself...

The background is that Rafal is doing a roadshow on Microsoft Business Intelligence and presented in Croatia just two days ago. With half of Europe's airports closed because of the volcano ashes there was little chance that any plane would take him to Sweden in time for the BI Conference. So he made the most amazing journey by trains and taxis all across Europe to get here. I must say I admire his determination and good sense of humor when he kept emailing status reports. Crazy problems you would never imagine, like a taxi driver who didn't have enough money for fuel.

What to do? Well, Microsoft made a plan B, a "Volcano-programme" to give Rafal some time to appear at the conference. We prepared a couple of presentations yesterday evening and I did a 30 minute presentation on StreamInsight - for realtime business intelligence.

Another surprise today... Microsoft just announced that they launched SQL Server 2008 R2. Watch their launch site here!

StreamInsight resources

New to StreamInsight? Then you could have a look at the following resources to learn more!

StreamInsight product information page

PDC 2009 sessions on StreamInsight

The StreamInsight team blog

StreamInsight Training Kit

The StreamInsight forum on MSDN

My SQLServerCentral articles on StreamInsight

My StreamInsight samples

 

Posted: Apr 18 2010, 04:56 PM by jahlen | with 1 comment(s)
Filed under:
Miracle Open World - the only conference with 80% serious stuff and 80% fun

Miracle Open World is unique - it's the only conference with 80% serious (technical) stuff and 80% fun (drinking). How do you manage the 160%? You skip the sleep...

Although this conference was very social, it also had a lot of really good speakers. The Oracle track seemed to be the strongest, although I listened to only one Oracle session. There was also a SQL Server track, where I spent most of my time and also did a presentation, and a Microsoft .NET track. The first speaker of the SQL Server Track was Henk van der Valk, who showed some real high-end ETL-processing and how to for instance maximize throughput in SSIS. Thomas Grohser talked about SQL Server Storage and I/O, and how to manage a 24x7 database environment with extremely high workload. I really did enjoy their presentations. Henrik Sjang Davidsen and Martin Schmidt also gave good presentations on how to detect unused objects in SQL Server through auditing and Martin Schmidt showed how to audit SSIS packages in a production environment using DTLoggedExec.

For the keynote they'd brought Jonathan Lewis, who was a new name to me but obviously was the number one Oracle speaker in the world. That's why they asked him to speak about SQL Server!!! Well, he spoke about Oracle also later.

Jonathan did actually a really good speech about SQL Server. I must say I'm impressed how much he'd learned about SQL Server in his 45 hours of preparation. I shared car with Jonathan later and had a good discussion with him and suggested that he should try to become a SQL Server MVP! J

The conference was held at Hotel Legoland in Billund, Denmark. Most of the participants stayed at Lalandia just next to the hotel, which was a very nice place and also where the beach party (at the aquadome in the middle of the night) was held. The manager of Miracle declared his house at Lalandia to be the "party house", which previous years usually had been assigned to new employees instead...

I'd like to thank Jakob Tikjøb Andersen at Miracle for inviting me as a speaker. I truly did enjoy the conference!

By the way, Miracle is not only an IT company. They also own a brewery.

 

Blindfolded piano

Just added a Youtube-video on my About-page of me playing on my piano, blindfolded. Enjoy!

Posted: Mar 31 2010, 05:58 PM by jahlen
Filed under: ,
I will speak about StreamInsight at Miracle Open World in Denmark

Beat the stock market with StreamInsight

StreamInsight is one of the main news of SQL Server 2008 R2. It's a platform for Complex Event Processing (CEP) that enables processing of high speed event streams using familiar tools such as Visual Studio, C# and LINQ. CEP is a technology that is increasingly used for algorithmic trading, targeted ads at websites and real-time Business Intelligence. In this session, you will get a 60 minutes introduction to how StreamInsight works and a demo of how to build an application that handles data in near real-time from sources such as Twitter, Facebook and financial markets. 

Miracle Open World will be held during 14th - 16th of april in Billund, Denmark. For more info and registration, see this link:
http://mow2010.dk/

 

More Posts « Previous page - Next page »