Sunday, September 30, 2007

Practical SQL Server 2005 CLR Assemblies

Consuming a Web Service from a SQL Server 2005 CLR Assembly

Practical SQL Server 2005 CLR Assemblies

One advantage of CLR assemblies is the ability to consume web services from within the database. This wouldn’t be easy with T-SQL, and would also require a lot of work in an unmanaged extended stored procedure. With .NET, it’s almost as simple as accessing a local DLL. There are just a couple of extra command-line utilities we need to run to be able to access web services from within a SQL assembly:

  • wsdl.exe. This utility examines the WSDL file for the web service and generates from it a source code file that performs the actual web access. We simply compile this file into our assembly and we can use the classes in it to access the web service just like a local assembly. If you use Visual Studio instead of the command-line compiler, you can omit this step, and instead simply add a web reference to your project in the VS IDE.
  • sgen.exe. By default, .NET web services use dynamic XML serialization to encode .NET types for transmission over the web. Dynamic code generation would be a security liability within SQL Server, and therefore isn’t permitted for SQL assemblies. The sgen.exe utility is used to generate the XML serialization code for an assembly before installing it into SQL Server. At the time of writing, this step must be performed at the command prompt, and isn’t available within the VS IDE.

NOTE: These tools are supplied only with the .NET SDK and Visual Studio 2005 - they aren’t shipped with the distribution version of the .NET Framework that comes with SQL Server 2005. So, if you don’t have a copy of Visual Studio 2005, you’ll need to download the full version of the .NET 2.0 SDK.

For this example, we’ll use an existing public web service: the Terraserver-USA web service, which returns information about locations in the USA as well as aerial images provided by the US Geological Survey (USGS). You can find out more about Terraserver-USA at http://www.terraserver-usa.com.

Our example resides in the AdventureWorks database, and it consists of a stored procedure that takes as a parameter the ID for a row in the Person.Address table. It uses this value to look up the row and select the city name, state or province and country for the address. We then pass this information to the Terraserver-USA web service, and retrieve a list of matching places. For each of these, it queries the web service again to retrieve the image associated with the place, and then inserts a new row into a table, CityDetails, in the database containing the binary data for the image and other information about the place.

This is admittedly code that could be placed outside the database, in an external business object written in .NET, but there are two advantages to creating this as a SQL assembly. Firstly, having the code inside the database allows it to be called from within other stored procedures, triggers, and so on. This means that it can be called, for example, whenever a new row is inserted into the Person.Address table. Secondly, the assembly performs several database queries, so we reduce network traffic by performing this within the database itself.

At first sight, this looks like code that might be more usefully placed in a table-valued UDF to enable reuse. Unfortunately, however, the tables returned from TVFs aren’t permitted to contain columns of type image. Columns of type varbinary are allowed, but this data type simply isn’t big enough for our purposes.

Storing Data from the Web Service

Before we get down to writing the code, let’s first create the CityDetails table where we’ll store the data from the web service. This should be created in the AdventureWorks database, and in the Person schema:

USE AdventureWorks;
GO
 
CREATE TABLE Person.CityDetails
(
   CityDetailsID int IDENTITY PRIMARY KEY,
   AddressID     int FOREIGN KEY REFERENCES Person.Address(AddressID),
   Name          nvarchar(256),
   Latitude      float,
   Longitude     float,
   Population    int,
   Image         varbinary(max),
   CONSTRAINT UNQ_NameLatLong UNIQUE (Name, Latitude, Longitude)
);
GO

The only major point to note here is that, to avoid duplicate data being entered, we’ll add a UNIQUE constraint based on the longitude and latitude coordinates of the city and the city name, as many places in the Terraserver-USA database seem to be duplicated. Specifying all three of these for the constraint will allow near duplicates to be entered (e.g. Seattle Center beside Seattle), and also ensure that different places that share the same name can be entered.


Writing the .NET Code


The next task is to write the .NET code that implements our stored procedure. In this case, with stunning originality, we’ve called the source code file WebServiceExample.cs. As usual, we start with the using directives; the only namespaces we need to import are the usual suspects for SQL Server assemblies (System, the System.Data namespaces, and Microsoft.SqlServer.Server) :

using System;
using System.Data;
using System.Data.Sql;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

The code for the stored procedure will be contained in a single method, GetCityData. This takes one parameter - the ID of the address for which we want to get the city details. We’ll start by getting the city, state, and country for this address via the in-process provider, so we first need to open up the context connection to SQL Server:

namespace Apress.SqlAssemblies.Chapter11
{
   public class WebServiceExample
   {
      [SqlProcedure]
      public static void GetCityData(int addressID)
      {
         using (SqlConnection cn = new SqlConnection(\"context connection=true\"))
         {
            cn.Open();

Next, we need to build the SQL command to perform this query. The city, state and country are all in different tables in the AdventureWorks database, so our query contains a couple of inner joins. We also need to add a single parameter to the command - the ID for the address we want the data for. We won’t be reusing this command, so we simply add the parameter and its value at the same time by calling the SqlParametersCollection’s AddWithValue method. Once we’ve done this, we call the SqlCommand.ExecuteReader method to retrieve the data from the database:

   string selectQuery = @\"SELECT a.City, s.Name As State, c.Name As Country
                          FROM Person.Address a
                          INNER JOIN Person.StateProvince s
                          ON a.StateProvinceID = s.StateProvinceID
                             INNER JOIN Person.CountryRegion c
                             ON s.CountryRegionCode = c.CountryRegionCode
                          WHERE a.AddressID = @addressID\";
   SqlCommand selectCmd = new SqlCommand(selectQuery, cn);
   selectCmd.Parameters.AddWithValue(\"@addressID\", addressID);
   SqlDataReader reader = selectCmd.ExecuteReader();

Now we have the data as a SqlDataReader, we need to extract the names of the city, state, and country into local variables. First we check that the address ID supplied did match an address in the database, and so the reader contains data. If it does, we call Read to move to the first row, and then get the data from the three columns. The query should return only a single row, so we don’t need to call Read more than once. After we’ve done that, we close the SqlDataReader as it’s no longer needed:

         if (reader.HasRows)
         {
            reader.Read();
            string city = (string)reader[0];
            string state = (string)reader[1];
            string country = (string)reader[2];
            reader.Close();

Once we have this information, we concatenate it into a single string (separated by commas); this is the form in which we’ll pass the data to the web service. Then we’ll instantiate the web service and call its GetPlaceList method. This method takes the name of the city (the string we’ve just constructed), the maximum number of entries to return, and a Boolean parameter to indicate whether or not only entries with an associated image are to be returned, and it returns an array of PlaceFacts objects. The PlaceFacts struct is a custom type used by the web service, and the code for this class will be generated when we run wsdl.exe on the Terraserver-USA web service.


NOTE: As we’re calling wsdl.exe from the command-line, we can place the TerraService class in the same namespace as the rest of our code. However, if you’re using VS, it will by default be placed in the namespace com.terraserver-usa.www, so you’ll need to either add a using directive for this namespace at the start of the code, or fully qualify the names of all the Terraserver-specific types.


Next we’ll create the command to insert a new row into the CityDetails table. This command will be executed for each PlaceFacts object in the places array that we got back from our first call to the web service. As we’ll be calling this multiple times with different values, we need to set up SqlParameter objects for each column value that we’ll be inserting into the table, and then add these to the SqlCommand object through the Parameters.AddRange method. The one value that will always be the same is the associated address ID from the Person.Address table, so we can set this straight away:

         string insertQuery = @\"INSERT INTO Person.CityDetails
                                VALUES (@addressID, @name, @longitude,
                                        @latitude, @population, @image)\";
         SqlCommand insertCmd = new SqlCommand(insertQuery, cn);
         SqlParameter addressIDParam = new SqlParameter(\"@addressID\",
                                                        SqlDbType.Int);
         SqlParameter nameParam = new SqlParameter(\"@name\",
                                                   SqlDbType.NVarChar, 256);
         SqlParameter longParam = new SqlParameter(\"@longitude\",
                                                   SqlDbType.Float);
         SqlParameter latParam = new SqlParameter(\"@latitude\",
                                                  SqlDbType.Float);
         SqlParameter popParam = new SqlParameter(\"@population\",
                                                  SqlDbType.Int);
         SqlParameter imgParam = new SqlParameter(\"@image\", SqlDbType.Image);
           insertCmd.Parameters.AddRange(new SqlParameter[] { addressIDParam,
                       nameParam, longParam, latParam, popParam, imgParam });
         addressIDParam.Value = addressID;

The other parameter values will vary for each of the PlaceFacts objects in the array, so we iterate through these, retrieve the information and call the insert command for each one. The PlaceFacts struct has a property called Center, which returns a LonLatPt object that encapsulates the longitude and latitude of the center of the place that the PlaceFacts represents. From this, we can retrieve the longitude and latitude as floating-point numbers, so we’ll use these to set the values of two of our parameters to the insert command.
We can also use this LonLatPt to find out which image we need to download, as the image data itself isn’t included in the PlaceFacts object. Each image is regarded as a tile in the map of the USA, so we need to find out which tile we want. To do this, we call the GetTileMetaFromLonLatPt method, which takes three parameters:


  • The LonLatPt that we want the tile for
  • The type of image we want, as an integer from 1 to 4
  • The scale of the image as a Scale enum value.

Here we’ve gone for image type 1 (aerial photograph), and a scale of Scale8m, which is the highest resolution available for non-subscribers. This returns an object of type TileMeta, containing the metadata for the selected tile, which we can use to get the image data itself. The TerraService has a method called GetTile, which takes as its parameter the ID for the tile we want to retrieve, and we can pass into this the value of the Id property of our TileMeta object. This returns the binary image data as a byte array.

We also get a couple of other bits of information from the PlaceFacts object - the population of the place (although this has always been zero in the places I’ve seen), and the name of the place from the City property of the Place object associated with the PlaceFacts (the Place struct simply contains string properties returning the city, state, and country of the place). Once we’ve used this information to set the values of the parameters to our insert command, we simply execute the command:

      foreach (PlaceFacts facts in places)
      {
         LonLatPt coords = facts.Center;
         TileMeta metadata = terraService.GetTileMetaFromLonLatPt(coords,
                                                        1, Scale.Scale8m);
         byte[] image = terraService.GetTile(metadata.Id);
         nameParam.Value = facts.Place.City;
         longParam.Value = coords.Lon;
         latParam.Value = coords.Lat;
         popParam.Value = facts.Population;
         imgParam.Value = image;
         insertCmd.ExecuteNonQuery();
      }

Once we’ve finished iterating through the PlaceFacts array and inserting rows into the table, we clean up the resources and send a message to the user:

         SqlContext.Pipe.Send(\"Command executed successfully.\");
         terraService.Dispose();
      }

If no rows were found matching the supplied AddressID in our original database query, we’ll also clean up resources and send a message to the user informing them of this fact:

            else
            {
               reader.Close();
               SqlContext.Pipe.Send(
                      \"No addresses in the database match the specified ID.\");
            }
         }
      }
   }
}

Generating the Custom Types used by the Web Service


That completes the .NET code that we’ll be writing for this example. However, before we compile it, we need to run wsdl.exe to generate the custom types that are used by the web service, including the TerraService class itself. The wsdl command-line tool takes as input the Web Service Description Language (WSDL) file that defines the web service (in the case of .NET web services, this is the .asmx file). In our case, we also want to pass in a couple of other options:


  • /o. The name of the source code file that will be generated.
  • /n. The namespace that the code will be placed in.

Because the default language for the generated source code is C#, we don’t need to specify the language. However, if you want the code to be in another language, you’ll also have to include the /l option. The possible values for this are CS (C#), VB (Visual Basic), JS (JScript), or VJS (J#).

Our call to wsdl therefore looks like this:

wsdl /o:TerraService.cs /n:Apress.SqlAssemblies.Chapter11 
http://www.terraserver-usa.com/TerraService2.asmx

This will generate a C# source code file called TerraService.cs, and you should now see the message:

Writing file 'TerraService.cs'.

Compiling the Code


The next step is to compile our .NET code into a DLL assembly, including the TerraService.cs file we’ve just generated:

csc /t:library WebServiceExample.cs TerraService.cs

For a normal .NET assembly not hosted in SQL Server, we wouldn’t need to do any more than this. However, as we noted at the start of this example, for security reasons SQL Server assemblies aren’t allowed to use the default dynamically generated XML serialization assembly, and we need to generate a static serialization assembly, by calling the command-line tool sgen.exe:

sgen /a:WebServiceExample.dll

The only option we need is /a, which indicates the name of the assembly we want to generate the XML serialization classes for. This generates a serialization assembly with the name WebServiceExample.XmlSerializers.dll. If you want to overwrite an existing serialization assembly, you can use the /f option to force sgen to overwrite an existing file; if this option isn’t included and the file already exists, sgen will throw an error.


NOTE: If you recompile the assembly for any reason, you need to run sgen on it again before redeploying the assemblies to SQL Server.


Deploying the Assemblies


We’re now ready to deploy these assemblies (WebServiceExample and WebServiceExample.XmlSerializers) to SQL Server. The WebServiceExample assembly needs to be deployed first, as the serialization assembly references it (if you deploy the serialization assembly first, it creates a hidden copy of the WebServiceExample assembly, which you then can’t use to create the stored procedure).
The code generated by wsdl contains synchronization attributes, and therefore the assembly must be installed with the UNSAFE permission set:

CREATE ASSEMBLY WebServiceExample
FROM 'C:\Apress\SqlAssemblies\Chapter11\WebServiceExample\WebServiceExample.dll'
WITH PERMISSION_SET = UNSAFE;
GO

Next we install the serialization assembly. This can be installed with the SAFE permission set:

CREATE ASSEMBLY [WebServiceExample.XmlSerializers]
FROM 'C:\Apress\SqlAssemblies\Chapter11\WebServiceExample\
WebServiceExample.XmlSerializers.dll'
WITH PERMISSION_SET = SAFE;
GO

Finally, create the CLR stored procedure:

CREATE PROCEDURE uspGetCityData(@addressID int)
AS
EXTERNAL NAME WebServiceExample.[Apress.SqlAssemblies.Chapter11.
WebServiceExample].GetCityData
GO

Testing the Example


To test the example, run the stored procedure with an appropriate address ID from the Person.Address table, and then check the contents of the CityDetails table:

EXEC uspGetCityData 3
SELECT * FROM Person.CityDetails

After running the CLR stored procedure, you should see that a couple of rows have been added to the contents of the CityDetails table, as shown in the following figure:


Since viewing the image data in hex format isn’t very exciting, the code download contains the City Image Viewer - a small Windows application written in .NET 2.0 that you can use to view the images in this table; please see the readme.txt file for information on configuring this.

The following figure shows an image from the CityDetails table displayed in the City Image Viewer application:

Wednesday, September 19, 2007

Making the case for Full Text Search

Making the case for Full Text Search

Use Case for Full Text Search

On a recent project, one of the needs was to search by a few different sets of criteria which were in a number of different tables with a number of one to many relationships.  Most of the tables had millions of rows of data with the largest table having over 12 million rows.  The data was primarily read-only and updated on a monthly basis.  Unfortunately, all of the queries had to access the 12 million row table with 1 to 5 joins to the child tables and 1 to 5 WHERE clauses in the queries.  The data needed to be returned as quickly as possible with the highest level of concurrency based on the existing hardware resources.

In this scenario, a number of test cases were built with a few different T-SQL coding techniques to include:

Based on the testing conducted (queries, data, concurrency, etc.), the EXCEPT and INTERSECT syntax was the best T-SQL option for a single JOIN and/or a single WHERE clause statement, which was considered a rare condition.  The best T-SQL option for a numerous JOIN and numerous WHERE clause statements, which was considered the norm, was the Full Text Catalog with the CONTAINS command.  This was a little counter intuitive initially, but the Full Text Catalog was the most efficient as we tested each scenario and reviewed the cost of the query plans.  However, this may not be the case for your data and queries, so test thoroughly.

*** NOTE *** - As a point of reference, the testing was conducted with the DBCC DROPCLEANBUFFERS command issued in between statement executions to ensure cached results would not unfairly benefit subsequent queries.

Setting up a Full Text Catalog

Since, Full Text Catalogs may be new to some DBAs and Developers, let's walk through an example of setting up a Full Text Catalog for the AdventureWorks database (SQL Server 2005 sample database):

Creation - To create a new Full Text catalog navigate to root | Databases | AdventureWorks (Database Name) | Storage | Full Text Catalogs.  To create the catalog, right click on the Full Text Catalogs folder and select the 'New Full-Text Catalog...' option.

On the New Full-Text Catalog interface the following settings should be configured:

  • Full-text catalog name
  • Catalog location
  • Filegroup
  • Owner
  • Accent Sensitivity

Corresponding Script

USE [AdventureWorks]
GO
CREATE FULLTEXT CATALOG [zProduction.Product]
IN PATH N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\FTData'
WITH ACCENT_SENSITIVITY = ON
AUTHORIZATION [dbo]
GO

Configuration - To configure the Full Text Catalog (tables, columns, schedule, etc) navigate to root | Databases | AdventureWorks (Database Name) | Storage | Full Text Catalogs | Catalog Name (i.e. zProduction.Product in our example) and select the 'Properties' option. 

On the Full Text Catalog Properties interface the following settings should be configured:

  • General configurations
  • Tables\views with the corresponding columns
  • Population Schedule

General Page - This interface corresponds primarily to the catalog that was setup in the previous steps, with the ability to indicate if the catalog is the default, the catalog owner and if the catalog is accent sensitive.

Tables/Views Page - This page provides the opportunity to determine the tables, columns to include in the Full Text Catalog.  Although multiple tables and columns could be included.  In the lower portion of the interface, it is important to note the unique index for each table, the language and the track changes setting.  For more information about the track changes setting, reference the Rebuilding the Full Text Catalog section below.

Population Schedule -

Corresponding Script

USE [AdventureWorks]
GO
CREATE FULLTEXT INDEX ON [Production].[Product] KEY INDEX [PK_Product_ProductID] ON [zProduction.Product] WITH CHANGE_TRACKING AUTO
GO

USE [AdventureWorks]
GO
ALTER FULLTEXT INDEX ON [Production].[Product] ADD ([Name])
GO

USE [AdventureWorks]
GO
ALTER FULLTEXT INDEX ON [Production].[Product] ENABLE
GO

USE [msdb]
GO
DECLARE @jobId BINARY(16)
EXEC msdb.dbo.sp_add_job @job_name=N'Start Optimize Catalog Population on AdventureWorks.zProduction.Product',
@enabled=1,
@start_step_id=1,
@description=N'Scheduled full-text optimize catalog population for full-text catalog zProduction.Product in database AdventureWorks. This job was created by the Full-Text Catalog Scheduling dialog or Full-Text Indexing Wizard.',
@category_name=N'Full-Text', @job_id = @jobId OUTPUT
select @jobId
GO

EXEC msdb.dbo.sp_add_jobserver @job_name=N'Start Optimize Catalog Population on AdventureWorks.zProduction.Product', @server_name = N'JTKLAPTOP'
GO

USE [msdb]
GO
DECLARE @schedule_id int
EXEC msdb.dbo.sp_add_jobschedule @job_name=N'Start Optimize Catalog Population on AdventureWorks.zProduction.Product', @name=N'zProduction.Product Full Text Catalog ',
@enabled=1,
@freq_type=4,
@freq_interval=1,
@freq_subday_type=1,
@freq_subday_interval=0,
@freq_relative_interval=0,
@freq_recurrence_factor=1,
@active_start_date=20070924,
@active_end_date=99991231,
@active_start_time=20000,
@active_end_time=235959,
 @schedule_id = @schedule_id OUTPUT

select @schedule_id
GO

USE [msdb]
GO
EXEC msdb.dbo.sp_add_jobstep @job_name=N'Start Optimize Catalog Population on AdventureWorks.zProduction.Product', @step_name=N'Full-Text Indexing',
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=-1,
@on_fail_action=2,
@on_fail_step_id=-1,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0,
 @subsystem=N'TSQL',
@command=N'USE [AdventureWorks]
ALTER FULLTEXT CATALOG [zProduction.Product] REORGANIZE',
@database_name=N'master'
GO

Querying with the Full Text Catalog

Here are a few Full Text Catalogs query examples with the CONTAINS command for the AdventureWorks database as a point of reference:

Product ID and Product Name Selection

USE AdventureWorks;
GO
SELECT ProductID, [Name]
FROM Production.Product
WHERE CONTAINS([Name], '"*washer*" OR "*ball*"');
GO

Description Selection

USE AdventureWorks;
GO
SELECT ProductDescriptionID, Description
FROM Production.ProductDescription
WHERE CONTAINS(Description, '"*technology*" OR "*performance*"');
GO

Rebuilding the Full Text Catalog

One item to keep in mind is that the Full Text Catalogs need to be rebuilt automatically, on a schedule or on an as needed basis in order to include all of the updated data since the initial population.  Please reference the options below to rebuild the Full Text Catalogs based on the Tables/Views Page interface shown above:

  • Automatic - The data in the full-text index is automatically updated as the data in the corresponding table is inserted, updated or deleted. 
  • Manual - In this scenario, when the indexed data is inserted, updated or deleted, SQL Server will track the change, but not update the index as is the case with the automatic option. A SQL Server Agent job needs to run in order to update the index.
  • Do not track changes - When the indexed data is inserted, updated or deleted, SQL Server will not track the change and the index must be rebuilt to reflect all of the underlying data changes.

Each of these options need to be understood in terms of data availability versus performance.  If the data in the Full Text catalog always needs to be up to date at any cost, then use the automatic setting.  If the data changes need to be recorded, but not update the Full Text indexes then the manual setting should be used. This setting should balance some of the performance needs by rebuilding the Full Text indexes during a low usage period.  If the data is updated in bulk on a regular basis, then not tracking changes is probably appropriate with the practice that the Full Text index will need to be updated following the bulk loading of the data.

Next Steps

Compress Asp.net Ajax Web Service Response - Save Bandwidth

Compress Asp.net Ajax Web Service Response - Save Bandwidth

In this post, I will show you how to compress the Asp.net Ajax Web Service response, To understand the benefits of compression let us start with a simple example, Consider you have an web service which returns a large data like the following:

[WebMethod()]
public string GetLargeData()
{
using (StreamReader sr = File.OpenText(Server.MapPath("~/DataFile.txt")))
{
return sr.ReadToEnd();
}
}

The web method reads an large text file (around 100KB) and returns it contents. Once we call this method from a page the network activity in the firebug shows like the following:

Plain

Now, lets examine the HttpModule which compress the Ajax Web Service response. The following shows the complete code of this module:

using System;
using System.IO;
using System.IO.Compression;
using System.Globalization;
using System.Web;


public class JsonCompressionModule : IHttpModule
{
public JsonCompressionModule()
{
}

public void Dispose()
{
}

public void Init(HttpApplication app)
{
app.PreRequestHandlerExecute += new EventHandler(Compress);
}

private void Compress(object sender, EventArgs e)
{
HttpApplication app = (HttpApplication)sender;
HttpRequest request = app.Request;
HttpResponse response = app.Response;

//Ajax Web Service request is always starts with application/json
if (request.ContentType.ToLower(CultureInfo.InvariantCulture).StartsWith("application/json"))
{
//User may be using an older version of IE which does not support compression, so skip those
if (!((request.Browser.IsBrowser("IE")) && (request.Browser.MajorVersion <= 6)))
{
string acceptEncoding = request.Headers["Accept-Encoding"];

if (!string.IsNullOrEmpty(acceptEncoding))
{
acceptEncoding = acceptEncoding.ToLower(CultureInfo.InvariantCulture);

if (acceptEncoding.Contains("gzip"))
{
response.Filter = new GZipStream(response.Filter, CompressionMode.Compress);
response.AddHeader("Content-encoding", "gzip");
}
else if (acceptEncoding.Contains("deflate"))
{
response.Filter = new DeflateStream(response.Filter, CompressionMode.Compress);
response.AddHeader("Content-encoding", "deflate");
}
}
}
}
}
}

Next, register this module in the web.config like the following:

<httpModules>
<add name="ScriptModule" type="System.Web.Handlers.ScriptModule, System.Web.Extensions, Version=1.0.61025.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"/>
<add name="JsonCompressionModule" type="JsonCompressionModule"/>
</httpModules>

This time the network activity shows like the following:

Compressed Json

So by adding this little module, we have saved 74KB. Now consider the impact of this in an highly traffic ajax web application :-). You will find the complete source code in the bottom of this post. If you want to learn more optimization tips check out my previous post Implement Yahoo's YSlow in your Asp.net pages and Combine Multiple JavaScript and CSS Files and Remove Overheads.

Pivots with Dynamic Columns in SQL Server 2005

Pivots with Dynamic Columns in SQL Server 2005

 

Pivots in SQL Server 2005 can rotate a table, i.e. they can turn rows into columns. PIVOTs are frequently used in reports, and they are reasonably simple to work with. However, I've seen quite a few questions about this operator. Most questions were about the column list in the PIVOT statement. This list is fixed, but many times the new columns are determined by the report at a later stage. This problem is easily solved when we mix pivots with dynamic SQL, so here is a very simple example about how to dynamically generate the pivot statement:

PIVOT allows you to turn data rows into columns. For example, if you have a table like this (I use only three months here for simplicity):

CREATE TABLE Sales ([Month] VARCHAR(20) ,SaleAmount INT)

INSERT INTO Sales VALUES ('January', 100)
INSERT INTO Sales VALUES ('February', 200)
INSERT INTO Sales VALUES ('March', 300)

SELECT * FROM SALES
 

Month             SaleAmount
----------------  -----------
January           100
February          200
March             300 


Suppose we wanted to convert the above into this:

 
January     February    March
----------- ----------  ----------
100         200         300


We can do this using the PIVOT operator, as follows:

SELECT  [January]
      , [February]
      , [March]
FROM    ( SELECT    [Month]
                  , SaleAmount
          FROM      Sales
        ) p PIVOT ( SUM(SaleAmount)
                    FOR [Month] 
IN ([January],[February],[March])
                  ) AS pvt


However, in the above example, I have the column names fixed as the first three months. If I want to create a result in which the columns are dynamic (for example, they are read from a table), then I need to construct the above query dynamically. To demonstrate this let’s look at the following example:

In the first table I have the column names I want to use:

CREATE TABLE Table1 (ColId INT,ColName VARCHAR(10))
INSERT INTO Table1 VALUES(1, 'Country')
INSERT INTO Table1 VALUES(2, 'Month')
INSERT INTO Table1 VALUES(3, 'Day')


In the second table I have the data. This consists of a row identifier (tID), a column ID (ColID) that refers to the column type in Table1, and a value:

CREATE TABLE Table2 (tID INT,ColID INT,Txt VARCHAR(10))

INSERT INTO Table2 VALUES (1,1, 'US')
INSERT INTO Table2 VALUES (1,2, 'July')
INSERT INTO Table2 VALUES (1,3, '4')
INSERT INTO Table2 VALUES (2,1, 'US')
INSERT INTO Table2 VALUES (2,2, 'Sep')
INSERT INTO Table2 VALUES (2,3, '11')
INSERT INTO Table2 VALUES (3,1, 'US')
INSERT INTO Table2 VALUES (3,2, 'Dec')
INSERT INTO Table2 VALUES (3,3, '25')


Now I would like to retrieve data from these two tables, in the following format:

tID         Country    Day        Month
----------- ---------- ---------- ----------
1           US         4          July
2           US         11         Sep
3           US         25         Dec 


In other words I want to turn the data rows in Table2 into columns. If I had a fixed set of columns for the result, i.e. the columns Country, Day, and Month were fixed, I could use SQL Server 2005’s PIVOT operator in a query like:

SELECT  tID
      , [Country]
      , [Day]
      , [Month]
FROM    ( SELECT    t2.tID
                  , t1.ColName
                  , t2.Txt
          FROM      Table1 AS t1
                    JOIN Table2 
AS t2 ON t1.ColId = t2.ColID
        ) p PIVOT ( MAX([Txt])
                    FOR ColName IN ( [Country], [Day],
                                     [Month] ) ) AS pvt
ORDER BY tID ;


However I need to construct this query dynamically, because the column names Country, Day, and Month are specified in a table, and can be changed independently from my query. In our case these columns are given in Table1.

In the first step to generate the final pivot query I need to create the list of columns, in this case [Country], [Day], [Month].

Since there is no string concatenation aggregator in SQL (a concatenation aggregator would not be deterministic without some order restriction), and since the column names are stored in rows of a table, I need to flatten these columns into a single row or variable. There are various solutions to achieve this. One solution would be to use a query like:

DECLARE @cols NVARCHAR(2000)
SELECT  @cols = COALESCE(@cols + ',[' + colName + ']',
                         '[' + colName + ']')
FROM    Table1
ORDER BY colName


This query works both on SQL Server 2000 and 2005. It is efficient, but some may not like it because it uses the same variable (@cols) on both sides of an assignment. Another solution that works on SQL Server 2005 only is to use XML PATH.

DECLARE @cols NVARCHAR(2000)
SELECT  @cols = STUFF(( SELECT DISTINCT TOP 100 PERCENT
                                '],[' + t2.ColName
                        FROM    Table1 AS t2
                        ORDER BY '],[' + t2.ColName
                        FOR XML PATH('')
                      ), 1, 2, '') + ']'


This second query (I’ve seen this posted by Peter Larsson) has the advantage that it does not use the @cols variable on the right hand side. I like this solution more, since this can be extended as a general string concatenation aggregate in more complex queries.

Both of the above queries generate, from Table1, the string: ‘[Country],[Day], [Month]’. This column list is used twice in the pivot query that we aim to construct. Once it is use in the list of columns that we want to retrieve, and once it is used as the list of values that should become columns. Having constructed this list of columns above, we can just concatenate it with the missing parts of the pivot query like:

DECLARE @query NVARCHAR(4000)
SET @query = N'SELECT tID, '+
@cols +'
FROM
(SELECT  t2.tID
      , t1.ColName
      , t2.Txt
FROM    Table1 AS t1
        JOIN Table2 AS t2 ON t1.ColId = t2.ColID) p
PIVOT
(
MAX([Txt])
FOR ColName IN
( '+
@cols +' )
) AS pvt
ORDER BY tID;'

Executing this with

EXECUTE(@query)

will give us the expected result: a table that is pivoted and shows columns that were specified in a table:

tID         Country    Day        Month
----------- ---------- ---------- ----------
1           US         4          July
2           US         11         Sep
3           US         25         Dec

Copy Text or Image into or out of SQL Server

Copy Text or Image into or out of SQL Server

OSQL: Storing result of a Stored Procedure in a file

OSQL: Storing result of a Stored Procedure in a file

 

BCP works ok if you want to store the contents of a table to text file. If you want the contents of a stored procedure it gets complicated. In summary, you must use the stored procedure to fill a table, then use BCP to export the table, and finally you need to clear the table. I wish there was a better way!
Wait. There is and it's called OSQL. OSQL is a command line program that allows you to run SQL Statements. Let's look at the simplest form:


osql /U sa /P password /d pubs /S Server9 /Q "Select * from Authors"


Remember you are typing this into the operating system (command prompt) and NOT Query Analyzer. This will connect to the pubs database on Server9 and run the query Select * from Authors and output the results back to stdout (which is the screen unless you redirect it). And the arguement headers (-U, -P, etc.) are case sensitive. Go figure.
You wanted to get the result of a stored procedure. You can do that like this:


osql /U sa /P password /d pubs /S Server9 /Q "sp_help"


Now we need to save this in a file. The simplest way is like this:


osql /U sa /P password /d pubs /S Server9 /Q "sp_help" -o ofile.txt


This will store the results in a file called ofile.txt. You could store them in a Unicode file using -u ofile.txt. You can also embed the path information in the file name using quotes.
There are a couple of other cool things you can do with this. It's really easy to put the SQL statement you want to run in an environment variable using a batch file. You batch file will look something like this:

set mysqlcommand=sp_help
osql /U sa /P password /d pubs /S Server9 /Q "%mysqlcommand%"


This makes it pretty easy to build a batch system. You can also capture the SQL Server result code at the operating system level using the -b flag.
Other popular settings include disabling headers, setting the row width, using trusted connections and input redirection. With input redirection you can put a series of SQL commands into a file and execute them. Books Online has quite a bit more detail on OSQL if you're so inclined.

Exporting Data Programatically with bcp and xp_cmdshell

Exporting Data Programatically with bcp and xp_cmdshell

Creating CSV Files Using BCP and Stored Procedures

Creating CSV Files Using BCP and Stored Procedures

Create a simple CSV file

The simplest way to copy data from a database table to file is to use the basic BCP command:

BCP master..sysobjects out c:\sysobjects.txt -c -t, -T –S<servername>


The basic format for the BCP command for creating a CSV file is as follows:

BCP <table> out <filename> <switches>

The switches used here are:


  • -c Output in ASCII with the default field terminator (tab) and row terminator (crlf)
  • -t override the field terminator with ","
  • -T use a trusted connection. Note that U –P may be used for username/password
  • -S connect to this server to execute the command

Note that, like DTS/SSIS, BCP is a client utility, hence you need to supply the connection information.

For transfer of data between SQL servers, in place of –c, use –n or -N for native data format (-N = Unicode). This is much faster and avoids data conversion problems. Please refer to the previous BOL link for the complete format of the BCP command.

As BCP is a command line utility it is executed from T-SQL using xp_cmdshell. Create a directory called BCP on your c: drive and execute:

declare @sql varchar(8000)

select @sql = 'bcp master..sysobjects out
c:\bcp\sysobjects.txt -c -t, -T -S'+ @@servername

exec master..xp_cmdshell @sql


Other field and row delimiters


Often, character data includes commas which will be interpreted in the file as a field terminator. A simple way to cater for this is to use a different field delimiter which does not appear in the data. Commonly used characters are pipe (|):

declare @sql varchar(8000)

select @sql = 'bcp master..sysobjects out
c:\bcp\sysobjects.txt -c –t| -T -S' + @@servername

execmaster..xp_cmdshell@sql


And caret (^):

declare @sql varchar(8000)

select @sql = 'bcp master..sysobjects out
c:\bcp\sysobjects.txt -c –t^ -T -S' + @@servername

exec master..xp_cmdshell @sql


The terminators are not limited to a single character, so if necessary you can use (|^):

declare @sql varchar(8000)

select @sql = 'bcp master..sysobjects out
c:\bcp\sysobjects.txt -c –t|^ -T -S' + @@servername

exec master..xp_cmdshell @sql


Note that this will increase the size of the file and so slow down the import/export. Another way to cater for embedded commas is to "quote encapsulate" character data – more about that later.

It is unusual to need to change the row terminator from the default of crlf (carriage return, line feed) but occasionally you will need the cr or lf on its own.

To do this use the hex value cr = 13 = 0x0D, lf = 10 = 0x0A. If you can't remember these values they are easily obtained, as follows:

select cr = ascii('

')

select lf = ascii(right('

',1))


To use these in the BCP command for the row terminator the –r switch is used:

declare @sql varchar(8000)

select @sql = 'bcp master..sysobjects out
                 c:\bcp\sysobjects.txt -c -t, -r0x0D -T -S' + @@servername

exec master..xp_cmdshell @sql


When the resulting file is opened in notepad the row terminators will not end the line – the character should appear as a non-ASCII character (a little oblong). If opened or copied to query analyser (or management studio) these will be interpreted as line breaks and the file will be more readable.

The terminator characters can be encapsulated with double quotes in the command to allow for space. I like to do this for anything other than a single character delimiter. The following example gives a crlf row terminator and | ^ field terminator.

declare @sql varchar(8000)
select @sql = 'bcp master..sysobjects out
                 c:\bcp\sysobjects.txt -c -t"| ^" -r"0x0D0A" -T -S'
+ @@servername
exec master..xp_cmdshell @sql


Formatting the extracted data


If you do not require all the columns/rows, or you wish to alter the data, a simple method is to create a view. This can be used in the BCP command in the same way as a table. The following view allows you to extract just the name, the formatted create date and time and then order the results according the date created (most recent first):

use tempdb
go
create view vw_bcpMasterSysobjects
as
  select top 100 percent
      name ,
      crdate = convert(varchar(8), crdate, 112) ,
      crtime = convert(varchar(8), crdate, 108)
   from master..sysobjects
   order by crdate desc
go
declare @sql varchar(8000)
select @sql = 'bcp tempdb..vw_bcpMasterSysobjects out
                 c:\bcp\sysobjects.txt -c -t, -T -S' + @@servername
exec master..xp_cmdshell @sql


Now we can quote encapsulate the extracted data by including the formatting in the view:

use tempdb
go
create view vw_bcpMasterSysobjects
as
   select top 100 percent
      name = '"' + name + '"' ,
      crdate = '"' + convert(varchar(8), crdate, 112) + '"' ,
      crtime = '"' + convert(varchar(8), crdate, 108) + '"'
   from master..sysobjects
   order by crdate desc
go
declare @sql varchar(8000)
select @sql = 'bcp tempdb..vw_bcpMasterSysobjects out
                 c:\bcp\sysobjects.txt -c -t, -T -S' + @@servername
exec master..xp_cmdshell @sql


Note:
Quote encapsulation can also be performed by use of a format file.


Complete control – stored procedures


If you examined the BCP syntax in BOL, you might have noticed that it is possible to extract from a query by using the queryout keyword in place of out.

So, for example, an equivalent but neater version of the previous code extract would place the ORDER BY clause in the BCP statement rather than the view:

use tempdb
go
create view vw_bcpMasterSysobjects
as
   select
      name = '"' + name + '"' ,
      crdate = '"' + convert(varchar(8), crdate, 112) + '"' ,
      crtime = '"' + convert(varchar(8), crdate, 108) + '"'
   from master..sysobjects
go
declare @sql varchar(8000)
select @sql = 'bcp "select * from tempdb..vw_bcpMasterSysobjects
order by crdate desc, crtime desc"
queryout c:\bcp\sysobjects.txt -c -t, -T -S'
+ @@servername
exec master..xp_cmdshell @sql


Of course, strictly speaking, the view was not necessary at all as the query could have been included in the BCP statement but that can get difficult to maintain.

The queryout method allows you to BCP from the result of a stored procedure, which opens up a lot of possibilities and offers a lot of control over the file format. For anything other than a simple table extract I would tend to use this method rather than a view. I would also format each line within the stored procedure. This means that the formatting can be tested independently from the file creation.

Employing this technique, the above extract becomes:

use tempdb
go
create proc s_bcpMasterSysobjects
as
   select   '"' + name + '"'
            + ',' + '"' + convert(varchar(8), crdate, 112) + '"'
            + ',' + '"' + convert(varchar(8), crdate, 108) + '"'
   from master..sysobjects
   order by crdate desc
go
declare @sql varchar(8000)
select @sql = 'bcp "exec tempdb..s_bcpMasterSysobjects"
queryout c:\bcp\sysobjects.txt -c -t, -T -S'
+ @@servername
exec master..xp_cmdshell @sql


More complex formatting


We can now change the format of the extracted data to anything we require. For example, to include column names at the top and a rowcount at the end:

use tempdb
go
create proc s_bcpMasterSysobjects
as
      set nocount on
     
      create table #a (seq int, crdate datetime, s varchar(1000))
      -- header - column headers
      insert      #a (seq, crdate, s)
      select      1, null,
                  '"name","crdate","crtime"'
     
      -- data
      insert      #a (seq, crdate, s)
      select      2, crdate,
                          '"' + name + '"'
                  + ',' + '"' + convert(varchar(8), crdate, 112) + '"'
                  + ',' + '"' + convert(varchar(8), crdate, 108) + '"'
      from master..sysobjects

      -- trailer - rowcount
      insert      #a (seq, crdate, s)
      select      3, null,
                  'rowcount = ' + convert(varchar(20),count(*)-1)
      from #a

      select      s
      from  #a
      order by seq, crdate desc
go


This stored procedure will format the data as required, but when it is run via the BCP command it receives an error:

"Invalid object name '#a'"

Using the profiler you can see that when using queryout, the query is called twice. BCP tries to get the result set format by calling the stored procedure with fmtonly. This does not execute the stored procedure but returns the format of the resultset. Unfortunately, it does not work if the stored procedure creates and uses a temp table. This problem can be circumvented by including a set fmtonly off command in the execution:

declare @sql varchar(8000)
select @sql = 'bcp "set fmtonly off exec tempdb..s_bcpMasterSysobjects"
queryout c:\bcp\sysobjects.txt -c -T -S' + @@servername
exec master..xp_cmdshell @sql


Note that the column terminator is not used since the resultset is a single column.

---Editor's Note---
This was the only piece of code that I failed to get working. The stored procedure executed fine outside the BCP, but when I tried to run the above command I received an error:

SQLState = HY010, NativeError = 0
Error = [Microsoft][SQL Native Client]Function sequence error
NULL

The author was unable to replicate this error. Is anyone aware of what might be causing this on my machine?
---End Editor's Note---

Look at this execution using the profiler and you will see set fmtonly off followed by set fmtonly on. There is a potential problem with this though: it means that the stored procedure is executed twice, once for the format and once to extract the data – due to the fmtonly setting, both these calls will produce a resultset. This needs to be taken into account when considering how long the process will take and its impact on your system.

As the stored procedure is executed twice it should not contain any updates of permanent data. Particularly if data is flagged as exported then the flagging will be performed on the first call and no data will be extracted for the second call. All in all, this method should be used with caution and comments (warnings) placed in the stored procedure for the aid of future maintenance.

To get round all of these problems, use the stored procedure to insert into a table then, from the BCP, extract from that table. This gives you added features that can be valuable on systems that are performing a lot of extracts:


  • It gives a record of the data extracted – if the file is lost it can be recreated.
  • The data extracted can be presented to a user for viewing from the database.
  • If there are problems with the file, the support can see the data without needing access to the file.

For this process, the format stored procedure, s_bcpMasterSysobjects, will insert into the table using an identifier and the BCP will extract using the same identifier. This process can be controlled be (a control) stored procedure which would allocate the identifier and pass it to the format stored procedure which inserted the extracted data into a table with that identifier. It then calls another stored procedure or in-line code to create files (via BCP) for each identifier which is not marked as complete. The identifier is flagged as complete after the file creation. In this way the file creation becomes independent of the extract of the data.

This identifier can be passed to both processes from a controlling stored procedure or the format stored procedure can get the next available identifier and the extract can flag the identifier as extracted – so it extracts everything that has not been extracted, allowing the export to be independent of the extract.

Which technique you use will depend on your system. The following code demonstrates the use of a controlling stored procedure:

use tempdb
go
create table Extract
      (
      Extract_ID  int ,
      Seq1        varchar(100) null ,
      Seq2        varchar(100) null ,
      Seq3        varchar(100) null ,
      Data        varchar(8000) ,
      InsertDate  datetime default getdate()
      )
go
create proc s_bcpMasterSysobjects
@ExtractID int
as
declare @rowcount int
      set nocount on
     
      -- header - column headers
      insert      Extract (Extract_ID, Seq1, Data)
      select      @ExtractID,
                  '01' ,
                  '"name","crdate","crtime"'
     
      -- data
      insert      Extract (Extract_ID, Seq1, Seq2, Data)
      select      @ExtractID,
                  '02' ,
                  + convert(varchar(100), '99990101' - crdate, 121) ,
                          '"' + name + '"'
                  + ',' + '"' + convert(varchar(8), crdate, 112) + '"'
                  + ',' + '"' + convert(varchar(8), crdate, 108) + '"'
      from master..sysobjects

      select @rowcount = @@rowcount
     
      -- trailer - rowcount
      insert      Extract (Extract_ID, Seq1, Data)
      select      @ExtractID,
                  '03' ,
                  'rowcount = ' + convert(varchar(20),@rowcount)
go
create proc s_Extract
@ExtractID int
as
      select      Data
      from  Extract
      where Extract_ID = @ExtractID
      order by Seq1, Seq2, Seq3
go


Now the data is extracted via:

Create table ExportLog (Export_id int, Status varchar(20))

Insert ExportLog select 25, 'Extracting'
exec tempdb..s_bcpMasterSysobjects 25
update ExportLog set Status = 'Exporting' where Export_id = 25
declare @sql varchar(8000)
select @sql = 'bcp "exec tempdb..s_Extract 25"
queryout c:\bcp\sysobjects.txt -c -T -S' + @@servername
exec master..xp_cmdshell @sql
update ExportLog set Status = 'complete' where Export_id = 25


You can view the data extracted via:

exec tempdb..s_Extract 25


Alternatively, you can use the following, which will also give the date that the data was extracted:

select * from tempdb..Extract order by Seq1, Seq2, Seq3


Note:
For the extract I have given three sort columns Seq1, Seq2, Seq3 which are all order ascending. This means that the extracting stored procedure needs to place the values in ascending order (see the date manipulation). This could have ascending and descending sort columns or the use could be dependent on the extract type.

For a system which is performing a lot of small exports I will usually implement this process, and a similar process for imports.

Extracting all tables from a database


This is usually performed to transfer data between servers/databases and so should use the native data format. The easiest way is to create the BCP statements from a query then execute them:

select 'exec master..xp_cmdshell'
            + ' '''
            + 'bcp'
            + ' ' + TABLE_CATALOG + '.' + TABLE_SCHEMA + '.' + TABLE_NAME
            + ' out'
            + ' c:\bcp\'
            + TABLE_CATALOG + '.' + TABLE_SCHEMA + '.' + TABLE_NAME + '.bcp'
            + ' -N'
            + ' -T'
            + ' -S' + @@servername
            + ''''
from INFORMATION_SCHEMA.TABLES
where TABLE_TYPE = 'BASE TABLE'


The result of this will be a series of BCP statements of the form:

exec master..xp_cmdshell 'bcp tempdb.dbo.Extract out
c:\bcp\tempdb.dbo.Extract.bcp -N -T –S<servername>'


which will extract all tables in the database. To import just change the "out" to "in".

Summary


We have seen how to use BCP to perform a simple extract of a table to a CSV file, how to perform slightly more complex extracts using a view, extracting using a stored procedure to format the data and a system which allows the data to be extracted and saved in a table and viewed/exported from there.

Use SqlBulkCopy to Quickly Load Data from your Client to SQL Server

Use SqlBulkCopy to Quickly Load Data from your Client to SQL Server

Getting The Most Out of SQL Server 2005 UDTs and UDAs

Getting The Most Out of SQL Server 2005 UDTs and UDAs

Tuesday, September 18, 2007

What Data is in SQL Server's Memory?

What Data is in SQL Server's Memory?

SQL Server memory is primarily used to store data (buffer) and query plans (cache). In this article I'll show how much memory is allocated to the data buffer (or RAM). I'll explain how to determine what tables and indexes are in the buffer memory of your server.

SQL Server stores its data in 8KB data pages.  As these pages are read off disk they are stored in memory.  This is referred to as buffer memory.  A list of all the data pages in memory is stored in the dynamic management view sys.dm_os_buffer_descriptors.  A simple SELECT from this DMV returns this result set:

database_id file_id page_id page_level allocation_unit_id page_type row_count free_space_in_bytes is_modified
----------- ------- ------- ---------- ------------------ --------- --------- ------------------- -----------
2 1 8716 0 71942940205187072 IAM_PAGE 2 6 1
2 1 15178 0 440359678902272 DATA_PAGE 42 1086 1
1 1 331 0 281474980642816 IAM_PAGE 2 6 0
1 1 239 0 281474980642816 DATA_PAGE 1 6790 0
5 1 45652 0 72057594089766912 DATA_PAGE 80 696 0
2 1 9860 0 169048845058048 INDEX_PAGE 0 8096 1
5 1 35979 0 71798504602664960 TEXT_MIX_PAGE 35 338 0
1 1 376 1 458752 INDEX_PAGE 2 8062 0
. . . .

I'm running these queries on the server that hosts SQLTeam.com.  It's running SQL Server 2005 Express Edition so the memory is capped at 1GB.  We can see what database this data page came from using the database_id column.  If this page belongs to an index we can see the index level of the page in the page_level column.  We can also see what type of page this is by looking at the page_type column.  My result set included Index Allocation Map (AIM) pages, data pages, index pages and text pages.  You can see a full list of page types in Books Online under Pages and Extents.  There is also a flag (is_modified) that tells us whether this page has been changed since it was read from disk (i.e. the page is "dirty").

We can do a little grouping and summing on this table like so:

select count(*) AS Buffered_Page_Count
,count(*) * 8192 / (1024 * 1024) as Buffer_Pool_MB
from sys.dm_os_buffer_descriptors

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Buffered_Page_Count Buffer_Pool_MB
------------------- --------------
59405 464

This tells us that there are 59,405 data pages in memory that take 464MB of RAM.  If I turn on Performance Monitor (PerfMon) and watch the Database Pages counter in the SQLServer:Buffer Manager object I'll see the exact same 59,405 listed.  This number fluctuates a little minute to minute but shouldn't be changing dramatically on a production server.

The next thing I typically want to see is how much memory is going to each database.  On my server I run this query:

SELECT LEFT(CASE database_id 
WHEN 32767 THEN 'ResourceDb'
ELSE db_name(database_id)
END, 20) AS Database_Name,
count(*)AS Buffered_Page_Count,
count(*) * 8192 / (1024 * 1024) as Buffer_Pool_MB
FROM sys.dm_os_buffer_descriptors
GROUP BY db_name(database_id) ,database_id
ORDER BY Buffered_Page_Count DESC

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Database_Name Buffered_Page_Count Buffer_Pool_MB
-------------------- ------------------- --------------
sqlteam 28145 219
SqlTeamBeta 15854 123
tempdb 8887 69
SQLTeamBlogsNew 5167 40
clearpass_isc 1144 8
eNewsletterPro 367 2
BillGraziano 250 1
ResourceDb 64 0
ClearPass_MIS 58 0
master 35 0
ClearTrace 16 0
msdb 1 0

It looks like most of the data buffer is being allocated to SQLTeam.  The next level of drill-down is the object level inside the database.  That query looks like this:

SELECT TOP 25 
obj.[name],
i.[name],
i.[type_desc],
count(*)AS Buffered_Page_Count ,
count(*) * 8192 / (1024 * 1024) as Buffer_MB
-- ,obj.name ,obj.index_id, i.[name]
FROM sys.dm_os_buffer_descriptors AS bd
INNER JOIN
(
SELECT object_name(object_id) AS name
,index_id ,allocation_unit_id, object_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND (au.type = 1 OR au.type = 3)
UNION ALL
SELECT object_name(object_id) AS name
,index_id, allocation_unit_id, object_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND au.type = 2
) AS obj
ON bd.allocation_unit_id = obj.allocation_unit_id
LEFT JOIN sys.indexes i on i.object_id = obj.object_id AND i.index_id = obj.index_id
WHERE database_id = db_id()
GROUP BY obj.name, obj.index_id , i.[name],i.[type_desc]
ORDER BY Buffered_Page_Count DESC

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

name name type_desc Buffered_Pages Buffer_MB
------------------------------ ----------------------------------- ----------------- -------------- ---------
FORUM_REPLY IX_TOPIC_DATE CLUSTERED 13287 103
FORUM_TOPICS PK_FORUM_TOPICS CLUSTERED 6028 47
FORUM_MEMBERS PK_FORUM_MEMBERS CLUSTERED 2276 17
FORUM_REPLY IX_FORUM_REPLY NONCLUSTERED 1201 9
fulltext_index_map_341576255 i1 CLUSTERED 925 7
FORUM_MEMBERS_PENDING PK__FORUM_MEMBERS_PE__46E78A0C CLUSTERED 491 3
sqlteam_Article PK_sqlteam_Article CLUSTERED 445 3
FORUM_TOPICS IX_FORUM_TOPICS_FORUM_ID_LAST_POST NONCLUSTERED 431 3
fulltext_index_map_373576369 i1 CLUSTERED 304 2
sysobjvalues clst CLUSTERED 249 1
FORUM_MEMBERS IX_LAST_POST NONCLUSTERED 222 1
FORUM_TOPICS IX_FORUM_TOPICS_TOPIC_ID NONCLUSTERED 173 1
FORUM_REPLY IX_REPLY_FORUM_R_STATUS NONCLUSTERED 153 1
FORUM_MEMBERS IX_FORUM_MEMBERS NONCLUSTERED 123 0
. . . . .

This query only returns the results for a specific database.  This is the result from the SQLTeam.com database.  Remember that keeping a clustered index in the buffer (or memory) is the same as keeping the table in memory. 

The forum tables are the largest in the database and typically get the most traffic.  I'm not surprised to see them at the top of the list.  I was a little surprised at how quickly the memory usage dropped off.  I expected it to be spread out more evenly. The "full text" entries in the list are for the full text indexes that are on the forum tables.

Saturday, September 15, 2007

Downloading and Playing YouTube Videos on Your PC > Tools for Downloading YouTube Videos

InformIT: Downloading and Playing YouTube Videos on Your PC > Tools for Downloading YouTube Videos

Tools for Downloading YouTube Videos
Several websites let you do what YouTube doesn’t—save YouTube videos to your computer’s hard drive. And after you’ve saved a YouTube video, you can watch it anytime you want, even if you’re not connected to the Internet.
The easiest way to save your YouTube videos is to use a video downloader site designed just for that purpose. These sites take the Flash video from the YouTube site and save it to your PC’s hard drive. Once it’s on your hard drive, you can view it at any time.
To use one of these sites, go to the YouTube site, open the desired video viewing page, and then copy the URL for this page. Now go to the video downloader site and paste that URL into the appropriate box. Click the Download button, and the site will download the selected video to your disk, saving it in the .FLV format. (If prompted, you’ll need to give the downloaded file a .FLV extension.)
Some of the more popular YouTube downloader sites include the following:
dlThis.com Video Grabber
FeelingTea
GooTube FLV Retriever
KeepVid
KissYouTube
Ripzor YouTube Video Ripper
SaveTube
SaveYouTube.com
TechCrunch YouTube Video Download Tool
VideoDL
VideoDownloader
YouTube Downloader
YouTubeX.com
YouTubia
All of these sites are completely free to use, and will download and save any YouTube video (and, in some cases, videos from other streaming video sites).
In addition, if you use the Internet Explorer web browser, you can use the FlvGetter plug-in to download FLV-format videos from within the browser. If you use Mozilla Firefox instead, similar FLV downloading plug-ins for your browser include UnPlug and VideoDownloader.

Playing YouTube Videos with a Flash Video Player
Once you’ve downloaded a YouTube video file, how do you play it? Simple—you use a video player program that can play back .FLV format files. Not all video players support the .FLV format, however. You’ll need to use one of the following:
Flash Guru Flash Video Player
Flash Video Player
FLVPlayer
Riva FLV Player
Wimpy Standalone FLV Player
All of these video players work in much the same fashion. Launch the player, open the file for the video you want to view, and then click the Play button. Use the player’s transport controls to pause, stop, rewind, and fast-forward the video.
Downloading and Playing YouTube Videos with RealPlayer
There’s a new way to download, save, and play YouTube videos—and it’s an all-in-one solution. The latest version of RealPlayer, currently in beta testing, enables one-button downloading of any video from the YouTube site, and also functions as a player for .FLV-format videos (among others).
When you install the new RealPlayer, you also install a special add-in for Internet Explorer. This add-in automatically displays a Download This Video button above the top right corner of any video you view on the YouTube site. (It displays the same button on any video streaming site, so you can download videos from any website that offers streaming videos, such as CNN.com or ESPN.) To display the Download This Video button, simply hover your cursor over the video you want to download; the button should appear automatically.
Note that you don’t have to have the RealPlayer program running to display the Download This Video button in your web browser. The button is an add-in program that launches automatically whenever you launch your web browser.
To download a video, just click the Download This Video button. RealPlayer displays the Download & Recording Manager window and automatically saves the video to your hard drive. You don’t have to do anything; the file is automatically named and saved.
To play back a saved video, launch the RealPlayer program and select the My Library tab. The videos you’ve downloaded in this fashion should be listed here; double-click a video to start playback. If you’ve downloaded a video using another download website, select File, Open to locate the file on your hard drive, and then load the video for playback.
Playback takes place within the RealPlayer window, at whatever size the window appears. To view the video full-screen, click the Full-Screen Theater button; to view the video at its original size, regardless of how big the RealPlayer window is, click the 1X button. You can use RealPlayer’s transport controls to pause, stop, rewind, and fast-forward the video; you can even play back in slow motion, by clicking Pause and then using the Slow Reverse and Slow Forward buttons.
All in all, the new RealPlayer is an elegant solution for downloading and playing YouTube and other streaming videos. And the best thing is, it’s totally free—which means it’s worth your consideration.

.netCHARTING

Dino Esposito reviews .netCHARTING


Got DotNet? Got Charting? Here’s .netCHARTING Then
If you know what was graphic programming, and charting in particular, in the pre-.NET era—yep, seen today it looks like a sort of Jurassic Windows—you can only thank the will of fate which made you approach this special branch of programming in the .NET age.

The .NET Framework comes with an advanced library of functions collectively known as GDI+ in honor of the glorious (and notorious…) Windows Graphics Device Interface—joy and sorrow (well, mostly sorrow) for a generation of Windows programmers.

GDI+ services belong to the following categories: 2D vector graphics and imaging. 2D vector graphics involves drawing simple figures such as lines, curves, and polygons. Under the umbrella of imaging are functions to display, manipulate, save, and convert bitmap and vector images. Finally, a third category of functions can be identified—typography, which has to do with the display of text in a variety of fonts, sizes, and styles.

As you can see, GDI+ only provides the bedrock of graphic development; using its low-level tools one can arrange for more advanced and sophisticated forms of graphics like charts. So long as you just need to render a series of data in vertical and flat bars self development is still an option. But when your customers begin asking for more?

Using plain GDI+ classes, producing a pie chart like this is definitely possible, but certainly not trivial.

The code to place labels at the center of the slice can be boring to figure out and write. The 3D style with lighting like above is a nice enhancement, but requires quite a few lines of smart code.

The bottom line is that, yes sure you can do it yourself but is it time and cost effective? More, does what you actually get—mind, at the cost of spending precious hours of non-business development—meet the expectations of the customer? And even when the result meets expectations, you’ve lost a great opportunity to exceed expectations at the same cost or even cheaper!

A pie chart is an eye-catching way of presenting data; but it is not appropriate just for every series of data. The real tradeoff is when you can switch it to alternative graphics with little changes. Trust me: the chart below is not that easy to craft.
Even harder is coming up with a manual implementation of a bevel chart like the one that follows. Effects like transparency, 3D style, backgrounds require code; and code requires time and money. And, more importantly, that is not the type of code you’re mostly paid for.
In summary, a professional help from a well-done graphic library is first of all a first-class service you offer to your clients. You ought to consider that. Always. (And don’t wait for the need to render a gauge graph to seriously consider that …)
What’s the Perfect Fit?
Choosing the right library may not be easy and, to some extent, is a matter of personal preference. However, any library you nominate must meet a few requirements in order to assist you developer in delivering a first-class service to your customer.

My ideal graphic library should be:
Easy to use
Sport a .NET-ish programming interface
Supply a great variety of charts and speak the language of statistics and charting
Be effective in terms of graphic results, required programming efforts, and costs
Provide a rich user interface that goes beyond a long list of chart types to add support for stunning features like transparency, backgrounds, gradients, nice labeling, and more
Be internally optimized for performance and offer control over certain features like caching and cache duration

Forgot something? Oh, sure. It doesn’t have to be overly complex and unwieldy. Because unwieldy software often scares me at first, I do love any piece of software that no matter how complex allows you to get started with a simple but significant example.

I believe that .netCHARTING meets all of these requirements. And, at least for me, it rocks.
Working with .netCHARTING
As amazing as it may seem, a simple .netCHARTING ASP.NET application is really simple. You start by registering the control assembly with the page and place an instance of the control in the ASPX source.


:


Next, you need to feed the control with some data to generate the chart. The .netCHARTING data model consists of elements which collectively become a series. The .netCHARTING control sucks data from a series. One or more series can be grouped in a collection and passed to the control.
The Element class represents an element on the chart and provides properties to hold different types of data such as X and Y values, percentage completed for Gantt charts, and many more. In addition, the Element class provides control over other attributes of a single element such as colors and markers.
The Series class represents a group of elements and provides default properties of all elements within it as well as specific properties for the series such as legend entries. The Series class comes with data manipulation methods through which you can, for example, sort elements and perform some calculation and derive new expression-based elements for the series.
You associate a series to the .netCHARTING control using the SeriesCollection property, as below:

Element e = new Element();
e.Name = "My Element";
e.YValue = 10;
Series s = new Series();
s.Name = "My Series";
s.Elements.Add(e);
SeriesCollection sc = new SeriesCollection();
sc.Add(s);
chart.SeriesCollection.Add(sc);


The code snippet first instantiates and name a new element and assigns a value for the Y axis. Next, it creates a new series and fills it with the previously created element. The Elements property on the Series class lists the elements currently within the series. Finally, the series is added to a SeriesCollection object which, in turn, is added to the same property on the .netCHARTING control. This code snippet is usually located in the Page_Load event. The Render method of the control accesses data from the bound series collection and render the chart.

While extremely flexible—you create elements individually—the preceding code is not realistic. In most cases, in fact, you get data to render from business objects that retrieve it from databases, Excel documents, XML streams. Business objects normally expose data using ADO.NET container objects like the DataSet or streaming interfaces like the data reader. To meet these requirements, the .netCHARTING control also accepts data from a DataSet. For consistency, though, the DataSet must be packed into a Series object. Here’s an example.

void Page_Load(object sender, EventArgs e)
{
// Set global properties
chart.Title = "Item sales report";
chart.TempDirectory = "temp";
chart.Type = ChartType.Pie;

// Adding series programatically
chart.Series.Name = "Item sales";
chart.Series.Data = GetDataSet();
chart.SeriesCollection.Add();
}

The Data property of the Series class can be filled with any ADO.NET enumerable object including readers. Feasible chart types are in the ChartType enumeration; the global properties you can set are function of the selected chart type.

The TempDirectory property controls the directory where images generated by the control are temporarily stored. You must provide writing permissions on this directory on the Web server machine. Each image is saved to a server file to avoid taxing the server memory. By default, the image is regenerated on every hit unless you set the CacheDuration property. When set, the chart image is generated on the first hit and preserved for the specified number of seconds. All requests coming in this period will be served accessing the cached image with no rendering work by the chart control. A Boolean property—UseFile—lets you indicate whether the generated image is saved as a file in the TempDirectory folder or streamed directly to the browser.

One more class that is worth mentioning is the DataEngine class. First and foremost, the DataEngine object can be configured to collect data either through queries against databases or consuming disconnected DataTable objects. The data is then converted to a SeriesCollection which is the only type of data the chart control recognizes. In addition, the DataEngine offers many handy data manipulation features specifically designed to serve common needs like date aggregation.

DataEngine de = new DataEngine();
de.StartDate = "1/1/02 8:00:00 AM";
de.EndDate = "1/1/02 8:00:00 AM";
de.DateGrouping = TimeInterval.Days;
de.SqlStatement = "SELECT names, values FROM myTable " + "WHERE start > #StartDate# AND end < #EndDate#";
SeriesCollection sc = de.GetSeries(); chart.SeriesCollection.Add(sc);

The DataEngine also supports date grouping done with the desired granularity—from years to minutes. In the newest version 2.5, .netCHARTING supports extensive arithmetic operations including average, minimum, maximum and many more. In earlier versions, only the sum operation is supported.

Often you need to display a chart to complement a grid of data or vice versa. The .netCHARTING control makes this feature as easy to implement as assigning the ID of the desired page DataGrid to the DataGrid property of the control.

chart.DataGrid = "myGrid";

The DataGrid property takes the ID of a grid control defined within the page. If set, the data represented in the chart is displayed in the specified grid control.
Advanced Capabilities
The .netCHARTING provides several advanced capabilities. I like to briefly discuss three of them here: drill-down, file manager, and the countless options for aligning text and inserting annotations.
Drill-down consists in the control’s ability to support a logically related chain of charts. The idea is that you display a first chart and by clicking each of the displayed elements to get into the data represented there obtaining a detail view. The level of nesting is up to you and exploits the aforementioned date grouping capabilities. Here’s an example:

Chart.DateGrouping = TimeInterval.Years;
Chart.DrillDownChain="Years,Quarters,Months,days=Days,hours,minutes";
Chart.DefaultSeries.DefaultElement.ToolTip="%yvalue";


The DrillDownChain property defines the various levels of detail you wish to see. No other code is needed.

All charts generated by the control can be saved to a server-side file. You do this using the FileManager object. The SaveImage method and the ImageFormat enumerated type allow you choose name and format of the image file. The code below shows how to save the chart to a file which is then assigned to the ImageUrl property of an ASP.NET Image control.

img.ImageUrl = Chart.FileManager.SaveImage(Chart.GetChartBitmap());

In the newest version 2.5, even the legend can be taken out as a distinct image file. You use the code below to assign the bitmap of the legend to an Image control within the host page.

leg.ImageUrl = Chart.FileManager.SaveImage(Chart.GetLegendBitmap());

In version 2.5, .netCHARTING also fully supports image maps—that is, images that include hotspots for clicking. The control exposes the ImageMapName property to let you assign a name to the map and ImageMapText to insert the HTML block that physically represents the list of clickable areas within the chart.

An annotation is a box of text with a position, data source, and hotspot functionality. You can associate annotations with elements of a chart like in the picture below.
Tooltips and legend boxes can be defined based on a text template. Here’s an example:

chart.LegendBox.Template ="%icon %name";

You define a template by composing special syntax elements known as tokens. A token is a name prefixed with a percentage sign. Several tokens are predefined for you to incorporate commonly used data. For example, the legend template shown above displays the icon and the name of the entry. In this case, the icon is a small rectangle reproducing the same graphic fill of the corresponding chart element. The name instead is the name assigned to the element in the series of data. See the documentation for the complete list of supported tokens for each chart component.
Summary
In .netCHARTING, the list of supported chart types is amazingly long and rich (gauge charts are the latest entry), but not endless; meaning that the .netCHARTING control is overall well balanced. It is not too big for memory, and is not too slim as far as capabilities are concerned. Each element of each series can be easily customized based on runtime conditions—for example, you might want to change the color of all slices representing sales that exceeded expectations. Labeling, coloring, and styling are pretty rich too. For more details, the documentation is quick with brief explanations and, better yet, adequate code snippets in C# and Visual Basic .NET.

Another aspect to emphasize is that to effectively use the control you don’t have to be a .NET guru or super-expert. Some shortcuts like the date aggregation features of the data engine are designed to answer common requirements and make it simple also for professionals other than experienced .NET developers. Bear in mind, the one of strenghts of this component is that you build an effective data-driven chart with only 4 lines of code. Want an example?

chart.TempDirectory="temp";
chart.DefaultSeries.ConnectionString = "data.mdb";
chart.Series.SqlStatement= "SELECT Customer, Sales FROM Table";
chart.SeriesCollection.Add();

Admittedly, this article only scratched the tip of .netCHARTING iceberg but it is enough to illustrate the main themes of this control. The next step is up to you. Enjoy!