Saturday, June 30, 2007

15 Seconds : SQL Server 2005 XQuery and XML-DML - Part 1

15 Seconds : Paging in SQL Server 2005

15 Seconds : Paging in SQL Server 2005


15 Seconds : Paging in SQL Server 2005
David Beahm
06/28 / 2007

Introduction

Developers and database administrators have long debated methods for paging recordset results from Microsoft SQL Server, trying to balance ease of use with performance. The simplest methods were less efficient because they retrieved entire datasets from SQL Server before eliminating records which were not to be included, while the best-performing methods handled all paging on the server with more complex scripting. The ROW_NUMBER() function introduced in SQL Server 2005 provides an efficient way to limit results relatively easily.

Paging Efficiency

In order to scale well, most applications only work with a portion of the available data at a given time. Web-based data maintenance applications are the most common example of this, and several data-bindable ASP.NET classes (such as GridView and Datagrid) have built-in support for paging results. While it is possible to handle paging within the web page code, this may require transferring all of the data from the database server to the web server every time the control is updated. To improve performance and efficiency, data which will not be used should be eliminated from processing as early as possible.

Paging Methods

Many popular databases offer functions allowing you to limit which rows are returned for a given query based upon their position within the record set. For example, MySQL provides the LIMIT qualifier, which takes two parameters. The first LIMIT parameter specifies which (zero-based) row number will be the first record returned, and the second parameter specifies the maximum number of records returned. The query:

SELECT * FROM table LIMIT 20,13

...will return the 20th through the 32nd records -- assuming at least 33 records are available to return. If fewer than 33 records are available, the query will return all records from record 20 on. If fewer than 20 records are available, none will be returned.

SQL Server does not have this functionality, however the 2005 release does have a number of other new tricks. For instance, support for CLR procedures means it is possible to use existing paging methods to write VB.NET or C# code that would execute within the SQL Server environment. Unfortunately, CLR procedures are not as efficient as native Transact SQL. To ensure best performance, queries should still be written in TSQL whenever practical.

Using ROW_NUMBER()

TSQL in the 2005 release includes the ROW_NUMBER() function, which adds an integer field to each record with the record's ordinal result set number. Stated more simply, it adds the record's position within the result set as an additional field so that the first record has a 1, the second a 2, etc. This may appear to be of little value, however by using nested queries we can use this to our advantage.

To demonstrate ROW_NUMBER() and to explore how the paging solution works, create a simple salary table and populate it with random data using the following commands:

CREATE TABLE [dbo].[Salaries](
[person] [nvarchar](50) NOT NULL,
[income] [money] NOT NULL,
CONSTRAINT [PK_salaries] PRIMARY KEY CLUSTERED(
[person] ASC
)) ON [PRIMARY]
GO

INSERT INTO Salaries VALUES ('Joe', '28000')
INSERT INTO Salaries VALUES ('Sue', '96000')
INSERT INTO Salaries VALUES ('Michael', '45000')
INSERT INTO Salaries VALUES ('John', '67000')
INSERT INTO Salaries VALUES ('Ralph', '18000')
INSERT INTO Salaries VALUES ('Karen', '73000')
INSERT INTO Salaries VALUES ('Waldo', '47000')
INSERT INTO Salaries VALUES ('Eva', '51000')
INSERT INTO Salaries VALUES ('Emerson', '84000')
INSERT INTO Salaries VALUES ('Stanley', '59000')
INSERT INTO Salaries VALUES ('Jorge', '48000')
INSERT INTO Salaries VALUES ('Constance', '51000')
INSERT INTO Salaries VALUES ('Amelia', '36000')
INSERT INTO Salaries VALUES ('Anna', '49000')
INSERT INTO Salaries VALUES ('Danielle', '68000')
INSERT INTO Salaries VALUES ('Stephanie', '47000')
INSERT INTO Salaries VALUES ('Elizabeth', '23000')

The ROW_NUMBER() function has no parameters - it simply adds the row number to each record in the result set. To ensure the numbering is consistent, however, SQL Server needs to know how to sort the data. Because of this, ROW_NUMBER() must immediately be followed by the OVER() function. OVER() has one required parameter, which is an ORDER BY clause. The basic syntax for querying the Salaries table is:

SELECT ROW_NUMBER() OVER(ORDER BY person), person, income
FROM Salaries

This returns the following result:

(No column name)personincome
1Amelia36000.00
2Anna49000.00
3Constance51000.00
4Danielle68000.00
5Elizabeth23000.00
6Emerson84000.00
7Eva51000.00
8Joe28000.00
9John67000.00
10Jorge48000.00
11Karen73000.00
12Michael45000.00
13Ralph18000.00
14Stanley59000.00
15Stephanie47000.00
16Sue96000.00
17Waldo47000.00

The Salaries data now appears sorted by person, and it has an extra column indicating each record's position within the results.

If for any reason you wanted the results to display in a different order than they were numbered in, you can include a different ORDER BY clause as part of the normal SELECT syntax:

SELECT ROW_NUMBER() OVER(ORDER BY person), person, income
FROM Salaries
ORDER BY income

This returns the following result:

(No column name)personincome
13Ralph18000.00
5Elizabeth23000.00
8Joe28000.00
1Amelia36000.00
12Michael45000.00
15Stephanie47000.00
17Waldo47000.00
10Jorge48000.00
2Anna49000.00
3Constance51000.00
7Eva51000.00
14Stanley59000.00
9John67000.00
4Danielle68000.00
11Karen73000.00
6Emerson84000.00
16Sue96000.00

If we want to limit the results displayed to a certain range, we need to nest this SELECT inside another one and provide a name for the ROW_NUMBER() column. To limit our results to records 5 through 9, we can use the following query:

SELECT *
FROM (SELECT ROW_NUMBER() OVER(ORDER BY person) AS
rownum, person, income FROM Salaries) AS Salaries1
WHERE rownum >= 5 AND rownum <= 9

This returns the following result:

rownumpersonincome
5Elizabeth23000.00
6Emerson84000.00
7Eva51000.00
8Joe28000.00
9John67000.00

Again, we can change the sort order by adding an ORDER BY clause. This is most easily accomplished by using the outer SELECT statement:

SELECT *
FROM (SELECT ROW_NUMBER() OVER(ORDER BY person) AS
rownum, person, income FROM Salaries) AS Salaries1
WHERE rownum >= 5 AND rownum <= 9
ORDER BY income

This returns the following result:

rownumpersonincome
5Elizabeth23000.00
8Joe28000.00
7Eva51000.00
9John67000.00
6Emerson84000.00

If we want to support the same type of arguments that MySQL's LIMIT() supports, we can create a stored procedure that accepts a beginning point and a maximum number of records to return. ROW_NUMBER requires that the data be sorted, so we will also have a required parameter for the ORDER BY clause. Execute the following statement to create a new stored procedure:

CREATE PROCEDURE [dbo].[pageSalaries]
@start int = 1
,@maxct int = 5
,@sort nvarchar(200)
AS
SET NOCOUNT ON
DECLARE
@STMT nvarchar(max), -- SQL statement to execute
@ubound int

IF @start < 1 SET @start = 1
IF @maxct < 1 SET @maxct = 1
SET @ubound = @start + @maxct
SET @STMT = ' SELECT person, income
FROM (
SELECT ROW_NUMBER() OVER(ORDER BY ' + @sort + ') AS row, *
FROM Salaries
) AS tbl
WHERE row >= ' + CONVERT(varchar(9), @start) + ' AND
row < ' + CONVERT(varchar(9), @ubound)
EXEC (@STMT) -- return requested records

The pageSalaries procedure begins with SET NOCOUNT ON to disable the record count message (a common step for optimizing query performance). We then declare two necessary variables, @STMT and @ubound. Because we want to be able to change what ORDER BY argument is used, we need to dynamically generate our query statement by storing it in @STMT. The next lines ensure that only positive numbers are used for the starting position and maximum size, then calculate the range of ROW_NUMBER() values being requested. (If we wanted to be zero-based like MySQL's LIMIT, we could do so with a few minor tweaks.) Once the dynamic SQL command has been strung together, it is executed so that the results are returned.

Execute the following statement to test the stored procedure:

pageSalaries 4, 7, 'income'

This returns the following result:

personincome
Amelia36000.00
Michael45000.00
Stephanie47000.00
Waldo47000.00
Jorge48000.00
Anna49000.00
Constance51000.00

If we execute:

pageSalaries 13, 7, 'income'

we receive back:

personincome
John67000.00
Danielle68000.00
Karen73000.00
Emerson84000.00
Sue96000.00

... because the query goes beyond the number of records available.

Taking this one step further, we can make a stored procedure that does a more general form of paging. In fact, it can be generalized to the point that it can be used to return any collection of fields, in any order, with any filtering clause. To create this wunderkind marvel, execute the following command:

CREATE PROCEDURE [dbo].[utilPAGE]
@datasrc nvarchar(200)
,@orderBy nvarchar(200)
,@fieldlist nvarchar(200) = '*'
,@filter nvarchar(200) = ''
,@pageNum int = 1
,@pageSize int = NULL
AS
SET NOCOUNT ON
DECLARE
@STMT nvarchar(max) -- SQL to execute
,@recct int -- total # of records (for GridView paging interface)

IF LTRIM(RTRIM(@filter)) = '' SET @filter = '1 = 1'
IF @pageSize IS NULL BEGIN
SET @STMT = 'SELECT ' + @fieldlist +
'FROM ' + @datasrc +
'WHERE ' + @filter +
'ORDER BY ' + @orderBy
EXEC (@STMT) -- return requested records
END ELSE BEGIN
SET @STMT = 'SELECT @recct = COUNT(*)
FROM ' + @datasrc + '
WHERE ' + @filter
EXEC sp_executeSQL @STMT, @params = N'@recct INT OUTPUT', @recct = @recct OUTPUT
SELECT @recct AS recct -- return the total # of records

DECLARE
@lbound int,
@ubound int

SET @pageNum = ABS(@pageNum)
SET @pageSize = ABS(@pageSize)
IF @pageNum < 1 SET @pageNum = 1
IF @pageSize < 1 SET @pageSize = 1
SET @lbound = ((@pageNum - 1) * @pageSize)
SET @ubound = @lbound + @pageSize + 1
IF @lbound >= @recct BEGIN
SET @ubound = @recct + 1
SET @lbound = @ubound - (@pageSize + 1) -- return the last page of records if -- no records would be on the
-- specified page
END
SET @STMT = 'SELECT ' + @fieldlist + '
FROM (
SELECT ROW_NUMBER() OVER(ORDER BY ' + @orderBy + ') AS row, *
FROM ' + @datasrc + '
WHERE ' + @filter + '
) AS tbl
WHERE
row > ' + CONVERT(varchar(9), @lbound) + ' AND
row < ' + CONVERT(varchar(9), @ubound)
EXEC (@STMT) -- return requested records
END

You may receive the following error message from SQL Server, which you can confidently ignore:

Cannot add rows to sys.sql_dependencies for the stored procedure because it depends on the missing table 'sp_executeSQL'. The stored procedure will still be created; however, it cannot be successfully executed until the table exists.

The utilPage procedure accepts 6 parameters:

@datasrc - the table (or stored procedure, etc.) name
@orderBy- the ORDER BY clause
@fieldlis- the fields to return (including calculated expressions)
@filter- the WHERE clause
@pageNum- the page to return (must be greater than or equal to one)
@pageSize- the number of records per page

The stored procedure needs the name of a data source to query against (such as a table) and one or more fields to sort by (since OVER() requires an ORDER BY clause). If @filter is blank (the default), it will be set to "1 = 1" as a simple way to select all records. If @pageSize is not supplied, the query will run without paging and will not return a record count.

If, however, @pageSize is supplied, a version of the query is executed to get the total number of records. In order to have this record count available within the procedure and as a returned value, we use sp_executeSQL to support executing the statement while returning an output parameter. The record count is used to prevent returning empty results when possible, and to support paging interfaces that calculate the number of pages available (such as GridView). If we were calling this stored procedure to populate a GridView, we would return @recct as a ReturnValue parameter instead of using a result set, but we will use a result set for demonstration purposes.

The procedure calculates what the actual record positions will be for the requested page. Rather than allow the query to fail, there are safety checks ensuring that @pageSize and @pageNum are greater than zero, and that the result set will not be empty. If the specified page is out of range, this procedure will return the last possible page of records. This is helpful if a user changes more than one setting before refreshing their data, or if a significant amount of data is deleted between requests.

The remainder of the procedure is virtually identical to the pageSalaries procedure. To test the utilPAGE stored procedure, execute the following statement:

utilPAGE 'Salaries', 'person', '*', 'income > 1000', 2, 4

This returns the following two result sets:

recct
17

rowpersonincome
5Elizabeth23000
6Emerson84000
7Eva51000
8Joe28000

If we execute:

utilPAGE 'Salaries', 'person', 'person, income', '', 13, 3

...we receive back:

recct
17

personincome
Stephanie47000
Sue96000
Waldo47000

Even though the request should be for records 36 through 38 - far outside of what is available - the procedure returns the last available page of records. In contrast, requesting the third page with seven records per page using:

utilPAGE 'Salaries', 'person', 'person, income', '', 3, 7

...returns the last three records, as the page is not completely out of bounds:

personincome
Stephanie47000
Sue96000
Waldo47000

All of these examples are based on simple single-table queries, which may not reflect what you need in the real world. While the utilPAGE procedure does not support ad-hoc JOINs, it does work with SQL Views. If you want paging support for multi-table queries, you should create a View (with all of the necessary JOINs) to use as the data source. Using a View follows good design practices as it ensures that your Joins are performed consistently, allows easier ad-hoc querying from the command line, and is much easier to troubleshoot than a stored procedure's dynamic SELECT statement logic.

Conclusion

While SQL Server does not have as simple a method for paging results as some other databases, features introduced in the 2005 release have made it possible to page results efficiently more easily than ever before. In the next article in this series, we will go a step further and integrate this paging logic with a GridView through a Data Access Layer.


rbldnsd

rbldnsd



Platform is Windows XP.
Output folder: C:\Program Files\wrbldnsd
Extract: license.txt
Output folder: C:\Program Files\wrbldnsd\bin
Extract: vcwbldd.dll
Extract: cygiconv-2.dll
Extract: cygintl-2.dll
Extract: cygintl-3.dll
Extract: cygncurses-8.dll
Extract: cygpcre-0.dll
Extract: cygreadline6.dll
Extract: cygcrypto-0.9.8.dll
Extract: cygz.dll
Extract: cygminires.dll
Extract: rsync.exe
Extract: rbldnsd.exe
Extract: cygrunsrv.exe
Extract: bash.exe
Extract: chmod.exe
Extract: cksum.exe
Extract: cmp.exe
Extract: cp.exe
Extract: date.exe
Extract: gawk.exe
Extract: gkill.exe
Extract: grep.exe
Extract: ls.exe
Extract: mv.exe
Extract: ps.exe
Extract: rm.exe
Extract: sed.exe
Extract: tail.exe
Extract: touch.exe
Extract: ssh.exe
Extract: get-stats.sh
Extract: get-clear-stats.sh
Output folder: C:\Program Files\wrbldnsd\doc
Extract: rsync.html
Extract: rbldnsd.html
Output folder: C:\Program Files\wrbldnsd\etc
Extract: profile
Output folder: C:\Program Files\wrbldnsd
Output folder: C:\Program Files\wrbldnsd\rbldnsd
Extract: pbl
Extract: sbl
Extract: xbl
Output folder: C:\Program Files\wrbldnsd
Create folder: C:\Program Files\wrbldnsd\tmp
Set Root
Configure rbldnsd as a service
Service : rbldnsd
Description : A small and fast DNS daemon to serve DNSBL zones
Current State : Stopped
Command : /bin/rbldnsd -n -f -b 0.0.0.0/53 -w /rbldnsd sbl.dnsbl:ip4set:sbl xbl.dnsbl:ip4set:xbl pbl.dnsbl:ip4set:pbl sbl-xbl.dnsbl:ip4set:sbl sbl-xbl.dnsbl:ip4set:xbl zen.dnsbl:ip4set:sbl zen.dnsbl:ip4set:pbl zen.dnsbl:ip4set:xbl
stdin path : /dev/null
stdout path : C:\Program Files\wrbldnsd\rbldnsd-stdout.log
stderr path : C:\Program Files\wrbldnsd\rbldnsd-stderr.log
Environment : CYGWIN="nontsec binmode"
Process Type : Own Process
Startup : Automatic
Dependencies : tcpip
Account : LocalSystem

Created uninstaller: C:\Program Files\wrbldnsd\uninstall.exe
Make shortcuts
Create folder: C:\Documents and Settings\rgr\Start Menu\Programs\wrbldnsd
Output folder: C:\Program Files\wrbldnsd
Create shortcut: C:\Documents and Settings\rgr\Start Menu\Programs\wrbldnsd\Uninstall wrbldnsd.lnk
Create shortcut: C:\Documents and Settings\rgr\Start Menu\Programs\wrbldnsd\Write stats to log file.lnk
Create shortcut: C:\Documents and Settings\rgr\Start Menu\Programs\wrbldnsd\Write stats to log file & Clear stats.lnk
Create folder: C:\Documents and Settings\rgr\Start Menu\Programs\wrbldnsd\Documentation
Create shortcut: C:\Documents and Settings\rgr\Start Menu\Programs\wrbldnsd\Documentation\rsync.lnk
Create shortcut: C:\Documents and Settings\rgr\Start Menu\Programs\wrbldnsd\Documentation\rbldnsd.lnk
Start rbldnsd service
Completed





rbldnsd-stdout.log



rbldnsd: listening on 0.0.0.0/53
rbldnsd: ip4set:sbl: 20060203 150652: e32/24/16/8=0/0/0/0
rbldnsd: ip4set:xbl: 20060203 150652: e32/24/16/8=0/0/0/0
rbldnsd: ip4set:pbl: 20060203 150652: e32/24/16/8=0/0/0/0
rbldnsd: zones reloaded, time 0.1e/0.0u sec, mem arena=256 free=9 mmap=0 Kb
rbldnsd: rbldnsd version 0.996 (19 Feb 2006) started (1 socket(s), 5 zone(s))
rbldnsd: stats for 225secs zone sbl.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 225secs zone xbl.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 225secs zone pbl.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 225secs zone sbl-xbl.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 225secs zone zen.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 225sec: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 1154secs zone sbl.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 1154secs zone xbl.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 1154secs zone pbl.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 1154secs zone sbl-xbl.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 1154secs zone zen.dnsbl: tot=0 ok=0 nxd=0 err=0 in=0 out=0
rbldnsd: stats for 1154sec: tot=0 ok=0 nxd=0 err=0 in=0 out=0



  1. rbldnsd: Small Daemon for DNSBLs
    http://www.corpit.ru/mjt/rbldnsd.html

  2. How to setup a local RBL mirror
    http://www.surbl.org/rbldnsd-howto.html

  3. http://www.surbl.org/rsync-signup.html

  4. http://www.itefix.no/

  5. http://www.mneylon.com/blog/archives/2004/11/13/dns-blacklists-setting-up-a-local-mirror/

  6. http://www.ladro.com/docs/dns/rblsmtpd.html




Technorati :
Del.icio.us :
Ice Rocket :
Flickr :
Zooomr :
Buzznet :
Riya :
43 Things :

Dirvish - fast, disk based, rotating network backup system

Dirvish is a fast, disk based, rotating network backup system.
read more at: Welcome to Dirvish

With dirvish you can maintain a set of complete images of your filesystems with unattended creation and expiration. A dirvish backup vault is like a time machine for your data.


Documentation
Manpages
dirvish.8
dirvish.conf.5
dirvish-runall.8
dirvish-expire.8
dirvish-locate.8

http://www.dirvish.org/dirvish.8.html