Monday, May 14, 2007

Asynchronous File I/O

Dev Source: Making Asynchronous File I/O Work in .NET: A Survival Guide

Dev Source: Making Asynchronous File I/O Work in .NET


Jim Mischel

Picture two members of the Microsoft .NET Framework development team meeting in the break room.

Programmer 1: "Can you imagine an application developer trying to do an asynchronous write in a C# program?"

Programmer 2: "Ha! I wonder if he'll just code it up and assume that it's asynchronous, or if he'll notice that it blocks and spend hours trying to figure out why."

Programmer 1: "Either way it'd be a hoot to watch, wouldn't it?"

Programmer 2: "Yeah. I love messing with peoples' heads."

Okay, so perhaps the .NET Framework developers didn't purposely make the asynchronous file operations work synchronously. It's even possible that whoever wrote the asynchronous file I/O support didn't know that it sometimes doesn't work as expected. But that shortcoming had to be revealed during testing, right?

Apparently not, as the .NET Framework SDK documentation makes almost no mention of the possibility that an asynchronous file operation will block and complete synchronously. In addition, most asynchronous I/O samples in the SDK and on other Web sites show "asynchronous" operations that don't actually execute in the background.

In this article, I'll show you how the asynchronous file I/O routines are supposed to work, demonstrate that they often don't work as expected, explain why, and show you how to guarantee that your asynchronous file operations occur in the background without blocking the main execution thread.

A Refresher

To a processor capable of several billion operations per second, an I/O device that requires milliseconds for positioning and transfers data at a couple hundred megabits per second is slow, even glacial. Whenever your application writes data to or reads data from a disk or network file, the processor spends most of its time waiting for the I/O channel or servicing other tasks. In either case, your program's execution is blocked, waiting for the file operation to complete. Even with fast hard drives that can write 10MB or more per second, that's a full second that your application is unresponsive. Try blocking your GUI application for a full second, and see how quickly your users complain about a clunky interface.

Another situation in which slow file operations are annoying is when you have to read from or write to multiple files. Imagine having to read two different large files when your application starts. Each file takes 15 seconds to load, which means that it takes 30 seconds for your application to load. It probably wouldn't surprise you to learn that you could load both files in a total of about 15 seconds if you could tell the computer to do two things at once. While it's waiting on the I/O channel for one of the files, it can be gathering data from the other file.

Operations that occur in the order that they're specified are said to occur synchronously. Most of the time, we want our programs to operate synchronously, because the program's logic depends on it. It would be unfortunate, for example, if the computer tried to use the result of a calculation before the result was computed.

Sometimes, though, we don't really care about the order of some intermediate steps, just so long as all of the steps are completed before we have to use any of the results. For example, think of the instructions to bake cookies:

Gather the ingredients Preheat the oven to 450 degrees Lightly grease a cookie tray Mix the ingredients in a bowl Put the cookie dough on the tray Place tray in oven and cook until golden brown

It's pretty obvious here that it doesn't matter in what order the first three operations happen, or if they all happen concurrently. You just have to make sure that you gather the ingredients before you try to mix them, that the cookie tray is greased and the ingredients are mixed before you put the dough on it, and that the oven is preheated before you pop the tray in. If you have a helper, you can save time by having him preheat the oven and grease the cookie tray while you're gathering and mixing the ingredients.

What you have here, and in many computer programs, is small sets of operations that can happen in any order, and synchronization points where you ensure that the previous operations are complete. If you add the synchronization points to the cookie instructions, you can draw a flow chart that illustrates the tasks that can be performed concurrently.

Wouldn't it be nice if you could do the same thing with your file I/O? Just think if you could load seldom-used data in the background while the user is navigating through menus. That would be a much nicer user experience than staring at a splash screen while the data buffers are initialized.

Reading a File Asynchronously

The designers of the .NET Framework went out of their way to make asynchronous I/O easy to use. In its simplest form, writing a program to do asynchronous reads and writes is almost as easy as using standard, synchronous I/O.

Discounting the changes in how you open the file and the name of the function you call, the only real difference with asynchronous I/O is the additional requirement of a synchronization point. At some time you have to check the result of the operation, termed "harvesting the result."

For example, consider this code snippet that uses synchronous operations to read data from a file:

static void synchronousRead()
{
byte[] data = new byte[BUFFER_SIZE];
FileStream fs = new FileStream("readtest.dat", FileMode.Open,
FileAccess.Read, FileShare.None);
try
{
int bytesRead = fs.Read(data, 0, BUFFER_SIZE);
Console.WriteLine("{0} bytes read", bytesRead);
}
finally
{
fs.Close();
}
}

To do the same thing asynchronously, you have to pass a couple of other options to the FileStream constructor, call BeginRead rather than Read, and you have to create the synchronization point where you harvest the result. Let me show you the code first, and then I'll explain how it works.

static void asyncRead()
{
byte [] data = new byte[BUFFER_SIZE];
FileStream fs = new FileStream("readtest.dat", FileMode.Open,
FileAccess.Read, FileShare.None, 1, true);
try
{
// initiate an asynchronous read
IAsyncResult ar = fs.BeginRead(data, 0, data.Length, null, null);
// read is proceeding in the background.
// You can do other processing here.
// When you need to access the data that's been read, you
// need to ensure that the read has completed, and then
// harvest the result.

// wait for the operation to complete
ar.AsyncWaitHandle.WaitOne();

// harvest the result
int bytesRead = fs.EndRead(ar);

Console.WriteLine("{0} bytes read", bytesRead);
}
finally
{
fs.Close();
}
}

The first difference in the asynchronous read code is the way that you open the file. To perform asynchronous file I/O in .NET, you must create the FileStream with the useAsync flag set to True. The only way to set this flag is to call one of the two FileStream constructors that allow it to be specified. If you try to do asynchronous operations on files that were not opened with this flag set, the operations will proceed synchronously.

To initiate an asynchronous read, you call BeginRead, passing it the same parameters that you pass to Read, plus two others: a completion delegate and a state object. In this example, I've supplied null for both of those parameters. Later in the article, I describe how to use those two parameters, and provide an example.

Calling BeginRead causes the read operation to begin executing in a background thread. The main thread returns from the BeginRead call, and continues executing. You can perform any processing you like after BeginRead returns, except you can't access the buffer that is being filled by the read operation (the data buffer, in this example). Either reading or writing to that buffer while the read operating is running can corrupt the data being read into the buffer.

The value returned from BeginRead is an object that implements the IAsyncResult interface. This object provides information about the read operation, and is used both to determine when the operation is complete, and to harvest the result. The Completed property tells you if the asynchronous operation has completed. The CompletedSynchronously property tells you if the operation was completed "immediately" during the BeginRead call, and the AsyncState property contains a reference to the state object that you passed to BeginRead. (I discuss CompletedSynchronously and AsyncState a little later.)

The whole point to asynchronous I/O is that you want the file operation to run in the background while you do other things. When it's time to work with the results of the I/O operation, you synchronize and continue on your way.

There are two points to synchronizing: determining if the operation completed, and harvesting the result. Whereas there's only one way to harvest the result, there are two ways that you can wait for the operation to complete: poll or wait.

The code in the example above waits for completion by calling the Wait method of the IAsyncResult object's AsyncWaitHandle property. This uses a Windows synchronization object to set an event and wait until the event is signaled before continuing. This is very processor-efficient because the waiting thread (your main thread in this case) consumes zero processor cycles while waiting. When the read is complete, it signals the wait handle and Windows transfers control to the waiting thread. This is the preferred way to do synchronization when you want to spawn an asynchronous read, perform some processing, and then wait for the read to complete before continuing.

The other way to wait for completion is to poll: periodically check the status of the IsCompleted flag, and continue processing until the flag is set. Here's a code snippet that does just that.

// wait for the operation to complete
while (!ar.IsCompleted)
{
System.Threading.Thread.Sleep(10);
}

This is the recommended method of waiting for completion if you want to do some processing until the read is done. Perhaps you want to do some animation and pre-calculate some non critical items while waiting for the read to be done. In the code example, the thread just sleeps for 10 milliseconds between each check of IsCompleted. You could just have well coded some processing loop that takes some small time to complete.

The nice thing about the polling method is that it gives us a simple way to determine if the I/O operation is operating synchronously or asynchronously. If you add an output statement to the loop, you should see at least some output before the I/O operation completes. Provided, of course, that the I/O was handled asynchronously. The code below modifies the example above slightly, checking the CompletedAsynchronously flag and displaying a period each time through the polling loop.

static void asyncRead()
{
byte [] data = new byte[BUFFER_SIZE];
FileStream fs = new FileStream("e:\\readtest.dat", FileMode.Open,
FileAccess.Read, FileShare.None, 1, true);
try
{
// initiate an asynchronous read
IAsyncResult ar = fs.BeginRead(data, 0, data.Length, null, null);
if (ar.CompletedSynchronously)
{
Console.WriteLine("Operation completed synchronously.");
}
else
{
// Read is proceeding in the background.
// wait for the operation to complete
while (!ar.IsCompleted)
{
Console.Write('.');
System.Threading.Thread.Sleep(10);
}
Console.WriteLine();
}
// harvest the result
int bytesRead = fs.EndRead(ar);

Console.WriteLine("{0} bytes read", bytesRead);
}
finally
{
fs.Close();
}
}

If the operation proceeds in the background, you should see at least one period displayed on the screen. If you run this on a fast hard drive, especially if BUFFER_SIZE is small, one period is all that you'll see. Try reading from a USB flash drive or from a network drive with a buffer size of five megabytes. Then you should see a lot of periods racing across the screen.

Don't be surprised, if you run the program twice against the same file, if the second time through takes only a fraction of the time as the first. Windows is very good about caching data — sometimes too good, as you'll see — so the second time through the data is already in memory and just requires that Windows transfer it to your buffer. No device access required.

Sometimes, your program will "lock up" (stop responding) for after you call BeginRead, but when the function call returns, the CompletedSynchronously property of the returned IAsyncResult is not set. In that case, you'll see one period displayed on the screen, and then the operation is complete. This doesn't happen very often, if at all, during asynchronous reads, but it does happen often during asynchronous writes, as you'll see below.

Completion Callbacks and State Objects

In some cases, you want to execute some code immediately after the asynchronous I/O operation is complete, without waiting for the main thread to finish what it's doing. Or you don't want the main thread to be concerned with cleaning up after the read (calling EndRead and closing the stream). That's where the completion callback comes in; you can set things up so that the background thread executes the asynchronous read transfers control to the completion callback method after the read is completed.

The IAsyncResult reference that was created when you initiated the BeginRead statement is passed to the completion callback as its only parameter. A reference to the state object that you pass to BeginRead is stored in the IAsyncResult object. This functionality allows true "fire and forget" asynchronous I/O. Here's an example:

static private byte[] globalData = new byte[BUFFER_SIZE];
static void asyncReadWithCallback()
{

FileStream fs = new FileStream("readtest.dat", FileMode.Open,
FileAccess.Read, FileShare.None, 1, true);
// initiate an asynchronous read
IAsyncResult ar = fs.BeginRead(globalData, 0, BUFFER_SIZE,
new AsyncCallback(readCallback), fs);
if (ar.CompletedSynchronously)
{
Console.WriteLine("Operation completed synchronously.");
}
// The read has completed or it is proceeding in the background.
// Either way, we don't care, as this main thread will not
// be accessing the information, or we have some other way to
// ensure that the the read is finished before we access the data.

// Finish whatever processing here ...

// note that we don't close the stream because
// it will be closed by the callback function
}

// readCallback is called when the asynchronous read completes.
// It harvests the result and closes the stream.
// You could use this routine to signal a manual event or do
// other processing that must occur when the read is complete.
static void readCallback(IAsyncResult ar)
{
FileStream fs = (FileStream)ar.AsyncState;
int bytesRead = fs.EndRead(ar);
Console.WriteLine("{0} bytes read", bytesRead);
fs.Close();
}

The call to BeginWrite creates an AsyncDelegate that references the callback function, and passes the FileStream reference. The FileStream reference is stored in the IAsyncResult object, and is available in the AsyncState property. You can pass any type as the state object, and then cast the AsyncState property to that type from within the callback function. If you want to do any operations on the file stream, be sure to include the stream in whatever state object you create.

The documentation for BeginRead, by the way, says "on Windows, all I/O operations smaller than 64 KB will complete synchronously for better performance." My testing with small reads shows that this is not true. 've tried many different buffer sizes from 2 bytes to 64 K bytes. Not one of them completed synchronously. At least, ar.CompletedSynchronously never was set. It's possible that the operation was blocking, but being reported as having completed asynchronously.

The .NET Framework's implementation of asynchronous reading is simple to use, and quite powerful. Asynchronous writes work in much the same way, at least in theory.

Asynchronous Writes

To write data asynchronously, you use the same type of code as for a read. The only real difference is that you call BeginWrite and EndWrite instead of BeginRead and EndRead, and the EndWrite method has no return value. Other than that, the code is identical. The I/O completion routine and the state object also operate exactly the same for writes as for reads. Below is the code for a simple asynchronous write.

static void asyncWrite()
{
byte[] data = new byte[BUFFER_SIZE];
FileStream fs = new FileStream("writetest.dat", FileMode.Create,
FileAccess.Write, FileShare.None, 1, true);
try
{
// initiate an asynchronous write
Console.WriteLine("start write");
IAsyncResult ar = fs.BeginWrite(data, 0,
data.Length, null, null);
if (ar.CompletedSynchronously)
{
Console.WriteLine("Operation completed synchronously.");
}
else
{
// write is proceeding in the background.
// wait for the operation to complete
while (!ar.IsCompleted)
{
Console.Write('.');
System.Threading.Thread.Sleep(10);
}
Console.WriteLine();
}
// harvest the result
fs.EndWrite(ar);
Console.WriteLine("data written");
}
finally
{
fs.Close();
}
}

I added the line that outputs "start write," because I want to illustrate that asynchronous writes don't always proceed asynchronously. If you use this code to output a file to the local computer's hard drive, you'll probably see one period displayed before the program exits. Local hard drives are so fast, and Windows is so good at buffering, that local file operations appear to be instantaneous, even when they aren't.

If you want a surprise, change the program so that it outputs to a network device or a USB flash drive — something slower than a local hard drive. If your system is anything like the others I've tested this program on, you will see the message "start write", and then the program will display nothing for a while. Finally, it will display one period and then exit. The program obviously blocks during the I/O operation, but the CompletedSynchronously property isn't set. To paraphrase that most famous line from Hamlet, "Something is rotten inside of Windows."

What They Don't Tell You

Neither the .NET Framework SDK documentation for BeginWrite, nor the Windows API documentation for WriteFile (the API function that BeginWrite uses) make much mention of the possibility that an asynchronous write might block. The documentation for BeginWrite does say that "the underlying operating system resources might allow access in only one of these modes." Still, if the OS didn't support asynchronous writes, you'd expect BeginWrite to return an IAsyncResult that has the CompletedSynchronously flag set. Right?

It took some digging, but I finally came across a Microsoft Knowledge Base article titled Asynchronous Disk I/O Appears as Synchronous on Windows NT, Windows 2000, and Windows XP. This article is chock full of good information. For example, under the heading "Set Up Asynchronous I/O," the article states:

Be careful when coding for asynchronous I/O because the system reserves the right to make an operation synchronous if it needs to. Therefore, it is best if you write the program to correctly handle an I/O operation that may be completed either synchronously or asynchronously.

Having programmed computers for 25 years, I understood and even expected that. It's interesting, though, that the SDK documentation doesn't mention it, nor do any of the many articles I've found about asynchronous I/O under .NET. That still didn't answer my question as to why asynchronous writes were blocking.

The article covers many reasons why your asynchronous file operations might appear synchronous. I found my answer to the problem under the heading "Extending a file." As the article states:

Another reason that I/O operations are completed synchronously is the operations themselves. On Windows NT, any write operation to a file that extends its length will be synchronous.

The article mentions Windows NT specifically, but I've seen the same behavior on Windows 2000 and Windows Server 2003. The rest of the section goes on to explain how to get around this limitation, and the security reasons for not attempting the workaround.

The problem is that the FileStream constructor I call is creating a new file (FileMode.Create), which means that any existing file is truncated to zero bytes so that it's just like a newly-created file. Writing any data to that file extends it, thereby causing the operation to proceed synchronously. I proved that this is the cause of the error by running my program once and then changing the open mode to FileMode.Open. When I ran the program again, it opened the existing file and wrote data to it from the beginning, overwriting the existing data. That operation proceeded asynchronously, as you would expect.

I don't know enough about Windows internals and the NTFS file system to understand why extending a file has to be a synchronous operation. It'd be interesting, although not terribly useful, to learn the reason, and I encourage any member of the Windows team to contact me with an explanation. Understanding why doesn't solve my problem, though. I want my asynchronous writes!

Alternate Feline De-furring Method

The problem here is that the operating system blocks the thread during a write that extends a file, and I don't want my thread blocked. That was the whole point of attempting an asynchronous write operation in the first place. Sometimes if you want something done write (er, right), you just have to do it yourself. Since BeginWrite won't guarantee asynchronous operation, I decided to do the I/O by creating and calling an asynchronous delegate. Doing so presents two primary advantages:

I/O is guaranteed to be asynchronous because it is executing on a background thread. The code executed in the background can involve multiple reads and writes, and arbitrarily complex calculations interspersed between I/O operations.

Creating an asynchronous delegate isn't much more complicated than creating a file I/O completion callback. All you need to do is define the method that will execute on the background thread, define a delegate for it, instantiate a delegate, and execute. The code below replaces the asynchronous write code from the last example.

// Define the method type that will be called
private delegate void WriteDelegate(byte[] dataBuffer);

static void asyncWriteWithDelegate()
{
byte[] data = new byte[BUFFER_SIZE];

// create a new WriteDelegate
WriteDelegate dlgt = new WriteDelegate(WriteTheData);

// Invoke the delegate asynchronously.
IAsyncResult ar = dlgt.BeginInvoke(data, null, null);

// WriteTheData is now executing asynchronously
// Continue with foreground processing here.
while (!ar.IsCompleted)
{
Console.Write('.');
System.Threading.Thread.Sleep(10);
}
Console.WriteLine();

// harvest the result
dlgt.EndInvoke(ar);
}

static void WriteTheData(byte[] dataBuffer)
{
using (FileStream fs = new FileStream("e:\\writetest.dat", FileMode.Create,
FileAccess.Write, FileShare.None))
{
Console.WriteLine("Begin write...");
fs.Write(dataBuffer, 0, dataBuffer.Length);
Console.WriteLine("Write complete");
}
}

Notice the similarity between this code and the asynchronous write code that uses FileStream.BeginWrite. Granted, you have to define and create a delegate, but other than that, the code is almost identical. Calling the Invoke method is similar to BeginWrite, and you harvest the result by calling EndInvoke in the same way that you call EndWrite. You even check for completion in the same way: by testing IAsyncResult.IsCompleted, or by waiting on the AsyncWaitHandle.

You can even pass a completion callback routine and state object to BeginInvoke in the same way that you can for BeginRead and BeginWrite. The completion callback is executed on the background thread after the delegate is done executing, and the state object is passed in the AsyncState property of the IAsyncResult object.

Note also that WriteTheData calls Write rather than BeginWrite to initiate the I/O. Since WriteTheData is executing on a background thread, it makes little sense to do the I/O asynchronously. This would hold true for reads, too, although I guess there are rare situations in which you'd want a background thread to do asynchronous I/O.

The second advantage of using asynchronous delegates rather than BeginRead or BeginWrite sometimes is overlooked. More often than not, you want your program's initialization code to do more than just read a file into a buffer — something that the built-in asynchronous functionality does quite well. Very often, you want to read the file in blocks (or records), and process those blocks to create internal data structures, or do some post-processing as you're writing data to a file. Perhaps you want to read or write an encrypted file. In those and many other common situations, the standard asynchronous I/O support supplied by FileStream isn't enough. Considering how easy it is to create and invoke an asynchronous delegate, I've found that it's easier to do all of my asynchronous I/O—even simple file reads into a buffer—through asynchronous delegates. Doing so guarantees that the operation will execute in the background, and also makes it easier for me to make the inevitable changes when the client requests different functionality.

Asynchronous I/O to Other Devices

All classes that inherit from System.IO.Stream implement the BeginRead and BeginWrite methods. However, the default implementations of these methods in System.IO.Stream simply call Read and Write, making the operations synchronous. Only those classes that override BeginRead and BeginWrite actually support asynchronous I/O. Of the .NET Framework classes that inherit from System.IO.Stream, only System.Net.Sockets.NetworkStream includes full support for asynchronous operation. All others depend on the default implementation of BeginRead and BeginWrite, which meansthat all I/O operations on those streams will block the calling thread until the operation is completed.

Copyright © 2005 Ziff Davis Media Inc. All Rights Reserved. Originally appearing in Dev Source.

Thursday, May 10, 2007

Digits to Charts - The Code Project - SOAP and XML

Digits to Charts - The Code Project - SOAP and XML

This article presents several XSLT stylesheets for visualizing numerical data rows contained, as you may have guessed, within XML files. The article explains the details of stylesheet setup and the template design rationale.
Stylesheets for the following types of charts are described:
Bar graph:
simple (distinct rows, overlaid rows);
stacked;
stacked & normalized to 100%.
Histogram:
simple (distinct columns, overlaid columns);
stacked;
stacked & normalized to 100%.
This article assumes you are familiar with XSLT 1.0 and, to some extent, CSS Level 1.0 standards.

Dynamic Cross-Tabs/Pivot Tables - SQL Server Information at SQLTeam.com

Dynamic Cross-Tabs/Pivot Tables - SQL Server Information at SQLTeam.com

EXECUTE crosstab 'select title from titles inner join sales on (sales.title_id=titles.title_id)
group by title', 'sum(qty)','stor_id','stores'



EXECUTE crosstab 'select pub_name, count(qty) as orders, sum(qty) as total
from sales inner join titles on (sales.title_id=titles.title_id)
right join publishers on (publishers.pub_id=titles.pub_id)
group by pub_name', 'sum(qty)','type','titles'



EXECUTE crosstab 'SELECT LastName FROM Employees INNER JOIN Orders
ON (Employees.EmployeeID=Orders.EmployeeID)
GROUP BY LastName', 'count(lastname)', 'Year(OrderDate)', 'Orders'



Related Articles:
Efficient and Dynamic Server-Side Paging with T-SQL on 3/23/2004 rated 5.0 (1)
Introduction to Dynamic SQL (Part 1) on 6/20/2001 rated 4.4 (7)
Introduction to Dynamic SQL (Part 2) on 6/27/2001 rated 4.5 (12)
Dynamic CrossTabs on 9/23/2001
Implementing a Dynamic WHERE Clause on 1/14/2001 rated 4.8 (12)



IMHO, the best feature of MS Access is the TRANSFORM statement, used to create cross-tabs/pivot tables. It does all of the work of dynamically generating the cross-tabulation and the summary calculations. T-SQL unfortunately doesn't have this statement, so you're stuck using complicated SQL commands, expensive 3rd party products, or exotic OLAP to make pivot tables...or you can use the following procedure to dynamically create them!

td.rob {color: #343399; background-color: #eff7fe; font-family: Arial; font-size: 10pt; padding-left: 7px; padding-right: 7px; text-align: right;}
td.robl {color: #343399; background-color: #eff7fe; font-family: Arial; font-size: 10pt; padding-left: 7px; padding-right: 7px;}
td.robhead {color: #343399; background-color: #a6cedd; font-family: Arial; font-size: 10pt; font-weight: bold; padding-left: 7px; padding-right: 7px;}
td.rob1 {color: #AAFFAA; background-color: #333388; font-family: Trebuchet, 'Arial Narrow', Arial; font-size: 10pt;}
td.robhead1 {color: #FFFFFF; background-color: #AA0000; font-family: Arial; font-size: 10pt; font-weight: bold;}
td.cap {text-align: center; border-left: thin solid black; border-top: thin solid black; font-weight: bold; padding-left: 10px; padding-right: 10px;}
I got the idea from this question, asking how to "undo" a pivot table, and then I started working on how to create them in T-SQL. There are numerous ways of doing pivot tables, and this site has several examples (and lots of other cool stuff). The standard method uses a CASE statement, with one CASE for each pivot value (the column headings created by cross-tabbing the pivot column). The greatest shortcoming is finding a way to handle an unknown or changing number of pivot values. Obviously you have to know these values beforehand, and you must add a CASE for each new, distinct value inserted into the pivot column. The code listed below will do all of the work for you:

m0n0wall

m0n0wall is a project aimed at creating a complete, embedded firewall software package that, when used together with an embedded PC, provides all the important features of commercial firewall boxes (including ease of use) at a fraction of the price (free software).m0n0wall is based on a bare-bones version of FreeBSD, along with a web server, PHP and a few other utilities. The entire system configuration is stored in one single XML text file to keep things transparent.m0n0wall is probably the first UNIX system that has its boot-time configuration done with PHP, rather than the usual shell scripts, and that has the entire system configuration stored in XML format.


Facts
The m0n0wall system currently takes up less than 6 MB on the Compact Flash card (or CD-ROM), and contains
all the required FreeBSD components (kernel, user programs)
ipfilter
PHP (CGI version)
mini_httpd
MPD
ISC DHCP server
ez-ipupdate (for DynDNS updates)
Dnsmasq (for the caching DNS forwarder)
racoon (for IPsec IKE)
UCD-SNMP
choparp
BPALogin
On a net4501, m0n0wall provides a WAN <-> LAN TCP throughput of about 17 Mbps, including NAT, when run with the default configuration. On faster platforms (like net4801 or WRAP), throughput in excess of 50 Mbps is possible (and > 100 Mbps with newer standard PCs).
On a net4501, m0n0wall boots to a fully working state in less than 40 seconds after power-up, including POST (with a properly configured BIOS)


Features
At this time, m0n0wall can be used as-is with the Wireless Router Application Platform from PC Engines (www.pcengines.ch), the net45xx/net48xx embedded PCs from Soekris Engineering (www.soekris.com) or most standard PCs (with a BIOS that supports booting from CD-ROM (El Torito standard) for the CD-ROM version).m0n0wall already provides many of the features of expensive commercial firewalls, including:
web interface (supports SSL)
serial console interface for recovery
set LAN IP address
reset password
restore factory defaults
reboot system
wireless support (access point with PRISM-II/2.5/3 cards, BSS/IBSS with other cards including Cisco)
captive portal
802.1Q VLAN support
stateful packet filtering
block/pass rules
logging
NAT/PAT (including 1:1)
DHCP client, PPPoE, PPTP and Telstra BigPond Cable support on the WAN interface
IPsec VPN tunnels (IKE; with support for hardware crypto cards, mobile clients and certificates)
PPTP VPN (with RADIUS server support)
static routes
DHCP server and relay
caching DNS forwarder
DynDNS client and RFC 2136 DNS updater
SNMP agent
traffic shaper
SVG-based traffic grapher
firmware upgrade through the web browser
Wake on LAN client
configuration backup/restore
host/network aliases

Monitoring and Analyzing a Load Test Result

Monitoring and Analyzing a Load Test Result

SQL Server 2005: Regular Expressions Make Pattern Matching And Data Extraction Easier -- MSDN Magazine, February 2007

SQL Server 2005: Regular Expressions Make Pattern Matching And Data Extraction Easier -- MSDN Magazine, February 2007

  • Efficient SQL querying using regular expressions
  • Support in SQL Server 2005 for regular expressions
  • Using .NET Regex classes from SQL Server
  • Effective uses for regular expressions in a database

select dbo.RegexMatch( N'123-45-6789', N'^\d{3}-\d{2}-\d{4}$' )


select ROUTINE_NAME
from INFORMATION_SCHEMA.ROUTINES
where ROUTINE_TYPE = N'PROCEDURE'
and dbo.RegexMatch( ROUTINE_NAME,
N'^usp_(Insert|Update|Delete|Select)([A-Z][a-z]+)+$' ) = 0


select ROUTINE_NAME
from INFORMATION_SCHEMA.ROUTINES
where ROUTINE_TYPE = N'PROCEDURE'
and ( LEN( ROUTINE_NAME ) < 11
or LEFT( ROUTINE_NAME, 4 ) <> N'usp_'
or SUBSTRING( ROUTINE_NAME, 5, 6 ) not in
( N'Insert', N'Update', N'Delete', N'Select' ) )


CREATE TABLE [Account]
(
[AccountNumber] nvarchar(20) CHECK (dbo.RegexMatch(
[AccountNumber], '^[A-Z]{3,5}\d{5}-\d{3}$' ) = 1),
[PhoneNumber] nchar(13) CHECK (dbo.RegexMatch(
[PhoneNumber], '^\(\d{3}\)\d{3}-\d{4}$' ) = 1),
[ZipCode] nvarchar(10) CHECK (dbo.RegexMatch(
[ZipCode], '^\d{5}(\-\d{4})?$' ) = 1)
)



Data Extraction

The grouping features of regular expressions can be used to extract data from a string. My RegexGroup function provides that functionality to T-SQL:

[SqlFunction]
public static SqlChars RegexGroup(
SqlChars input, SqlString pattern, SqlString name )
{
Regex regex = new Regex( pattern.Value, Options );
Match match = regex.Match( new string( input.Value ) );
return match.Success ?
new SqlChars( match.Groups[name.Value].Value ) : SqlChars.Null;
}


select distinct dbo.RegexGroup( [Url],
N'https?://(?([\w-]+\.)*[\w-]+)', N'server' )
from [UrlTable]


CREATE TABLE [Email]
(
[Address] nvarchar(max),
[Mailbox] as dbo.RegexGroup( [Address],
N'(?[^@]*)@', N'mailbox' ),
[Domain] as dbo.RegexGroup( [Address], N'@(?.*)', N'domain' )


Pattern Storage

All of the patterns used by these functions are just strings, which means that any of them can be stored in a table within your database. Most databases that store international data have a table representing countries. By adding a few extra columns to that table, you could store country-specific validation patterns. That would allow the constraint applied to an address row to vary based on the country for that row.

In databases that store data on behalf of clients, there is typically already a table representing a client. That table can be used to store grouping patterns that let you describe the way raw client data is stored within the database, and this allows you to create computed columns to pull the data you actually need from the client data. For example, if each of your clients has unique schemes for account numbers and you only need specific pieces of that account number, you could easily create an expression that pulls the correct piece of information for each client.


declare @text nvarchar(max), @pattern nvarchar(max)
select
@text = N'Here are four words.',
@pattern = '\w+'
select count(distinct [Text])
from dbo.RegexMatches( @text, @pattern )


declare @pattern nvarchar(max), @list nvarchar(max)
select @pattern = N'[^,]+', @list = N'2,4,6'

select d.* from [Data] d
inner join dbo.RegexMatches( @list, @pattern ) re
on d.[ID] = re.[Text]




Related Articles from MSDN Magazine:

SQL Server Best Practices on Microsoft TechNet

SQL Server Best Practices on Microsoft TechNet
et the real-world guidelines, expert tips, and rock-solid guidance to take your SQL Server implementation to the next level. These SQL Server best practices draw on the extensive experience and expertise from respected developers and engineers at Microsoft, who walk you through the specifics on solving particularly difficult issues.



Technical White Papers

Deep level technical papers on specific SQL Server topics that were tested and validated by SQL Development.

SQL Server 2005 Deployment Guidance for Web Hosting Environments
Resolving Common Connectivity Issues in SQL Server 2005 Analysis Services Connectivity Scenarios
Implementing Application Failover with Database Mirroring
OLAP Design Best Practices for Analysis Services 2005
DBCC SHOWCONTIG Improvements and Comparison between SQL Server 2000 and SQL Server 2005
TEMPDB Capacity Planning and Concurrency Considerations for Index Create and Rebuild
Loading Bulk Data into a Partitioned Table
Database Mirroring Best Practices and Performance Considerations
SQL Server 2005 Performance Tuning using Waits and Queues
Analysis Services 2005 Performance Guide
SAP with Microsoft SQL Server 2005: Best Practices for High Availability, Performance, and Scalability
Microsoft SQL Server 2005 Tuning Tips for PeopleSoft8.x
Troubleshooting Performance Problems in SQL Server 2005

SQL Server Best Practices ToolBox

Scripts and tools for performance tuning and troubleshooting SQL Server 2005

Top 10 Lists

Summary lists (usually consisting of 10 items) of recommendations, best practices and common issues for specific customer scenarios by the SQL Server Customer Advisory Team.

Top 10 Best Practices for SQL Server Maintenance for SAP
Top 10 Hidden Gems in SQL Server 2005
Top 10 SQL Server 2005 Performance Issues for Data Warehouse and Reporting Applications
OLTP Top 6 Performance Issues for OLTP Applications
Storage Top 10 Best Practices

Best Practices in SQL Server Books Online

Best Practices for Replication Administration
Replication Security Best Practices
Best Practices for Recovering a Database to a Specific Recovery Point

Complete List of Best SEO-Tools

Complete List of Best SEO-Tools
What are the best ways to boost your position in search engines? What keywords should you use on your web-pages? And which tools should you use to improve the quality of backlinks, link popularity and Google Pagerank? We deliver answers. Here is the list of the most useful SEO-tools you might be willing to use, developing and optimizing your next web-site. Our personal choice: 156 Seo Tools - one of the most comprehesive lists with essential SEO-references and tools. For even more resources check out the SEO section of The Web-Developer’s Handbook.

Thursday, May 3, 2007

Using RADIUS For WLAN Authentication

Using RADIUS For WLAN Authentication, Part II

Commercial RADIUS Server: In the long run, fully-supported commercial products can save you time, and time means money. 802.1X-capable RADIUS Server products are available from a variety of sources, including:
Aradial WiFi
Bridgewater Wi-Fi AAA
Cisco Secure Access Control Server
Funk Odyssey
IEA RadiusNT
Infoblox RADIUS One Appliance
Interlink Secure.XS
LeapPoint AiroPoint Appliance
Meetinghouse AEGIS
OSC Radiator
Vircom VOP Radius
Commercial RADIUS Servers vary in price and capacity. For example, Interlink's Secure.XS starts at $2375 for 250 users. $2500 will also buy you one Funk Odyssey Server, including 25 Odyssey Client software licenses. VOP Radius Small Business starts at $995 for 100 users. A single-server Radiator license will run you $720.

Wednesday, May 2, 2007

Generic P2P Architecture, Tutorial and Example

Generic P2P Architecture, Tutorial and Example - The Code Project - VB / VBScript



Download source code - 16.8 Kb
This generic P2P architecture tutorial was brought to you by Planet API – Searching thousands of ASPX / ASP.NET Public Webmethods / Public Webservices, Search Engine APIs, Novelty APIs, B2B APIs, Game APIs, P2P APIs, Fun APIs, AI Chatbot APIs, etc.
Overview of P2P Culture
P2P (Peer To Peer) is when multiple computers think collectively towards a shared objective. Computer programs that use less central servers and rely on a collection of computers such as Gnutella, distributed media streaming, networks of DCC based IRC fservers etc. tend to be referred to as being more P2P. Computer programs where many end-users communicate with few central services tend to be referred to as being less P2P or not P2P. To fully understand and leverage P2P technology, one must separate his or her self from the dogma that our computer programs must be united by servers in our physical possession to synchronize activities. Rather, think of our computer programs from a more digital-life oriented perspective and break the computer software up over multiple machines and make no single part of the software critical to the collective objective.
P2P Philosophy
“Single servants are less powerful then a single server but the collective of many servants is more powerful then any single server” - Daniel Stephen Rule.
For example, a large software company gives each employee a very small amount of responsibility. Even if this means you get your month’s coding done in a few days, it is more beneficial to the company as a whole to not rely on any single employee too much and allows the company more overall stability and to ultimately write larger more complex software packages than any single person is capable of. Your software is more P2P if you leverage this same principle to achieve more bandwidth and computing speed.
Basic P2P Terminology
Peer or Servant
A computer program that acts as both a client and a server for the entire P2P network.
Connection Manager
A light server application that provides a starting point for applications which enter a P2P network. The less the connection manager is involved in the objective of your overall application, the more P2P your application is. The more P2P your application is, the less strain on your own hardware.

How to do pointers in Visual Basic

How to do pointers in Visual Basic - The Code Project - VB / VBScript

Here is a simple and not complete implementation of a linked list. (On the form put a Command button named Command1)


Option Explicit
Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory"_
(Destination As Any, Source As Any, ByVal Length As Long)
Private Declare Function GetProcessHeap Lib "kernel32" () As Long
Private Declare Function HeapAlloc Lib "kernel32" _
(ByVal hHeap As Long, ByVal dwFlags As Long, _
ByVal dwBytes As Long) As Long
Private Declare Function HeapFree Lib "kernel32" _
(ByVal hHeap As Long, ByVal dwFlags As Long, _
lpMem As Any) As Long
Private Declare Sub CopyMemoryPut Lib "kernel32" Alias _
"RtlMoveMemory" (ByVal Destination As Long, _
Source As Any, ByVal Length As Long)
Private Declare Sub CopyMemoryRead Lib "kernel32" Alias _
"RtlMoveMemory" (Destination As Any, _
ByVal Source As Long, ByVal Length As Long)

Dim pHead As Long
Private Type ListElement
strData As String * 255 '==255 * 2=500 bytes vbStrings are UNICODE !
pNext As Long '4 bytes
'pointer to next ; ==0 if end of list
'----------------
'total: 504 bytes
End Type

Private Sub CreateLinkedList()
'add three items to list
' get the heap first
Dim pFirst As Long, pSecond As Long 'local pointers
Dim hHeap As Long
hHeap = GetProcessHeap()
'allocate memory for the first and second element
pFirst = HeapAlloc(hHeap, 0, 504)
pSecond = HeapAlloc(hHeap, 0, 504)
If pFirst <> 0 And pSecond <> 0 Then
'memory is allocated
PutDataIntoStructure pFirst, "Hello", pSecond
PutDataIntoStructure pSecond, "Pointers", 0
pHead = pFirst
End If
'put he second element in the list
End Sub

Private Sub Command1_Click()
CreateLinkedList
ReadLinkedListDataAndFreeMemory
End Sub

Private Sub PutDataIntoStructure(ByVal ptr As Long, _
szdata As String, ByVal ptrNext As Long)
Dim le As ListElement
le.strData = szdata
le.pNext = ptrNext
CopyMemoryPut ptr, le, 504
End Sub

Private Sub ReadDataToStructure(ByVal ptr As Long, _
struct As ListElement)
Dim le As ListElement
CopyMemoryRead le, ptr, 504
struct.strData = le.strData
struct.pNext = le.pNext
End Sub

Private Sub ReadLinkedListDataAndFreeMemory()
Dim pLocal As Long
Dim hHeap As Long
Dim le As ListElement
Dim strData As String
pLocal = pHead
hHeap = GetProcessHeap()
Do While pLocal <> 0
ReadDataToStructure pLocal, le
strData = strData & vbCrLf & le.strData
HeapFree hHeap, 0, pLocal
pLocal = le.pNext
Loop
MsgBox strData
End Sub

Private Sub Form_Load()

End Sub

Retrieving hardware information with WMI

Retrieving hardware information with WMI - The Code Project - VB / VBScript

Find out what your friends/co-workers have been doing on the Internet!

Find out what your friends/co-workers have been doing on the Internet! - The Code Project - VB / VBScript

What is an Index.dat file?

Index.dat are files hidden on your computer that contain many of the Web sites that you have visited. Every URL and Web page is listed there. This script is for you, if internet privacy is an issue or you're just curious to see what surfing habits people have on your PC or Network. In WinXP, here's where you'll usually find them:

C:\Documents and Settings\\Local Settings\History\History.IE5\index.dat

VBScript for reading and writing to the Windows host file - The Code Project - VB / VBScript

VBScript for reading and writing to the Windows host file - The Code Project - VB / VBScript

Asynchronous processing - Basics and a walkthrough with VB6/ VB.NET - I

Asynchronous processing - Basics and a walkthrough with VB6/ VB.NET - I - The Code Project - VB / VBScript

MySQL Stored Functions

MySQL Stored Functions

Continuing with our series on Stored Procedures and Functions (see part 1, part 2, or part 3), this month we focus on Stored Functions. Most of what we have covered in those earlier tutorials is relevant here, so I suggest you read those first if you haven't already.
What's a Stored Function
If procedural programming is new to you, you may be wondering what the difference is between a Stored Procedure and a Stored Function. Not too much really. A function always returns a result, and can be called inside an SQL statement just like ordinary SQL functions. A function parameter is the equivalent of the IN procedure parameter, as functions use the RETURN keyword to determine what is passed back. Stored functions also have slightly more limitations in what SQL statements they can run than stored procedures.

PrimeBase XT (PBXT) pluggable storage engine for MySQL

PrimeBase XT (PBXT) pluggable storage engine for MySQL


PrimeBase XT (PBXT) is a transactional storage engine for MySQL. It has been designed for modern, web-based, high concurrency environments and heavy update loads. PBXT uses a number of new techniques to achieve its goals. The aspects of design and other features of the implementationare described in a White Paper.

PrimeBase XT is hosted on SourceForge.net. Here you can download the latest source code, track changes and updates and report bugs.
NEW! PBXT is now available for Windows NT/XP. Build it yourself, or install the binary included in the PBXT binary distribution.The PBXT binary distribution includes binary versions of the plug-in for MySQL 5.1 and 5.2 running on a number of platforms. An install shell script automatically detects which version of the MySQL server is running and installs the correct plug-in (see the package README for details).News and Information
For the latest news and information in and around PBXT read the PrimeBase XT Blog:
The PrimeBase XT Blog
Last update: Feb, 2007
Subscribe to the PrimeBase XT Blog feed to stay up-to-date with the latest news.
-http://www.primebase.com/xt/pbxt.rss
see above
Robin Schumacher, Product Director at MySQL, takes a look at PBXT.
A Look at the PBXT Storage Engine New!
Mar, 2007
PBXT Presentation at the Hamburg MySQL September Meetup.
XT Presentation (PDF)
Sept 4, 2006
MySQL Community Relations Manager, Lenz Grimmer, spoke to Paul McCullagh about the Users Conference, PBXT and the Community
Interview with Paul McCullagh, developer of the PrimeBase XT Storage Engine
May, 2006
For details about the architecture and implentation of PBXT, read the White Paper:
PrimeBase XT White Paper (PDF)
Mar 20, 2006Download XT
Source distribution of the PBXT 0.9.86 Beta pluggable storage engine for MySQL 5.1.14/15/16 Beta and MySQL 5.2.0/3 Alpha.
pbxt-0.9.86-beta.tar.gz New!
Last update:Apr 7, 2007
Binary distribution of the PBXT 0.9.86 Beta pluggable storage engine for MySQL 5.1.16 and 5.2.3 running on Linux (32/64-bit), Mac OS X (x86/ppc) and Windows.
pbxt-0.9.86-plugins.tar.gz New!
Apr 7, 2007
Binary distribution of the PBXT 0.9.85 Beta pluggable storage engine.
pbxt-0.9.85-plugins.tar.gz
Mar 15, 2007
Binary distribution of the PBXT 0.9.8 Beta pluggable storage engine.
pbxt-0.9.8-plugins.tar.gz
Mar 9, 2007
PrimeBase XT 0.9.7 Beta integrated into MySQL 4.1.21.
mysql-4.1.21-pbxt-0.9.7b.tar.gz
Sept 29, 2006
The PBXT engine integrated into MySQL 4.1.16, the nightly build of November 4, 2005
mysql-4.1.16-pbxt-0.9.6.tar.gz
Aug 7, 2006Documentation and Notes
The latest changes, and bug fixes are documented in the release notes.
Release Notes (TXT)
Last update:Apr 7, 2007
A list of planned features and bugs still to be fixed.
TO-DO List (TXT)
Apr 7, 2007
Notes on how to build, install and test PrimeBase XT and MySQL
Building PBXT (TXT)
Jan 30, 2007
So that mysql-test-run runs with PBXT as the default engine some changes where made to the script. These changes illustrate the differences between MyISAM and PBXT.
mysql-test-run-changes (TXT)
Aug 5, 2006

MySQL 5 Storage Engines

MySQL 5 Storage Engines

MySQL 5 offers a number of new storage engines (previously called table types). In addition to the default MyISAM storage engine, and the InnoDB, BDB, HEAP and MERGE storage engines, there are four new types: CSV, ARCHIVE, FEDERATED and EXAMPLE, as well as a new name for the HEAP storage engine. It is now called the MEMORY storage engine. None of the new types are available by default - you can check for sure with the SHOW ENGINES statement. Here is what is on my default version of MySQL Max:
mysql> SHOW ENGINES;
+------------+---------+------------------------------------------------------------+
Engine Support Comment
+------------+---------+------------------------------------------------------------+
MyISAM DEFAULT Default engine as of MySQL 3.23 with great performance
HEAP YES Alias for MEMORY
MEMORY YES Hash based, stored in memory, useful for temporary tables
MERGE YES Collection of identical MyISAM tables
MRG_MYISAM YES Alias for MERGE
ISAM NO Obsolete storage engine, now replaced by MyISAM
MRG_ISAM NO Obsolete storage engine, now replaced by MERGE
InnoDB YES Supports transactions, row-level locking, and foreign keys
INNOBASE YES Alias for INNODB
BDB YES Supports transactions and page-level locking
BERKELEYDB YES Alias for BDB
NDBCLUSTER NO Clustered, fault-tolerant, memory-based tables
NDB NO Alias for NDBCLUSTER
EXAMPLE NO Example storage engine
ARCHIVE NO Archive storage engine
CSV NO CSV storage engine
+------------+---------+------------------------------------------------------------+

To add support for the missing storage engines, you currently need to build MySQL


The FEDERATED storage engine
Added in MySQL 5.0.3, to make use of it you need to use the --with-federated-storage-engine option to configure when building MySQL. The FEDERATED storage engine allows you to access data from a table on another database server. That table can make use of any storage engine. Let's see it in action. First, CREATE a table on a remote server (you can do this on the same server for testing purposes, but doing so is fairly pointless otherwise).CREATE TABLE myisam_table (f1 INT, PRIMARY KEY(f1))
ENGINE=MYISAM;
Assuming that the default is set to create MyISAM tables (FEDERATED tables can access tables of any type), the above statement creates a definition file (.frm), an index file (.MYI) and a data file (.MYD). If you had created an InnoDB file, MySQL would create a definition (.frm) and index and data file (.idb). Now create the FEDERATED table on another server. The original table must always exist first: CREATE TABLE federated_table (f1 INT, PRIMARY KEY(f1))
ENGINE=FEDERATED
COMMENT='mysql://username:password@hostname.co.za:3306/dbname/myisam_table';


The ARCHIVE storage engine
Added in MySQL 4.1.3, the archive storage engine lives up to its name by storing large amounts of data without taking up too much space. It too makes no use of any sort of indexing, and there are no means to repair the table should it become corrupted during a crash. To enable this storage engine, use the -with-archive-storage-engine configure option when building MySQL.
mysql> CREATE TABLE archive_names(firstname CHAR(30), surname CHAR(40), age INT) ENGINE = ARCHIVE;