Tech Articles: 2007

Friday, August 17, 2007

Query Performance Tuning (SQL Server Compact Edition)

You can improve your SQL Server 2005 Compact Edition (SQL Server Compact Edition) application performance by optimizing the queries you use. The following sections outline techniques you can use to optimize query performance.

Improve Indexes

Creating useful indexes is one of the most important ways to achieve better query performance. Useful indexes help you find data with fewer disk I/O operations and less system resource usage.

To create useful indexes, you much understand how the data is used, the types of queries and the frequencies they are run, and how the query processor can use indexes to find your data quickly.

When you choose what indexes to create, examine your critical queries, the performance of which will affect user experience most. Create indexes to specifically aid these queries. After adding an index, rerun the query to see if performance is improved. If it is not, remove the index.

As with most performance optimization techniques, there are tradeoffs. For example, with more indexes, SELECT queries will potentially run faster. However, DML (INSERT, UPDATE, and DELETE) operations will slow down significantly because more indexes must be maintained with each operation. Therefore, if your queries are mostly SELECT statements, more indexes can be helpful. If your application performs many DML operations, you should be conservative with the number of indexes you create.

SQL Server Compact Edition includes support for showplans, which help assess and optimize queries. SQL Server Compact Edition uses the same showplan schema as SQL Server 2005 except SQL Server Compact Edition uses a subset of the operators. For more information, see the Microsoft Showplan Schema at http://schemas.microsoft.com/sqlserver/2004/07/showplan/.

The next few sections provide additional information about creating useful indexes.

Create Highly-Selective Indexes
Indexing on columns used in the WHERE clause of your critical queries frequently improves performance. However, this depends on how selective the index is. Selectivity is the ratio of qualifying rows to total rows. If the ratio is low, the index is highly selective. It can get rid of most of the rows and greatly reduce the size of the result set. It is therefore a useful index to create. By contrast, an index that is not selective is not as useful.

A unique index has the greatest selectivity. Only one row can match, which makes it most helpful for queries that intend to return exactly one row. For example, an index on a unique ID column will help you find a particular row quickly.

You can evaluate the selectivity of an index by running the sp_show_statistics stored procedures on SQL Server Compact Edition tables. For example, if you are evaluating the selectivity of two columns, "Customer ID" and "Ship Via", you can run the following stored procedures:

sp_show_statistics_steps 'orders', 'customer id';

RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS

------------------------------------------------------------

ALFKI 0 7 0

ANATR 0 4 0

ANTON 0 13 0

AROUT 0 14 0

BERGS 0 23 0

BLAUS 0 8 0

BLONP 0 14 0

BOLID 0 7 0

BONAP 0 19 0

BOTTM 0 20 0

BSBEV 0 12 0

CACTU 0 6 0

CENTC 0 3 0

CHOPS 0 12 0

COMMI 0 5 0

CONSH 0 4 0

DRACD 0 9 0

DUMON 0 8 0

EASTC 0 13 0

ERNSH 0 33 0

(90 rows affected)

And

sp_show_statistics_steps 'orders', 'reference3';

RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS

------------------------------------------------------------

1 0 320 0

2 0 425 0

3 0 333 0

(3 rows affected)

The results show that the "Customer ID" column has a much lower degree of duplication. This means an index on it will be more selective than an index on the "Ship Via" column.

For more information about using these stored procedures, see sp_show_statistics (SQL Server Compact Edition), sp_show_statistics_steps (SQL Server Compact Edition), and sp_show_statistics_columns (SQL Server Compact Edition).

Create Multiple-Column Indexes
Multiple-column indexes are natural extensions of single-column indexes. Multiple-column indexes are useful for evaluating filter expressions that match a prefix set of key columns. For example, the composite index CREATE INDEX Idx_Emp_Name ON Employees ("Last Name" ASC, "First Name" ASC) helps evaluate the following queries:

... WHERE "Last Name" = 'Doe'

... WHERE "Last Name" = 'Doe' AND "First Name" = 'John'

... WHERE "First Name" = 'John' AND "Last Name" = 'Doe'

However, it is not useful for this query:

... WHERE "First Name" = 'John'

When you create a multiple-column index, you should put the most selective columns leftmost in the key. This makes the index more selective when matching several expressions.

Avoid Indexing Small Tables
A small table is one whose contents fit in one or just a few data pages. Avoid indexing very small tables because it is typically more efficient to do a table scan. This saves the cost of loading and processing index pages. By not creating an index on very small tables, you remove the chance of the optimizer selecting one.

SQL Server Compact Edition stores data in 4 Kb pages. The page count can be approximated by using the following formula, although the actual count might be slightly larger because of the storage engine overhead.

* <# of rows>

<# of pages> = -----------------------------------------------------------------

4096

For example, suppose a table has the following schema:

Column Name Type (size)
Order ID
INTEGER (4 bytes)

Product ID
INTEGER (4 bytes)

Unit Price
MONEY (8 bytes)

Quantity
SMALLINT (2 bytes)

Discount
REAL (4 bytes)

The table has 2820 rows. According to the formula, it takes about 16 pages to store its data:

<# of pages> = ((4 + 4 + 8 + 2 + 4) * 2820) / 4096 = 15.15 pages

Choose What to Index

We recommend that you always create indexes on primary keys. It is frequently useful to also create indexes on foreign keys. This is because primary keys and foreign keys are frequently used to join tables. Indexes on these keys lets the optimizer consider more efficient index join algorithms. If your query joins tables by using other columns, it is frequently helpful to create indexes on those columns for the same reason.

When primary key and foreign key constraints are created, SQL Server Compact Edition automatically creates indexes for them and takes advantage of them when optimizing queries. Remember to keep primary keys and foreign keys small. Joins run faster this way.

Use Indexes with Filter Clauses
Indexes can be used to speed up the evaluation of certain types of filter clauses. Although all filter clauses reduce the final result set of a query, some can also help reduce the amount of data that must be scanned.

A search argument (SARG) limits a search because it specifies an exact match, a range of values, or a conjunction of two or more items joined by AND. It has one of the following forms:

Column operator

operator Column

SARG operators include =, >, <, >=, <=, IN, BETWEEN, and sometimes LIKE (in cases of prefix matching, such as LIKE 'John%'). A SARG can include multiple conditions joined with an AND. SARGs can be queries that match a specific value, such as:

"Customer ID" = 'ANTON'

'Doe' = "Last Name"

SARGs can also be queries that match a range of values, such as:

"Order Date" > '1/1/2002'

"Customer ID" > 'ABCDE' AND "Customer ID" < 'EDCBA'

"Customer ID" IN ('ANTON', 'AROUT')

An expression that does not use SARG operators does not improve performance, because the SQL Server Compact Edition query processor has to evaluate every row to determine whether it meets the filter clause. Therefore, an index is not useful on expressions that do not use SARG operators. Non-SARG operators include NOT, <>, NOT EXISTS, NOT IN, NOT LIKE, and intrinsic functions.

Use the Query Optimizer

When determining the access methods for base tables, the SQL Server Compact Edition optimizer determines whether an index exists for a SARG clause. If an index exists, the optimizer evaluates the index by calculating how many rows are returned. It then estimates the cost of finding the qualifying rows by using the index. It will choose indexed access if it has lower cost than table scan. An index is potentially useful if its first column or prefix set of columns are used in the SARG, and the SARG establishes a lower bound, upper bound, or both, to limit the search.

Understand Response Time vs. Total Time

Response time is the time it takes for a query to return the first record. Total time is the time it takes for the query to return all records. For an interactive application, response time is important because it is the perceived time for the user to receive visual affirmation that a query is being processed. For a batch application, total time reflects the overall throughput. You have to determine what the performance criteria are for your application and queries, and then design accordingly.

For example, suppose the query returns 100 records and is used to populate a list with the first five records. In this case, you are not concerned with how long it takes to return all 100 records. Instead, you want the query to return the first few records quickly, so that you can populate the list.

Many query operations can be performed without having to store intermediate results. These operations are said to be pipelined. Examples of pipelined operations are projections, selections, and joins. Queries implemented with these operations can return results immediately. Other operations, such as SORT and GROUP-BY, require using all their input before returning results to their parent operations. These operations are said to require materialization. Queries implemented with these operations typically have an initial delay because of materialization. After this initial delay, they typically return records very quickly.

Queries with response time requirements should avoid materialization. For example, using an index to implement ORDER-BY yields better response time than does using sorting. The following section describes this in more detail.

Index the ORDER-BY / GROUP-BY / DISTINCT Columns for Better Response Time
The ORDER-BY, GROUP-BY, and DISTINCT operations are all types of sorting. The SQL Server Compact Edition query processor implements sorting in two ways. If records are already sorted by an index, the processor needs to use only the index. Otherwise, the processor has to use a temporary work table to sort the records first. Such preliminary sorting can cause significant initial delays on devices with lower power CPUs and limited memory, and should be avoided if response time is important.

In the context of multiple-column indexes, for ORDER-BY or GROUP-BY to consider a particular index, the ORDER-BY or GROUP-BY columns must match the prefix set of index columns with the exact order. For example, the index CREATE INDEX Emp_Name ON Employees ("Last Name" ASC, "First Name" ASC) can help optimize the following queries:

... ORDER BY / GROUP BY "Last Name" ...

... ORDER BY / GROUP BY "Last Name", "First Name" ...

It will not help optimize:

... ORDER BY / GROUP BY "First Name" ...

... ORDER BY / GROUP BY "First Name", "Last Name" ...

For a DISTINCT operation to consider a multiple-column index, the projection list must match all index columns, although they do not have to be in the exact order. The previous index can help optimize the following queries:

... DISTINCT "Last Name", "First Name" ...

... DISTINCT "First Name", "Last Name" ...

It will not help optimize:

... DISTINCT "First Name" ...

... DISTINCT "Last Name" ...

Note:
If your query always returns unique rows on its own, avoid specifying the DISTINCT keyword, because it only adds overhead

Rewrite Subqueries to Use JOIN

Sometimes you can rewrite a subquery to use JOIN and achieve better performance. The advantage of creating a JOIN is that you can evaluate tables in a different order from that defined by the query. The advantage of using a subquery is that it is frequently not necessary to scan all rows from the subquery to evaluate the subquery expression. For example, an EXISTS subquery can return TRUE upon seeing the first qualifying row.

Note:
The SQL Server Compact Edition query processor always rewrites the IN subquery to use JOIN. You do not have to try this approach with queries that contain the IN subquery clause.

For example, to determine all the orders that have at least one item with a 25 percent discount or more, you can use the following EXISTS subquery:

SELECT "Order ID" FROM Orders O

WHERE EXISTS (SELECT "Order ID"

FROM "Order Details" OD

WHERE O."Order ID" = OD."Order ID"

AND Discount >= 0.25)

You can also rewrite this by using JOIN:

SELECT DISTINCT O."Order ID" FROM Orders O INNER JOIN "Order Details"

OD ON O."Order ID" = OD."Order ID" WHERE Discount >= 0.25

Limit Using Outer JOINs
OUTER JOINs are treated differently from INNER JOINs in that the optimizer does not try to rearrange the join order of OUTER JOIN tables as it does to INNER JOIN tables. The outer table (the left table in LEFT OUTER JOIN and the right table in RIGHT OUTER JOIN) is accessed first, followed by the inner table. This fixed join order could lead to execution plans that are less than optimal.

For more information about a query that contains INNER JOIN, see Microsoft Knowledge Base.

Use Parameterized Queries

If your application runs a series of queries that are only different in some constants, you can improve performance by using a parameterized query. For example, to return orders by different customers, you can run the following query:

SELECT "Customer ID" FROM Orders WHERE "Order ID" = ?

Parameterized queries yield better performance by compiling the query only once and executing the compiled plan multiple times. Programmatically, you must hold on to the command object that contains the cached query plan. Destroying the previous command object and creating a new one destroys the cached plan. This requires the query to be re-compiled. If you must run several parameterized queries in interleaved manner, you can create several command objects, each caching the execution plan for a parameterized query. This way, you effectively avoid re-compilations for all of them.

Query Only When You Must

The SQL Server Compact Edition query processor is a powerful tool for querying data stored in your relational database. However, there is an intrinsic cost associated with any query processor. It must compile, optimize, and generate an execution plan before it starts doing the real work of performing the plan. This is particularly true with simple queries that finish quickly. Therefore, implementing the query yourself can sometimes provide vast performance improvement. If every millisecond counts in your critical component, we recommend that you consider the alternative of implementing the simple queries yourself. For large and complex queries, the job is still best left to the query processor.

For example, suppose you want to look up the customer ID for a series of orders arranged by their order IDs. There are two ways to accomplish this. First, you could follow these steps for each lookup:

Open the Orders base table

Find the row, using the specific "Order ID"

Retrieve the "Customer ID"

Or you could issue the following query for each lookup:

SELECT "Customer ID" FROM Orders WHERE "Order ID" =

The query-based solution is simpler but slower than the manual solution, because the SQL Server Compact Edition query processor translates the declarative SQL statement into the same three operations that you could implement manually. Those three steps are then performed in sequence. Your choice of which method to use will depend on whether simplicity or performance is more important in your application.

Monday, August 06, 2007

Participating in Transactions in XML Web Services Created Using ASP.NET

The transaction support for XML Web services leverages the support found in the common language runtime, which is based on the same distributed transaction model found in Microsoft Transaction Server (MTS) and COM+ Services. The model is based on declaratively deciding whether an object participates in a transaction, rather than writing specific code to handle committing and rolling back a transaction. For an XML Web service created using ASP.NET, you can declare an XML Web service's transactional behavior by setting the TransactionOption property of the WebMethod attribute applied to an XML Web service method. If an exception is thrown while the XML Web service method is executing, the transaction is automatically aborted; conversely, if no exception occurs, the transaction is automatically committed.

The TransactionOption property of the WebMethod attribute specifies how an XML Web service method participates in a transaction. Although this declarative level represents the logic of a transaction, it is one step removed from the physical transaction. A physical transaction occurs when a transactional object accesses a data resource, such as a database or message queue. The transaction associated with the object automatically flows to the appropriate resource manager. A .NET Framework data provider, such as the .NET Framework Data Provider for SQL Server or the .NET Framework Data Provider for OLE DB, looks up the transaction in the object's context and enlists in the transaction through the Distributed Transaction Coordinator (DTC). The entire transaction occurs automatically.

XML Web service methods can only participate in a transaction as the root of a new transaction. As the root of a new transaction, all interactions with resource managers, such as servers running Microsoft SQL Server, Microsoft Message Queuing (also known as MSMQ), and Microsoft Host Integration Server maintain the ACID properties required to run robust distributed applications. XML Web service methods that call other XML Web service methods participate in different transactions, as transactions do not flow across XML Web service methods.

The following code example shows an XML Web service that exposes a single XML Web service method, called DeleteDatabase. This XML Web service method performs a database operation that is scoped within a transaction. If the database operation does throw an exception, the transaction is automatically stopped; otherwise, the transaction is automatically committed.

<%@ WebService Language="C#" Class="Orders" %>
<%@ Assembly name="System.EnterpriseServices,Version=1.0.3300.0,Culture=neutral,PublicKeyToken=b03f5f7f11d50a3a" %>

using System;
using System.Data;
using System.Data.SqlClient;
using System.Web.Services;
using System.EnterpriseServices;

public class Orders : WebService
{
[ WebMethod(TransactionOption=TransactionOption.RequiresNew)]
public int DeleteAuthor(string lastName)
{
String deleteCmd = "DELETE FROM authors WHERE au_lname='" +
lastName + "'" ;
String exceptionCausingCmdSQL = "DELETE FROM NonExistingTable WHERE
au_lname='" + lastName + "'" ;

SqlConnection sqlConn = new SqlConnection(
"Persist Security Info=False;Integrated Security=SSPI;database=pubs;server=myserver");

SqlCommand deleteCmd = new SqlCommand(deleteCmdSQL,sqlConn);
SqlCommand exceptionCausingCmd = new
SqlCommand(exceptionCausingCmdSQL,sqlConn);

// This command should execute properly.
deleteCmd.Connection.Open();
deleteCmd.ExecuteNonQuery();

// This command results in an exception, so the first command is
// automatically rolled back. Since the XML Web service method is
// participating in a transaction, and an exception occurs, ASP.NET
// automatically aborts the transaction. The deleteCmd that
// executed properly is rolled back.

int cmdResult = exceptionCausingCmd.ExecuteNonQuery();

sqlConn.Close();

return cmdResult;
}
}

Managing State in XML Web Services Created Using ASP.NET

XML Web services have access to the same state management options as other ASP.NET applications when the class implementing the XML Web service derives from the WebService class. The WebService class contains many of the common ASP.NET objects, including the Session and Application objects.

The Application object provides a mechanism for storing data that is accessible to all code running within the Web application, whereas the Session object allows data to be stored on a per-client session basis. If the client supports cookies, a cookie can identify the client session. Data stored in the Session object is available only when the EnableSession property of the WebMethod attribute is set to true for a class deriving from WebService. A class deriving from WebService automatically has access to the Application object.

To access and store state specific to a particular client session

Declare an XML Web service method, setting the EnableSession property of the WebMethod attribute to true.

[ WebMethod(EnableSession=true) ]
public int PerSessionServiceUsage()

The following code example is an XML Web service with two XML Web service methods: ServerUsage and PerSessionServerUage. ServerUsage is a hit counter for every time the ServerUsage XML Web service method is accessed, regardless of the client communicating with the XML Web service method. For instance, if three clients call the ServerUsage XML Web service method consecutively, the last one receives a return value of 3. PerSessionServiceUsage, however, is a hit counter for a particular client session. If three clients access PerSessionServiceUsage consecutively, each will receive the same result of 1 on the first call.

<%@ WebService Language="C#" Class="ServerUsage" %>
using System.Web.Services;

public class ServerUsage : WebService {
[ WebMethod(Description="Number of times this service has been accessed.") ]
public int ServiceUsage() {
// If the XML Web service method hasn't been accessed,
// initialize it to 1.
if (Application["appMyServiceUsage"] == null)
{
Application["appMyServiceUsage"] = 1;
}
else
{
// Increment the usage count.
Application["appMyServiceUsage"] = ((int) Application["appMyServiceUsage"]) + 1;
}
return (int) Application["appMyServiceUsage"];
}

[ WebMethod(Description="Number of times a particualr client session has accessed this XML Web service method.",EnableSession=true) ]
public int PerSessionServiceUsage() {
// If the XML Web service method hasn't been accessed, initialize
// it to 1.
if (Session["MyServiceUsage"] == null)
{
Session["MyServiceUsage"] = 1;
}
else
{
// Increment the usage count.
Session["MyServiceUsage"] = ((int) Session["MyServiceUsage"]) + 1;
}
return (int) Session["MyServiceUsage"];
}
}

Wednesday, August 01, 2007

Post data to other Web pages with ASP.NET 2.0

ASP.NET 2.0's PostBackUrl attribute allows you to designate where a Web form and its data is sent when submitted.

Standard HTML forms allow you to post or send data to another page or application via the method attribute of the form element. In ASP.NET 1.x, Web pages utilise the postback mechanism where the page's data is submitted back to the page itself. ASP.NET 2.0 provides additional functionality by allowing a Web page to be submitted to another page. This week, I examine this new feature.

The old approach

I want to take a minute to cover the older HTML approach to gain perspective. The HTML form element contains the action attribute to designate what server side resource will handle the submitted data. The following code provides an example.

< html >
< head >
< title > Sample HTML form</title></head>
<body>
<form name="frmSample" method="post" action="target_url">
<input type="text" name="fullname" id="fullname" />
<input type="button" name="Submit" value="submit" />

</form>
</body></html>The value entered in the text field (fullname) is submitted to the page or program specified in the form element's action attribute. For ASP.NET developers, standard HTML forms are seldom, if ever, used.

ASP.NET developers have plenty of options when faced with the task of passing values from page to page. This includes session variables, cookies, querystring variables, caching, and even Server.Transfer, but ASP.NET 2.0 adds another option.

ASP.NET 2.0 alternative

When designing ASP.NET 2.0, Microsoft recognised the need to easily cross post data between Web forms. With that in mind, the PostBackUrl attribute was added to the ASP.NET button control. It allows you to designate where the form and its data are sent when submitted (via the URL assigned to the PostBackUrl attribute). Basically, cross posting is a client side transfer that uses JavaScript behind the scenes.

The ASP.NET Web form in Listing A has two text fields (name and e-mail address) and a button to submit the data. The PostBackUrl attribute of the submit button is assigned to another Web form, so the data is sent to that page when the form is submitted. Note: The form is set up to post the data when it is submitted via the method attribute of the form element, but this is unnecessary since all cross postbacks utilise post by design.

Listing A

<%@ Page language="vb" %>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >

<html><head>

<title>Cross Postback Example</title>

</head><body>

<form id="frmCrossPostback1" method="post" runat="server">

<asp:Label ID="lblName" runat="server" Text="Name:"></asp:Label>

<asp:TextBox ID="txtName" runat="server"></asp:TextBox><br />

<asp:Label ID="lblE-mailAddress" runat="server" Text="E-mail:"></asp:Label>

<asp:TextBox ID="txtE-mailAddress" runat="server"></asp:TextBox><br />

<asp:Button ID="btnSubmit" runat="server" Text="Submit" PostBackUrl="CrossPostback2.aspx" />

</form></body></html>
Working with previous pages

The IsPostBack property of an ASP.NET Page object is not triggered when it is loaded via a cross postback call. However, a new property called PreviousPage allows you to access and work with pages utilising cross postbacks.

When a cross page request occurs, the PreviousPage property of the current Page class holds a reference to the page that caused the postback. If the page is not the target of a cross-page posting, or if the pages are in different applications, the PreviousPage property is not initialised.

You can determine if a page is being loaded as a result of a cross postback by checking the PreviousPage object. A null value indicates a regular load, while a null value signals a cross postback. Also, the Page class contains a new method called IsCrossPagePostBack to specifically determine whether the page is loading as the result of a cross postback.

Once you determine that a cross postback has occurred, you can access controls on the calling page via the PreviousPage object's FindControl method. The code in Listing B is the second page in our example; it is called by the page in the previous listing.

Listing B

<%@ Page language="vb" %>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >

<html><head>

<title>Cross Postback Example 2</title>

</head><body>

<script language="vb" runat="server">

Sub Page_Load()

If Not (Page.PreviousPage Is Nothing) Then

If Not (Page.IsCrossPagePostBack) Then
Response.Write("Name:" + CType(PreviousPage.FindControl("txtName"), TextBox).Text + "<BR>")
Response.Write("E-mail:" + CType(PreviousPage.FindControl("txtE-mailAddress"), TextBox).Text + "<BR>")

End If

End If

End Sub

</script></body></html>
The page determines if it is being called via a cross postback. If so, the values from the calling page are accessed using the FindControl method and converting the controls returned by this method to TextBox controls and displaying their Text properties.

You can convert the entire PreviousPage object to the type of the page initiating the cross postback. This approach allows you to access the public properties and methods of a page. Before I offer an example of this technique, I must rewrite the first example to include public properties. Listing C is the first listing with two properties added to give access to field values.

Listing C

<%@ Page language="vb" %>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >

<html><head>

<title>Cross Postback Example</title>

<script language="vb" runat="server">

Public ReadOnly Property Name

Get

Return Me.txtName.Text

End Get

End Property

Public ReadOnly Property E-mailAddress

Get

Return Me.txtE-mailAddress.Text

End Get

End Property

</script></head><body>

<form id="frmCrossPostback1" method="post" runat="server">

<asp:Label ID="lblName" runat="server" Text="Name:"></asp:Label>

<asp:TextBox ID="txtName" runat="server"></asp:TextBox><br />

<asp:Label ID="lblE-mailAddress" runat="server" Text="E-mail:"></asp:Label>

<asp:TextBox ID="txtE-mailAddress" runat="server"></asp:TextBox><br />

<asp:Button ID="btnSubmit" runat="server" Text="Submit" PostBackUrl="CrossPostback2.aspx" />

</form></body></html>
Now that the properties have been established, you can easily access them. One caveat is that the Page class's PreviousPage object must be converted to the appropriate type to properly work with the properties. This is accomplished by converting it to the necessary page level object.

Listing D illustrates this point by referencing the calling page in the top of the page, so it can be used in the code. Once it is referenced, the actual VB.NET code converts the PreviousPage object to the appropriate type using the CType function. At this point, the properties may be used as the code demonstrates.

Listing D

<%@ Page language="vb"
%>

<%@ Reference Page="~/CrossPostback1.aspx" %>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >

<html><head>

<title>Cross Postback Example 3</title>

</head><body>

<script language="vb" runat="server">

Sub Page_Load()

Dim cppPage As CrossPostback1_aspx

If Not (Page.PreviousPage Is Nothing) Then

If Not (Page.IsCrossPagePostBack) Then

If (Page.PreviousPage.IsValid) Then
cppPage = CType(PreviousPage, CrossPostBack1_aspx)
Response.Write("Name:" + cppPage.Name + "<br>")
Response.Write("E-mail:" + cppPage.E-mailAddress)

End If

End If

End If

End Sub

</script></body></html>

A note about the use of the IsValid method of the PreviousPage object in the previous listing: The IsValid property of a previous page allows you to ensure a page passed all validation tests before working with it.

Summary

Passing data values between Web pages has many applications, including maintaining personal user information. Legacy Web solutions, like using the querystring and cookies, allows you to pass and maintain values, and you can easily direct one page to another when submitted. ASP.NET 1.1 supported these solutions as well as additional ones, but ASP.NET 2.0 addresses the issue head on by supporting cross page postbacks. This makes it easy for one Web page to process data from another. Take advantage of this new concept when you're working on your next ASP.NET 2.0 application.

Tuesday, July 31, 2007

Generic Classes

Many developers will view themselves primarily as consumers of generics. However, as you get more comfortable with generics, you're likely to find yourself introducing your own generic classes and frameworks. Before you can make that leap, though, you'll need to get comfortable with all the syntactic mutations that come along with creating your own generic classes. Fortunately, you'll notice that the syntax rules for defining generic classes follow many of the same patterns you've already grown accustomed to with non-generic types. So, although there are certainly plenty of new generic concepts you'll need to absorb, you're likely to find it quite easy to make the transition to writing your own generic types.

Parameterizing Types

In a very general sense, a generic class is really just a class that accepts parameters. As such, a generic class really ends up representing more of an abstract blueprint for a type that will, ultimately, be used in the construction of one or more specific types at run-time. This is one area where, I believe, the C++ term templates actually provides developers with a better conceptual model. This term conjures up a clearer metaphor for how the type parameters of a generic class serve as placeholders that get replaced by actual data types when a generic class is constructed. Of course, as you might expect, this same term also brings with it some conceptual inaccuracies that don't precisely match generics.

The idea of parameterizing your classes shouldn't seem all that foreign. In reality, the mindset behind parameterizing a class is not all that different than the rationale you would use for parameterizing a method in one of your existing classes. The goals in both scenarios are conceptually very similar. For example, suppose you had the following method in one of your classes that was used to locate all retired employees that had an age that was greater than or equal to the passed-in parameter (minAge):

[C# code]
public IList LookupRetiredEmployees(int minAge) {
IList retVal = new ArrayList();
foreach (Employee emp in masterEmployeeCollection) {
if ((emp.Age >= minAge) && (emp.Status == EmpStatus.Retired))
retVal.Add(emp);
}
return retVal;
}
}
Now, at some point, you happen to identify a handful of additional methods that are providing similar functionality. Each of these methods only varies based on the status (Retired, Active, and so on) of the employees being processed. This represents an obvious opportunity to refactor through parameterization. By adding status as a parameter to this method, you can make it much more versatile and eliminate the need for all the separate implementations. This is something you've likely done. It's a simple, common flavor of refactoring that happens every day.

So, with this example in mind, you can imagine applying this same mentality to your classes. Classes, like methods, can now be viewed as being further generalized through the use of type parameters. To better grasp this concept, let's go ahead and build a non-generic class that will be your candidate for further generalization:

[C# code]
public class CustomerStack {
private Customer[] _items;
private int _count;

public void Push(Customer item) {...}
public Customer Pop() {...}
}
This is the classic implementation of a type-safe stack that has been created to contain collections of Customers. There's nothing spectacular about it. But, as should be apparent by now, this class is the perfect candidate to be refactored with generics. To make your stack generic, you simply need to add a type parameter (T in this example) to your type and replace all of your references to the Customer with the name of your generic type parameter. The result would appear as follows:

[C# code]
public class Stack {
private T[] _items;
private int _count;

public void Push(T item) {...}
public T Pop() {...}
}
Pretty simple. It's really not all that different than adding a parameter to a method. It's as if generics have just allowed you to widen the scope of what can be parameterized to include classes.

Type Parameters
By now, you should be comfortable with the idea of type parameters and how they serve as a type placeholder for the type arguments that will be supplied when your generic class is constructed. Now let's look at what, precisely, can appear in a type parameter list for a generic class.

First, let's start with the names that can be assigned to type parameters. The rules for naming a type parameter are similar to the rules used when defining any identifier. That said, there are guidelines that you should follow in the naming of your type parameters to improve the readability and maintainability of your generic class. These guidelines, and others, are discussed in Chapter 10, "Generics Guidelines" of Professional .NET 2.0 Generics.

A generic class may also accept multiple type parameters. These parameters are provided as a delimited list of identifiers:

[C# code]
public class Stack
As you might suspect, each type parameter name must be unique within the parameter list as well as within the scope of the class. You cannot, for example, have a type parameter T along with a field that is also named T. You are also prevented from having a type parameter and the class that accepts that parameter share the same name. Fortunately, the names you're likely to use for type parameters and classes will rarely cause collisions.

In terms of scope, a type parameter can only be referenced within the scope of the generic class that declared it. So, if you have a child generic class B that descends from generic class A, class B will not be able to reference any type parameters that were declared as part of class A.

The list of type parameters may also contain constraints that are used to further qualify what type arguments can be supplied of a given type parameter. Chapter 7, "Generic Constraints" (Professional .NET 2.0 Generics) looks into the relevance and application of constraints in more detail.

Overloaded Types
The .NET implementation of generics allows programmers to create overloaded types. This means that types, like methods, can be overloaded based on their type parameter signature. Consider the declarations of the following types:
[C# Code]
public class MyType {
}

public class MyType {
...
}

public class MyType {
...
}
Three types are declared here and they all have the same name and different type parameter lists. At first glance, this may seem invalid. However, if you look at it from an overloading perspective, you can see how the compiler would treat each of these three types as being unique. This can introduce some level of confusion for clients, and this is certainly something you'll want to factor in as you consider building your own generic types. That said, this is still a very powerful concept that, when leveraged correctly, can enrich the power of your generic types.

Static Constructors
All classes support the idea of a static (shared) constructor. As you might expect, a static constructor is a constructor that can be called without requiring clients to create an instance of a given class. These constructors provide a convenient mechanism for initializing classes that leverage static types.

Now, when it comes to generics, you have to also consider the accessibility of your class's type parameters within the scope of your static constructor. As it turns out, static constructors are granted full access to any type parameters that are associated with your generic classes. Here's an example of a static constructor in action:

[C# code]
using System.Collections.Generic;

public class MySampleClass {
private static List _values;

static MySampleClass() {
if (typeof(T).IsAbstract == false)
throw new Exception("T must not be abstract");
else
_values = new List();
}
}
This example creates a class that accepts a single type parameter, T. The class has a data member that is used to hold a static collection of items of type T. However, you want to be sure, as part of initialization, that T is never abstract. In order to enforce this constraint, this example includes a static constructor that examines the type information about T and throws an exception if the type of T is abstract. If it's not abstract, the constructor proceeds with the initialization of its static collection.

This is just one application of static constructors and generic types. You should be able to see, from this example, how static constructors can be used as a common mechanism for initializing any generic class that has static data members.

Monday, July 23, 2007

A Preview of What's New in C# 3.0

This article discusses the following major new enhancements expected in C# 3.0:

Implicitly typed local variables
Anonymous types
Extension methods
Object and collection initializers
Lambda expressions
Query expressions
Expression Trees

Implicitly Typed Local Variables
C# 3.0 introduces a new keyword called "var". Var allows you to declare a new variable, whose type is implicitly inferred from the expression used to initialize the variable. In other words, the following is valid syntax in C# 3.0:

var i = 1;
The preceding line initializes the variable i to value 1 and gives it the type of integer. Note that "i" is strongly typed to an integer—it is not an object or a VB6 variant, nor does it carry the overhead of an object or a variant.

To ensure the strongly typed nature of the variable that is declared with the var keyword, C# 3.0 requires that you put the assignment (initializer) on the same line as the declaration (declarator). Also, the initializer has to be an expression, not an object or collection initializer, and it cannot be null. If multiple declarators exist on the same variable, they must all evaluate to the same type at compile time.

Implicitly typed arrays, on the other hand, are possible using a slightly different syntax, as shown below:

var intArr = new[] {1,2,3,4} ;
The above line of code would end up declaring intArr as int[].

The var keyword allows you to refer to instances of anonymous types (described in the next section) and yet the instances are statically typed. So, when you create instances of a class that contain an arbitrary set of data, you don't need to predefine a class to both hold that structure and be able to hold that data in a statically typed variable.

Anonymous types

C# 3.0 gives you the flexibility to create an instance of a class without having to write code for the class beforehand. So, you now can write code as shown below:

new {hair="black", skin="green", teethCount=64}
The preceding line of code, with the help of the "new" keyword, gives you a new type that has three properties: hair, skin, and teethCount. Behind the scenes, the C# compiler would create a class that looks as follows:

class __Anonymous1
{
private string _hair = "black";
private string _skin = "green";
private int _teeth = 64;
public string hair {get { return _hair; } set { _hair = value; }}
public string skin {get { return _skin; } set { _skin = value; }}
public int teeth {get { return _teeth; } set { _teeth = value; }}
}

In fact, if another anonymous type that specified the same sequence of names and types were created, the compiler would be smart enough to create only a single anonymous type for both instances to use. Also, because the instances are, as you may have guessed, simply instances of the same class, they can be exchanged because the types are really the same.

Now you have a class, but you still need something to hold an instance of the above class. This is where the "var" keyword comes in handy; it lets you hold a statically typed instance of the above instance of the anonymous type. Here is a rather simple and easy use of an anonymous type:

var frankenstein = new {hair="black", skin="green", teethCount=64}

Extension methods

Extension methods enable you to extend various types with additional static methods. However, they are quite limited and should be used as a last resort—only where instance methods are insufficient.

Extension methods can be declared only in static classes and are identified by the keyword "this" as a modifier on the first parameter of the method. The following is an example of a valid extension method:

public static int ToInt32(this string s)
{
return Convert.ToInt32(s) ;
}

If the static class that contains the above method is imported using the "using" keyword, the ToInt32 method will appear in existing types (albeit in lower precedence to existing instance methods), and you will be able to compile and execute code that looks as follows:

string s = "1";
int i = s.ToInt32();

This allows you to take advantage of the extensible nature of various built-in or defined types and add newer methods to them.

Object and collection initializers
C# 3.0 is expected to allow you to include an initializer that specifies the initial values of the members of a newly created object or collection. This enables you to combine declaration and initialization in one step.

For instance, if you defined a CoOrdinate class as follows:

public class CoOrdinate
{
public int x ;
public int y;
}

You then could declare and initialize a CoOrdinate object using an object initializer, like this:

var myCoOrd = new CoOrdinate{ x = 0, y= 0} ;
The above code may have made you raise your eyebrows and ask, "Why not just write the following:"

var myCoOrd = new CoOrdinate(0, 0) ;
Note: I never declared a constructor that accepted two parameters in my class. In fact, initializing the object using an object initializer essentially is equivalent to calling a parameterless (default) constructor of the CoOrdinate object and then assigning the relevant values.
Similarly, you should easily be able to give values to collections in a rather concise and compact manner in C# 3.0. For instance, the following C# 2.0 code:

List animals = new List();
animals.Add("monkey");
animals.Add("donkey");
animals.Add("cow");
animals.Add("dog");
animals.Add("cat");

Now can be shortened to simply:

List animals = new List {
"monkey", "donkey", "cow", "dog", "cat" } ;

Lambda expressions
C# 1.x allowed you to write code blocks in methods, which you could invoke easily using delegates. Delegates are definitely useful, and they are used throughout the framework, but in many instances you had to declare a method or a class just to use one. Thus, to give you an easier and more concise way of writing code, C# 2.0 allowed you to replace standard calls to delegates with anonymous methods. The following code may have been written in .NET 1.1 or earlier:

class Program
{
delegate void DemoDelegate();
static void Main(string[] args)
{
DemoDelegate myDelegate = new DemoDelegate(SayHi);
myDelegate();
}
void SayHi()
{
Console.Writeline("Hiya!!") ;
}
}

In C# 2.0, using anonymous methods, you could rewrite the code as follows:

class Program
{
delegate void DemoDelegate();
static void Main(string[] args)
{
DemoDelegate myDelegate = delegate()
{
Console.Writeline("Hiya!!");
};
myDelegate();
}
}

Whereas anonymous methods are a step above method-based delegate invocation, lambda expressions allow you to write anonymous methods in a more concise, functional syntax.

You can write a lambda expression as a parameter list, followed by the => token, followed by an expression or statement block. The above code can now be replaced with the following code:

class Program
{
delegate void DemoDelegate();
static void Main(string[] args)
{
DemoDelegate myDelegate = () => Console.WriteLine("Hiya!!") ;
myDelegate();
}
}

Although Lambda expressions may appear to be simply a more concise way of writing anonymous methods, in reality they also are a functional superset of anonymous methods. Specifically, Lambda expressions offer the following additional functionality:

They permit parameter types to be inferred. Anonymous methods will require you to explicitly state each and every type.
They can hold either query expressions (described in the following section) or C# statements.
They can be treated as data using expression trees (described later). This cannot be done using Anonymous methods.

Query expressions
Even though further enhancements may be introduced in the coming months as C# 3.0 matures, the new features described in the preceding sections make it a lot easier to work with data inside C# in general. This feature, also known as LINQ (Language Integrated Query), allows you to write SQL-like syntax in C#.

For instance, you may have a class that describes your data as follows:

public class CoOrdinate
{
public int x ;
public int y;
}

You now could easily declare the logical equivalent of a database table inside C# as follows:

// Use Object and collection initializers
List coords = ... ;

And now that you have your data as a collection that implements IEnumerable, you easily can query this data as follows:

var filteredCoords =
from c in coords
where x == 1
select (c.x, c.y)

In the SQL-like syntax above, "from", "where", and "select" are query expressions that take advantage of C# 3.0 features such as anonymous types, extension methods, implicit typed local variables, and so forth. This way, you can leverage SQL-like syntax and work with disconnected data easily.

Each query expression is actually translated into a C#-like invocation behind the scenes. For instance, the following:

where x == 1
Translates to this:

coords.where(c => c.x == 1)
As you can see, the above looks an awful lot like a lambda expression and extension method. C# 3.0 has many other query expressions and rules that surround them.

Expression Trees

C# 3.0 includes a new type that allows expressions to be treated as data at runtime. This type, System.Expressions.Expression, is simply an in-memory representation of a lambda expression. The end result is that your code can modify and inspect lambda expressions at runtime.

The following is an example of an expression tree:

Expression filter = () => Console.WriteLine("Hiya!!") ;
With the above expression tree setup, you easily can inspect the contents of the tree by using various properties on the filter variable.

Wednesday, July 18, 2007

FileSystemWatcher in .Net

Use FileSystemWatcher to watch for changes in a specified directory. You can watch for changes in files and subdirectories of the specified directory. The component can watch files on a local computer, a network drive, or a remote computer

The .NET FileSystemWatcher class makes it possible to quickly and easily launch business processes when certain files or directories are created, modified, or deleted. The FileSystemWatcher class, for example, can be quite useful in application integration, by way of monitoring incoming data files and processing it once an event is raised. It listens to the file system change notifications and raises events when a directory, or file(s) in a directory, changes.

I've come to realize that it does need some understanding of this class to make it work efficiently. For example, a denial-of-service attack is possible if a malicious program gains access to a directory the FileSystemWatcher component is monitoring, and which thereby generates so many changes that the component cannot manage them and can cause a buffer overflow or other drastic effects. Following are some tips and notes on how to use the FileSystemWatcher class to build a more robust solution:

1. Events being raised twice - An event will be raised twice if an event handler (AddHander FSW.Created, AddressOf FSW_Created) is explicitly specified. This is because, by default, the public events automatically call the respective protected methods (OnChanged, OnCreated, OnDeleted, OnRenamed). To correct this problem, simply remove the explicit event handler (AddHandler ...).

2. Events being raised multiple times - In some cases a single event can generate multiple events that are handled by the component. Because FileSystemWatcher monitors the operating system activities, all events that applications fire will be picked up. For example, when a file is moved from one directory to another, several OnChanged and some OnCreated and OnDeleted events might be raised. Moving a file is a complex operation that consists of multiple simple operations, therefore raising multiple events. Such multiple events can be correctly handled and processed by using a simple boolean flag check (for a first-come-first-serve approach) or by setting the FileSystemWatcher.NotifyFilter property to one of the NotifyFilters values.

3. Thread Safety - Any public static (Shared in VB) members of this class type are thread safe. Any instance members are not guaranteed to be thread safe.

4. File System Type - The FileSystemWatcher does not raise events for CDs and DVDs, because time stamps and properties cannot change for such media types.

5. Filter - The Filter property can be used to watch for changes in a certain type(s) of file(s) (eg: "*.txt"). To watch for changes in all files, set the Filter property to an empty string (""). Hidden files are also monitored by the FileSystemWatcher.

6. Internal Buffer - The FileSytemWatcher component uses an internal buffer to keep track of file system actions. You should set the buffer to an appropriate size for the approximate number of events you expect to receive. By default, the buffer is set to a size of 4 KB. A 4 KB buffer can track changes on approximately 80 files in a directory. Each event takes up 16 bytes in the buffer, plus enough bytes to store the name of the file, in Unicode (2 bytes per character), that the event occurred on. You can use this information to approximate the buffer size you will need. You (re)set the buffer size by setting the InternalBufferSize property. If there are many changes in a short time, the buffer can overflow. This causes the component to lose track of changes in the directory, and it will only provide blanket notification. Increasing the size of the buffer is expensive, as it comes from non-paged memory that cannot be swapped out to disk, so keep the buffer as small as possible. Setting the Filter does not decrease what goes into the buffer. If you are using Microsoft Windows 2000, you should increase the buffer size in increments of 4 KB, because this corresponds to the operating system's default page size. With any other operating system, you should increase the buffer size in increments that correspond to the operating system's default page size. If you are unsure of the default page size for the operating system you are using, the safest way to proceed is to just double the original size of the buffer. This will maintain the original interval needed for your operating system.

7. Directories - Changing a file within a directory you are monitoring with a FileSystemWatcher component generates not only a Changed event on the file but also a similar event for the directory itself. This is because the directory maintains several types of information for each file it contains - the names and sizes of files, their modification dates, attributes, and so on. Use the IncludeSubdirectories property to indicate whether or not you want to include the subdirectories of the directory you are watching. If you turn this off when you do not need it, you will receive fewer events than when it is turned on.

8. Stop Watching - When file system monitoring is not needed anymore, you should set the EnableRaisingEvents property to False to disable the component, otherwise the component will continue listening.

The FileSystemWatcher class provides a useful feature but it should be implemented along with a thorough examination.

public class Watcher
{

public static void Main()
{

string[] args = System.Environment.GetCommandLineArgs();

// If a directory is not specified, exit program.
if(args.Length != 2)
{
// Display the proper way to call the program.
Console.WriteLine("Usage: Watcher.exe (directory)");
return;
}

// Create a new FileSystemWatcher and set its properties.
FileSystemWatcher watcher = new FileSystemWatcher();
watcher.Path = args[1];
/* Watch for changes in LastAccess and LastWrite times, and
the renaming of files or directories. */
watcher.NotifyFilter = NotifyFilters.LastAccess | NotifyFilters.LastWrite
| NotifyFilters.FileName | NotifyFilters.DirectoryName;
// Only watch text files.
watcher.Filter = "*.txt";

// Add event handlers.
watcher.Changed += new FileSystemEventHandler(OnChanged);
watcher.Created += new FileSystemEventHandler(OnChanged);
watcher.Deleted += new FileSystemEventHandler(OnChanged);
watcher.Renamed += new RenamedEventHandler(OnRenamed);

// Begin watching.
watcher.EnableRaisingEvents = true;

// Wait for the user to quit the program.
Console.WriteLine("Press \'q\' to quit the sample.");
while(Console.Read()!='q');
}

// Define the event handlers.
private static void OnChanged(object source, FileSystemEventArgs e)
{
// Specify what is done when a file is changed, created, or deleted.
Console.WriteLine("File: " + e.FullPath + " " + e.ChangeType);
}

private static void OnRenamed(object source, RenamedEventArgs e)
{
// Specify what is done when a file is renamed.
Console.WriteLine("File: {0} renamed to {1}", e.OldFullPath, e.FullPath);
}
}

Thursday, July 12, 2007

Performance Modeling

Objectives

Engineer for performance up front.
Manage performance risks.
Map business requirements to performance objectives.
Balance performance against other quality-of-service requirements.
Identify and analyze key performance scenarios.
Identify and allocate budgets.
Evaluate your model to ensure you meet your performance objectives.
Identify metrics and test cases.

Overview
This chapter presents performance modeling based on approaches used by Microsoft teams. Similar approaches are recommended elsewhere, including in the book Performance Solutions by Connie U. Smith and Lloyd G. Williams. You and your organization will need to adapt the process for your environment.

Performance modeling is a structured and repeatable approach to modeling the performance of your software. It begins during the early phases of your application design and continues throughout the application life cycle.

Performance is generally ignored until there is a problem. There are several problems with this reactive approach:

Performance problems are frequently introduced early in the design.
Design issues cannot always be fixed through tuning or more efficient coding.
Fixing architectural or design issues later in the cycle is not always possible. At best, it is inefficient, and is usually very expensive.
When you create performance models, you identify application scenarios and your performance objectives. Your performance objectives are your measurable criteria, such as response time, throughput (how much work in how much time), and resource utilization (CPU, memory, disk I/O, and network I/O). You break down your performance scenarios into steps and assign performance budgets. Your budget defines the resources and constraints across your performance objectives.

Performance modeling provides several important benefits:

Performance becomes part of your design.
Modeling helps answer the question "Will your design support your performance objectives?" By building and analyzing models, you can evaluate tradeoffs before you actually build the solution.
You know explicitly what design decisions are influenced by performance and the constraints performance puts on future design decisions. Frequently, these decisions are not captured and can lead to maintenance efforts that work against your original goals.

You avoid surprises in terms of performance when your application is released into production.

You end up with a document of itemized scenarios that help you quickly see what is important. That translates to where to instrument, what to test for, and how to know whether you are on or off track for meeting your performance goals.
Upfront performance modeling is not a replacement for scenario-based load testing or prototyping to validate your design. In fact, you have to prototype and test to determine what things cost and to see if your plan makes sense. Data from your prototypes can help you evaluate early design decisions before implementing a design that will not allow you to meet your performance goals.

An information structure to help you capture performance-related information. This information can be filled out partially with assumptions and requirements, and it can get more comprehensively filled out according to your needs.
A process that helps you incrementally define and capture the information that helps the teams working on your solution to focus on using, capturing, and sharing the appropriate information.

To use this performance model, do the following:

Set goals. Capture whatever partial performance-related information you have, including your application prototype's metrics, important scenarios, workloads, goals, or budgets. The performance model presented in this chapter is designed to use the partial information you might have in these areas as input. You do not have to completely fill out the data or have a complete understanding of your own requirements and solutions.
Measure. Execute the suggested tasks in the process to iteratively set goals and measure the result of your action, by using the partially completed model as a guide of what to focus on. This allows you add and refine the information in your model. The new data will inform the next round of goal setting and measurement.
Why Model Performance?
A performance model provides a path to discover what you do not know. The benefits of performance modeling include the following:

Performance becomes a feature of your development process and not an afterthought.
You evaluate your tradeoffs earlier in the life cycle based on measurements.
Test cases show you whether you are trending toward or away from the performance objectives throughout your application life cycle.
Modeling allows you to evaluate your design before investing time and resources to implement a flawed design. Having the processing steps for your performance scenarios laid out enables you to understand the nature of your application's work. By knowing the nature of this work and the constraints affecting that work, you can make more informed decisions.

Your model can reveal the following about your application:

What are the relevant code paths and how do they affect performance?
Where do the use of resources or computations affect performance?
Which are the most frequently executed code paths? This helps you identify where to spend time tuning.
What are the key steps that access resources and lead to contention?
Where is your code in relation to resources (local, remote)?
What tradeoffs have you made for performance?
Which components have relationships to other components or resources?
Where are your synchronous and asynchronous calls?
What is your I/O-bound work and what is your CPU-bound work?
And the model can reveal the following about your goals:

What is the priority and achievability of different performance goals?
Where have your performance goals affected design?

Risk Management
The time, effort, and money you invest up front in performance modeling should be proportional to project risk. For a project with significant risk, where performance is critical, you may spend more time and energy up front developing your model. For a project where performance is less of a concern, your modeling approach might be as simple as white-boarding your performance scenarios.

Budget
Performance modeling is essentially a "budgeting" exercise. Budget represents your constraints and enables you to specify how much you can spend (resource-wise) and how you plan to spend it. Constraints govern your total spending, and then you can decide where to spend to get to the total. You assign budget in terms of response time, throughput, latency, and resource utilization.

Performance modeling does not need to involve a lot of up-front work. In fact, it should be part of the work you already do. To get started, you can even use a whiteboard to quickly capture the key scenarios and break them down into component steps.

If you know your goals, you can quickly assess if your scenarios and steps are within range, or if you need to change your design to accommodate the budget. If you do not know your goals (particularly resource utilization), you need to define your baselines. Either way, it is not long before you can start prototyping and measuring to get some data to work with.

What You Must Know
Performance models are created in document form by using the tool of your choice (a simple Word document works well). The document becomes a communication point for other team members. The performance model contains a lot of key information, including goals, budgets (time and resource utilization), scenarios, and workloads. Use the performance model to play out possibilities and evaluate alternatives, before committing to a design or implementation decision. You need to measure to know the cost of your tools. For example, how much does a certain API cost you?

Best Practices
Consider the following best practices when creating performance models:

Determine response time and resource utilization budgets for your design.
Identify your target deployment environment.
Do not replace scenario-based load testing with performance modeling, for the following reasons:
Performance modeling suggests which areas should be worked on but cannot predict the improvement caused by a change.
Performance modeling informs the scenario-based load testing by providing goals and useful measurements.
Modeled performance may ignore many scenario-based load conditions that can have an enormous impact on overall performance.
Information in the Performance Model
The information in the performance model is divided into different areas. Each area focuses on capturing one perspective. Each area has important attributes that help you execute the process.

Information in the Performance Model
Category Description
Application Description The design of the application in terms of its layers and its target infrastructure.
Scenarios Critical and significant use cases, sequence diagrams, and user stories relevant to performance.
Performance Objectives Response time, throughput, resource utilization.
Budgets Constraints you set on the execution of use cases, such as maximum execution time and resource utilization levels, including CPU, memory, disk I/O, and network I/O.
Measurements Actual performance metrics from running tests, in terms of resource costs and performance issues.
Workload Goals Goals for the number of users, concurrent users, data volumes, and information about the desired use of the application.
Baseline Hardware Description of the hardware on which tests will be run — in terms of network topology, bandwidth, CPU, memory, disk, and so on.

Additional Information You Might Need

Category Description
Quality-of-Service (QoS) Requirements QoS requirements, such as security, maintainability, and interoperability, may impact your performance. You should have an agreement across software and infrastructure teams about QoS restrictions and requirements.
Workload Requirements Total number of users, concurrent users, data volumes, and information about the expected use of the application.

Inputs
A number of inputs are required for the performance modeling process. These include initial (maybe even tentative) information about the following:

Scenarios and design documentation about critical and significant use cases.
Application design and target infrastructure and any constraints imposed by the infrastructure.
QoS requirements and infrastructure constraints, including service level agreements (SLAs).
Workload requirements derived from marketing data on prospective customers.
Outputs

The output from performance modeling is the following:
A performance model document.
Test cases with goals.
Performance Model Document
The performance model document may contain the following:

Performance objectives.
Budgets.
Workloads.
Itemized scenarios with goals.
Test cases with goals.
An itemized scenario is a scenario that you have broken down into processing steps. For example, an order scenario might include authentication, order input validation, business rules validation, and orders being committed to the database. The itemized scenarios include assigned budgets and performance objectives for each step in the scenario.

Test Cases with Goals
You use test cases to generate performance metrics. They validate your application against performance objectives. Test cases help you determine whether you are trending toward or away from your performance objectives.

Eight step performance model

The performance modeling process involves the following steps:

Identify key scenarios. Identify scenarios where performance is important and scenarios that pose the most risk to your performance objectives.
Identify workload. Identify how many users and concurrent users your system needs to support.
Identify performance objectives. Define performance objectives for each of your key scenarios. Performance objectives reflect business requirements.
Identify budget. Identity your budget or constraints. This includes the maximum execution time in which an operation must be completed and resource utilization constraints, such as CPU, memory, disk I/O, and network I/O.
Identify processing steps. Break down your key scenarios into component processing steps.
Allocate budget. Spread your budget (determined in Step 4) across your processing steps (determined in Step 5) to meet your performance objectives (defined in Step 3).
Evaluate. Evaluate your design against objectives and budget. You may need to modify your design or spread your response time and resource utilization budget differently to meet your performance objectives.
Validate. Validate your model and estimates. This is an ongoing activity and includes prototyping, assessing, and measuring.
The next sections describe each of the preceding steps.

Step 1 – Identify Key Scenarios
Identify your application scenarios that are important from a performance perspective. If you have documented use cases or user stories, use them to help you define your scenarios. Key scenarios include the following:

Critical scenarios.
Significant scenarios.
Critical Scenarios
These are the scenarios that have specific performance expectations or requirements. Examples include scenarios covered by SLAs or those that have specific performance objectives.

Significant Scenarios
Significant scenarios do not have specific performance objectives such as a response time goal, but they may impact other critical scenarios.

To help identify significant scenarios, identify scenarios with the following characteristics:

Scenarios that run in parallel to a performance-critical scenario.
Scenarios that are frequently executed.
Scenarios that account for a high percentage of system use.
Scenarios that consume significant system resources.
Do not ignore your significant scenarios. Your significant scenarios can influence whether your critical scenarios meet their performance objectives. Also, do not forget to consider how your system will behave if different significant or critical scenarios are being run concurrently by different users. This "parallel integration" often drives key decisions about your application's units of work. For example, to keep search response brisk, you might need to commit orders one line item at a time.

Step 2 – Identify Workload
Workload is usually derived from marketing data. It includes the following:
Total users.
Concurrently active users.
Data volumes.
Transaction volumes and transaction mix.
For performance modeling, you need to identify how this workload applies to an individual scenario. The following are example requirements:
You might need to support 100 concurrent users browsing.
You might need to support 10 concurrent users placing orders.
Note Concurrent users are those users that hit a Web site at exactly the same moment. Simultaneous users are those users who have active connections to the same site.

Step 3 – Identify Performance Objectives
For each scenario identified in Step 1, write down the performance objectives. The performance objectives are determined by your business requirements.

Performance objectives usually include the following:

Response time. For example, the product catalog must be displayed in less than 3 seconds.
Throughput. For example, the system must support 100 transactions per second.
Resource utilization. A frequently overlooked aspect is how much resource your application is consuming, in terms of CPU, memory, disk I/O, and network I/O.
Consider the following when establishing your performance objectives:

Workload requirements.
Service level agreements.
Response times.
Projected growth.
Lifetime of your application.
For projected growth, you need to consider whether your design will meet your needs in six months time, or one year from now. If the application has a lifetime of only six months, are you prepared to trade some extensibility for performance? If your application is likely to have a long lifetime, what performance are you willing to trade for maintainability?

Step 4 – Identify Budget
Budgets are your constraints. For example, what is the longest acceptable amount of time that an operation should take to complete, beyond which your application fails to meet its performance objectives.

Your budget is usually specified in terms of the following:

Execution time.
Resource utilization.
Execution Time
Your execution time constraints determine the maximum amount of time that particular operations can take.

Resource Utilization
Resource utilization requirements define the threshold utilization levels for available resources. For example, you might have a peak processor utilization limit of 75 percent and your memory consumption must not exceed 50 MB.

Common resources to consider include the following:

CPU.
Memory.
Network I/O.
Disk I/O.
More Information

Additional Considerations
Execution time and resource utilization are helpful in the context of your performance objectives. However, budget has several other dimensions you may be subject to. Other considerations for budget might include the following:

Network. Network considerations include bandwidth.
Hardware. Hardware considerations include items, such as servers, memory, and CPUs.
Resource dependencies. Resource dependency considerations include items, such as the number of available database connections and Web service connections.
Shared resources. Shared resource considerations include items, such as the amount of bandwidth you have, the amount of CPU you get if you share a server with other applications, and the amount of memory you get.
Project resources. From a project perspective, budget is also a constraint, such as time and cost.

Step 5 – Identify Processing Steps
Itemize your scenarios and divide them into separate processing steps, such as those shown in Table 2.3. If you are familiar with UML, use cases and sequence diagrams can be used as input. Similarly, Extreme Programming user stories can provide useful input to this step.

Processing Steps

Processing Steps
1. An order is submitted by client.
2. The client authentication token is validated.
3. Order input is validated.
4. Business rules validate the order.
5. The order is sent to a database server.
6. The order is processed.
7. A response is sent to the client.

An added benefit of identifying processing steps is that they help you identify those points within your application where you should consider adding custom instrumentation. Instrumentation helps you to provide actual costs and timings when you begin testing your application.

Step 6 – Allocate Budget
Spread your budget (determined in Step 4, "Identify Budget") across your processing steps (determined in Step 5, "Identify Processing Steps") to meet your performance objectives. You need to consider execution time and resource utilization. Some of the budget may apply to only one processing step. Some of the budget may apply to the scenario and some of it may apply across scenarios.

Assigning Execution Time to Steps
When assigning time to processing steps, if you do not know how much time to assign, simply divide the total time equally between the steps. At this point, it is not important for the values to be precise because the budget will be reassessed after measuring actual time, but it is important to have an idea of the values. Do not insist on perfection, but aim for a reasonable degree of confidence that you are on track.

You do not want to get stuck, but, at the same time, you do not want to wait until your application is built and instrumented to get real numbers. Where you do not know execution times, you need to try spreading the time evenly, see where there might be problems or where there is tension.

If dividing the budget shows that each step has ample time, there is no need to examine these further. However, for the ones that look risky, conduct some experiments (for example, with prototypes) to verify that what you will need to do is possible, and then proceed.

Note that one or more of your steps may have a fixed time. For example, you may make a database call that you know will not complete in less than 3 seconds. Other times are variable. The fixed and variable costs must be less than or equal to the allocated budget for the scenario.

Assigning Resource Utilization Requirements
When assigning resources to processing steps, consider the following:

Know the cost of your materials. For example, what does technology x cost in comparison to technology y.
Know the budget allocated for hardware. This defines the total resources available at your disposal.
Know the hardware systems already in place.
Know your application functionality. For example, heavy XML document processing may require more CPU, chatty database access or Web service communication may require more network bandwidth, or large file uploads may require more disk I/O.

Step 7 – Evaluate
Evaluate the feasibility and effectiveness of the budget before time and effort is spent on prototyping and testing. Review the performance objectives and consider the following questions:

Does the budget meet the objectives?
Is the budget realistic? It is during the first evaluation that you identify new experiments you should do to get more accurate budget numbers.
Does the model identify a resource hot spot?
Are there more efficient alternatives?
Can the design or features be reduced or modified to meet the objectives?
Can you improve efficiency in terms of resource consumption or time?
Would an alternative pattern, design, or deployment topology provide a better solution?
What are you trading off? Are you trading productivity, scalability, maintainability, or security for performance?
Consider the following actions:

Modify your design.
Reevaluate requirements.
Change the way you allocate budget.

Step 8 – Validate
Validate your model and estimates. Continue to create prototypes and measure the performance of the use cases by capturing metrics. This is an ongoing activity that includes prototyping and measuring. Continue to perform validation checks until your performance goals are met.

The further you are in your project's life cycle, the greater the accuracy of the validation. Early on, validation is based on available benchmarks and prototype code, or just proof-of-concept code. Later, you can measure the actual code as your application develops.

Wednesday, July 11, 2007

Improving .NET Application Performance and Scalability

Architecture and Design Solutions
If you are an architect, this guide provides the following solutions to help you design Microsoft® .NET applications to meet your performance objectives:

How to balance performance with quality-of-service (QoS) requirements
Do not consider performance in isolation. Balance your performance requirements with other QoS attributes such as security and maintainability.

How to identify and evaluate performance issues
Use performance modeling early in the design process to help evaluate your design decisions against your objectives before you commit time and resources. Identify your performance objectives, your workload, and your budgets. Budgets are your constraints. These include maximum execution time and resource utilization such as CPU, memory, disk I/O, and network I/O.

How to perform architecture and design reviews
Review the design of your application in relation to your target deployment environment, any constraints that might be imposed, and your defined performance goals. Use the categories that are defined by the performance and scalability frame promoted by this guide to help partition the analysis of your application and to analyze the approach taken for each area. The categories represent key areas that frequently affect application performance and scalability. Use the categories to organize and prioritize areas for review.

How to choose a deployment topology
When you design your application architecture, you must take into account corporate policies and procedures together with the infrastructure that you plan to deploy your application on. If the target environment is rigid, your application design must reflect the restrictions that exist in that rigid environment. Your application design must also take into account QoS attributes such as security and maintainability. Sometimes you must make design tradeoffs because of protocol restrictions, and network topologies.

Identify the requirements and constraints that exist between application architecture and infrastructure architecture early in the development process. This helps you choose appropriate architectures and helps you resolve conflicts between application and infrastructure architecture early in the process.

Use a layered design that includes presentation, business, and data access logic. A well-layered design generally makes it easier to scale your application and improves maintainability. A well-layered design also creates predictable points in your application where it makes sense (or not) to make remote calls.

To avoid remote calls and additional network latency, stay in the same process where possible and adopt a non-distributed architecture, where layers are located inside your Web application process on the Web server.

If you do need a distributed architecture, consider the implications of remote communication when you design your interfaces. For example, you might need a distributed architecture because security policy prevents you from running business logic on your Web server, or you might need a distributed architecture because you need to share business logic with other applications, Try to reduce round trips and the amount of traffic that you send over the network.

How to design for required performance and scalability
Use tried and tested design principles. Focus on the critical areas where the correct approach is essential and where mistakes are often made. Use the categories described by the performance frame that is defined in this guide to help organize and prioritize performance issues. Categories include data structures and algorithms, communication, concurrency, resource management, coupling and cohesion, and caching and state management.

How to pass data across the tiers
Prioritize performance, maintenance, and ease of development when you select an approach. Custom classes allow you to implement efficient serialization. Use structures if you can to avoid implementing your own serialization. You can use XML for interoperability and flexibility. However, XML is verbose and can require considerable parsing effort. Applications that use XML may pass large amounts of data over the network. Use a DataReader object to render data as quickly as possible, but do not pass DataReader objects between layers because they require an open connection. The DataSet option provides great flexibility; you can use it to cache data across requests. DataSet objects are expensive to create and serialize. Typed DataSet objects permit clients to access fields by name and to avoid the collection lookup overhead.

How to choose between Web services, remoting, and Enterprise Services
Web services are the preferred communication mechanism for crossing application boundaries, including platform, deployment, and trust boundaries. The Microsoft product team recommendations for working with ASP.NET Web services, Enterprise Services, and .NET remoting are summarized in the following list:

Build services by using ASP.NET Web services.
Enhance your ASP.NET Web services with Web Services Enhancements (WSE) if you need the WSE feature set and if you can accept the support policy.
Use object technology, such as Enterprise Services or .NET remoting, within the implementation of a service.
Use Enterprise Services inside your service boundaries when the following conditions are true:
You need the Enterprise Services feature set. This feature set includes object pooling, declarative transactions, distributed transactions, role-based security, and queued components.

You are communicating between components on a local server, and you have performance issues with ASP.NET Web services or WSE.

Use .NET remoting inside your service boundaries when the following conditions are true:
You need in-process, cross-application domain communication. Remoting has been optimized to pass calls between application domains extremely efficiently.

You need to support custom wire protocols. Understand, however, that this customization will not port cleanly to future Microsoft implementations.

When you work with ASP.NET Web services, Enterprise Services, or .NET remoting, you should consider the following caveats:

If you use ASP.NET Web services, avoid using low-level extensibility features such as the HTTP Context object. If you do use the HttpContext object, abstract your access to it.
If you use .NET remoting, avoid or abstract using low-level extensibility such as .NET remoting sinks and custom channels.
If you use Enterprise Services, avoid passing object references inside Enterprise Services. Also, do not use COM+ APIs. Instead, use types from the System.EnterpriseServices namespace.

How to design remote interfaces
When you create interfaces that are designed for remote access, consider the level of chatty communication, the intended unit of work, and the need to maintain state on either side of the conversation.

As a general rule, you should avoid property-based interfaces. You should also avoid any chatty interface that requires the client to call multiple methods to perform a single logical unit of work. Provide sufficiently granular methods. To reduce network round trips, pass data through parameters as described by the data transfer object pattern instead of forcing property access. Also try to reduce the amount of data that is sent over the remote method calls to reduce serialization overhead and network latency.

If you have existing objects that expose chatty interfaces, you can use a data facade pattern to provide a coarse-grained wrapper. The wrapper object would have a coarse-grained interface that encapsulates and coordinates the functionality of one or more objects that have not been designed for efficient remote access.

Alternatively, consider the remote transfer object pattern where you wrap and return the data you need. Instead of making a remote call to fetch individual data items, you fetch a data object by value in a single remote call. You then operate locally against the locally cached data. In some scenarios where you may need to ultimately update the data on the server, the wrapper object exposes a single method that you call to send the data back to the server.

How to choose between service orientation and object orientation
When you are designing distributed applications, services are the preferred approach. While object-orientation provides a pure view of what a system should look like and is good for producing logical models, a pure object-based approach often does not take into account real-world aspects such as physical distribution, trust boundaries, and network communication. A pure object-based approach also does not take into account nonfunctional requirements such as performance and security.

Table 1 summarizes some key differences between object orientation and service orientation.

Table 1: Object Orientation vs. Service Orientation

Object orientation Service orientation
Assumes homogeneous platform and execution environment. Assumes heterogeneous platform and execution environment.
Share types, not schemas. Share schemas, not types.
Assumes cheap, transparent communication. Assumes variable cost, explicit communication.
Objects are linked: Object identity and lifetime are maintained by the infrastructure. Services are autonomous: security and failure isolation are a must.
Typically requires synchronized deployment of both client and server. Allows continuous separate deployment of client and server.
Is easy to conceptualize and thus provides a natural path to follow. Builds on ideas from component software and distributed objects. Dominant theme is to manage/reduce sharing between services.
Provides no explicit guidelines for state management and ownership. Owns and maintains state or uses reference state.
Assumes a predictable sequence, timeframe, and outcome of invocations. Assumes message-oriented, potentially asynchronous and long-running communications.
Goal is to transparently use functions and types remotely. Goal is to provide inter-service isolation and wire interoperability based on standards.

Common application boundaries include platform, deployment, trust, and evolution. Evolution refers to whether or not you develop and upgrade applications together. When you evaluate architecture and design decisions around your application boundaries, consider the following:

Objects and remote procedure calls (RPC) are appropriate within boundaries.
Services are appropriate across and within boundaries.

Development Solutions
If you are a developer, this guide provides the following solutions:

Improving Managed Code Performance
How to conduct performance reviews of managed code
Use analysis tools such as FxCop.exe to analyze binary assemblies and to ensure that they conform to the Microsoft .NET Framework design guidelines. To evaluate specific features including garbage collection overheads, threading, and asynchronous processing.

Use the CLR Profiler tool to look inside the managed heap to analyze problems that include excessive garbage collection activity and memory leaks. For more information, see "How To: Use CLR Profiler" in the "How To" section of this guide.

How to design efficient types
Should your classes be thread safe? What performance issues are associated with using properties? What are the performance implications of supporting inheritance?

How to manage memory efficiently
Write code to help the garbage collector do its job efficiently. Minimize hidden allocations, and avoid promoting short-lived objects, preallocating memory, chunking memory, and forcing garbage collections. Understand how pinning memory can fragment the managed heap.

Identify and analyze the allocation profile of your application by using CLR Profiler.

How to use multithreading in .NET applications
Minimize thread creation, and use the self-tuning thread pool for multithreaded work. Avoid creating threads on a per-request basis. Also avoid using Thread.Abort or Thread.Suspend.
Make sure that you appropriately tune the thread pool for ASP.NET applications and for Web services. For more information, see "How to tune the ASP.NET thread pool" later in this document.

How to use asynchronous calls
Asynchronous calls may benefit client-side applications where you need to maintain user interface responsiveness. Asynchronous calls may also be appropriate on the server, particularly for I/O bound operations. However, you should avoid asynchronous calls that do not add parallelism and that block the calling thread immediately after initiating the asynchronous call. In these situations, there is no benefit to making asynchronous calls.

How to clean up resources
Release resources as soon as you have finished with them. Use finally blocks or the C# using statement to make sure that resources are released even if an exception occurs. Make sure that you call Dispose (or Close) on any disposable object that implements the IDisposable interface. Use finalizers on classes that hold on to unmanaged resources across client calls. Use the Dispose pattern to help ensure that you implement Dispose functionality and finalizers (if they are required) correctly and efficiently.

How to avoid unnecessary boxing
Excessive boxing can lead to garbage collection and performance issues. Avoid treating value types as reference types where possible. Consider using arrays or custom collection classes to hold value types. To identify boxing, examine your Microsoft intermediate language (MSIL) code and search for the box and unbox instructions.

How to handle exceptions
Exceptions can be expensive. You should not use exceptions for regular application logic. However, use structured exception handling to build robust code, and use exceptions instead of error codes where possible. While exceptions do carry a performance penalty, they are more expressive and less error prone than error codes.

Write code that avoids unnecessary exceptions. Use finally blocks to guarantee resources are cleaned up when exceptions occur. For example, close your database connections in a finally block. You do not need a catch block with a finally block. Finally blocks that are not related to exceptions are inexpensive.

How to work with strings efficiently
Excessive string concatenation results in many unnecessary allocations that create extra work for the garbage collector. Use StringBuilder when you need to create complex string manipulations and when you need to concatenate strings multiple times. If you know the number of appends and concatenate strings in a single statement or operation, prefer the + operator. Use Response.Write in ASP.NET applications to benefit from string buffering when a concatenated string is to be displayed on a Web page.

How to choose between arrays and collections
Arrays are the fastest of all collection types, so unless you need special functionalities like dynamic extension of the collection, sorting, and searching, you should use arrays. If you need a collection type, choose the most appropriate type based on your functionality requirements to avoid performance penalties.

Use ArrayList to store custom object types and particularly when the data changes frequently and you perform frequent insert and delete operations. Avoid using ArrayList for storing strings.
Use a StringCollection to store strings.
Use a Hashtable to store a large number of records and to store data that may or may not change frequently. Use Hashtable for frequently queried data such as product catalogs where a product ID is the key.
Use a HybridDictionary to store frequently queried data when you expect the number of records to be low most of the time with occasional increases in size.
Use a ListDictionary to store small amounts of data (fewer than 10 items).
Use a NameValueCollection to store strings of key-value pairs in a presorted order. Use this type for data that changes frequently where you need to insert and delete items regularly and where you need to cache items for fast retrieval.
Use a Queue when you need to access data sequentially (first in is first out) based on priority.
Use a Stack in scenarios where you need to process items in a last–in, first-out manner.
Use a SortedList for fast object retrieval using an index or key. However, avoid using a SortedList for large data changes because the cost of inserting the large amount of data is high. For large data changes, use an ArrayList and then sort it by calling the Sort method.

How to improve serialization performance
Reduce the amount of data that is serialized by using the XmlIgnore or NonSerialized attributes. XmlIgnore applies to XML serialization that is performed by the XmlSerializer. The XmlSerializer is used by Web services. The NonSerialized applies to .NET Framework serialization used in conjunction with the BinaryFormatter and SoapFormatter. The BinaryFormatter produces the most compact data stream, although for interoperability reasons you often need to use XML or SOAP serialization.

You can also implement ISerializable to explicitly control serialization and to determine the exact fields to be serialized from a type. However, using ISerializable to explicitly control serialization is not recommended because it prevents you from using new and enhanced formatters provided by future versions of the .NET Framework.

If versioning is a key consideration for you, consider using a SerializationInfoEnumerator to enumerate through the set of serialized fields before you try to deserialize them.

To improve DataSet serialization, you can use column name aliasing, you can avoid serializing both the original and the updated data values, and you can reduce the number of DataTable instances that you serialize.

How to improve code access security performance
Code access security ensures that your code and the code that calls your code are authorized to perform specific privileged operations and to access privileged resources like the file system, the registry, the network, databases, and other resources. The permission asserts and permission demands in the code you write and call directly affects the number and the cost of the security stack walks that you need.

How to reduce working set size
A smaller working set produces better system performance. Fewer larger assemblies rather than many smaller assemblies help reduce working set size. Using the Native Image Generator (Ngen.exe) to precompile code may also help. For more information, see "Working Set Considerations" in Chapter 5, "Improving Managed Code Performance."

How to develop SMP friendly code
To write managed code that works well with symmetric multiprocessor (SMP) servers, avoid contentious locks and do not create lots of threads. Instead, favor the ASP.NET thread pool and allow it to decide the number of threads to release.

If you run your application on a multiprocessor computer, use the server GC) instead of the workstation GC. The server GC is optimized for throughput, memory consumption, and multiprocessor scalability. ASP.NET automatically loads the server GC. If you do not use ASP.NET, you have to load the server GC programmatically. The next version of the .NET Framework provides a configurable switch.

How to time managed code in nanoseconds
Use the Microsoft Win32® functions QueryPerformanceCounter and QueryPerformanceFrequency to measure performance. To create a managed wrapper for these functions, see "How To: Time Managed Code Using QueryPerformanceCounter and QueryPerformanceFrequency" in the "How To" section of this guide.

Note At the time of this writing, the .NET Framework 2.0 (code-named "Whidbey") provides a wrapper to simplify using QueryPerformanceCounter and QueryPerformanceFrequency.
How to instrument managed code
Instrument your application to measure your processing steps for your key performance scenarios. You may need to measure resource utilization, latency, and throughput. Instrumentation helps you identify where bottlenecks exist in your application. Make your instrumentation configurable; be able to control event types and to switch your instrumentation off completely. Options for instrumentation include the following:

Event Tracing for Windows (ETW). Event Tracing for Windows is the recommended approach because it is the least expensive to use in terms of execution time and resource utilization.
Trace and Debug classes. The Trace class lets you instrument your release and debug code. You can use the Debug class to output debug information and to check logic for assertions in code. These classes are in the System.Diagnostics namespace.
Custom performance counters. You can use custom counters to time key scenarios within your application. For example, you might use a custom counter to time how long it takes to place an order. For implementation details, see "How To: Use Custom Performance Counters from ASP.NET" in the "How To" section of this guide.
Windows Management Instrumentation (WMI). WMI is the core instrumentation technology built into the Microsoft Windows® operating system. Logging to a WMI sink is more expensive compared to other sinks.
Enterprise Instrumentation Framework (EIF). EIF provides a framework for instrumentation. It provides a unified API. You can configure the events that you generate, and you can configure the way the events are logged. For example, you can configure the events to be logged in the Windows event log or in Microsoft SQL Server™. The levels of granularity of tracing are also configurable. EIF is available as a free download at http://www.microsoft.com/downloads/details.aspx?FamilyId=80DF04BC-267D-4919-8BB4-1F84B7EB1368&displaylang=en.
For more information, see "How To: Use EIF" In the "How To" section of this guide.

How to decide when to use the Native Image Generator (Ngen.exe)
The Native Image Generator (Ngen.exe) allows you to run the just-in-time (JIT) compiler on your assembly's MSIL to generate native machine code that is cached to disk. Ngen.exe for the .NET Framework version 1.0 and version 1.1 was primarily designed for the common language runtime (CLR), where it has produced significant performance improvements. To identify whether or not Ngen.exe provides any benefit for your particular application, you need to measure performance with and without using Ngen.exe. Before you use Ngen.exe, consider the following:

Ngen.exe is most appropriate for any scenario that benefits from better page sharing and working set reduction. For example, it is most appropriate for client scenarios that require fast startup to be responsive, for shared libraries, and for multiple-instance applications.
Ngen.exe is not recommended for ASP.NET version 1.0 and 1.1 because the assemblies that Ngen.exe produces cannot be shared between application domains. At the time of this writing, the .NET Framework 2.0 (code-named "Whidbey") includes a version of Ngen.exe that produces images that can be shared between application domains.
If you do decide to use Ngen.exe:

Measure your performance with and without Ngen.exe.
Make sure that you regenerate your native image when you ship new versions of your assemblies for bug fixes or for updates, or when something your assembly depends on changes.

Improving Data Access Performance

How to improve data access performance
Your goal is to minimize processing on the server and at the client and to minimize the amount of data passed over the network. Use database connection pooling to share connections across requests. Keep transactions as short as possible to minimize lock durations and to improve concurrency. However, do not make transactions so short that access to the database becomes too chatty.

How to page records
You should allow the user to page through large result sets when you deliver large result sets to the user one page at a time. When you choose a paging solution, considerations include server-side processing, data volumes and network bandwidth restrictions, and client-side processing.

The built-in paging solutions provided by the ADO.NET DataAdapter and DataGrid are only appropriate for small amounts of data. For larger result sets, you can use the SQL Server SELECT TOP statement to restrict the size of the result set. For tables that do not have a strictly-increasing key column, you can use a nested SELECT TOP query. You can also use temporary tables when data is retrieved from complex queries and is prohibitively large to be transmitted and stored on the Web layer and when the data is application wide and applicable to all users.

How to serialize DataSets efficiency
Default DataSet serialization is not the most efficient. For information about how to improve this, see "How To: Improve Serialization Performance" in the "How To" section of this guide.

How to choose between dynamic SQL and stored procedures
Stored procedures generally provide improved performance in comparison to dynamic SQL statements. From a security standpoint, you need to consider the potential for SQL injection and authorization. Both approaches are susceptible to SQL injection if they are poorly written. Database authorization is often easier to manage when you use stored procedures because you can restrict your application's service accounts to only run specific stored procedures and to prevent them from accessing tables directly.

If you use stored procedures, follow these guidelines:

Try to avoid recompiles.
Use the Parameters collection to help prevent SQL injection.
Avoid building dynamic SQL within the stored procedure.
Avoid mixing business logic in your stored procedures.
If you use dynamic SQL, follow these guidelines:

Use the Parameters collection to help prevent SQL injection.
Batch statements if possible.
Consider maintainability. For example, you have to decide if it is easier for you to update resource files or to update compiled statements in code.

How to choose between a DataSet and a DataReader
Do not use a DataSet object for scenarios where you can use a DataReader object. Use a DataReader if you need forward-only, read-only access to data and if you do not need to cache the data. Do not pass DataReader objects across physical server boundaries because they require open connections. Use the DataSet when you need the added flexibility or when you need to cache data between requests.

How to perform transactions in .NET
You can perform transactions using T-SQL commands, ADO.NET, or Enterprise Services. T-SQL transactions are most efficient for server-controlled transactions on a single data store. If you need to have multiple calls to a single data store participate in a transaction, use ADO.NET manual transactions. Use Enterprise Services declarative transactions for transactions that span multiple data stores.

When you choose a transaction approach, you also have to consider ease of development. Although Enterprise Services transactions are not as quick as manual transactions, they are easier to develop and lead to middle tier solutions that are flexible and easy to maintain.

Regardless of your choice of transaction type, keep transactions as short as possible, consider your isolation level, and keep read operations to a minimum inside a transaction.

How to optimize queries
Start by isolating long-running queries by using SQL Profiler. Next, identify the root cause of the long-running query by using SQL Analyzer. By using SQL Analyzer, you may identify missing or inefficient indexes. Use the Index Tuning Wizard for help selecting the correct indexes to build. For large databases, defragment your indexes at regular intervals.

Improving ASP.NET Performance
How to build efficient Web pages
Start by trimming your page size and by minimizing the number and the size of graphics, particularly in low network bandwidth scenarios. Partition your pages to benefit from improved caching efficiency. Disable view state for pages that do not need it. For example, you should disable view state for pages that do not post back to the server or for pages that use server controls. Ensure pages are batch-compiled. Enable buffering so that ASP.NET batches work on the server and avoids chatty communication with the client. You should also know the cost of using server controls.

How to tune the ASP.NET thread pool
If your application queues requests with idle CPU, you should tune the thread pool.

For applications that serve requests quickly, consider the following settings in the Machine.config file:
Set maxconnection to 12 times the number of CPUs.

Set maxIoThreads and maxWorkerThreads to 100.

Set minFreeThreads to 88 times the number of CPUs.

Set minLocalRequestFreeThreads to 76 times the number of CPUs.

For applications that experience burst loads (unusually high loads) between lengthy periods of idle time, consider testing your application by increasing the minWorkerThreads and minIOThreads settings.
For applications that make long-running calls, consider the following settings in the Machine.config file:
Set maxconnection to 12 times the number of CPUs.

Set maxIoThreads and maxWorkerThreads to 100.

Now test the application without changing the default setting for minFreeThreads. If you see high CPU utilization and context switching, test by reducing maxWorkerThreads or increasing minFreeThreads.

For ASP.NET applications that use the ASPCOMPAT flag, you should ensure that the total thread count for the worker process does not exceed the following value:
75 + ((maxWorkerThread + maxIoThreads) * #CPUs * 2)

For more information and implementation details, see "Formula for Reducing Contention" in Chapter 6, "Improving ASP.NET Performance." Also see "Tuning Options" in the "ASP.NET Tuning" section in Chapter 17, "Tuning .NET Application Performance."

How to handle long-running calls
Long-running calls from ASP.NET applications block the calling thread. Under load, this may quickly cause thread starvation, where your application uses all available threads and stops responding because there are not enough threads available. It may also quickly cause queuing and rejected requests. An ASP.NET application that calls a long-running Web service is an application that blocks the calling thread. In this common scenario, you can call the Web service asynchronously and then display a busy page or a progress page on the client. By retaining the Web service proxy in server-side state by polling from the browser by using the > meta < refresh tag, you can detect when the Web service call completes and then return the data to the client.

If design changes are not an alternative, consider tuning the thread pool as described earlier.

How to cache data
ASP.NET can cache data by using the Cache API, by using output caching, or by using partial page fragment caching. Regardless of the implementation approach, you need to consider an appropriate caching policy that identifies the data you want to cache, the place you want to cache the data in, and how frequently you want to update the cache.

To use effective fragment caching, separate the static and the dynamic areas of your page, and use user controls.

How to call STA components from ASP.NET
STA components must be called by the thread that creates them. This thread affinity can create a significant bottleneck. Rewrite the STA component by using managed code if you can. Otherwise, make sure you use the ASPCOMPAT attribute on the pages that call the component to avoid thread switching overhead. Do not put STA components in session state to avoid limiting access to a single thread. Avoid STA components entirely if you can.

How to handle session state
If you do not need session state, disable it. If you do need session state, you have three options:

The in-process state store
The out-of-process state service
SQL Server
The in-process state store offers the best performance, but it introduces process affinity, which prevents you from scaling out your solution in a Web farm. For Web farm scenarios, you need one of the out-of-process stores. However, the out-of-process stores incur the overhead of serialization and network latency. Be aware that any object that you want to store in out-of-process session state must be serializable.

Other optimizations include using primitive types where you can to minimize serialization overhead and using the ReadOnly attribute on pages that only read session state.

Improving Web Services Performance
The solutions in this section show how to improve Web service performance. The majority of the solutions are detailed in Chapter 10, "Improving Web Services Performance."

How to improve Web service performance
Start by tuning the thread pool. If you have sufficient CPU and if you have queued work, apply the tuning formula specified in Chapter 10. Make sure that you pool Web service connections. Make sure that you send only the data you need to send, and ensure that you design for chunky interfaces. Also consider using asynchronous server-side processing if your Web service performs extensive I/O operations. Consider caching for reference data and for any internal data that your Web service relies upon.

How to handle large data transfer
To perform large data transfers, start by checking that the maxRequestLength parameter in the < httpRuntime > element of your configuration file is large enough. This parameter limits the maximum SOAP message size for a Web service. Next, check your timeout settings. Set an appropriate timeout on the Web service proxy, and make sure that your ASP.NET timeout is larger than your Web service timeout.

You can handle large data transfer in a number of ways:

Use a byte array parameter. Using a byte array parameter is a simple approach, but if a failure occurs midway through the transfer, the failure forces you to start again from the beginning. When you are uploading data, this approach can also make your Web service subject to denial-of-service attacks.
Return a URL. Return a URL to a file, and then use HTTP to download the file.
Use streaming. If you need to transfer large amounts of data (such as several megabytes) from a Web method, consider streaming to avoid having to buffer large amounts of data in memory at the server and client. You can stream data from a Web service either by implementing IList or by implementing IXmlSerializable.
For more information, see "Bulk Data Transfer" in Chapter 10, "Improving Web Services Performance."

How to handle attachments
You have various options when you are handling attachments by using Web services. When you are choosing an option, consider the following:

WS-Attachments. Web Services Enhancements (WSE) version 1.0 and 2.0 support Web services attachments (WS-Attachments). WS-Attachments use Direct Internet Message Encapsulation (DIME) as an encoding format. While DIME is a supported part of WSE, Microsoft is not investing in this approach long term. DIME is limited because the attachments are outside the SOAP envelope.
Base64 encoding. For today, you should use Base64 encoding in place of WS-Attachments when you have advanced Web services requirements such as security. Base64 encoding creates a larger message payload that may be up to two times the original size. For messages that have large attachments, you can implement a WSE filter to compress the message by using tools like GZIP before you send the message over the network. If you cannot afford the message size that Base64 introduces and if you can rely on the transport for security (for example, Secure Sockets Layer [SSL] or Internet Protocol Security [IPSec]), consider the WS-Attachments implementation in WSE. Securing the message is preferred to securing the transport so that messages can be routed securely. Transport security only addresses point-to-point communication.
SOAP Message Transmission Optimization Mechanism (MTOM). MTOM, which is a derivative work of SOAP Messages with Attachments (SwA), is the likely future interop technology. MTOM is being standardized by the World Wide Web Consortium (W3C) and is easier to compose than SwA.
SwA, also known as WS-I Attachments Profile 1.0, is not supported by Microsoft.

How to improve .NET remoting performance
Remoting is for local, in-process, cross-application domain communication or for integration with legacy systems. If you use remoting, reduce round trips by using chunky interfaces. Improve serialization performance by serializing only the data you need. Use the NonSerialized attribute to prevent unnecessary fields from being serialized.

How to serialize DataSet instances efficiently over remoting
Try to improve serialization efficiency in the following ways:

Use column name aliasing to reduce the size of column names.
Avoid serializing the original and new values for DataSet fields if you do not need to.
Serialize only those DataTable instances in the DataSet that you require. DataSet instances serialize as XML.
To implement binary serialization, see Knowledge Base article 829740, "Improving DataSet Serialization and Remoting Performance," at http://support.microsoft.com/default.aspx?scid=kb;en-us;829740.

Improving Enterprise Services Performance
The solutions in this section show how to improve the performance of your Enterprise Services applications and serviced components. The majority of the solutions are detailed in Chapter 8, "Improving Enterprise Services Performance."

How to improve Enterprise Services performance
Only use Enterprise Services if you need a service. If you need a service, prefer library applications for in-process performance. Use Enterprise Services transactions if you need distributed transactions, but be aware that manual transactions that use ADO.NET or T-SQL offer superior performance for transactions against a single resource manager. Remember to balance performance with ease of development. Declarative Enterprise Services transactions offer the easiest programming model. Also consider your transaction isolation level.

Use object pooling for objects that take a long time to initialize. Make sure that you release objects back to the pool promptly. A good way to do this is to annotate your method with the AutoComplete attribute. Also, clients should call Dispose promptly on the service component. Avoid using packet privacy authentication if you call your serviced components over an IPSec encrypted link. Avoid impersonation, and use a single service identity to access your downstream database to benefit from connection pooling.

Tech Articles