Tech Articles: August 2006

Wednesday, August 23, 2006

Iterators in c# 2.0

In C# 1.1, you can iterate over data structures such as arrays and collections using a foreach loop.
string[] cities = {"New York","Paris","London"};
foreach(string city in cities)
{
Console.WriteLine(city);
}
In fact, you can use any custom data collection in the foreach loop, as long as that collection type implements a GetEnumerator method that returns an IEnumerator interface. Usually you do this by implementing the IEnumerable interface:
public interface IEnumerable
{
IEnumerator GetEnumerator();
}
public interface IEnumerator
{
object Current{get;}
bool MoveNext();
void Reset();
}
Often, the class that is used to iterate over a collection by implementing IEnumerable is provided as a nested class of the collection type to be iterated. This iterator type maintains the state of the iteration. A nested class is often better as an enumerator because it has access to all the private members of its containing class. This is, of course, the Iterator design pattern, which shields iterating clients from the actual implementation details of the underlying data structure, enabling the use of the same client-side iteration logic over multiple data structures, as shown in Figure 1.

Figure 1

Figure 1 Iterator Design Pattern
In addition, because each iterator maintains separate iteration state, multiple clients can execute separate concurrent iterations. Data structures such as the Array and the Queue support iteration out of the box by implementing IEnumerable. The code generated in the foreach loop simply obtains an IEnumerator object by calling the class's GetEnumerator method and uses it in a while loop to iterate over the collection by continually calling its MoveNext method and current property. You can use IEnumerator directly (without resorting to a foreach statement) if you need explicit iteration over the collection.
But there are some problems with this approach. The first is that if the collection contains value types, obtaining the items requires boxing and unboxing them because IEnumerator.Current returns an Object. This results in potential performance degradation and increased pressure on the managed heap. Even if the collection contains reference types, you still incur the penalty of the down-casting from Object. While unfamiliar to most developers, in C# 1.0 you can actually implement the iterator pattern for each loop without implementing IEnumerator or IEnumerable. The compiler will choose to call the strongly typed version, avoiding the casting and boxing. The result is that even in version 1.0 it's possible not to incur the performance penalty.
To better formulate this solution and to make it easier to implement, the Microsoft .NET Framework 2.0 defines the generic, type-safe IEnumerable and IEnumerator interfaces in the System.Collections.Generics namespace:
public interface IEnumerable
{
IEnumerator GetEnumerator();
}
public interface IEnumerator : IDisposable
{
ItemType Current{get;}
bool MoveNext();
}
Besides making use of generics, the new interfaces are slightly different than their predecessors. Unlike IEnumerator, IEnumerator derives from IDisposable and does not have a Reset method.
The second and more difficult problem is implementing the iterator. Although that implementation is straightforward for simple, it is challenging with more advanced data structures, such as binary trees, which require recursive traversal and maintaining iteration state through the recursion. Moreover, if you require various iteration options, such as head-to-tail and tail-to-head on a linked list, the code for the linked list will be bloated with various iterator implementations. This is exactly the problem that C# 2.0 iterators were designed to address. Using iterators, you can have the C# compiler generate the implementation of IEnumerator for you. The C# compiler can automatically generate a nested class to maintain the iteration state. You can use iterators on a generic collection or on a type-specific collection. All you need to do is tell the compiler what to yield in each iteration. As with manually providing an iterator, you need to expose a GetEnumerator method, typically by implementing IEnumerable or IEnumerable.
You tell the compiler what to yield using the new C# yield return statement.

You can also use C# iterators on non-generic collections.
In addition, you can use C# iterators on fully generic collections. When using a generic collection and iterators, the specific type used for IEnumerable in the foreach loop is known to the compiler from the type used when declaring the collection—a string in this case:
LinkedList list = new LinkedList();
/* Some initialization of list, then */
foreach(string item in list)
{
Trace.WriteLine(item);
}
This is similar to any other derivation from a generic interface.
If for some reason you want to stop the iteration midstream, use the yield break statement. For example, the following iterator will only yield the values 1, 2, and 3:
public IEnumerator GetEnumerator()
{
for(int i = 1;i< 5;i++)
{
yield return i;
if(i > 2)
yield break;
}
}
Your collection can easily expose multiple iterators, each used to traverse the collection differently. For example, to traverse the CityCollection class in reverse order, provide a property of type IEnumerable called Reverse:
public class CityCollection
{
string[] m_Cities = {"New York","Paris","London"};
public IEnumerable Reverse
{
get
{
for(int i=m_Cities.Length-1; i>= 0; i--)
yield return m_Cities[i];
}
}
}
Then use the Reverse property in a foreach loop:
CityCollection collection = new CityCollection();
foreach(string city in collection.Reverse)
{
Trace.WriteLine(city);
}
There are some limitations to where and how you can use the yield return statement. A method or a property that has a yield return statement cannot also contain a return statement because that would improperly break the iteration. You cannot use yield return in an anonymous method, nor can you place a yield return statement inside a try statement with a catch block (and also not inside a catch or a finally block).

Friday, August 18, 2006

CLS and Non-CLS Exception

All programming language for the CLR must support the throwing of Exception derived object because the CLS mandate this. However, the CLR actually allows an instance of any type to be throw, and some programming language will allows code to throw non-CLS-compliant exception object such as a string,Int32 whatever.The c# comiler allows code to throw only Exception-derived objects,whereas code written in IL assembly language and C++/CLI allow code to throw Exception-derived objects as well as objects that are not derived from Exception.

Prior to version 2.0 of the CLR, when programmers wrote catch blocks to catch exception, they were catching CLS-compliant exceptions only. If a c# method called a method written in another language, and that method threw a non-CLS-compliant exception,the C# code would not catch this exception at all.

In version 2.0 of the CLR, Microsoft has introduced a new RuntimeWrappedException class.This class is derived from Exception,so it is a CLS-compliant type.The RuntimeWrappedException class contain a private field of type object.In version 2.0 of the CLR,when non-CLS-compliant exception is throw, the CLR automatically constructs an instance of the RuntimeWrapperdException class and initialize its private field to refer to the object that was actually throw.

private void TestMethod()
{
try
{
// some code
}
catch(Exception e)
{
//Before c# 2.0, this block catches CLS-compliant exception only
// In C# 2.0 this block catches CLS and non CLS compliant exceptions
throw;
}
catch
{
// in all version of C# this block catches CLS and non CLS compliant exceptions
}
}

If the above code is recompiled for CLR 2.0, the second catch block will never execute ,and the c# compiler will indicate this by issuing a warning: "CS1058: A previous catch clause already catches all exceptions.All non- exceptions will be wrapped in a System.Runtime.CompilerServices.RuntimeWrappedException."

There are two ways for developer to migrate code from an eariler version of .Net Framework to version 2.0:

1. You can merge the code from the two catch block into a single catch block.
2. You can tell the CLR that the code in your assembly wants to play by the old rules.That is tell the CLR that you catch blocks should not catch an instance of the new RuntimeWrappedException calss.
[assembly:RuntimeCompatibility(WrapNonExceptionThrows = false)]

Monday, August 07, 2006

Features of Ado.Net 2.0

1. Bulk Copy Operation

Bulk copying of data from a data source to another data source is a new feature added to ADO.NET 2.0. Bulk copy classes provides the fastest way to transfer set of data from once source to the other. Each ADO.NET data provider provides bulk copy classes. For example, in SQL .NET data provider, the bulk copy operation is handled by SqlBulkCopy class, which can read a DataSet, DataTable, DataReader, or XML objects.

Binary.BinaryFormatter format = new Binary.BinaryFormatter ();
DataSet ds = new DataSet();
ds = DataGridView1.DataSource
using FileStream fs = new FileStream(("c:\sar1.bin", FileMode.CreateNew")
ds.RemotingFormat = SerializationFormat.Binary

In this code snippet, we are serializing the dataset into filestream. If you look at the file size difference between XML and Binary formating, XML formating is more than three times bigger than Binary formating. If you see the perfomance of Remoting of DataSet when greater than 1000 rows, the binary formating is 80 times faster than XML formating

2. DataSet and DataReader Transfer

In ADO.NET 2.0, you can load DataReader directly into DataSet or DataTable. Similarly you can get DataReader back from DataSet or DataTable. DataTable is now having most of the methods of DataSet. For example, WriteXML or ReadXML methods are now available in DataTable also. A new method "Load" is available in DataSet and DataTable, using which you can load DataReader into DataSet/DataTable. In other way, DataSet and DataTable is having method named "getDataReader" which will return DataReader back from DataTable/DataSet. Even you can transfer between DataTable and DataView.

SqlDataReader dr ;
SqlConnection conn = new SqlConnection(Conn_str);
conn.Open() ;
SqlCommand sqlc = new SqlCommand("Select * from Orders", conn);
dr = sqlc.ExecuteReader(CommandBehavior.CloseConnection)
DataTable dt = new DataTable("Orders");
dt.Load(dr)

3. Data Paging

Now command object has a new execute method called ExecutePageReader. This method takes three parameters - CommandBehavior, startIndex, and pageSize. So if you want to get rows from 101 - 200, you can simply call this method with start index as 101 and page size as 100.

SqlDataReader dr ;
SqlConnection conn = new SqlConnection( Conn_str);
conn.Open()
SqlCommand sqlc = new SqlCommand("Select * from Orders", conn);
dr = sqlc.ExecutePageReader(CommandBehavior.CloseConnection, 101, 200)

4. Multiple Active ResultSets

Using this feature we can have more than one simultaneous pending request per connection i.e. multiple active datareader is possible. Previously when a DataReader is open and if we use that connection in another datareader, we used to get the following error "Systerm.InvalidOperationException: There is already an open DataReader associated with this connection which must be closed first". This error wont come now, as this is possible now because of MAR's. This feature is supported only in Yukon.

5. Batch Updates

In previous versions of ADO.NET, if you do changes to DataSet and update using DataAdapter.update method. It makes round trips to datasource for each modified rows in DataSet. This fine with few records, but if there is more than 100 records in modified. Then it will make 100 calls from DataAccess layer to DataBase which is not acceptable. In this release, MicroSoft have changed this behaiour by exposing one property called "UpdateBatchSize". Using this we can metion how we want to groups the rows in dataset for single hit to database. For example if you want to group 50 records per hit, then you need to mention "UpdateBatchSize" as 50.

Wednesday, August 02, 2006

Checked and Unchecked Primitive Type Operations

Programmers are well aware that many arthmetic operation on primitives could result in an overflow:
Byte b =100;
b = (Byte)(b+200);

In most programming scenarios, this silent overflow is undesirable and if not detected causes the application to behave in strange and unusual ways.In some rare programming scenariosthis overflow is not only acceptable but is also desired.

The CLR offer IL instructions that allows the compiler to chose the desire bhaviour.The CLR has an instruction called add that adds to values together.The add instruction performs no overflow checking.The CLR also has an instruction called add.ovf that also add two values together. However add.ovf theows a System.OverflowException if an overflow occur.CLR also has similar IL instruction for Subtraction,multiplication and data conversion.

C# allows the programmer to decide how overflows should be handled.By default, overflow checking is turned off. This means that comipler generate IL code by using the versions to add that don't include overflow checking. As a result the code run faster but developer must be assured that overflow won't occur .

One way to get the C# complier to control overflow is to use the /Checked+ compiler switch. This switch tell the comipler to generate code that has the overflow checking versions of add.

Rather than have overflow checking turned on or off globally, programmers are much more likely to want to decide case by case whether to have overflow checking.C# allow this flexibility by offering checked and unchecked operators.

Byte b = 100;
b = checked(Byte(b+200)); // overflow exception is throw
or
checked
{
Byte b =100;
b += 200;
}
}

Tuesday, August 01, 2006

CLR Event Model

The common Language runtime event model is based on delegates.A delegates is a type-safe way to invoke a callback method. Callback methods are the means by which objects receive the notification they subscribed to.
To help you fully understand the way event work within the CLR,I'll start with a secnario in which events are useful.Suppose you want to design an e-mail application.When an e-mail message arrives,the user might like the message to be forwarded to a fax machine or a pager.
In architecting this application,let's say that you'll first design a type,called MailManager,that receives the incoming e-mail message. MailManager will expose an event called NewMail. Oter type(such as Fax and Pager) may register interest in this event. When MailManager receives a new e-mail message,it will raise the event,causing the message to be distributed each of the registered objects.

When the application intializes,let's instantiate just one MailManager instance-the application can then instantiate any number of Fax and Pager Type.Above figure shows how the application initializes and what happen when a new e-mail message arrive.

The application intializes by constructing an instance of MailManager. MailManager offer a NewMail event.When the Fax and Page objects are constructed, they register themselves with MailManager's NewMail event so that MailManager knows to notify the Fax and Pager objects when new e-mail message arrive.

Step # 1 : Define a type that will hold any additional information that should be sent to receivers of the event notification

internal class NewMailEventArgs : EventAgrs
{
private readonly string m_from,m_to,m_subject;
public NewMailEventArgs(string from,string to,string subject)
{
m_from = from;
m_to = to
m_subject = subject;
}

public string from
{
get
{
}
return m_from;
}

public string To
{
get
{
}
return m_to;
}

public string Subject
{
get
{
}
return m_subject;
}

Step # 2 :Define Event member

internal class MailManager
{
publiuc event Eventhandler NewMail;
......
}

Step # 3: Define a method responsible for raising the event to notify registered objects that the event has occured

internal class MailManager
{
protected virtual void onNewMail(NewMailEventArgs e)
{
EventHandler temp = NewMail;
if(temp != null)
{
temp(this,e);
}
}
}

Step #4 : Define a method that translates the input into the desired event

internal class MailManager
{
public void SimulateNewMail (string from,string to,string subject)
{
NewMailEventArgs e = new NewMailEventArgs(from,to,subject);
onNewMail(e);
}
}

Designing a Type That Listen for an Event

internal class Fax
{
public Fax(MailManager mm)
{
mm.NewMail += FaxMsg;
}

private void FaxMsg(object sender, NewMailEventArgs e)
{
Console.writeLine("From{0} , To{1}, Subject{2}", e,from,e.to,e.Subject);
}

public void Unregister(MailManager mm)
{
mm -= FaxMsg;
}

}

Tech Articles