I am currently looking for a contract in the London area -

If you're hiring a .NET contractor in or around London, look no further!

Dynamic Lambda Expressions - creating functions at run-time


One of my favourite new features in C# 3 are lambda expressions. These nutty little functions are essential to the inner workings of Linq, and open many doors for the adventurous programmer.

It's important to know the difference between Func<> expressions and Expression<> expressions. If you have a lambda of type Func<>, it's simply a function pointer - a strongly typed delegate to a (usually) anonymous method. However, if your expression is of type Expression<>, it's actually a reference to an expression tree - an in-memory tree structure containing nodes that represent the operations that you have defined in your lambda.

This difference is crucial - if it is an Expression<>, it's not actually a function to be executed but rather an abstract description of it - this means we can manipulate and construct them at will, then compile them into executable functions whenever we want to use them.

Consider the following lambda to square an integer:

Expression<Func<int, int>> f = (int x) => x * x;
This gets translated into a statement to build the expression tree that looks like the following:
ParameterExpression x = Expression.Parameter(typeof(int), "x");

Expression<Func<int, int>> f =
    Expression.Lambda<Func<int, int>>
    (
        Expression.Multiply(x, x),
        new ParameterExpression[] { x }
    );
We can then compile it into a "real" delegate and call it using DynamicInvoke().
// Compile it
Func<int, int> func = f.Compile();

// Cast it to a Delegate type
Delegate del = func;

// Call it - prints "4"
Console.Out.WriteLine(del.DynamicInvoke(2).ToString());
So, armed with this technique, we can construct functions at runtime. In the following code sample, I've created a descendent of a DataContext class that prints out the type and primary key value of every updated row when SubmitChanges() is called. It creates a lambda expression tree for each entity type that simply returns the entity's primary key value, and compiles that into a regular delegate.
public class MyNorthwindDataContext : NorthwindDataContext
{
    private Dictionary<Type, Delegate> m_Keys = 
            new Dictionary<Type,Delegate>();

    public override void SubmitChanges(ConflictMode failureMode)
    {
        ChangeSet changes = this.GetChangeSet();
        
        PrintChanges(changes.Updates);
        
        base.SubmitChanges(failureMode);
    }
    
    private void PrintChanges(IList<object> changes)
    {
        foreach (object entity in changes)
        {
            Delegate pk = null;
            if (m_Keys.ContainsKey(entity.GetType()))
            {
                pk = m_Keys[entity.GetType()];
            }
            else
            {
                pk = CreatePrimaryKeyDelegate(entity.GetType());
                m_Keys[entity.GetType()] = pk;
            }
            
            Console.Out.WriteLine(String.Format(
                "An entity of type {0} with primary key {1} was updated.", 
                entity.GetType().Name, 
                pk.DynamicInvoke(entity)));
        }
    }
    
    // This method returns a delegate that returns the primary key
    // value from a DLINQ data entity.
    private Delegate CreatePrimaryKeyDelegate(Type type)
    {
        Type col = typeof(ColumnAttribute);
        ParameterExpression x = Expression.Parameter(typeof(object), "x");
        Expression<Func<object, object>> expression =
            Expression.Lambda<Func<object, object>>
            (
                Expression.TypeAs
                (
                    Expression.Call
                    (
                        Expression.TypeAs(x, type),
                        (
                            // Get the primary key column's get_ method by
                            // examining the attributes.
                            from p in type.GetProperties
                                (BindingFlags.Instance | BindingFlags.Public)
                            where
                                p.IsDefined(col, false)
                                && (p.GetCustomAttributes(col, false)[0]
                                     as ColumnAttribute).IsPrimaryKey == true
                            select p
                        ).First().GetGetMethod()
                    ),
                    typeof(object)
                ),
                new [] { x }
            );

        return expression.Compile();
    }
}
Using the code above:
using (NorthwindDataContext ctx = new MyNorthwindDataContext())
{
    List<Order> orders = (
        from o in ctx.Orders 
        where o.CustomerID == "ALFKI" 
        select o
    ).Take(5).ToList();
    
    orders.ForEach(order => order.Freight = 1);

    ctx.SubmitChanges();        
}
When run, the above code yields the following output:

Review: Koru Instant Energy Strips


Koru is a Maori word that symbolises growth, strength, and peace. It's also the name of a new energy caffinated strip, in the same vein as the Diablo Energy Strips and many other similar products.

They are little strips of what looks like rice paper, designed to disolve on the tongue. The packet claims they work by "trans-mucosal delivery action", working "sublingually" - i.e. "they dissolve on your tongue". The idea behind these is that as the caffeine is absorbed through the mucous membranes in the mouth immediately, the caffeine reaches your bloodstream much faster than anything that you swallow. It's the caffeine equivalent of smoking crack, basically.

Having tried some of these before (and being particularly unable to pass up the opportunity of buying something new and caffinated) I decided to pick up a pack and write a review.

I found them at the checkout of the Whistlestop shop in London Bridge station for the princely sum of £1.99. The packet, a foil bag roughly the size of a playing card, contains 4 of the strips purported to "awaken the mind and the body". No mention is made of the caffeine content of the strips, either on the packet or the web site.

Each strip contains: maltodextrim, hydroxypropylmethyl cellulose, water, triethyl citrate, sorbitol, flavourings (including caffeine), sweeteners (sucralose, aspartame), and titanium dioxide as a colouring. The strips lack the ususal cocktail of ingredients that you often find in new energy products (taurine, ginseng, vitamin-B, and others), so you'll have to be content with the caffeine.

The front of the pack is decorated with slightly disturbing Maori print of what looks a bit like a gorilla, called the "Koru Face Device".

Inside

Inside the pack are 4 strips individually wrapped in foil packets, and a little sticker of the Face Device. The strips are attractively wrapped, with bold lettering on the front, and again the Face Device on the back.

The strips themselves are an off-white, like thick rice paper, with white lettering spelling "Koru" printed diagonally across them.

Consumption

The strips are meant to be placed on the tongue, where they quickly disolve. The pleasant mint flavour, not at all unlike Mentos, quickly gives way to the familiar bitter taste of caffeine, which is enough to give an indication of the caffeine content - very high.

Within a minute I was feeling rather refreshed and alert (with my hangover receding), and without the jitters that often accompany a large dose of caffeine. I could do without my morning coffee after just one of these, and the peak effect lasted for around an hour.

Koru Instant Energy Strips are an expensive but effective way of caffinating yourself. At £1.99 for 4 they are way cheaper than a latte, but still very pricey considering Boots sells a cheaper own-brand version with 28 strips in the pack (although without a scary Maori picture anywhere to be seen). The caffeine content, while not known, does feel very high for this kind of product, and is well worth a try.

Using memcached from C#


In my last post I showed you how to run memcached, the popular distributed hash table, on Windows. Now it's time to start using it from your .NET application.

Getting the client

The Win32 client can be downloaded here. It's a source release, available for .NET 1.1 and 2.0, so you'll need to compile the solution before you can use it. Once you have compiled it, create a new console application and add a reference to Memcached.ClientLibrary.

Writing your first memcached program

As it's just really a very scalable hash table, memcached is very simple to code against. There are only a few things you need to configure.

memcached requires a list of all the servers in the pool to connect to. The hash function that memcached uses to choose which machine to store values on requires knowledge of all the machines in the pool to function correctly, so maintaining this list is a concern when using memcached in the pool. For now, however, we'll just use a single local server for simplicity, running with default port (11211) and cache size (64Mb), running with the very verbose option - this will allow us to monitor the memcached process from the command line.

C:\memcached>memcached -vv


Now to write some code. We are running a single server locally, so let's let memcached know about it.

string[] servers = { "127.0.0.1:11211" };
SockIOPool pool = SockIOPool.GetInstance();
pool.SetServers(servers);
pool.Initialize();


Our memcached client is now ready to use. Let's try and stick some values in the cache and get them back out again.

// Create the client 
MemcachedClient mc = new MemcachedClient();

// Set the value in the cache
mc.Set("andy", "rocks");

// Make sure it's there
Console.WriteLine("The key " + (mc.KeyExists("andy") ? "exists" : "doesn't exist") + "!");

// Fetch from the cache
string cachedValue = mc.Get("andy") as string;

// Display the fetched value
Console.WriteLine("Retrieved the value '" + cachedValue + "' from the cache!");


If we go back to our memcached.exe window, something like the following should appear when the above is run:

<96 new client connection
<100 new client connection
<104 new client connection
<104 set andy 0 0 6
>104 STORED
<104 get andy
>104 sending key andy
>104 END
<104 get andy
>104 sending key andy
>104 END
<104 connection closed.
<96 connection closed.
<100 connection closed.


The MemcachedClient.Get() method returns a value of type object. This you can cast to whatever type you originally put in the cache. However, if you want to store objects, then you must serialize them, either yourself, or with the built-in .NET serialization mechanisms.

If the memcached server is not found or an error occurs during sets and gets, no exception will be thrown to the calling code. Basically, if your cache servers encounter a problem, to your code it will just appear as if the cache is permanently empty. Bad for performance, but good for uptime.

Using with Linq Queries

C# 3 and Linq's syntax extensions allow us to write some funky shortcuts for things like caching systems. We can cache the results of DLinq queries in memcached the same as any other data, as long as we make sure we have a serialization policy for the classes.

For example, the following code from a data access layer method to return all the users of a site will first check memcached for the data. If it is not found, it will execute the query and store the results in memcached. The code to manage the caching is written in the form of an extension method, so you can simply add a single call to your DAL methods to enable memcached caching of query results.

Extension method:
public static IEnumerable<T> CachedQuery<T>
        (this IQueryable<T> query, string key) where T : class
{
    if (cache.KeyExists(key))
    {
        return (IEnumerable<T>)cache.Get(key);
    }
    else
    {
        IEnumerable<T> items = query.ToList();
        cache.Set(key, items);
        return items;
    }
}


Example Usage:
public static IEnumerable<User> GetAllUsers()
{
    // Retrieve from cache if it exists, otherwise run the query
    return (from u in ctx.Users select u).CachedQuery("allusers");
}


The above code ignores nasty things like serialization, but they are easily added in. There is an effort to standardize an object serialization format for use in memcached (very useful when you have clients from multiple platforms and languages accessing the same pools) to JSON, but you are free to choose the serialization mechanism. I would recommend against XML however, due to the excessive redundancy and large documents produced in the language.

Multiple Servers

Using memcached with multiple servers in the pool is almost as easy as using with a single one. Just supply the SockIOPool class with the list of all the servers you wish to pool together.
string[] servers = {
    "192.168.1.100:11211",
    "192.168.1.101:11211",
    "192.168.1.102:11211",
    "192.168.1.103:11211",
    "192.168.1.104:11211",
};
SockIOPool pool = SockIOPool.GetInstance();
pool.SetServers(servers);
pool.Initialize();


Note, that all your clients must be aware of the same set of servers, otherwise the hashing function to find data on servers will not work consistently between clients, which will adversely affect your hit ratio.

Running memcached on Windows


Memcached is a giant distributed hash table, allowing you to cache data in-memory across multiple machines in your data centre. Commonly, in the web 2.0 world, it's used to store the results of frequently-run queries to avoid hitting the database. This allows the database to scale much further, as it moves much of the load into the fast and almost transparent cache, allowing the database to concentrate on write operations. Many users of memcached report (when implemented correctly) database load dropping by a factor of 10 after implementing query caching in memcached.

Running memcached on Windows

Why? Linux is normally the OS of choice for running memcached, due to its stability, performance, and low TCO - I'd never recommend to run memcached on Windows in a production environment. However, the memcached network protocol is the same regardless of the client or server OS, meaning that organisations that develop mainly on the Microsoft platform can use a Linux cluster in production, but still conveniently run memcached on the local Windows development server.

Downloading the memcached server and client

Memcached is available from Danga (of LiveJournal fame) here. Memcached server for Windows is available here, and you can grab the latest .NET client from SourceForge here.

First Look

Unless you get the source release, the downloaded zip file contains a single file, memcached.exe – no documentation, no release notes. If you run this exe with the “–h” option you get the following list of options.

C:\memcached>memcached.exe -h
memcached 1.2.6
-p       TCP port number to listen on (default: 11211)
-U       UDP port number to listen on (default: 0, off)
-s      unix socket path to listen on (disables network support)
-a      access mask for unix socket, in octal (default 0700)
-l   interface to listen on, default is INDRR_ANY
-d start          tell memcached to start
-d restart        tell running memcached to do a graceful restart
-d stop|shutdown  tell running memcached to shutdown
-d install        install memcached service
-d uninstall      uninstall memcached service
-r            maximize core file limit
-u  assume identity of  (only when run as root)
-m       max memory to use for items in megabytes, default is 64 MB
-M            return error on memory exhausted (rather than removing items)
-c       max simultaneous connections, default is 1024
-k            lock down all paged memory.  Note that there is a
              limit on how much memory you may lock.  Trying to
              allocate more than that would fail, so be sure you
              set the limit correctly for the user you started
              the daemon with (not for -u  user;
              under sh this is done with 'ulimit -S -l NUM_KB').
-v            verbose (print errors/warnings while in event loop)
-vv           very verbose (also print client commands/reponses)
-h            print this help and exit
-i            print memcached and libevent license
-b            run a managed instanced (mnemonic: buckets)
-P      save PID in , only used with -d option
-f    chunk size growth factor, default 1.25
-n     minimum space allocated for key+value+flags, default 48


Running the server

Memcached has no configuration file – it's controlled purely by the command line parameters above. To run the server with default settings – 64Mb cache memory, and listening on port 11211.

If you use the “-vv” flag, you get a much more verbose output with details of client connections: Your memcached server is now running.

Running memcached as a Windows Service

The Win32 port also allows you to install and run memcached as a service, using the “-d” (daemonizer) switch with its various options. To install it, run the command:

memcached.exe –d install


This will install the server as a service available in the control panel. To start it, either run the command:

memcached.exe –d start


… or start it from your Services manager:

Configuration

The two main settings that may require configuration are the cache size, and port.

The default cache size is 64mb, which for any web 2.0 application is a pretty paltry cache. You can change the cache size (in megabytes) using the "-m" switch:

memcached.exe –m 1024


The above will configure memcached to use a gigabyte of memory.

Drawbacks to running as a service

If you configure memcached to run as a Windows service, you (as of release 1.2.6) lose the ability to configure the sevice as you would via the command line. The size of the cache defaults to 64mb, and the port to 11211. Parameters supplied on the command line when using “-d start” are ignored.

Linq tip - returning polymorphic types from a query


I haven't posted since February, mainly because my time has been taken up with preparations for, and actually doing, the Mongol Rally 2008 - driving from London to Mongolia in a 1988 Fiat Panda. We completed it in 6 weeks 2 days, having travelled through 16 countries, 7 time zones, 3 rivers, 2 deserts, 1 blizzard, and 0 Starbucks.

While I'm getting back into the swing of things, I thought I'd share a little Linq tip with you. Quite often I have data in a database that has a 'type' column, indicating the type of data in another column. When I deserialize this from the database, I would like to change the type of object I am creating based on the value of this column, normally with a common base class.

The select statement in Linq allows us to put any expression we like in, as long as it conforms to the type rules of the context of the query. So, using the conditional assignment operator (?), we can return a collection of objects descended from a common type, like this:

abstract class Number { }
class EvenNumber : Number { }
class OddNumber : Number { }

IEnumerable<Number> GetNumbers()
{
 return from i in Enumerable.Range(1, 10)
  select
   (i % 2 == 0) ? 
    (Number)new EvenNumber() : (Number)new OddNumber();
}


Note that you have to cast both sides of the expression to the base type you are returning. That is because the compiler isn't clever enough to work out that both sides of the expression are type equivalent when the return type is their superclass.

This method can be used in a database query as shown below.

IEnumberable<User> GetUsers()
{
 using (MyDatabaseContext ctx = new MyDatabaseContext())
 {
  return from u in ctx.Users
   select
    u.user_type == UserType.Admin ? 
      (User)new AdministratorUser() : 
      (User)new NormalUser();
 }
}


It should be noted that this technique doesn't scale to more than 3 different types however - due to the spaghetti code created when nesting lots of ? expressions together. Maybe we should all petition Anders for a case-expression similar to SQL? :)

Generating Fractals with LINQ


Jump to the code

After seeing Luke Hoban's excellent LINQ ray tracer, I wondered if the same techniques could be used in other algorithmic image generators - specifically in generating the Mandlelbrot Set fractal.

I've never written a fractal generator before, so I used the Wikipedia article on the Mandelbrot Set, and some Java sources I found on the web as a base. Using a complex number class (strangely .NET does not provide its own) the algorithm is fairly simple.

Escape Time Algorithm

The simplest method of generating the image uses something called the escape time algorithm. Simply put, it iterates over every pixel in the image and calculates a number which is used to select a colour to draw that pixel - the number comes from repeating a calculation in the complex number plane and determining if during each repetition it satisfies an 'escape' condition - hence the name. Points within the mandelbrot set do not 'escape' (or escape to infinity), and so after a maximum number of iterations is hit the point is coloured black.

The maximum number of iterations is defined by the programmer, and determines the detail level of the fractal. The more iterations allowed, the better quality the output.

It's probably easier to describe in code.
public IEnumerable<PixelData> GenerateSet(int width, int height, int maxIterations)
{
  List<PixelData> result = new List<PixelData>();

  double xmin = -2.5;
  double xmax = 1;
  double ymin = -1.25;
  double ymax = 1.25;
  double dx = (xmax - xmin) / width;
  double dy = (ymax - ymin) / height;
  double x = xmin + dx / 2;
  
  for (int i = 0; i < width; i++)
  {
    double y = ymax - dy / 2;

    for (int j = 0; j < height; j++)
    {
      y = ymax - dy / 2;
      int iterations = countIterations(x, y, maxIterations);
      
      result.Add(new PixelData()
      {  
        X = i,
        Y = j,
        Iterations = iterations < maxIterations ? (int?)iterations : null
      });
      
      y -= dy;
    }
    
    x += dx;
  }
  
  return result;
}


int countIterations(double x, double y, int maxIterations)
{
  int result = 0;
  Complex c = new Complex(x, y);
  Complex z = new Complex(x, y);
  
  while (result <= maxIterations && z.Magnitude() < 2)
  {
    z = z * z + c;
    result++;      
  }

  return result;
}
This simple algorithm can produce some pretty fractals, but it is an iterative method, and this method doesn't translate easily to LINQ.

Eliminating the Outer Loops

The outer loops of the code above are the ones that iterate over the X and Y co-ordinates of the image. Using the Enumerable.Range() iterator we can replace those loops with LINQ statements that have the same effect:

from i in Enumerable.Range(0, width)
   let dx = (xmax - xmin) / width
   let dy = (ymax - ymin) / height
   let x = (xmin + dx / 2) + (i * dx)
   select from j in Enumerable.Range(0, height)
      let y = (ymax - dy / 2) - (j * dy)
      let iterations = countIterations(x, y)
      select new
      {
        X = i,
        Y = j,
        Iterations = iterations < maxIterations ? (int?)iterations : null
      };
This has the same effect as the two nested for-loops in the first example that run over the width and height of the image. But, we're still relying on the countIterations() method, that does the actual calculation of the escape count of the pixel. Clearly, if we are to call this a LINQ fractal generator, we must find some way to do this within in the context of a LINQ query.

This is where it start to get a bit tricky.

Eliminating the inner loop

The countIterations() method essentially counts the number of times it needs to calculate the value of z = z2 + c before the magnitude, or absolute value, of z 'escapes' and becomes greater than 2. Since LINQ (or lambda expressions) do not afford us while-loops we must find another way to calculate this.

What we can do is replace the iterative calculation of z with a recursive one. Instead of escaping the algorithm, we can calculate the magnitiude of z for every value between 0 and the maximum number of iterations - then from that set select the maximum value that satisifes our escape condition, that the magnitude of z (| z |) < 2.

We can define a lambda expresssion that calculates the magnitude of z after any iteration by recusively calling itself, like this:
Func<int, Complex, Complex> f = 
    (int n, Complex z) => n > 0 ? f(n - 1, z) * f(n - 1, z) + c : z;
Phew. That's a bit nasty. It also won't compile, as the function f has not yet been fully defined. To get round this, we need to resort to using the Y-combinator. I used the same method as LukeH, and it results in the following code to generate the magnitude of z in the range 0...maxIterations:
from n in Enumerable.Range(0, maxIterations)
  let c = new Complex(x, y)
  let func = (Func<Func<int, Complex, Complex>, Func<int, Complex, Complex>>)
    (f => (n, z) => n > 0 ? f(n - 1, z) * f(n - 1, z) + c : z)
  let escape = Y(func)
  select escape(n, new Complex(x, y))
So, adding a where clause, an orderby clause and using the FirstOrDefault() method, we can get the expression to return the escape iteration count, or 0 if it escapes to infinity, as below.
let c = new Complex(x, y)
let func = (Func<Func<int, Complex, Complex>, Func<int, Complex, Complex>>)
  (f => (n, z) => n > 0 ? f(n - 1, z) * f(n - 1, z) + c : z)
let escape = Y(func)
let iterations = (from n in Enumerable.Range(0, maxIterations)
  where escape(n, new Complex(x, y)).Magnitude() < 2
  orderby n descending
  select n).FirstOrDefault()
So, now all we need to do is fit that into the outer loop query we wrote before.

Final result: the code

The final result looks something like this:
from i in Enumerable.Range(0, width)
  let dx = (xmax - xmin) / width
  let dy = (ymax - ymin) / height
  let x = (xmin + dx / 2) + (i * dx)
  select from j in Enumerable.Range(0, height)
    let y = (ymax - dy / 2) - (j * dy)
    let c = new Complex(x, y)
    let func = (Func<Func<int, Complex, Complex>, Func<int, Complex, Complex>>)
      (f => (n, z) => n > 0 ? f(n - 1, z) * f(n - 1, z) + c : z)
    let escape = Y(func)
    let iterations = (from n in Enumerable.Range(0, maxIterations)
      where escape(n, new Complex(x, y)).Magnitude() < 2
      orderby n descending
      select n).FirstOrDefault()
    select new PixelData()
    {
      X = i,
      Y = j,
      Iterations = iterations < maxIterations ? (int?)iterations : null
    };
Please note the above code is horribly inefficient. Using the iterative version I have been merrily running it over 256 iterations in almost real-time. but the LINQ version takes half a minute or so to execute over 8 - and I haven't managed to have the patience to run it over 16 yet. Since the number of iterations affects the quality of the picture, 8 iterations yields a rather sorry fractal, but given the time this method produces fractals every bit as pretty as the iterative version.

Iterator blocks and the "yield return" statement - #2


This is the second part of a two-part series. For the first post, see Iterator blocks and the "yield return" statement - #1.

IEnumerator

Before we jump into the nasty world of state machines and wotnot, let's start by re-familiarizing ourselves with the IEnumerator interface:
public interface IEnumerator
{
    bool MoveNext();
    object Current { get; }
    void Reset();
}
The important things here are MoveNext() and Current. The MoveNext() method moves to the next item in the set and returns a boolean indicating if there are any more items left in the set, and the Current property allows us to get the current item. This is all we need to support the foreach loop - it simply calls MoveNext() until it returns false, and uses the Current property for get the value for the loop variable.

IEnumerable and Iterator Blocks

Consider the following code from the previous post:
public static IEnumerable Workdays()
{
 yield return "Monday";
 yield return "Tuesday";
 yield return "Wednesday";
 yield return "Thursday";
 yield return "Friday";
}

static void Main()
{
 foreach (string day in Workdays())
 {
  Console.WriteLine(day);
 }
}
The foreach statement only operates over IEnumerable objects - so clearly the Workdays() method must be returning an IEnumerable instance (as its method signature suggests). So how how does it do that? We need to peek inside the compiled code with Reflector to find out the answer.

Inside the compiled code

Using Reflector, we are able to decompile the generated IL and get an approximate view of what the C# to create it would be. So, navigating to our type, we see that the Workdays() method is actually compiled into:
public static IEnumerable Workdays()
{
    return new <Workdays>d__0(-2);
}
What the... ? That is not the code we wrote. And what is this strangely named type, "<Workdays>d__0"?

State machines

To answer those questions, we need to take a step backwards, and look at something computer scientists call a state machine.

A state machine is a structure in which execution has a defined state, and from that state, it has a discrete list of subsequent states that it may transition into. A simple example would be to consider the road network to be a state machine - and the position of your car on the roads to be the state. If your car pulls up to a T-junction, you may turn either left or right - you may not jump somewhere two miles away onto a different road. Therefore the list of states that you can transition to is limited to a finite set, left and right. A state machine in computing works on the same principle - the program that is being executed has a definite state, and the state machine represents the current state, and the list of possible states that it can transition to next.

So how does this relate to our simple yield return example? Lets have another look in Reflector at the generated code, and specifically that strange <Workdays>d__0 type.

Looking closer

The class it generates is quite large and scary looking - I have stripped down to the relevant methods:
private sealed class <Workdays>d__0 : IEnumerable<object>, IEnumerable, IEnumerator<object>, IEnumerator, IDisposable
{
    private int <>1__state;
    private object <>2__current;

    public <Workdays>d__0(int <>1__state)
    {
        this.<>1__state = <>1__state;
    }

    private bool MoveNext()
    {
        switch (this.<>1__state)
        {
            case 0:
                this.<>1__state = -1;
                this.<>2__current = "Monday";
                this.<>1__state = 1;
                return true;

            case 1:
                this.<>1__state = -1;
                this.<>2__current = "Tuesday";
                this.<>1__state = 2;
                return true;

            case 2:
                this.<>1__state = -1;
                this.<>2__current = "Wednesday";
                this.<>1__state = 3;
                return true;

            case 3:
                this.<>1__state = -1;
                this.<>2__current = "Thursday";
                this.<>1__state = 4;
                return true;

            case 4:
                this.<>1__state = -1;
                this.<>2__current = "Friday";
                this.<>1__state = 5;
                return true;

            case 5:
                this.<>1__state = -1;
                break;
        }
        return false;
    }

    object IEnumerator.Current
    {
        get
        {
            return this.<>2__current;
        }
    }
}
The first thing we notice from this is that all the identifier names are mangled and that makes the code hard to follow. So, I've cleaned up the source further with meaningful identifier names:
private sealed class WorkdaysEnumerator : IEnumerable<object>, IEnumerable, IEnumerator<object>, IEnumerator, IDisposable
{
    private int m_State;
    private object m_Current;

    public WorkdaysEnumerator(int initialState)
    {
        this.m_State = initialState;
    }

    private bool MoveNext()
    {
        switch (this.m_State)
        {
            case 0:
                this.m_State = -1;
                this.m_Current = "Monday";
                this.m_State = 1;
                return true;

            case 1:
                this.m_State = -1;
                this.m_Current = "Tuesday";
                this.m_State = 2;
                return true;

            case 2:
                this.m_State = -1;
                this.m_Current = "Wednesday";
                this.m_State = 3;
                return true;

            case 3:
                this.m_State = -1;
                this.m_Current = "Thursday";
                this.m_State = 4;
                return true;

            case 4:
                this.m_State = -1;
                this.m_Current = "Friday";
                this.m_State = 5;
                return true;

            case 5:
                this.m_State = -1;
                break;
        }
        return false;
    }

    object IEnumerator.Current
    {
        get
        {
            return this.m_Current;
        }
    }
}
That's a bit clearer - we can see the all important MoveNext() and Current members. So how does it work?

Stepping into the State Machine

The code above is a simple implementation of a state machine. The current state is an integer, m_State in the cleaned up code, and it holds a value indicating what state the machine is currently in - in this example, it has five valid states, one per workday. The switch statement is the guts of the machine, that reads the state value and determines what do do next. Let's pick apart one branch of the switch statement:
case 3:
    this.m_State = -1;
    this.m_Current = "Thursday";
    this.m_State = 4;
    return true;
This branch is only executed when the state is equal to 3, which is set when it leaves the "Wednesday" branch. It sets the Current property to be "Thursday", and increases the state variable by one, making it ready for the next call to MoveNext() (which would set Current to "Friday").

Let's also have a closer look at the sequence of calls that are generated when a programmer writes a foreach loop over this iterator.
foreach (string day in Workdays())
{
 Console.WriteLine(day);
} 
This code actually compiles into something that looks like the below:
string day;
IEnumerable enumerator = new WorkdayEnumerator(0);

while (enumerator.MoveNext())
{
 day = enumerator.Current;
 Console.Writeline(day);
}
I've simplified the above a touch for clarity. Let's run through the sequence of calls that this loop would make and explain what happens to the state machine.

Step Value of m_State (before) Value of m_State (after) Value of Current MoveNext() returned
1 0 1 "Monday" true
2 1 2 "Tuesday" true
3 2 3 "Wednesday" true
4 3 4 "Thursday" true
5 4 5 "Friday" true
6 5 -1 "Friday" false


As you can see, when it hits the last valid state ("Friday"), it still makes one more call to MoveNext() - but since it has set the m_State value to 5, which is not a valid state, MoveNext() returns false and the loop exits.

Wrapping up and further reading

yield return allows us to write simple code to generate lists from complex structures without having to implement our own state machines. It also has some nice uses that are not immediately apparent - since an iterator can return an infinite series of values it can be put to use in many different contexts. Hopefully from this series of posts you have learned how to use yield return, and understand some of the complexity of the code that is generated behind the scenes. Go forth and iterate.

For more information, you might want to read the following.

Iterator blocks and the "yield return" statement - #1


This is the first part of a two-part series. For the second post, see Iterator blocks and the "yield return" statement - #2.

C# 2.0 introduced a wonderful and underused piece of syntax - the "yield return" statement. This is the first of two posts - in this one I will explain what yield return is and how to use it, and in the second of I'll explain how it actually works internally.

Iterators

An iterator in C# is a coding construct that allows you to loop over a set of values using the foreach loop, without having to implement the entire IEnumerator interface. If you've ever rolled your own collection classes that needed you to implement IEnumerator, you might know what pain it can sometimes be. In particular, it is difficult to implement on non-linear data structures, such as trees, without some nasty hacking.

C# 2.0 allows us to avoid implementing the full IEnumerator interface, by writing a special iterator method, using yield return to provide values back to the caller. Such a method is called an iterator block - a method that returns either IEnumerable or IEnumerable<> - which can be either an instance or a static method.

Yield return

The yield return statement is used to return a single value from the iterator block. A simple example is below (taken from the MSDN documentation)
public static IEnumerable Power(int number, int exponent)
{
    int counter = 0;
    int result = 1;
    while (counter++ < exponent)
    {
        result = result * number;
        yield return result;
    }
}
In this example, the yield return statement is executed once per loop, returning the next value in the power set for the given parameters. It doesn't need to be used in a loop however - the following is perfectly valid:
public static IEnumerable Workdays()
{
 yield return "Monday";
 yield return "Tuesday";
 yield return "Wednesday";
 yield return "Thursday";
 yield return "Friday";
}

static void Main()
{
 foreach (string day in Workdays())
 {
  Console.WriteLine(day);
 }
}


Limitations

You can't nest yield return statements inside a try.. catch block, it can't be unsafe code, and can't appear in an anonymous method or lambda expression. Also, the iterator block cannot take ref or out parameters. This is all due to the way yield return is implemented internally, which I will be covering in the next post of this series.

A YouTube API parser in one line of code


Recently I had to write a simple parser for the YouTube API, to get a list of the videos posted by a user. Not so long ago I would have used XmlDocument, and manually iterated over the result of an XPath query in some nasty looking code.

C# 3.0 and the LINQ project provides us with a new framework for dealing with XML documents - the XDocument.

XDocument allows us to use the Standard Query Operators of LINQ to deal with the objects and collections inside a document, and provides many useful features to a developer. One of the pains I felt with XmlDocument was that dealing with namespaces, using the XmlNamespaceManager, was clunky, and difficult to remember. XDocument (and the rest of the classes in System.Xml.Linq) has much better programmatic support for namespaces. In particular, the overloaded addition operator (+) allows us to easily concatenate namespace prefixes and element names together inline, which leads to much more followable code.

Anyway, on to the code.

The following code snippet shows how easy it is to parse a XML document using XDocument and LINQ - it deserializes the document into a collection of objects in just one line of code.

from feed in doc.Elements("feed")
select new Feed
{
    ID = (string)feed.Element("id"),
    Logo = (string)feed.Element("logo"),
    Title = (string)feed.Element("title"),
    Updated = (DateTime)feed.Element("updated"),
    Author = new Author
    {
        Name = (string)feed.Element("author").Element("name"),
        Uri = (string)feed.Element("author").Element("uri")
    },
    Entries = 
    (
        from entry in feed.Elements("entry")
        select new Entry
        {
            Author = new Author
            {
                Name = (string)entry.Element("author").Element("name"),
                Uri = (string)entry.Element("author").Element("uri")
            },
            Content = (string)entry.Element("content"),
            ID = (string)entry.Element("id"),
            Logo = (string)entry.Element("logo"),
            Title = (string)entry.Element("title"),
            Updated = (DateTime)entry.Element("updated"),
            Url = (string)entry.Elements("link").Attributes().FirstOrDefault
                    (a => a.Value == "alternate").Parent.Attribute("href").Value,
            Media = new Media
            {
                Description = (string)entry.Element("group").Element("description"),
                Keywords = (string)entry.Element("group").Element("keywords"),
                Thumbnails =
                (
                    from thumb in entry.Element("group").Elements("thumbnail") select thumb.Attribute("url").Value
                    
                ),
                Title = (string)entry.Element("group").Element("title"),
            }
        }
    )
}.First();