I am currently looking for a contract in the London area -

If you're hiring a .NET contractor in or around London, look no further!

Using memcached from C#


In my last post I showed you how to run memcached, the popular distributed hash table, on Windows. Now it's time to start using it from your .NET application.

Getting the client

The Win32 client can be downloaded here. It's a source release, available for .NET 1.1 and 2.0, so you'll need to compile the solution before you can use it. Once you have compiled it, create a new console application and add a reference to Memcached.ClientLibrary.

Writing your first memcached program

As it's just really a very scalable hash table, memcached is very simple to code against. There are only a few things you need to configure.

memcached requires a list of all the servers in the pool to connect to. The hash function that memcached uses to choose which machine to store values on requires knowledge of all the machines in the pool to function correctly, so maintaining this list is a concern when using memcached in the pool. For now, however, we'll just use a single local server for simplicity, running with default port (11211) and cache size (64Mb), running with the very verbose option - this will allow us to monitor the memcached process from the command line.

C:\memcached>memcached -vv


Now to write some code. We are running a single server locally, so let's let memcached know about it.

string[] servers = { "127.0.0.1:11211" };
SockIOPool pool = SockIOPool.GetInstance();
pool.SetServers(servers);
pool.Initialize();


Our memcached client is now ready to use. Let's try and stick some values in the cache and get them back out again.

// Create the client 
MemcachedClient mc = new MemcachedClient();

// Set the value in the cache
mc.Set("andy", "rocks");

// Make sure it's there
Console.WriteLine("The key " + (mc.KeyExists("andy") ? "exists" : "doesn't exist") + "!");

// Fetch from the cache
string cachedValue = mc.Get("andy") as string;

// Display the fetched value
Console.WriteLine("Retrieved the value '" + cachedValue + "' from the cache!");


If we go back to our memcached.exe window, something like the following should appear when the above is run:

<96 new client connection
<100 new client connection
<104 new client connection
<104 set andy 0 0 6
>104 STORED
<104 get andy
>104 sending key andy
>104 END
<104 get andy
>104 sending key andy
>104 END
<104 connection closed.
<96 connection closed.
<100 connection closed.


The MemcachedClient.Get() method returns a value of type object. This you can cast to whatever type you originally put in the cache. However, if you want to store objects, then you must serialize them, either yourself, or with the built-in .NET serialization mechanisms.

If the memcached server is not found or an error occurs during sets and gets, no exception will be thrown to the calling code. Basically, if your cache servers encounter a problem, to your code it will just appear as if the cache is permanently empty. Bad for performance, but good for uptime.

Using with Linq Queries

C# 3 and Linq's syntax extensions allow us to write some funky shortcuts for things like caching systems. We can cache the results of DLinq queries in memcached the same as any other data, as long as we make sure we have a serialization policy for the classes.

For example, the following code from a data access layer method to return all the users of a site will first check memcached for the data. If it is not found, it will execute the query and store the results in memcached. The code to manage the caching is written in the form of an extension method, so you can simply add a single call to your DAL methods to enable memcached caching of query results.

Extension method:
public static IEnumerable<T> CachedQuery<T>
        (this IQueryable<T> query, string key) where T : class
{
    if (cache.KeyExists(key))
    {
        return (IEnumerable<T>)cache.Get(key);
    }
    else
    {
        IEnumerable<T> items = query.ToList();
        cache.Set(key, items);
        return items;
    }
}


Example Usage:
public static IEnumerable<User> GetAllUsers()
{
    // Retrieve from cache if it exists, otherwise run the query
    return (from u in ctx.Users select u).CachedQuery("allusers");
}


The above code ignores nasty things like serialization, but they are easily added in. There is an effort to standardize an object serialization format for use in memcached (very useful when you have clients from multiple platforms and languages accessing the same pools) to JSON, but you are free to choose the serialization mechanism. I would recommend against XML however, due to the excessive redundancy and large documents produced in the language.

Multiple Servers

Using memcached with multiple servers in the pool is almost as easy as using with a single one. Just supply the SockIOPool class with the list of all the servers you wish to pool together.
string[] servers = {
    "192.168.1.100:11211",
    "192.168.1.101:11211",
    "192.168.1.102:11211",
    "192.168.1.103:11211",
    "192.168.1.104:11211",
};
SockIOPool pool = SockIOPool.GetInstance();
pool.SetServers(servers);
pool.Initialize();


Note, that all your clients must be aware of the same set of servers, otherwise the hashing function to find data on servers will not work consistently between clients, which will adversely affect your hit ratio.

Running memcached on Windows


Memcached is a giant distributed hash table, allowing you to cache data in-memory across multiple machines in your data centre. Commonly, in the web 2.0 world, it's used to store the results of frequently-run queries to avoid hitting the database. This allows the database to scale much further, as it moves much of the load into the fast and almost transparent cache, allowing the database to concentrate on write operations. Many users of memcached report (when implemented correctly) database load dropping by a factor of 10 after implementing query caching in memcached.

Running memcached on Windows

Why? Linux is normally the OS of choice for running memcached, due to its stability, performance, and low TCO - I'd never recommend to run memcached on Windows in a production environment. However, the memcached network protocol is the same regardless of the client or server OS, meaning that organisations that develop mainly on the Microsoft platform can use a Linux cluster in production, but still conveniently run memcached on the local Windows development server.

Downloading the memcached server and client

Memcached is available from Danga (of LiveJournal fame) here. Memcached server for Windows is available here, and you can grab the latest .NET client from SourceForge here.

First Look

Unless you get the source release, the downloaded zip file contains a single file, memcached.exe – no documentation, no release notes. If you run this exe with the “–h” option you get the following list of options.

C:\memcached>memcached.exe -h
memcached 1.2.6
-p       TCP port number to listen on (default: 11211)
-U       UDP port number to listen on (default: 0, off)
-s      unix socket path to listen on (disables network support)
-a      access mask for unix socket, in octal (default 0700)
-l   interface to listen on, default is INDRR_ANY
-d start          tell memcached to start
-d restart        tell running memcached to do a graceful restart
-d stop|shutdown  tell running memcached to shutdown
-d install        install memcached service
-d uninstall      uninstall memcached service
-r            maximize core file limit
-u  assume identity of  (only when run as root)
-m       max memory to use for items in megabytes, default is 64 MB
-M            return error on memory exhausted (rather than removing items)
-c       max simultaneous connections, default is 1024
-k            lock down all paged memory.  Note that there is a
              limit on how much memory you may lock.  Trying to
              allocate more than that would fail, so be sure you
              set the limit correctly for the user you started
              the daemon with (not for -u  user;
              under sh this is done with 'ulimit -S -l NUM_KB').
-v            verbose (print errors/warnings while in event loop)
-vv           very verbose (also print client commands/reponses)
-h            print this help and exit
-i            print memcached and libevent license
-b            run a managed instanced (mnemonic: buckets)
-P      save PID in , only used with -d option
-f    chunk size growth factor, default 1.25
-n     minimum space allocated for key+value+flags, default 48


Running the server

Memcached has no configuration file – it's controlled purely by the command line parameters above. To run the server with default settings – 64Mb cache memory, and listening on port 11211.

If you use the “-vv” flag, you get a much more verbose output with details of client connections: Your memcached server is now running.

Running memcached as a Windows Service

The Win32 port also allows you to install and run memcached as a service, using the “-d” (daemonizer) switch with its various options. To install it, run the command:

memcached.exe –d install


This will install the server as a service available in the control panel. To start it, either run the command:

memcached.exe –d start


… or start it from your Services manager:

Configuration

The two main settings that may require configuration are the cache size, and port.

The default cache size is 64mb, which for any web 2.0 application is a pretty paltry cache. You can change the cache size (in megabytes) using the "-m" switch:

memcached.exe –m 1024


The above will configure memcached to use a gigabyte of memory.

Drawbacks to running as a service

If you configure memcached to run as a Windows service, you (as of release 1.2.6) lose the ability to configure the sevice as you would via the command line. The size of the cache defaults to 64mb, and the port to 11211. Parameters supplied on the command line when using “-d start” are ignored.

Linq tip - returning polymorphic types from a query


I haven't posted since February, mainly because my time has been taken up with preparations for, and actually doing, the Mongol Rally 2008 - driving from London to Mongolia in a 1988 Fiat Panda. We completed it in 6 weeks 2 days, having travelled through 16 countries, 7 time zones, 3 rivers, 2 deserts, 1 blizzard, and 0 Starbucks.

While I'm getting back into the swing of things, I thought I'd share a little Linq tip with you. Quite often I have data in a database that has a 'type' column, indicating the type of data in another column. When I deserialize this from the database, I would like to change the type of object I am creating based on the value of this column, normally with a common base class.

The select statement in Linq allows us to put any expression we like in, as long as it conforms to the type rules of the context of the query. So, using the conditional assignment operator (?), we can return a collection of objects descended from a common type, like this:

abstract class Number { }
class EvenNumber : Number { }
class OddNumber : Number { }

IEnumerable<Number> GetNumbers()
{
 return from i in Enumerable.Range(1, 10)
  select
   (i % 2 == 0) ? 
    (Number)new EvenNumber() : (Number)new OddNumber();
}


Note that you have to cast both sides of the expression to the base type you are returning. That is because the compiler isn't clever enough to work out that both sides of the expression are type equivalent when the return type is their superclass.

This method can be used in a database query as shown below.

IEnumberable<User> GetUsers()
{
 using (MyDatabaseContext ctx = new MyDatabaseContext())
 {
  return from u in ctx.Users
   select
    u.user_type == UserType.Admin ? 
      (User)new AdministratorUser() : 
      (User)new NormalUser();
 }
}


It should be noted that this technique doesn't scale to more than 3 different types however - due to the spaghetti code created when nesting lots of ? expressions together. Maybe we should all petition Anders for a case-expression similar to SQL? :)