Deep dive into C# dynamic

One of the most notable additions to C# 4 is dynamic. It’s been discussed a lot in the community. But The dynamic language runtime (DLR) is always overlooked. In this article, we will have a look at the internal implementation of the DLR, C# compiler’ machinery, and the PIC (Polymorphic Inline Cache) technique, used, for example, in the Google V8 engine.

Before moving on, I would like to brush up on some terms and concepts.

In short, a variable does mean any data object (a reference, a constant, an expression, etc.).

From the type checking perspective, programming languages are usually divided into statically typed (e.g. in simple words variables’ type is specified during its declaration and the type cannot be changed later) and dynamically typed (e.g. variables’ type is determined during the time of value assignment and the type cannot be changed later).

C# is an example of a statically typed language, while Python and Ruby are dynamically typed.

In the context of type safety, programming languages can be distinguished between weak (variables’ does not have a strictly defined type) and strong / strict (variables’ has a strictly defined type that cannot be changed later) typing.

C# 4 dynamic keyword

While dynamic adds the ability to write clean code and interact with dynamic languages like IronPython and IronRuby, C# continues to be a strongly & statically typed language.

Before diving into dymamic internals, here is simple example:

// we're assigning the variable' type as System.String
dynamic d = "stringValue";
Console.WriteLine(d.GetType());

//no exception thrown during runtime
d = d + "otherString";

Console.WriteLine(d);

//setting System.Int32 value
d = 100;
Console.WriteLine(d.GetType());
Console.WriteLine(d);

//no exception thrown during runtime 
d++;

Console.WriteLine(d);

d = "stringAgain";

//an exception is thrown during runtime
d++;

Console.WriteLine(d);

The result of the execution is shown below in the screenshot:

And what do we see? What is the typing here (strong or weak)?

Short answer: it is strong typing, and here’s why.

Unlike other C# built-in types (such as string, int, object, and so on), dynamic does not map directly to any of the base BCL types. Instead, dynamic is a special alias for System.Object with additional metadata needed for proper late binding.

So:

dynamic d = 100;
d++;

Will be compiled as:

object d = 100;
object arg = d;
if (Program.<dynamicMethod>o__SiteContainerd.<>p__Sitee == null)
{
    Program.<dynamicMethod>o__SiteContainerd.<>p__Sitee = CallSite<Func<CallSite, object, object>>.Create(Binder.UnaryOperation(CSharpBinderFlags.None, ExpressionType.Increment, typeof(Program), new CSharpArgumentInfo[]
    {
            CSharpArgumentInfo.Create(CSharpArgumentInfoFlags.None, null)
    }));
}
d = Program.<dynamicMethod>o__SiteContainerd.<>p__Sitee.Target(Program.<dynamicMethod>o__SiteContainerd.<>p__Sitee, arg);

As you can see, variable d of type object is declared. Next, binders from the Microsoft.CSharp library come into play.

DLR

For each dynamic expression in the code, the compiler generates a separate dynamic call site that represents the operation.

dynamic d = 100;
d++;

a class like the one below will be generated:

private static class <dynamicMethod>o__SiteContainerd
{
    // Fields
    public static CallSite<Func<CallSite, object, object>> <>p__Sitee;
}

The <>__Sitee field type is the System.Runtime.CompilerServices.CallSite class. Let’s review it in more detail.

public sealed class CallSite<T> : CallSite where T : class
{
    public T Target;
    public T Update { get; }
    public static CallSite<T> Create(CallSiteBinder binder);
}

Although the Target field is generic, it is in fact always a delegate. And the last line in the above example is not just a variation of the operation:

d = Program.<dynamicMethod>o__SiteContainerd.<>p__Sitee.Target(Program.<dynamicMethod>o__SiteContainerd.<>p__Sitee, arg);

The static Create method of the CallSite class is:

public static CallSite<T> Create(CallSiteBinder binder)
{
    if (!typeof(T).IsSubclassOf(typeof(MulticastDelegate)))
    {
        throw Error.TypeMustBeDerivedFromSystemDelegate();
    }
    return new CallSite<T>(binder);
}

The Target field is an L0 cache (there are also L1 and L2 caches) that is used to quickly dispatch calls based on call history.

Please note that the call node is “self-learning”, so the DLR needs to periodically update the value of Target.

To describe the logic of the DLR, I will give Eric Lippert’s answer on this matter in a re-phrased way:

First, the runtime decides what type of object we are dealing with (COM, POCO). Next comes the compiler. Since there is no need for a lexer and parser, the DLR uses a special version of the C# compiler that has only a metadata parser, a semantic expression parser, and a code generator that generates Expression Trees instead of IL. The metadata parser uses reflection to determine the type of the object, which is then passed to the semantic parser to determine whether a method can be called or an operation can be performed. Next comes the construction of the Expression Tree, as if you were using a lambda expression. The C# compiler returns the expression tree back to the DLR along with a caching policy. The DLR then stores this delegate in the cache associated with the call node.

Update property of the CallSite class is used for that. During a call to the dynamic operation stored in the Target field, a redirection to the Update property occurs, where the binders are called. The next time the call occurs, instead of doing the above steps again, the already prepared delegate will be used.

Polymorphic Inline Cache

The performance of dynamic languages suffers due to extra checks and lookups being performed. A straightforward implementation would try to resolve a method reference during runtime by constantly examining possible candidates. In languages that are statically typed (or that have enough code type hints and type inference), it is possible to generate statements or runtime function calls that match all call references. This is possible because statically typed languages have all the necessary information during compile-time.

In practice, repeated operations with different methods, but with common signature, can be reduced to a one call site. For example, the first time an expression like x + y is evaluated, where both x and y are integers, you may create and cache a function that adds two integers on the fly. Hence it will no longer be necessary to lookup/search for a function or a piece of code, which corresponds to that signature/operation.

The above delegate caching mechanism (in this case) when the caller is self-learning and self-updating is called Polymorphic Inline Cache. Why?

Polymorphic. The target of a call site may contain signatures, where only type parameters differ.

Inline. Life-cycle of a dynamic call site is limited to the place of its instantiation.

Cache. All operations are based on a multi-level cache (L0, L1, L2).