Applied Algorithms

Ok, ok. It took me 2 days. But I was doing other things with my life 🤷. But anyway, onto the real story.

So a few days ago, I was writing some internal documentation for my job at Ziosk (did I mention we are hiring?), specifically I was writing a primer for newcomers to Microsoft Orleans. To start with, Orleans is Microsoft's implementation of the Distributed Virtual Actor model, which makes people's eyes glaze over, so I try to build intuition other ways. I have fallen into a pattern of introducing Orleans as a "nanoservice" platform. You see, microservices are small independent APIs that handle some specific functionality. But the more microservices we add, the higher the complexity and the greater the woes. But Orleans skips a lot of that pain and lets us break out functionality as if we were programming normal C# classes and interfaces. This enables us to define smaller and more fine-grained services if our designs call for it. Hence, "nanoservices".

While I was explaining this in my document, I mused a bit that since nano is smaller than micro, you could build microservices out of smaller nanoservices. Then it clicked! Maybe I could actually do that! And what could be a more hip way to deploy a microservice but to use a serverless platform and pay 6 times the rate 😃! So I cracked open VSCode and started writing.

What exactly are we building?

Well, a serverless platform! So what is that? Lets ask wikipedia:

Serverless computing is a cloud computing execution model in which the cloud provider allocates machine resources on demand, taking care of the servers on behalf of their customers.

It further states:

Serverless computing does not hold resources in volatile memory; computing is rather done in short bursts with the results persisted to storage. When an app is not in use, there are no computing resources allocated to the app. Pricing is based on the actual amount of resources consumed by an application.

Ok, so it's a platform that abstracts the developer completely away from the machine. It also does computation in short bursts (request/response processing) and costs the client nothing while it is not executing.

From this we can distil a couple of basic requirements:

You need to be able to deploy the serverless code
You need to be able to execute that code to do some processing
You need to be able to pay for consumption and nothing else.

How are we going to build it?

We are going to use one of my favorite tech stacks: C# + Orleans + gRPC. Actually F# is my favorite language, but that would be harder for you folks to follow along with, so back to C#, my 3rd favorite language 😉. The serverless code will be in Javascript because it is the most popular language on earth and it is well supported. As for Orleans, I think it will fit the workload very well and make prototyping easier. gRPC is a no-brainer for me. It gives me compile time safety for my client and server code, it's my default choice.

We are going to build this app on the fly and not do a lot of up front design. This was thought of and built on a whim, so I didn't want to try to make something perfect. This was a personal hackathon. If you stop by my repo, you can see it commit by commit. Lets jump in!

Building out the Control Plane

Our system will consist of a control plane and a data plane. The control plane is how routes are controlled and other administrative tasks are controlled. The data plane is where requests flow in the front door and are ultimately served by the serverless code. The control plane will need a proper interface that we can talk to programmatically. So I reach for my favorite tool for APIs, which is gRPC. gRPC lets me define a server in a special definition language and then generate both server and client code.

The control plane basically looks like this :

service ControlPlane {
  rpc QueryUsage QueryConsumptionRequest) 
    returns QueryConsumptionResponse) {}

  rpc RegisterHandler (RegisterHandlerRequest) 
    returns (RegisterHandlerResponse) {}
}

So there are two methods, QueryUsage and RegisterHandler. This is the bare minimum for us to compute billing and upload code. This code will be handled by a asp.net gRPC controller that will then forward the details onward to Orleans. We are hosting the gRPC controllers in the same process as the Orleans because it is faster and simpler. The controllers are not very complex, here the code for `RegisterHandler` with the error handling removed for brevity

var normalizedHandlerName = 
  helpers.GetNormalizedHandlerName(
    request.HandlerId
  );

var grain = clusterClient.GetGrain<IJavascriptGrain>(
  normalizedHandlerName
);

await grain.Import(request.HandlerJsBody);

return new RegisterHandlerResponse {
    Success = true
};

That's about it. We convert the handler id to lowercase, lookup the Orleans grain by that id, call Import on the grain and return a result. This is missing auth, but hang with me.

You may be curious about the Import method, so here it is:

private Script? Script { get; set; }

public async Task Import(string code)
{
    var script = ParseScriptOrThrow(code);

    Script = script;
    state.State ??= new JavascriptGrainState();
    state.State.Code = code;

    await state.WriteStateAsync();//save to storage
}

The method parses the code to ensure the syntax is valid, because why let someone push broken code 😐. After that it stores the parsed script in a volatile field in the class (which is just fine) and then updates the persistent state with the raw code string and saves it. (We'll see why later...🙂).

Actually lets see why now 😃.

The Why and the How of Orleans

You see an Grain in Orleans is like an instance of an object. In fact it literally is an object instance in .NET. It has methods and fields and a constructor. We can store things in the fields of this object like the Script property we saw above. And they will disappear when the object is collected. However, we can mark some of those fields and properties as persistent and then Orleans takes over control of the data.

The lifecycle of a Grain in Orleans

A Grain in Orleans is never created and it is never destroyed. That is why we call it the Virtual Actor pattern. You call up a grain by providing a unique key, much like a primary key in a database. If this grain has state associated with it, Orleans transparently loads that data into memory when it routes a request to that Grain. After that, any method calls to the Grain can use that state as normal. If the Grain instance in memory hasn't had a method call sent to it for a while, Orleans will deactivate the grain and save it's state back to persistent storage (such as S3 or a database).

Does this sound a little familiar?

Yep, most serverless platforms work the same way. If your Azure Function or AWS Lambda isn't used for a while, it goes to sleep and then the next time it is called, it has to warm up. This is called the cold start problem and it can be a big annoyance for some use cases (but my prototype doesn't suffer much from this 😁). So now you see why Orleans is a good fit. We store the raw script in the database and then have Orleans Load it on demand when we need it. Once loaded we parse the script once and cache the result in the Grain's Script property and then reuse the cached Script to execute the request again and again.

The alternative?

We, of course, could have done all this ourselves and used normal asp.net. But then we would need to:

Manually integrate with some persistent storage technology to store the scripts and the consumption data.
Designed a caching mechanism to store the scripts in memory so we aren't clobbering the persistent store.
Built out a cache eviction policy so that we weren't holding on to every script ever in memory.
Figured out how to store and when to write consumption data back to the database
Finally once we are done with all of that, we can then worry about the hard problem of scaling this to run on multiple servers in a coherent way without having a lot of coordination traffic between the server instances, which is not a trivial problem

But we don't have to, because Orleans exists and it does all of this for us and more. ✨

This blog post is getting a little long, so lets take a break here. Next time, we will see how the data plane is implemented and also handle how to measure consumption for billing purposes. Until then!

Applied Algorithms

A blog about software development

I built a serverless platform in 1 day (and so can you!)

What exactly are we building?

How are we going to build it?

Building out the Control Plane

The Why and the How of Orleans

The alternative?

What exactly are we building?

How are we going to build it?

Building out the Control Plane

The Why and the How of Orleans

The alternative?

Subscribe by email

Subscribe by email