Microsoft Orleans - Long running CPU bound work

Microsoft Orleans - Long running CPU bound work

I’ve been talking about the idea of long running, CPU bound, synchronous work for a long time. Now, I’ve finally taken the time to build it out into a NuGet package!

Been talking about it for years at this point. Between my other posts on Orleans, or that one time (so far?) I was on a podcast to talk about it, I never got around to writing the thing down that I was always talking about. That has changed now in the form of a NuGet package, and you too can solve problems (maybe!) with a small amount of code exposed by the package!

As a refresher, I needed to come up with a means of distributing compute calls for 10s of thousands of crypto calls, all hitting within a single moment. I had done some experimenting/proof of concepting, and at the time settled on Orleans to accomplish the distribution. Though I (still) haven’t used Orleans in more of an “intended use-case” of doing lots of distributed asynchronous work, using it in the way we are works pretty well for our use case.

If I were to do it all over again, perhaps it would make more sense to utilize a message queueing system with tried and true infrastructure. I’m not sure if I didn’t know of these concepts at the time, but this route may have been a better option? Even though the Orleans option seems like it would “conceptually” work, I’ve still also not personally worked on bringing up a queue system with workers and notifications. I feel like that was quite a brain fart when re-reading… but basically there may have been better ways to accomplish what this code does, but I still haven’t experimented with those better ways!

Anyway, onto the NuGet package!

The package

There are only a few moving parts when it comes to the package:

ISyncWorker<TRequest, TResult>

All your CPU bound, long running, and synchronous work will be implementing from this class, though it will be through the extension of the SyncWorker<TRequest, TResult> abstract base class.

This interface exposes a few separate methods/contracts that allow for the interaction with the long running grain work:

  • Start(TRequest)
    • This method is the “Entry point” into dispatching work to the grain. This method is the only one that takes in a parameter in the form of the arguments needed to perform the work. In the case of PasswordVerifier, those parameters are both “the password” and “the password hash”, both of which are needed to actually do the verification of a password.
  • GetWorkStatus()
    • This method is used as a sort of “polling” mechanism, where callers can retrieve the current status of the grain work being worked on. The statuses Completed or Faulted will then allow for the return of data from one of the following two methods, depending no which status the grain is in.
  • GetResult()
    • This method gets the result of a grain’s execution when the grain is in a Completed state, in the form of TResult.
  • GetException()
    • This method gets the exception from a grain’s execution when the grain is in a Faulted state.

SyncWorker<TRequest, TResult>

This is the abstract base class of the grains. This class implements ISyncWorker<TRequest, TResult> and provides implementation details for the methods describe from the previously mentioned interface, as well as a few other methods:

  • CreateTask(TRequest request)
    • This private method is invoked by the Start(TRequest request) method, which sets the _task state on the instance, and enqueues that long running work onto a LimitedConcurrencyLevelTaskScheduler
  • PerformWork(TRequest request)
    • This is the abstract method to this abstract class, it is invoked through the creation of the _task, and is the “thing” that needs to be implemented within the implementations of this class.

LimitedConcurrencyLevelTaskScheduler

This is a class that was more or less copied from here. Its intended use is to limit the amount of work that can be performed at any one time as it relates to the work queued against this scheduler.

Without making use of this scheduler, queueing a massive amount of work on the “normal” scheduler quickly overwhelms the Orleans silo in such a way that there are no resources available to accommodate the asynchronous messaging calls required for Orleans to operate. Using this scheduler, at some configurable level under the “amount of work that conceivably be done concurrently”, the Orleans cluster is able to continue asynchronous communication, while also performing these long running tasks being worked through the limited concurrency task scheduler.

Not the package

Aside from the package itself, the repository has a few other projects:

Orleans.SyncWork.Demo.Api

a “sample” implementation of a set of Orleans grains, as well as the cluster hosting. In this project, several long running grains are implemented and registered to the Orleans silo. Those grains are then exposed through api endpoints, or are used as a means of testing some of the logic of SyncWorker<TRequest, TResult> within the Orleans.SyncWork.Tests project.

This project exposes an Orleans Dashboard, as well as SwaggerUI, for keeping an eye on the test cluster and invoking calls to the API respectively.

The dashboard depicting 10k grains being fired off simultaneously

Orleans.SyncWork.Demo.Api.Benchmark

A Benchmark project has been set up to get an idea of the timing differences between serial execution, Parallel.For, and the parallel execution offered through the SyncWork<TRequest, TResponse> implementation. The latter can be slower then Parallel.For, but faster then the serial execution - though keep in mind this benchmarking testing is done completely locally, where in a real world scenario, you’d be bringing up multiple silo hosts to make up a highly available cluster. At this point it stands to reason that the functionality exposed by this package will far exceed the performance accomplished with a single machine.

An initial running of the benchmark gave the results:

Method Mean Error StdDev
Serial 12.284 s 0.0145 s 0.0135 s
MultipleTasks 12.274 s 0.0073 s 0.0065 s
MultipleParallelTasks 1.723 s 0.0185 s 0.0144 s
OrleansTasks 1.118 s 0.0080 s 0.0074 s

Though keep in mind this can be further improved with an actual cluster of orleans silos, rather than just my one locally.

Orleans.SyncWork.Tests

Unit testing project for the work exposed from the NuGet package. These tests bring up a “TestCluster” which is used for the full duration of the tests against the grains.

One of the tests in particular throws 10k grains onto the cluster at once, all of which are long running (~200ms each) on my machine - more than enough time to overload the cluster if the limited concurrency task scheduler is not working along side the SyncWork base class correctly.

Wrap up

So that’s it. This is quite a bit simpler of an implementation that I ended up originally making for work. The same basic idea is there, but the abstraction is quite different, if it holds up well enough, maybe using this package will be in order!

You can use this package to enable your Orleans cluster to handle your long running sync work, like 10s of thousands of crypto calls!

do it

Resources

Author

Russ Hammett

Posted on

2021-11-22

Updated on

2022-10-13

Licensed under

Comments