I wish we used the actor model more, it seems like it is a much better alternative to Green threads or semaphores or locks and mutexes for parallelism, and with Moore's law struggling to keep going, we can no longer continue to write our code AS synchronous code first, instead of async or parallel by default in my humble opinion.
My understanding is that go routines in the golang language are pretty popular for parallelism there, and I at some point I'm going to be trying to use the language features of the language I'm currently using (F# mailbox processor) to put my money where my mouth is.
but it continues to baffle me why we don't make fuller use of the multiprocessing capabilities of the processors we have today and the fortunes of RAM we have now, even taking into account that we are using higher level languages and not constantly profiling what we write for performance.
Surely there must be a way to make writing parallel first code as natural to humans as our imperative code is today, right? Whether that be channels or actors or something else?
Bonus: On a purely emotive level, anything with the name complex and technique in its name will have a rough time getting started with mind share in new users. If the goal is adoptionof languages that sport the features from this article, they may want to pick a less scary sounding name simply for pragmatic reasons.
Nitpick: Actors or goroutines or async are mostly used to deal with concurrency. CUDA & friends are used for parallelism.
The paper: The motivation is a problem suitable to be solved via rule systems. It's a bit unclear why the rule system needs to be grafted on top of an actor model.
Actors: Programming with async messages is hard. Message ordering matters, no clear error reporting, no stack traces, distributed state. This is hardware hard, as in bring in formal verification methods hard. To be avoided in favor of a centralized coordinator solution if at all possible.
I wonder if, alternatively, one could use Rx processing over the stream of events.
Motivating example from the paper:
1. Turn on the lights in a room if someone enters, and the ambient light is less than 40lux.
2. Turn off the lights in a room after two minutes without detecting any movement.
3. Send a notification when a window has been open for over an hour.
4. Send a notification if someone presses the doorbell, but only if no notification was already sent in the past 30 seconds.
5. Detect home arrival or leaving based on a particular sequence of messages, and activate the corresponding scene.
6. Send a notification if the combined electricity consumption of the past three weeks is greater than 200 kWh.
7. Send a notification if the boiler fires three Floor Heating Failures and one InternalFailure within the past hour, but only if no notification was sent in the past hour.
> The motivation is a problem suitable to be solved via rule systems
> I wonder if, alternatively, one could use Rx processing over the stream of events.
Seems like the right solution, and additionally doesn't need to be centralized, strictly speaking - you can ship the code for Rx operations anywhere.
> Message ordering matters, no clear error reporting, no stack traces, distributed state
It's funny. For all the useless Java "factory" patterns, the one thing there ISN'T a factory for - Futures - is the one thing you could use to easily solve your list of issues and really improve the quality of actor / async programming.
This is fundamentally what NewRelic and other instrumenting libraries do though, also there's stuff like JetBrain's `Schedule` and `Execute`. It's a bit of a puzzle to me why this isn't just standardized.
>It's funny. For all the useless Java "factory" patterns, the one thing there ISN'T a factory for - Futures[...].
>It's a bit of a puzzle to me why this isn't just standardized.
It kind of is. A `FutureFactory` it's pretty much `IO` (as in, Haskell's IO).
It just so happens that a lot of implementations don't want you to call `unsafeRun` (so that `unsafeRun` in only called at the edge of the world) and some implemetations don't like, but nothing stops you from having a `unsafeRunToFuture` that returns a `Future`. This is actually pretty common in Scala.
I'm a simple bear. I have always struggled with the architecture and organization of my async code. Stuff like exception handling, percolating errors up, rational back pressure and retry.
My goal is to rewrite some of my server code using Project Loom, to see how it helps.
"why we don't make fuller use of ... the fortunes of RAM we have now"
I know you mention this as an aside, but I'd appreciate examples.
I've been personally frustrated by teammates using external key/value stores for hot data sets which would trivially fit within RAM. But haven't had ready-made solutions.
One recurring objection, for which I've had no good response, is how to quickly hoist (prime, seed, better term needed here) the RAM of new instances.
Last time this came up, I helped come up with a compromise kludge using Redis. We added a local Redis instance to each web server, had the "master" create regular snapshots, copied that snapshot onto new EC2 instances during boot, used the snapshot to prime its local cache.
It worked pretty good. Though I would have preferred in process, using Redis allowed better inspection and debugging, which was pretty cool. But devops wise, this Rube Goldberg kludge broke most teammate's brains.
Thanks for listening. I keep looking to see if anyone's got a more turnkey solution.
"I've been personally frustrated by teammates using external key/value stores for hot data sets which would trivially fit within RAM. But haven't had ready-made solutions."
You don't need a ready made solution unless you are trying to keep the data consistent, too. If you are, better to use something off the shelf. If you aren't...just deserialize it into whatever local data format makes sense, the same as you'd do reading from an external cache. You basically just have some push/poll mechanism to get data from your data store when updated/at intervals, deserialize it, and now it lives in RAM.
The one downside of that is if autoscaling is thrashing instances, it can thrash your data store (though that's true with Redis etc too, depending what is loading data into it; having a caching layer in between DOES decouple that), and also what happens if the data store is unreachable (I've done things like query the rest of the cluster for the data, just via a REST API, because the total amount needed was all of a few megs).
My experience with puniverse/quasar, Loom's predecessor, has been overall positive. There exists a whole actors framework built on top of it. The real problem is how clumsy it is to work with modern Futures code, you have to wrap everything.
We used the actor model through Thespian Python library[0] for a project. Raspberry PI connecting throught Bluetooth Low Energy (BLE) to a "fitness tracker", and stream data to our backend throught 4G dongles.
It had to be plug and play [non technology savvy users]. The Raspberry PI was unattended and had to re-connect to the internet and re-connect with the device automatically, always. Check data, pull in code updates automatically.
They were distributed geographically, in different time zones, with unstable internet.
Timezones are not easy to work with. I've had nightmares about them.
> but it continues to baffle me why we don't make fuller use of the multiprocessing capabilities of the processors we have today and the fortunes of RAM we have now, even taking into account that we are using higher level languages and not constantly profiling what we write for performance.
When I write parallel-first code in a fancy language, I find that I can get better absolute performance by just writing plain old single-threaded C. And when my data gets big, it is relatively easy to turn my C-algorithm-using-arrays into a C-algorithm-using-arrays-backed-by-mmap. Mmap is harder to use in every other language I've tried - or comes with additional worries.
And often a single-threaded algorithm is still the way to go, even in a multi-user system, because you can usually let the 'next layer up' multiplex the requests for you, whether it's your web framework or nginx or whatever.
I think that might be the way to make 'fuller use of the multiprocessing capabilities of the processors we have today and the fortunes of RAM'.
I've never used clojures syntax, but my suspicion is that until first class parallelism can be used with what feels like just an annotation or another type in imperative code paths, without having to actually change one's way of thinking much from the c abstract machine concept, we won't get to the goal of parallel by default very quickly.
Not really in Erlang, since you manually build the actors with a recursive tail call. There's nothing called an "actor" in Erlang, but it's definitely an actor-based language.
> Elixir can be regarded as a modern Erlang (e.g., with macros) that runs atop BEAM; i.e.,the Erlang virtual machine.
I've heard quite a few descriptions of Elixir, relative to Erlang, but I'm not quite sure about this one. Erlang has macros[0], though the syntax and general format does differ from that of Elixir's[1]. And though Erlang has been around since at least 1986, it has changed a lot in that time and is what I'd consider to be a modern programming language today, used at places such as WhatsApp, Grindr, Pintrest etc...
I think a lot of people use "modern" to mean "doesn't look dated or unfamiliar to me." In programming language commentary, it generally means "looks like C or ALGOL."
My understanding is that go routines in the golang language are pretty popular for parallelism there, and I at some point I'm going to be trying to use the language features of the language I'm currently using (F# mailbox processor) to put my money where my mouth is.
but it continues to baffle me why we don't make fuller use of the multiprocessing capabilities of the processors we have today and the fortunes of RAM we have now, even taking into account that we are using higher level languages and not constantly profiling what we write for performance.
Surely there must be a way to make writing parallel first code as natural to humans as our imperative code is today, right? Whether that be channels or actors or something else?
Bonus: On a purely emotive level, anything with the name complex and technique in its name will have a rough time getting started with mind share in new users. If the goal is adoptionof languages that sport the features from this article, they may want to pick a less scary sounding name simply for pragmatic reasons.