Contents
New: 10Nov2013, updated 22Feb2024 (Some moved to 254:[] typos, plus added some). This page is in group Technology.
This note on the Internet Archive‘s Wayback Machine: A reactive manifest
Disclaimer
Whether this note manifests itself as a manifest, remains to be manifested.
Background
I will in this blog try to critically analyse The Reactive Manifesto [1]. The first version of this manifest was written by Jonas Bonér (he so states in a blog that I found by searching [2]) but the web page and document are anonymous (rather strange for a manifest). It is embraced by the Scala programming language people Martin Odersky, Erik Meijer and Roland Kuhn at their «Principles of Reactive Programming» page at the École Polytechnique Fédérale de Lausanne in Switzerland [3].
Now, after having read The Reactive Manifesto and seen it also being signed by lots of people – I want to react. I read the paper as rather rhetorical and almost like a creed. I think the outline is fine but the main fault of the paper is that it sets the scope as Universal.
Disclaimer: update 15Oct2015. Version 2.0 was published on September 16 2014 (also [1]). It has been reduced from nine rather airy pages as of version 1.1 of Sept. 23 2013 to two rather condensed pages! (Could they have read this note?) Therefore, some of my comments below (done on 1.1) may also be interesting to read in this new context. Event-driven, Scalable, Resilient, Responsive are now instead Responsive, Resilient, Elastic and Message Driven (in those orders). «Event-driven» these days seems to be getting out of popularity (for, yes: «reactive programming», like with Reactive Extensions / ReactiveX / RxSwift [25] (much thanks to Erik Meijer) or Clojure and core.async [26] (Rich Hickey)). They also seem to have removed the explicit polarisation between everything good (asynchronous, non-blocking) vs. everything evil (synchronous, blocking). Version 2 seems like a more edible document. It’s more mature to talk about what you have than painting black what the others have (which by the way also solves the problems they say it doesn’t). I guess next version will also have removed message driven to change driven? (Update 22Feb2024: there is no newer version). After I wrote this note I have written more about some of this, like in Not so blocking after all. |
I try to understand matters in this note. It will have errors, and when I read it over and over I change and add and remove. I see holes and lack of streamlining. But it would be up to you to pinpoint concrete errors and comment them. Then I will tag these points. But nobody knows everything, so may be even you will learn from this note – if I manage to keep your attention all the way to the far end.
Disclaimer: if any of this is seen as offensive, it’s not meant to be – even if I did want to make some points. But I could always restate.
I remember back in 2001 that Autronica was a member of the J Consortium. It aimed at defining a standard for real-time Java, (NIST was in it, and the work was backed by HP, Microsoft, and Newmonics etc.). I was at two meetings, one at Ericsson at Älvsjö, Stockholm in 1999, and one in Karlsruhe, Germany in 2001. On this last meeting I also attended the Embedded Systems messe in Nürnberg (Nuremberg) in Germany (it’s called Embedded World now). I found a very interesting guy with a small company who talked about «reactive systems». He was from Switzerland. What I heard was familiar and alien at the same time. Didn’t we at Autronica make reactive systems; hadn’t I for one, written highly reactive systems in the occam 2 programming language for some ten years? What’s more reactive than collecting real-time data from the combustion chambers of the diesel engines of some 50% of the world’s largest ships – and do real-time calculations on the data – aligning the samples with pins on the flywheel to get the right rotation angle? Where timing is more than crucial. As I learned, a rotary engine like that doesn’t rotate evenly, the angular speed varies around one rotation. We even had to take that into consideration because the equation for delivered power is very dependent on accurate angular measurements. We had 32 samples for each rotation on up to 16 (I think) cylinders, and these engines I seem to remember went to about 1000 RPM (and much more if turbine engine). When I came back to work I started a discussion thread about reactive systems.
Where the J Consortium failed on Java and the JVM intricacies and its applicability to real-time, Martin Odersky at EPFL around the same time designed Scala and had it run on the JVM.
I think J Consortium (Java Consortium) and its RTJWG (Real-Time Java Working Group) was in competition with Sun’s own Java Community Process leading to javax.realtime, see present state at [4]. So there may have been several reasons for the failure. I assume that many technologies will have diverse success and failure criteria, not only technical. Like signing a petition for it. It may be a good idea.
Reactive what?
The Reactive Manifesto is probably meant to be a white paper for the ideas of a reactive programming framework at least for the Scala language.
In the «Reactive programming» article at Wikipedia I read:
– In computing, reactive programming is a programming paradigm oriented around data flows and the propagation of change. This means that it should be possible to express static or dynamic data flows with ease in the programming languages used, and that the underlying execution model will automatically propagate changes through the data flow.
– Sometimes the term reactive programming refers to the architectural level of software engineering, where individual nodes in the data flow graph are ordinary programs that communicate with each other. [5]
And in the «Real-time computing» article at Wikipedia I read:
In computer science, real-time computing (RTC), or reactive computing, is the study of hardware and software systems that are subject to a «real-time constraint»— e.g. operational deadlines from event to system response» [16]
Summary:
- Reactive applications (the only term in The Reactive Manifesto)
- Reactive programming (separate Wikipedia article, [5])
- Reactive computing == real-time computing (in «Real-time computing» at Wikipedia, [16])
The interesting thing is that (1) is not mentioned in (2) or (3) – and that (2) or (3) are not mentioned in (1). And «Scala» is not mentioned in (1). The Reactive Manifesto may be trying to reserve the term «Reactive applications» for its own use, and trying to tag the traits as Universal (again, with capital U). In my opinion the solution is not universal, but absolutely a possible solution. The phrase should be used in the context of any of the points above.
However, for this blog note I am not differentiating between them.
I have tried to understand their context in much of my writings, because they are not alone. They may even be the vast majority. My recent blog notes Some questions about SDL, Pike & Sutter: Concurrency vs. Concurrency, Block play for blockers and Eventual concurrency try to do this. Also Not so blocking after all. Adding these up, I could stop writing now. (Update 22Feb2024, I still haven’t quit). I may have said it all. But repetition is a well known argumentation technique – used all over in The Reactive Manifesto. I’ll try to twist my perspectives on this, too.
I have tried to make a sketch. I must admit that even in view of the blog notes above, I only had this figure ready days after studying and writing of this blog note:
It was especially putting «reactive» in the figure’s context that I struggled with. Some of it should be clearer when I have written the last word of this note. In the figure I mention systems built with basically (A) synchronous messages (CSP), (B) asynchronous messages (SDL), (C) asynchronous, callbacks, event-loop, futures and (D) synchronous, time-slot. I also mention (a) explicit concurrency, (b) implicit concurrency, (c), state machines and (d) stateless designs. All of these may be used to design reactive systems.
The figure is my jotting down words to see matters here in perspective. It’s not like Peter Wegner‘s classical taxonomy [21]. He identified six orthogonal dimensions of object-oriented language design: objects, types, delegation, abstraction, concurrency, persistence. He also discusses both actors and concurrency there. That paper can’t be read enough.
I think that The Reactive Manifesto tries to establish the top right methodology as best for reactive applications. I try to understand in this blog note in which context they are right. And some times I believe they are plain right – but also plain wrong.
Thought experiments
These two thought experiments are naïve, and have nothing to do with reactive programming in the smaller sense. But in the larger sense they may be ok tools to get a feeling of things.
Clockwork
Imagine a clockwork with toothed wheels that may or may not touch; in any thought-case they should logically or physically be pushed by each other. There is a computer system that understands what a wheel is doing (rotate or not) and it controls one actuator for every wheel to tell it what to do (rotate one/several positions). The system is safety critical (the time shown must at every time be correct) and it is accurate (time is kept correct by speeding the pendulum or balance wheel causing a phase locked loop) and reactive (it must be possible to change between winter and summer time). Some internet clock is the reference.
The computer system has many small programs that send messages, asynchronously or synchronously. If all messages are put into a message pool it is important that no single message controlling one wheel may block for another, so an asynchronous non-blocking scheme is chosen. Contrary, if the messages are sent over individual channels at points where the toothed wheels would have touched, it is fine to do synchronization and communication as the same happening. This means that blocking is fine. In the latter system, adding asynchronousity adds nothing, in the first going for synchronous blocking may cause the clock to halt. This could be modelled and analysed in a tool.
Update 9mar2022. The left column is from the Reactive Manifesto (2014), the right is my try at an antonym, comparing again, apple and bananas. But I’m not certain if The Reactive Manifesto also in its way of reasoning, fall short of the same.
«If all of our components support mobility, and local communication is just an optimization, then we do not have to define a static system topology and deployment model upfront. We can leave this decision to the operations personnel and the runtime, which can adapt and optimize the system depending on how it is used.« | Two (mechanical) clocks have the same purpose: to rotate three arrows. Two brass plates make up a case, where the toothed wheels are placed in holes. The two clocks are designed according to the spec, but behind the clock face they differ. The user can do what’s in the spec, but has no need to rearrange the wheels. |
Coffee blade grinder
This morning we had little coffee left, and I put the little we had into our blade grinder. What surprised me was that it took longer time to grind the beans than when it is full; at least longer per bean on the average. The amateur scientist in me concluded that the knife hit beans more seldom. Closer beans would have been better.
This system was too soft to be effective. Too much travelling of beans. Making less asynchronousity by pushing the top down to decrease the cavity would have been better. Tighter coupling is best (as in fastest) in this case.
Once and for all !?
This chapter will perhaps look formal, but I am afraid it isn’t. Terms are loosely defined. My goal is to set up a framework that I could use to discuss The Reactive Manifesto. I try to use axioms and lemmas:
«An axiom, or postulate, is a premise or starting point of reasoning. As classically conceived, an axiom is a premise so evident as to be accepted as true without controversy.» [6]
«In mathematics, a lemma or helping theorem is a proven proposition which is used as a stepping stone to a larger result rather than as a statement of interest by itself.» [7]
The «software system» scope here is a real-time type, reactive and concurrent software system – towards safety critical.
A1. No software system (with external interface) can be built by only asynchronous or only synchronous components
A2. As long as the specifications are met then there (yet, perhaps) is no «best» methodology to build a software system. (Internally there may be more or less elegant solutions, depending on taste and experience as well as hard facts – as with any «inside»)
L1. A component may be built with only asynchronous or only synchronous building blocks. Buffers may be built with synchronous building blocks, and feedback mechanisms may be built from asynchronous building blocks
L2. Som sort of asynchronousity is needed between a fast producer and a slow consumer (by any sort of buffer)
L3. Asynchronousity will be bounded, because no physical buffer is infinite
L4. When a buffer (of data or components, etc.) is full (and the consumer is not able to consume), either data is allowed to be lost or a fast producer is blocked from filling more (by synchronous feedback)
L5. An «external» buffer is harder to do bookkeeping on than an «internal» buffer (like flushing, rearranging, prioritising). (This point is perhaps out of scope, but I think about examples like «send and forget»-buffer pools, pipes and buffered channels)
L6. When a building block has nothing more to do (for whatever reason) it may become idle at «no cost». (Blocking is one such «idle» state.) (This point is perhaps too much connected with a particular «process model», see below). Update 9Mar2022: However, in the Manifesto they do make a point of it (2014): «A non-blocking API to a resource allows the caller the option to do other work rather than be blocked waiting on the resource to become available. This may be complemented by allowing the client of the resource to register for getting notified when the resource is available or the operation has completed.» The first point, well – I have never felt I needed this on a general basis. The second point’s last point, fine. It’s the «asynch pattern» used all over the place. My XCHAN
uses it and the fated xC language’s interface
with [[notification]]
and [[clears_notification]]
uses it (xC is C plus x). But, contrary to what the Manifesto authors think, it’s perfectly possible with synchronous and blocking behaviour under the hood. It may even be better, since queueing won’t be necessary (ref Ravenscar profile)).
I hope I haven’t forgotten or repeated anything. I would change it on no warning. And, no – I didn’t do this to provoke. I am naïvely hoping that the points would converge to reality. But I must admit that The Reactive Manifesto provoked me, thinking parts of it diverted…
The Reactive Manifesto
Here is the outline of The Reactive Manifesto:
- The Need to Go Reactive
- Reactive Applications
- Event-driven
- Scalable
- Resilient
- Responsive
- Conclusion
The outline of The Reactive Manifesto is easy to agree with. The professional knowledge of their writer(s) I simply accept. But they are wrong on some aspects. As mentioned, they fall short of comparing apples and bananas and draw incorrect conclusions. It’s even worse; they compare apples with what they think a banana is. Their context is difficult to grasp.
I will do some cut and paste from the manifest, version 1.1.
The Need to Go Reactive (pp 1-2)
But now a new architecture has evolved to let developers conceptualize and build applications that satisfy today’s demands. We call these Reactive Applications. This architecture allows developers to build systems that are event-driven, scalable, resilient and responsive: delivering highly responsive user experiences with a real-time feel, backed by a scalable and resilient application stack, ready to be deployed on multicore and cloud computing architectures. The Reactive Manifesto describes these critical traits which are needed for going reactive. (p1-2)
This is fine, and a good start. However, I doubt whether the term «reactive applications» should be introduced as anything new. And I’d say that any real-time system has been as resilient and responsive as possible for at least thirty years. Scalability is nice, and I agree that it’s a goal. This is not something that we have been very good at. I used to believe that anything message’ish was «event-driven», but the use these days as «event loop, event as callback and rather single threaded» seem to have taken over. Trying to understand the rationale for this I have (as mentioned) discussed in an earlier blog (Eventual concurrency). Especially study Rob Pike’s seminal paper «A concurrent window system» (1989), where he shows another alternative.
(After I wrote this note I have rediscovered a paper by Peter H. Welch about the dangers of callbacks in OO [23]))
Event-driven (p 3)
An application based on asynchronous communication implements a loosely coupled design, much better so than one based purely on synchronous method calls. The sender and recipient can be implemented without regards to the details of how the events are propagated, allowing the interfaces to focus on the content of the communication. This leads to an implementation which is easier to extend, evolve and maintain, giving you more flexibility and reducing maintenance cost. (p3)
I have discussed the relationship between cohesion and coupling in a published paper [9]. I have argued the opposite way. To achieve high cohesion and low coupling the synchronized approach offers more control than the asynchronous. Of course, and this is default: needed asynchronisms must be implemented when they are needed to cover the spec. But those software processes (or tasks) that are not close to I/O are basically synchronous.
I remember back in the nineties, we developed products with transputers from Inmos in the UK. They had four high speed links (20 Mb/s!) and could be booted and debugged from any of them. They could be connected any way we liked (and could, via the four links), and there was a configuration language. There also was a router chip. It had loops to place processes on processors, and links were connected. The code that was being placed was written with no regard to where it should run. Except the I/O-close processes. When we got the Virtual Channel Router from the University of Southampton another level of connectivity was introduced: we could connect channels as well, that is – channels between individual processes on processors. The channels were routed across the links in invisible bundles. (A company called XMOS these days have a transputer-like processor, I have described its deterministic thread behaviour (even if the xC language has selective choice which may resolve nondeterministically), in a blog note (Nondeterminism).
Of course the internals of each component is made with no regard as to how the messages are propagated. In some of my listed blog notes and papers I have discussed WYSIWYG semantics [10]. For the asynchronous case, which does not have WYSIWYG semantics – any one process must know how all the other processes that it receives input from (or communicates with) are coded there, to code my process here. The Message Sequence (protocol) is not enough, what you see (in here) is not what you get (because you have to know more than the protocol: when can this message happen?) (See BitReactive below, they have automatic formal verification of all aspects of an API). Also (and this is perhaps even more important) one cannot say that an event (message) cannot happen in this state (but one can save it). And one cannot hinder a sender to change state (again and again) after it has sent a message to this process. So a possible session between processes becomes quite complex to code, since any one (not wanted) message can come in between.
So, the Manifest’s «This leads to an implementation which is easier to extend, evolve and maintain» is plain wrong, seen in the context of WYSIWYG semantics. The components of an asynchronous system, if they have correct internal cohesion (doing their job proper, but not more) are up for having low Office Mapping Factor, a term I introduced in mentioned [9].
Agree on the protocol in a meeting, run to your offices and code. After a while the persons doing asynchronous communication will have to get out of their offices to talk about how they are doing the programming. The synchronous system programmers will only need to ask how they are doing.
Event-driven (p 3-4)
There are several other errors on page 3. Most message based processes «remain dormant» when they are waiting for a message, also synchronous! There’s no busy poll. The Manifesto is probably correct in assuming that some systems do busy poll for messages, but personally I haven’t seen any over the last 35 years. Another thing, being woken up when it’s not necessary is also some times done; I don’t think that’s mentioned in the Manifesto. Like Java’s NotifyAll
, where all thread-runnables in a set are being awoken, for most of them only to discover that the wake-up was not for them. (But bear in mind that busy-poll may be smart! If you have a number of interrupts that you need to serve, then polling each of them, say like every 100 us may be smarter than letting them all go through as standard interrupts. Worst case of the first is easy to calculate (since it’s always worst, you can measure it on a scope) – worst case for the non-busy polled is much more difficult to try to calculate. That’s if you need to guarantee a worst case behaviour.)
Next matter. I have not been able to see data to support this statement:
A non-blocking application that is under heavy load can thus have lower latency and higher throughput than a traditional application based on blocking synchronization and communication primitives. (p3)
I believe this is based on assumptions. There may be very fast (mostly) asynchronous implementations, and there may be very fast (mostly) synchronous solutions. Any one particular implementation may be compared with any other and they may be more or less smart or based more or less on a certain technology. But I have not seen any research showing that it’s the technologies in the larger sense that make the difference. I have discussed this in a blog note mentioned above (Some questions about SDL).
I have a hypothesis that it’s our anthropomorphising, in this case: moving the terms «wait» and «block» into human understanding of them that may be the cause of the assumption that «not waiting» and «not blocking» is Good (while «waiting» and especially «blocking» is Evil). There are technical reasons as well (later), but the meaning of language may explain why it catches on so well (update and again: my Not so blocking after all is a more comprehensive analysis than what I started here):
Wait | Block |
Something we don’t want to do. It’s boring, and had we come in time we would not have had to wait. Everybody knows that you can’t proceed with what you are waiting for if you wait.. | This is even worse than waiting. We’re squeezed. A person comes out of the dark and blocks you. You’ll be late, and can’t even send a message! Everybody knows that being blocked is terrible.. |
.. except if it’s because you have done your job and want to come back regularly (like reading the temperature once per second, to warn about ice on the road) | ..except if it protects something (like the breaks on a car) – by at the moment doing nothing |
Depending on your «process model», waiting and blocking will not cause the processor to sleep or become blocked! The run-time system (operating system) will let any other ready process do their work. If there is nothing more to do it will of course sleep to save power, but it will never enter sleep because a process blocks – it will only pass the cycles over to another process that has more to do. |
This has nothing to do with latency or throughput, which should be parameters in the spec. The implementer can use either asynchronous, synchronous or (in my opinion) mixed methodologies.
Using either a spring or a wire to pull a car will get it there. Or, using either a spring or a rod to push the car will get it there:
Also, the «Event-driven systems tend to rely on push rather than pull or poll, i.e. they push data towards consumers when it is available instead of wasting resources by having the consumers continually ask for or wait on the data». I fail to see who the «instead of» alternative is. As discussed above, there is no busy poll (in the synchronous case), if that is the alternative. That being said, the first(?) mechanism to avoid busy poll was invented by Dijkstra in 1965, the semaphore. Advertising that a car has safety belts is barely a good strategy these days. I fail to see why anybody would sign a petition for this, now. But the authors know this..
Event-driven (p 4)
Here’s a paragraph I could have signed since it’s about good OSI model levelling, «a conceptual model that characterises and standardises the internal functions of a communication system by partitioning it into abstraction layers.» [15]
The decoupling of event generation and processing allows the runtime platform to take care of the synchronization details and how events are dispatched across threads, while the programming abstraction is raised to the level of business workflows. You think about how events propagate through your system and how components interact instead of fiddling around with low-level primitives such as threads and locks. (p4)
The interesting thing is that they talk so much about asynchronous systems, and still they talk about «synchronization details» here. Their run-time system is called Akka («a toolkit and runtime for building highly concurrent, distributed, and fault tolerant event-driven applications on the JVM.») [11]. Akka uses futures to implement these «synchronization details».
The Wikipedia article about this certainly helps [12], however, the Scala document gives more sense [13]. (The latter is almost in the form of a sales document, spread with superlatives. So The Reactive Manifesto is in good tradition. But they are both interesting reading!)
(Update and aside: see how James Long describes promises here)
Excerpts from the Scala doc. [13]:
– Note that Future[T] is a type which denotes future objects, whereas future is a method which creates and schedules an asynchronous computation, and then returns a future object which will be completed with the result of that computation.
– Above, we first import the contents of the scala.concurrent package to make the type Future and the construct future visible.
– To better utilize the CPU until the response arrives, we should not block the rest of the program– this computation should be scheduled asynchronously. The future method does exactly that– it performs the specified computation block concurrently, in this case sending a request to the server and waiting for a response.
– In many future implementations, once the client of the future becomes interested in its result, it has to block its own computation and wait until the future is completed – only then can it use the value of the future to continue its own computation. Although this is allowed by the Scala Future API as we will show later, from a performance point of view a better way to do it is in a completely non-blocking way, by registering a callback on the future. This callback is called asynchronously once the future is completed.
This is where the fear of blocking has its root: the used «process model» – as mentioned above. Again, I have discussed this in several of the mentioned blog notes. Correct me if I am (very) wrong on this:
The fear of blocking comes from the fact that the application code does not live in a world of explicit concurrency: it is some kind of <object>
and lives in a world of <object>
s. Concurrent behavior is hidden with mechanisms like <future>
. In CSP-type languages, a <process>
(occam) or <goroutine>
(Go) are also <sequential>
inside, but it is a known fact that all the components they communicate with are also <process> or <goroutine>, which run <concurrent>
ly with each other. (We would say <parallel>
when more cores, which is ok. Shared or distributed memory would then become an issue). Concurrency/parallelism and mechanisms to synchronize and communicate are first-class citizens of the language (like channels and selective choice). Blocking is a no-fear issue because you know that there are other ants in the colony. However, in the other case fear of blocking is necessary because conceptually you are alone with a list of work probably undone. Unrelated to the implementation a user should not need to wait for parts that conceptually (do not) belong together – like waiting for one browser tab to finish before a new tab gets attention is 100% unnecessary! (Is Chrome 100% ok here? (2012))
The Process Oriented Programming [14] I am used to is different from the Object Oriented (or Actor Oriented(?)) model described in The Reactive Manifesto. As far I can see Scala takes concurrency in through a library, based on Java’s process model [22]. This is somewhat contrary to Wegner who states that «actor languages = objects + abstraction + concurrency – classes – inheritance – strong typing» [21] – with concurrency being a first-class citizen.
But I think that much of this is about explicit vs. implicit concurrency at the language level. Concurrency/parallelism is composable; so at some level «up» some concurrency is hidden in any system. (CSP even has an explicit hide operator.) But none of these methodologies necessarily build more reactive systems than the other!
I’ll repeat myself. The basic flaw of The Reactive Manifesto is that it compares apples with bananas, and to me it looks like they haven’t seen or tasted bananas. (*) A side effect is that I learnt a lot by building these arguments!
(*) I may fall victim of the same flaw myself: I have only read and heard about the Scala/Actor model – but I have only tasted bananas. That’s why I’d very much like comments and corrections from critics! I would try to correct my flaws. That being said, The Reactive Manifesto is a more official document than a blog note by some programmer – it’s is even up for signing!
I added this when I re-read some of my sources. Wikipedia tries to put Scala, Akka and concurrency in context:
Scala standard library includes support for the actor model, in addition to the standard Java concurrency APIs. The company called Typesafe provides a stack that includes Akka, a separate open source framework that provides actor-based concurrency. [22].
Next matter:
– Therefore it is important that the entire solution is asynchronous and non-blocking.
– An application must be reactive from top to bottom. (p 4)
If I (again) take the view that this paper is about «reactive applications» as applications that are reactive in the general sense, then these statements are plain wrong. In my world a reactive application may be seen as a black box, which must be as reactive as the specification requires it to be. That the authors here have chosen a specific paradigm to solve the inside doesn’t mean that any inside has to follow that rule. It is inaccurate since their scope is not well enough defined.
I’ll explain a little more, and why I object. The Reactive Manifesto could have been called something like The Gravity Manifesto. Then everybody who would read it would reject any assertion that Gravity is green. When The Reactive Manifesto authors have chosen such a title, and the inside pretends to be Universal, then they just can’t write anything like this. In my opinion I suggest that they firmly set the scope of the manifest. If it looks and reads like Gravity then it should describe Gravity.
Then they go on to quote Amdahl’s Law in favour of their mission, which seems to be «the need for eliminating the weakest link in the chain». I do not understand why the law or graph is relevant here. I read that:
Amdahl’s law is only applicable in certain fork-join programming paradigms. Specifically, it is applicable to workloads where some code runs as a single thread followed by some embarrassingly parallel code, e.g., matrix-matrix-multiply or other HPC kernels. [16]
Maybe this is the situation they are referring to. If not, remove the graph.
Scalable (p 5)
I’ll start with this quote:
An event-driven system based on asynchronous message-passing provides the foundation for scalability. The loose coupling and location independence between components and subsystems make it possible to scale out the system onto multiple nodes while retaining the same programming model with the same semantics. (p 4)
When I attended the World occam Transputer User Group (WoTUG) conferences in the nineties I was interested in the embedded type of applications. But there were a lot of presentations on speedup of different algorithms, processor topology, connectivity and scheduling policies, farming etc. that I had to listen to. I learnt that speedup is not simple and that the linear part (if it exists) only is a short segment of any speedup curve.
I also learnt that to scale was in many cases to move the core off-chip, to another location, leaving the shared memory behind.
I also learnt that some processes (programs) have relationships like always having to wait for a data (a butterfly in an FFT) or don’t really need to wait for data (like a web page that fills in the needed parts of the screen when (and if) they arrive). Some implementations will keep their semantics when they are moved to multicore, some will break their semantics, especially of course if they move out of the shared memory (like, alas: Go goroutines). Internal cohesion is always best when it’s high enough but not more than necessary: parallelising such a program is always difficult – if the subcomponents are loosely coupled or not.
So I fail to understand why loose coupling should be a single prerequisite for scalability. Location (in)dependence is hidden in the problem that’s going to be paralleled. I have discussed some of this with Ami Marowka Nondeterminism:[Parallel programming]; he gave me some interesting examples.
Next point:
It is important to understand that the goal is not to try to implement transparent distributed computing, distributed objects or RPC-style communication—this has been tried before and it has failed. Instead we need to embrace the network by representing it directly in the programming model through asynchronous message-passing. (p 4)
If you farm out the programs and wait for independent data from several RPC calls in parallel, then RPC may also be fine. Saying that RPC as a general mechanism has failed leaves little respect for what RPC offers, also when it comes to inspiring analogues. I am certain it fails in some circumstances, but that is no guarantee that one’s own favourite also might fail, future-proofing is difficult. Here’s what my Wikipedia says about invocation and blocking, in the RPC article (in addition to mentioning the Interface Description Language, IDL)::
While the server is processing the call, the client is blocked (it waits until the server has finished processing before resuming execution), unless the client sends an asynchronous request to the server, such as an XMLHttpRequest call. There are many variations and subtleties in various implementations, resulting in a variety of different (incompatible) RPC protocols (Wikipedia, RPC)
I read in the Amdahl’s Law Wikipedia article about Gustafson’s Law: some times we are not interested in speed, but solving the largest possible problem. «If the non-parallelizable portion of the problem is fixed, or grows very slowly with problem size (e.g., O(log n)), then additional processors can increase the possible problem size without limit».
To «embrace the network» the Manifesto’s author tries to explain in the context of failed RPC-style communication. I do not understand.
True scalability naturally involves distributed computing and with that inter-node communication which means traversing the network, that as we know is inherently unreliable. It is therefore important to make the constraints, trade-offs and failure scenarios of network programming explicit in the programming model instead of hiding them behind leaky abstractions that try to “simplify” things. As a consequence it is equally important to provide programming tools which encapsulate common building blocks for solving the typical problems arising in a distributed environment—like mechanisms for achieving consensus or messaging abstractions which offer higher degrees of reliability. (p 5-6)
This is somewhat laden language. I’d like to know exactly what a «leaky abstraction» is. And I’d like to see how he relates this sentences to OSI layering: at some level the network is «inherently unreliable», but at the application layer it must probably be seen as inherently reliable. I can think that a distributed FFT some times might relate to placement and connectivity, but never to reliability and then itself do retransmissions.
If you have retransmission in one layer, should you also do it in a layer above? You got one message from the transport layer: it failed after n retrials. If it’s because the line is broken it won’t help. If it’s congestion or something else (why should the upper layer know this?) – should it think that it needed to retransmit? If it gets a signal from lower layer that the connection is up again, where should it resend from? This is a design question, something that’s in the spec. Ok, try to resend fire alarm messages but throw away the prewarning messages? Or send only the first fire alarm message, then they would know it’s burning.
I am not certain how much a language can do for me in such a scenario.
Any protocol needs consensus. The occam and Go languages compile check this. The Ariane 5 flight 501 disaster mistook one protocol for another. Sending the protocol description on a protocol is possible, but when the units come to the talking phase they must understand each other. Having control on these matters, I agree, gives higher degree of reliability. But this is not new.
It surprises me that the author of The Reactive Manifesto is not concerned about the asynchronous model’s tilting message sequence diagrams. I have discussed this in chapter «5.1 SlopingMessageSequenceDiagram» of [9] (or press the figure). This is part of the WYSIWYG semantics discussed earlier. Here’s the figure from the 2007 paper:
I think I discovered that asynchronous messages had sloping MSC’s all by myself!-)
Resilient and Responsive chapters (p 6-9)
There is more on asynchronous matters in those chapters; I won’t repeat. But there are also more interesting matters:
I have read that the Akka run-time system (which I think they are really talking about) has been built with traits from Erlang’s Supervisor Module. Akka is presented to Scala as a library. In the Erlang documentation I read:
Supervisor. A behaviour module for implementing a supervisor, a process which supervises other processes called child processes. A child process can either be another supervisor or a worker process. [18]
I guess that in the Akka system there would be actors that are supervised? I like some aspects of this.
However, I’d like to make a final comment about exception handling, because the supervisor seems to have flavours of it. The philosophical question is why should we build mechanisms to restore a situation if it’s because we have done lousy programming? (Because errors happen!) C.A.R. Hoare discussed this in his Turing Award lecture in 1980 [19]. He was against exception handling being built into Ada, or into any language. The Go designers to some extent seem to agree with him, after some 20-30 years with exception handling present in this world.
If my Java BankID application crashes with a crash log of 33 thrown exceptions, I am only concerned about the crash, not that there was an error in the parser module, deep inside. How a supervisor would react to the same would probably also be described in the specification. I’d much more like the module not to crash. That also was Hoare’s concern.
Hoare had at that time written his first version of his CSP book. In 1985 he released his second version, where processes communicate over named channels instead of with named processes. This was after he and David May had designed the transputer as an occam machine. A year ago (2013) I finally took a course (and exam) on CSP and got hands on the CSPm language. Previously I have worked with Promela and Spin. And I would like to have my fingers on the PAT (Process Analysis Toolkit [24]). All these, and many others, are contributions to increasing the quality of software. At some level software may be proven error free or according to spec.
Even if error handling is important, it’s probably as important to learn about how to avoid some of the errors. And what to do when failures happen is a specification requirement. No programming paradigm shall decide what a module that controls the flaps on an airplane should do when it fails.
But still, I like Erlang’s impressive up-time statistics. And didn’t I read the other day that Python code had an impressingly low error rate (it has built-in exception handling)?
Finally
Programming paradigms are useful tools for making the systems we want. The risk (probability * consequence) of using these systems will not be zero. I have used «a life» to try to lower the risk of «my» software. But I am afraid, there is no silver bullet. Not even clean, blocking, synchronous designs between lots of processes.
How complex should simplicity be allowed to be?
I will finally mention my XCHAN suggestion. It is a safe synchronous channel type that never blocks. Is it then asynchronous? (No!) See [20]
Probably in line with what I just wrote: I will have a look at the company BitReactive [8]. They have developed a tool suite and methodology for reactive systems. They are concerned about getting the properties of the system correct, and they do automatic formal analysis by a combination of what’s possible with the tool and the contracts between the modules. This looks close to what I have learnt in life. I’ll register and read more.
Updates
Smaller internal edits will not be recorded here. Only changes caused by external comments (if any!-) Newest on top:
- Mar2022
- Several edits. Search for «2022». We discussed The Reactive Manifesto in a literature study group I’m on, with people from NTNU and Sintef here in Trondheim (Sverre, Erling, Henrik and myself). A catch point: «Isn’t it proactive systems we want?» 😳
- 15Oct2015
- As mentioned in a disclaimer at the top version 2.0 was published on September 16 2014 [1].
- 26Nov2013
- I have added a comment, also pointing to here, at http://pchiusano.blogspot.fr/2013/11/the-reactive-manifesto-is-not-even-wrong.html (if it’s not being moderated away, I’ll wait and see)
- 25Nov2013
- Started a discussion thread called «A blog note commenting on ‘The Reactive Manifesto'» on the Scala user group, see https://groups.google.com/forum/#!topic/scala-user/Cq1YxC1Fjjs
- 24Nov2013
- Submitted a comment about this blog note on http://www.typesafe.com/blog/why_do_we_need_a_reactive_manifesto%3F (I think you need to be logged in there to see it (but it only refers back to here))
References
-
- The Reactive Manifesto, at http://www.reactivemanifesto.org/. The downloadable PDF is from September 23 2013. (v1.1): http://www.reactivemanifesto.org/pdf/the-reactive-manifesto-1.1.pdf. It’s a living document at github, see https://github.com/
reactivemanifesto/reactivemanifesto. Version 2.0 is much shorter, only two pages: http://www.reactivemanifesto.org/pdf/the-reactive-manifesto-2.0.pdf - Why Do We Need a Reactive Manifesto? by Jonas Bonér, see http://www.typesafe.com/blog/why_do_we_need_a_reactive_manifesto%3F. Typesafe seems to be Martin Odersky’s and Jonas Bonér’s company, set up to handle Scala for the industry.
- Principles of Reactive programming, Martin Odersky, Erik Meijer and Roland Kuhn at a page at the École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland, see https://www.coursera.org/course/reactive
- Real-time Java at Wikipedia, see http://en.wikipedia.org/wiki/Real_time_Java
- Reactive programming on Wikipedia, see http://en.wikipedia.org/wiki/Reactive_programming
- Axiom on Wikipedia, see http://en.wikipedia.org/wiki/Axiom
- Lemma on Wikipedia, see http://en.wikipedia.org/wiki/Lemma_(mathematics). The «Lemma (logic)» article didn’t help me much.
- BitReactive, see http://www.bitreactive.com/. This company is resident in Trondheim, where I live!
- High Cohesion and Low Coupling: the Office Mapping Factor, Øyvind Teig inCommunicating Process Architectures 2007, Peter Welch et al(Eds.), ISBN 978-1-58603-767-3, see http://www.teigfam.net/oyvind/pub/pub_details.html#TheOfficeMappingFactor
- WYSIWYG semantics, see http://web.archive.org/web/19991013044050/http:/www.cs.bris.ac.uk/~alan/Java/ieeelet.html (wait a while, the Internet Archive needs time to find it)
- Akka toolkit and run-time system, see http://akka.io/
- Futures and promises on Wikipedia, see http://en.wikipedia.org/wiki/Futures_and_promises
- Futures and promises in Scala, see http://docs.scala-lang.org/overviews/core/futures.html
- Process-oriented programming on Wikipedia, see http://en.wikipedia.org/wiki/Process-oriented_programming
- OSI model at Wikipedia, see http://en.wikipedia.org/wiki/Osi_model
- Parallel Programming: When Amdahl’s law is inapplicable?, see http://www.futurechips.org/thoughts-for-researchers/parallel-programming-gene-amdahl-said.html.
- Real-time computing at Wikipedia, see https://en.wikipedia.org/wiki/Real-time_systems
- Erlang’s supervisor, see – http://www.erlang.org/doc/man/supervisor.html
- C.A.R. Hoare Turing Award lecture (1980), see http://www.cs.fsu.edu/~engelen/courses/COP4610/hoare.pdf
- «XCHANs: Notes on a New Channel Type» , Øyvind Teig (CPA-2012), see http://www.teigfam.net/oyvind/pub/CPA2012/paper.pdf
- P.Wegner: «Dimensions of Object-Based language design«. In Proc.of the OOPSLA ´87 Conf. on Object- Oriented Programming Systems, Languages and Applications, 1987. See http://www-public.int-evry.fr/~gibson/Teaching/CSC7322/ReadingMaterial/Wegner87.pdf
- Scala programming language at Wikipedia, see http://en.wikipedia.org/wiki/Scala_(programming_language)#Concurrency
- «Life of occam-Pi» by Peter H. WELCH, School of Computing, University of Kent, UK at Communicating Process Architectures 2013 (CPA-2013)
http://www.wotug.org/papers/CPA-2013/Welch13a/Welch13a.pdf – Paper
http://www.wotug.org/papers/CPA-2013/Welch13a/Welch13a-slides.pdf – Presentation. Also referenced at https://www.teigfam.net/oyvind/home/technology/079-wysiwyg-semantics/#Ref03 - Process Analysis Toolkit (PAT), by National Univeristy of Singapore, see http://pat.comp.nus.edu.sg/ (2022: not updated for five years..)
- ReactiveX / RxSwift, see https://github.com/ReactiveX/RxSwift
- clojure and core.async, see https://www.teigfam.net/oyvind/home/technology/084-csp-on-node-js-and-clojurescript-by-javascript/#clojure_concurrency
- The Reactive Manifesto, at http://www.reactivemanifesto.org/. The downloadable PDF is from September 23 2013. (v1.1): http://www.reactivemanifesto.org/pdf/the-reactive-manifesto-1.1.pdf. It’s a living document at github, see https://github.com/
It sounds like you were right in rewriting that system. I’d be very interesting in which language (library) the synchronous system was coded, to understand the «process model» (or «unit of concurrency»). If there was some nested decision-tree there I personally haven’t seen that in any synchronous system. In what kind of module would that be necessary? I assume it’s as error-prone as nested callbacks, that node.js code presumably may end up with. Rob Pike showed a complete synchronous window system, mentioned above. Perhaps you should read it and see if that functionality may be compared with your system. There is a CSP library for Python, called PyCSP which I think can be used client side – and I’d love to try it. Do observe that farming out an «asynchronous job» is standard procedure also for a synchronous system, because one would naturally neither block nor poll for the reply. Study Go code, and its
select
statement.I am not sold on The Reactive Manifesto either, although I am taking the course so I may have an informed opinion of it. I do agree that the document, like the Akka documentation, reads too much like a sales pitch.
I am going to ignore some of the broader, general criticisms you have. That’s not because I think you’re incorrect, but because I have a weak grasp of some of the concepts you mention and thus my opinion is not well-formed. Instead, I’ll focus on the particular use case that got me interested in the Functional Reactive concept, which was building a set of moderately complicated web pages for an application.
In these pages, the logic was complex. If the user clicked option A and B, then option C was greyed out and option D became checked automatically. If the user filled in the text box next to option C with a string that matched a certain regular expression, options E and F would become enabled and the text box next to option F would automatically populate with a dynamically calculated value. etc… etc… There were a few Ajax (server side) calls in the page, but the great majority of the logic was synchronous. But it wasn’t a synchronous interaction between two elements, or three, it was between thirty, or fifty. The original implementation of these pages used an enormous nested decision-tree implementation. «if …. { if …. { if …. {} else {} } else {} } else …. { }» and was loaded with errors. So instead I wrote an individual function for each element that set the element’s visibility and contents based upon all elements on the page it depended upon. Then I created a Javascript map of element ids to arrays of handler functions that were dependent upon that id. Then one global handler was added to the page, which took the ID of the element modified, and then looked it up in the Javascript map, and invoked each of the handler functions sequentially.
In turn, each of those handlers could change an element that would invoke yet another nested function, so the cascading changes could be very inefficient and redundant. But the logic went from difficult to manage correctly to trivially easy. I think the treatment of each element with its own handler function and the map of ids to handlers is similar to «Functional Reactive», though I did this work months before I had encountered the phrase.
I can’t speak for performance, but I think giving portions of your application a handler that accepts events may be an advantage as your application gets more complex. You can test that handler in isolation, and new items that trigger events may sometimes be able to interact with that handler without requiring any new changes to the handler itself. Conceptually I see it as REST for intra-process events.