1 Introduction
Actor languages and libraries support an easy-to-use mechanism for expressing concurrent computations. The program spawns as many actors as needed, and actors strictly communicate via messages. Pure message passing eliminates the possibility of race conditions arising from a lack of mutual exclusion, a major problem in shared-state concurrency. An investigation (Tasharofi et al., Reference Tasharofi, Dinges and Johnson2013) of actively used actor programming libraries suggests, however, that programmers often use actors with threads and shared state, if available. When interviewed, programmers tend to point to the ease of sharing some state between computational components running in parallel. In short, programmers wish for some amount of sharing among actors.
The dataspaces actor model supplements the message-exchanges of the actor model with a space for sharing state in a controlled manner (Garnock-Jones et al., Reference Garnock-Jones, Tobin-Hochstadt and Felleisen2014; Garnock-Jones & Felleisen, Reference Garnock-Jones and Felleisen2016; Caldwell et al., Reference Caldwell, Garnock-Jones and Felleisen2020). It accepts the actor model’s premise that computation is conversation—exchange of messages—but it also embraces the demand of conversations for context. Roughly speaking, a context consists of the public information—corresponding to fragments of an actor’s current state—that the participants remain aware of for a period of time. Such public information, dubbed an assertion, is placed into a shared dataspace. From a programmer’s perspective, an assertion is a part of an actor’s state that it wishes to share publicly.
From the dataspace’s point of view, an actor is a function that reacts to changes in the published states of other actors. Even a message is a change to this shared state: it shows up and disappears immediately. Indeed, basic message-passing may be expressed with dataspaces in just this manner. The functional, black-box interface for actors lends itself to system-level reasoning but, when it comes to expressing the behavior of an individual actor, a purely functional notation is not a good fit. The core programming tasks—situationally engaging and disengaging in behavior and managing and reacting to changes in shared state—are inherently temporal. Hence, the model calls for a novel way of expressing a concurrent actor in a context-sensitive conversation.
In response, this paper presents the design of a language of facets and a system for reasoning about their temporal behavior. From the perspective of this facet language, an actor consists of several pieces—the facets—each of which represents the actor as a participant in distinct (parts of) conversations. The concept addresses the central concerns of actors:
-
• engaging and disengaging in conversations;
-
• maintaining local state;
-
• permitting access to shared, public state; and
-
• reacting to external events.
The benefits of facets are evident in improving the organization of dataspace actor programs and supporting a conversational style of concurrency. Furthermore, facets directly enable type-based static reasoning about communication.
Dining Concurrency Researchers. To illustrate our meaning of conversational concurrency, imagine a table of dining and conversing attendees at a conference on concurrency research. Experience suggests a number of common characteristics for such a scenario:
-
• The conversation is likely to branch into several subconversations as new topics are broached. Previous topics may be returned to—or not.
-
• The participants may engage in one subconversation or several simultaneously; they may leave one conversation to join another, only to return later to the first one.
-
• The set of participants changes over time. Late-arrivals may join the table after the conversation commences while early-departers exit the conversation before its conclusion.
-
• The conversation is contextual. A newcomer to the conversation is likely to listen for a while to pick up the context. Alternatively, this newcomer may explicitly state an interest in the contextual background.
-
• Finally, these participants identify one another through a number of characteristics, including their presentations at the conference, their interests, their association with particular institutions, and so on. They may or may not know each others’ names. Moreover, a particular participant’s identifying characteristic may vary from conversation to conversation.
While Djikstra’s dining philosophers illustrate how coordination avoids certain pitfalls of concurrency, namely deadlocked program states, the scenario of dining conference attendees paints a portrait of how developers desire well-designed concurrent programs to behave.
The notions of dataspace and facet equip developers with mechanisms to program in this conversational style: means for representing and sharing context, evolving actor behavior in the face of shifting demands, and a flexible notion of identity and participation in a conversation. Sections 2 and 3 return to these ideas.
Roadmap. The presentation proceeds in three steps: Sections 2 through 4 introduce the ideas via examples; the formal models of facet actors and dataspace programs are the subject of Sections 5, 6, and 7, respectively; and Section 8 states the key theorem concerning the temporal type checking of facet actors. Section 9 briefly explains the implementation, and Section 10 reports the results of evaluating the ability to verify behavioral properties of realistic programs. The remaining sections are about related and future work.
2 Background: Dataspaces, actors
The dataspace actor model generalizes the publish/subscribe protocol of Linda’s Tuplespaces (Carriero et al., Reference Carriero, Gelernter, Mattson and Sherman1994). When actors wish to share pieces of state with others, they deposit those with a dedicated actor, the dataspace. All actors connected to a dataspace can query this shared space. The dataspace thus controls how published state is shared within a group of conversing actors.
A simulation of a smart home allows a comprehensive illustration. Let us say the smart home supports three kinds of devices: smart lights; a presence sensor, which detects how many people are in a room; and a control hub for accessing and controlling the other devices. The simulation expresses each of these devices as an actor (Light, Sensor, Hub) embedded in a common dataspace.
Figure 1 depicts a moment in such a simulation. The connecting dataspace is the large rectangle. Each actor has deposited aspects of its state pertinent to the others; these are called assertions.Footnote 1 For example, the Sensor actor has published the assertion inRoom[3], denoting the number of people in the room. Published pieces of state—assertions—are made up from values (e.g., numbers, strings) as well as immutable data structures (lists, structs).
An actor subscribes to publications of state of others via assertions of interest. In the simulation, the Light actor turns a bulb on or off based on the occupancy of the room, meaning it needs to know about the inRoom assertion. To get this information and keep it up-to-date, it registers ?(inRoom[ $\star$ ]) with the dataspace: inRoom[ $\star$ ] is an assertion pattern where $\star$ is a match-anything wildcard and ? turns it into an assertion of interest, or just interest for short.
Communication in dataspace programs arises from matches between interests and assertions. The dataspace provides a routing mechanism, depicted in Figure 1 as a cloud, for detecting such matches. Routing tracks changes to the set of assertions matching each actor’s interests. When this set changes, the dataspace generates an event describing the change and informs the interested actor(s). In the running example of the smart home, the presence Sensor actor updates its assertion from inRoom[3] to inRoom[2] if someone leaves. In turn, the dataspace informs the Light actor of this change, because of its interest.
An actor’s response to an event is one of the following actions:
-
• it may add and/or subtract assertions from its published state;
-
• it may spawn additional actors; and
-
• it may exit, by choice or by an uncaught run-time exception.
When an actor exits, the dataspace removes all of its assertions. Since this removal is a change to the assertion set, it triggers an event for all interested actors. In response, other actors may react to component failures in a graceful manner. For example, when a light burns out in the smart home simulation, its corresponding Light actor exits. This triggers the removal of its light[ON] assertion. In response, the dataspace informs the Hub actor because it is interested in such assertions. This actor may then take appropriate action, such as display a notification.
Dataspace communication is anonymous. That is, the model does not provide actors with an address or process-ID (PID) to serve as a communication handle. Rather, correlation information—that is, data dictating to which actor(s) some data is relevant—must be included in the exchanged assertions and interests. This allows identifying information to be encoded in a domain-specific fashion. For example, the Light actor could include its serial number in its assertion light[serial , ON], allowing future communications specific to that device to refer to it by its serial number.
Dining Concurrency Researchers. The dataspace model addresses several needs of the conversational style introduced in the dining concurrency setting of the introduction:
-
• Assertions directly represent conversational context.
-
• Anonymous actors give a flexible notion of identity and conversational participant.
-
• The notion of a change-in-state as an event merges the means through which a new participant receives prior context in a conversation with normal communication.
-
• Similarly, notification of withdrawn assertions is the same—and automatic—whether via normal behavior or exceptional crashes.
In sum, the dataspace model abstracts over the processing of routing events by actors. The model requires only that an actor has a functional event–transducer interface:
How to express an implementation of this interface is an entirely different question.
3 Facets
With the facet notation, a programmer expresses an individual actor as a tree of facets. Roughly speaking, a facet groups some of the actor’s state with the behavior related to a particular conversation. The behavior must specify the contributions (assertions and interests) to a conversation together with reactions to the utterances of other participants. The tree structure reflects the tendency of conversations to branch from one another while accruing context. This section introduces the facet language with the code for some of the actors from the smart home simulation of the preceding section.
Figure 2 presents the code for the Light actor written in our Racket-based implementation. The listing includes the spawn expression that launches the actor (line 8). The actor engages in a single conversation concerning the presence of people in the room and the state of the light. Hence, it starts a single facet (line 2), named light-facet. The start-facet form imperatively extends the actor’s facet tree with additional behavior. The new facet joins the tree as a child of the enclosing facet. The first facet started by the actor, as is the case for light-facet, becomes the root of the tree.
A facet keeps the actor-relevant state in fields. The light-facet facet creates a single field, which keeps track of the current state of the light (line 3). This light-state field holds values of type Bool and is initialized to ON, a constant defined elsewhere in the program. The expression (! light-state) accesses the current value of the field, while (:= light-state OFF) updates it. The type of the light-state field handle itself is (Field Bool), analogous to the type of a mutable reference cell. A type annotation on a field is optional; when elided, it defaults to the type of the initialization expression.
A facet interacts with the connected dataspace via endpoints. Endpoints come in two varieties—assertions and event handlers—corresponding to incoming and outgoing information. The light-facet facet comes with three endpoints:
-
• The first one is an assertion endpoint (line 4). It shares information about the state of the light with the dataspace. The assertion is a struct of type light. The light assertion carries one value, the contents of the light-state field accessed via $\mathtt{!}$ . The endpoint establishes an assertion in the dataspace while the facet it belongs to is active. In this case, since the actor boots with the light-facet facet, it initially makes such an assertion. Moreover, the facet keeps its assertion up to date. When an assertion endpoint refers to a field, the endpoint and field establish a dataflow link. The dataflow mechanism propagates field updates in order to keep the actor’s assertions up-to-date with respect to its fields. Thus, every time light-state changes, the actor automatically withdraws the previous light assertion and replaces it with a version reflecting the new light-state.
-
• The other endpoints are $\mathtt{on}$ (event) handlers (lines 5-6). Each consists of two pieces:
1. The event specification describes the nature and structure of the event. The nature of an event is expressed in terms of a keyword— $\mathtt{asserted}$ or $\mathtt{retracted}$ —corresponding to the appearance or disappearance of assertions in the dataspace. The structure of the event is expressed with a pattern, articulating a query for specific structures in the dataspace. The actor automatically issues an assertion of interest corresponding to the event handler’s pattern. In the running example, the event handler wishes to know about in-room assertions in the dataspace; it does so with placing ?(in-room 0) in the dataspace. An event handler’s assertion of interest is initialized in the same manner as an assert endpoint. Thus, the actor boots with assertions of interests corresponding to its initial event handlers. Moreover, if the patterns of the event handlers reference the values of any fields, the assertions of interest are kept up to date through the same dataflow mechanism.
2. The body part of the handler is code that executes in response to every matching event. In the running example, each event handler’s body updates the facet’s state via the light-state field. Recall that updates to this field automatically propagate to the facet’s assertion via dataflow.
The light-actor function is effectful. When called from inside an actor spawn statement (line 8), or the body of a facet’s event handler endpoint, it has the effect of starting a new facet with the described behavior. In our implementation, programmers may mix facet operations, field creation, and endpoint creation together with definitions and expressions. With this flexibility, any aspect of an actor’s behavior may be defined inside a procedure. This can improve the readability of the code as well as facilitate re-use and sharing of code among different actor definitions.
Figure 3 visualizes the light-facet facet of the Light actor. The facet encapsulates the light-state field and the two endpoints. The solid-blue arrow indicates the dataflow link between the field and the assertion endpoint; the dotted-blue arrow indicates that a value flows from the body of the on handler into the field. Finally, the red arrows show the manifestation of the endpoints in the dataspace as an assertion and an interest for the two endpoints, respectively.
Let’s equip the smart home simulation with additional features and observe how the language of facets facilitates the corresponding adaptation of the Light actor:
Wall Switches control the power to lights in a room. When a switch is off, the light must be off and is unable to communicate with other devices. Flipping the switch on turns the light on and allows for communication with, and control by, other devices.
Multiple Rooms complicate the simulation, too. Certain interactions are possible only among devices located within the same room.
Configuration is essential for software systems. Initially, the only devices are the control hub and the wall switches. The user may install lights and presence sensors. After installing a device, the user must assign it to a specific room via the hub.
Figure 4 shows how to adapt the implementation of the Light actor to this revised scenario. An unrelated difference to Figure 2 is the use of spawn inside of the spawn-light-actor function—a simplification of the presentation. This function is now called by a UI component when the user installs a light. It takes one String-typed argument. The argument designates to which wall switch the light is connected; that is, the argument represents the inherent connection between elements on a power circuit. Once the function is called, it generates a unique ID for the light (line 2), akin to a manufacturer serial number. After that, it spawns the light actor with one initial facet (lines 3-4).
Equipped with a basic understanding of this function, let us look at the revised light-facet facet. Recall that the actor should not engage in any behavior unless the connected switch is flipped on. Once it is on, the actor participates in conversation(s) until the switch is turned off. A basic facet-oriented actor would start a facet in response to the first kind of event and shut it down when the second kind occurs. This pattern is so common that the facet language comes with a during expression (line 5), which combines an assertion pattern that can be used with either asserted or retracted.Footnote 2
Here the facet operates while the assertion of (wall-switch wall-switch-id ON) exists in the dataspace—representing the interval of time during which the room’s switch is in the on position. As the switch is flipped into the on position, the facet announces its existence and (ON) state using a field and assertion endpoint as before (lines 6-7). This light assertion allows the Hub actor to detect the installation of a new device. If this is the first time the Hub actor encounters this device ID, it prompts the user to assign it to a room. Once the assignment is complete, the Hub actor deposits a room-assignment assertion (line 8) to inform the Light actor. The subpattern $room binds the corresponding portion of the room-assignment assertion to the name room. Knowing what room it is in allows the Light actor to converse with any presence sensor in the same room, turning on and off accordingly (lines 9-10).
Multifaceted Actors. The revised implementation of the Light actor starts several facets. Eventually it may run three concurrent facets: light-facet and one per during expression. Each of these facets corresponds to a particular context within an ongoing conversation. Furthermore, the facets are nested, just like actor conversations. When one facet starts another, the first is the parent to the second. Stopping a facet (via stop) means terminating it and all of its children. In the context of the Light actor, the wall-switch facet may shut down when the switch is flipped off; in this case, shutting down this facet also shuts down the nested facets associated with during expressions. A stop form may optionally specify some continuation behavior, as in (stop $\mathit{orig}$ (start-facet $\mathit{cont}\ldots)$ ). In that case, the $\mathit{cont}$ facet takes the place of $\mathit{orig}$ in the tree. Continuation facets address the need to transition from one task to another.
Wider Facet Trees. While the facets in the Light actor are nested in a linear fashion, bushy trees are equally common. Consider the Hub actor. A user interacts with the Hub to configure each light. Once configured, the Hub both monitors and controls these lights. To accomplish this task, the actor starts a facet for each separate conversation with the various Light actors.
Figure 5 shows a portion of the actor’s implementation. The actor starts a facet named hub (line 5), which creates one field (lights, line 6) for storing information about the lights in the system. The field is a dictionary. For each registered light, this dictionary associates the light’s ID with a light struct that records its room and its known state. The auxiliary function set-light-state (lines 27–29) takes a reference to the lights field as an argument, of type (Field LightDict), and updates the dictionary for a particular light.
Using during, the actor converses with each individual light. The during reacts in response to new light assertions.Footnote 3 Such assertions can mean one of two things: the installation of a light or the restoration of power to an already installed light. In the first case, lights does not contain an entry with that ID; in the second case, it does. Each of these situations calls for different behavior, but no matter what, the reaction should happen only when the facet starts up.
To express this idea directly, the language enables facets to react to the start and stop events, respectively. That is, instead of a reaction to a dataspace assertion, a facet’s endpoints may specify a start or stop specific reaction. Here the during facet uses start to decide how to act (line 8). In the handler (lines 9–11), it uses lights to look up what it knows about the light. In the first case and match branch, the facet retrieves the light’s room assignment and starts another facet to control it via control-light (lines 19–25). In the second case, the actor calls get-room-from-user, which engages in yet another dialog with the user to assign the new light to a room. This nested facet stops itself, with some explicit continuation behavior: a call to control-light.
The control-light function starts a facet to assert the light’s room-assignment and to react to changes in its state. When the state changes, the facet updates lights. Finally, when the facet stops—due to termination of the outer during—it records the light as off. Stop handlers, like exception handlers, enable separation between the decision to disengage in behavior and the definition of how to clean up in that event. In this case, the decision is implicit in the during on line 7’s retracted event handler. Meanwhile, the code for cleaning up is in the control-light function (line 24). Crucially, the need to clean up only arises after an initial provisioning step, so it would not make sense to have the functionality at the same level as the during. In the case of a crash, an actor does not run any stop event handlers. Since the state of the actor could be inconsistent, with fields partially updated, orderly shutdown may not be possible.
Figure 6 visualizes the facets of the Light and Hub actors. In the case of the Light actor, the tree is deeply nested, as the actor engages in new behaviors as it accumulates conversational context. By contrast, the Hub actor’s facet tree exhibits breadth, as it starts a new facet in reaction to each installed light. In turn, each of those facets have sub-trees either to interact with the user or to control the light. Each sub-tree of the hub facet is independent of the others, evolving based on its interactions with its controlled light and the user without needing to know the current state (or existence) of the others.
3.1 Derived forms
Figure 7 displays derived forms for common patterns. One exceedingly common pattern is for a facet to be constructed in response to the appearance of an assertion and torn down in response to its disappearance. The Light and Hub actors both fall into this category. We abstract this pattern into the during form. Each time the asserted event fires, any pattern variables bound in pattern are instantiated to yield a concrete assertion, $\textit{pattern}'$ . The (on (retracted …)) endpoint monitors this assertion in the new facet.
Figure 7 also shows a related derived form, during/spawn. In contrast to during, this form does not create a facet within the actor but spawns an actor in the dataspace. The critical difference concerns failure. While a failure in during tears down the current actor, a failure in during/spawn terminates the separate actor but leaves the spawning one alone.
Both during and during/spawn allow the programmer to directly express matching of supply to demand. Assertions matched by the pattern are interpreted as demand; the during-body constitutes the supply. As demand increases via new matching assertions in the dataspace, the supply expressions are executed to match. As demand decreases again due to the withdrawal of assertions, (on (retracted …)) endpoints automatically terminate the facet or actor in response. Resources are allocated via (on start …) clauses and released in (on stop …) clauses in response to changing needs.
Figure 7 defines a second family of derived forms concerning the integration of newly arrived assertions into local fields. While actors often just deal with singleton assertions, other scenarios call for a set, a hash table, or even an aggregate summary of several assertions. To support this idiom, the facet language provides derived constructs called queries. Queries define and subsequently update fields, making their results available for use in neighboring facet endpoints.
One such form, define/query-value, is useful for local tracking of an assertion kind in which there may be zero or one instance in the dataspace. For example, we could extend the smart home program with persistence, so that the user’s assignments of devices to rooms are saved between runs of the program. A separate actor would handle persisting such configurations to the file system or other location and loading them when the program starts. Then, when the Hub actor boots, it needs to operate based on a reloaded configuration or start with a default fresh one. It may use
to create a field that defaults to DC (short for default configuration) value unless there is a configuration assertion, in which case it uses the value of that assertion.
The second such form, define/query-set, collects information from relevant assertions in a set.Footnote 4 A user-interface component for the smart home may use it to show the user all of the lights currently installed in the system:
The query scheme directly generalizes to structures such as hash tables, etc., and also allows aggregations analogous to SQL’s COUNT(*) and GROUP BY.
3.2 The case for facets
Returning to the dining concurrency researchers example, facets support a conversational style in several ways:
-
• Engaging in a new conversation is as simple as starting a facet. Likewise, stopping a facet disengages from a conversation.
-
• The during form directly connects context to behavior.
-
• The tree of facets mirrors the way (sub)conversations branch from existing conversations.
Defining actors using facets provides additional advantages over basic functions and objects, as well. Appendix A provides an implementation of the light actor using a procedural notation as a concrete basis of comparison, demonstrating and explaining these benefits.
Generally put, the facet notation combines and extends a number of concepts from other languages and empowers programmers to use them in a synergistic manner. For example, end-point programming may remind the reader of the event-handling and reactive programming style of Esterel (Berry & Gonthier, Reference Berry and Gonthier1992) and Crime (Mostinckx et al., Reference Mostinckx, Lombide Carreton and De Meuter2008). Dataflow exists in many forms (Demetrescu et al., Reference Demetrescu, Finocchi and Ribichini2011). Updates to behavior have been around as long as the Actor model (Hewitt et al., Reference Hewitt, Bishop and Steiger1973). The during behavior is the clearest example of the synergistic use of these features. It precisely describes a temporal engagement with the dataspace; sets up and tears down local behavior as needed; and simultaneously takes into account the accumulated conversational context.
4 Static checks: Types and behavior
Programming actors requires the development of protocols. A protocol coordinates the flow of data among actors in a conversation. Designing the program requires reasoning about protocols while writing code. An actor language ought to assist programmers with this reasoning process and ideally with checking it statically.
The language of facets comes with a structural type system and a behavioral one. The first is somewhat conventional (Section 4.1); also see Caldwell et al. (Reference Caldwell, Garnock-Jones and Felleisen2020)’s work on structural types for dataspaces. The second enables programmers to specify an actor’s dynamic behavior (Section 4.2) and perform some static checking (Section 4.4). The bridge between the two is an effect-type system (Section 4.3). Section 4.5 applies this checking to an example specification from the smart home.
4.1 Basic types
The starting point is a structural type system. Base types describe data such as Int, Bool, etc. A struct-type definition such as (struct light ([on? : Bool])) introduces the type constructor LightT.Footnote 5 . Thus, (light #true) has type (LightT Bool).
Our previous work on structural types (Caldwell et al., Reference Caldwell, Garnock-Jones and Felleisen2020) for purely functional dataspace actors validates that it prevents mistakes involving the shape of assertions. That is, type checking incorporates knowledge of how a dataspace routes assertions among actors and uses the types to confirm that assertions cannot upset routing. It is straightforward to adapt this system and its proof of soundness to facet-oriented dataspace actors.
4.2 Specification
Like for structural properties of routing assertions, programmers need a language for behavioral—that is, temporal—properties, too. Like structural types give a syntax and semantics to claims about properties of assertions, behavioral types should allow the programmer to make claims concerning the behavior of actors in a dataspace. We employ a form of linear temporal logic (LTL) for this purpose, though of course, its basic propositions must somehow involve the assertions that actors deposit into and withdraw from dataspaces. In addition to introducing this language, this section also informally explains what it means for a facet-based actor program to live up to such a specification.
LTL has a long history as a specification language for concurrent programs and the basis for mechanized checks (Manna & Pnueli, Reference Manna and Pnueli1991; Holzmann, Reference Holzmann1997; Pnueli, Reference Pnueli1977). When compared to other temporal logics, such as the branching-time computation tree logic (CTL), LTL is typically considered more intuitive and at least as amenable to efficient model checking (Vardi, Reference Vardi2001; Rozier, Reference Rozier2011). Moreover, compared to typical protocol description languages such as choreographies (Carbone & Montesi, Reference Carbone and Montesi2013) or session types (Honda et al., Reference Honda, Yoshida and Carbone2008), LTL allows the focus to be on what is communicated rather than who does the communicating, suiting the anonymous style of dataspace programming.
The base LTL predicates are types that describe assertions in the dataspace. For example, the formula (LightT Bool) holds at any moment iff some actor has deposited an assertion (light #true) or (light #false) in the dataspace. Specifications may also use negation (Not), and (Not (LightT Bool)) means no assertion of this structure type is in the dataspace.
Other connectives are conjunction (And), disjunction (Or), and logical implication (Implies). With temporal connectives, a programmer can specify the evolution of dataspace programs. The proposition (Always (LightT Bool)) holds if the dataspace contains a light assertion at every moment during execution, regardless of which actor has deposited it there. Meanwhile,
says that eventually there will not be a (wall-switch-on) assertion in the dataspace, but until then there is a light assertion. The formula
describes an execution state with an assertion matching in the dataspace or the execution arrives at such a state within a finite number of steps.
A dataspace program satisfies a specification if its dataspace evolves according to the interpretation of the LTL formula. Thus, a program lives up to the just-mentioned formula if, at some point during execution, an actor makes a matching light assertion. In particular, any dataspace program that includes the light-facet actor from the preceding section satisfies this formula, because it makes a light assertion as it boots.
Consider another proposition:
It is the strong version of Until, requiring that (Not wallSwitchOn) eventually be true.
Take the dataspace program that runs the two actors on the left:
The evolution of the dataspace is on the right. It shows which assertions are in the dataspace for the first four time steps, starting with the boot state.
After booting, actor 1 makes a light assertion and actor 2 adds a (wall-switch-on) to the dataspace. Hence, the dataspace configuration at that point matches the antecedent of the implication. Consequently, the Until property must also describe that state so that the latter models the complete formula. Due to actor 1’s assertion, actor 2 stops its only facet, leading to the withdrawal of the assertion. Since the light assertion remains active until that point, the implication still holds. Furthermore, actor 1 notices the disappearance of the (wall-switch-on), and in response, replaces its own assertion the dataspace. As a result, the next configurations also match the antecedent of the implication as does the until property.
While the notion of satisfaction is intuitive, the eventual goal is to mechanize this type-checking of specifications. In this regard, the sample explanations are also suggestive. What is needed is an abstract description of the behavior of the actors and their interaction with dataspaces via assertions. And again, just like for structural type checking, it is the shape of exchanges that can serve in this role, not the precise assertions.
4.3 Effect types for facets
The end of the preceding section almost dictates the third step. Historically, type systems research uses effect-type systems for descriptions of program behavior (Lucassen & Gifford, Reference Lucassen and Gifford1988). In the setting of facet-oriented actors, the to-be-observed effects are caused by actors overall, facets, and endpoints. Hence, each corresponding term-level construct comes with a type-level construct for describing the type of its effects. See Figure 8 for the actual notation.
Let us illustrate the synthesis of effect types with the code for spawning the simplistic Light actor from Section 3 (without rooms and power switches). Ignoring fields and expressions, the code describes a single actor and a single facet nested within this actor. The facet itself comes with three end points: a plain assertion endpoint, an on-asserted one and an on-retracted one. The behavioral type mirrors this description, including the ordering of the endpoints within the facet:
Here is how to read the type in terms of the actor’s communication behavior:
-
• it makes an assertion of type (LightT Bool);
-
• it expresses interest to assertions of type (InRoomT Int);
-
• it reacts to their appearance and disappearance; and
-
• its reactions do not change its communication behavior.
For this last point, note that the bodies of the On types are empty.
From this perspective, a behavioral type is a simplified actor. This simplified actor communicates via types of assertions rather than value assertions. By taking the behavior type of each actor in a program—say, the smart home program—we get a communication-only program, that is, a collection of simplified actors and their types of assertion exchanges with dataspaces. Accordingly, the number of possible behaviors is much lower—and that is precisely what enables a mechanical checking of (some) specifications.
4.4 Checking specifications mechanically
The transition of possible interactions from concrete value assertions to types of assertions enable specification checking. At a rather high level, the checker infers effect types from a program and compiles them to the language of a model checker. The compilation encodes effect types as simplified actors. This model-checking program is combined with the programmer’s LTL formula. If the model-checking attempt fails, the error is reported in terms of a trace of these simplified actor programs.
The theory behind this idea is presented in Sections 5 and 7. Section 8 verifies why this approach works. The implementation is sketched in Section 9. Section 10 evaluates its ability to check behavioral properties of realistic programs. The following subsection illustrates the working of the model checker with an example.
4.5 Revisiting the smart-home example
To demonstrate the power of the specification language, let us revisit the smart home example from Section 3. The code presented in Figure 4 is buggy, and attempting to check it against a natural specification exposes the bug, Here is the specification of the expected behavior of the lights:
This specification describes the expected outcomes of interactions between the presence sensor, light, and light-switch actors. It states that the light turns on and off in response to the sensor actor’s InRoomT assertions, as long as it is powered (WallSwitchOn).
When analyzed with respect to this specification, our implementation of LTL checking reports the error and provides a counterexample in terms of the involved actors and their assertions:
If a presence sensor is installed and configured after the light in the same room is turned on even though it is empty, the light stays on due to the sensor’s initial reading.
Based on this trace, we can identify the active assertions and thus the active facets of the Light actor. This assessment provides a starting point for determining where to locate the bug. A close look reveals that the light-facet facet in Figure 4 reacts only to the retraction of (in-room room 0) to determine that the room is occupied. If the room is occupied when the light regains power, no such retraction will take place. To fix this problem, we can amend the facet to query the state of in-room assertions on startup. The boxed code in Figure 9 is all that is needed to make the actor satisfy its LTL specification.
5 Facets: The semantics
A proper language design should come with a rigorous blueprint, especially if its rationale includes claims about the reasoning power of its type system. The presented language consists of two pieces: the facet notation and its connection to dataspaces. The former deserves a formal semantics; the latter has been presented thoroughly elsewhere (Garnock-Jones & Felleisen, Reference Garnock-Jones and Felleisen2016; Caldwell et al., Reference Caldwell, Garnock-Jones and Felleisen2020), but Section 7.1 provides an abbreviated treatment so that the meta-theorems can be stated in a self-contained manner.
This section presents the formal model of a single facet-based actor (Sections 5.1 and 5.2). It describes how a facet-based actor updates its state in response to an event: the evolution of the facet tree; the invocation of event handlers; and field maintenance.
5.1 Formal syntax
Figure 10 defines the abstract syntax of facet-oriented actors. Statements $\mathrm{Pr}$ imperatively update the actor’s state in one of several ways:
-
• $\mathtt{start}\,\,\,{\mathrm{fn}}\,\,\,{\overrightarrow{\mathrm{e}}}\,\,\,{(\overrightarrow{\mathrm{D} \,\,\mathrm{Pr}})}$ —starting a facet with name $\mathrm{fn}$ , assertions $\overrightarrow{\mathrm{e}}$ , and event handlers $(\overrightarrow{\mathrm{D} \ \mathrm{Pr}})$ ;
-
• $\mathtt{stop}\,\,{\mathrm{fn}}\,\,{\mathrm{Pr}}$ —terminating a facet, along with all of its children, and engaging in some continuation behavior ${\mathrm{Pr}}$ ;
-
• —creating a mutable field for $\mathrm{Pr}$ ;
-
• ${\mathrm{x}}:={\mathrm{e}}$ —assigning a new value to a field; and
-
• —spawning an additional actor.
Statements may introduce local, lexically scoped variables with . Otherwise, they are just like statements in other programming languages that compose sequentially ( ${\mathrm{Pr}};{\mathrm{Pr}}$ ). For technical reasons, the grammar includes no-op (skip).
The set of expressions comprises
-
• b—basic values;
-
• ${p}({\overrightarrow{\mathrm{e}}})$ —invocations of potentially partial primitive operations;
-
• $\mathrm{x}$ —references to variable names;
-
• $!{\mathrm{x}}$ —field access;
-
• $?{\mathrm{e}}$ —assertion of interest in $\mathrm{e}$ ;
-
• $({\overrightarrow{\mathrm{e}}})$ —tuple with tag m;
-
• $\star$ —wildcard assertion patterns/templates; and
-
• ${\mathrm{x}}:{\tau}$ —binding patterns.
A vector of assertion templates ( $\overrightarrow{\mathrm{e}}$ ) defines the assertions made by a facet. Templates describe families of assertions built with labels (m), interests ( ${?}$ ), wildcards ( $\star$ ), and expressions ( $\mathrm{e}$ ).
Event handlers consist of an event description ( $\mathrm{D}$ ) and a body ( $\mathrm{Pr}$ ). Events describe occurrences that are either internal to the actor ( $\mathtt{start}$ , $\mathtt{stop}$ ) or particular assertions in the dataspace ( $\mathtt{asserted}\,\,{\mathrm{e}}$ , $\mathtt{retracted}\,\,{\mathrm{e}}$ ). In the latter case, the description includes a pattern expression $\mathrm{e}$ defining the relevant assertions. In order to construct patterns, expressions additionally include match-anything wildcards ( $\star$ ) and binding variables ( ${\mathrm{x}}:{\tau}$ ).
In comparison to the implemented syntax, the model uses some obvious short-hands and some less obvious simplifications:
-
• The assertion endpoints of a facet are all declared together.
-
• The event handlers come as a sequence of description/handler body pairs.
-
• The fields shared by a facet’s endpoints are declared outside the facet.
-
• The assertion labels replace struct names.
-
• Binding variables in patterns require a type annotation.
Here is the basic Light actor from Figure 2 in the model’s syntax:
5.2 Semantics
Since individual actors consume events and produce assertions, we choose to specify the semantics of facet-oriented actors via a labeled transition system. Each transition acts on a machine state and constructs the next one. The transition labels $\mathit{l}$ describe either inputs in the form of event patches ( $\Delta$ ) or outputs in the form of spawned actors ( $\overline{{\mathrm{Pr}}}$ ). An unlabeled transition relation ( $\bullet$ ), represents actor-internal work. Patches $\Delta$ comprise two disjoint assertions sets $\pi$ , representing added and removed assertions. Assertions $\mathrm{u}$ range over basic values, labeled tuples, and assertions of interest.
The formulation of machines requires syntax for describing the states in addition to the syntax of programs. Figure 11 defines this syntax and the machine states. The error state represents an actor that has crashed due to an unhandled error.
An actor machine state consists of five components: . The components track different kinds of state and knowledge:
-
• the activity state of facets
The machine must express which facets are currently active as well as the parent-child relationship among them. This aspect takes the form of a tree: $\mathrm{FT}$ . A node $\mathrm{fn}\,\,[{\overrightarrow{\mathrm{e}}}(\overrightarrow{\mathrm{D}\,\,\mathrm{Pr}})].\overrightarrow{\mathrm{FT}}$ in this tree describes an active facet in terms of its name, its assertion and event handler endpoints, and its children, while the empty tree takes the form $\epsilon$ .
-
• the instruction stream within the active facets
A machine state identifies a sequence of ready-to-perform instructions $\mathrm{I}$ , and it also maintains a queue of pending scripts $\mathrm{PS}$ . A pending script ( ${\mathrm{fid}},\,\,{\mathrm{Pr}}$ ) is a sequence of statements to be performed and pertinent context information. The context is a facet ID $\langle{\overrightarrow{\mathrm{fn}}}\rangle$ identifying the facet from which the script originated. A facet ID is a sequence of facet names forming the path from the root of the tree to that particular facet. Instructions $\mathrm{I}$ can $\mathtt{start}$ a new facet at a designated location in the tree; $\mathtt{stop}$ a running one; and $\mathtt{spawn}$ an actor.
-
• the information known to the actor
Specifically, the store $\sigma$ holds the contents of the fields, while the set of assertions $\pi$ keeps track of the actor’s current knowledge of assertions in the dataspace. Newly booted facets receive these assertions as an event.Footnote 6
Finally, we need some miscellaneous pieces of syntax. An $\mathrm{Evt}$ is either a patch or facet $\mathtt{start}$ / $\mathtt{stop}$ . Evaluation reduces expressions to values $\mathrm{v}$ .
Figure 12—above the horizontal line—shows the initial machine state $\mathrm{M}$ for the light-facet facet from above. The rest of the figure displays the first two transitions of the machine.
Transition Relations. Figure 13 defines the transition relation on machine states that describes the behavior of a single facet-based actor in response to incoming events. For the moment we ignore how such events arrive; Section 7 covers that aspect of the semantics. We use $\mathrm{M} \longrightarrow \mathrm{M}'$ as shorthand for .
Generally speaking, the machine works in the following fashion. Initially, the machine state describes an inert actor, meaning there are no instructions to perform nor pending scripts to execute. When the machine receives an event patch, it is matched against the event handlers of the actor’s facets. For each successful match, the body of the event handler is enqueued as a pending script.
Once the matching process is completed, the machine executes the scripts. It dequeues the first one and interprets the instructions, one at a time. Some instructions generate internal events (facet start and stop). For those the machine matches them again against the facet tree, and this process may enqueue additional scripts. The machine continues in this manner until it reaches inertness again. At that time, another external event may be injected.
Here are descriptions of the machine’s five transition rules:
-
inject —The rule’s purpose is to dispatch an event to an inert actor. Due to dataspace routing, the event is guaranteed to conform to the actor’s interests. Incorporating the incoming patch via $\oplus$ yields a new set $\pi'$ of assertions known by the actor. The dispatch metafunction matches the event against each handler in the tree of facets, yielding a pending script for each match.
-
transfer —The machine transfers the next pending script to the current register and partially evaluates ( ${p-e}$ -s) its internal statements. In doing so, it updates and creates fields in the store. The process eliminates expressions and yields a sequence of instructions that alter the active facet tree ( $\mathtt{start}$ / $\mathtt{stop}$ ) or correspond to an output by the actor ( $\mathtt{spawn}$ ). Partially evaluating the script means eval-ing expressions, which may include applications of partial primitives. In the event that such a primitive fails, the actor crashes. The error state represents a crashed actor without any behavior.
-
start—The next instruction demands the start of a facet. The new facet is given a fresh name; the bodies of each of its event handlers refer to the new name. The dispatch function is used to create two scripts: one to signal a $\mathtt{start}$ event and one to boot the event handlers that correspond to the known assertions of the actor. These scripts are enqueued and the new facet is inserted in the tree. The instruction specifies the location for the new facet in the tree as a path ( $\mathrm{fid}$ ) to the new facet’s parent. The locate metafunction constructs the appropriate context. It is possible that no such context exists, which happens when the desired parent, or one of its anscestors, is terminated via a $\mathtt{stop}$ instruction in the time between the script containing this $\mathtt{start}$ statement being enqueued and the instruction executing. In that case, the new facet is immediately terminated. The dispatch function informs it of the $\mathtt{stop}$ event. The resulting scripts are enqueued, and the facet is never inserted into the tree.
-
stop—The machine terminates an active facet and its children, dispatch-ing a $\mathtt{stop}$ event to allow these facets to shut down in an orderly manner. The instruction may designate a facet that is already eliminated. In this case, there is no more.
-
spawn—The facet wishes to create an actor, which the machine treats as an output instruction. Formally, the transition comes with a $\overline{{\mathrm{Pr}}}$ label.
The top-most machine state in Figure 12 transitions to an inert state in two steps. The first one injects the script, consuming the patch $\,\,{\mathtt{{in-room}(0)}}{\emptyset}$ as the label indicates. The second enqueues the script’s instruction and performs it.
5.2.1 Metafunctions
The transition system of Figure 13 refers to several metafunctions. This section provides their formal definitions. The definitions make use of secondary metafunctions, which are defined in Appendix B.
The ${boot}_{\mathrm{Pr}}$ function initializes the machine from an actor description. It recognizes only scripts that create some number of fields and then $\mathtt{start}$ a single facet. It creates each field and then boots the facet as a child of a synthetic root facet with no behavior. The synthetic root facet allows the initial facet to be replaced by any number of facets via $\mathtt{stop}$ :
The dispatch function matches an event against each handler in a tree of facets, accumulating the successes as a list of pending scripts:
It utilizes a helper function, dispatch1, to instantiate the body of a single event handler with the substitution(s) arising from matching against a single event. Substitutions $\gamma$ map variables to values and matching a pattern against a set of assertions yields a set of substitutions $\mathcal{S}$ :
Finally, unroll sequentializes a set of statements using an arbitrary yet fixed ordering:
The ${p-e}$ function partially evaluates a script, yielding a sequence of instructions and an updated store:
The ${facet-context}$ function produces the path of facet names (a $\mathrm{fid}$ ) leading to the hole in a context:
Meanwhile, locate performs the reverse direction, producing the context for a particular $\mathrm{fid}$ in a tree of facets:
6 Types for facets
Facet types (Section 6.1) play two roles. First, they ensure basic type safety in the form of a standard soundness theorem. Second, they reflect facet operations to the type level with a type-and-effect style. By lifting the individual actor semantics to the type level (Section 6.2), and then dataspace programs, type-level programs can be assigned an operational semantics that enables model checking temporal logic claims.
6.1 Type syntax and judgment
Figure 14 defines the syntax of types. Effect types T directly reflect the core facet syntax (Figure 10), including facet $\mathtt{start}$ / $\mathtt{stop}$ and $\mathtt{spawn}$ operations of $\mathrm{Pr}$ . Event descriptor types $\mathrm{D}_{\mathrm{T}}$ correspond directly to the term level descriptors $\mathrm{D}$ . Types $\tau$ summarize the result of expressions, with base types B describing primitive values. Type environments $\Gamma$ associate variable names from patterns and fields with their types and facet names with the marker FacetName.
Figure 15 specifies the core type judgment on facets: It ensures that facet and field names are used appropriately while collecting certain facet operations as an effect type T. As mentioned, this effect type is used for behavioral analysis. Auxiliary judgments apply to the different categories of syntax; Appendix D provides their definitions.
Here are explanations of the key typing rules:
T-Start assigns types to a facet-creation instruction. It appeals to an auxiliary judgment for checking the assertion expressions $\mathrm{e}$ , which also produces a description of the assertion types $\tau$ . The condition $\mathtt{assertable}({\tau})$ , defined in Appendix D, ensures that assertion types do not contain binding patterns or field names. The event descriptors $\mathrm{D}$ for each event handler are checked in the same manner, yielding a type $\mathrm{D}_{\mathrm{T}}$ for each. The final aspect to check is the body $\mathrm{Pr}$ of each event handler. For each such body, the environment is extended with the name of the facet being started as well as the binding variables from the pattern in the event descriptor (via the ${bindings}_{\mathrm{D}}$ metafunction). Checking each body produces a type level description of its behavior, T. The product of the rule is a Start effect, combining the types of the assertions plus each event handler description and its body.
T-Stop is the analogue of for facet termination. It consults the environment to ensure that the named facet is in context. The ${prune-up-to}$ function removes the target facet’s name from the environment, as well as the names of any of its descendants, when checking the continuation $\mathrm{Pr}$ . The resulting effect type is to $\mathtt{Stop}$ the same facet with a continuation behavior.
T-Field, T-Assign check field creation and assignment, respectively. The judgment for field creation checks the type of the initial expression and associates the name with an appropriate type in the environment for the remaining statements. The judgment for field assignment looks up that type in the environment, making sure that the type of the assigned expression matches. Field updates and references are not included in effect types. Thus, field creation yields the effect type from its enclosed statements while field assignment yields a Skip effect.
T-Spawn concerns the creation of actors. The rule utilizes the prune function (Appendix D) to remove identifiers associated with fields and facet names from the running actor, because these names are local. The spawned actor’s behavior $\mathrm{Pr}$ is checked within this restricted context, and it has the (sole) effect of Start-ing a facet. The effect type of the $\mathtt{spawn}$ statement is a $\mathtt{Spawn}$ with a type description of the actor’s initial facet.
6.2 Machine types and type-level machines
The judgment $\vdash_{\mathrm{M}}:\mathrm{M_{\mathrm{T}}}$ relates a machine $\mathrm{M}$ to a machine type $\mathrm{M_{\mathrm{T}}}$ . A machine type $\langle\mathrm{FT}_\mathrm{T};{\overrightarrow{\mathrm{I}}}_\mathrm{T};\overrightarrow{\mathrm{PS}_{\mathrm{T}}};{\pi_{\tau}};{\sigma_{\tau}}\rangle$ includes a facet tree type ( $\mathrm{FT}_{\mathrm{T}}$ ), sequence of instruction types ( $\overrightarrow{\mathrm{I_{\mathrm{T}}}}$ ), sequence of pending script types ( $\overrightarrow{\mathrm{PS_{\mathrm{T}}}}$ ), set of assertion types ( $\pi_{\tau}$ ), and store type ( $\sigma_{\tau}$ ). Appendix F defines the full judgment and syntax. In brief, the syntax of machine types reflects the syntax of machines (Figure 11) in the same manner that effect types T (Figure 14) reflect the syntax of statements $\mathrm{Pr}$ (Figure 10).
The statement of soundness for facet-based actors relies on two auxiliary notions. First, we say $\mathrm{M}$ is an inert machine state, $\texttt{inert}({\mathrm{M}})$ , iff . That is, it has neither instructions to perform nor pending scripts to execute. Second, we designate a set of internal transition labels, $\mathit{l_{\bullet}}$ . Internal labels correspond to the machine processing instructions and scripts. Thus, they consist of each $\bullet$ and $\overline{{\mathrm{Pr}}}$ label, but not $\Delta$ .
Theorem 1 (Soundness). If and then either:
-
• and ; or
-
•
Proof By the standard progress and preservation properties (Wright & Felleisen, Reference Wright and Felleisen1994). Appendix G contains the full statement and proof of each property. To show termination of internal reduction sequences, note the lack of recursive facilities in the language, and that each event handler in a machine’s facet tree may activate a maximum of one time between external events. $\Box$
If a well-typed single-actor machine configuration can take an inject step, it eventually transitions to an inert machine state or a state that represents an exceptional state (due to a misapplication of a partial primitive).
The transition system for machine states operates on machine types with only minor syntactic adjustments. Appendix F defines the full type-level transition system. Section 8 makes use of this notion for behavioral analysis of actors. Here, we note that transitions on type machines are well-defined.
Theorem 2 (Well-definedness). Either $\mathrm{M_{\mathrm{T}}} \longrightarrow^{\mathit{l_{\tau}}} \mathrm{M_{\mathrm{T}}}'$ or $\texttt{inert}({\mathrm{M_{\mathrm{T}}}})$ .
Proof Follows that of Theorem 1. $\Box$
7 Dataspaces: The semantics of programs
While the preceding section describes the semantics of a single actor, a full understanding necessitates a semantics of entire dataspace programs, that is, programs with many, dynamically created and terminated, actors. Figure 16 defines the abstract syntax (left) and the semantics (right) of dataspace programs.
A dataspace configuration consists of a queue of pending patches $({\ell}, {\Delta})$ , labeled by the originating actor; a table of assertions ( ${\mathrm{R}}$ ); and a series of actors (A). The table R is a set of pairs associating each assertion with the identity $\ell$ of the originating actor. An active actor in the dataspace associates an address with an actor state, $\Sigma$ . The state consists of a pending event $\Delta$ and behavior B. The pending event accumulates a patch for the actor between its turns. An actor behavior $({\mathrm{f}, {\mathrm{v}}})$ pairs a behavior function with a private state. The behavior function f is a total mapping between a pending event and the private state of the actor to a new state and possibly some actions (act). The model abstracts over the language for specifying behavior functions so both facet-based code as well as alternatives work. Actions include actor specifications P and patches $\Delta$ . An actor specification actor ${\mathrm{f}}\,{\mathrm{v}}\,{\pi}$ consists of three pieces: a behavior function; its initial state, which is any value from the language for expressing behavior functions; and its initial assertions.
The semantics is defined in terms of a transition relation on dataspace configurations. The relation interprets actor actions and dispatches events in a single rule ( $\textsf{step-par}$ ). It starts by interpreting the queue of pending patches. The update metafunction handles a single queue item, updating the dataspace’s assertion table and dispatching events. The pending event for each actor aggregates a patch describing relevant updates. Next, the ${dispatch}_{\mathrm{DS}}$ metafunction applies the behavior of each actor to its pending patch, producing an updated actor state and some outputs. Any spawn actions are booted, yielding a new address, actor state, and assertions. Finally, all of the patches and boot assertions are collected into a new queue for the next configuration. It includes the updated assertion table and freshly booted actors. Appendix E contains the full definition of these metafunctions.
Definition 3 ( $\mathrm{DS}_i$ ). A run of a dataspace configuration DS is a potentially infinite sequence of dataspaces configurations, where each neighboring pair is related by the transition relation:
It is a run that represents the temporal behavior of a dataspace program and its individual actors, and it is thus the foundation for verifying the value of our behavioral types.
Definition 4 ( ${inert_{\mathrm{DS}}}$ ). A dataspace configuration is inert iff each patch and pending event is empty.
Theorem 5 (Dataspace Soundness). Either or there exists $\mathrm{DS}'$ such that $\mathrm{DS} \longrightarrow_{\mathrm{DS}} \mathrm{DS}'$ .
Proof Garnock-Jones’s dissertation proves the soundness of dataspace systems, as well as a number of desirable properties (Garnock-Jones Reference Garnock-Jones2017, theorem 4.17).
7.1 Facet dataspace programs
Caldwell et al., (Reference Caldwell, Garnock-Jones and Felleisen2020) show how to link a semantics for individual actors with the dataspace semantics and prove the soundness of the composite system. Their semantics coincides with ours, so that the soundness of the system follows from Theorem 5, because each actor implements the functional interface of a $\textbf{BehFun}$ . The ${interp_{\mathrm{M}}}$ function is the key component connecting the semantics of dataspaces with the semantics of facet actors. It implements the dataspace behavior interface $\textbf{BehFun}$ for facet machine states, and it applies a pending patch to a machine state via an -step, then allows the machine to continue via internal transitions until reaching either inertness or an error state. The result of the function is the new machine state together with a patch describing the difference(s) in assertions between the initial and final states. Furthermore, any $\overline{{\mathrm{Pr}}}$ transitions are translated to dataspace specifications P:
The ${boot}_{\mathrm{P}}$ metafunction translates a facet-level $\mathtt{spawn}$ to a dataspace process description:
7.2 Type-level dataspace programs
Section 6.2 sketches how to lift the transition system on a facet machine to operate on a facet machine type. Similarly, the dataspace transition system above naturally lifts to the type level with only minor adjustments. A type-level dataspace consists of actors communicating in terms of assertion types $\tau$ . Facet machine types provide behavior to actors in a type-level dataspace the same way that facet machines give behavior to term-level dataspace actors. We write $\mathrm{DS}_{\mathrm{T}}$ to refer to a dataspace configuration with type assertions.
7.3 Example
Figure 17 shows a dataspace configuration based on the smart home example and its evolution. For readability, the example employs a notation different from that of Figure 16, giving each actor a name rather than an address and aggregating various other parts of the syntax. Each actor is displayed as a triple, as in for an actor with name nm. The leftmost element is the current pending event for the actor, while on the right are its current set of assertions. The example treats the behavior and private state of each actor as a black box, displaying them as a box containing the actor’s name. For brevity, we use two additional notational shorthands. As a pending event, $\cdot$ represents the empty , while a single assertion $\mathrm{u}$ stands for the patch .
The configuration contains three actors, representing the control hub, a light with the ID “l1”, and a presence sensor with the ID “ps1”. Each transition of the relation from Figure 16 is broken into two steps. The first routes assertions among the actors and updates pending events when there is relevant information. The second dispatches each actor on its pending event and updates its current assertions.
The system proceeds in the following manner:
1. Initially, the light actor announces its presence and on/off state, as well as a subscription to a room assignment.
2. The hub actor’s interest in light assertions yields a corresponding event.
3. In response, the hub actor introduces a room assignment to the “bed” room.
4. Routing results in an event informing the light actor of the assignment.
5. The light actor’s reaction is to introduce a subscription for assertions that the “bed” room is empty.
6. The subscription matches the existing empty assertion, immediately yielding an event for the light actor.
7. The light actor updates its light assertion to the off state in response to the room being empty.
8. The hub actor receives a patch describing the change to the light assertion.
9. Finally, the hub actor processes the latest patch without changing its assertions, yielding an inert system.
Figure 18 shows the corresponding type-level system. The primary difference is that the assertions in the dataspace are types rather than values; the names of the actors have also been capitalized to emphasize the difference between the two examples. The transitions follow the same logic as those of Figure 17, with one exception. The example encodes the on/off state of the light as a Bool. Consequently, the types of assertions in the dataspace do not change when the light switches from on to off in response to an empty assertion (type). Thus, the type-level system reaches inertness in fewer transitions than the term-level one.
8 The predictive nature of behavioral types
Equipped with a semantics of facets and dataspaces, we are now in a position to verify the predictive nature of our behavioral types. Specifically, we wish to show that if a program’s behavioral types satisfy a specification, the program itself does, too. We start with a model-appropriate definition of our LTL specifications.
Definition 6 (LTL).
The atomic propositions are types $\tau$ of dataspace assertions. Our basic syntax comes with two temporal operators: (next) and (strong-until). The set of formulas includes negation () and disjunction (). Note Standard constructs such as conjunction (), eventually (), and always () are derived forms.
Formulating the meaning of the LTL specifications takes two steps. The first introduces the set of atomic propositions of a dataspace.
Definition 7 ().
That is, an atomic proposition belongs to a dataspace configuration, if its table of assertions u contains an assertion (by any actor l) that type checks as $\tau$ . We overload the notation for type-level dataspace configurations, . At the type level, an atomic proposition $\tau$ belongs to the configuration if such a type is present in the configuration’s assertions ( $\mathrm{R_{\mathrm{T}}}$ ).
The second step concerns the notion of satisfaction, that is, when a dataspace configuration—and its run—satisfies an LTL specification.
Definition 8 () For both term and type-level configurations,
At the term level, atomic propositions hold in dataspace configurations where an actor is making an assertion of that type. At the type level, they hold for configurations where an actor is asserting that type. The temporal constructor holds in a dataspace configuration when $\psi$ holds starting from the next step in the run. Similarly, the other temporal operator, strong until (), holds when $\psi_1$ is true for some finite number of steps in a dataspace run, at which point $\psi_2$ is true. All other syntactic forms have their usual meanings.
At this point, we can define what it means for a program—a description of the initial facet-based actors in the dataspace, or their types—to satisfy a specification.
Definition 9 ( $\cdot~\models~\psi$ ).
Note how the definitions use the first successor of the initial dataspace configuration to check the desired formula. They ignore the initial one because all of the actors’ initial assertions are in the pending action queue, waiting to be interpreted. The first transition step moves these assertions into the configuration’s assertion table.
Figure 19 illustrates how to translate the example specification from Section 4.2 into the formal syntax. It also indicates what it means at the term and the type level.
The type system and the behavioral type system properly predict the behavior of actors and programs. Naturally, the first is the basis for the second. That is, all correspondence between terms and types relies on standard soundness (Theorem 1). The point of soundness is to confirm that there are distinct classes of actor programs: those that live up to (type) expectations and those that raise exceptions. We collect the first kind in a set for a configuration M that does not error due to the application of a partial primitive, such as division by zero. We write if evaluates to a safe machine state.
Next we establish a correspondence between the behavior of a single actor and its type. Specifically, a term-level actor machine is related to a type-level actor machine if they make the same assertion types; the type-level machine types the term-level one; they react to the same (types of) events; and their reactions to those events yield related states.
Definition 10. if and only if:
-
•
-
• $\forall \mathit{l}$ such that $\mathrm{M} \longrightarrow^{\mathit{l}} \mathrm{M}'$ , such that $\mathrm{M_{\mathrm{T}}} \longrightarrow^\mathit{l_{\tau}} \mathrm{M_{\mathrm{T}}}'$ and
-
• $\forall \mathit{l_{\tau}}$ such that , such that and
Appendix F provides the straightforward definitions of the $\vdash_{\pi}$ and $\vdash_{\mathit{l}}$ judgments.
Assuming the machine belongs to the set of safe ones, typing implies correspondence.
Theorem 11. If and then .
Proof The initial machine states for a term and its type are related, as are their assertions. The key meta functions preserve typing, and a suitably related event can always be constructed. See Appendix G for the full statement of these properties and their proofs.
Theorem 11 establishes a similarity between term and type level machines based on their inputs (injected events) and output labels (spawned actors).
The point of this theorem is that an actor machine and its type implement the same interface, up to typing. A dataspace program consisting of actor machines interacts with its context in the same way as a dataspace program consisting of those actor machines’ types.
Finally, we can state the key theorem, namely, that type-level behavior carries over to term-level dataspace programs. That is, if a collection of actor terms $\overrightarrow{\mathrm{Pr}}$ have types $\overrightarrow{\mathrm{T}}$ and boots to exception-free dataspace programs, then LTL properties satisfied by the type-level dataspace program also hold for the term-level one.
Theorem 12 (LTL Transference). If
-
•
-
•
-
•
then
Proof Due to the simulation property (Theorem 11), the table of assertions in each step of the term and type level dataspace runs is related by typing. Therefore, the same atomic predicates ( $\textbf{AP}$ ) hold in each, so the same LTL properties apply.
Note The assumption of safety for the single-actor machines is standard for the statement of partial correctness. In essence, it is equivalent to the conventional assumption “if the statement terminates properly” in Floyd-Hoare logic (Floyd, Reference Floyd1967; Hoare, Reference Hoare1969).
In detail, our type analysis does not seek to capture the behavior of actors in the case of uncaught exceptions. While dealing with failures is a core principle of the actor and dataspace model, the current type system does not capture this notion.
If we were to account for the possibility of unexpected exceptions, every non-trivial actor would include the behavior of immediately crashing. Every program would then include the situation where all its actors crash and other such degenerate cases, vastly increasing the difficulty of stating LTL properties of the system. Even the addition of supervision actors would not avoid this collapse in reasoning capability.
We conjecture that by generalizing work on exception analysis via type systems (Yi & Ryu, Reference Yi and Ryu2002) it might be possible to extend our type system to reason about such cases precisely, at least for some limited class of programs. We leave this problem as future work.
9 Implementation
An implementation of the theoretical design requires three pieces: the language itself (its syntax and semantics); a type checker that derives effect types; and a compilation of these effect types plus the specification to a model-checking system. The following three subsections sketch these three pieces of our implementation.
9.1 Implementing the behavior of facet-oriented actors
The basic language implementation of facet-oriented actors consists of three layers:
-
• a syntax layer, which provides the facet notation of the first half of this paper;
-
• a runtime system, which provides data structures and functions that implement the behavior of facets and endpoints; and
-
• an imperative dataflow network for tracking changes to fields and scheduling the re-evaluation of dependent computations.
These pieces are built atop the existing implementation of the dataspace model. The dataspace implementation does not require any modifications in order to support these elements, just as the dataspace model predicts (Garnock-Jones & Felleisen, Reference Garnock-Jones and Felleisen2016).
The syntax layer essentially turns the surface forms into calls to functions provided by the runtime system. It makes use of syntax/parse, Racket’s high-level syntax extension system (Culpepper & Felleisen, Reference Culpepper and Felleisen2010, Reference Culpepper and Felleisen2012). Implementing the surface language with syntax/parse greatly simplifies the addition of actor syntax. Moreover, the resulting interface can be extended and grown as the language develops, and the implementation can benefit from any improvements to the underlying dataspace library. It also facilitates the addition of a type checker.
The run-time system is the most sophisticated of the three layers. The key elements strongly mirror the data structures of the machine and the meta-functions of the formal model of Section 5. Technically speaking, the run-time system maintains data structures corresponding to the different parts of machine states $\mathrm{M}$ and defines functions for dealing with facets, enqueuing scripts, etc. The implementation differs from the model primarily in its support for efficient re-evaluation of the assertions of an actor as fields are updated.
Finally, the runtime maintains a bipartite, directed dataflow graph for each facet-oriented actor. A source node represents a field, a target node an endpoint, and edges the dependencies of the endpoints on the fields. Each endpoint representation contains a procedure that is used to compute the set of assertions to be associated with the endpoint. By recording field dependencies during the execution of such procedures, the implementation learns which endpoints must have their assertion sets recomputed in response to a field change.
9.2 Implementing the type checker
The type checker is implemented with the Turnstile meta-DSL (Chang et al., Reference Chang, Knauth and Greenman2017). Roughly speaking, Turnstile macros supplement syntax/parse with a mechanism to check types first and elaborate the given source syntax into target terms later. That is, each syntactic form is defined as a macro that performs some type checking before elaborating to syntax that implements the run-time behavior. Type information is propagated via metadata on syntax objects.
For example, the define-typed-syntax macro for start-facet looks like this:
Line by line, this definition reads the following way:
-
• The macro applies to syntax that matches the (start-facet id expr …+) pattern.Footnote 7 The pattern expects an identifier and a nonempty sequence of endpoint expressions.
-
• The code between pronounced “elaborates to”—and the dashed line analyzes and checks these endpoint forms:
1. The first clause elaborates each endpoint expression, ep, individually in an environment extension that records name as the name of a facet. This allows Turnstile to resolve each identifier reference within each ep to the proper type.
2. The result of a successfulFootnote 8 elaboration is bound to the name ep-, while the types of effects performed by ep are bound to the name effs.
3. The trailing ellipse dictates that each ep form is analyzed in this manner.
-
• The #:fail-unless clause enforces the side-condition on the corresponding type-checking rule. It makes sure that the body has only endpoint installation effects, such as from the use of on, assert, and field; it disallows the following effect types: start-facet, stop, and spawn. The actual work is left to a helper function: all-endpoint-effects?. If the check fails, the checker signals a type error.
-
• The clause below the dashed line consists of the macro’s two outputs:
1. The first output is the target expression that implements the behavior of start-facet. The produced expression calls the install-facet! procedure, which is provided by the run-time support system for facets. The procedure consumes the name and a thunk that runs the results of elaborating the endpoint forms—in the specified order. These endpoint forms then elaborate to calls to other procedures from the run-time system, and so on.
2. The second output attaches a type to the elaboration. Here it describes the effect type of the form: StartFacet, with endpoint types described by effs ….
Generally speaking, the define-typed-syntax macros are almost verbatim transliterations of the typing rules presented in Figure 15.
9.3 Implementing the model checker
A programmer initiates a system verification with respect to some specification by adding the following formula to the program:
In terms of Theorem 12, this formula requests a check of where the LTL formula $\psi$ corresponds to spec, and T is the series of actor-types, one per actor in the dataspace program. If the check succeeds, then the actors of these types jointly realize the specification.
The implementation of verify-actors utilizes the model checker SPIN (Holzmann, Reference Holzmann1997) to check such uses. That is, it translates each use of verify-actors into a Promela program, that is, SPIN’s input forms.
More precisely, the translation turns each actor-type into a Promela process with a state-machine driven behavior. Each state corresponds to a set of active facets within the actor. An encoding of the dataspace routing algorithm allows processes to react to the appearance and disappearance of assertions, while an additionally generated process performs message dispatching.
SPIN is invoked on this Promela program, plus a translation of the LTL $\psi$ . If verification fails, the trace provided by SPIN is back-translated into the actions of the type-level actors.
Figure 20 displays a sample Promela program. It is the result of translating the Light actor’s type. The figure annotates the Promela code with in-line explanations. We italicize the annotations to distinguish them from the actual code and comments, which are enclosed within /*…*/. The rest of the Promela process follows the same form.
Blocks of code within the actor (lines 15, 23) execute atomically to avoid any unnecessary interleaving with other actor’s initialization in the SPIN search space. Additionally, comments within the code, such as those on lines 16 and 30, embed information from the Racket source program; this information is used to back-translate counterexamples of the SPIN model to the source level.
10 Evaluation
The usefulness of this model-checking approach depends on two primary factors:
-
• Whether the specification language can express important properties of dataspace communication as they arise in actual programs; and
-
• Whether the implementation can successfully check those properties that are expressible.
This section presents the results of an analysis of these factors in the context of a corpus of facet-based dataspace programs implemented in Racket. Inspecting each program yields a number of behavioral properties important to program correctness. These properties provide the basis of the evaluation.
10.1 Corpus
This section describes the programs in the corpus, as well as several behavioral properties for each program that comprise the subject of the evaluation. One property—deadlock-freedom—is relevant to several examples and thus factored into its own Section (10.2), which discusses the topic in the context of dataspace programs and our behavioral checks. Each program consists of about 500 lines of code, comprising the concurrent core of a larger system.
Smarthome. This program is an extended version of the earlier examples, including a text-based user interface and temperature sensors and controls.
-
• Light Presence. This property describes the desired interaction between presence sensors and lights in the smarthome program. The specification essentially states that once a presence sensor and light have been installed in a room, the light turns on or off as the user moves in or out of the room, respectively.Footnote 9 Note that while on the surface this property only talks about light and presence sensor actors, the behavior of those actors depends on additional interactions from the user interface, the actor representation of the user, and the hub actor.
-
• Steady State Temperature. This property specifies the desired behavior of the thermostat in the smarthome: when the user sets a desired temperature, the system gradually heats or cools the home until the target is reached, before turning off.
Data Processing. This program implements the core of a streaming data processing framework, inspired by Flink.Footnote 10 Clients submit jobs consisting of some number of inter-dependent tasks. Several actors collaborate to manage the underlying computational resources for executing tasks, tracking intermediate progress, and delivering the end result.
-
• Task Delegation. This property checks the working of three actors that collaborate to perform tasks by treating them as a black box: each assigned task is eventually performed. Performing a task involves delegating it from the job manager to a task manager, and then from a task manager to a task runner, and finally propagating the results back in the opposite direction.
-
• Job Completion. This is another black-box correctness property: when a client submits a job, the job’s results eventually become available. When a client submits a job, a job manager actor analyzes the request and computes a directed, acyclic graph of tasks. It then processes the task graph by assigning tasks as they become ready to other actors, before finally announcing the results of the job. A weak variant of this property states that each job either completes or processes tasks in an infinite loop.
-
• Load Balancing. Task-runner actors are capable of performing one task at a time. A task manager actor monitors the status of task performers and assigns them tasks when they are idle. Likewise, each task manager has a capacity based on the number of task runners it manages. The job manager assigns tasks to task managers based on their current free capacity. This property states that the task manager never assigns a task to a busy runner and the job runner never assigns a task to a task manager that is at capacity. A weak variant of the property pertains to the scenario where each task manager oversees exactly one task runner actor.
Caucus. This program represents a geographically distributed, iterative election in the style of a caucus, inspired by the “two-buyer problem” from the behavioral types literature (Honda et al., Reference Honda, Yoshida and Carbone2008). A distinct actor represents each voter and candidate. After a registration period, voting commences across a collection of regions. Within each region, voting proceeds in rounds until a single candidate receives a majority share of the vote. Once every region has reached a decision, the winner of the most regions is pronounced the winner of the election. In each region, an actor guards against various forms of misbehavior from voters and candidates, such as voting twice.
-
• Resolution. The election completes. A weak variant of the property states that either the election completes or holds infinitely many rounds of voting.
-
• Candidate (Mis)Behavior. This property is the specification for a well-behaved candidate actor in the election. A well-behaved candidate actor announces its candidacy as an assertion and maintains it until either the election is over or an election agent informs it that it is no longer in the running.
-
• Voter (Mis)Behavior. This property is the specification for a well-behaved voter actor. A well-behaved voter actor registers in exactly one region and then participates in each round of voting in that region by casting exactly one vote for one of the eligible candidates. A weak variant of the property states that the voter registers and always votes at least once in each round.
Windowing System. This program implements a basic graphical windowing system. A collection of driver actors collaborate to provide the primary interface. A layout-solving actor powers graphical output by combining requested sizes and layout styles (such as vertical or tabular) with the actual dimensions of the system window to compute the actual size and location of each item. Another interface actor provides descriptions of mouse events, including when the mouse is touching a window, mouse presses, and releases. The mouse interface allows for an actor mix-in (a function that starts facets or endpoints) to implement drag-and-drop behavior, which can then be instantiated and reused freely. The rest of the program consists of actors implementing rudimentary windowed applications and menus.
-
• Layout. This property describes the behavior of the layout engine, which spawns solvers for horizontal, vertical, and tabular layouts on demand. The specification is based on a comment in the original, untyped program, describing the desired behavior; it essentially demands that requests for layouts are answered with solutions:
This comment closely corresponds to the specification language of assertion types and LTL connectives provided by the implementation.
-
• Menu Duration. This property checks the lifetime of menu items: that they appear in response to selecting a menu and that they remain until either a selection is made or a mouse click occurs, either selecting a menu item or closing the menu.
Web Chat. This program implements the server for a chat service loosely based on Slack.Footnote 11 It allows a user to connect, sign up for an account, and then create and join conversations. Connected users may request to follow one another. Accepted follow requests lead to each user being added to the contact list of the other. Users may join the conversations of their contacts. The system implements an option for permission delegation, so that if user A chooses to invite user B to a conversation, user B may then invite user C, and so on.
-
• Conversation Release. This property is based on a comment in the original, untyped program, essentially calling for the release of resources in the event that a request is canceled before the response has materialized:
A weak version of the property relaxes the need for the resources to all be removed “at once.”
-
• Contact Release. This property is also based on a similar comment in the code to the conversation release property, but in the module for managing the contacts list for users:
A weak version of the property relaxes the need for the resources to all be removed “at once.”
10.2 Deadlock freedom
Dataspaces implement a form of asynchronous message-passing, so in a technical sense every dataspace program is free from deadlocks. However, dataspace programs may reach a stuck state where each actor waits for further communication before continuing. Such a situation is sometimes referred to as a soft deadlock. The corpus includes several programs that feature numerous kinds of actors participating in interwoven conversations, making freedom from such soft deadlocks a desirable property to check.
Figure 21 demonstrates a simple program simulating two friends attempting to make plans. One friend actor, once it knows the time of the meeting, is ready to suggest a location. Meanwhile, the other friend actor, is ready to suggest a time depending on the location. The result is no communication at all. Such situations can arise from poorly designed protocols and buggy implementations. Figure 21 is a case of the former.
One way of describing soft deadlocks is that an actor states an interest, that interest is never withdrawn (that is, the interest remains relevant to the actor), and no matching assertion ever arises. The following LTL formula states this property with respect to the friend1 actor’s interest in Time assertions:
Checking this LTL property against the implementations of the friend1 and friend2 actors fails, as should be expected for Figure 21. The resulting trace illustrates the two actors waiting for one another with active subscriptions but no matching assertions.
A similar property describes the deadlock from the perspective of the friend2 actor’s interest in Place assertions, which fails a similar check:
Generalizing, a program may be checked for deadlocks by taking each assertion of interest ? $\tau$ and checking the following LTL formula:
Such a check succeeds if the model checker can show that the program never deadlocks. This property may be too strong for the level of type precision. An alternative approach is to state in LTL that the program definitely deadlocks, and see if model checking produces a counterexample, implying that the program has some non-deadlocking executions:
This section refers to the former property as strong (soft) deadlock freedom and the latter as weak (soft) deadlock freedom.
10.3 Results
Table 1 summarizes the results. Each row of the table describes a property from the corresponding program. A ✓ in the “Expressible” column indicates that the implementation can express the property and a corresponding ✓ in the “Checkable” column indicates that the implementation successfully checked the property. Otherwise, the process of stating and checking could not be completed. Section 10.4 discusses these cases.
10.3.1 Performance
One possibility is that, even if the implementation can express and check a desired property, the result may not return within a reasonable amount of time. The potential for a program’s state space to grow exponentially is unavoidable. Eventually, even the most optimized checkers will require enormous amounts of memory and face slowdowns. Determining a threshold for a “reasonable” amount of time is tricky, as different developers have different conceptions. This evaluation sets the upper bound for a check that could still be useful at eight hours. That is a long time, but short enough to be a part of continuous integration (CI) tests that run on a nightly basis, with the results ready for review the next morning. Table 2 details the running time and peak memory usage for each successfully checked property.
The performance clears the “integration test” threshold by a wide margin. All of the properties take seconds or a few minutes rather than hours to check. The property that takes the longest to check, deadlock freedom for the smarthome program, consists of 22 individual properties, one for each type of subscription in the program. Each sub-property leads to a separate invocation of the model checker, one for each type of interest in the program. These results put the performance in the realm of “unit test” acceptability, where the checks can be run along with a unit test suite before committing each change to a project.
10.4 Interpretation and discussion
This section discusses each of the properties that the implementation failed to support and the primary reason(s). In a few cases, interesting details for successfully checked properties are provided as well.
Data Processing.
-
• Job Completion. The implementation is not able to check this property. The checker is not able to reason about the termination of the task-processing loop in the protocol. Even though the loop follows an inductive structure—processing a DAG—this information is not propagated to the type level. Moreover, since an LTL formula always talks about all possible executions of a system, the specification language cannot express that it is at least possible to reach the desired end state.
-
• Load Balancing. The implementation is unable to state and check the property. For a task manager with an arbitrary number of associated task runners, stating the specification requires stating relations between numbers. Checking the property would similarly require reasoning about arithmetic operations. Moreover, the property inherently deals with the multiplicity of assertions of a certain types, namely task-assignment assertions. Violating the property means having more than the expected number of such assertions.
Checking the weak variant of the specification makes use of a coincidence in the program for working around the inability to reason about the multiplicity of assertions. Because there happen to exist two distinct types of tasks, a limited form of multiplicity can be checked—namely, if there are some of one type of task assigned to a manager, there are none of the other, and vice versa.
Caucus.
-
• Resolution. Checking this property also encounters difficulty due to the potential non-termination of a loop in the protocol. The implementation is unable to reason about the termination of the loop, even though the loop matches a simple inductive structure (one candidate is removed each round).
-
• Voter (Mis)Behavior. The implementation lacks the precision to state the full specification. The primary impediments are the inability to correlate information from one assertion to another, such as the name of the candidate the voter chooses with the names of the candidates on the ballot, as well as reason about the multiplicity of assertions. However, the weak specification is still able to distinguish a correct implementation of a voter from several malicious ones.
-
• Deadlock Freedom. While Section 10.2 discusses deadlock freedom, one interesting point to note is that this check uncovered a bug in the program: at the end of each round of voting, a region manager actor announces intermediary results that inform the candidates of their status in the race, except for the final round of voting, the manager actor simply announces the winner. Due to an oversight, the original implementation of the candidate actors only listened for the intermediate results, not the winner announcement. As a result, they were soft deadlocked waiting for an intermediate update that never materializes. The fix to the bug is to have the candidate actor listen and react to the final election result.
Web Chat.
-
• Conversation Release. This property is checkable with the implementation, with one caveat: the phrase “at once” entails a level of precision that is beyond the implementation. That is, it can check that the required assertions disappear, but it does not have a way of expressing that the concerned parties, “all eligible answerers,” all react within a set timeframe. However, that part of the specification is of dubious importance when scrutinized: dataspace routing will always dispatch an event to all interested actors when the question is withdrawn. Each such actor will have the opportunity to react to the event in due time. Since dataspace scheduling is fair, there is little reason to worry about such timing details. Thus, the full specification is beyond the implementation, but the weak version, without the timing constraint, is checkable.
-
• Contact Release. Just as with the conversation release property, the “at once” phrasing creates difficulty but is of questionable importance. So again, the full specification is not checkable but a weak variant without the timing constraint is.
Threats to Validity. There are several threats to validity of this evaluation. The corpus comprises a small number of programs, authored by only three individuals. The programs for the corpus were not selected randomly. Rather, we sought programs whose implementations involved a high degree of concurrency and communication, as opposed to business logic or other concerns. Moreover, the identification of properties and subsequent verification efforts were conducted by the authors of the tools. We are highly familiar with both the capabilities and limitations, potentially biasing the search for properties that are a good match for the current capabilities.
11 Related work
The design of the facet language has two distinct goals. First, the notation greatly facilitates the expression of the temporal relationships with respect to—their conversations with—other actors. See Appendix A for an extended comparison of facets to an implementation of the same behavior in a functional style. Second, in contrast to higher order functional programming of actors, the first-order character of the facet language allows the construction of reasoning tools that assist programmers with the validation of essential protocol properties. That is, the facet language forms capture the input/output behavior of an actor without the need for additional data- or control-flow analysis. See Section 10 for a basic illustration. Our results are thus related to two distinct bodies of research: design of concurrent languages (Section 11.1) and reasoning with automated behavioral types (Section 11.2). The dissertations of Garnock-Jones (Reference Garnock-Jones2017) and Caldwell (Reference Caldwell2023) also feature extended discussions of related languages and analysis tools.
11.1 Facets and the design of concurrent languages
Programming languages provide limited support for organizing conversations among groups of actors and their internals. Garnock-Jones et al. (Reference Garnock-Jones, Tobin-Hochstadt and Felleisen2014, 2016) compare dataspace communication with related coordination mechanisms such as the Conversation Calculus (Vieira et al., Reference Vieira, Caires and Seco2008), the Mobile Ambient Calculus (Cardelli & Gordon, Reference Cardelli and Gordon2000), the join calculus (Fournet & Gonthier, Reference Fournet and Gonthier2000), and tuplespaces (Carriero et al. Reference Carriero, Gelernter, Mattson and Sherman1994; Murphy et al., Reference Murphy, Picco and Roman2006), as well as various actor systems and process calculi. Here, we focus on linguistic features for organizing the code of individual actors.
State-Machine Actors. Actor systems such as Akka (Akka Project, Reference Project2022) and Erlang/OTP (Armstrong, Reference Armstrong2003) provide means of organizing actors as state machines. Erlang’s gen_statem interface exemplifies these mechanisms. The programmer describes each state in the machine as a procedure mapping an incoming message to the next state and some actions to carry out. These state transition functions also operate on the actor’s private store, handled separately from the state of the machine.
Actors organized as event-driven state machines lose access to contextual information. Instead, such notations force programmers to encode context into each state and save related information explicitly in the private store. Such encodings, however, make it cumbersome to create actors that simultaneously participate in multiple conversations or alternatively engage and disengage in multiple intertwined behaviors, such as the Hub and Light actors from Section 3.
The gen_statem interface and its sibling gen_server address only one of the concerns of facets in a limited fashion: abstracting control state. Language support ends at instantiating the interface with callbacks. An actor that deals with multiple types of messages in a single state must often manually demultiplex the message to match it with the proper response. Callbacks use the familiar state-passing style, operating on a monolithic store. Though callbacks may coordinate startup and shutdown of the actor, initiating and closing a conversation comes without linguistic support.
Fact Spaces and Crime. The inspirational fact spaces model (Mostinckx et al., Reference Mostinckx, Scholliers, Philips, Herzeel and De Meuter2007) shares many similarities with the dataspace model. Fact spaces build on TOTAM (Scholliers et al., Reference Scholliers, González Boix and De Meuter2009, Reference Scholliers, González Boix, De Meuter and D’Hondt2010; González Boix et al., Reference González Boix, Scholliers, De Meuter and D’Hondt2014), a form of tuplespace, to equip actors with a means of moving and sharing state. In fact spaces, programs react to both the appearance and disappearance of facts from a shared repository. Reactions are described in a logic coordination language, allowing the computation of new facts based on current facts via forward-chaining. The language implementation (implicitly) records the dependencies between facts and invokes application actions specified by logic rules.
The implementation of the fact space model, dubbed Crime (Mostinckx et al., Reference Mostinckx, Lombide Carreton and De Meuter2008), integrates the fact spaces model with the AmbientTalk language (Van Cutsem et al., Reference Van Cutsem, Mostinckx, González Boix, Dedecker and De Meuter2007; Van Cutsem et al., Reference Van Cutsem, Gonzalez Boix, Scholliers, Lombide Carreton, Harnie, Pinte and De Meuter2014). AmbientTalk is an instance of the Actor model (Agha Reference Agha1986; De Koster, et al., Reference De Koster, Van Cutsem and De Meuter2016) in the mold of E (Miller et al., Reference Miller, Tribble and Shapiro2005). In E and AmbientTalk, objects are organized into vats; each vat runs its own event loop, dispatching incoming messages to the contained objects. Combining AmbientTalk and Crime requires bridging the gap between the events corresponding to assertion and retraction of facts and the message exchange of AmbientTalk. The solution incorporates reactive programming, in the mold of FrTime (Cooper & Krishnamurthi, Reference Cooper and Krishnamurthi2006). The result is that actors may define time-varying behaviors, with values shifting based on the available facts.
The differences between Crime and facet-oriented dataspace actors stem from the absence of a unifying notion of a conversational frame. Crime comes without any representation of a conversation, hence programs lack the conversational structure of (nested) facets. Time-varying collections do not offer a way of propagating changes to the tuples in the shared space, unlike the fields and query forms that connect to an actor’s endpoints. Because Crime does not group the components of conversational behavior, no automatic support exists for the release of associated resources. Programmers must carefully reason about the relationships between components to ensure that the state of the actor and associated tuples remain consistent. The underlying E language offers object references to denote specific conversations within the heap of a given actor and employs method dispatch as a limited pattern matcher over received messages.
Concurrency Within Active Objects. Some Active Objects languages allow for concurrency within an Actor or Actor-like entity (de Boer et al., Reference de Boer, Clarke and Johnsen2007; Schäfer & Poetzsch-Heffter Reference Schäfer and Poetzsch-Heffter2008, Reference Schäfer and Poetzsch-Heffter2010). Typically, this concurrency is achieved by allowing multiple message dispatches to be active within a single active object. This allows for splitting control and state among numerous objects, resembling the way our facets and fields individually contribute to the actor’s overall behavior.
Much like with conventional object-oriented programming, communication between active objects forms a graph structure. Each active object is a node in the graph and has an edge to each object it communicates with via asynchronous method invocation. Indeed, many active object languages support both the communication-level graph and the traditional object-oriented “refers to” relationship with support for passive objects (Boer et al., Reference Boer, Serbanescu, Hähnle, Henrio, Rochas, Din, Johnsen, Sirjani, Khamespanah, Fernandez-Reyes and Yang2017). Each active object (node in the communication graph) may implement its behavior using any number of such passive objects following traditional object-oriented design (Gamma et al., Reference Gamma, Helm, Johnson and Vlissides1994).
Dataspaces and faceted actors possess the same graph-like communication structure. Each actor (node) is connected to each other actor that it communicates with via assertions (edge). Thinking in terms of the graph abstracts away the differences between the message-passing paradigms employed by active objects (point-to-point) and dataspaces (publish/subscribe). Just as how an active object may implement its behavior using any number of passive objects with their own fields and method definitions, a facet-based actor defines its behavior across a collection of facets, fields, and endpoints. The primary difference is that passive objects form a graph, while facets form a (dynamically changing) tree. The tree structure of facets enables the ability to program directly to the concerns of startup and, especially, shutdown (via the stop statement and on stop event handlers).
One interpretation of this analysis is that it may be fruitful to adapt the facet notation to the domain of active objects. Combining facets for defining behavior with passive objects for defining state has the potential to synthesize the best of both worlds.
ConGoLog. While communication-by-assertion in dataspaces is inspired by Prolog fact databases, ConGoLog (De Giacomo et al., Reference De Giacomo, Lespérance and Levesque2000) is a more direct adaptation of logic programming to concurrency. In ConGoLog, a programmer first models a particular domain—typical applications include robotic control systems—with a collection of axioms describing relevant properties and the effects of performing particular actions. The main program then specifies some number of concurrent agents—as logic programs—emitting effects, each of which may update the stated properties, and reacting to the truthiness of such properties.
ConGoLog and dataspaces share a deep connection but also differ in significant ways with respect to their conceptions of state and agent. As a variant of the actor model, dataspaces provide each actor with its own isolated memory, while the dataspace itself contains information shared with the rest of the program. Information shared in the dataspace is tied to the identity of some particular actor, imposing a rigid structure on updates and a connection to component failure. Meanwhile, ConGoLog lacks the distinction between local state of an agent and globally visible state information of the situation. Concurrent ConGoLog agents come without the structure provided by facets for grouping related behavior and state, as well as orchestrating startup, shutdown, and component evolution—partly because these concerns are not a goal of the design.
Sparrow. The Sparrow DSL (Avila et al., Reference Avila, De Koster and De Meuter2020) extends Erlang with the ability to react to complex event patterns. That is, rather than just handling one message at a time from its mailbox, a programmer may utilize Sparrow to specify an actor’s behavior dependent on particular combinations of messages, timing constraints, and so on. Reactions to a given pattern may be dynamically added to and removed from an actor’s behavior. A reaction is like a facet with a single event-handler endpoint. They lack the other features of our language, like the hierarchical structure, start-up and shut-down behavior, and grouping of facets, as well as localized state of fields.
Dataflow. The simple dataflow system described here is most similar to the simple dependency tracking approach to object-oriented reactive programming described by Salvaneschi & Mezini (Reference Salvaneschi and Mezini2014, section 2.3) and was in fact directly inspired by the dependency tracking of JavaScript frameworks such as KnockoutFootnote 12 (Sanderson, Reference Sanderson2010) and Meteor.Footnote 13
11.2 Behavioral type systems
In general, the design and theory of a type system is intimately tied to the underlying language. This is doubly true of behavioral type systems, which express control-flow aspects of the program, not just the static relationships found in purely structural type systems. Consequently, applying ideas around the actor model to dataspaces and facet-oriented actors poses a serious challenge. Dataspaces provide a form of publish/subscribe communication (Eugster et al., Reference Eugster, Felber, Guerraoui and Kermarrec2003) that has received little attention in the context of behavioral type systems. While encoding dataspaces in a traditional message-passing (or channel-based) setting is possible, the encoding would obscure a program’s communication patterns too much for the targeted behavioral type system to be of use. Application-specific information could potentially be recovered by utilizing more powerful type-level reasoning, such as dependent session types (Toninho et al., Reference Toninho, Caires and Pfenning2011), but presently bringing such machinery to bear in a usable programming language remains unsolved.
Our work specifically draws inspiration from studying the approach to type checking more so than the specifics of the systems. In that sense, our design follows that of Igarashi & Kobayashi (Reference Igarashi and Kobayashi2001) and Chaki et al. (Reference Chaki, Rajamani and Rehof2002), in which the type checker constructs type-level processes as abstractions for term-level behavior. The type-level processes serve as the basis for behavioral analysis. In the case of Igarashi & Kobayashi (Reference Igarashi and Kobayashi2001), the checks are generic. Subtyping and other parts of the type checker may be instantiated to yield checks for races, deadlocks, etc. Chaki et al. (Reference Chaki, Rajamani and Rehof2002) use the SPIN model checker for simulation-based subtyping and analyzing conformance to LTL specifications. In their case, LTL formulae state properties of channels in the system.
Effpi. The Effpi message-passing framework (Scalas et al., Reference Scalas, Yoshida and Benussi2019) reflects core process and communication operations to the type level and uses dependent function types to precisely track which channels are used. By model checking these type-level descriptions, it becomes possible to verify certain properties stated in a temporal logic, including deadlock-freedom and some communication patterns. By contrast, we seek to verify any LTL formula, allowing application-specific correctness properties. The biggest difference to our work is the communication paradigm. While Effpi uses message-passing along channels in a process calculus, our actors share knowledge via a dataspace.
Conversation Types. Conversation types (Caires & Vieira, Reference Caires and Vieira2009) add behavioral checking to the Conversation Calculus (Vieira et al., Reference Vieira, Caires and Seco2008). A conversation type describes a sequence of message exchanges within a particular context, i.e., conversation. Like the global types of Honda et al. (Reference Honda, Yoshida and Carbone2008), a conversation type may be decomposed to type(s) describing the actions of the individual processes participating in the conversation. Crucially, the decomposition is flexible. A given conversation type may be realized as the composition of numerous different combinations of participant types. This flexibility meshes well with the anonymous communication in the conversation calculus, allowing a degree of agnosticism with regard to the number of participants in a conversation and their individual roles when looking from the global perspective. At first glance, this approach may work for dataspace actors, but because every communication is between a single sender and a single receiver, the model remains similar to channel-based models rather than publish/subscribe communication. In addition to structural message safety, their types ensure a degree of deadlock-freedom.
Active Object Languages. There are several notable behavioral type systems and efforts to perform static verification on active object languages. Behavioral type systems for active objects tend to focus on the problem of detecting deadlocks (Henrio et al., Reference Henrio, Laneve and Mastandrea2017). They employ a similar types-as-processes technique, capturing key information, such as dependencies between futures, as effect types. Proof rules (i.e., type checking) can then determine whether the program may deadlock. Our behavioral checks of facet programs instead treat effect types as abstracted, but still executable, processes, with the goal of both (model)checking properties of programs and validating the design of the DSL.
The Rebeca modeling language (Sirjani et al., Reference Sirjani, Movaghar, Shali and de Boer2004) is the closest to our own behavioral checking. With Rebeca, a programmer may use communication and abstraction facilities of active objects to define a modes of their program or system. The Rebeca model checker, Modere (Jaghoori et al., Reference Jaghoori, Movaghar and Sirjani2006), can then verify properties of the model specified as temporal logic formulae. Rebeca has numerous extensions, notably broadcasting (Yousefi et al., Reference Yousefi, Ghassemi and Khosravi2015). The goal of our work is to check properties of programs versus models of programs and to thus validate the design of the facet DSL. It might be possible to target Rebeca as a backend for the implementation, rather than SPIN, and benefit from its message-passing-specific optimizations.
ConGoLog. In bounded contexts, it is possible to decide some temporal properties of ConGoLog programs (De Giacomo et al., Reference De Giacomo, Lespérance, Patrizi and Sardina2016). Thus, it ought to be possible to develop a model checker based on these decision procedures that could express and check temporal properties similar to the ones that we explore for dataspace programs. No such model-checker implementation appears in the literature.
Types for the Join Calculus. The join calculus (Fournet & Gonthier, Reference Fournet and Gonthier2000) shares some similarities with dataspace actors. It has a soup of messages versus the table of assertions in a dataspace. The event handler endpoints of facets resemble join patterns, especially in the objective join calculus (Fournet et al., Reference Fournet, Laneve, Maranget and Rémy2000). Recent work (Crafa & Padovani, Reference Crafa and Padovani2017) has developed behavioral types for the join calculus around the notion of type state (Strom & Yemini, Reference Strom and Yemini1986). They describe the types of messages understood by an object and use connectives such as conjunction, disjunction, and repetition to determine how a reference to an object can and must be used. The resulting design is able to elegantly track the dynamic interface of, e.g., a lock object. Unfortunately, they note the challenge of implementing such connectives, as well as the necessary substructural support.
Mailbox Types. Similarly to the behavioral type systems for the Join Calculus, mailbox types (de’Liguoro & Padovani, Reference de’Liguoro and Padovani2018) use a commutative regular expression of messages to describe the inputs to an actor. Though the description of messages is unordered, sequencing is still expressible using mailbox-passing. Following a familiar technique in behavioral systems, type analysis collects information from the program as the basis for analysis. In this case, it is a dependency graph among mailboxes. The dependency graphs are vital for detecting and preventing programs that may deadlock.
Session Types. Type systems for the $\pi$ -calculus and related process calculi have been widely studied. Session type systems (Honda et al., Reference Honda, Vasconcelos and Kubo1998) in particular have been utilized to describe and verify the communication properties of many kinds of systems. Our methods are closely related to the generic $\pi$ -calculus type system of Igarashi & Kobayashi (Reference Igarashi and Kobayashi2001). They relate processes with abstract process types, including the primary process constructors: channel send and receive, parallel composition, and so on. Their system is parametric over constraints on process types, allowing different instantiations targeted for different properties, such as deadlock prevention.
Multiparty session types (Honda et al., Reference Honda, Yoshida and Carbone2008; Scalas & Yoshida, Reference Scalas and Yoshida2019) come closer to describing the group oriented communications between dataspace actors. Global types provide a perspective for describing the protocol among a group of participants. Though efforts toward adapting session type theories to actor systems have found some success (Mostrous & Vasconcelos, Reference Mostrous and Vasconcelos2011; Crafa Reference Crafa2012; Neykova & Yoshida, Reference Neykova and Yoshida2014), much work remains to be done to handle the full spectrum of actor programming and its relatives, such as the dataspace model.
12 Conclusion
The notation of facets for dataspace actors is good for concurrent actor programming. Our examples show how the notation directly addresses the most common concerns: engaging and disengaging in behavior, managing local state and resources, sharing information during a conversation, etc. Furthermore, the language design captures the key components of behavior, allowing a types-as-processes approach to analyzing temporal behavior.
The facet notation is also good for reasoning about programs. A type system can ensure basic soundness and provides the basis of behavioral verification. The evaluation on realistic programs illustrates the system’s usefulness. Our formal work proves that this approach is well-founded; our evaluation supplies initial evidence of the practicality of the approach.
The work itself suggests several directions for future improvements. Concerning the facet notation, the design could benefit from Avila et al. (Reference Avila, De Koster and De Meuter2020)’s language of communication patterns. Concerning the behavioral type system, precision could be improved by using a dependency notation for correlation information in assertions similar to the work of Scalas et al. (Reference Scalas, Yoshida and Benussi2019) for channel names. Finally, an additional validation could come from porting our work to a widely used actor system, such as Akka. The broadened context would introduce additional research challenges, such as accounting for the possibility of communication failures, as well as shed additional light on the usefulness and usability of our design.
Acknowledgments
We would like to thank Mitch Gamburg for his assistance in developing programs for the evaluation.
Financial support
This work was partially supported by NSF grant SHF 2315884.
Conflicts of interest
None.
Appendix
A Comparing Facet-based and Functional Actors
This section provides additional motivation for the facet notation via a comparison to the results of implementing actor behaviors in a procedural manner.
Figure A1 implements the behavior of the light-actor (Figure 4) using a procedural interface to the Racket dataspace library. In the procedural interface, the behavior of an individual actor is specified as a state transducer function.
The BEHAVIOR function (lines 3–8) implements this interface. Its inputs are a representation of an event and the current-state of the actor. The body of the BEHAVIOR function dispatches to one of several sub-behavior functions for the actor. The current-state is an instance of the light-actor-state struct containing two pieces of information. The first is the actor’s current control state, dictating which conversation(s) to engage in response to an event. The actor’s interactions are simple enough that its control may be encoded as a switch.Footnote 14 The second holds any additional data relevant to the active sub-behavior.
Placing the light-actor into a dataspace as an instance of an actor struct (lines 9–11) requires specifying the initial light-actor-state as well as the actor’s initial assertions, establishing interest in receiving wall-switch information. Note that the actor’s ctrl-state—UNPOWERED—is expressed separately but must be in sync with the initial assertion(s).
The unpowered, powered, and located sub-routines implement the actor’s various sub-behaviors. Each routine computes a transition-to struct as its result. The first slot of the transition-to specifies a new private state for the actor, while the second requests updates to the actor’s current assertions. The updates are a list, with each item calling for the introduction of a new assertion (assert) or withdrawal of an existing one (retract).
A
The unpowered routine (lines 13–19) responds to notification that the wall-switch is ON with a transition-to the POWERED state. In the POWERED state, the actor uses the powered-state struct as its private state to keep track of whether the light is currently ON or OFF. Additionally, it introduces the light assertion and expresses interest in a room-assignment.
The powered routine (lines 21–32) implements two possibilities. The first is the event that the wall switch is turned off (line 23). The resulting transition returns to the “unpowered” state, retracting the powered-specific assertions. The second possibility is the appearance of a room-assignment assertion (line 28). In that case, the actor transitions to the LOCATED state. The private state of the actor must then keep track of both the current ON/OFF status of the light as well as the assigned room. The located-state struct serves this purpose. Finally, the actor introduces a new interest in presence sensor information for the assigned room.
The located routine (lines 34–54) must still react to the withdrawal of wall-switch assertions, in the manner of powered. To accomplish this goal, it invokes the powered function on the event and a synthesized instance of powered-state. A procedure-based actor must determine how to combine the state-changes and actions from each of its conversations on a case-by-case basis. Here, it inspects the resulting transition. If the transition is to the “unpowered” state, the located function augments the transition with the retraction of interest in presence sensor assertions. Otherwise, the actor ignores the transition and proceeds by analyzing the event. Depending on the presence sensor information, the actor either updates the light’s private state and public assertion to ON or OFF.
Analysis. The facet-based implementation enjoys a number of advantages.
Spatial locality. The procedural actor describes its initial assertion in one place and updates to its assertions throughout different locations in the code. The actor manipulates its light assertion in four different locations (lines 18, 26, 46, and 50). The problem is not unique to assertion-manipulating dataspace actors. Consider the code for a networked actor. The code for connecting is typically in one place, while code for using the connection or tearing it down is somewhere else. By contrast, in the facet language, an actor accomplishes the same behavior with a single endpoint—which is less error prone than when functionality is distributed over many locations.
Conversational structure. The organization of the facet code directly reflects the relationships among the actor’s conversations. The interaction with the presence sensor takes place as a sub-conversation in the context of a room-assignment assertion, which is dependent on the conversation with the wall switch. Recovering the same relationships from the implementation of the procedural actor requires a careful inspection of its code.
Localized State and Automatic Demultiplexing. Facet-based actors define state item-by-item and in the relevant context, and dataflow keeps public and private state automatically in sync. Likewise, the control state of the facet-based actor is maintained by the language implementation, which automatically routes incoming assertion patches to the (pattern)matching endpoints.
The contrast between the code in a facet notation and a procedural language is large, even when the latter is enriched with imperative assignment or objects. For simple conversations, switches like those in the light actor combined with assignments or objects will simplify the procedural code. Complications arise for an actor conducting parallel conversations. While each conversation’s state could be encapsulated in an object, doing so would have the unfortunate consequence that the code for manually demultiplexing incoming events and the code for maintaining a local view would be at separate places.
Similarly, a procedural language forces an actor to maintain its control state explicitly. In our concrete example, the ctrl-state field is the symptom of this problem. Continuations (delimited or unconstrained) (Felleisen, Reference Felleisen1988; Flatt et al. Reference Flatt, Yu, Findler and Felleisen2007) do not solve the problem either, as code (or patterns) for managing the continuations must still be deployed to keep track of the state.
Brevity. Finally, the facet-based implementation requires far fewer lines of code.
B Facet Metafunctions
The semantics of Section 5.2 refers to several secondary metafunctions. This section provides their formal definitions.
The ${match_{\mathrm{D}}}$ function matches an event against an event descriptor. Successful matches yield a set of substitutions for the binding variables of the event descriptor’s pattern:
The project function handles matching a pattern against a set of assertions, yielding a set of substititutions. The set is empty when there are no matching assertions in the set:
The helper inst partially matches a pattern against a value, leaving wildcards and binders in place:
The ${match}_{\mathrm{v}}$ function implements basic value-against-pattern matching. We assume without loss of generality that all binders in a given pattern are unique:
The $\oplus$ operation updates an actor’s knowledge of assertions based on a patch:
The evaluator for expressions is eval. The function is partial due to the possibility of unbound field names. A primitive interpretation $\delta$ is a partial function from an operation p and a vector of values the result value:
The ${assertions-of}$ function produces the current assertions made by an actor given its active facet tree and field store:
It utilizes a family of helper functions for determining the assertions associated with endpoints. The $assertions-of_{\mathrm{e}}$ function does the work for assertion endpoints:
While the $assertions-of_{\mathrm{D}}$ metafunction produces the assertion of interest when needed by an event-handler endpoint:
C The Extended Light Actor
The full implementation of the light actor in Figure 4 is a good example for illustrating how the facet machine deals with loading and manipulating states. Here is the fixed version of the actor description, assuming suitable bindings for wall-switch-id and my-id:
Figure C1 shows the complete machine transition sequence, starting from the initial state. From there, the machine executes a labeled transition to get the start instruction of the facet script. A transition launches the script, which then immediately issues an action patch via another labeled transition. At this point, the facet is ready to shut down, which the machine realizes with $\textsf{stop}$ transition.
D Type Judgments
The following definitions complete the type judgment for facet actor terms (section 6.1).
The judgment for event descriptions is :
The judgment for expressions is :
Types suitable for asserting, :
Types suitable for patterns, :
The following metafunctions complete the definition of the type judgment.
The ${bindings}_{\mathrm{D}}$ metafunction creates a type environment from the binding variables in an event description:
It utilizes a helper function for patterns, ${bindings}_{\mathrm{P}}$ :
Two metafunctions prune type environments to help keep track of facet and field names. The first, prune, removes all facet and field names from an environment, i.e.,variables that are not shared between actors:
The second, ${prune-up-to}$ , removes all facet names that are children of a particular facet, giving an environment of names of active facets after that facet stops:
E Dataspace Metafunctions
The semantics of Section 7 refers to several metafunctions. For completeness, this section provides their formal definitions. But, also see the work of Caldwell et al. (Reference Caldwell, Garnock-Jones and Felleisen2020) plus Garnock-Jones & Felleisen (Reference Garnock-Jones and Felleisen2016), which fully specify the semantics of dataspace systems as well. The primary difference between the semantics presented here and prior work is the aggregation of incoming patches performed by ${bc}_{\Delta}$ below.
The ${boot}_{\mathrm{DS}}$ metafunction initializes a dataspace program based on a collection of initial facet-based actor descriptions:
The ${boot}_{\Sigma}$ function boots a process description to an actor state:
The update metafunction is the work horse of dataspace event dispatch:
The ${bc}_{\Delta}$ metafunction updates the pending event for an actor if it has an active interest:
where
The $\circ$ operator applies two patches in sequence, maintaining disjointness:
The helper function patch calculates the difference in assertions between two machine states:
The function translates a facet machine’s internal transition label to a sequence of (zero or one) actions:
The $assertions-of_{\mathrm{M}}$ function extends the ${assertions-of}$ family of functions, determining the active assertions of a facet machine:
The $\oplus_{\mathrm{R}}$ operator updates the table of active assertions based on an actor’s patch:
The ${dispatch}_{\mathrm{DS}}$ function invokes an actor’s behavior on its pending event, when non-empty:
F Type Level Programs
Section 6.2 sketches a semantics for the behavioral types of facets, appealing to a type-level facet machine. In the same vein, Section 7.2 informally presents a semantics for the types of dataspaces. This section supplements these with formal definitions. The definitions of the metafunctions are just like those from Appendix B; their definitions are omitted.
Figure C2 defines the syntax for describing the states of a type-level facet machine and Figure C3 the transition relation.
F.1 Machine Typing
Theorem 1, soundness for the type-level facet machine, needs matching type judgments.
The type judgment for facet machines, $\vdash_{\mathrm{M}}:\mathrm{M_{\mathrm{T}}}$ :
The ${store-bindings}$ function creates a type environment from a store:
The type judgment for facet trees, :
The type judgment for machine instructions, :
The type judgment for pending scripts, :
The type judgment for assertion sets, :
The type judgment for assertions, :
The type judgment for patches, :
The type judgment for events, :
The type judgment for stores, :
The type judgment for substitutions, :
The type judgment for transition labels, :
F.2 Type-level Dataspaces
A type-level dataspace has nearly the same semantics as the one from Section 7, with two modifications. First, assertions range over types $\tau$ . Second, actor behaviors are defined in terms of facet machine types, defined above, with related metafunctions such as ${interp_{\mathrm{M}}}$ lifted to the type level, mutatis mutandis.
G Proof Details
The following lemmas establish that the machine transitions for actors and types are related via typing, under certain conditions. The transitions fall roughly in the following categories:
-
• initialization (Lemmas 13);
-
• transitions in response to an external stimulus (Lemmas 15, 16, 17); and
-
• transitions for performing internal work (Lemmas 18, 19).
Finally, Lemma 20 establishes that related type and term machines make related assertions.
Lemma 13 (Boot). If
-
•
-
•
then
-
•
-
•
Proof By induction on the type derivation and the soundness of the dispatch metafunction (Lemma 16).
Definition 14 (Store Typing).
In other words, initialization of related actor terms and types yields related term and type machines.
Lemma 15 (Matching Events). If
-
•
-
•
then iff with .
Proof By induction on the pattern type derivation.
Essentially, if a pattern matches an assertion, then the pattern’s type matches the assertion’s type, yielding related substitutions, and vice versa.
Lemma 16 (Dispatch). If
-
•
-
•
-
•
-
•
-
•
-
•
then
iff
with
Proof By induction on the type derivations. The dispatch lemma (Lemma 16) lifts to events in general, so the term and type applications of ${match_{\mathrm{D}}}$ yield related substitutions. When applied to related event-handler bodies, the related substitutions yield related scripts.
In other words, dispatching related events yields related scripts to execute.
Lemma 17 (Partial Evaluation). If and then iff with
Proof By induction on the type derivation.
That is, partially evaluating a pending script and its type yields related instructions.
Lemma 18 (Preservation). If and then iff with
Proof Following a standard approach (Wright & Felleisen, Reference Wright and Felleisen1994), we can show that at most one of the machine transition rules can apply. Through the application of related lemmas, such as Lemmas 16 and 17, the updated parts of the machine state remain related via typing.
In other words, if a term machine takes a transition, then its type can take a transition to a related type machine, up to the type of the store. Since type machines do not manipulate or depend on the store at all, the store from the destination term-level machine state is translated to a new store type for the type-level.
Lemma 19 (Progress). If either or there exists $\mathrm{M}',\mathrm{M_{\mathrm{T}}}'$ such that and .
Proof By inspection of the machine state.
Lemma 20 (Machine Assertions). If then
Proof By induction on the type derivation and via similar properties for the family of functions.
This establishes that related term and type machines make the same assertions.
Discussions
No Discussions have been published for this article.