Void-safety: Background, definition, and tools

Background

The primary focus of Eiffel is on software quality. Void-safety, like static typing, is another facility for improving software quality. Void-safe software is protected from run time errors caused by calls to void references, and therefore will be more reliable than software in which calls to void targets can occur. The analogy to static typing is a useful one. In fact, void-safe capability could be seen as an extension to the type system, or a step beyond static typing, because the mechanism for ensuring void-safety is integrated into the type system.

Static typing

You know that static typing eliminates a whole class of software failures. This is done by making an assurance at compile time about a feature call of the form: x.f (a) Such a feature call is judged acceptable at compile time only if the type of x has a feature f and that any arguments, represented here by a, number the same as the formal arguments of f, and are compatible with the types of those formal arguments.

In statically typed languages like Eiffel, the compiler guarantees that you cannot, at run time, have a situation in which feature f is not applicable to the object attached to x. If you've ever been a Smalltalk programmer, you are certainly familiar with this most common of errors that manifests itself as "Message not understood." It happens because Smalltalk is not statically typed.

Void-unsafe software

Static typing will ensure that there is some feature f that can be applied at run time to x in the example above. But it does not assure us that, in the case in which x is a reference, that there will always be an object attached to x at any time x.f (a) is executed.

This problem is not unique to Eiffel. Other environments that allow or mandate reference semantics also allow the possibility of void-unsafe run time errors. If you've worked in Java or .NET you may have seen the NullReferenceException. Sometimes you might have experienced this rather poetic sounding message: "Object reference not set to an instance of an object". In Eiffel you would see "Feature call on void target". All these are the hallmarks of run time errors resulting from void-unsafe software.

Note: If you need a review of difference between reference types and expanded types in Eiffel, see the chapter of the Eiffel Tutorial dedicated to the Eiffel execution model.

Of course this is not an issue with instances of expanded types, because these instances are indeed "expanded" within their parent objects. But we could not imagine a world with expanded types only. References are important for performance reasons and for modeling purposes. For example, consider that a car has an engine and a manufacturer. When we model cars in software, it might be appropriate for engines to be expanded types, as each car has one engine. But certainly the same is not true for manufacturer. Many cars can share, through a reference, a single manufacturer.

So, references are necessary, but we want them to be trouble free.

Void-safe software

Void-safe software, then, is software in which the compiler can give assurance, through a static analysis of the code, that at run time whenever a feature is applied to a reference, that the reference in question will have an object attached. This means that the feature call x.f (a) is valid only if we are assured that x will be attached to an object when the call executes.

Info: This validity rule is called the Target rule, validity code VUTA, and is the primary rule for void-safety. In the following discussion, you will see that other validity rules are involved, too. You can see the formal definition of all validity rules in the ISO/ECMA standard document available online.

Once we have committed ourselves to this validity rule, we must have a strategy for complying with the rule.

Elements of the void-safe strategy

Here are the tools of void-safe trade. They will each be addressed in more detail throughout the documentation that follows. As you look at these elements it helps to try to think about things from the compiler's viewpoint ... after all, it is the compiler that we expect to give us the guarantee that our code is indeed void-safe.

First let's look at a couple of approaches that won't work.

It might occur to us that we could enforce compliance with the target rule by simply eliminating the concept of void references. But this would not be practical. Void is a valuable abstraction that is useful in many situations, such as providing void links in structures. So, we must keep void ... but we want to keep it under control.

Another thought might be that we could just have the compiler do all the work for us. But would be impossibly time consuming for the compiler to investigate every conceivable execution path available to a system to make certain that every possible feature call was made on an attached reference.

So, all of this boils down to the fact that we have to take some actions that help the compiler along. That's what the following are about.

Certified Attachment Patterns (CAPs)

We know that in the context of certain code patterns, it is clear that it would be impossible for a reference to be void. These patterns are identified and we call them CAPs, short for Certified Attachment Patterns. Here is a very straightforward example expressed in a syntax that should be familiar to all Eiffel developers: if x /= Void then -- ... Any other instructions here that do not assign to x x.f (a) end Here a check is made to ensure x is not void. Then as long as no assignments to x are made in the interim, a feature f can be applied to x with the certainty that x will be attached at the time ... and importantly, this can be determined at compile time. So, we say that this code pattern is a CAP for x.

It is important to understand that in this example (and with other CAPs), x is allowed to be a local variable or formal argument only. That is, x may not be an attribute or general expression (with one exception which we will see below). Direct access to class attribute references cannot be allowed via a CAP due to the fact that they could be set to void by a routine call in some execution path invoked by the intervening instructions or possibly even different process thread. In a later section, we will see that this is not quite such a limitation as it may appear at this point.

Note: You will find more useful information about CAPs in More about CAPs. Learn how certain code patterns are determined to be CAPs in What makes a Certified Attachment Pattern.

The attached syntax (object test)

For the purposes of void-safety, the attached syntax does double duty for us. It allows us to make certain that a reference is attached, and it provides us a safe way to access objects that are attached to class attributes.

We noted earlier that this code if x /= Void then -- ... Any other instructions here that do not assign to x x.f (a) end creates a CAP for feature calls on x, but only if x is a local variable or a formal argument.

By using the attached syntax, we can perform an object test on a variable. That is, the attached syntax is a BOOLEAN expression which provides an answer to the question "Is x attached to an object?" At the same time, if indeed x is attached to an object, the attached syntax will deliver to us a fresh local variable, also attached to x's object, on which we can make feature calls. if attached x as l_x then l_x.f (a) end In the example above, x is tested to make certain that it is attached. If so, the new local l_x becomes attached to the same object as x. And so the object can be used safely even if x is a class attribute. So, the attached syntax, is really another CAP, because it provides a clearly verifiable place for the application of features to targets that are guaranteed not to be void.

Note: The attached syntax has other syntax variations as well as other uses. These will be discussed later.

One way to make sure we comply with the target rule would be always use a CAP or the attached syntax every time we want to apply a feature to a referenced object. That might work, but it falls among the impractical approaches to the problem ... it would break a very high percentage of existing Eiffel code, not to mention cluttering things up quite a bit.

Types as "attached" or "detachable"

Rather than trying to protect every feature call, Eiffel allows us to declare any variable as being of an attached type. This is an important extension to the Eiffel type system.

In Eiffel prior to the introduction of void-safe facilities, any reference variable could be set to Void. So, all variables were considered '"detachable"'.

The current standard Eiffel supports a mixture of attached and detachable types. When a variable is declared of an attached type, as in the following example, then the compiler will prevent it from being set to Void or set to anything that can be set to Void.


my_attached_string: attached STRING

It is easy to imagine that the more declarations are of attached types, the easier it will be to guarantee that a call to a void target cannot take place at run time. In fact, if every declaration was guaranteed to be of an attached type, then that would be all that was needed to satisfy the Target rule.

However, it wouldn't be workable to have only attached types, because sometimes it's important to allow references to have a value of Void.

When it is necessary to allow Void as a value, a declaration can use the detachable mark as in the following. my_detachable_string: detachable STRING

This doesn't mean that on every declaration you must put either an attached mark or a detachable mark. Declarations that are unmarked are allowed. If a declaration contains neither attached nor detachable, then it is assumed to be attached.

In Eiffel then, all declarations will have types that are either attached or detachable. As a result, we need only use CAPs and the attached syntax with detachable types. So the important thing to remember is that direct access to class attributes of detachable types is never void-safe.

Attachment and conformance

The distinction between attached and detachable types results in a small but important addition to the rules of conformance. Because variables declared as attached types can never be void, then it is important not to allow any assignment of a detachable source to an attached target. However, assigning an attached source to a detachable target is permissible. The following code shows both cases (as described earlier, class types are attached by default). my_attached_string: STRING my_detachable_string: detachable STRING ... my_attached_string := my_detachable_string -- Invalid my_detachable_string := my_attached_string -- Valid

Initialization rule

If we have attached types, then we can assume variables declared of these types, once attached, will always be attached. But how do they get attached in the first place? That's what the initialization rule is all about.

The rule says that at any place in which a variable is accessed, it must be properly set. A variable's being properly set has a precise, but not particularly simple definition in the Eiffel standard.

Info: You can find the formal definition of the Variable initialization rule, validity code VEVI, and its related concepts such as properly set variables in the ISO/ECMA standard document.

Still, it's not too hard to understand the basics of initializing variables of attached types:

  • For the initialization of attributes of a class, we can apply a rule similar to that of the initial evaluation of class invariants ... that is, everything must be in order upon completion of a creation procedure. If a class attribute is of an attached type, then each of the class's creation procedures is responsible for making sure that the attribute is attached to an object upon its completion.
  • A local variable is considered properly set if it is initialized at some point preceding its use in any execution path in which it is used. So immediately after its create instruction, the local variable would be considered properly set. But if the create occurred in the then part of an if instruction, the local variable would not be properly set in the else part of that same if instruction:

my_routine -- Illustrate properly set local variable local l_my_string: STRING do if my_condition then create l_my_string.make_empty -- ... l_my_string is properly set here else -- ... l_my_string is not properly set here end end

  • A variable is considered properly set if it is self-initializing. What it means to be self-initializing is explained below.

Self-initializing attributes

A self-initializing attribute is guaranteed to have a value when accessed at run time. Declarations of self-initializing attributes are characterized by the use of the code that follows the attribute keyword. The code is executed to initialize the attribute in the case that the attribute is accessed prior to being initialized in any other way.

So, self-initializing attributes are ordinary attributes, with the restriction that they are of both attached types and reference types (i.e., not expanded types or constants). Self-initializing attributes still can be, and typically will be initialized in the traditional ways. The difference is that the code in the attribute part serves as a kind of safety net guaranteeing that a self-initializing attribute will never be void, even if it is accessed prior to being initialized by one of the traditional means.

value: STRING attribute create Result.make_empty end

In the example above, the attribute value will be attached to an object of type STRING, in fact, the empty string, if no other initialization occurs before the first access of value.

Rule for conformance

You will remember that the Eiffel type system dictates that an assignment instruction: x := yis valid only if the type of y is compatible with the type of x. Compatibility, in turn, means either conversion or conformance.

The fact that all types are either attached or detachable adds another dimension to rule for conformance:

  • If x is of an attached type, then y must be of an attached type.

This prevents us from circumventing attached status at run time. If x is of a detachable type, then y could be either a detachable or attached type.

The same goes for routine calls. In a call: z.r (y)where x is the formal argument for r, then if x is of an attached type, then y must be of an attached type.

Stable attributes

Stable attributes are really stable detachable attributes, as adding the concept of stability is meaningful only for detachable attributes. Declaring a detachable attribute as stable, means that it behaves like a detachable attribute except that its assignment rules mimic those of attached attributes. In other words, a stable attribute does not need to be attached during object creation the way that attributes declared as attached must. But like attached type attributes, stable attributes can never be the target of an assignment in which the source is Void or a detachable type. my_test: detachable TEST note option: stable attribute end

This means that even though stable attributes do not need to be initialized like attributes of attached types, once they are attached to an object, they can never be void again.

Stable attributes are also interesting in that they are the only exception to the rule given above in the CAPs section that stated that direct access to attributes cannot be protected by a CAP. A stable attribute can be used under the protection of a CAP. This is because once a stable attribute has an object attached, it can never again be set to Void. So there's no worry about having the attribute's state going unexpectedly from attached to non-attached because of the actions of other routines or threads.

Rule for generic parameters

Generic classes provide another question. A generic class likeclass C [G] ...allows us to create a type by providing a specific actual generic parameter for the formal parameter G.

So, two valid derivations are: my_integer_derivation: C [INTEGER]and my_employee_derivation: C [EMPLOYEE]

If class C contains a declaration: x: GWhat do we know about the void-safety of x ?

In the case of the INTEGER derivation above, we know x is safe because INTEGER is an expanded type. But often types like EMPLOYEE will be reference types which could be void at run time.

For a class like C [G], G is considered detachable. As a result, because of the rule for conformance, any class will work for an actual generic parameter. That means that both of the following are valid generic derivations:

my_detachable_string_derivation: C [detachable STRING] my_attached_string_derivation: C [attached STRING]

If C contains a declaration x: G, the application of features to x must include verification of attachment (CAPs, attached syntax, etc).

Constrained genericity can be used to create generic classes in which the generic parameter represents an attached type. If class C had been defined as: class C [G -> attached ANY] ...then x in this class has attached type G. Consequently, the actual generic type in any derivation must be attached ... and feature calls on x are safe.

Rule for ARRAYs

The rule for generic parameters applies to all generic types ... except ARRAYs. In the typical creation of an ARRAY, we would provide a minimum and maximum index. my_array: ARRAY [STRING] ... create my_array.make (1, 100)During creation, an area to store the appropriate number of entries is also created. And depending upon the actual generic parameter, these entries are either objects for expanded types or references for reference types.

In the case of an actual generic parameter of an attached reference type, all the elements must be attached to instances of type during the creation of the ARRAY. The make procedure would not do this. Creation of an ARRAY in which the actual generic parameter is attached must be done using the make_filled creation procedure. create my_array.make_filled ("", 1, 100)The first argument is an object of the actual generic type, in this case an empty STRING. Every entry in the newly created ARRAY will be initialized to reference this object.

For more detail on void-safe use of arrays and other generic classes, see the section: Using generic classes.