My three Eiffelwishes, part 3: readable types

by David Le Bansais (modified: 2010 Jul 28)

In a previous blog entry, I related how I modified my code to split classes in two parts, separating commands and queries. In the conclusion I was suggesting that Eiffel could be extended to provide support for this. I will now formalize the language extension, and demonstrate how to use it to write even more solid code.

The Command-Query Separation (CQS) principle states that features of a class should be either commands, that may modify the state of an object but do not return a result, or queries that return a result but don't modify the object. When classes are designed with CQS in mind, assertions are easier to write by simply calling only features that return a result. This guarantees that a program will produce the same output whether assertions are turned on or off: they don't change the state of objects.

The situation becomes more complicated when assertions are called with parameters. They must call only queries on the object being probed, but also on the parameters.

Therefore, for CQS to work properly, every class must follow this principle. The burden, for the programmer, is that each call must then be verified manually. In particular, if a query changes the state of one of its parameters, this state must be restored before the query returns the result. It is easy, in these conditions, to make mistakes.

Eiffel doesn't force programmers to follow CQS, but doesn't assist them in that task either. A feature that modifies one of its parameters and returns a result is a valid feature, by the language rules.

First approach

Let's introduce a new query keyword, to mark features as queries (and therefore, features without it as commands). The syntax would be the following:

query feature_name (parameters_list): return_type
body...end

A feature marked with the query keyword would have to follow some rules:

  • Only make unqualified calls to features that are also marked as queries.
  • Only make qualified calls to features that are also marked as queries, unless it applies to a local target.

The first restriction is severe since it forces us to change legacy code to add the query keyword in all classes. This can be alleviated if we trust libraries to follow CQS, and assume all features returning a result are queries.

This approach has some drawbacks:

  • The code of the feature can make an alias to one of the parameters, then call commands on the alias. The example below demonstrates this problem.

query some_feature (p: SOME_CLASS): ANY
local
    param_alias: like p
do
    ...
    param_alias := p
    param_alias.some_command
    ...
end

  • It should be possible to call commands on a parameter, if the object referenced by the parameter is restored to its previous state before the feature returns a result. Forbidding calls to commands is too restrictive.

Another approach: types that are only readable

A better option is to provide a finer grain of control over which feature is a query, and when calls are restricted to queries. This is a two steps process:

  • Mark features of a class that are queries as such.
  • Mark types for any target (including Current) to accept only calls to queries, when appropriate.

For this purpose, query is abandoned, and I'll use readable instead. This choice is discussed later in the post.

Here is a quick example of how it works, the syntax and rules are formalized in the next sections.

class
    A

feature

    readable some_query: SOME_TYPE
    do
        ...
    end

    some_command
    do
        ...
    end

end

class
    CLIENT

feature

    test (p: readable A)
    local
        x: A
        t: SOME_TYPE
    do
        ...
        t := p.some_query
        t := x.some_query
        ...
        p.some_command -- Compiler error: only features marked with readable can be called on p
        x.some_command
        ...
    end

end

In this example, the parameter p is of a type that has the readable tag. Only attributes, and features marked as readable, are available to p. x doesn't have this tag, and therefore all features of A can be called on the object referenced by x. Obviously, the assignment x := p would work around the restriction, so this assignment will be forbidden by the new rules detailed below.

Note that contrary to the first approach using the query keyword, the some_query feature can do whatever it wants with the object, including change its state. This possibility allows for instance the creation of a data cache, to make next calls to the feature faster.

Formal rules

I introduce a new keyword: readable. This keyword is used in two different contexts: readable features and readable types.

Readable features

The BNF syntax for feature declarations is modified as follow:

Feature_declaration: [<span style="color: blue">readable</span>] New_feature_list Declaration_body

New_feature_list: {New_feature "," …}+

New_feature: [<span style="color: blue">frozen</span>] Extended_feature_name

In this context, all feature names (some can be frozen) refer to a single feature, and this feature has the optional readable mark. This mark will be checked if the target of a call has only restricted access to the class the feature belongs to.

Readable types

Everywhere a construct is declared with a type, the type declaration is modified as follow:

Type: Readable_Type | Anchored

Anchored: [<span style="color: green">Attachment_mark</span>] like [<span style="color: blue">readable</span>] Anchor

Readable_Type: [<span style="color: blue">readable</span>] Type_With_Access

Type_With_Access: Class_or_tuple_type | Formal_generic_name

These rules allow the following use of readable:* x: readable STRING
* x: readable ARRAY[INTEGER]
* x: ARRAY[<span style="color: blue">readable</span> STRING]
* x: TUPLE[r_str: <span style="color: blue">readable</span> STRING; str: STRING]
* x: readable TUPLE[INTEGER; <span style="color: blue">readable</span> BOOLEAN]
* x(r_str: readable STRING) do...end
* x(a,b,c: INTEGER; str: readable STRING): readable STRING do...end
* inspect x when {STRING} ...when {readable STRING} ... end
* x := create {readable STRING}.make_empty
* create {readable STRING} x.make_empty
* if attached {readable STRING} x as r_str then...end
* agent f({STRING}?, {readable STRING}?)
* x := {readable INTEGER}.min_value
* x: like readable feature_name
* x: like readable Current

In the non-object call {readable INTEGER}.min_value, there is no object which state can be modified, so in practice there is no difference between {readable INTEGER}.min_value and {INTEGER}.min_value, if one is a valid (or invalid) call, the other is too.

Anchored types

With the rules from the previous section, the syntax x: readable like feature_name is not permitted.

If x is declared as x: like readable feature_name and feature_name's type is already readable, a warning is issued. This is because feature_name can only be redefined in a type that is also readable (see the conformance section below).

Inheriting as readable

The inherit clause of a class is modified as follow:

Inherit_clause: inherit [<span style="color: green">Non_conformance</span>] Parent_list

Parent_list: {Parent ";" …}+

Parent: [<span style="color: blue">readable</span>] Class_type [<span style="color: green">Feature_adaptation</span>]

A parent class inherited as readable will not make commands available to the child class. More specifically, the only features available to unqualified calls in the parent class are:

  • Features that are marked with readable
  • Attributes
  • Features listed in the create clause. These are included so the parent can initialize properly.

Note that nothing prevents a feature in the descendant to call one of the creation features of the parent. It's the responsibility of the programmer to call them only the context of an object creation.

Conformance rules will be updated accordingly (see the corresponding section).

Generic constraints

The declaration of formal generics isn't modified, but use the new declaration of types:

Multiple_constraint: "{" Constraint_list "}"

Constraint_list: {Single_constraint "," …}+

Single_constraint: Type [<span style="color: green">Renaming</span>]

It is therefore possible to write the following:

class A[G-><span style="color: blue">readable</span> ANY]

x: A[<span style="color: blue">readable</span> STRING]
x: A[STRING] -- This shouldn't compile

Conversion

Conversion clauses can be used to convert to and from readable types, with a separate feature for the type tagged as readable if desired. For instance, we can have:

class
    A

create
    make_from_string,
    make_from_readable_string

convert
    make_from_string({STRING}),
    make_from_readable_string({readable STRING}),
    to_string: {STRING},
    to_readable_string: {readable STRING}

feature

    make_from_string(s: STRING) do...end

    make_from_readable_string(s: readable STRING) do...end

    readable to_string: STRING do...end

    readable to_readable_string: readable STRING do...end

end

Most classes will just use the version with a readable input, since they typically don't modify the object being converted. In this example, I marked to_string and to_readable_string as readable, since they are queries.

Call restrictions

In qualified or unqualified calls, there is always a target of some type T. Sometimes this type is explicit, and sometimes it's inferred using many rules, but in the end there is always a type T. This type may or may not have the readable tag. If it doesn't, all usual rules apply to decide if the call is valid, like the export status of the called feature for instance.

If the type T is tagged as readable, there are additional checks to make before the call can be considered valid:

  • The feature is marked as readable
  • The feature is an attribute
  • The call is a creation instruction (or expression)

The later case is needed to allow the creation of objects that have been declared as readable, without making the code uselessly complicated. For instance, we can always write the following:

local
    x: readable STRING
do
    create {STRING} x.make_empty
    ...
end

But it's just easier to use create x.make_empty.

Assignment

Assigning a new value to an attribute can be a regular assignment, a tuple field assignment, a conversion or the call to an assigner. The last two cases are handled with the call restriction rules described above.

The case of a regular attachment, for reference types, is a simple conformance check, described in the next section. For expanded types, assigning to a readable entity is invalid.

For tuples, assigning a field of a readable tuple is invalid. Consider the code below:

x: TUPLE[s: <span style="color: blue">readable</span> STRING]
y: readable TUPLE[s: STRING]

x.s := "" -- Valid, STRING conforms to readable STRING
y.s := "" -- Invalid, y is readable

Conformance

The check of conformance between two types must be extended to handle differences in the readable tag. This can be seen as an extra dimension in the description of classes and type. The rules are simple:

  • T conforms to readable T, that is if x is of type readable T, and y of type T, x := y is valid.
  • readable T doesn't conform to T. Some languages may allow it, like C++ with the const_cast operator, but in my experience there is always a way to redesign the code to remove the need for this.
  • If class B is a descendant of the base class A, but with the inherit readable A clause, readable B conforms to readable A, but B doesn't. Neither conform to A.
  • If G is a generic parameter with a readable constraint, only readable types conform to the constraint.

Redefinition of a readable feature

If a feature is marked as readable, all redefinitions in descendants must also be readable. The reasoning is that, since the feature is a query, it should remain a query even if the implementation changes in descendants. Remember, readable features can still change the state of the object, it's only a mark to identify the feature as a query, and it doesn't put any constraints on the redefined versions.

Error messages

Along with the introduction of the concept of readable features and types, I created a new category of specific error messages, VURE. They correspond to the following errors:

VURE1
Calling a feature without the readable mark on a target of a readable type.
Details:

  • Enclosing feature name
  • Called feature name
  • Type name (readable)

VURE2
Renaming, changing the export status, redefining, undefining, selecting and otherwise using a feature when the parent class is inherited as readable, and the feature doesn't have the readable mark.
(This replaces VHRC, VLEL, VDRS, VDUS, VMSS, VEEN since it's a special case of these validity rules)
Details: same as the replaced error.

VURE3
A formal generic parameter is constrained to be a readable type, but the actual type is not readable.
(This replaces VTGD2 since it's a special case of constrained genericity rules)
Details: same as the replaced error.

VURE4
Assignment to a target of readable expanded type
(This covers the case of assigning the field of a readable tuple as well)
Details:

  • Enclosing feature name
  • Target of the assignment
  • Type name (readable)

VURE5
A readable type doesn't conform to a non-readable type
(This replaces VNCC since it's a special case of conformance rules)
Details: same as the replaced error.

VURE6
A readable feature is redefined but the redefinition doesn't have a readable mark
(This replaces VDRD since it's a special case of redefinition rules)
Details: same as the replaced error.

The choice of readable

I chose to use only one keyword, when it facts it's used for two separate concepts:

  • Readable features, making a feature as a query
  • readable types, an attribute of types used for additional validity rules.

Ideally, I should have kept the query keyword for features, and the readable keyword for types. In my first example, the code would look like this:

class
    A

feature

    query some_query: SOME_TYPE
    do
        ...
    end

    some_command
    do
        ...
    end

end

class
    CLIENT

feature

    test (p: readable A)
    local
        x: A
        t: SOME_TYPE
    do
        ...
    end

end

I think it makes more sense to mark queries with the query keyword, and types that can only be used to read the state of an object with readable. A counter argument is that, since both concepts are related, it makes sense to use just one keyword for both, and readable STRING seems a better choice than query STRING.

There is of course the fact that the number of keywords and reserved identifiers should be kept to a minimum. And finally, since legacy code will sometimes use them and therefore must be updated, it's better to just have one identifier name to change. On this subject, I found that readable is used as an identifier in some classes of EiffelStudio, but replacing it with a different name wasn't too much of an effort.

Application to manifest strings

A special case that could use the concept of readable types is the basic type STRING. In Eiffel, it is possible to modify manifest strings after creation:

local
    s: STRING
do
    s := "Hello"
    s.append(", world");
    s.print -- Prints "Hello, world"
end

Some consider this behavior as normal and expected, while others point out that manifest strings look like constants. A solution to reconcile both views could be the offer the option to have manifest string be of type readable STRING. This way, programmers that want to deal with constants can set the option and write:

local
    s: STRING
do
    create s.make_empty
    s.append("Hello");
    s.append(", world");
    s.print
end

Improper uses of readable types

The concept of readable types doesn't make all programs foolproof. If a class A make one of the objects of class B it references available to clients, even if A is readable the B object can be modified. Consider this example:

class
    B

feature

    some_command
    do
        ...
    end

end

class
    A

feature

    readable some_query: B
    do
        ...
    end

end

class
    CLIENT

feature

    test
    local
        a: readable A
        b: B
    do
        ...
        b := a.some_query
        b.some_command -- Valid since b is of type B, not readable B.
        ...
    end

end

We shouldn't change the design of readable types to force some_query in the example above to return a readable B. Perhaps A is some factory, and some_query is designed to build new objects, intended to be fully modifiable the client.

The workaround is to change the code slightly and handle both cases:

class
    A

feature

    readable some_query_readable: readable B
    do
        Result := some_query
    end

    some_query: B
    do
        ...
    end

end

class
    CLIENT

feature

    test
    local
        a: readable A
        b: readable B
    do
        ...
        b := a.some_query_readable
        ...
    end

end

Notice how some_query lost the readable tag. It is now only available to clients with unrestricted access. The other version some_query_readable returns a readable type, as expected, and is therefore safe.

This pattern is probably pretty common, and leads to some code duplication. If there was a way for the designer of A to indicate that some_query should return a readable B if called on a readable A object, all this code should be unnecessary.

I didn't find a satisfactory syntax without introducing a new keyword. I believe it could be done reusing the like keyword:

readable some_query_readable: B readable like Current

But that's some verbose and confusing English!

Implementation

I have started implementing these changes, based on version 6.6.8.3873 of EiffelStudio (the current official release). Unfortunately, the compiler code is quite complex, and my time limited. I can offer a partial implementation, basically everything that can emit a VURE1 and VURE4 error code, with the exception of the like readable Feature_Name syntax.

It's available for the 32-bits Windows platform at http://www.filefactory.com/file/b2bh3cc/n/ec.zip (for a week). Other platforms available on request, if I can fix the compilation issues I ran into lately and that I fixed for Windows.

If time allows, I hope to realize a complete implementation, although this partial version is sufficient for my own needs.

Comments
  • Helmut Brandl (14 years ago 29/7/2010)

    Your approach is very interesting.

    However I don't understand what it means to mark a feature which is already a query with `readable'. A query is already a query. A query should not change the state of any already existing object. I admit that this is not easy to check, because pure syntax analysis is not sufficient. But this is the common understanding of a query.

    • David Le Bansais (14 years ago 29/7/2010)

      You're right that in almost every case a query doesn't change the state of the object. However, sometimes the query can take a long time to execute, and to make it faster the result can be saved in a cache.

      So, in practice a query can modify an object, even though it will always return the same result.

      If we want to enforce CQS on every program, then the readable mark becomes unnecessary. But CQS might not be the best way to implement a class every time. It's mentioned in http://en.wikipedia.org/wiki/Command-query_separation, although I don't find arguments compelling on either sides.

      So I wanted to keep some freedom in deciding when an object state is really changing, and introduced the readable mark for features.

      • Helmut Brandl (14 years ago 29/7/2010)

        Ok, now I understand the reason to mark a query readable.

        Saving the result of a query in a cache in order to avoid lengthy recalculations is not a real state change. I don't considers this as a violation of CQS. It could even be checked by the compiler. We could add a validity rule that a query can modify a private attribute only if the attribute is bound in the class invariant to be equal to the result of a query.

    • David Le Bansais (14 years ago 30/7/2010)

      I forgot another reason to use readable for features

      If you take a look at the last before last section in my post, it has this code:

      class
          A

      feature

          readable some_query_readable: readable B
          do
              Result := some_query
          end

          some_query: B
          do
              ...
          end

      end

      Notice how some_query doesn't have the readable tag? This is because it's intended for targets with full access. Readable targets should use some_query_readable. But as I posted, I'm not happy with the way it looks and would rather have a single feature with a return type that depends on the target.

      • Manu (14 years ago 31/7/2010)

        The SCOOP model introduces a new typing rule in which the return type and the argument types of a feature depends on the type of the target. This could be possibly reused in your case. See http://se.ethz.ch/teaching/ss2007/0268/lectures/09-coop-scoop.pdf for more details on the SCOOP type system.

        • David Le Bansais (14 years ago 31/7/2010)

          Thanks for the link, it's an interesting reading.

          Unless I'm mistaken, the result type of a feature in that system is fixed by the rules. In my case, it's an option of the feature to adapt the type of the returned value to the target or not.

          • Factories will want to return a reference to a writable entity
          • Most other queries will return a reference to a read-only entity if the target is read-only.

          I'll come up with something as I continue with the implementation.

  • Colin LeMahieu (14 years ago 31/7/2010)

    Can some formal language be put in to parts of this stating the goal of the 'query' or 'readable' keywords saying whether that should signify a 'pure function' or a 'referentially transparent' function or other?

    It's interesting that you describe manual caching. Most languages that enforce or understand referential transparency will have the compiler or runtime do the common subexpression caching.

    • David Le Bansais (14 years ago 31/7/2010)

      I'm sorry, I don't understand which part of my post, or comments, your question is referring to. I'd like to answer, but I have the feeling my answer would be irrelevant for your question.

      • Colin LeMahieu (14 years ago 31/7/2010)

        I see in one of the examples a 'readable' version calling a non-readable version; this seems backwards.

        Usually with other languages the process goes like this:

        thing (i: INTEGER_32; j: INTEGER_32) local a: INTEGER_32 do a := slow (i, j) + slow (i, j) * slow (i, j) * slow (i, j) end slow (i: INTEGER_32; j: INTEGER_32): INTEGER_32 do ... -- Slow operation end

        If slow is referentially transparent, the compiler can cache the value of 'slow' and only evaluate the function once.

        Since a lot of academic research has been done on the topic of referential transparency it would be nice to know how this related to the 'readable' keyword if at all.

        • David Le Bansais (14 years ago 31/7/2010)

          Ha, I see what you mean now!

          The readable keyword for a feature marks it as available for readable types, so a compiler could certainly perform the optimization you described for such features, since it's an explicit hint that the feature will return the same result. No research necessary here.

          I'm reluctant to use the 'referentially transparent' expression because I'm not familiar with it and it might imply other properties of the feature, so I'll stick to readable here.

          The reason the call looks backward in my example with some_query and some_query_readable is because if a feature lacks the readable mark, it doesn't necessarily means it modifies the state of the object. In fact, in that example, none does. So one can query the other. But the lack of readable mark for some_query makes it unavailable for readable types and this is the property I was trying to demonstrate in the example. If the target is a readable type, it can only call some_query_readable, that returns a reference to a readable type. If the target is of a 'normal' type, it can call both features to get whichever version of the return types it wants.

          Manual caching, in my experience, happens a lot more often than this compiler optimization. Slow queries aren't that common, and usually parameters will vary. Finally, all caches I designed contain more data than initially requested and can satisfy other requests. But, of course, this is the experience of a single programmer (see Prof. Meyer recent post on that subject).