Moving to a readable/writable model
Introduction
Work a long time on your code, and you will have countless opportunities to experiment with software design. Today, I'd like to describe how I moved some classes to a new model for separating queries and commands.
The need for this came to me because of complex assertions. It turns out that if the code called during assertions evaluation involves a large number of classes and features, it becomes difficult to ensure that it doesn't have side effects. My first approach was to verify the state of my objects dynamically and ensure it wasn't modified. Lately, I moved to a more static approach whereby classes involved during assertions evaluation only contain queries. It is this work I present here. The tradeoffs are mentioned in the conclusion section.
Starting with an example
I will begin with a SAMPLE class, containing some queries and commands.
This is close to the most basic example of a class with queries and commands. I separated expanded and reference types to demonstrate how a query can indirectly allow side effects to happen during execution. Here, a client class can call the reference_value feature to obtain access to reference_attribute and modify it. In the case of the expanded_value feature, the object is copied and can't produce side-effects.
The new model consists of splitting SAMPLE in two classes, SAMPLE and READABLE_SAMPLE.
Splitting the class
Let's see how we can do it on the sample code. All queries that can't produce side effects are moved to READABLE_SAMPLE, remaining features stay in SAMPLE. Finally SAMPLE inherits from READABLE_SAMPLE to produce a class that behaves identically to the old version. Creation functions can't be considered as producing side affects, but they can be called by clients if their export status is not {NONE}, and in that case are commands. This problem is easily fixed by setting the export status to {NONE} in READABLE_SAMPLE, and to the value it should have in the SAMPLE class.
Here is a first draft of the new code.
This first approach allows us to pass references to READABLE_SAMPLE objects when we don't want clients to be able to modify the object. It has some flaws, one of them already mentioned.
- It does nothing about commands inherited from ANY, most notably copy.
- The object can be modified if a client calls reference_value to get a reference to reference_attribute.
Before we refine the model, let's see an example of code that takes advantage of it.
In is_valid only queries can be called, and if a_sample is passed to other routines, they will also be unable to call commands. Actually, this is not entirely true. Since the object is in fact a SAMPLE object, a client routine could obtain a reference to the object as SAMPLE from the reference to READABLE_SAMPLE with a Certified Attachment Pattern (CAP). Consider the code below:
There is nothing we can do about this other than performing a copy of the object into a new READABLE_SAMPLE object. But this hurts performance, and the new model becomes useless since we could have kept the SAMPLE class as is, with all side-effects happening in the copy. We will assume that the writer of the code respects the intent of the new model and doesn't do such an horrible thing!
Refining the model
The case of the copy feature is typical and we will meet it again when we consider descendants of SAMPLE. Here is a feature that should not be in READABLE_SAMPLE and yet must be in SAMPLE. The solution to this problem is two-fold:
- We change the export status of copy so that clients cannot call it. The export status to {ANY} will be reinstated in SAMPLE.
- As an additional precaution, we rename copy into a new name intended to discourage descendants of READABLE_SAMPLE from using it.
Indeed, any class inheriting from READABLE_SAMPLE could call copy. By changing its name, we make sure that descendant classes know what they doing. The correct name can be reused in SAMPLE the same way it is exported.
This is how we handle the first problem. But what about the second? We have two options:
- Return a copy of reference_attribute instead of a direct reference to it. This is at the cost of performance and should be avoided when possible.
- If reference_attribute is of a type that has a corresponding READABLE_ counterpart, return the read-only type in READABLE_SAMPLE, and the full type in a redefined version of the query in SAMPLE.
Fortunately for us, STRING is already implemented like that since it inherits (in its latest incarnation) from READABLE_STRING, or more accurately from the corresponding sized variant.The solution is therefore to have 2 versions of reference_value with different return type. We can take advantage of it to make both versions available in SAMPLE, since returning a read-only version makes sense even for the modifiable class. But in this case they must have different names. This is easily obtained with renaming, as the reader can see below:
Here, a client of SAMPLE can obtain either full access to reference_attribute by calling reference_value, or read-only access by calling readable_reference_value. A client of READABLE_SAMPLE could only get the read-only reference with a call to reference_value. Hence the semantic of this feature is preserved.
The sample class doesn't have an invariant, because they didn't create any problem for me. I just moved all invariants to the readable version of the splitted classes.
Inheritance
Single inheritance
Things can be formalized when a descendant inherits from a parent and both can be splitted. Let's call PARENT and READABLE_PARENT the classes to inherit from, CHILD and READABLE_CHILD the descendants.The inheritance scheme is CHILD -> READABLE_CHILD -> PARENT -> READABLE_PARENT. What we need is to make commands and queries that can lead to side-effects coming from PARENT unavailable to clients of READABLE_CHILD.
- In READABLE_CHILD
- Commands are renamed as forbidden_<command name> and their export status changed to {NONE}.
- Queries allowing side effects have a readable_<query name> version. We rename <query name> to forbidden_<query name> and readable_<query name> to <query name>. Similarly to commands, we also change their export status.
- In CHILD
- We reinstate commands with their original name and export status.
- We rename queries again (and change export) to the readable_<query name> and <query name> versions.
The resulting code, quite complicated, is listed below. SAMPLE and READABLE_SAMPLE are used to demonstrate queries that return a read-only version of an object to avoid side-effects.
As long as all classes follow this model, assertions can be coded using only the READABLE_xxx version of classes as parameters.
In one occasion, almost the entire code of a class was dedicated to creating the object. This forced me to put all code in the readable version of the class. If no other class than SAMPLE inherits from READABLE_SAMPLE, then it's not much of a problem. Otherwise, the solution consists of implementing these features directly with the forbidden_ prefix and export to {NONE}.
Multiple inheritance
The case of multiple inheritance is slightly more complicated, but the rules are identical. In one case I had to merge different versions of copy. Taking the example of the PARENT and CHILD classes again, here is the corresponding code.
Creation of readable objects
In my sample code, READABLE_SAMPLE is a class with a create clause. That is, READABLE_SAMPLE objects can be created, and for them no command can be called ever. They represent constant objects with a state fixed at creation.It turned out to be a problem in my code, because of anchor types and result types that are sometimes tied to the readable version of SAMPLE. For instance, if a query returns a READABLE_SAMPLE object destined to be stored and modified later, there will be no actually modification possible if the object was created with a code like
- I made all readable versions of classes deferred and removed their create clause.
- I recompiled and fixed all errors by changing code like
create Result.make toresult := create {SAMPLE}.make - I made all readable versions normal classes again (i.e. not deferred anymore) and reinstated the create clauses.
Deferred and once routines
Deferred routines can be a problem if they are commands. This is because they should be implemented in the writable version of the class, which means the readable version must be declared as deferred and therefore cannot have creation procedures.The solution I found was to declare the deferred command in the readable version of the class, but with an empty body and with export {NONE}. Then I redefine it with the appropriate implementation in the writable version.
Once routines are easier to manage. They create a problem because there must not be two versions of the once routine. For instance, if once_fcn is the routine name, we must not end up with readable_once_fcn too, otherwise it could be executed twice. This case happens only if the once routine is a function returning a reference to a writable object, and to a read-only object for the readable_once_fcn case. The solution consists of moving the code to a third implementation_once_fcn once function, belonging to the readable version of the class of course. once_fcn and readable_once_fcn can then be separate routines returning the appropriate object type, since they call a once function themselves.
Here is a more comprehensive example:
The idea of moving the code to a third function makes sense for all functions, and not only if they are "once", for the purpose of avoiding code replication. However, most of the time I didn't do it because the resulting code was more complicated and harder to understand.
Generics
When the sample class is parameterized with generic classes like in SAMPLE[G], I needed to access the readable version of G. There is unfortunaly no easy way to do that, so eventually I resorted to recode SAMPLE[G] as SAMPLE[READABLE_G, G->READABLE_G]. Here, the intended purpose is to have G being a direct descendant of READABLE_G but this is not something that can be enforced.For generic parameters that are not splitted, I could use the class twice like in SAMPLE[INTEGER, INTEGER], since a class always conforms to itself. This works with expanded classes, or classes that only have queries, but not with most classes coming from libraries.
Unfortunately, EiffelStudio doesn't fully implement the SAMPLE[READABLE_G, G->READABLE_G] syntax. G isn't considered conforming to READABLE_G when checking constraints. Let's make it more clear with an example:
Here G is not recognized as a descendant of READABLE_G and this code won't compile.
I worked around this issue with a hack in the compiler, until better support for this kind of constraint is added to the language.
Conclusion
I demonstrated how I changed my code to move to a read/write, read-only class model. The change was necessary to write complex code to be used during assertions evaluation. Another, easier way to obtain the same result is to execute the assertion code on a copy of the program state, since it guarantees (within some limits) that there will be no side-effects on the core of the program. The tradeoff is to exchange simpler code for poor performance.
- The new model forced me to duplicate almost all classes.
- All class headers can easily be cluttered with a lot of rename and export clauses.
- Class names are longer, and if they use generics, a lot longer.
- Renaming a feature forbidden_<feature name> is not a proper protection against using the feature
- This doesn't work well with libraries. I ended up rewriting several of the basic structure classes.
On this subject, things can be automated since I've been following a small set of rules. If the language had support for this model, the code would look a lot more simpler.For instance, introducing a new keyword 'readable', here is what the compiler could do:
- Split all classes in two, 'readable SAMPLE' being an alias to READABLE_SAMPLE.
- Move all queries to the readable version of the class. To indicate which feature is a query, just put 'readable' in front of the feature name.
- For duplicated queries (<feature name> and readable_<feature name>) make the readable version available with 'readable <feature name>
- Hide commands instead of renaming them to forbidden_<command name>.
- If G is a generic, Make 'readable G' an alias for READABLE_G and take SAMPLE[G] to mean SAMPLE[readable G, G->readable G].
- Provide default yet unavailable versions of deferred features that are commands, so a readable class can be created. Code once functions to be called once even if they have two versions.
This would need to be better formalized of course. Perhaps I'll play with the compiler to try it one day. I'll live with the modified code for now.
Just a few comments regarding the Eiffel ECMA specification of a few things mentioned here. ECMA is planning on removing the duplication/copy routines from ANY. The other thing is that ECMA is currently forbidding restriction of export which is not yet enforced in EiffelStudio for backward compatibility. These are things to keep in mind with what you are proposing.
I can't comment about the removal of copy, I don't know what it will be replaced with.
The export redefinitions I made are just a precaution, since features for wich export is restricted are also renamed. In the semi-serious proposal of my last section, they would actually be hidden. So I'm not really worried about a change in the way EiffelStudio handles restricted exports.
However, perhaps you could add a -strict option to enforce ECMA rules? This would help people that want to stay ECMA compliant. But unless I'm mistaken, EiffelStudio is the main if not only compiler out there, so being ES compliant has higher priority obviously.
(Edited for typo)