Stumbling across errors in language design
The recently floated proposal to add the `across' looping construct to Eiffel seems seriously mistaken. I will go through the reasons that make it undesirable for me, and as a bonus end with a general theory of why such proposals occur.
What happened to the minimalism principle?
Eiffel used to have one good way of doing each thing, for clarity and homogeneity and common understanding in the context of team work, and it’s well worth the modest cost of the odd slightly painful border cases. We now have not one but 3 constructs: legacy loops, which I’ll call neanderthal loops in the rest of this post; across loops, let’s call them neo-neanderthal loops; and agent-based iterators, the one you use in modern Eiffel.
Not an everyday construct
The new construct is being justified by saying that loops in their most general form are an everyday programming construct. In fact, loops in regular programming are all from a very small family of very regular loops, usually on containers or similar like INTEGER_INTERVAL, that are very well captured by iteration agents (do_all and friends). A full time Eiffel programmer would probably need to write an explicit loop once a year or less. The very rare cases I meet where I have to write a neanderthal loop is because of omissions in libraries who sometimes miss some obvious looping construct (almost always general enough, I’m not advocating adding convoluted iteration routines to containers that apply only to one case specifically). Lazy readers might think “but I write loops all the timeâ€, but if they look at all the loops you’ve written, and think how it could be replaced with an iteration agent, they will find there are very few cases that require an explicit loop. Neanderthal-loop fans often tell me that some loops can’t be written with a generic agent iterator but when I ask them to show me an example all they can find is examples of where they were being lazy or used a sadly incomplete library (I don’t think anyone who promised me an example ever delivered a credible one, though I’m not saying a few cases don’t exist, they’re just very rare).
The great advantage of using agent-based iterators is that their behaviour (termination etc) is much easier to reason about, by hand or machine. If your system has a handful of generic loops (in the implementation of the container library) it’s much easier to reason about instances of such loops once you’ve proven and modelled the core library routines' behaviour.
Core language constructs shouldn't depend on libraries
The use of a keyword for library syntactic sugar is upside down. Core language constructs should have minimal dependencies if any at all on upper level libraries in general, and on individual routine/classes fairly down the abstraction chain at that. What next, syntactic sugar for log routines?
This gratuitous syntactic sugar also pollutes the keywords namespace, which ideally should be kept small and manageable, and certainly not used for superfluous constructs that are well dealt with by the core language.
Eiffel is verbose on purpose
The verbosity argument is so anti-Eiffel as well, Eiffel is about clarity not about line count. If we’re going that way we could as well go punctuation {*o:i|print(i)*}. With a bit of effort the gold standard of syntax mediocrity, perl, could be within our reach!
The verbosity comparison between fully expanded neanderthal loops and compact neo-neanderthal variants in the proposal are disingenuous at best, the fair comparison would be to present neanderthal loops like this:
which is a style both used by leading practitioners and the one consistent with the style of the neo-neanderthal examples. This reduces the line count "advantage" to hardly anything.
A bad take at a known construct
If I was trying to add a loop construct of that genre, I’d hide the cursor totally and do something like:
which is what other languages with neo-neanderthal looping constructs do, for a good reason. We don’t need to be gratuitously worse than the common practice for the sake of originality.
The construct above could be generalised to the hash table case with something like ‘across table with item, key loop’ — were it not anyway totally superfluous given that do_all_with_key does it much better (the only reason it’s not in HASH_TABLE is that as far as I know the maintainer got arrested for earlier crimes against software quality — the class is a crime hotspot — so the class has been unmaintained since the lengthy court martial proceedings started). Low-crime hash table classes like the one in Gobo are well equipped.
Going past the Good Idea Entropy Axiom
To finish I’ll expose my theory why this proposal happened.
One of the fundamental laws of the universe is the Good Idea Entropy Axiom, which reads:
– The aggregate quality of ideas available in the universe must never increase.
and I think that Bertrand has taken this theorem individually and, to compensate for the recently introduced and excellent theory of aliasing, has had to introduce neo-neanderthal loops so that his net contribution to the world of ideas is not positive. But the universe is full of people who produce more than their fair shares of bad ideas, so to each to their ability: I urge Bertrand leave to others the jobs of creating junk language proposals and keeps to the supply of good ideas.
I can't say I have followed the debate, but really, are we getting another please? Please no. Pascal had 3 loops and one of them was used so infrequently I had to lookup the semantics when using it or reading code.
The minimalist approach is there for a reason. Syntactic sugar has its place, but should be clearly warranted. Can Eiffel programmers vote on the proposal?
Agreed
While it is often hard to resist new syntax, hey, Eiffel is elegant (sure I don't use it daily, and maybe that has skewed my vision).
One might say that Eiffel could make an even bigger contribution by adding statically-typed macros! :)
Some people such as myself don't like to use iteration agents because they statically strip contracts; I prefer an adapter object.
Almost all languages have a kernel library that's used, C without a C runtime, which includes malloc (), isn't very useful. To handle release of external resources, all GC'ed languages need to expose some type of finalizer class.
Don't get me wrong, I'm wasn't a proponent of this language construct but I don't find some of these arguments valid discourse.
I also don't like the 'once' keyword. I stay away from 'once' and 'agent' as much as possible.
Neanderthal-loop problem
It seems to me that the neanderthal-loop approach to the following problem is more straightforward than the agent-iterator approach (but I concede it can be done with an agent-iterator).
Given:Not to me
To me this seems a clear win for the iterator:
Not quite
This givesequal (l_result, "redbluegreen")
but it should be
assert_equal ("l_result correct'', "red, blue, and green", l_result)
No agent loops
I disagree with Frank's criticism on the new loop construct. I think it is a valuable contribution to the language.
Pros and cons of agents
I am surprised at the absolute denigration of agents.
I can see pros-and cons in their use.
Against them is the overhead in ISE's compiler (but this is a problem to be tackled - gec produced faster code with do_all the one time we measured it). Also the catcall/conformance problem. But SmartEiffel got this right, I seem to recall.
More serious is the problem of writing a contract for agent-based iterators. I think this requires static checking of contracts. We have a problem with implementing runtime checking. But the use of any particular iterator at a given point in client code is not so problematic - a static check by the code reviewer should be straight-forward. Certainly more so than for loops.
And of course inline agents never have any justification whatsoever, but that is another issue.
I occasionally code explicit loops. But I need to find a very strong reason to do so, or Franck will (rightly) laugh at me.
Recently I code a four-way loop nested three-deep. My excuse was I was transcribing a C routine whose workings I did not understand, and nor was I eager to attempt such a task on such code. So I equipped my routine with so many loop invariant and variant clauses (incidentally discovering that you can't have multiple variant assertions in a single loop - which was annoying as I had co-varying variables) that I would not be surprised if I broke the world-record for the number of assertions in a routine. My intention was that once I had the thing fully working, I'd be able to understand it with the aid of the assertions, and translate it to an iterator version, so that other developers could understand it (I haven't reached that stage yet though - it doesn't always work correctly). Note that across would not have been usable at all in that particular case.
The other chief reason I find for writing explicit loops is library deficiencies. These come in two varieties. The first is where the library just doesn't implement an obvious iterator. This is a simple defect, and if you are allowed to edit the library, that is not a problem. But sometimes you can't. The second is where the particular iterator you want is so obscure that it is hard to imagine anyone wanting to re-use it. In this case I code an explicit loop, and bemoan that I'm using Eiffel rather than Haskell. Note that in such a case across would be automatically ruled out.
But on the question of adding a keyword I am 100% with Franck. Especially one tied to a particular library class. In addition to his arguments, which tree traversal order is it going to use? And in general, given a data structure that might be traversed in many ways, how do you make the decision that a particular route is most desirable?
SmartEiffel agents
I prefer SmartEiffel's covariant agents to standard Eiffel's contravariant (or did they end up being no-variant?) agents.
But I wouldn't say SmartEiffel "got this right", because I recall that the objection to covariant agents was that it wouldn't work if the agent had an open target.
As someone who's never used an agent with an open target, personally I'd rather have covariant agents and remove open targets from the language.
Traversal order
The traversal order issue is addressed by the library features that allow specifying the required one. For an indexable structure this is done by callingreversed on a cursor:
across my_list.new_cursor.reversed as c loop print (c.item) end
The similar technique is taken for tree traversal: when the desired order does not match the default one it's possible to specify it explicitly using appropriate features for preoder, postorder, inoder, level-order and other variants using the approach above.
I'll try again
OK. Sorry I didn't read carefully enough.
In that case it is necessary to write the agent:
Again, it is nice and neat, and no loop syntax to obscure the logic.
Just agent syntax to obscure the logic ;-)
Honestly, although I've written quite a few of these do_all loops, I find lines of code like this hard to read:
I have to think quite hard to figure out what those question marks are about, and why they are in that particular position in the argument list, cross-referencing with the routine signature in order to nut it out. And although I've now figured out why the word and the index are open arguments (they are the things that vary as you traverse the array), it's still not clear to me how they get to be passed in that order. I guess it would be clearer if I went and looked at do_all_with_index.
So after spending five minutes understanding that line of code, I'm wiser about do_all_with_index. Heck, I might even get the bright idea of using it myself one day. But pity the poor schmuck (or more likely schmucks, plural, given that any line of code, although written only once, will be read many times) who has to comprehend my clever agent-based loop.
Give me a neo-neanderthal loop please.
Not definitively more straightforward
The logic in extend_english_list makes sense only in the context of an iteration. I find that removing it from the context (a loop) highlights the logic but obscures the purpose of the logic. It is also hard to see how one would use extend_english_list outside of an iteration, and it is not desirable as a reusable procedure in the context of a call to do_all_with_index. For clarity and ease of use, one still has to encapsulate the call to do_all_with_index within a function:
So you end up with two features, only one of which (as_english_list) is really intended for reuse. Instead, we can do it with a single loop function:
The agent is more reusable
First, in reply to Peter's comment, you have no need to understand what the questions marks mean when reading that line of code (I never do) - you just read the agent name (unless you're reviewing the code for correctness that is - if the two arguments have the same type, the compiler won't do that for you). Writing the code is a different matter. You need to learn the meaning of the agent. But you only have to do this once.
To Neal's point about (perceived lack of) reuse. There is a second use for this agent - it's in do_parallel_with_index (hypothetical, but not too difficult to implement). The loop is inherently sequential. Also, you are re-writing the loop, as it is already present in the body of do_all_with_index, and extensively tested (I use that iterator all the time. It was added to EiffelBase at my request. Incidentally, this provides a nice illustrative story about the benefits of only writing the loop once. The first version I submitted to Eiffel Software had a bug - when lower did not equal one. The bug is fixed, and remains fixed if you use that iterator. But if you write the loop yourself each time, you might make the same mistake as I did.) If I had to choose between Bernd's position (no agents), and Franck's (no loops), then I'd go with Franck every time. In fact for my home programming I now use a loop-free language (Haskell).
P.S. Neal, your loop lacks invariant and variants. If you added the variant, it would catch the non-termination bug quickly. In "A Touch of Class", professor Meyer says don't even think of coding a loop without writing invariants and variants. Franck just drops the without clause.
Well, of course that particular agent couldn't be reused for a parallel computation. But the principle is right.
No new agent required
You can also do it without writing any agent routine:
It's more concise than even the variant-free loop version, and you can even run the comma-plussing in parallel! :-)