Object Data Instance Notation - ODIN library

by Thomas Beale (modified: 2014 Mar 24)

In the openEHR (EHR = Electronic Health Record) project, we use a serialisation that is called ODIN, an alternative to JSON, XML etc. It was developed before JSON existed, and is more comprehensive (particularly in terms of leaf types). It is also (in my view) easier to read. It's certainly easier to write (in JSON, [] and "" will eventually drive you crazy). You can read about it here.

This library is currently used for:

  • application .cfg files
  • multi-lingual error messages
  • object model schemas (i.e. as a replacement for XMI)
  • to express direct serialisations of object structures in other programming artefacts

The library consists of:

  • parser and serialiser
  • DT (data tree) library - intermediate DOM-like representation
  • DT <=> object bidirectional data converter
  • Xpath-like path functionality on the data tree

It is implemented in a number of languages.

I have often wondered if it could become a de facto serialisation for use with Eiffel, since none of the other formats in use are very friendly (data serialisation, XML .ecf files), and there is no standard format at all for multi-lingual error and UI messages.

- thomas

Comments
  • Larry Rix (10 years ago 26/3/2014)

    Is the ODIN library a part of the normal Eiffel Studio installation?

    I have looked for all *.ecf files in the Eiffel Software folder under Programs (x86) and cannot find anything looking like it.

  • Jocelyn-Fiat (10 years ago 26/3/2014)

    No it is not shipped with Eiffel Studio. However it could be possible to include under "contrib/library".

    • Thomas Beale (10 years ago 27/3/2014)

      I'm not suggesting we just put it in the contrib libraries, I think it's more important to work a) out if it could be a useful syntax b) if the Data Tree (DOM-like) approach is useful and if c) an Xpath facility within data hierarchies is useful, and finally d) if a bidirectional object materialisation facility is useful.

      The ODIN library does this:

      ODIN syntax text file <=> Data Tree (addressable name/value tree) <=> native Eiffel objects

      As far as I know, there isn't any other facility that allows you to serialise objects into a readable syntax in Eiffel. There is eJSON but AFAIK independent_store doesn't use that. I'm not sure JSON would work anyway because it would lose type information.

      In my view, there should be a single standard syntax not only for serialised object networks, but for all such text files - .ecfs, .cfgs, error message files and so on.

      How are people implementing .cfg and error message files today?

      - thomas

      • Larry Rix (10 years ago 29/3/2014)

        I was reflecting on ODIN and wondering if JSON could be retro-fitted and adopted. The reasoning for this was to foster acceptance. Then I came to my senses. This is the same mental disorder infecting others having attempted the "change-it-a-little-for-the-sake-of-acceptance" model. This is not what is needed or wanted.

        That led to another bit of reasoning and a conclusion: ODIN is not the answer either.

        What is needed is an Eiffel-lite notation. One that is Eiffel-ish, but stripped of the parts needed by the Eiffel compiler because the intended purpose is not compiling, but object serialization. Thus, we already have a notation that is well founded, well thought out, and needs only to be made into an Eiffel-lite notation. To this end, I reworked your example from the README.md file at https://github.com/openEHR/odin.

        {SCHOOL_SCHEDULE} [ lesson_times := {TIME} <<"08:30:00", "09:30:00", "10:30:00">> locations := <<"under this big plane tree", under the north arch", "in a garden">> subjects := {SUBJECT} << [ key := "philosophy: plato", name := "philosophy", teacher := "Plato", topics := <<"meta-physics", "natural science">>, weighting := {DECIMAL} 76.0 ], [ key := "philosophy: kant", name := "philosophy", teacher := "Plato", topics := <<"meaning and reason", "meta-physics", "ethics">>, weighting := {DECIMAL} 80.0 ], [ key := "art", name := art", teacher := Goya", topics := <<"technique", "portraiture", "satire">>, weighting := {DECIMAL} 78.0 ] >> ]

        NOTES:

        1. Notice that the keywords are removed (e.g. "class" keyword, which is replaced by the standard { ... } notation).

        2. Manifest values are used where they can be (just as it is in Eiffel code).

        3. Serialized objects are denoted as manifest TUPLEs (e.g. [ ... ]).

        4. Data type metadata is communicated just after the assignment operator (e.g. " ... := {DECIMAL} ).

        5. In the case of ARRAY [G] (e.g. "... subjects := {SUBJECT} << ... >>), we provide a metadata type declaration just before the manifest, which declares the type of the items within the array.

        What I find interesting about the proposed alternative notation above is that it reads like Eiffel. I do not need much of an additional specification to get my Eiffel-brain wrapped about what the notation is communicating. Unlike JSON, which is rife with the same inane syntactical sugar of curly-brace-hell, I am free to read quickly and easily in something quite Eiffel-ish.

        I am interested to get the feedback of the group.

        Cheers!

        • Eric Bezault (10 years ago 30/3/2014)

          Description of Objects in Eiffel

          In the later 90's I was working for a company in London (TCAM) on an Eiffel project for the Chicago Board of Trades. As part of our project we designed a language similar to Eiffel to describe objects. We named in DOE (Description of Objects in Eiffel). Here is an example of such file:

          object ledger: LEDGER is do name: STRING is "cbot" tag: STRING is "market" tracing: STRING is "off" template: STRING is "market" business_id: RECORD is do market: STRING is "cbot" contract_spec: STRING is "" contract: STRING is "" category: STRING is "market" end all_cspecs_id: RECORD is do market: STRING is "cbot" contract_spec: STRING is "*" contract: STRING is "" category: STRING is "contract_spec" end end end

          We were also able to write more complex objects (with two objects pointing to a same third object, etc.). It was also possible to have an indexing clause. And we had the framework to read these files into Eiffel objects and to write Eiffel objects to these files (like we do with `independent_store', except that the storable file is human-readable.

          I thought it was interesting to mention this old experience (I had diffculties to find an example in my old backups) in this blog. Food for thoughts...

          • Larry Rix (10 years ago 30/3/2014)

            Indeed, it is interesting and sparks a thought in me from what has been shared so far: Some form of these notations could be applied to code as some sort of a "complex constant" -- that is -- when accessed by a Client, the Eiffel code would construct the complex object from the static definition. Because it is a mixture of manifests and statically defined reference types, the compiler would still be able to apply static type checking as long as the complex constant was typed to an existing class in the project universe. Eric, your example, I would re-write as:

            ledger: LEDGER = [ name: STRING = "cbot" tag: STRING = "market" tracing: STRING = "off" template: STRING = "market" business_id: RECORD = [ market: STRING = "cbot" contract_spec: STRING = "" contract: STRING = "" category: STRING = "market" my_counterpart: RECORD = Current.parent.all_cspecs_id ] all_cspecs_id: RECORD = [ market: STRING = "cbot" contract_spec: STRING = "*" contract: STRING = "" category: STRING = "contract_spec" my_counterpart: RECORD = Current.parent.business_id ] reference_to_self: {like Current} = Current internal_array: {like Current.parent.parent.another_feature} = Current.parent.parent.another_feature another_feature: ARRAYED_LIST [INTEGER] -- Some other feature to reference from within `ledger'. attribute create Result.make (0) end

            Above, the structure is similar to writing an Eiffel constant, except the object is allowed to be more complex. In this case, a reference type like LEDGER is allowed to be represented as a new form of named TUPLE, where the named elements are themselves allowed to be constants and complex constants. Thus, the elements: name, tag, tracing, and template are all STRING constants, whereas the reference types business_id' and all_cspecs_id' are references to RECORD. Each instance of RECORD is allowed to be another new manifest TUPLE form, with its own set of constant elements.

            In this way, the `ledger' feature in our class will be a Once that returns a single object and reference, unless (of course) it is twinned (i.e. my_ledger := ledger.twin).

            Moreover, this same structure can be stored in a file, is quite human readable, and is easily a specification for both serialization and deserialization.

            Also, reference class references within the complex constants would have to exist somewhere in the class universe so the compiler would have oversight of them. In this case, the class RECORD would need to already exist. Additionally, RECORD would need to satisfy Void-safety for all of its attributes. Attached attributes would need to be provably attached when each instance of RECORD is built. Invariant contracts would apply at the end of the creation cycle for the reference class. Perhaps a flaw in the thinking is that (e.g. blah: STRING = "another_blah") the value of the attribute would be valid. One would be limited to trusting programmer discipline to get the values of manifests like STRINGs correct (e.g. "another_blah" might not be a legal value and a validator might not be achievable, if the STRING is complex enough. Perhaps one could open the door to the constant being set by some call on the RECORD class at creation time (e.g. blah: STRING = Current.calculate_my_blah)? Yet, for a specification that is strictly applied to serialization and deserialization of objects to and from text, such things are not needed as the values are known from the objects being serialized to text.

            I have added one more feature to your example to see how Current would apply. In this case, the feature is reference_to_self'. A more interesting problem would be to provide a reference between business_id' and all_cspecs_id'. In the code above, the example is that one can simply use Current.parent' to drill up and then back down. In this manner, one could make complex references between the objects in the complex constant.

            It is further conceivable that one could make references outside of the complex constant. In the code example above, the internal_array' reaches above ledger' and then back down to `another_feature' in the same enclosing class. However, such a thing would not look this way when serialized to text for storage in some data repository.

            For me, the dual applicability of the notation to both the creation and use of complex constants and to the serialization and deserialization of objects to and from text is highly intriguing. The code is so much like what Eiffel is already, simple, elegant, and flexible in its capacity to model complex objects as constants. I am liking it the more I look at it.

            More food for thought! :-)

        • Thomas Beale (10 years ago 1/4/2014)

          This is getting fun.. the above reworking looks quite good. I'll just make a few notes on aspects of the existing ODIN syntax that may be of interest:

          1. it is designed so that all leaf types are syntactically distinct, i.e. you don't need type ids. See https://github.com/openEHR/odin#primitive-types for the current list. Now, this doesn't quite work for Eiffel, because we have all kinds of INTEGER_16, INTEGER_32, 64, STRING_XXX etc. So in that case, the ODIN serialiser has a mode where you can force type information to always be included. I have tested this with Eiffel types, and it works mostly as expected. Some problems with Character and String 32 I have not figured out.

          2. in the original ODIN, the [] weren't chosen casually; they are intended to map to the predicate [] of Xpath. If you follow the path that is formed by descending the tree, picking up attribute names and [] segments, you get paths like this: school_schedule/subjects["philosophy:plato"]/topics and so on, which are a) easy to understand and parse for people who have seen Xpath, and b) easy to actually convert to formal Xpath, if needed.

          3. Paths as in 2. above are how two objects can refer to a third one, rather than having all objects inline (i.e. by value). Example here: https://github.com/openEHR/odin#object-referencing

          4. Another point about ODIN structure of repeated blocks of

             ["key"] = <value>
             ["key"] = <value>
             ["key"] = <value>
             ...

          is that it is designed to map directly to HASH_TABLE or the equivalent in any language, or just as easily, a LIST or ARRAY of any kind (by ignoring the keys). The loss of the [] for keys and addition of the Eiffel manifest array delimiters <<>> obfuscates this somewhat.

          5. I did originally think of using {} for the type markers, but decided to go with () a la C type-cast notation.

          Now, in the Eiffel version above, [] are used to designate TUPLEs, <<>> collections, and {} types, which is natural enough to read, which it is for us Eiffelists. I do wonder however if we should care about those who don't use Eiffel, i.e. most of the world. In openEHR, we use ODIN a lot, but hardly anyone uses Eiffel. Our serious uses of it are:

          • object schemas - e.g. https://github.com/openEHR/odin/blob/master/examples/openehr_demographic_102.bmm
          • complex artefact serialisation - e.g. https://github.com/openEHR/odin/blob/master/examples/openEHR-EHR-OBSERVATION.apgar.v1.odin

          In the openEHR community, there are ODIIN parsers and serialisers in many languages. Also, the object schemas above are now generated by an Enterprise Arcihtect plug-in. I am not sure if I could have convinced the author of that to work with an Eiffel-specific syntax!

          Just to be clear, I'm not at all against a remapped syntax or different appearance, just pointing out some background facts in case people here think they may be relevant.

          BTW, to give an idea of what it takes to interconvert between ODIN and Eiffel objects via the Data Tree structure, see this class - https://github.com/wolandscat/EOMF/blob/master/library/odin/data_tree/interface/dt_object_converter.e