You are here: Home V2 Software Software More ... Memops Code Generation Strategy Documents Unique and ordered collections

Unique and ordered collections

Semantic and implementation rules for collections, with isUnique and isOrdered attributes. IMPLEMENTED. Rasmus Fogh 2007

Nomenclature

  • Unique: U        hicard != 1, duplicates not allowed
  • Ordered: O      hicard != 1, order is significant and preserved
  • Single : S         hicard is 1. By definition these must be U+, O-
  • None : N          No role - the other side of a one-way link.


A link between classes  A---B with roles A.b unique, cardinality 0..*, role B.a ordered, 2..2, is be written as :
U+,* <--> O+, 2..2

Collections are modified by the three modifier functions set, add, and remove, as well as by create and delete.

Defaults

  • Roles or attributes without explicit cardinality default to 0..1
  • Attributes with hicard != 1 default to U-,O+
  • Roles with hicard != 1 default to U+,O-

 

Limits

  • Parent roles must be S, 1..1, frozen, and composite aggregation
  • Child roles must be either S, 0..1 or U+, O-. In the latter case a special function, opType sorted returns the child objects sorted by local key.
  • elements that are part of the local key must be n..n and frozen.

 

Collection implementations

The return values of functions uses the implementations below. Where the language allows, immutable versions (Python frozenset, tuple) are used for the return values of get operations. findAll will return modifiable versions.

  • U+, O- (set) is implemented as a set
  • U+, O+ (unique list or ordered set) is implemented as a list
  • U-, O+(list) is implemented as a list
  • U-, O_ (bag) is implemented as a list, but the implementation does not preserve element order. A bag implementation could have been used, but the languages looked at so far do not have a standard one.


The internal implementations in the objects follow the rules given above (for file implementations at least). Child links with hicard != 1 are an exception; they are stored internally as dictionaries (maps) with the local key as the dictionary key. This will speed up uniqueness checking the getByKey function, and relevant findFirst and findAll functions.

Semantics


The semantics of attributes follow straight from the chosen implementations. The semantics of links, which are mostly bidirectional, are anything but simple.

Links

  • All links are collections of object pairs.
  • For links where both ends are U+, S, or N, duplicate pairs are forbidden
  • For links where one end is U-, duplicate pairs are allowed.
  • Links that are U+ <--> U- behave like the corrresponding U- <--> U- links, except that function returns on the U+ side have duplicate values removed. This is probably the best way of implementing them too
  • Links that are S <--> U- behave like U+ <--> U- links, except for the changes that follow from the fact that S roles return a value rather than a list.

 

Functions

  • Add functions conceptually add an object pair. A.addB will throw an error if A.b is U+ (but B.addA will not).
  • Remove have two cases: A.removeB will remove a single object pair if A.b is U-, and will remove all object pairs if A.b is U+. A.removeB will always throw an error if there is nothing to remove.
  • Set uses the following algorithm:
    1. The set of object pairs that are present in both previousValue and newValue are retained unmodified
    2. The set of object pairs that are present in previousValue but not in currentValue are removed, one by one
    3. The set of object pairs that are present in newValue but not in previousVlaue are added, one by one.

      One effect of the set function is that setting a role to the value it already has will not change anything; setting it to a reordering of the values will change nothing except the order.

 

Ordering

  • New object pairs, be it from add, set, or create functions, are added to the end of the ordered list.
  • Removed object pairs, whether from remove or set functions, are removed from the beginning of the list.
  • The set function is the only function that allows proper control of ordering: to reorder the objects in a role or attribute, get them, reorder them, and set them back.

 

Special limitations

  • Two-way links that are O+ at both ends are forbidden. This should apparently make database implementations easier.
  • Two-way links must currently be U+ at both ends, unless they are derived or Implementation. As the principles in this page shows that need not always be so, but the machinery does not currently support two-way U- links

 

Questions

  • Is the prohibition on O+ <--> O+ links necessary?
  • We are sorting objects on the basis of their local key (for children) or their full key (for unordered crosslinks). The local key may be a list and may contain links to other objects. Will the sorting be a problem in any of our target languages? It is OK that sorting that involves object pointers might depend on the exact object addresses. We do not necessarily need the sorting to be the same between diferent sessions.
    Clarification:
    Sorting has no connection to the ordering in O+ roles. It is used only within the 'sortedXxx' function (e.g. PeakList.sortedPeaks), that return elements either sorted by local key (for chlildren), or by the full key (for unordered crosslinks). The normal way of getting either is getElements (e.g. PeakList.getPeaks), which will return a set. The sortedElements functions let you get objects in a predictable order, which will be useful for tables etc. Some cases have links, possibly with hicard != 1 as part of their key, and here some languages might find it hard to provide a sort. ChemBond, with key .chemAtoms would be an example.
  • Crosslinks have a 'sorted' function based on the full keys arther than the local keys. It may be slow, but it will work.