Department of Computer Science,
University of Adelaide,
South Australia 5005,
Australia.
email: fred@cs.adelaide.edu.au
The introduction of concurrency and distribution impose certain requirements on type equivalence checking. A concurrent system may involve multiple type equivalence tests over the same representations. The equivalence algorithm has to ensure that individual checks cannot interfere with each other. When distribution is introduced data may migrate between systems requiring a mechanism for checking the type of imported data. If copies of a type representation are passed with data, the necessary checking can be performed. However, this means that the type equivalence mechanisms must be able to cope with multiple copies of a type's representation.
A type equivalence mechanism will now be presented that is based on a graph representation of type. The representation accommodates recursive type definitions including the full range of data types available in a persistent programming language such as Napier88[7]. The mechanism presented is based on a hybrid of two existing Napier88 type equivalence mechanisms[2,8] that preserves their good points whilst solving their short-comings in the context of a concurrent and distributed system.
rec type TYPE is structure( label : int ; specificInfo : string ; references : list[ TYPE ] )
This is sufficient to uniquely represent any type describable by the Napier88 type system using some straight forward mapping rules. The label field indicates if the type is a base-type or a structure, procedure, vector, etc. The specificInfo contains information such as structure field names, concatenated together with markers to form a single string. This string has an implicit ordering that is part of the type description. A structural type equivalence algorithm over this structure must check for equality of the label and specificInfo fields before recursively checking any representation found in the references list. This is illustrated in Figure 1.
rec let eqType = proc( a,b : TYPE -> bool ) a = b or in_remembered_set( a,b ) or (a( label ) = b( label ) and a( specificInfo ) = b( specificInfo ) and eqList( a( references ),b( references ))) & eqList = proc( a,b : list[ TYPE ] -> bool ) (a is tip and b is tip) or (a isnt tip and b isnt tip and eqType( head( a ),head( b ) ) and eqList( tail( a ),tail( b ) ))
The time complexity of this algorithm is dependent on the implementation of the remembered set. It is also a potential source of error in a concurrent system since the remembered set may be shared by several concurrent type equivalence checks. To achieve a low time complexity and maximise concurrency the following two techniques are employed by the new algorithm.
The remembered set is implemented as a hash table[2]. In order to obtain suitable hash keys, the compiler annotates all TYPEs it creates with an extra field that contains a random number. Within any compilation unit, the random numbers should be unique and the compiler can arrange for there to be only one copy of each TYPE. Consequently, the type equivalence mechanism performs well when comparing types from the same compilation unit, nearly all comparisons would succeed at the identity check. A new, empty hash table is used for every type equivalence check to avoid interference between concurrent type equivalence checks.
Since, for a given compilation unit and therefore each copy of a TYPE, the random keys should be unique, the hash table can be keyed by always using either the first or second TYPE being compared. The end results should be that few collisions will occur and so inserting pairs into the hash table and searching for pairs will be on average O( 1 ). Given this, the overall complexity of a type equivalence check is O( N ) where N is the number of component types being compared. Thus, comparing very large type representations from different compilation units should be relatively efficient.
An alternative approach to the hash table is to build trees of equivalent types. This idea originates in solving the union-find problem[3], solutions to which are essential in many applications of graph theory[9]. In this approach every TYPE has and extra field which can point to a tree root. Initially this will point to a TYPE's own tree root[8]. Since this technique requires modifying TYPEs, it is only used to remember pairs of types once an initial comparison succeeds. Any potential interference between tree merging operations can be ignored since its sole function is to optimise future checks and cannot cause a check to succeed in error.
2. Connor, R.C.H. Types and Polymorphism in Persistent Programming Systems (PhD Thesis), Department of Computer Science, University of St.Andrews, CS/91/3, 1991.
3. Cormen, Leirson, Rivest. Chapter 22, Introduction to Algorithms, MIT Press, 1990, pp440-464.
4. Dearle, A. & Brown, A.L. Safe Browsing in a Strongly Typed Persistent Environment., Computer Journal, Vol. 31, No. 6, (December 1988), 540-545.
5. Farkas, A. & Dearle, A. The Octopus Model and its Implementation. Technical Report PS-17, Department of Computer Science, University of Adelaide. August 1993.
6. Kirby, G.N.C. (Ph.D. Thesis) Reflection and Hyper-Programming in Persistent Programming Systems, University of St.Andrews, 1992.
7. Morrison R., Brown A.L., Connor R. & Dearle A. The Napier88 Reference Manual. Universities of Glasgow and St.Andrews PPRR-77, Scotland, 1989.
8. Crawley, S. A New Napier88 Type Checker, Technical Report, D.S.T.O. Salisbury, South Australia, Australia.
9. Aho A.V. & Ullman J.D. Spanning Trees, Foundations of Computer Science, C Edition, Chapter 9, Computer Science Press, 1995, pp466-484.