Copyright (C) Andras Zsoter, 1996. Does Late Binding Really Have To Be Slow?András ZsótérDepartment of Chemistry The University of Hong Kong Pokfulam Rd., Hong Kong IntroductionToday's software tool is object-oriented programming, as many programmers will agree with me on this matter. No surprise that many Forth dialects already have some sort of OOP support either built into the kernel [2,3] or as an add-on feature [1]. On the other hand many from the Forth community will argue that the overhead involved in virtual method calls and field accesses is unacceptable in time critical applications. This paper is directed towards them, and its main goal is to make object-oriented techniques more attractive for those who like the "close to silicon" approach. In this paper the implementation details of my Object-oriented Forth model (1) [3] will be presented. In this model overhead involved in virtual method calls and field accesses is very low. Actually on the Intel486(TM) microprocessor there is no overhead involved in virtual method calls and field accesses. The only overhead is in object instance access. In other words once an object has become active all virtual method calls take exactly the same time as any other call and field accesses take as much time as an ordinary variable access. Because the usefulness of this implementation technique is not limited to Forth, most of the following ideas have already been published [4], here the emphasis will be on our language-specific advantages. An Object as a "Mini Universe"I was reading an old article in Byte [5] which complained about the instructions and clock cycles spent on ordinary housekeeping in any program. The problem is that function parameters are passed on the stack so the caller has to push them onto the stack and the callee has to fish them out(2). In a well-factored program this parameter passing can be significant. Of course for us Forth programmers the painful part is not parameter passing (because we use the stack for our temporaries anyway) but the need to rearrange the stack. A piece of code using only global variables would be much cleaner and faster because it would not have to fiddle around with the parameters. The problem is that such a routine would be extremely inflexible and not at all reusable. Considering all these I immediately thought about OOP. And object can have its own world where variables (the fields of the object) are all in some way at a fixed address so they do not have to be passed as parameters. All a routine (a method in this case) has to know who (3) is the object calling it. Explicit parameter passing is required only in the communication with the outside world. In object-oriented languages objects and their methods -- either virtual or static -- constitute an alternative way of factoring. Not only common instruction sequences (4) can be factored out, but data structures and pieces of code handling them together; which can lead to a much cleaner program design (5). Methods or Messages?I have to explain at this point why I prefer the term "virtual method" to "message". The latter implies that one object communicates with another and only as a special case can the sender and the receiver of a message be the same object. The word "method" only indicates that the routine is somehow special in the sense that it operates on an object instance and assumes certain properties of that object (e.g. its memory layout). In other words a method is a piece of code which implements a certain particular kind of behavior of a class of objects. The more specific term "virtual method" means a method whose identity (the actual piece of code) is not known when the program is compiled (6) so it must be determined at run time. Because in my OOF methods are often used for factoring out reusable functionality inside an object I consider the term "message" to be rather misleading. An "Object-oriented Architecture"The above ideas might look good on paper but to learn more about them we
need an implementation. When I started to experiment with my OOF I wanted to
comply with the following conditions:
As a working hypothesis let us consider a microprocessor which has built in support for the above mentioned kind of object-oriented behavior. We need one register to contain the base address of the active object; all field addresses will be relative to this base address. Another register is needed to hold the address of the Virtual Method Table or VMT which contains the addresses of the virtual method instances(8). Our theoretical microprocessor needs a couple of instructions to load these registers, and to read their contents and save them on a stack. So the basic ideas are clear but the next problem is that the whole thing has to be implemented somehow in hardware. Making an object-oriented microprocessor is far beyond our facilities so I had to emulate the above behavior on an existing one. Our PC-s are based on the Intel486(TM) chip which does not have very many registers but enough to spare two for OOP support. This microprocessor also has quite flexible addressing modes so once the base address of an object is in a register, one instruction is enough to read or write one of its fields. The same can be said about the VMT pointer and virtual method calls. In my Dynamic Object-oriented Forth or DOOF (the implementation of my OOF model for the Linux operating system) I have used the following register assignment: eax : TOS=The Top Of Stack. ExamplesListing One (10) shows a sample object which is just a point on the screen (actually a character position on a character screen). This simple object has two fields which are its X and Y coordinates in a Cartesian coordinate system. A Point can show itself and hide (delete) itself. Listing OneObjects DEFINITIONS \ Objects is the base of the Class hierarchy. 3 Class Points \ A new class with 3 new virtual methods. Points \ The detailed description of the new class. Field X \ The X coordinate of the Point. Field Y \ The Y coordinate of the Point. Method Show ( -- ) \ How to show a Point. Method Hide ( -- ) \ How to hide a Point. Method Move ( dY dX -- ) \ How to move a Point a given distance. As Init use: ( Y X -- ) X ! Y ! ;M \ Initialization of a new object instance. : GotoXY ( -- ) X @ Y @ AT-XY ; \ This is a static method. As Show use: GotoXY [CHAR] * EMIT ;M As Hide use: GotoXY SPACE ;M As Move use: Hide X +! Y +! Show ;M 10 40 Obj P \ Create a new object instance. : Test ( dY dX -- ) P { Move } ; \ A sample word using our object. These are the most basic virtual methods or types of behavior a Point can manifest. One example of a more complicated kind of behavior is that a Point can move itself a specified distance in the X and Y direction. A good example for field accesses is GotoXY which is a static method. As we will see later there would be no performance penalty if we implement it as a virtual method. DOOF generated the following sequence of Intel486(TM) instructions (11) for GotoXY: sub edi,4 ; Make room on the stack. mov [edi],eax ; Save top of stack. lea eax,[ebx].X ; Get the address of the first field. mov eax,[eax] ; @ sub edi,4 ; Make room on the stack. mov [edi],eax ; Save top of stack. lea eax,[ebx].Y ; The address of the next field. mov eax,[eax] ; @ call AtXY ; An ordinary call of a Forth word. ret ; Return. Not quite a piece of highly efficient code but so far no optimization has been used (12). A more intelligent compiler would output something like the following: sub edi,8 ; Make room for two items on the stack. mov [edi+4],eax ; Save top of stack. mov eax,[ebx].X ; Get the value of the first field. mov [edi],eax ; Save the value on the stack. mov eax,[ebx].Y ; The value of the next field. call AtXY ; An ordinary call of a Forth word. ret ; Return. This code is concise and efficient, but the problem is that the compiler has to be smart to produce this kind of output. With present compiler technology this optimization is possible [6,8] (13). Nevertheless the first version of the code can be produced by any dumb compiler. If we take a closer look at either version of the code we can see that the number of instructions to access a field of the object is the same as the number of instructions needed to access a global variable (with the same smartness of the compiler) (14). So far so good, but OOP is not only about field accesses. The quality of an OOP implementation depends on the efficiency of virtual method calls. The only one virtual method instance in our example which calls other virtual methods is Move, which has been compiled to the following sequence of instructions by DOOF: call [esi].Hide ; Call Hide. sub edi,4 ; Make room on the stack. mov [edi],eax ; Save top of stack. lea eax,[ebx].X ; Get the address of the first field. mov edx,[edi] ; Move value to a register. add [eax],edx ; Add to the cell at the address. mov eax,[edi+4] ; Move next top item to tos register. add edi,8 ; Adjust stack pointer. sub edi,4 ; Make room on the stack. mov [edi],eax ; Save top of stack. lea eax,[ebx].Y ; Get the address of the next field. mov edx,[edi] ; Move value to a register. add [eax],edx ; Add to the cell at the address. mov eax,[edi+4] ; Move next top item to tos register. add edi,8 ; Adjust stack pointer. call [esi].Show ; Call Show. ret ; Return. Again this is rough code generated by a dumb compiler. On the other hand what we have been interested in is clearly visible in the above list: the virtual method calls are single instructions (15). In other words a virtual method call is exactly as expensive as any other call as long as our program does not change the active object instance. Changing the Active Object InstanceThe word Test in our example changes the active object and then calls one of its virtual methods. The machine code generated by DOOF reads(16): sub edi,4 ; Make room on the stack. mov [edi],eax ; Save top of stack. mov eax,offset P ; Push the address of the object ; onto the stack. push esi ; Save the VMT Pointer. push ebx ; Save the Object Pointer. mov ebx,eax ; Load Object Pointer from the stack. mov esi,[ebx-4] ; Load VMT Pointer with the ; new object's VMT. mov eax,[edi] ; Reload top of stack. add edi,4 ; Adjust stack pointer. call [esi].Move ; Call virtual method. pop ebx ; Restore Object Pointer. pop esi ; Restore VMT Pointer. ret ; Return. Of course the fiddling with the stack makes our code somehow obscure again. A smarter compiler could emit something like the following: push esi ; Save the VMT Pointer. push ebx ; Save the Object Pointer. mov ebx,offset P ; Load Object Pointer from the stack. mov esi,[ebx-4] ; Load VMT Pointer with the ; new object's VMT. call [esi].Move ; Call virtual method. pop ebx ; Restore Object Pointer. pop esi ; Restore VMT Pointer. ret ; Return. From the second version of the code which can be generated from Test we can clearly see that the overhead in changing the active object is significant (six housekeeping instruction for only one useful virtual method call). On the other hand if the virtual method calls other virtual methods (as in our case) or accesses a couple of fields in the active object before the program switches to another object, the code can be significantly smaller and faster than in implementations on as a parameter on the stack(17) [4]. Multiple Inheritance and Other Bells and WhistlesMy OOF has always been intended to be a low level language which provides the advantages of late binding with no or only minor performance hit. In such a low level concept more complicated constructs like multiple inheritance and operator overloading have no place. Of course most of them can be implemented on the top of the features provided by OOF. For example multiple inheritance can be provided by aggregating several objects together, and for every method the compiler can present the appropriate part of the object [10]. By making the Class Classes which is the special vocabulary type representing classes of objects [3], a part of the class hierarchy, DOOF provides a way to derive more sophisticated tools to create new types (18). Operator overloading is more troublesome. To meaningfully provide operators which can do arithmetic on different types of numbers we have to provide a way for the operator to check the type of its operands. Traditionally items on Forth's stack are typeless, so this checking cannot be done without substantially modifying the internals of our traditional model of a Forth engine. The resulting language is still Forth but the overhead of the runtime type checking can make it extremely unattractive for the designers of time critical applications. In arguments about Forth, the complete lack of predefined data structures is listed as one of the major weaknesses of the language. After I have read some C++ literature [9,10], and I have seen the complaints about C's array concept and its incompatibility with C++ objects(19) I have started to think otherwise: Our greatest weakness can be our biggest strength(20) because we do not have to take away anything from the language to adapt OOP. Plain old-fashioned Forth is the best starting point to make some kind of Forth++(21). We do not have array concepts to be incompatible with more carefully designed data aggregates needed for an object-oriented system. Forth -- very luckily -- has remained a low level tool, a kind of portable assembler, for most of its users. My OOF model has been designed to live peacefully with this approach. ConclusionsIn this paper a fast and effective way of implementing OOP has been presented. The benefits of the technique are most useful in applications where speed is critical. The suggested technique provides the benefits of object-oriented design without the need of a sophisticated compiler. Although the examples are Intel-specific the presented argument will hold true on most modern microprocessors. The usefulness of this approach is not limited to one particular programming language, nevertheless those languages which encourage extensive factoring will benefit mostly from it. In Forth where the average definition is only a couple of words long(22) the implicit parameters passed from method to method with no extra cost can make a program based on OOP even faster than its counterpart implemented by using more conservative techniques(23). References
Trademarks MentionedIntel486(TM) is a trademark of Intel Corporation. Glossary
Notes
|