On Object Orientation

There are many articles and discussion threads on the web regarding the nature of Object-oriented (OO) programming. Most come at the question from a Programming Language Theory (PLT) perspective, in terms of the formal semantics of objects. I'm not going to do that here. Instead, I'm going to write about objects from an implementation perspective, approaching the subject first from below and then from above.

Objects in Terms of Machine Sympathy

The first OO code with which I became intimately familiar was the Virtual File System (VFS) layer of the SunOS kernel as documented in Kleiman's 1986 paper. Motivated primarily by the desire to provide transparent access to remote fileservers using Sun's Network File System (NFS), the VFS provided an extensible interface by which new file systems could be plugged into the OS without any change to the code of programs that use file system services (which is pretty much all of them).

Some might reject this example because it's written C, which is not itself "an OO language", but this is actually my point. The code I'm about to explain isn't object-oriented by default, as -- say -- SmallTalk code would be. It's object-oriented because the authors decided that those principles were the best way to solve the problem at hand, and thus it admirably demonstrates the mechanism of OO.

Two Structs to Make a File System

The data structures used to represent a file system include a pair of C structs, the first of which is:

struct vfs {
        struct vfs    *vfs_next;         /* next vfs in list */
        struct vfsops *vfs_op;           /* operations on vfs */
        struct vnode  *vfs_vnodecovered; /* vnode we cover */
        int           vfs_flag;          /* flags */
        int           vfs_bsize;         /* native block size */
        caddr_t       vfs_data;          /* private data */
};

The vfs struct contains a description of a file system -- flags, block size, and the like. But the kernel deals with more than one file system, so this struct also has a pointer, *vfs_next, to the next file system in a list.

Of interest for the present discussion is the *vfs_op pointer, which points to a second struct:

struct vfsops {
        int (*vfs_mount)();
        int (*vfs_unmount)();
        int (*vfs_root)();
        int (*vfs_statfs)();
        int (*vfs_sync)();
        int (*vfs_fid)();
        int (*vfs_vget)();
};

This struct is a set of function pointers, each of which points to a function that implements one of the operations that can be performed on the file system as a whole (mount, unmount, sync, and the like).

Likewise, another pair of structs called vnode and vnodeops abstract the the operations that can be conducted on the contents of a mounted file system using a set of function pointers that implement open, close, read, write, and so on.

In OO terms, the vfs struct is an object that implements an interface defined by the functions in the vfsops struct, and each of the function pointers in that struct points to a function that acts as a method that can be called on that object. When these functions are called, a pointer to the appropriate vfs struct must be passed as the first parameter, making the vfs struct equivalent to a this pointer in C++ (which was implemented exactly this way in older compilers that translated C++ to C).

This is also fundamentally how other OO systems work, give or take some hacks for performance, as Rob Pike wrote in 1989:

I argue that clear use of function pointers is the heart of object-oriented programming. Given a set of operations you want to perform on data, and a set of data types you want to respond to those operations, the easiest way to put the program together is with a group of function pointers for each type. This, in a nutshell, defines class and method.

In addition to the well-defined interface provided by the VFS, systems that implemented the VFS typically also provided a file system called nullfs whose functions did nothing at all. New file system implementations could begin with this code and replace each function until they've built a working file system. In this way, nullfs can be viewed as a prototype -- in the Self/Javascript sense -- for VFS objects.

It's worth going back and reading over those two structs again. Even if you don't know much about file systems, it's pretty easy to see the broader pattern: objects that implement interfaces that define methods. There are many definitions of OO programming, but the basics are pretty simple and the OO approach can be applied in just about any language. In the next section you'll see a simple OO system written in 15 lines of code (but not in C).

Objects in the Lambda Calculus

We now have a sense of what OO programming is from the perspective of the machine, but what does it mean for us humans?

The history of OO programming in the commonly used sense of the term is inextricably bound to the history of Functional Programming (FP) in general, and the Lisp family of languages in particular. Alan Kay famously said that "Lisp is the most important idea in computer science," and his research team begins new language experiments by implementing a Lisp with which to build whatever they've got in mind.

One of the reasons for this is that it's trivial to build an OO system using FP techniques. If, for example, we were to re-write the vfsops struct in functional terms, each vfsops function pointer could be treated as a partially applied function bound to the appropriate vfs structure (shown here in Clojure notation):

;; trimmed for brevity
(def filesystem-handle
  (let [vfs    {:data    "Implementation specific"}
        vfsops {:mount   (partial mount vfs)
                :unmount (partial unmount vfs)}]
    vfsops))

The simplicity of implementing objects using closures spawned a long-running debate in the PLT community on whether objects are really anything more than closures in disguise (and vice versa), in response to which there's an MIT koan that runs:

The venerable master Qc Na was walking with his student, Anton. Hoping to prompt the master into a discussion, Anton said "Master, I have heard that objects are a very good thing - is this true?" Qc Na looked pityingly at his student and replied, "Foolish pupil - objects are merely a poor man's closures."

Chastised, Anton took his leave from his master and returned to his cell, intent on studying closures. He carefully read the entire "Lambda: The Ultimate..." series of papers and its cousins, and implemented a small Scheme interpreter with a closure-based object system. He learned much, and looked forward to informing his master of his progress.

On his next walk with Qc Na, Anton attempted to impress his master by saying "Master, I have diligently studied the matter, and now understand that objects are truly a poor man's closures." Qc Na responded by hitting Anton with his stick, saying "When will you learn? Closures are a poor man's object." At that moment, Anton became enlightened.

And, indeed, if we take the above Clojure example slightly further, we can write a 15-line Clojure macro that implements a primitive object system:

(defmacro defclass [class-name & slots]
  (let [ctor-name (symbol (str "new-" class-name))
        slot-map  (apply hash-map slots)]
    `(defn ~ctor-name [& init-values#]
       (let [this-ptr# (ref (merge ~slot-map (apply hash-map init-values#)))
             meths# (apply hash-map
                           (mapcat
                            (fn [slot#]
                              (list
                               (.replace (str slot#) ":" ":get-")
                               #((deref this-ptr#) slot#)
                               (.replace (str slot#) ":" ":set-")
                               #(dosync (alter this-ptr# assoc slot# %))))
                            (keys ~slot-map)))]
         (fn [& args#] (apply (get meths# (str (first args#))) (rest args#)))))))

;; new fighter class with defaults for slots
(defclass fighter :height 1.78 :weight 80 :reach 1.83)

;; a new fighter instance with default values
(def wallace (new-fighter))

(wallace :get-weight)
;; => 80

;; a new fighter instance with a default overridden
(def butch (new-fighter :weight 78))

(butch :get-weight)
;; => 78

(butch :get-reach)
;; => 1.83

;; call the setter on this object's reach
(butch :set-reach 1.78)
;; => {:weight 78, :height 1.78, :reach 1.78}

(butch :get-reach)
;; => 1.78

OO programming is, at heart, just one approach to organizing functions and data, which are the true building blocks of computer programs. There are domains in which these techniques pay dividends, such as simulation (thus Simula), and others where they're a burden. The question of whether to use them comes down to how we most naturally and comfortably model a given problem domain. Your language shouldn't force you to use OO, or any other abstraction, but rather give you the tools to build whichever you need.

Thanks to Mark Carmichael, Paul Ford, Jon Riecke, Steven Sargent, Stan Switzer and Joseph Wilk for reading and commenting on drafts of this piece.

This entry is part of my journal, published July 29, 2013, in Berlin.