Intro to OOP

Code Walk

Issue 1 - October 1995

This month's Code Walk column offers an introduction to the concepts of object oriented programming.

Level: Introductory

The fundamental difference between procedural and object programming is this: procedural programming focuses on verbs and oop focuses on nouns. When designing a program using procedural programming, a designer thinks about what processes must be performed. What will the program do? Object oriented programming asks, What are the different types of data, and what is their interaction.

Abstracting the nouns allows real things and their relationships to be represented more directly in the software. In a bank, there are accounts, loans, and customers. If the bank opens a new account, say an interest checking account, this can be added to the system as a new object. Because it is an account, it should be able to do most of common work that all accounts do, plus it should be able to do the special processing at the end of the month that makes it an interest checking account. Oop provides this close relationship between real things and code which represents them.

An easier transition

It may seem that thinking about nouns instead of verbs is something radical and difficult. But oop can also be thought of as procedural programming that conforms to special rules and disciplines. That it to say, when procedural programming is practiced with a particular discipline and style, it naturally grows into oop.

When you add encapsulation, polymorphism, and inheritance, the various components of the procedural code begin to fall together into objects.

To show how this can happen, let's watch a mythical programmer, Vern, as he struggles through writing applications for The National McGreedy Bank. His first task was to write a program to print monthly checking account statements.

Because he was thinking of verbs instead of nouns, Vern started by writing a function called print_statements. He read customer names from a file and inserted them into an array. Since there were only 2000 banking customers, he made his array large enough to hold 3000 elements, to allow for growth.

He loaded the array somewhere near the beginning of program execution and then accessed it globally by the different modules that interacted with the list of customers.

Then he learned that he needed to print the statements in order by zip code so the bank could save money mailing the statements. So, Vern changed the insertion module to insert a customer into the array by moving all elements following the insertion point down one position. The program was released and it worked well.

Then, the bank went through a large marketing plan and by offering a free oven mitt or plaid plastic thermos to all new customers, increased its depositors to 7000.

Vern's boss told him next month that the statement printing program failed. Vern then printed out the program and began to examine the many functions. Of course he found that the array wasn't big enough so he changed it to hold 10,000 customers, figuring that would certainly be plenty.

Now the program ran, but it took two hours instead of a half hour. He found that his zip code insertion sort was horribly inefficient and he decided to change the array to a binary tree that performs insertions much more quickly.

"How long will that take?" asked his boss.

Vern scratched his head and said, "Well, just about every module does something with the array, to read an address, or cross reference an account number."

"How long?"

"I'll have to change something in every function."

"How long?"

"It will practically be a rewrite. It took three months to write it in the first place."

"So what are you telling me, Vern? Are you telling me three months?"

"Maybe."

And thus, Vern learned the first principle of oop.

Encapsulation

Encapsulation is the technique of creating data structures whose implementation is unknown to the functions which use them. All interaction with a data structure is through functions which know the details of implementation. When Vern used an array and then thought of using a binary tree, he was using data structures, but not encapsulated ones.

What he wanted was not an array of customers or even a binary tree of customers. What he wanted was a data structure called customer group, which would allow him to insert customers and examine them one at a time.

Luckily he realized this, and during the four and a half months during which he rewrote the statement printing program, he encapsulated not only customer group, but also the checking account itself.

This was a good thing because the very next day after the program was finished, the bank changed its policy and decided to begin printing monthly saving statements as well as checking statements.

"Can you do that?" Vern's boss asked.

Vern poured himself a cupful of jolt cola from his new plastic thermos and thought for a moment.

His boss continued, "Won't this encapsulation thing you came up with help?"

"Oh, definitely. But checking accounts are different. And you said that you want only a single statement if a customer has both a savings and a checking account, so that's really three types of statements."

"How long will it take?"

Vern knew that three months would not be a good answer, so he merely said, "I'll look into it and let you know."

Luckily, for Vern's sake, and that of his wife and baby, it took him only a week. First he needed to write the new functions to calculate, format and print the savings, checking, and combined statements. Then, at each point were he called one of his encapsulated functions, he used if statements like this:

/* Report heading */
if ( customer.checking_num && customer.saving_num )
   format_combined_heading (statement, customer);
else if ( customer.checking_num )
   format_checking_heading (statement, customer);
else if ( customer.saving_num )
   format_saving_heading (statement, customer);

and

/* Print the balance */
if ( customer.checking_num && customer.saving_num )
   format_combined_balance (statement, customer);
else if ( customer.checking_num )
   format_checking_balance (statement, customer);
else if ( customer.saving_num )
   format_saving_balance (statement, customer);

Vern was very proud of his quick turnaround and after this latest enhancement, he scheduled vacation to take his family to Disneyland. However, as he was packing his suitcase the next Monday morning, the phone rang. Vern stopped humming "It's a small world after all" and answered the phone.

"Vern, can you explain why the checking statements of new customers are indicating a service charge after Mister McGreedy paid two million dollars for the ad slogan: New customers get no service at McGreedy?"

"Maybe the computer needs cleaning."

Hearing no response from his boss, Vern quickly hung up and went back into work where he found that he forgot to add and extra if statements to account for new customers.

In a few minutes, he added extra if statements on the report printing functions having to do with service charges, totals, and itemization. For example:

/* Show Service Charge Transactions */
if ( customer.checking_num && customer.saving_num && customer.new)
   format_combined_interest_new_cust (statement, customer);
else if (customer.checking_num && customer.savings_num)
   format_combined_interest (statement, customer);
else if ( customer.checking_num && customer.new)
   format_checking_interest_new_cust (statement, customer)
else if ( customer.checking_num )
   format_checking_interest (statement, customer);
else if ( customer.saving_num )
   format_saving_interest (statement, customer);

After modifying the seven or eight critical if blocks, he went back to his boss and announced that it should work now.

"Are you sure?" his boss asked.

"I think so. Can I go see Snow White now?"

"Take off those mouse ears and wait for the statements to print."

When Vern got back from his vacation, he discovered another principle of oop.

Polymorphism

What Vern did was to add a set of function pointers to his account and customer data structures. He then replaced all the multiple if statements with a single function call invoked through the function pointer. So, for example, he always called account.format_interest(account, customer, statement), which pointed to one of the five different functions.

Vern was very lucky to have thought of this, for the very next day, the bank created a new type of saving account, a money market account. This time, all Vern had to do was write new functions for the new account, set the function pointers, and let it go.

Because all the functions that did things with accounts did them by invoking functions through the account's function table, he knew that he didn't even have to look at the rest of the program. He didn't have to add new if statements throughout the code. After he attached the functions to the new account type, he was done.

He realized that polymorphism allows the addition of new features to be added to a program with additions only. Existing code did not have to be touched.

This is possible because the functions are accessed through pointers and these can be set to anything while the program is running. This is called dynamic, or run time, binding of functions.

As soon as he was finished adding the new saving account type, his boss came back and told him that the bank was now offering an interest bearing checking account for customers that have a thousand dollars or more in a savings or money market account.

"How long will that take?"

"Not long," Vern answered. "When I create the checking account data structure, I'll examine the balance of the other accounts, if they exist, and if either is greater than or equal to a thousand dollars, I'll assign its end of the month processing function pointer to a new function that first calls the old end of month function and then adds a new transaction to compute the interest."

"Huh?"

"I'll have it by lunch."

"Good."

"Oh, do I use the interest computation from the savings account or the money market account?"

"I didn't ask. I didn't think we'd need to know that for another couple days."

Vern had discovered another principle of oop.

Inheritance

What Vern discovered was that to add a new data structure that is almost identical to one that already exists, all you need to do is create a new object which inherits the common behavior, and then redefine a particular behavior or simply add to it by first calling the original function and then adding the new code.

Language support

Vern discovered on his own the principles of oop and his new statement printing program became very flexible and reliable. But he had to be responsible for supporting the encapsulation, polymorphism, and inheritance himself.

When an object oriented language is used, like C++, these principles are supported by the language itself. Encapsulation can be enforced through the use of the private data scope keyword. Polymorphism is supported through virtual functions and inheritance is supported through the mechanism of base and derived classes.

So, when Vern heard of C++, he decided to rewrite his program using this language.

He understood all the principles of oop, but he soon felt lost in the cumbersome syntax of the language.

One day his boss came by and asked, "How's the new rewrite going?"

"I don't know. What's a protected abstract virtual base pure virtual private destructor? And do I really need one.? Also why is a friend operator overloading function always declared within a class definition if its scope is global? And what's the declaration for a pointer to an array of references to functions returning a reference to a pointer to int?"

"Huh?"

"If I declare a virtual function as inline and then redefine it in a derived class, does that mean that I do or don't have to recompile the user of the abstract class? Should I not declare virtual functions inline at all?"

"In what line?"

"Look here. Do you think I should make this a pure virtual abstract base class and use multiple inheritance, or instantiate it as a contained member object within the other class?"

"How long do you think this is going to take?"

Vern sat back a moment and looked at the five books lying open on his desk, then looked at the notes he had been making and the printout of the original program and said, "I don't know, two, three months."

No silver bullet

Vern soon learned that while C++ is the accepted oop language throughout the industry, its leadership role is more because of its roots in C than its implementation of oop, which is often cryptic and awkward. Other, more pure object languages have the same advantages of oop but are not so full of obscure rules. These rules come from the fact that, while C++ supports polymorphism and inheritance, it is still very much a strongly typed language. To allow the flexibility of operator and function overloading, polymorphism, and inheritance and still retain compiler type checking requires special techniques like copy constructors and conversion functions, which must all be controlled by the programmer. So, C++ is object oriented, very flexible, strongly typed, and can be rather difficult.

Beyond the syntax of C++, Vern discovered other objections to using oop. Are indirect functions less efficient than regular functions? Some languages actually perform a table lookup by name whenever a function is invoked. With C++, there are tables of functions, called v-tables, but no name search is performed so calling a function through the v-table takes only slightly longer than a regular function call.

However, oop also encourage the wrapping of objects within other objects, to preserve a strict encapsulated interface. Where a procedural language would do things in a direct manner, the same implementation in an object oriented language might require one function to call another function which performs some translation then calls another function that calls a function to do the actual work. Many times, these functions are tiny and do simple things like change the order or amount of indirection of parameters, or add or remove parameters. All these extra functions help provide proper encapsulation, but at the expense of function calling overhead.

C++ answers this by providing inline functions. When a regular function is called, the compiler normally produces machine code to save local variables on the stack and to branch to the function. An inline function causes the actual code of the function to be repeated at each call. This allows an object to have as many jacket functions it wants without paying the price of actual function calls.

He also learned that some people object to the style of oop in general. If classes are badly designed, it may be difficult to extend them through inheritance. If you want to inherit only part of a class definition, you're stuck taking everything. As one person put it, "Even if you only want a banana, you have to take the entire gorilla."

Programming with objects often diverts the design process from focusing on the end product to worrying about the subtle options of class construction like when to use abstract base classes, how flat or deep the inheritance tree should be, how different classes interact with each other, and when to use inclusion instead of inheritance.

Understanding the objects and their relationships is only the beginning and finding the most appropriate way to represent them is a problem best solved by experience.

Vern eventually overcame the syntax and design questions, with a little help from C++ programming journals, user groups, and online forums. His ability to write programs quickly with few errors helped him get promoted. He soon found himself assigned to direct the development of a new transaction processing program. Then he learned of another obstacle to oop.

The culture of oop

One day Vern was called into his boss' office.

"I understand, Vern, that you have given assignments to some programmers. Does this mean that in only four short weeks you have finished your entire analysis and design of the transaction processing program?"

"Of course not, but . . ."

"Then why are you trying to implement something before you have a design?"

"We're prototyping the concepts as we go. Because of the encapsulation and flexibility of object oriented programming, we can determine the major conceptual components and begin designing some of their behavior and their abstract interfaces. We don't have to know the details of any of the classes just yet."

"Vern, those programmers have other responsibilities. They can't be taken off their project just to help you figure out the details of your design. That's your job." "But they're not just testing the design. They're building the actual program. Building it, testing it, trying different things. Many things they learn will affect the design earlier rather than later, and as requirements are finialized, they will be able to start filling in the details of the derived classes to do the specific work."

"It still sounds as if you're getting them to do your job. Why did you pick those two anyway? They're the busiest of the group."

"Because they're the only ones who know C++ or anything about object programming. And they wanted to work on it."

"Listen, Vern, you're a bright guy, but I think this object thing is just another passing phase like prolog or CASE tools. They may be good theoretically, but they don't get the job done. I know you feel strongly about it, but it's becoming disruptive to how we do things here."

Vern thought about that a moment then decided to go ahead and speak his mind. "Well," he began slowly, "maybe we should change how things are done here."

That's when Vern realized a very basic truth of oop. It's principles can be thought of as simple enhancements to procedural programming, but to properly program with objects requires a new way of thinking about software development. The role of programmers and designers must grow closer together. The single pass approach of analysis, design, implementation, testing, and delivery does not support the type of dynamic development available with oop which allows the design to be tested and tuned and which enables prototype code to become part of the final product in a clean manner.

"And why should we do that?" his boss wanted to know.

"For the same reason we no longer use assembly language and don't hire keypunch operators. The software industry is about change. You can deal with change or ignore it, but it doesn't go away."

"Vern, we're talking about retraining, changing policies and priorities, and remember you had a lot of problems learning that language of yours. It won't be easy for anyone."

"No, maybe not. But we're not here to do the easy thing. We're here to write good software and I think this will help us do that."

His boss leaned back heavily into his chair, sighed explosively and said, "I'll think about it."

:^D