Hiding Implementation Details
 

Code Review 2: Hiding Implementation Details

Embedded classes, protected constructors, hiding constants, anonymous enums, namespaces.

Download!
source

A good software engineer is like a spy. He exposes information to his collaborators on the need-to-know basis, because knowing too much may get them in trouble. I’m not warning here about the dangers of industrial espionage. I’m talking about the constant struggle with complexity. The more details are hidden, the simpler it is to understand what’s going on.

Using Embedded Classes

The class Link is only used internally by the linked list and its friend the sequencer. Frankly, nobody else should even know about its existence. The potential for code reuse of the class Link outside of the class List is minimal. So why don’t we hide the definition of Link inside the private section of the class definition of List.
class List
{
    friend class ListSeq;
public:
    List ();
    ~List ();
    void Add (int id);
private:
    // nested class definition
    class Link
    {
    public:
        Link (Link * pNext, int id)
            : _pNext (pNext), _id (id) {}

        Link *   Next () const { return _pNext; }
        int        Id () const { return _id; }
    private:
        Link     * _pNext;
        int        _id;
    };

private:
    Link const * GetHead () { return _pHead; }

    Link* _pHead;
};

The syntax of class embedding is self-explanatory.

Class ListSeq has a data member that is a pointer to Link. Being a friend of List, it has no problem accessing the private definition of class Link. However, it has to qualify the name Link with the name of the enclosing class List--the new name becomes List::Link.
class ListSeq
{
public:
    bool AtEnd () const { return _pLink == 0; }
    void Advance () { _pLink = _pLink->Next(); }
    int GetId () const { return _pLink->Id (); }
protected:
    ListSeq (List const & list)
        : _pLink (list.GetHead ()) {}
private:
    // usage of nested class
    List::Link const *_pLink;
};

 

The classes List and ListSeq went through some additional privatization (in the case of ListSeq, it should probably be called "protectization"). I made the GetHead method private, but I made ListSeq a friend, so it can still call it. I also made the constructor of ListSeq protected, because we never create it in our program--we only use objects of the derived class, IdSeq.

I might have gone too far with privatization here, making these classes more difficult to reuse. It's important, however, to know how far you can go and make an informed decision when to stop.

Combining Classes

Conceptually, the sequencer object is very closely tied to the list object. This relationship is somehow reflected in our code by having ListSeq be a friend of List. But we can do much better than that--we can embed the sequencer class inside the list class. This time, however, we don't want to make it private--we want the clients of List to be able to use it. As you know, ouside of the embedding class, the client may only access the embedded class by prefixing it with the name of the outer class. In this case, the (scope-resolution) prefix would be List::. It makes sense then to shorten the name of the embedded class to Seq. On the outside it will be seen as List::Seq, and on the inside (of List) there is no danger of name conflict.

Here's the modified declaration of List:
class List
{
public:
    List ();
    ~List ();
    void Add (int id);
private:
    class Link
    {
    public:
        Link (Link * pNext, int id)
            : _pNext (pNext), _id (id) {}

        Link *  Next () const { return _pNext; }
        int     Id () const { return _id; }
    private:
        Link *  _pNext;
        int     _id;
    };

public:
    class Seq
    {
    public:
        Seq (List const & list)
            : _pLink (list.GetHead ()) {}
        bool AtEnd () const { return _pLink == 0; }
        void Advance () { _pLink = _pLink->Next (); }
        int GetId () const { return _pLink->Id (); }
    private:

        Link const * _pLink; // current link
    };

    friend Seq;
private:
    Link const * GetHead () const { return _pHead; }

    Link * _pHead;
};

Notice, by the way, how I declared Seq to be a friend of List following its class declaration. At that point the compiler knows that Seq is a class.

The only client of our sequencer is the hash table sequencer, IdSeq. We have to modify its definition accordingly.
class IdSeq: public List::Seq
{
public:
    IdSeq (HTable const & htab, char const * str)
        : List::Seq (htab.Find (str)) {}
};

And, while we're at it, how about moving this class definition where it belongs, inside the class HTable? As before, we can shorten its name to Seq and export it as HTable::Seq. And here's how we will use it inside SymbolTable::Find
for (HTable::Seq seq (_htab, str);
     !seq.AtEnd ();
     seq.Advance ())

Combining Things using Namespaces

There is one more example of a set of related entities that we would like to combine more tightly in our program. But this time it's a mixture of classes and data. I'm talking about the whole complex of FunctionTable, FunctionEntry and FunctionArray (add to it also the definition of CoTan which is never used outside of the context of the function table). Of course, I could embed FunctionEntry inside FunctionTable, make CoTan a static method and declare FunctionArray a static member (we discussed this option earlier). There is however a better solution. In C++ we can create a higher-level grouping called a namespace. Just look at the names of objects we're trying to combine. Except for CoTan, they all share the same prefix, Function. So let's call our namespace Function and start by embedding the class definition of Table (formerly known as FunctionTable) in it.
namespace Function
{
    class Table
    {
    public:
        Table (SymbolTable & symTab);
        ~Table () { delete []_pFun; }
        int Size () const { return _size; }
        PFun GetFun (int id) { return _pFun [id]; }
    private:
        PFun * _pFun;
        int    _size;
    };
}

The beauty of a namespace it that you can continue it in the implementation file. Here's the condensed version of the file funtab.cpp:
namespace Function
{
    double CoTan (double x) {...}
    class Entry {...};
    Entry Array [] =
    {...};
    Table::Table (SymbolTable & symTab)
        : _size(sizeof Array / sizeof Array [0])
    {...}
}

As you might have guessed, the next step is to replace all occurrences of FunctionTable in the rest of the program by Function::Table. The only tricky part is the forward declaration in the header parse.h. You can't just say class Function::Table;, because the compiler hasn't seen the declaration of the Function namespace (remember, the point of using a forward declaration was to avoid including funtab.h). We have to tell the compiler not only that Table is a class, but also that it's declared inside the Function namespace. Here's how we do it:
namespace Function
{
    class Table;
}

class Parser
{
public:
    Parser (Scanner & scanner,
            Store & store,
            Function::Table & funTab,
            SymbolTable & symTab );
    ...
};

 

By the way, the whole C++ Standard Library is enclosed in a namespace. Its name is std. Now you understand these prefixes std:: in front of cin, cout, endl, etc. (You've also learned how to avoid these prefixes using the using keyword.)

Hiding Constants in Enumerations

There are several constants in our program that are specific to the implementation of certain classes. It would be natural to hide the definitions of these constants inside the definitions of classes that use them. It turns out that we can do it using enums. We don’t even have to give names to enums—they can be anonymous.

Look how many ways of defining constants there are in C++. There is the old-style C #define preprocessor macro, there is a type-safe global const and, finally, there is the minimum-scope enum. Which one is the best? It all depends on type. If you need a typed constant, say, a double or a (user defined) Vector, use a global const. If you just need a generic integral type constant—as in the case of an array bound—look at its scope. If it’s needed by one class only, or a closely related group of classes, use an enum. If its scope is larger, use a global const int. A #define is a hack that can be used to bypass type checking or avoid type conversions--avoid it at all costs. By the way, debuggers don’t see the names of constants introduced through #defines. They appear as literal numerical values. It might be a problem sometimes when you don’t remember what that 74 stood for.

Here’s the first example of hiding constants using enumerations. The constant idNotFound is specific to the SymbolTable.
class SymbolTable
{
public:
    // Embedded anonymous enum
    enum { idNotFound = -1 };
    …
}

No change is required in the implementation of Find. Being a method of SymbolTable, Find can access idNotFound with no qualifications.

Not so with the parser. It can still use the constant idNotFound, since it is defined in the public section of SymbolTable, but it has to qualify its name with the name of the class where it's embedded.
if (_scanner.Token () == tLParen) // function call
{
    _scanner.Accept (); // accept '('
    pNode = Expr ();
    if (_scanner.Token() == tRParen)
        _scanner.Accept (); // accept ')'
    else
        _status = stError;
    // The use of embedded enum
    if (id != SymbolTable::idNotFound 
        && id < _funTab.Size ())
    {
        pNode = new FunNode (
            _funTab.GetFun (id), pNode );
    }
    else
    {
        cerr << "Unknown function \"";
        cerr << strSymbol << "\"\n";
    }
}
else
{
    // Factor := Ident
    if (id == SymbolTable::idNotFound)
    {
        // add new identifier to the symbol table
        id = _symTab.ForceAdd (strSymbol);
        if (id == SymbolTable::idNotFound)
        {
            cerr << "Error: Too many variables\n";
            _status = stError;
            pNode = 0;
        }
    }
    if (id != SymbolTable::idNotFound)
        pNode = new VarNode (id, _store);
}

Maximum symbol length might be considered an internal limitation of the Scanner (although one might argue that it is a limitation of the "language" that it recognizes--we’ll actually remove this limitation later).
class Scanner
{
public:
    // Embedded anonymous enum
    enum { maxSymLen = 80 };
    …
};

Hiding Constants in Local Variables

A constant that is only used within a single code fragment should not, in general, be exposed in the global scope. It can as well be defined within the scope of its usefulness. The compiler will still do inlining of such constants (that is, it will substitute the occurrences of the constant name with its literal value, rather than introducing a separate variable in memory). We have a few such constants that are only used in main
int main ()
{
    const int maxBuf = 100;
    const int maxSymbols = 40;

    char buf [maxBuf];
    Status status;
    SymbolTable symTab (maxSymbols);
    …
}