By Per Nilsson
Environment: Developed in (but not restricted to) VC6.
This article is an
introduction to templates, iterators, and temporary classes illustrated by an implementation of a red-black tree.
The
Balanced Binary Tree articlegives a brief introduction to binary trees (and to recursion). There,my aim was to describe the binary tree concept with as fewtechnicalities as possible. For instance, the balancing techniquedescribed is easy to understand (I think), though not very practical inRealLife[size=-1]TM.
This time, I have a different approach, more focused on a tree'simplementation and coding details. This article is not a description ofhow red-black balancing works, but uses a red-black implementation toillustrate some more or less interesting coding concepts.
In one way, it could be considered a sequel to the Balanced Binary Tree article, answering questions such as:
- How could one implement a efficiently balanced tree that makes sense in the real world?
- Is recursion really the thing? Is there perhaps something even better?
But, it could also just be regarded as an introduction to how to use templates, iterators, and temporary classes.
The Self-Balancing TreeThe idea is to have a binary tree that fairlycheap; it keeps itself fairly balanced at all times. I've chosen thered-black tree algorithm. In short, it works like this:
- Whenever a new element is inserted or deleted, the elements are shuffled about to keep the balance.
A red-black tree accomplishes this by colormarking every element as either red or black, and by applying a set ofrules on the colored elements' relation, a tree gets fairly balanced.
Articles and information on red-black trees are already out there on the Net; use your favourite search engine...
TemplatesTemplates are really cool, but could at first glance could seem a bit complicated. But trust me, they aren't really.
A template is like a "#define on steroids." Itis a way to just define a behavior without deciding what specific kindof data the behavior is acting upon. The compiler will generate theneccessary code when it feels it has to.
Example:
template <class MyType>
class MyClass
{
MyType mFoo;
public:
MyType GetFoo() const { return mFoo;}
.
.
}
Here there is a template class (note thetemplate keyword and the < >'s), MyClass. It has an attribute,mFoo, but at this point we don't know what kind if type it really is.All we know is that there is an attribute and a method, GetFoo(), thatreturns it. Later on, when we can use the class, we specify what MyTypereally is:
.
.
MyClass<CString> b; // b is a MyClass whose
// MyType is a CString
CString s = b.GetFoo();
typedef MyClass<int> MyIntClass; // MyIntClass is a
// MyClass whose MyType
// is an int
MyIntClass a;
int i = a.GetFoo();
.
.
Does this seem to fit into a binary tree idea,or what? Ideally, we'd want to implement a binary tree that isall-purpose, generic, and so forth, and we now have a technique that atone place just could describe how the tree behaves, and later on wedecide what kind of elements it should hold.
We know (or at least can guess) that eachelement in the tree has some kind of key, and some kind of value. Atthis point, we just don't care which specific types they are. Kind oflike this:
template <class KeyType, class ValueType>
class BinTree
{
.
.
}
The source code actually has two templateclasses, the BinTree<class KeyType, class ValueType>, andBinTreeNode<class KeyType, class ValueType>. A BinTreeNode is theelement stored in the tree, where BinTree is the actual tree gizmo.
Some notes on template classes:
- They must be 100% inline.
- Only the stuff that is actually used willgenerate code. That is, it is not a problem if one designs a bigtemplate class with a lot of functions, but the application uses onlyone of them. Those not used will simply not exist in the executable.
IteratorsAn iterator is (often a class) used for navigating through some collection of data.
Typically, to use an iterator, you would:
- Initialize it.
- Determine whether it is finished.
- Get access to the element the iterator currently is at.
- Increment it to the next element.
Consider the simplest, and most well-used iterator in the history of software engineering:
The int!
(Sure, it isn't only just an iterator, but anyway...)
for(int i=0; i<10; i++)
{
Foo f = mFoo;
.
.
}
- The i=0 part initializes the iterator.
- The i<10 part determines whether it is finished.
- The mFoo part gives access to the element the iterator currently is at.
- The i++ part increments the iterator to the next element.
With a more generic approach, we could describe it like this:
for(SomeIterator i=theFirstElement;!i.isBeyondLastElement();i++)
{
Element e = i.GetElement();
}
For arrays and such, the int is sufficient. But,what if we want to get elements stored in another fashion? Given thegeneric approach, do we really need to care about the way the elementsare stored? Couldn't we just let the iterator handle that issue?
Well, of course we can, and that is also one ofthe big advantages of iterators, as they encapsulate the way data isstored, and just provide a means to access it.
And another thing, let's say you want to be ableto traverse the data in different ways. With the iterator approach, allyou have to do is to define another iterator and let it do the work.
Consider a Binary Tree:
- If we want to access the tree's element in asorted way, we could define a SortedWayIterator (conceptually, it couldbe used for access in reverse order as well, initialize it to the lastelement and -- instead of ++ the way through).
- If we want to access the elements in random order, we could define a RandomOrderIterator.
- If we want to access the elements in another fashion, we just define another iterator for the task.
The provided source implements four different kinds of iterators for traversing through the binary tree. With a tree like...

...elements are traversed like this:
| Iterator | Description | Order |
| BinTree::Iterator(++) | Sorted. | A B C D E F |
| BinTree::Iterator(--) | Reversed | F E D C B A |
| BinTree::ParentFirstIterator | Parent elements are traversed before their children. | D B A C F E |
| BinTree::ParentLastIterator | Parent elements are traversed after their children. | A C B E F D |
| BinTree::ByLevelIterator | Top to bottom, level by level, left to right. | D B F A C E |
(Exercise for the reader: Do ParentLastIterator with a -- on the ParentFirstIterator instead.)
The code using the iterators doesn't differ much, though. It is really just
for(BinTree::[IteratorNameHere]
i=theTree.Get[SomeKindOf]Iterator();!i.atEnd();i++)
{
// The iterators use the * (and ->) operator to provide access
// to the elements
BinTree::Node currentNode = (*i);
.
.
}
Iterators vs RecursionThe iterators differ quite a lot from the wayone uses recursion to traverse the tree. The iterators simply keeptrack of one node at the time and, depending on its children, parents,and whatnot, figures out which is the next node to visit.
The ByLevelIterator is an exception to this; itwill allocate an array (with roughly tree.Size()/2 nodes) that itupdates as it goes. The other iterators have virtually no memory impact(well, they do hold a pointer to the root, and a pointer to the currentnode, but in a world where we count memory in 100s of Mb, it's nothing).
The big advantage is, however, that iterators"hide" the way the data is stored. You could have similar iterators forother kinds of data structures (arrays, database stuff, whatever).Recursion is pretty much a special tree-thingie.
Temporary ClassesI'd like to be able to use the [] operator like this:
myTree["Foo"] = 42;and I expect it to work like this:
- If "Foo" already exist just update its value, else insert a new element ("Foo",42).
That's all quite easy to do; just let the tree have a [] operator that takes a KeyType as parameter and returns a ValueType.
However, I would also like to be able to access data in a similar manner, like this:
int i = myTree["Foo"];And now the big issue is:
What if "Foo" doesn't exist? Well, unlike the myTree["Foo"] = 42; situation, I
don't want a new element to be created. Actually, I think I'd like to throw an exeption instead.
So, the question is really:
- How can I let the tree's [] operator behave differently depending on what side of the = sign it is?
The solution isn't all that complicated (but I still think it's kinda cool):
Let the [] return a (temporary) class (instead of just the ValueType) that:
- Overloads the assignment operator
- Has a ValueType operator
The assignment operator will then handle themyTree["Foo"] = 42; situation, and the ValueType operator handles the i= myTree["Foo"] situation. From the provided source:
template <class KeyType, class ValueType>
class BinTree
{
public:
.
.
// BinTree::AccessClass. Used by the [] operator.
class AccessClass
{
// Let BinTree be the only one who can instantiate this class.
friend class BinTree<KeyType, ValueType>;
public:
// Assignment operator. Handles the myTree["Foo"] = 32;
situation
operator=(const ValueType& value)
{
// Just use the tree's Set method, it handles already
// exists/not exist
situation
mTree.Set(mKey,value);
}
// ValueType operator. Handles int i=myTree["Foo"],
// myTree["Bar"] == 44 etc.
operator ValueType()
{
Node* node = mTree.Find(mKey);
// Not found
if (node==0)
{
throw "Item not found";
}
return node->GetValue();
}
private:
//------------------------------
// Private Construction
//------------------------------
AccessClass(BinTree& tree, const KeyType& key):mTree(tree),
mKey(key){}
//------------------------------
// Disabled Methods
//------------------------------
// Default constructor
AccessClass();
//------------------------------
// Private Members
//------------------------------
BinTree& mTree;
const KeyType& mKey;
}; // AccessClass;
.
.
// operator [] for access to elements
AccessClass operator[](const KeyType& k)
{
return AccessClass(*this, k);
}
.
.
It will also handle comparisons, such astree["Foo"]==43 or tree["Foo"]==tree["Bar"] (any ValueType access,really) such that it will throw an exception if the requested elementdoesn't exist.
One could let AccessClass implement == operators(two operators: one taking ValueType as param and another takingAccessClass) that return false instead of throwing exceptions whenelements don't exist; in most cases they'd probably work.
They would have a slight problem, though:
tree["NonExistent"] == 43 would return false, while 43 == tree["NonExistent"] would throw an exception. It'd be a bit inconsistent.
But We Have This Already!With the Standard Template Library,
STL, comes someclasses (std::map and std::set) that conceptually are implementedpretty much like this:
- They are template classes.
- They use iterators.
- Internally, they are red-black trees.
If you really don't care about the implementation (but then, you probably shouldn't read this article, anyway), use them!
They do have a tendency to flood the output with inanecompiler warnings, though, and are for some reason as for from humanlyreadable as you can get (people designing code like that should bepublicly spanked, IMHO).
Anyway, if you are interested in implementation details, I hope I've given you something to start with...
About the Source CodeBinTree.hEverything is implemented as template classes in a single, commented, file. It has no external dependencies.
BinTreeTest.
cppThe demo project is a simple consoleapplication that tests that the tree works as expected. Check itssource to see how one uses the darn thing...
The red-black implementation in the source is to some extent based on:
http://www.oopweb.com/Algorithms/Documents/PLDS210/Volume/red_black.html
Downloads
Download demo project - 9 Kb
Download source - 6 Kb