
The Cantor Set
onsider
a line segment of unit length. Remove its middle third. Now remove
the middle thirds from the remaining two segments. Now remove the
middle thirds from the remaining four segments. Now remove the middle
thirds from the remaining eight segments. Now remove ... well, you get
the idea. If you could continue this construction through infinitely
many steps, what would you have left?
What remains after infinitely many steps is a remarkable subset of the real numbers called the Cantor set, or “Cantor’s Dust.” At first glance one may reasonably wonder if there is anything left. After all, the lengths of the intervals we removed all add up to 1, exactly the length of the segment we started with:
Yet,
remarkably, we can show that there are just as many “points” remaining
as there were before we began! This startling fact is only one of the
many surprising properties exhibited by the Cantor set. Before
we begin to expose these properties, it is important to be quite
precise about this construction. Let us agree that the segments we
remove at each stage of the construction are open intervals. That is, in the first step we remove all of the points between
1/3 and 2/3, but leave the end points, and similarly for each
successive stage. A little reflection will convince you that these
endpoints we leave behind never get removed, since at each stage
we are only removing parts that lie strictly between the endpoints left
behind at the previous stage. Thus we see that our Cantor set cannot be
empty, since it contains 0, 1, 1/3, 2/3, 1/9, 2/9, 7/9, 8/9, 1/27, and
so on. But in fact there is much more that remains. To see this, recall that we may choose any number base
to represent real numbers. That is, there is nothing necessary or even
special about our common use of base ten; we can just as easily
represent our numbers using base two, or base three, or any other base.
When
a number is written in base two it is said to be in binary notation,
and when it is written in base three it is said to be in ternary
notation. Let's focus on the ternary representations of the decimals
between 0 and 1. Since, in base three, 1/3 is equal to 0.1, and 2/3 is
equal to 0.2, we see that in the first stage of the construction (when
we removed the middle third of the unit interval) we actually removed
all of the real numbers whose ternary decimal representation have a 1 in
the first decimal place, except for 0.1 itself. (Also, 0.1 is the same
as 0.0222... in base three, so if we choose this representation we are
removing all the ternary decimals with 1 in the first decimal
place.) In the same way, the second stage of the construction removes
all those ternary decimals that have a 1 in the second decimal place.
The third stage removes those with a 1 in the third decimal place, and
so on. (Convince yourself that this is so. Begin by noticing that 1/9
is equal to 0.01 and 2/9 is equal to 0.02 in base three.) Thus,
after everything has been removed, the numbers that are left – that is,
the numbers making up the Cantor set – are precisely those whose
ternary decimal representations consist entirely of 0’s and 2’s. What
numbers does this include, besides the ones already noted above? How
many are there? Lots.
Consider 1/4. This is not one of the endpoints (those all have powers
of three in the denominator), but it is not hard to show that 1/4 is in
the Cantor set. Begin by writing 4 in ternary notation (as 11 – one
“1” plus one “3”), and then use long division to get its ternary decimal
representation:
Since
the decimal expansion of 1/4 consists entirely of 0’s and 2’s, it was
never removed during the construction of the Cantor set, so it's still
there ... somewhere! Asking
how many numbers are left, as you can easily see, is to ask how many
numbers can be represented in ternary notation with no “1” in
any decimal place. But this must be as many as there are real numbers in
the unit interval – for consider: we may represent all the real
numbers between 0 and 1 in binary, and this is just every possible
decimal with a 1 or a 0 in each decimal place. And there can be no more
and no less of these than there are ternary decimals with a 0 or a 2 in
each decimal place. They correspond in an obvious way. The
conclusion is inescapable: once we remove all those intervals, the
number of points remaining is no less than the number we started with.
(If this seems inconceivable to you, you might wish to read the Infinity Minitext.) Let
us examine that “correspondence” more closely. The idea is evident:
for every number whose ternary decimal expansion consists entirely of
0’s and 2’s, match it with the corresponding number whose binary decimal
expansion has 0’s in the same place, and 1’s wherever the ternary
number had 2’s. Thus, 1/4 in ternary gets matched with 1/3 in binary:
This is evidently a function that is surjective. Moreover, it is continuous. (Elements that are “close” in the domain are mapped to elements that are “close” in the range.) We can extend this to a function,
called the Cantor function, from the entire unit interval onto itself,
by simply agreeing to let its value on the missing intervals be the
constant values which equal the values of the original function on the
endpoints of those intervals. For example, the Cantor function will map
each point in the first middlethird interval (1/3, 2/3)
to 1/2, the value of the original function on the points 1/3 and 2/3.
(Recall that 1/3 has ternary representation 0.0222... and 2/3 has
ternary representation 0.2, which map to 0.0111... and 0.1 respectively,
and these both represent the number 1/2 in binary.) If we were to
attempt to graph this function, it would look like this:
The
flat parts are the images of all of the “middle thirds,” and these are
all connected by the images of the Cantor set itself. This construction
has been called the “Devil's Staircase” since it has infinitely many
“steps.” A few more analytical tidbits: Since each interval removed was open, and there were only countably many of them, their union is also open. Thus, the Cantor set (which is the complement of this union) is closed. That is, it contains all of its accumulation points. Moreover, every point of the Cantor set is an accumulation point, since within any neighborhood
of a number whose ternary expansion consists entirely of 0’s and 2’s
one may find other such numbers. Consequently, the Cantor set is a perfect set
in the topologist’s sense. Finally, since any open neighborhood of any
point of the Cantor set contains an open set which is disjoint from the
Cantor set, we have that the Cantor set is nowhere dense. Altogether a remarkable set.
ADDENDUM:
Many
have gotten to know the Cantor set as a fractal. We touch here briefly
on these properties. Let us again visualize the construction:
Although
the Cantor set itself is to be thought of as the “final row” in this
picture, the picture considered altogether is very suggestive. Notice
that at each stage the picture is “doubled” into two copies which
precisely resemble the whole, but which at each stage become twothirds
smaller. Together
these properties – selfsimilarity at every scale over a uniform
reduction of scale – qualify the Cantor set as a fractal with Hausdorf
dimension given by:
The
Cantor set is an instructively simple example of a fractal,
demonstrating that our geometrical intuitions about space (even such
simple spaces as the unit interval) can fail to capture much of the deep
structure inherent in those very intuitions.






