Functional style frequently uses sequences. Nick Weatherhead applies these ideas to combinations in C++.
Previously [ Weatherhead15 ], I discussed the use of C++ templates to compile time program some well-known string katas. Template metaprogramming in this way imposes a functional style with the immutability of variables a key consideration. In this article I’m applying the same treatment to combinations of k elements and a variation on Eratosthenes sieve. The foundations for each are built as we go along including holding data within lists, adding and removing items from them, implementing queues, defining sequences and operating on them with sets. There is a bit to it so you may wish to skip ahead to a solution and come back to the detail. Either is fine but you’ll probably want to keep Listing 1 to hand for reference.
Lists
Without arrays and mutability at our disposal, a structure is required that can represent a list and after each operation produces a new variant. It’s common for functional languages to have an operation which adds an element to the beginning of a list and is typically known as cons (short for construct). This functional list differs from the imperative viewpoint of linearly linking elements with a head at one end, a body and a tail at the other. Instead it is woven from a compound pair of pairs. Each pair comprises a head plus tail that splits until a null tail terminates a given path. It’s good to have an appreciation of its recursive nature and the other features it affords see [ SICP96 ]. However, for the most part, this article will attempt to adhere to the metaphor for a single chain.
1 #include <iostream> 2 3 using namespace std; 4 5 struct nop { 6 7 static const char delim = '\0'; 8 9 friend ostream& operator<<( ostream& os 10 , const nop& ) { return os; } 11 }; 12 13 struct nil : nop { 14 15 typedef nil type; 16 17 template< size_t SIZE = 0 > struct size { 18 19 static const size_t value = SIZE; 20 21 friend ostream& operator<<( ostream& os 22 , const size& ) { return os << SIZE; } 23 }; 24 25 template< class RHS > 26 struct append : RHS { typedef RHS type; }; 27 template< class RHS > 28 struct prepend : append< RHS > { }; 29 template< class RHS > 30 struct uunion : append< RHS > { }; 31 template< class RHS > 32 struct except : nop { typedef nil type; }; 33 template< class RHS > 34 struct intrsct : except< RHS > { }; 35 36 template< size_t N, size_t I_SIZE, class O 37 , size_t O_SIZE = 0 > struct kombine_ 38 : conditional< O_SIZE == N, O, nop >::type 39 { }; 40 }; 41 42 template< class HEAD, class TAIL 43 , char DELIM = ',' > struct list_ { 44 45 #define t typename 46 47 typedef list_ type; typedef list_ LHS; 48 typedef HEAD head; typedef TAIL tail; 49 50 static const char delim = DELIM; 51 52 friend ostream& operator<<( ostream& os 53 , const list_& ) { return os 54 << HEAD( ) << tail::delim << TAIL( ); } 55 56 template< size_t SIZE = 0 > struct size 57 : tail::template size< SIZE + 1 > { }; 58 59 template< class RHS > struct append 60 : list_< LHS, RHS, DELIM > { }; 61 template< class RHS > struct prepend 62 : list_< RHS, LHS, DELIM > { }; 63 64 template< class LHS, class RHS > 65 struct append_ 66 : LHS::type::template append< RHS > { }; 67 template< class LHS, class RHS > 68 struct prepend_ 69 : LHS::type::template prepend< RHS > { }; 70 template< class LHS, class RHS > 71 struct except_ 72 : LHS::type::template except< RHS > { }; 73 template< class LHS, class RHS > 74 struct uunion_ 75 : LHS::type::template uunion< RHS > { }; 76 template< class LHS, class RHS > 77 struct intrsct_ 78 : LHS::type::template intrsct< RHS > { }; 79 80 template< class RHS, bool = true > 81 struct uunion 82 : conditional< ( LHS::head::value < 83 RHS::head::value ) 84 , append_< LHS::head 85 , uunion_< LHS::tail, RHS > > 86 , append_< t RHS::head 87 , uunion_< LHS , t RHS::tail > > 88 >::type { 89 typedef HEAD head; typedef TAIL tail; }; 90 91 template< bool NA > struct uunion< nil, NA > 92 : append_< head, tail > { }; 93 94 template< class RHS, bool = true > 95 struct intrsct 96 : conditional< is_same< LHS, RHS >::value 97 , LHS 98 , t conditional< ( LHS::head::value == 99 RHS::head::value ) 100 , append_< LHS::head 101 , intrsct_< LHS::tail, t RHS::tail > > 102 , t conditional< ( LHS::head::value < 103 RHS::head::value ) 104 , intrsct_< LHS::tail, RHS > 105 , intrsct_< LHS , t RHS::tail > 106 >::type 107 >::type 108 >::type { 109 typedef HEAD head; typedef TAIL tail; }; 110 111 template< bool NA > 112 struct intrsct< nil, NA > : nil { }; 113 114 template< class RHS, bool = true > 115 struct except 116 : conditional< is_same< LHS, RHS >::value 117 , nil 118 ,t conditional< ( LHS::head::value < 119 RHS::head::value ) 120 , append_< t LHS::head 121 , except_< LHS::tail, RHS > > 122 , t conditional< ( LHS::head::value > 123 RHS::head::value ) 124 , except_< LHS , t RHS::tail > 125 , except_< LHS::tail, t RHS::tail > 126 >::type 127 >::type 128 >::type { 129 typedef HEAD head; typedef TAIL tail; }; 130 131 template< bool NA > struct except< nil, NA > 132 : append_< head, tail > { }; 133 134 template< size_t N, size_t I_SIZE 135 , class O, size_t O_SIZE > class kkombine_ { 136 137 friend ostream& operator<<( ostream& os 138 , const kkombine_& that ) { 139 140 static const int i_size = I_SIZE - 1; 141 142 return os 143 << t TAIL::template kombine_< N, i_size 144 , prepend_< HEAD, O >, O_SIZE + 1 >( ) 145 << t TAIL::template kombine_< N, i_size 146 , O , O_SIZE >( ); 147 } }; 148 149 template< size_t N, size_t I_SIZE, class O 150 = nop, size_t O_SIZE = 0 > struct kombine_ 151 : conditional< !O_SIZE && I_SIZE == N 152 , prepend_< type, nop > 153 , t conditional< O_SIZE == N, O 154 , t conditional< !I_SIZE, nil 155 , kkombine_< N, I_SIZE, O, O_SIZE > 156 >::type 157 >::type 158 >::type { }; 159 160 template< size_t N > struct kombine 161 : kombine_< N, size< >::value > { }; 162 163 template< size_t N, bool NA = true > 164 struct powerset_ { 165 166 friend ostream& operator<<( ostream& os 167 , const powerset_& that ) { return os 168 << powerset_< N - 1 >( ) 169 << kombine< N >( ); } 170 }; 171 172 template< bool NA > 173 struct powerset_< 0, NA > : nop { }; 174 175 struct powerset 176 : powerset_< size< >::value > { }; 177 178 #undef t 179 }; 180 181 template< class HEAD, class TAIL 182 , char DELIM = ',' > struct list 183 : list_< HEAD, TAIL, DELIM > { }; 184 185 template< class TAIL, char DELIM > 186 struct list_< nop, TAIL, DELIM > : TAIL { 187 188 friend ostream& operator<<( 189 ostream& os, const list_& ) { 190 return os << ' ' << TAIL( ); } 191 }; 192 193 template< class HEAD, char DELIM > 194 struct list< HEAD, nil, DELIM > 195 : list_< HEAD, nil, DELIM > { 196 197 typedef HEAD type; 198 199 friend ostream& operator<<( 200 ostream& os, const list& ) { 201 return os << HEAD::value; } 202 203 template< class RHS > struct append 204 : list< HEAD, RHS, DELIM > { }; 205 template< class RHS > struct prepend 206 : list< RHS, HEAD, DELIM > { }; 207 }; 208 209 template< class T, T V, class VS, char DELIM > 210 struct v 211 : list< v< T, V, nil, DELIM >, VS, DELIM > { 212 static const T value = V; }; |
Listing 1 |
In Listing 1,
list_
(42–179) takes a
HEAD
element and
TAIL
elements; an additional parameter
DELIM
defines the separator used between elements when printing. Appending an underscore to a class’s name is the convention used here to indicate internal usage, hence,
list
(181–183) exposes
list_
. No-operation
nop
(5–11) simply places a null marker into the output as a guard element. A specialisation of
list_
(185–191) which has
nop
as its head element is used to indicate that it’s a list of lists and when printed each list can be delimited by another character. Another
list
(193–207) is provided for elements at the end of a list i.e. that’s
TAIL
is
nil
(13–40) where some functions are overridden whilst
nil
implements functions that operate on an empty list. When performingoperations between elements it is convenient to think in terms of what appears on the right and left hand sides, so
list_
provides some internal functions that take
RHS
and
LHS
respectively. A value of type
v
(209–212) is itself a
list_
with its
HEAD
being a single value element of the same type. Thus each element can be operated on in the same way.
Queues
Imperative lists and vectors describe consecutive elements, as does the functional list, and all can be used as the underlying structure of a queue. Removing the
HEAD
of a
list_
requires a simple reference to the
TAIL
elements. To
prepend
(28–29, 61–62, 205–206) elements another
list_
is created with the new content in the
HEAD
and the existing content in the
TAIL
;
append
(28–29, 59–60, 203–204) just reverses the concatenation of the
HEAD
and
TAIL
. There are also implementations of these that take explicit left and right hand side arguments – see
append_
(65–66) and
prepend_
(67–78).
Size
Functional programs don’t have loop constructs so rely on recursion to perform inductive operations. Templates don’t optimise tail recursive calls so each grows the stack, hence compilers place a limit on its depth (note that despite this if an expression is sufficiently long it’s still feasible to exhaust the memory). A classic way to observe this is to use a size function
size< >
(17–23, 56–57) that iterates over a list to obtain a count. What’s not always clear is that functors used as default template parameter arguments can be evaluated independently of the template to which they’re applied. Take
list_<HEAD, TAIL>::kombine_<N, I_SIZE, O = nop, O_SIZE = 0>
(149–158), if the input size
I_SIZE
is defaulted to
size<>
one might expect it to be calculated, as per a regular function, on invocation. However, as
size<>
’s template parameters are not dependent on
kombine_
’s the compiler instantiates it whenever
list_
is called. This is okay for k-combinations but would unintentionally execute for primes too. To prevent this the default is wrapped by
kombine<N>
(160–161). Alternatively how about having a size attribute i.e.
size = 1
for single elements and
size = HEAD::size + TAIL::size
as they are combined? This is fine when the
HEAD
and
TAIL
are lists. However, they are also used for list expressions; size isn’t intrinsic to these and their resultant type is unknown until fully evaluated.
Combinations
The brief is to find all the unique combinations of k elements from a set of distinct values. For the purpose of this discussion each element has its own letter and the input set is in alphabetical order.
In Figure 1 ( k -combinations) the input elements are treated as a queue; at each step the last item is removed from the top of the queue and either placed at the bottom of the output queue when branching right or dropped when going left. If the output reaches the desired length before the input queue is exhausted then it forms a terminal node and branching ceases. Further, if the input is of the required length and the output queue is empty then the input can be directly substituted for the output without further branching.
Here
k
-combinations (Listing 2) are represented with consecutive characters, specified by
c
and
cs
. As previously mentioned
list_<HEAD, TAIL>::kombine<N>
(160–161) wraps the initial call
list_<HEAD, TAIL>::kombine_<N, size< >, nop, 0>
. This takes parameters for the desired combination length, the size of the input list which when first called is the size of the initial list, the output list and its size which are initially empty. The underlying definition
list_<HEAD, TAIL>::kombine_<N, I_SIZE, O, O_SIZE>
(149–158) makes several checks. The first sees if the output can be directly substituted with the input. Otherwise a check is made to see if the output list has reached the desired length. If neither of these are satisfied
list_<HEAD, TAIL >::kkombine_<N, I_SIZE, O, O_SIZE>
(134–147) is called to branch left and right. If the end of the input list is reached i.e.
nil::kombine_<N, size<>, nop, 0>
(36–39) branching ceases with or without the output having reached the specified length. The
powerset
(175–176) is implemented as all the combinations of
k
for 0 to
n
elements.
… template< char C, class CS = nil , char DELIM = '\0' > struct c : v< char, C, CS, DELIM > { }; template< char C, char... CS > struct cs : c< C, cs< CS... > > { }; template< char C > struct cs< C > : c< C > { }; int main( ) { /* kombine('abcd', 2)=' ab ac ad bc bd cd' * * powerset('abcd')=' a b c d ab ac ad bc * * bd cd abc abd acd bcd abcd' */ cout << "\nkombine('abcd', 2)='" << cs<'a','b','c','d'>::kombine<2>( )<< "'" << "\npowerset('abcd')='" << cs<'a','b','c','d'>::powerset( ) << "'"; } |
Listing 2 |
It can be seen from the enumeration of elements that the permutations, whilst not strictly necessary, are in lexicographic order. Another way of representing combinations is in binary whereby each bit set maps to a corresponding value to print. Investigation of this is left as an additional exercise.
Member template specialisation
You may have noticed that there are a number of member template functors that take an additional, seemingly redundant, parameter; in each case these have a specialisation. For example the power set is all the subsets of an input set including itself and the empty set. In the general case this is lists between the length of the input list
powerset_<N>
to
powerset_<1>
(163–170). A specialisation
powerset_<0>
(172–173) terminates the set with an empty list. Nested classes are dependent on their enclosing template types, hence if explicitly specialised the enclosing classes need to be too. A work-around to this restriction is to provide partial specialisations by adding a dummy parameter.
Sequences
Lists can be long; however, if the content is not arbitrarily distributed then it can be described as a function rather than with individual elements. Evaluating the function and generating the list can then be deferred until first use. This is the case for sieving ranges of numbers with set intervals.
In Listing 3 (Sequences & Sets) a cons of integers
i
is defined for use as a sequence
seq
. The common case produces all integers in ascending order between a low
LO
and high
HI
value. The interval
I
can be altered to increase the step size. Reversing direction by beginning at a high value, ending with a low value and specifying a negative increment will produce a sequence in descending order. At each step the current value is appended to the list and, whilst within bounds, the next step is defined as a sequence of itself with an incremented starting range.
… template< int I, class IS = nil , char DELIM = ',' > struct i : v< int, I, IS, DELIM > { }; template< int I, int... IS > struct is : i< I, is< IS... > > { }; template< int I > struct is< I > : i< I > { }; template< int LO, int HI, int I = 1 , char DELIM = ',' > struct seq : conditional< ( I > 0 ? LO + I <= HI : LO >= HI - I ) , i< LO, seq< LO + I, HI, I, DELIM > > , i< LO > >::type { }; int main( ) { /* seq(-3, 3, 2).uunion(seq(-2, 2, 2))= * * -3,-2,-1,0,1,2,3 * * seq(-3, 3).intrsct(seq(-2, 2, 2))= * * -2,0,2 * * seq(-3, 3).except(seq(-2, 2, 2))= * * -3,-1,1,3 */ cout << "\nseq(-3, 3, 2).uunion(seq(-2, 2, 2))=" << seq<-3, 3, 2>::uunion< seq<-2, 2, 2>>( ) << "\nseq(-3, 3).intrsct(seq(-2, 2, 2))=" << seq<-3, 3>::intrsct<seq< -2, 2, 2>>( ) << "\nseq(-3, 3).except(seq(-2, 2, 2))=" << seq<-3, 3>::except<seq<-2, 2, 2>>( ); } |
Listing 3 |
Sets
In addition to sequences some set theory is required to sieve. Union, intersection and difference of ascending ordered sets are provided (a description of each follows) although not all will be required. When combining operations it’s feasible that earlier operations result in an empty list, hence the member templates for sets (30–34) in
nil
.
Union
The general case
list<HEAD, TAIL>::uunion<RHS>
(80–89) determines which
HEAD
, the
LHS
or
RHS
, is less than the other. Whichever is chosen becomes the
HEAD
of a new list with the rest being the union of its
TAIL
with the other list. The union of a list with an empty list is the list itself (91–92). As it isn’t fully defined at the point of parsing its creation is delayed with a concatenation operation i.e.
append_<HEAD, TAIL>
.
Intersection
The general case
list<HEAD, TAIL>::intrsct<R>
(94–109) determines whether the
LHS
and
RHS
’s
HEAD
s are of equal value; if they are one becomes the
HEAD
of the resulting list with the rest comprising the intersection of the respective list’s
TAIL
s. Otherwise, the
HEAD
with least value is discarded and the intersection between the remaining elements is performed. Intersection with an empty list (111–112) is, of course,
nil
.
Set difference
Known here as
list<HEAD, TAIL>::except<R>
(114–129) it is the equivalent of minus where any elements in the
LHS
that also appear in the
RHS
are removed. If both the
LHS
and
RHS
are identical then they cancel one another out resulting in the empty list
nil
; otherwise, the
HEAD
of each is compared. If the
LHS
is less than the
RHS
then it becomes the
HEAD
of a new list with the rest being the set difference between its
TAIL
and the
RHS
’s. If it’s greater the
RHS
’s
HEAD
is removed and if equal both
HEAD
s are removed with the set difference calculated between the remaining elements. If the
RHS
is an empty list then there is nothing to minus (131–132); as with union the list isn’t fully defined at the point of parsing so a concatenation operation i.e.
append_<HEAD, TAIL>
delays its creation.
Sieves
Eratosthenes proposed a simple way for finding primes up to a specified integer using the efficient principal of sieving. There are other well-known variations such as Euler’s sieve. The basic mechanism removes multiples of each integer between 2 and n thereby leaving only those that are divisible by one and themselves. There are some quick ways to refine this algorithm which also reduce recursive calls. All even numbers with the exception of 2 can be removed from the initial range; that is all multiples seeded from an even index and every other value from those with an odd. Further, as in table 1, higher order progressions have some values that overlap with lower ones. Thus sequences can begin at the square of their seed. Similarly it is unnecessary to go beyond an index that is √ n as if n is not prime it must have a divisor less than or equal to √ n [ SICP96 ]. Table 1 shows the arithmetic progressions required to sieve integers between 2 and 100.
|
||||||||||||||||||||||||||||||||||||||||
Table 1 |
A common solution is to have an array of elements indexed from 1 to n , marking 1 for removal, then generating multiples from 2 to the square root of n , beginning each sequence with the square of its index and marking these for removal too, and finally printing all unmarked values. Instead the algorithm can be conceived in terms of operations on ordered sets.
In Listing 4 (Eratosthenes Sieve) the template parameter
HI
is the limit value,
N
is the current sieving multiple beginning at 3 and incrementing by 2 for odd intervals.
O
is the output list which will be successively sieved; it begins as a list of a fixed value of 2 followed by the sequence of odd numbers up to and including the limit value. Whilst the square of the multiple i.e.
LO = N * N
is less than or equal to the
HI
boundary every other odd multiple
N + N
is sieved from the output list.
… template< int HI, int N = 3 , class O = i< 2, seq< 3, HI, 2 > > , int LO = N * N > struct primes : conditional< LO <= HI , primes< HI, N + 2 , typename O::type::template except< seq< LO, HI, N + N > > > , O >::type { }; int main( ) { /* primes(350)=2,3,5,7,11,13,17,19,23,29,31 * * ,37,41,43,47,53,59,61,67,71,73,79,83,89 * * ,97,101,103,107,109,113,127,131,137,139 * * ,149,151,157,163,167,173,179,181,191,193 * * ,197,199,211,223,227,229,233,239,241,251 * * ,257,263,269,271,277,281,283,293,307,311 * * ,313,317,331,337,347,349 */ cout << "\nprimes(350)=" << primes<350>( ); } |
Listing 4 |
Summary
In [ Weatherhead15 ] I covered ASCII to integer conversion, roman numerals and palindromes. Each used a variant of the list construct to represent strings. Here it was used to generate combinations with a queue and to sieve sequences by applying set operations. Representative of the way runtime classes are developed the data and methods were encapsulated for reuse. Further ways to experiment with this include implementing other sieves and adding functions to filter and sort.
Note
All the code in this article compiles with GCC 4.9.2 and Clang 3.5.2 using -std=c++11 . However, your mileage may vary with other compilers. Also whilst a null character should not be displayed in a terminal, some platforms show them as a space.
References
[SICP96] Harold Abelson and Gerald Jay Sussman with Julie Sussman. Structure and Interpretation of Computer Programs Second Edition pp.53–54, 116, 139–145, The MIT Press, 1996.
[Weatherhead15] Nick Weatherhead. Template Programming Compile Time String Functions, Overload 128, August 2015.
Further reading
Chris Oldwood. List incomprehension, CVu Volume 26 Issue 2 pp.7–8, May 2014.
Stuart Golodetz. Functional Programming Using C++ Templates (Part 1), Overload 81, October 2007.
Acknowledgements
I’d like to thank the Overload review team for providing their feedback which enabled me to elevate the content presented in this article.