• 10 min Reading
  • 160 Views
  • 3 Sharings
Table of contents

Let $E$ be a finite set and $\mathcal{P}_E(k)$ be the number of subsets of $E$ with $k$ elements.
Put $n = card(E)$ with $n \geq k \geq 0$.
The number of elements of $\mathcal{P}_E(k)$ is the $k^{th}$ binomial coefficient:

$$\mathcal{C}^n_k = n(n-1)(n-2)...(n-k-1) = \frac{n!}{k!(n-k)!} = \binom{n}{k}$$

To enumerate the set of elements of $\mathcal{P}_E(k)$, use this recurrence formula:

$$\mathcal{P}_E(k) = \left\{ \begin{array}{ll} \emptyset & \text{if } k = 0 \\ \{ e \cup p \;|\; e \in E, p \in \mathcal{P}_{E \backslash \{e\}}(k-1) \} & \text{if } k \neq 0 \end{array} \right.$$
from typing import Iterable, Set, FrozenSet

def combinations(E: Iterable, k: int) -> Set[FrozenSet]:
    if k < 0:
        raise ValueError(f"k={k} should be positive")
    elif k == 0 or k > len(E):
        return set()
    elif k == len(E):
        return set(frozenset(E))

    E_e = set(E)

    def _combinations(e: int, E: Iterable):
        nonlocal E_e
        E_e.remove(e)
        return combinations(E_e, k - 1)

    if k == 1:
        return {frozenset(e) for e in E}
    else:
        return {frozenset(p) | {e} for e in E for p in _combinations(e, E)}

Usage of frozenset is justified in python because a set contained only hashable element, then cannot contain element of type set. Because a mutable object like set is unhashable unlike in immutable object like frozenset. This first implementation is not very performing. In addition of multiple copies of $E$ costly, the recursion is non-tail and stack of memory should overflow and reach out a limit case, in python sys.getrecursionlimit() is by default 1000. And yield keyword could be used to avoid storing in memory the set of elements of $\mathcal{P}_E(k)$ that growths factorially.

Another way to enumerate the set of elements of $\mathcal{P}_E(k)$ is by filtering the set of subsets of $E$, called $\mathcal{P}_E$, that have a length of $k$:

$$\mathcal{P}_E(k) = \{ e \;|\; e \in \mathcal{P}_E, len(e) = k \}$$
from typing import Iterable, Generator, FrozenSet

def combinations(E: Iterable, k: int) -> Generator[FrozenSet]:
    E_l = list(E)
    n =  len(E)
    for i in range(1 << n):
        bits = [int(b) for b in bin(i)[2:]]
        n_one_bits = sum(bits)
        if n_one_bits != k:
            continue
        element = set()
        bits = [0]*(n - len(bits)) + bits
        for index, bit in enumerate(bits):
            if bit == 1:
                element.add(E_l[index])
        yield frozenset(element)

Note that, $\mathcal{P}_E$ is in bijection with $\{ b_1, b_2, ..., b_n \,|\, \forall i \in \left[\!\left[1, n\right]\!\right] b_i \in \{0, 1\} \}$.
$b_i$ is the value of the indicator function that indicates that the $i^{th}$ element of $E$ with 1-based indexing is in the related subset of $E$, if and only if $b_i = 1$.

But this iterative implementation is not efficient yet, the number of elements of $\mathcal{P}_E$ is :

$$card(\mathcal{P}_E) = card\left(\coprod_{k=0}^{n}\mathcal{P}_E(k)\right) = \sum_{k=0}^{n} card(\mathcal{P}_E(k)) = \sum_{k=0}^{n} \binom{n}{k} = (1 + 1)^n = 2^n$$

thanks to binomial formula. Also retrieve directly the result thanks to the bijection mentioned above. The for loop from $\left[\!\left[0, 2^n - 1\right]\!\right]$ could be shorter.

But why reinventing the wheel, use the python standard library:

https://docs.python.org/3/library/itertools.html?highlight=itertools#itertools.combinations

from typing import Iterable, Generator, Tuple
import itertools

def combinations(E: Iterable, k: int) -> Generator[Tuple]:
    return itertools.combinations(E, k)

Combinations are emitted in lexicographic sort order, and implementation could be in relation to permutation on a set of length $k$. Here you have tuple objects, not frozenset objects.


Written by Gilles Legoux (glegoux) - Software Engineer.