Formalized Zeckendorf Theorem

This blog post presents my formalized proof of the Zeckendorf Theorem using the theorem prover Isabelle/HOL

"Isabelle/HOL: a proof assistant for higher-order logic"
Nipkow, Tobias and Paulson, Lawrence C and Wenzel, Markus, 2002
[3] . First, I'll describe the original proof, and then I will discuss my formalization. The formal proof is available on the Archive of Formal Proofs and on GitHub.

Proof

Informally, the Zeckendorf theorem states that every positive integer can be represented uniquely as the sum of one or more distinct Fibonacci numbers, such that the sum does not include any two consecutive Fibonacci numbers. Fibonacci numbers are a sequence of natural numbers defined as

F_0 = 0 \qquad F_1 = 1 F_n = F_{n-1} + F_{n-2}

for n > 1 . More precisely, the theorem states that if N is any positive integer, there exist positive integers c_i \ge 2 , with c_{i+1} > c_i + 1 , such that

N = \sum_{i=0}^k F_{c_i}

Let's now walk through the original proof from the Dutch mathematician Gerrit Lekkerkerker

"Voorstelling van natuurlijke getallen door een som van getallen van Fibonacci"
Gerrit C. Lekkerkerker, 1951 (Accessed 21.12.2024)
[1] . We prove the theorem by induction over N . For N = 1 , we choose c_0 = 2 and have

\sum_{i = 0}^0 F_{c_i} = F_{c_0} = F_2 = 1

It is obvious that this representation is unique. Now we can continue by assuming that the Zeckendorf theorem holds for 1, \dots, N-1 and proving the theorem for N . We can observe that every positive natural number is either a Fibonacci number or lies between two consecutive Fibonacci numbers. Therefore, for every N > 0 , there exists a Fibonacci number F_i such that

\begin{equation} F_i \le N < F_{i+1} \end{equation}

If N is not a Fibonacci number, the first inequality is strict. If N is a Fibonacci number, there is nothing to prove (since N can be represented as a sum with F_i as the only element). Therefore, we will continue assuming N is not a Fibonacci number. By subtracting F_i from equation (1), we obtain 0 < N - F_i < F_{i+1} - F_i . By the definition of Fibonacci numbers, we can rewrite the inequality as 0 < N - F_i < F_{i-1} . Using the induction hypothesis, we can represent N - F_i as a sum of Fibonacci numbers.

N - F_i = \sum_{j=0}^k F_{c_j} \quad \text{with} \quad c_i \ge 2,~ c_{j+1} > c_j + 1

By adding F_i to this sum, we obtain a representation of N as a sum of Fibonacci numbers.

N = F_i + \sum_{j=0}^k F_{c_j}

As a result of N - F_i < F_{i-1} , we can deduce that F_i and F_{i-1} are not in the sum. Therefore, i > c_k + 1 . From the induction hypothesis, we know that the representation of N - F_i is unique. Thus, the representation of N is also unique. A different proof of the uniqueness of the representation can be found in the paper by Nik Henderson

"What is Zeckendorf's Theorem?"
Nik Henderson, 2016
[2] .

Isabelle/HOL Formalization

After revisiting the original proof, we can now discuss the main steps of my formalization of the Zeckendorf theorem in Isabelle/HOL

"Isabelle/HOL: a proof assistant for higher-order logic"
Nipkow, Tobias and Paulson, Lawrence C and Wenzel, Markus, 2002
[3] . The formal proof is available in the Archive of Formal Proofs.

The definition of Fibonacci numbers is already implemented in Isabelle/HOL's Number Theory library. After introducing some auxiliary lemmas and definitions, we can prove inequality (1) for the case where N is not a Fibonacci number.

  lemma no_fib_betw_fibs: 
    assumes "¬ is_fib n"
    shows "∃ i. fib i < nn < fib (Suc i)" 
  proof - 
    have finite_le_fib: "finite {i. fib i < n}" 
        using finite_smaller_fibs by auto
    obtain i where max_def: "i = Max {i. fib i < n}" by blast
    show "∃ i. fib i < nn < fib (Suc i)"
    proof(intro exI conjI)
        have "(Suc i) ∉ {i. fib i < n}" 
            using max_def Max_ge Suc_n_not_le_n finite_le_fib by blast
        thus "fib (Suc i) > n" 
            using ‹¬ is_fib n› fib_implies_is_fib linorder_less_linear by blast
    qed(insert max_def Max_in ‹¬ is_fib n› finite_le_fib no_fib_implies_le_fib_idx_set, auto) 
  qed

Here, I use the fact that the set of indices of Fibonacci numbers smaller than N is finite. By definition, the maximum of this set, i , must satisfy the inequalities in (1). Next, we can already prove the existence of the Zeckendorf representation.

  theorem zeckendorf_existence:
    assumes "n > 0"
    shows " c k. n = ( i=0..k. fib (c i))  inc_seq_on c {0..k-1}  (i{0..k}. c i  2)" 
        using assms
  proof(induct n rule: nat_less_induct)
    case (1 n)
    then show ?case
    proof(cases "is_fib n")
        case False
        obtain i where bounds: "fib i < n" "n < fib (Suc i)" "i > 0"
            using no_fib_betw_fibs 1(2) False by force
        then obtain c k where seq: "(n - fib i) = ( i=0..k. fib (c i))" "inc_seq_on c {0..k-1}" " i{0..k}. c i  2"
            using 1 fib_neq_0_nat zero_less_diff diff_less by metis
        let ?c' = "(λ n. if n = k+1 then i else c n)"
        have diff_le_fib: "n - fib i < fib(i-1)"
            using bounds fib2 not0_implies_Suc[of i] by auto
        hence ck_lt_fib: "fib (c k) < fib i" 
            using fib_Suc_mono[of "i-1"] bounds by (simp add: sum.last_plus seq)
        have "inc_seq_on ?c' {0..k}"
            using inc_seq_on_aux[OF seq(2) diff_le_fib ck_lt_fib seq(1)] One_nat_def 
                inc_seq_on_def leI seq by force
        moreover have "n = ( i=0..k+1. fib (?c' i))" 
            using bounds seq by simp
        moreover have " i  {0..k+1}. ?c' i  2" 
            using seq bounds fib2 not0_implies_Suc[of i] atLeastAtMost_iff 
                diff_is_0_eq' less_nat_zero_code not_less_eq_eq by fastforce
        ultimately show ?thesis by fastforce
    qed(insert fib_implies_zeckendorf, auto)
  qed

Here, I formalize the fact that every positive integer has a Zeckendorf representation. The property inc_seq_on is used to formalize the constraint c_{j+1} > c_j + 1 on the domain {0,\dots,k-1} . Introducing the domain for the mapping c should simplify reasoning and make it easier to extend the mapping. The existence proof is then done by induction over N . Note that we use the nat_less_induct rule, which provides the induction hypothesis for all smaller integers. The default induction rule would not be sufficient in this case. Next, I distinguish whether N is a Fibonacci number. If it is, the formalization is straightforward, and Isabelle can prove that case automatically using the fib_implies_zeckendorf lemma (in the penultimate qed statement). If N is not a Fibonacci number, we can use the previous lemma to infer the inequalities from (1). Then, we apply the induction hypothesis to find the Zeckendorf representation for N - F_i and define the new mapping c' . This new mapping simply appends i to the series of indices. The proof is completed by showing that c' satisfies the constraints and forms a Zeckendorf representation of N . So far, only the existence of the Zeckendorf representation has been proven. However, there might be multiple representations for a given number. The next step is to prove that the representation is unique.

  theorem zeckendorf_unique:
    assumes "n > 0"
    assumes "n = ( i=0..k. fib (c i))" "inc_seq_on c {0..k-1}" "i{0..k}. c i  2" 
    assumes "n = ( i=0..k'. fib (c' i))" "inc_seq_on c' {0..k'-1}" "i{0..k'}. c' i  2"
    shows "k = k'  ( i  {0..k}. c i = c' i)"
        using assms
  proof(induct n arbitrary: k k' rule: nat_less_induct)
    case IH: (1 n)
    consider "n = 0" | "n = 1" | "n  2" by linarith
    then show ?case
    proof(cases)
      case 3
      obtain i where bounds: "fib i  n" "fib(Suc i) > n" "2  i" 
        using betw_fibs nat_ge_2_fib_idx_bound 3 by blast
      have last_idx_eq: "c' k' = i" "c k = i" "c' k' = c k"
        using last_fib_sum_index_constraint[OF 3] IH(6-8) IH(3-5) bounds by blast+
      then show ?thesis
      proof(cases "is_fib n")
        case True
        hence "fib i = n" 
          unfolding is_fib_def using bounds IH(2-8) fib_mono leD nle_le not_less_eq_eq by metis
        hence "k = 0" "c 0 = i" "k' = 0" "c' 0 = i"
          using fib_unique_fib_sum 3 IH(3-8) by metis+
          then show ?thesis by simp
      next
        case False
        have "k > 0" 
          using IH(3) False unfolding is_fib_def by fastforce
        have "k' > 0"
          using IH(6) False  unfolding is_fib_def by fastforce
        have "0 < n - fib (c k)" using False bounds last_idx_eq(2) unfolding is_fib_def by fastforce
        moreover have "n - fib (c k) < n" 
          using bounds last_idx_eq by (simp add: dual_order.strict_trans1 fib_neq_0_nat)
        moreover have "n - fib (c k) = (i = 0..k-1. fib (c i))" 
          using sum.atLeast0_atMost_Suc[of "λ i. fib (c i)" "k-1"] Suc_diff_1 k > 0 IH(3) by simp
        moreover have "n - fib (c' k' ) = (i = 0..k'-1. fib (c' i))" 
          using sum.atLeast0_atMost_Suc[of "λ i. fib (c' i)" "k'-1"] Suc_diff_1 k' > 0 IH(6) by simp
        moreover have "inc_seq_on c {0..k-1 - 1}" "i{0..k-1}. 2  c i"
          using IH(4,5) unfolding inc_seq_on_def by auto
        moreover have "inc_seq_on c' {0..k'-1 - 1}" "i{0..k'-1}. 2  c' i"
          using IH(7,8) unfolding inc_seq_on_def by auto
        ultimately have "k-1 = k'-1  (i{0..k-1}. c i = c' i)"
          using IH(1) unfolding last_idx_eq by blast 
        then show ?thesis
          using IH(1) last_idx_eq by (metis One_nat_def Suc_pred 0 < k' 0 < k atLeastAtMost_iff le_Suc_eq)
      qed
    qed(insert IH one_unique_fib_sum, auto)
  qed

My approach to the formalization of the uniqueness proof was to assume that two Zeckendorf representations of N exist and show that the two mappings c and c' are identical. As in the previous proof, I use induction over N with Isabelle's nat_less_induct induction rule. Additionally, I introduce k and k' as arbitrary variables. The arbitrary keyword in the induction rule ensures that k and k' remain universally quantified throughout the proof. This is necessary to show k = k' . Next, I construct a case distinction on N , where the cases for N=0 and N=1 can be solved automatically by Isabelle. In the third case, I further distinguish whether N is a Fibonacci number. If N is a Fibonacci number, we can use the fact that Fibonacci numbers trivially have a unique representation. If N is not a Fibonacci number, we can use the induction hypothesis and derive a Zeckendorf representation for N - F_i . From the induction hypothesis, we can also conclude that k - 1 = k' - 1 and c_i = c_i' for all 0 \le i \le k-1 . Remember, our goal is to prove k = k' and c_i = c_i' for all 0 \le i \le k . Since k = k' trivially follows from k - 1 = k' - 1 , the main goal is to show c_k = c_{k'} . Isabelle can prove this by using the following equation.

N - F_{c_k} = \sum_{j=0}^k F_{c_j} = \sum_{j=0}^{k'} F_{c'_j} = N - F_{c_k'}

This concludes the formalization of the Zeckendorf theorem proof. I am sure that this is not the most elegant solution. A week after finishing this formalization, I had the idea to use lists of indices in the summation. I believe this would simplify Isabelle's reasoning because I could remove some auxiliary lemmas and the domain from the inc_seq_on property. If I have time in the future, I may redo the formalization in a cleaner version.

References

[1] Gerrit C. Lekkerkerker. "Voorstelling van natuurlijke getallen door een som van getallen van Fibonacci", 1951 (Accessed 21.12.2024)

[2] Nik Henderson. "What is Zeckendorf's Theorem?", 2016

[3] Nipkow, Tobias and Paulson, Lawrence C and Wenzel, Markus. "Isabelle/HOL: a proof assistant for higher-order logic", 2002