## 26.12.20

### Kripke semantics and Tableaux for Intuitionistic Logic

Simply-typed $\lambda$ calculus terms can be seen as encoding of natural deduction proofs of intuitionistic propositional logic (IPC - the C is for calculus; IPL is also common). This is the Curry Howard correspondence.


As an example, consider the formula $\qneg{\qneg{(\qp \qor \qneg{\qp)}}}$. We can take $\qneg{\qX}$ as abbreviation for $\qX \qimplies \bot$. Is it "obvious" how to find a term of type $((\qp \qor \qneg{\qp}) \qimplies \bot) \qimplies \bot$? Here is a natural deduction proof of (the names are from this list of intuitionistic natural deduction rules from the post above; I can't be bothered with typesetting a tree). $$\begin{array}{lll} 1. & \qneg{(\qp \qor \qneg{\qp)}} & \mathrm{(assumption)} \\ 2. & \qp & \mathrm{(assumption)} \\ 3. & \qp \qor \qneg{\qp} & \mathrm{(via}~\orintro/1~\mathrm{from~2.}\\ 4. & \bot & \mathrm{(via}~\implelim~\mathrm{apply~1.~to~3.)} \\ 5. & \qneg{\qp} & \mathrm{(via}~\implintro~\mathrm{from~[2.]~to~4.)} \\ 6. & \qp \qor \qneg{\qp} & \mathrm{(via}~\orintro/2~\mathrm{from~5.)} \\ 7. & \bot & \mathrm{(via}~\implelim~\mathrm{apply~1.~to~6.)} \\ 8. & \qneg{\qneg{(\qp \qor \qneg{\qp)}}}& \mathrm{(via}~\implintro~\mathrm{from~[1.]~to~7.)} \\ \end{array}$$

Maybe a better question to ask is: Why does this proof looks the way it looks? How, starting from the formula, can we arrive at such a proof, with steps that are so simple that we can leave them to a machine?

# Intuitionistic Tableaux

Kripke semantics and tableaux give us an alternative view of proving IPC formulae. The tableau calculus we discuss here is not very different from sequent calculus, it uses "block tableaux" where each node is set of (signed) formulae. It will be a refutation calculus - despite proving intuitionistic logic!

The definitions here follow Melvin Fitting's book [1] and Smullyan's unified presentation [2].

In contrast to block tableaux, "analytic tableaux" have one formula per node and we consider the set of all formulas on a branch instead of looking at a node). These and connexion methods are out of scope today, just like modal logic or the algebraic view (denotational semantics). The only differences from the aforementioned texts is that instead of using negation as primitive, I will use $\qfalse$ as primitive, and I make it explicit when a principal $T \qX$ formulae has to be preserved in tableau extension.

[1] Melvin Fitting. Intuitionistic Logic, Model Theory and Forcing [researchgate pdf].

[2] Raymond M. Smullyan. A generalization of intuitionistic and modal logics. https://doi.org/10.1016/S0049-237X(08)71543-X

### Syntax for Intuitionistic and Minimal Propositional Logic

Once more, we define of well-formed formulae (wff):

• a propositional variable $\qp$ is a wff
• $\qfalse$ is a wff
• if $\qX$ and $\qY$ are wff, then $\qX \wedge \qY$ is a wff.
• if $\qX$ and $\qY$ are wff, then $\qX \vee \qY$ is a wff.
• if $\qX$ is a wff, then $\qX \qimplies \qY$ is a wff

We use $\qq,\qp$ for propositional variables and $\qX, \qY, \qZ$ as metavariables for formulae. The symbol $\qfalse$ stands for "falsehood", "falsity" or "absurd" and is also written as $\bot$. We use $\qfalse$ and $\qimplies$ as primitive (built-in) symbols, and define negation $\qneg{X}$ as an abbreviation for $\qX \qimplies \qfalse$. This goes well with the intuitionist notion that "a proof of $\qneg{X}$ amounts to a construction that transforms a proof $\qX$ to a proof of falsehood."

### Kripke semantics

A (Kripke) model for intuitionistic propositional logic is a triple $\langle \mathcal{G}, \mathcal{R}, \Vdash \rangle$ where $\mathcal{G}$ is a set of states, $\mathcal{R}$ an accessibility relation between states that is reflexive and transitive, and a relation $\Vdash$ between states and formulae that satisfies conditions $K_0 - K_3$ below. We use $\Gamma, \Delta$ for elements of $\mathcal{G}$, read $\Gamma \Vdash \qX$ as "$\Gamma$ forces $\qX$" and write $\Gamma^\star$ for "any $\Delta$ with $\Gamma \mathcal{R} \Delta$".

($K_0$) If $\Gamma \Vdash \qX$ then $\Gamma^\star \Vdash \qX$

($K_1$) $\Gamma \Vdash \qX \qand \qY$ iff $\Gamma \Vdash \qX$ and $\Gamma \Vdash \qY$

($K_2$) $\Gamma \Vdash \qX \qor \qY$ iff $\Gamma \Vdash \qX$ or $\Gamma \Vdash \qY$

($K_3$) $\Gamma \Vdash \qX \qimplies \qY$ iff for every $\Gamma^\star$, either $\Gamma^\star \not\Vdash \qX$ or $\Gamma^\star \Vdash \qY$.

($K_4$) $\Gamma \not\Vdash \qfalse$.

A formula is valid in a model if $\Gamma \Vdash \qX$ for all $\Gamma \in \mathcal{G}$. A formula is valid if it is valid in all models. Note that with $K_0$ in mind, we could also have written conditions $K_1$ and $K_2$ differently:

($K_1'$) $\Gamma \Vdash \qX \qand \qY$ iff for every $\Gamma^\star$, $\Gamma^\star \Vdash \qX$ and $\Gamma^\star \Vdash \qY$

($K_2'$) $\Gamma \Vdash \qX \qor \qY$ iff for every $\Gamma^\star$, $\Gamma^\star \Vdash \qX$ or $\Gamma^\star \Vdash \qY$

In the above definition, it is understood that $\Vdash$ is "seeded" with a relation $\Vdash^{0}$ between states and propositional variables $\qp$. This $\Vdash^{0}$ is then extended to a relation between states and whole formulae. It is also common to define states directly as subsets of the "available" propositional variables (the ones forced in that state). Here is an example, with two states $\Gamma = \{\qq\}$ and $\Delta = \{\qp, \qq\}$:

The relation $\Vdash$ takes all reachable states into consideration, and once a state forces $\Gamma \Vdash \qq$, any subsequent $\Gamma^\star$ must force it, too. The "truth in a state" is rather more dynamic than absolute "truth" we are used to from classical logic. One can give this a temporal epistemic interpretation and say states represent subjective knowledge of true facts, and that this knowledge "grows over time" (as we access subsequent states).

As an exercise: Convince yourself that $\Delta \Vdash \qp \qand \qq$, $\Gamma \not\Vdash \qp$ and the formula $\qp \qimplies \qq$ is valid in this model. Note further that $\qp \vee \qneg{p}$ is not valid. We have $\Gamma \not\Vdash \qp$, which is very different from saying $\Gamma \Vdash \qneg{p}$.


### Kripke semantics for Signed Formulae

On towards a proof system for IPC based on an intuitionistic tableau calculus (or just intuitionistic tableaux). To this end, we introduce signed formulae, using symbols $T$ and $F$ and writing $T\qX, F\qX$ for a signed version of formula $X$:
• $\Gamma \vdash T\qX$ means $\Gamma \Vdash \qX$ and
• $\Gamma \vdash F\qX$ means $\Gamma \not\Vdash \qX$.

In classical logic, $T\qX$ would mean $\qX$ is true and $F\qX$ would mean $\qX$ is false (or $\qneg{X}$ is true). The way to understand $\Gamma \vdash F\qX$ non-classically is to read it as "it is not known that $\Gamma \Vdash \qX$." Note that $\Gamma$ is merely a state, not a context or set of formulae - we are using this assertion as a building block to compare proof situations. For example, the assertions $T\qX \qand \qY$ and $F\qX$ cannot both be true at the same time (or in the same state).

We follow Smullyan's uniform notation, and classify our formulae into different types (whether they are "conjunctive" or "disjunctive" in nature):

• type $\alpha$ is any signed formula of the shape $T\qX \qand \qY, F\qX \qor \qY, F\qX \qimplies \qY, F\qfalse$
• type $\beta$ is any formula $F\qX \qand \qY, T\qX \qor \qY, T\qX \qimplies \qY, T\qfalse$.

Since we added $\qfalse$ as primitive, we need to say something about its components, too. We treat $\qfalse$ like we treat an empty disjunction.

For a signed formula $\alpha (\beta)$ we called certain signed subformula its components $\alpha_1$ and $\alpha_2$ (same for $\beta$), according to this table: $$\begin{array}{c|c|c}% \alpha & \alpha_1 & \alpha_2 \\ \hline T\qX \qand \qY & T\qX & T\qY \\ \hline F\qX \qor \qY & F\qX & F\qY \\ \hline F\qX \qimplies \qY & T\qX & F\qY \\ \hline F \qfalse & F \qfalse & F \qfalse \end{array} \quad \quad \begin{array}{c|c|c}% \beta & \beta_1 & \beta_2 \\ \hline F\qX \qand \qY & F\qX & F\qY \\ \hline T\qX \qor \qY & T\qX & T\qY \\ \hline T\qX \qimplies \qY & F\qX & T\qY \\ \hline T \qfalse & T \qfalse & T \qfalse \end{array}$$

Again, for $\qfalse$ we have subformulae that are not proper, but this is not going to be a concern.

Let us call the conjugate of $\overline{T\qX}$ the corresponding $F\qX$ and vice versa. Note that the conjugate of a formula of type $\alpha$ is of type $\beta$ and vice versa. In classical logic, a set of signed formulae $\{T\qX_1,\ldots,T\qX_n, F\qY_1, \ldots, \qY_m\}$ would correspond to a sequent calculus sequent $\qX_1, \ldots \qX_n \longrightarrow \qY_1, \ldots, \qY_m$. The sequent would be provable iff the set of signed formulae is unsatisfiable. The distinction into type $\alpha$ and $\beta$ explain how a formula can either be split into its components for conjunctive formulae $\alpha$ or the proof needs to branch into two possibilities for disjunctive formulae $\beta$.

For intuitionistic logic, we are going to call all formulae $T \qX$ permanent and all formulae $F \qX$ co-permanent. With all this setup, we can remix our definition of Kripke model and $\Vdash$ into the following conditions using signed formulae $\qX$:

$(U_0(a))$ Either $\Gamma \Vdash \qX$ or $\Gamma \Vdash \overline{\qX}$, but not both.

$(U_0(b))$ If $\Gamma \Vdash \qX$ and $\qX$ is permanent, then $\Gamma^\star \Vdash \qX$.

$(U_1)$ For any permanent $\alpha$, $\Gamma \Vdash \alpha$ iff for every $\Gamma^\star$, $\Gamma^\star \Vdash \alpha_1$ and $\Gamma^\star \Vdash \alpha_2$

$(U_2)$ For any permanent $\beta$, $\Gamma \Vdash \beta$ iff for every $\Gamma^\star$, $\Gamma^\star \Vdash \beta_1$ or $\Gamma^\star \Vdash \beta_2$

The behavior on co-permanent $\alpha$ and $\beta$ is determined by $U_1$ and $U_2$ too, by taking into consideration $(U_0(b))$. Observe:

$(U_1')$ For any co-permanent $\alpha$, $\Gamma \Vdash \alpha$ iff for some $\Gamma^\star$, $\Gamma^\star \Vdash \alpha_1$ and $\Gamma^\star \Vdash \alpha_2$

$(U_2')$ For any co-permanent $\beta$, $\Gamma \Vdash \beta$ iff for some $\Gamma^\star$, $\Gamma^\star \Vdash \beta_1$ or $\Gamma^\star \Vdash \beta_2$

As an exercise: Go back to the example above, and convince yourself that $\Delta \Vdash T\qp \qand \qq$ and $\Gamma \Vdash F\qp$. Check that the formula $T \qp \qimplies \qq$ is valid in every state, using the conditions $U_\star$ above. Now do the same for $T \qp \vee \qneg{p}$, remembering that it stands for $T \qp \vee (\qp \qimplies \qfalse)$. Now, to keep it interesting, what about $\Gamma \Vdash F \qp \vee \qneg{p}$? Is $F \qp \vee \qneg{p}$ valid?

A final definition is to call a formula special if it is co-permanent but has a permanent component, otherwise it is called ordinary. The only special formula we have is $F\qX \qimplies \qY$, of type $\alpha$. If we had negation as a built-in operator, that would be a second special $\alpha$. Check out Smullyan's paper for predicate logic and other generalizations that require special $\beta$.

### Tableau Proofs

A tableau proof is like a sequent calculus proof written upside-down: we draw it with the root at the top, growing towards the bottom. Instead of $S \cup \qX$, we write $S, \qX$. A vertical bar $\vert$ indicates that there are two children. $\qS_T$ means $\{ \qT\qX \vert \qT\qX \in \qS\}$, and $\alpha_P$ means repeat $\alpha$ if it is permanent (same for $\beta$). This is just to emphasize that $T \qX$ (possibly) needs be to preserved; in sequent calculus, this would surface as a structural operation. $$\begin{array}{ll} (A1) & \mathrm{for~ordinary} ~ \alpha: ~ \LARGE \frac{\qS,~ \alpha }{\qS,~\alpha_P,~ \alpha_1,~ \alpha_2 } \\ (A2) & \mathrm{for~special} ~ \alpha: ~ \LARGE \frac{\qS,~ \alpha }{\qS_P,~ \alpha_1,~ \alpha_2 } \\ (B1) & \mathrm{for~ordinary} ~ \beta: ~ \LARGE \frac{\qS,~ \beta }{\qS,~\beta_P, ~ \beta_1 ~ \vert ~ \qS,~\beta_P,~\beta_2 } \\ \end{array}$$

Now, a configuration is a collection $[~\qS_1, \qS_2, \ldots, \qS_n~]$ of sets of signed formulae. A rule takes a configuration and replaces one $S_i$ with its result (or its two results). A tableau is a finite sequence of configurations $C_1, \ldots C_n$ in which each configuration except the first resulted from its predecessor by application of a rule.

A set of signed formulae is closed if it contains both $\qT \qX$ and $\qF \qX$ for some formula $\qX$ or if it contains $\qT \qfalse$. A configuration $[~\qS_1, \qS_2, \ldots, \qS_n~]$ is closed if all sets $\qS_i$ are closed. A tableau $C_1, \ldots C_n$ is closed if some $C_i$ is closed.

A tableau for a signed set of formula $\qS$ is a tableau $C_1, \ldots, C_n$ with $C_1 = \{\qS\}$. A finite set of of signed formulae $\qS$ is inconsistent if some tableau for $S$ is closed. Otherwise, $\qS$ is consistent. A formula $\qX$ is a theorem, written $\vdash_I \qX$, if $\{\qF \qX\}$ is inconsistent, and a closed tableau for $\{\qF\qX\}$ is called a proof of $\qX$.

What we have done here is to define provability of $X$ as inconsistency of $FX$. This refutation calculus may not be appealing to an intuitionist.

Let's look at a proof of $\qneg{\qneg{(\qX \vee \qneg{X})}}$ in our $\qfalse$-primitive logic. $$\begin{array}{l} [~\{\qF \qneg{\qneg{(\qX \vee \qneg{\qX})})} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}})\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qF \qX \vee \qneg{\qX}\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qF \qX, \qF \qneg{\qX}\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qT \qX\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qF \qX \vee (\qneg{\qX}), \qT \qX\} ] \\ [~\{\qT \qneg{(\qX \vee \qneg{\qX}}), \qF \qX, \qF \qneg{\qX}, \qT \qX\} ] \\ \end{array}$$ The same proof in a bit more verbose: $$\begin{array}{l} [~\{\qF ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse) \qimplies \qfalse\}~] \\ \alpha (F\qimplies) \\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse, \qF \bot)\}~] \\ \beta (T\qimplies) \\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qF (\qX \vee (\qX \qimplies \qfalse)), \qF \qfalse\}~, \{\ldots, \qT \qfalse\}] \\ \alpha (F\qor)\\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qF \qX, \qF (\qX \qimplies \qfalse), \qF \qfalse\}~, \{\ldots, \qT \qfalse\}] \\ \alpha (F\qimplies)\\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qT \qX, \qF \qfalse\}, \{\ldots, \qT \qfalse\}] \\ \beta (T\qimplies)\\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qF (\qX \vee (\qX \qimplies \qfalse)), \qT \qX, \qF \qfalse \}, \\ ~ \{\ldots, \qT \qfalse\}, \{\ldots, \qT \qfalse\}] \\ \alpha (F\qor)\\ [~\{\qT ((\qX \vee (\qX \qimplies \qfalse)) \qimplies \qfalse), \qF \qX, \qF (\qX \qimplies \qfalse), \qT \qX, \qF \qfalse \}, \\ ~ \{\ldots, \qT \qfalse\}, \{\ldots, \qT \qfalse\}] \\ \end{array}$$

In applying the rules of the tableau calculus, we are taking the formula we want to prove apart and deal with its subformulae until we close a branch (configuration). One never needs to guess an intermediate formula or assumption. However, it is not an entirely deterministic procedure, in the sense that we still need to pick which of the formulas we want to take apart. Automated reasoning! And it should be easy to convert the result into a natural deduction proof (similar to Prawitz's view that a sequent calculus proof can be seen as a description of how to construct a natural deduction proof).

Another thing that is great about the tableaux method is that if after taking everything apart, you end up with open branches (that cannot be closed), you can read them as models that provide counter-examples. The application of a "special" rule indicates that we transitioned to another state.

#### What is special about intuitionistic tableaux?

Let's consider that signed formula $\qF \qX$ correspond to the right-hand side of a sequent calculus (the succedent). So unlike Gentzen's LJ, it is quite fine to have multiple formulae mingle on the right-hand side and stay intuitionistic, as long as the right-$\rightarrow$ rule (i.e. $\qF \qX \qimplies \qY$) keeps only one. This has also been explained here:

[3] Valeria de Paiva, Luiz Carlos Pereira. A short note on intuitionistic propositional logic with multiple conclusions [researchgate pdf]

The assertion $\qF \qX \qimplies \qY$ says: "at present, we think we can prove $\qX$ without at the same time proving $\qY$". In considering such a possible proof $\qT \qX$, we need to do bookkeeping to ensure that anything we prove along the way is separate from other formulae $\qF \qZ$ that we labeled as not-yet-known. This is why we the special $\alpha$ rule use $\qS_P$ instead of $\qS$. This looks like the essence of intuitionistic reasoning as far as proof rules are concerned. It also makes intuitionistic tableaux a bit more complicated than classical.

#### Coda

I hope the above sheds some light on the connection between tableaux calculus, Kripke semantics and intuitionistic logic. Soundness and completeness can be found in Fitting's book, as well as generalizations to modal logic, predicate logic. Of course, for stronger logics, carrying over soundness and completeness does not mean we retain decidability; stronger logics require not only taking formulae apart, but also instantiating them, which increases non-determinism and search space.

# Intro


I am going to refer to material in "Higher order logic and equality" often. Like in that post, there is actually no original material here; I am just gathering definitions that is probably well-known to some grad students and experts (yet nobody calls it standard since neither higher-order logic nor intuitionistic logic are known very well outside certain circles who deal with formal logic and proof assistants and the like). This is my feeble attempt at an exposition.

# The Logic of a Topos

We take a $\lambda$ calculus with surjective pairing and a type $\omicron$ which we take as a substitute for "truth values" (topos theory texts use $\Omega$ but we want easy comparison with the classical version linked above, and we spell out $\qPi \lambda x. t$ without setting up an abbreviation $\forall x. t$ and we also spell out $\qSigma \lambda x. t$ without setting up an abbreviation $\exists x. t$). We want to set this up in an intuitionistic manner, which means doing away with the idea that $\omicron$ stands for two truth values, and not using classical logical equivalences to define $\qor$ and $\qSigma$. The terms would then correspond to a particular cartesian-closed category, the free topos.

We again want to define everything in terms of equality $\qeq_\alpha: \alpha \rightarrow \alpha \rightarrow \omicron$. What is this equality doing? In our langugage, it talks about equality of terms, and in categorical logic, terms correspond to arrows and thus equality should talk about equality of arrows. (There is some detail hidden here as our language will also let us talk about "global elements" $1 \stackrel{c}{\longrightarrow} C$ which are written as constant terms $c: C$). We take the following elements of our logical language: \begin{align*} \qtrue:\ & \omicron\\ \qand:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qimplies:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qPi_\alpha:\ & (\alpha \rightarrow \omicron) \rightarrow \omicron\\ \end{align*}

... and define them using the only primitive $\qeq$: \begin{align*} \qtrue &:= \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \qeq_\tprop \qeq_\tprop\\ \qand &:= \lambda x:\tprop. \lambda y:\tprop. \\ &\quad \quad \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~\qtrue~\qtrue) \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~x~y) \\ \qimplies &:= \lambda x:\tprop. \lambda y:\tprop. \qeq~x~(\qand~x~y)\\ \qPi_\alpha &:= \qeq_{(\alpha \rightarrow \tprop) \rightarrow (\alpha \rightarrow \tprop) \rightarrow \tprop}~(\lambda x : \alpha. \qtrue) \end{align*}

These definitions are the same as for classical logic. An intro to topos theory will typically describe $\qand$ and $\qimplies$ using commutative diagrams, but here we are defining the language first. Moreover, the language has only one primitive $\qeq$, though that primitive is instantiated at different types.

# False

Now we look at $\qfalse$, which is also defined the same as in classical case: \begin{align*} \qfalse &: \omicron\\ \qfalse &:= \qeq_{\tprop \rightarrow \tprop} (\lambda x:\tprop. \qtrue) (\lambda x:\tprop. x)\\ \end{align*}

In a topos theory text, one encounters $\forall v:\omicron. v$, which is shorthand for $\qPi\lambda v:\omicron. v$ and actually expands to the very same definition above. The definition that mentions $\qeq$ explicitly has the advantage that it explains what is going on: it morally says "if there is anything other than $\qtrue$ in $\omicron$, then $\lambda x:\omicron. \qtrue$ and $\mathsf{id}_\omicron$ are not the same." So we have a value $\qfalse$, but it need to be the only thing distinct from $\qtrue$ in $\omicron$.

# Not, Or and Exists

On to intuitionistic negation $\qnot$. In intuitionistic logic, we would define negation like this: \begin{align*} \qnot &: \omicron \rightarrow \omicron\\ \qnot &:= \lambda x: \tprop. \qimplies(x, \qfalse) \end{align*}

If we consult the classical version $\qnot := \qeq_{\tprop \rightarrow \tprop} \qfalse$, we again find that this is not that different.

Finally, let's look at $\qor$ and $\qSigma$. We define these using $\qPi$: \begin{align*} \qor &: \omicron \rightarrow \omicron \rightarrow \omicron\\ \qor &:= \lambda x: \omicron. \lambda y: \omicron. \qPi \lambda r:\omicron. \\ & \quad \quad \qimplies(\qand(\qimplies(x, r), \qimplies(y,r)), r) \\ \\ \qSigma_\alpha &: (\alpha \rightarrow \omicron) \rightarrow \omicron \\ \qSigma_\alpha &:= \lambda x: \alpha \rightarrow \omicron. \qPi \lambda r:\omicron. \\ &\quad \quad (\qPi \lambda z:\alpha. \qimplies(\qimplies(x z, r), vr) \end{align*}

We can get an intuitive understanding of these definitions: for $\qor$, if we have a way of obtaining an element $r$ (result) from $x$ and one of obtaining the same from $y$, then also from $(\qor~x~y)$. Similarly for $\qSigma$, if we have a way of obtaining any result $r$ from the "instantiated" predicate $x z$, then also from $\qSigma x$.

The definition of $\qor$ in terms of $\qand$ and $\qPi$ is not very different from natural deduction rule for "or elimination": $$\cfrac{\begin{array}[cc]{} \\ {} \\ A \vee B \end{array} \quad \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array}\quad \begin{array}[cc]{} [B] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array} } {C}~\color{blue}{(\vee{\mathrm{\small ELIM}})}$$

Checking the the natural deduction rule for existential elimination and comparing it with the encoding above is left as an exercise.

# The end of the beginning

Higher order logic seems to deserve more attention. We have set up a logical system based on $\lambda$ calculus without making use of the Curry Howard correspondence. While the previous discussion focused on classical higher order logic as introduced by Church in 1940, in this post we looked at intuitionistic higher order logic. The interest in the latter is at least very heavily influenced by topos theory.

It is slightly confusing that when discussing elementary toposes, one talks of first-order logic when in reality the logic is inherently higher-order (because a topos is cartesian closed). I guess the charitable perspective is what people mean is "you can do everything that you would want to do in formal set theory using first-order logic."

The Lambek and Scott book "Introduction to higher order categorical logic" dwells a bit on the fact that it is always possible to get a formulation of the internal language of a topos using equality. Andrews $Q_0$ develops simple type theory / higher order logic using a single equality operator $Q$. Various theorem proving systems have been based on this (TPS and HOL).

One could say that the way higher-order logic admits the definition of logical connectives in terms of other is an "encoding" of natural deduction rules. Could one use this idea to build a logical framework, a representation of derivations? Indeed this is what is underlying Isabelle/Pure framework and $\lambda$Prolog.

It would seem worthwhile to write a proper book that takes a logical perspective on the $\lambda$ calculus, developing higher order logic as language, category theory as semantics and arriving at elementary toposes as models. It would probably feel very elementary and pedestrian in the academic world, but I do think it would be useful. It feels to me that there is so much "background knowledge" to gather the relevant theory on logic and computation that most practitioners will not have the energy (and this says something about the potential for adopting of formal methods). Until such a book arrives (I'm not sure if I would have the energy), I hope my out-of-order blog posts provide some help on the reader's journey. There are certainly a lot of of connections between logic, language, types and semantics that are left to be explored, besides Curry Howard and dependent types.

## 27.9.20

### Programmers love Higher-Order Logic

My previous take on higher-order logic was classical, channeling Church's simple theory of types and Andrews $Q_0$. In that last post, I was hinting at motivation. So I want to expand a bit now how this stuff can be useful for building software at large. I am posting some ramblings here.

# Equality is Fundamental

There are two different "levels" at which a programmer deals with some kind of reasoning that involves equality:

• while writing code: need to be able to tell when two objects are the same.
• when designing or arguing that code is correct: reasoning whether behavior like output values conform to a (however implicit) specification

In both cases, one needs to be able to identify whether objects (or behaviors or input-output pairs) are "the same" as expected ones. But the latter activity, reasoning about program correctness, happens in some form of logic. Much that has been written and is being written on formal logic is about mathematics and mathematical convention. This is true whether we talk about semantics of programming languages or program specification.

But for programmers, there are benefits to knowing formal reasoning even if one is not interested in the foundations of math, axiomatizations of set theory etc. It is no secret that in mathematics as in programming, you run into situations where there are multiple different ways to represent some data, and one needs to chose or translate between those representations. While programmers resign to writing "proto-to-proto" translation over and over (which often are not as trivial as it seems), in mathematics it is customary to leave out the awkward details, introduce "abuse of notiation" and quotienting to keep arguments informal, but concise.

The heart and soul of much mathematics consists of the fact that the “same” object can be presented to us in different ways. [1]

[1] Barry Mazur. When is one thing equal to some other thing?

# Equality in First Order Logic

We take a leap towards "formal systems", where objects are represented using (strings of) symbols, along with strict rules to form statements and derive new true statements from existing ones.

A typical definition of a first-order logic language contains constant and function symbols, predicate symbols, logical operators.

There is a "design choice" here whether equality is considered as built-in, magically denoting exactly the equality in the underlying sets, or whether equality is a predicate with certain special properties. Wikipedia" talks about "first-order logic with equality" (when built-in) and first-order logic without equality (when it is not).

When equality does not come built-in, one needs to add axioms. The equality axioms are special, in the sense that they have to be preserved by all the function and relation symbols. This looks like:

• $\mathsf{eq}$ is reflexitve, symmetric, transitive
• if $\mathsf{eq}(x_1, y_1), \ldots, \mathsf{eq}(x_n, y_n)$, then $\mathsf{eq}(f(x_1,\ldots,x_n),f(y_1,\ldots,y_n)$ for all function symbols $f$ of arity $n$
• if $\mathsf{eq}(x_1, y_1), \ldots, \mathsf{eq}(x_n, y_n)$, then $R(x_1,\ldots,x_n) \Leftrightarrow R(y_1,\ldots,y_n)$ for all relation symbols $R$ of arity $n$

With a "defined" equality, is it is well possible to interpret logical statements involving $\mathsf{eq}$ in such a way that two distinct elements are regarded as equal according to $\mathsf{eq}$.

# Coding our own equality

When we are talking about practical software development, there is really no hope of defining a platonic ideals; yet when techniques of reasoning, informaticians are in various ways confined to results and methods that formalizing mathematics.

Take this example: We'd like our terms to represent free monoid on $X$, so we'd like these terms to satisfy the following equations (or identities): $$s \cdot 1 = s$$ $$t = 1 \cdot t$$ $$(s \cdot t) \cdot u = s \cdot (t \cdot u)$$

Consider definitions in some fictional language to express the "interface" of a monoid: $$\begin{array}{ll} \mathsf{interface}~ M[X] \\ \quad \mathsf{unit}(): M[X]\\ \quad \mathsf{emb}(x: X): M[X]\\ \quad \mathsf{dot}(s: M[X], t: M[X]): M[X]\\ \quad \mathsf{eq}(s: M[X], t: M[X]): \mathsf{bool} \end{array}$$

Here, $\mathsf{unit}$ stands for $1$, $\mathsf{emb}(x)$ is for embedding $x$ and $\mathsf{dot}$ is the combine operaion in $MX$. If our fictional language has algebraic data types, we can use those to build terms which represent the algebraic theory of monoids: $$\begin{array}{ll} \mathsf{data}~ \mathsf{Mon}[X] \\ \quad \mathsf{Unit}(): \mathsf{Mon}[X]\\ \quad \mathsf{Emb}(x: X): \mathsf{Mon}[X]\\ \quad \mathsf{Dot}(s: \mathsf{Mon}[X], t: \mathsf{Mon}[X]): \mathsf{Mon}[X]\\ \quad \mathsf{eqMon}(s: \mathsf{Mon}[X], t: \mathsf{Mon}[X]): \mathsf{bool} \end{array}$$

It is now really obvious what these constructors stand for, but we note that while the equality $\mathsf{eqMon}$ returns true whenever two terms are equal when considered as elements of the free monoid, there is at least one more equality relation on terms that is different ("structural equality" or "syntactic equality", which will consider $\mathsf{dot}(\mathsf{dot}(s, t), u)$ as distinct from $\mathsf{dot}(s, \mathsf{dot}(t, u))$).

Mathematically speaking, we want to consider the quotient set $\mathsf{Mon}[X] / \mathsf{eqMon}$, and work with equivalence classes.

# Formulas = Functions to Truth Values

After all this talk about equality in algebraic theories, we want to turn equality in the context of logic, specifically higher-order logic.

In classical logic, a proposition (logical statement) stands for a truth value, and there are exactly two of then (the two-value type "bool"). Boole was the first to associate propositional connectives with algebraic operations. In this perspective, the the truth values are part of the language, even though this was later abandoned as in Tarski-style semantics.

Programmers are quite used to having functions return $\mathsf{bool}$. In terms of mathematical logic, our custom-made equality "relation" $\mathsf{eqMon}$ above is in fact a function to the set of truth values. Could we not do away with relations and talk in terms of functions?

Yes, we can; using truth values and functions in a formal language of propositions is the essential summary of classical higher order logic (or Church's simple theory of types. There is a more compelling example: if you consider "set builder notation" $$\{ x \in A \ |\ \phi(x) \}$$ then this describes a subset of $A$. We can choose to describe $\phi$ as defining a (unary) relation, but viewing it as a function to $\mathsf{bool}$ is no doubt more natural in a context like SELECT x FROM A WHERE  $\phi$ where "relations" are given by their extensions (as tables) and computation in the form of expressions is available.

The point of (classical) higher-order logic is that using the $\lambda$ calculus and truth values, we have everything we need to define the meaning of all logical connectives, and the types prevent logical paradoxes of "naively-untyped set theory". Since the truth values are part of the language, it is possible to define the meaning purely through definitions that involve equality.

More on this and classical higher order logic in that last post on Higher Order Logic and Equality and the next one I'm going to write in a second, where I talk about the intuitionistic version.

I will just end with a note that there is no distinction between "predicate", "relation" and "set" is higher-order logic, and this is a good thing. An expression of type $\iota \rightarrow \omicron$ (where $\iota$ are the individuals and $\omicron$ the truth values) defines a (sub)set of individuals, a predicate on the set of individuals, a unary relation, all at the same time. $$\mathsf{even} := \lambda x:\iota. \Sigma \lambda y:\iota. \mathsf{eq}(x, \mathsf{times}(2, y))$$ $$\mathsf{greater} := \lambda x:\iota.\lambda y:\iota. \Sigma \lambda z:\iota. \mathsf{eq}(x, \mathsf{plus}(y,z))$$ $$\mathsf{lowerbound} := \lambda x:\iota.\lambda y:\iota \rightarrow \omicron. \Pi \lambda z:\iota. \mathsf{implies}(y(z), \mathsf{greater}(z, x))$$

The confusing bit for programmers may be that this thing is called simple type theory when in fact there is only one "sort" of individuals. This makes more sense when considering that it is no problem to come up with multiple "sorts" of individuals but that wouldn't give sets ($\iota \rightarrow \omicron$), sets of sets $(\iota \rightarrow \omicron) \rightarrow \omicron$, sets of sets of sets (and so on) of individuals. It's this distinction of hierarchies of sets/predicates/relations that earn Church's higher order logic the name "simple type theory": only terms with the right types are considered well-formed and meaningful.

## 19.7.20

### Modules, components and their limits

I read Clear explanation of Rust's module system. The TL;DR the module system in rust is a way to (somewhat) break/hide the link between what things are defined and what source files the things are defined in. A rust module provides control over visibility across source files and makes explicit how references reach across "modules".

The post starts with a request to ignore what "module" means in other languages, and this plunged me into thoughts on all the things that are called modules. It is somewhat sad but to be expected that a basic word like "module" (or "package") could never have a standard meaning, despite all these programming languages supposedly built while we have all this PL theory.

The more academic vocabulary has "module" as a language concept that comes with a scope and visibility, a way to bundle types and named values. It's somehwat independent of installation, deployment (the only touch point being support for separate compilation), but - to my knowledge - there is no talk of an "installed-at-a-version" module. So the rust module concept is pretty consistent with that. Arranging a set of modules that fit together is the programmer's problem, not the researcher's.

Alas, there is this thing that happens at the level of Rust crates/npm packages, and it is really a different problem. Nevertheless, those things are also called modules in some languages (C++20 modules, Java modules, go modules).

In Java, source files in the same directory make a Java package and they come with a way to control visibility. The association between files and packages is rigid. More recently, Java modules were added to express dependencies between modules, visibility of packages - something like a "packages of packages", but really more on the level of Rust crates.

I am reminded of "components" (if anybody remembers component-oriented programming) because deployment is part of this game - the notion of dependencies only makes sense in the context of some installation. But then components used to be about runtime services (in addition to code), object brokers and such and here we would actually like to talk about less dynamic things, just code and functionality in the form of source and/or libraries. Still, components are things that get assembled and exist at versions, so I'd say it's an appropriate, if old-fashioned name.

Now, such a (static) component would abstractly describe what npm packages or Rust crates achieve, and also form of Debian development packages (sigh) where you install a C library in binary form and its header files. That gives us even more of a twist now, which is that C++ 20 modules are really modules in the sense of achieving grouping for visibility and scope (check) but at the same time that whole notion has a huge impact on C++ since it should actually make it easier to deploy C++ components. A relevant quote from the C++20 link above:

After a module is compiled once, the results are stored in a binary file that describes all the exported types, functions and templates. That file can be processed much faster than a header file, and can be reused by the compiler every place where the module is imported in a project.

So while a C++20 module is language concept, it is foreseeable that one day all self-respecting newly developed libraries will be deployed as (installed at a version) modules. Oh well. There was really no reason to hope that a standard meaning for "modules" and the concept "above" could be justified in any way, across languages. At least, there remains an insight that it really makes sense to deal with the two levels separately.

I sure wish we could all call the installed-at-a-version thing "component", and the language concept "modules". Just two paragraphs on how that might sound:

Every language needs a module system: a way to deal with grouping, interfaces (in terms of making visible - "exporting" only the interface and keeping the rest internal). Including files was always bad and indicates the lack of a module system. It's when one needs to deal with independently developed, deployed and release units that there comes the problem of dependencies, evolution and versions, and these things are the components. Components are related to modules in that they need to contain at least one of them for talking about the things that are provided and brought in scope.

If there is a dependency between versions of components and there are backwards incompatible (!) changes in the depended-upon component (or a left-pad situation), then this is something that needs to be dealt through changes to the depending component. That isn't something that module systems (or any other language design) can help much with.

## 18.7.20

### Relating partial evaluation, multi-stage programming and macros

Here is some quick fact on two metaprogramming techniques. $$\begin{array}{l} \mathsf{letrec}\ \mathtt{pow}\ =\ \lambda n : \mathsf{Int}. \lambda x : \mathsf{Int}. \\ \quad \mathsf{if}\ (\mathsf{eqInt}\ n\ 0)\ 1\ (x\ * (\mathtt{pow}\ (n - 1)\ x)) \end{array}$$

What can we do with an expression $\mathtt{pow}\ 3$?

## Partial evaluation

The "partial evaluation" answer is to inline the definition, and evaluate what can be evaluated, yielding a residual program. $$\begin{array}{l} \lambda x : \mathsf{Int}. \mathsf{if}\ (\mathsf{eqInt}\ 3\ 0)\ 1\ (x * (\mathtt{pow}\ (3 - 1)\ x)\\ \lambda x : \mathsf{Int}. x * (\mathtt{pow}\ 2\ x)\\ \ldots\\ \lambda x : \mathsf{Int}. x * (x * (x * 1)) \end{array}$$

## Staging

The multi-stage programming answer (or staging) is to rethink $\mathtt{pow}$ as a transformation of code. $$\newcommand{\qquote}[1]{\color{blue}{[|}#1\color{blue}{|]}} \begin{array}{l} \mathsf{letrec}\ \mathtt{pow}\ =\ \lambda n : \mathsf{Int}. \lambda x : \mathsf{\color{blue}{Code[Int]}}. \\ \quad \mathsf{if}\ (\mathsf{eqInt}\ n\ 0)\ \color{blue}{[|}1\color{blue}{|]}\ \color{blue}{[|}(x * (\mathtt{pow}\ (n - 1)\ x)\color{blue}{|]} \\ \mathsf{let}\ \mathtt{mkpow}\ =\ \lambda n : \mathsf{Int}. \qquote{\lambda y : \mathsf{Int}. (\mathtt{pow}\ n \qquote{y})}\\ \end{array}$$

The notation was (probably) not designed to scare you:

• $\qquote{\ldots}$ is quasi-quotation. What is between these brackets is code

The activities in constructing models and software may be similar to mathematics, but they are certainly not the same. Each domain is a mathematical universe, with random and often contradictory pieces of information waiting to be organized in "islands of consistency". Hopefully we will one day reach some widespread understanding and usable tool support for formal reasoning that can serve both practical informatics as well as formalizing mathematics.

[If you want to discuss, the HN thread is a great place to leave comments]