Intuitionistic Higher-Order Logic and Equality


I have posted plenty of motivation for considering equality and higher-order logic elsewhere on this blog by now. Let me finally get to the intuitionistic version of higher-order logic, since I talked about the classical version previously and then also talked at length about "Intuitionistic Logic and Natural Deduction" without really getting closer to the matter. Intuitionistic higher-order logic is what let's you talk about an elementary topos, and using equality to characterize the logical connectives gives an insightful perspective on those commutative diagrams for people who are familiar with $\lambda$ calculus. \( \newcommand{\qeq}[0]{\color{blue} {\mathrm{eq}}} \newcommand{\qSigma}[0]{\color{blue} \Sigma} \newcommand{\qPi}[0]{\color{blue} \Pi} \newcommand{\qor}[0]{\color{blue} {\mathrm{or}}} \newcommand{\tprop}[0]{\omicron} \newcommand{\qtrue}[0]{{\color{blue} {\mathrm{true}}}} \newcommand{\qand}[0]{\color{blue} {\mathrm{and}}} \newcommand{\qimplies}[0]{\color{blue} {\mathrm{implies}}} \newcommand{\qPi}[0]{\color{blue} \Pi} \newcommand{\qfalse}[0]{{\color{blue} {\mathrm{false}}}} \newcommand{\qnot}[0]{{\color{blue} {\mathrm{not}}}} \)

I am going to refer to material in "Higher order logic and equality" often. Like in that post, there is actually no original material here; I am just gathering definitions that is probably well-known to some grad students and experts (yet nobody calls it standard since neither higher-order logic nor intuitionistic logic are known very well outside certain circles who deal with formal logic and proof assistants and the like). This is my feeble attempt at an exposition.

The Logic of a Topos

We take a $\lambda$ calculus with surjective pairing and a type $\omicron$ which we take as a substitute for "truth values" (topos theory texts use $\Omega$ but we want easy comparison with the classical version linked above, and we spell out $\qPi \lambda x. t$ without setting up an abbreviation $\forall x. t$ and we also spell out $\qSigma \lambda x. t$ without setting up an abbreviation $\exists x. t$). We want to set this up in an intuitionistic manner, which means doing away with the idea that $\omicron$ stands for two truth values, and not using classical logical equivalences to define $\qor$ and $\qSigma$. The terms would then correspond to a particular cartesian-closed category, the free topos.

We again want to define everything in terms of equality $\qeq_\alpha: \alpha \rightarrow \alpha \rightarrow \omicron$. What is this equality doing? In our langugage, it talks about equality of terms, and in categorical logic, terms correspond to arrows and thus equality should talk about equality of arrows. (There is some detail hidden here as our language will also let us talk about "global elements" $1 \stackrel{c}{\longrightarrow} C$ which are written as constant terms $c: C$). We take the following elements of our logical language: $$ \begin{align*} \qtrue:\ & \omicron\\ \qand:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qimplies:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qPi_\alpha:\ & (\alpha \rightarrow \omicron) \rightarrow \omicron\\ \end{align*} $$

... and define them using the only primitive $\qeq$: $$ \begin{align*} \qtrue &:= \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \qeq_\tprop \qeq_\tprop\\ \qand &:= \lambda x:\tprop. \lambda y:\tprop. \\ &\quad \quad \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~\qtrue~\qtrue) \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~x~y) \\ \qimplies &:= \lambda x:\tprop. \lambda y:\tprop. \qeq~x~(\qand~x~y)\\ \qPi_\alpha &:= \qeq_{(\alpha \rightarrow \tprop) \rightarrow (\alpha \rightarrow \tprop) \rightarrow \tprop}~(\lambda x : \alpha. \qtrue) \end{align*} $$

These definitions are the same as for classical logic. An intro to topos theory will typically describe $\qand$ and $\qimplies$ using commutative diagrams, but here we are defining the language first. Moreover, the language has only one primitive $\qeq$, though that primitive is instantiated at different types.


Now we look at $\qfalse$, which is also defined the same as in classical case: $$ \begin{align*} \qfalse &: \omicron\\ \qfalse &:= \qeq_{\tprop \rightarrow \tprop} (\lambda x:\tprop. \qtrue) (\lambda x:\tprop. x)\\ \end{align*} $$

In a topos theory text, one encounters $\forall v:\omicron. v$, which is shorthand for $\qPi\lambda v:\omicron. v$ and actually expands to the very same definition above. The definition that mentions $\qeq$ explicitly has the advantage that it explains what is going on: it morally says "if there is anything other than $\qtrue$ in $\omicron$, then $\lambda x:\omicron. \qtrue$ and $\mathsf{id}_\omicron$ are not the same." So we have a value $\qfalse$, but it need to be the only thing distinct from $\qtrue$ in $\omicron$.

Not, Or and Exists

On to intuitionistic negation $\qnot$. In intuitionistic logic, we would define negation like this: $$ \begin{align*} \qnot &: \omicron \rightarrow \omicron\\ \qnot &:= \lambda x: \tprop. \qimplies(x, \qfalse) \end{align*} $$

If we consult the classical version $\qnot := \qeq_{\tprop \rightarrow \tprop} \qfalse$, we again find that this is not that different.

Finally, let's look at $\qor$ and $\qSigma$. We define these using $\qPi$: $$ \begin{align*} \qor &: \omicron \rightarrow \omicron \rightarrow \omicron\\ \qor &:= \lambda x: \omicron. \lambda y: \omicron. \qPi \lambda r:\omicron. \\ & \quad \quad \qimplies(\qand(\qimplies(x, r), \qimplies(y,r)), r) \\ \\ \qSigma_\alpha &: (\alpha \rightarrow \omicron) \rightarrow \omicron \\ \qSigma_\alpha &:= \lambda x: \alpha \rightarrow \omicron. \qPi \lambda r:\omicron. \\ &\quad \quad (\qPi \lambda z:\alpha. \qimplies(\qimplies(x z, r), vr) \end{align*} $$

We can get an intuitive understanding of these definitions: for $\qor$, if we have a way of obtaining an element $r$ (result) from $x$ and one of obtaining the same from $y$, then also from $(\qor~x~y)$. Similarly for $\qSigma$, if we have a way of obtaining any result $r$ from the "instantiated" predicate $x z$, then also from $\qSigma x$.

The definition of $\qor$ in terms of $\qand$ and $\qPi$ is not very different from natural deduction rule for "or elimination": $$ \cfrac{\begin{array}[cc]{} \\ {} \\ A \vee B \end{array} \quad \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array}\quad \begin{array}[cc]{} [B] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array} } {C}~\color{blue}{(\vee{\mathrm{\small ELIM}})} $$

Checking the the natural deduction rule for existential elimination and comparing it with the encoding above is left as an exercise.

The end of the beginning

Higher order logic seems to deserve more attention. We have set up a logical system based on $\lambda$ calculus without making use of the Curry Howard correspondence. While the previous discussion focused on classical higher order logic as introduced by Church in 1940, in this post we looked at intuitionistic higher order logic. The interest in the latter is at least very heavily influenced by topos theory.

It is slightly confusing that when discussing elementary toposes, one talks of first-order logic when in reality the logic is inherently higher-order (because a topos is cartesian closed). I guess the charitable perspective is what people mean is "you can do everything that you would want to do in formal set theory using first-order logic."

The Lambek and Scott book "Introduction to higher order categorical logic" dwells a bit on the fact that it is always possible to get a formulation of the internal language of a topos using equality. Andrews $Q_0$ develops simple type theory / higher order logic using a single equality operator $Q$. Various theorem proving systems have been based on this (TPS and HOL).

One could say that the way higher-order logic admits the definition of logical connectives in terms of other is an "encoding" of natural deduction rules. Could one use this idea to build a logical framework, a representation of derivations? Indeed this is what is underlying Isabelle/Pure framework and $\lambda$Prolog.

It would seem worthwhile to write a proper book that takes a logical perspective on the $\lambda$ calculus, developing higher order logic as language, category theory as semantics and arriving at elementary toposes as models. It would probably feel very elementary and pedestrian in the academic world, but I do think it would be useful. It feels to me that there is so much "background knowledge" to gather the relevant theory on logic and computation that most practitioners will not have the energy (and this says something about the potential for adopting of formal methods). Until such a book arrives (I'm not sure if I would have the energy), I hope my out-of-order blog posts provide some help on the reader's journey. There are certainly a lot of of connections between logic, language, types and semantics that are left to be explored, besides Curry Howard and dependent types.


Programmers love Higher-Order Logic

My previous take on higher-order logic was classical, channeling Church's simple theory of types and Andrews $Q_0$. In that last post, I was hinting at motivation. So I want to expand a bit now how this stuff can be useful for building software at large. I am posting some ramblings here.

Equality is Fundamental

There are two different "levels" at which a programmer deals with some kind of reasoning that involves equality:

  • while writing code: need to be able to tell when two objects are the same.
  • when designing or arguing that code is correct: reasoning whether behavior like output values conform to a (however implicit) specification

In both cases, one needs to be able to identify whether objects (or behaviors or input-output pairs) are "the same" as expected ones. But the latter activity, reasoning about program correctness, happens in some form of logic. Much that has been written and is being written on formal logic is about mathematics and mathematical convention. This is true whether we talk about semantics of programming languages or program specification.

But for programmers, there are benefits to knowing formal reasoning even if one is not interested in the foundations of math, axiomatizations of set theory etc. It is no secret that in mathematics as in programming, you run into situations where there are multiple different ways to represent some data, and one needs to chose or translate between those representations. While programmers resign to writing "proto-to-proto" translation over and over (which often are not as trivial as it seems), in mathematics it is customary to leave out the awkward details, introduce "abuse of notiation" and quotienting to keep arguments informal, but concise.

The heart and soul of much mathematics consists of the fact that the “same” object can be presented to us in different ways. [1]

[1] Barry Mazur. When is one thing equal to some other thing?

Equality in First Order Logic

We take a leap towards "formal systems", where objects are represented using (strings of) symbols, along with strict rules to form statements and derive new true statements from existing ones.

A typical definition of a first-order logic language contains constant and function symbols, predicate symbols, logical operators.

There is a "design choice" here whether equality is considered as built-in, magically denoting exactly the equality in the underlying sets, or whether equality is a predicate with certain special properties. Wikipedia" talks about "first-order logic with equality" (when built-in) and first-order logic without equality (when it is not).

When equality does not come built-in, one needs to add axioms. The equality axioms are special, in the sense that they have to be preserved by all the function and relation symbols. This looks like:

  • $\mathsf{eq}$ is reflexitve, symmetric, transitive
  • if $\mathsf{eq}(x_1, y_1), \ldots, \mathsf{eq}(x_n, y_n)$, then $\mathsf{eq}(f(x_1,\ldots,x_n),f(y_1,\ldots,y_n)$ for all function symbols $f$ of arity $n$
  • if $\mathsf{eq}(x_1, y_1), \ldots, \mathsf{eq}(x_n, y_n)$, then $R(x_1,\ldots,x_n) \Leftrightarrow R(y_1,\ldots,y_n)$ for all relation symbols $R$ of arity $n$

With a "defined" equality, is it is well possible to interpret logical statements involving $\mathsf{eq}$ in such a way that two distinct elements are regarded as equal according to $\mathsf{eq}$.

Coding our own equality

When we are talking about practical software development, there is really no hope of defining a platonic ideals; yet when techniques of reasoning, informaticians are in various ways confined to results and methods that formalizing mathematics.

Take this example: We'd like our terms to represent free monoid on $X$, so we'd like these terms to satisfy the following equations (or identities): $$s \cdot 1 = s$$ $$t = 1 \cdot t$$ $$(s \cdot t) \cdot u = s \cdot (t \cdot u)$$

Consider definitions in some fictional language to express the "interface" of a monoid: $$ \begin{array}{ll} \mathsf{interface}~ M[X] \\ \quad \mathsf{unit}(): M[X]\\ \quad \mathsf{emb}(x: X): M[X]\\ \quad \mathsf{dot}(s: M[X], t: M[X]): M[X]\\ \quad \mathsf{eq}(s: M[X], t: M[X]): \mathsf{bool} \end{array}$$

Here, $\mathsf{unit}$ stands for $1$, $\mathsf{emb}(x)$ is for embedding $x$ and $\mathsf{dot}$ is the combine operaion in $MX$. If our fictional language has algebraic data types, we can use those to build terms which represent the algebraic theory of monoids: $$ \begin{array}{ll} \mathsf{data}~ \mathsf{Mon}[X] \\ \quad \mathsf{Unit}(): \mathsf{Mon}[X]\\ \quad \mathsf{Emb}(x: X): \mathsf{Mon}[X]\\ \quad \mathsf{Dot}(s: \mathsf{Mon}[X], t: \mathsf{Mon}[X]): \mathsf{Mon}[X]\\ \quad \mathsf{eqMon}(s: \mathsf{Mon}[X], t: \mathsf{Mon}[X]): \mathsf{bool} \end{array}$$

It is now really obvious what these constructors stand for, but we note that while the equality $\mathsf{eqMon}$ returns true whenever two terms are equal when considered as elements of the free monoid, there is at least one more equality relation on terms that is different ("structural equality" or "syntactic equality", which will consider $\mathsf{dot}(\mathsf{dot}(s, t), u)$ as distinct from $\mathsf{dot}(s, \mathsf{dot}(t, u))$).

Mathematically speaking, we want to consider the quotient set $\mathsf{Mon}[X] / \mathsf{eqMon}$, and work with equivalence classes.

Formulas = Functions to Truth Values

After all this talk about equality in algebraic theories, we want to turn equality in the context of logic, specifically higher-order logic.

In classical logic, a proposition (logical statement) stands for a truth value, and there are exactly two of then (the two-value type "bool"). Boole was the first to associate propositional connectives with algebraic operations. In this perspective, the the truth values are part of the language, even though this was later abandoned as in Tarski-style semantics.

Programmers are quite used to having functions return $\mathsf{bool}$. In terms of mathematical logic, our custom-made equality "relation" $\mathsf{eqMon}$ above is in fact a function to the set of truth values. Could we not do away with relations and talk in terms of functions?

Yes, we can; using truth values and functions in a formal language of propositions is the essential summary of classical higher order logic (or Church's simple theory of types. There is a more compelling example: if you consider "set builder notation" $$ \{ x \in A \ |\ \phi(x) \} $$ then this describes a subset of $A$. We can choose to describe $\phi$ as defining a (unary) relation, but viewing it as a function to $\mathsf{bool}$ is no doubt more natural in a context like SELECT x FROM A WHERE $\phi$ where "relations" are given by their extensions (as tables) and computation in the form of expressions is available.

The point of (classical) higher-order logic is that using the $\lambda$ calculus and truth values, we have everything we need to define the meaning of all logical connectives, and the types prevent logical paradoxes of "naively-untyped set theory". Since the truth values are part of the language, it is possible to define the meaning purely through definitions that involve equality.

More on this and classical higher order logic in that last post on Higher Order Logic and Equality and the next one I'm going to write in a second, where I talk about the intuitionistic version.

I will just end with a note that there is no distinction between "predicate", "relation" and "set" is higher-order logic, and this is a good thing. An expression of type $\iota \rightarrow \omicron$ (where $\iota$ are the individuals and $\omicron$ the truth values) defines a (sub)set of individuals, a predicate on the set of individuals, a unary relation, all at the same time. $$\mathsf{even} := \lambda x:\iota. \Sigma \lambda y:\iota. \mathsf{eq}(x, \mathsf{times}(2, y))$$ $$\mathsf{greater} := \lambda x:\iota.\lambda y:\iota. \Sigma \lambda z:\iota. \mathsf{eq}(x, \mathsf{plus}(y,z))$$ $$\mathsf{lowerbound} := \lambda x:\iota.\lambda y:\iota \rightarrow \omicron. \Pi \lambda z:\iota. \mathsf{implies}(y(z), \mathsf{greater}(z, x))$$

The confusing bit for programmers may be that this thing is called simple type theory when in fact there is only one "sort" of individuals. This makes more sense when considering that it is no problem to come up with multiple "sorts" of individuals but that wouldn't give sets ($\iota \rightarrow \omicron$), sets of sets $(\iota \rightarrow \omicron) \rightarrow \omicron$, sets of sets of sets (and so on) of individuals. It's this distinction of hierarchies of sets/predicates/relations that earn Church's higher order logic the name "simple type theory": only terms with the right types are considered well-formed and meaningful.


Modules, components and their limits

I read Clear explanation of Rust's module system. The TL;DR the module system in rust is a way to (somewhat) break/hide the link between what things are defined and what source files the things are defined in. A rust module provides control over visibility across source files and makes explicit how references reach across "modules".

The post starts with a request to ignore what "module" means in other languages, and this plunged me into thoughts on all the things that are called modules. It is somewhat sad but to be expected that a basic word like "module" (or "package") could never have a standard meaning, despite all these programming languages supposedly built while we have all this PL theory.

The more academic vocabulary has "module" as a language concept that comes with a scope and visibility, a way to bundle types and named values. It's somehwat independent of installation, deployment (the only touch point being support for separate compilation), but - to my knowledge - there is no talk of an "installed-at-a-version" module. So the rust module concept is pretty consistent with that. Arranging a set of modules that fit together is the programmer's problem, not the researcher's.

Alas, there is this thing that happens at the level of Rust crates/npm packages, and it is really a different problem. Nevertheless, those things are also called modules in some languages (C++20 modules, Java modules, go modules).

In Java, source files in the same directory make a Java package and they come with a way to control visibility. The association between files and packages is rigid. More recently, Java modules were added to express dependencies between modules, visibility of packages - something like a "packages of packages", but really more on the level of Rust crates.

I am reminded of "components" (if anybody remembers component-oriented programming) because deployment is part of this game - the notion of dependencies only makes sense in the context of some installation. But then components used to be about runtime services (in addition to code), object brokers and such and here we would actually like to talk about less dynamic things, just code and functionality in the form of source and/or libraries. Still, components are things that get assembled and exist at versions, so I'd say it's an appropriate, if old-fashioned name.

Now, such a (static) component would abstractly describe what npm packages or Rust crates achieve, and also form of Debian development packages (sigh) where you install a C library in binary form and its header files. That gives us even more of a twist now, which is that C++ 20 modules are really modules in the sense of achieving grouping for visibility and scope (check) but at the same time that whole notion has a huge impact on C++ since it should actually make it easier to deploy C++ components. A relevant quote from the C++20 link above:

After a module is compiled once, the results are stored in a binary file that describes all the exported types, functions and templates. That file can be processed much faster than a header file, and can be reused by the compiler every place where the module is imported in a project.

So while a C++20 module is language concept, it is foreseeable that one day all self-respecting newly developed libraries will be deployed as (installed at a version) modules. Oh well. There was really no reason to hope that a standard meaning for "modules" and the concept "above" could be justified in any way, across languages. At least, there remains an insight that it really makes sense to deal with the two levels separately.

I sure wish we could all call the installed-at-a-version thing "component", and the language concept "modules". Just two paragraphs on how that might sound:

Every language needs a module system: a way to deal with grouping, interfaces (in terms of making visible - "exporting" only the interface and keeping the rest internal). Including files was always bad and indicates the lack of a module system. It's when one needs to deal with independently developed, deployed and release units that there comes the problem of dependencies, evolution and versions, and these things are the components. Components are related to modules in that they need to contain at least one of them for talking about the things that are provided and brought in scope.

If there is a dependency between versions of components and there are backwards incompatible (!) changes in the depended-upon component (or a left-pad situation), then this is something that needs to be dealt through changes to the depending component. That isn't something that module systems (or any other language design) can help much with.


Relating partial evaluation, multi-stage programming and macros

Here is some quick fact on two metaprogramming techniques. $$ \begin{array}{l} \mathsf{letrec}\ \mathtt{pow}\ =\ \lambda n : \mathsf{Int}. \lambda x : \mathsf{Int}. \\ \quad \mathsf{if}\ (\mathsf{eqInt}\ n\ 0)\ 1\ (x\ * (\mathtt{pow}\ (n - 1)\ x)) \end{array} $$

What can we do with an expression $\mathtt{pow}\ 3$?

Partial evaluation

The "partial evaluation" answer is to inline the definition, and evaluate what can be evaluated, yielding a residual program. $$ \begin{array}{l} \lambda x : \mathsf{Int}. \mathsf{if}\ (\mathsf{eqInt}\ 3\ 0)\ 1\ (x * (\mathtt{pow}\ (3 - 1)\ x)\\ \lambda x : \mathsf{Int}. x * (\mathtt{pow}\ 2\ x)\\ \ldots\\ \lambda x : \mathsf{Int}. x * (x * (x * 1)) \end{array} $$


The multi-stage programming answer (or staging) is to rethink $\mathtt{pow}$ as a transformation of code. $$ \newcommand{\qquote}[1]{\color{blue}{[|}#1\color{blue}{|]}} \begin{array}{l} \mathsf{letrec}\ \mathtt{pow}\ =\ \lambda n : \mathsf{Int}. \lambda x : \mathsf{\color{blue}{Code[Int]}}. \\ \quad \mathsf{if}\ (\mathsf{eqInt}\ n\ 0)\ \color{blue}{[|}1\color{blue}{|]}\ \color{blue}{[|}($x * $(\mathtt{pow}\ (n - 1)\ x)\color{blue}{|]} \\ \mathsf{let}\ \mathtt{mkpow}\ =\ \lambda n : \mathsf{Int}. \qquote{\lambda y : \mathsf{Int}. $(\mathtt{pow}\ n \qquote{y})}\\ \end{array} $$

The notation was (probably) not designed to scare you:

  • $\qquote{\ldots}$ is quasi-quotation. What is between these brackets is code
  • $\$(\ldots)$ is used within a quasi-quoted expression to splice code.

Now $\mathtt{mkpow}\ 3$ is going to give us the code of the desired power function: $$ \begin{array}{l} \qquote{\lambda y : \mathsf{Int}. $(\mathtt{pow}\ 3\ \qquote{y})}\\ \qquote{\lambda y : \mathsf{Int}. $(\mathsf{if}\ (\mathsf{eqInt}\ 3\ 0)\ \qquote{1}\ \qquote{($\qquote{y} * $(\mathtt{pow}\ (3 - 1)\ \qquote{y})})}\\ \qquote{\lambda y : \mathsf{Int}. $\qquote{($\qquote{y} * $(\mathtt{pow}\ 2\ y)})}\\ \qquote{\lambda y : \mathsf{Int}. $\qquote{y} * $(\mathtt{pow}\ 2\ y)}\\ \qquote{\lambda y : \mathsf{Int}. y * $(\mathtt{pow}\ 2\ y)}\\ \qquote{\ldots}\\ \qquote{\lambda y : \mathsf{Int}. y * (y * (y * 1))} \end{array} $$

There can be nested levels of $\color{blue}{\mathsf{Code}}$ and this explains the term multi-stage programming. For objects of this type to be really useful, there should be a method $\mathsf{run}: \color{blue}{\mathsf{Code}[T]} \rightarrow T$.

Macros are very related to the above: a macro gives the programmer a way to give a definition that is elaborated at compile time. This is similar to what is going on in splicing. The special case is that it is the compiler that calls $\mathsf{run}$; in other words,the context in which a macro is elaborated is the "compile-time stage". The design of Scala's metaprogramming facilities is consistent with all this, and I picked this up from "A practical unification of multi-stage programming and macros" by Stucki, Biboudis, Odersky.

That's all; this is shallow as all my posts and there would be more to say; "Modal analysis of staged computation" by Davies and Pfenning would be a good place to continue for the curious and/or impatient. I'd like to connect these views on staging and macros and modal logic, but that's for another day.


Language, Logic and Modeling

Language is the parrot of thought, and a hard to teach one, nothing more. („Sprache ist der Papagei des Gedankens, und ein schwer gelehriger, nichts weiter.“) ― Friedrich Hebbel

Here's a short note about a perspective on logic for informatics, and decidedly not mathematics.

Formal logic has its roots in formalizing mathematics. Expressing the steps of reasoning in a formal system would assure us that there are no "gaps". I wrote a brief account on the beginnings of formal logic elsewhere.

The 20th century brought with it the advent of machines for information processing. This has gone way beyond mathematics already. In informatics (a better word for "computer science") one deals with information processing, and its practical kin, software engineering, deals with systems of all kinds where information gets in contact with reality.

There is a view that holds that we can only understand reality through models (see constructivist epistemology). Now, whether one subscribes to this or not, it is obvious that informatics relies on models of reality; information itself is always part of some model of reality. If you worked with customers or product managers, you know it is shaky and always incomplete series of models of reality, always adapting to new circumstances.

All I am saying is: formal logic can and should help us as a tool for formulate precisely these models. Like in mathematics, we do not want to have "gaps" in our reasoning. This is a somewhat ambitious direction, as software systems are not always built according to specification, and specifications are usually prose. And it is not only about modeling the domain.

"Functional programming" has gathered a large following in recent years, and I think this is because less because "software should be mathematics", but programmers benefit from constructing their code according to models that leaves no "gaps". The tool at hand, a (statically typed) functional programming languages, forces one to think thoroughly about data and the operations that are going to be carried out.

One does not need to resort to philosophy for challenging this. Large software systems are distributed over many machines, and gives rise to challenges that require methods different from those that are mathematics (and different from programs that run on a single machine). While functional requirements lead us to modeling domain concepts, designing a system that meets those is constrainted by non-functional requirements. We need to consider of performance and machine capacity, and evaluate which underlying technologies to build on, what guarantees they provide. It is here that engineering can be meaningfully distinguished from programming, and work is influenced less by theorems, more by best practices and experience.

Engineering and domain modeling are full of abstract models (the favorite tool being: diagrams of boxes with their meaning provided by handwaving). Formal logic and actual, manifest models where people could experiment are not encountered very often, yet expressing these models, reasoning with them and testing/verifying them would be critical since mistakes are costly. Moreover, a large number of people have to understand and act in accordance to these models. Often, the only time where this becomes obvious are the APIs between subsystems, a point where much modeling has already happened and cannot be changed.

I will go further and claim that Conway's Law is a consequence of each team forming their own "unit of modeling". It would be a lot better when we learnt to communicate explicitly and precisely what our models are, explain our needs to model things differently and how we can translate between our different models instead of insisting on some "right" way of seeing something. I place my hopes in logic as a language and tool to do just that. Mathematics is a discipline where the practice of modeling a mathematical "universe" has advanced very far, and it is no wonder that people seeking to use precise methods turn to it. It is a mistake to interpret this field of application of formal logic as some sort of mathematical monopoly. Language is a tool to communicate, a logic is a style of language.


I don't know how to Semantics

I missed out my daily writing practice of bad, unfinished, unedited thoughts. So why not combine it with idle thoughts on logic. The title alludes to the friendly LOGO error message, and this exchange on Twitter:

Andrej is not wrong, but axiomatic semantics is a thing in PL. So I'm collecting now the two different contexts for the word and idea of "semantics", the mathematical logic, one in programming languages. I am generally trying to embrace multitude of meanings in different contexts, but given the strong connections between PL and logic, it is worth pondering what is going on here.

Linguistic meaning: the writing on the wall

The word "semantics" comes from σῆμα "sign", "marking" and refers to the study of meaning (somewhat associated with linguistics, the study of language). Now meaning a very broad thing to study, and how we create meaning and interpret signs proper (beyond language) leads us to semiotics. I promise I am not going to spiral into ever higher level of abstractions and get back to the more mundane divergences of academic disciplines; still I want to share a story to show the depth of the question "how we make meaning of something" and what meaning really is.

Here is an example of linguistic and semiotic meaning. I believe I had heard and understood in English the idiom "the writing is on the wall" long before I heard the biblical story. It is hard to tell, since I do not remember. Maybe I looked up the German word "Menetekel" – which is the same but doesn't sound as cool and portentous – and then forgot about it. Maybe I learned and chose to forget biblical reference. English is full of references to stories that become idioms and detached from their origins (I also learned very recently that "drinking the Kool-Aid" originates from a Jonestown massacre), yet somehow they convey meaning.

Conveying meaning in written language can only happen through more written language; however, language is definitely more than what is written. I claim it's possible that even if you never heard this story from the Bible and have no access to it, you can nevertheless learn the full meaning of the saying "the writing is on the wall" when you witness it being used in some situation that discusses the terrible ending to come. Whether that is from the textual, narrative context, or rather from the faces, gestures of the participant or from your understanding of the reality of the thing it is said in reference to, that's hard to tell.

Kamus writes how many meanings each word has, then, I wondered and tried to write down a Kamus showing, to how many words a meaning corresponds. (Bediuzzaman Said Nursi)
Wisdom is a tree which sprouts in the heart and fruits on the tongue. (Ali ibn Abi Talib)

Now, back to our divergent use of "semantics" in mathematical logic and programming languages.

Semantics in mathematical logic

The predominant view in mathematical logic today is Tarski's conception of semantics and truth, adapted to model theory. This is what is taught to undergrads, and I shall like to deconstruct the assumptions that come with it:

  • Statements are either true or false, or otherwise put, there is an objective notion of truth that we try to capture with logic languages.
  • The meaning for a logical model is given by describing the interpretation of the language in a model
  • The way we communicate or describe a model and the interpretation are well understood.

All issues one may have are subdued with the last point. You cannot really question why models are given in "informal set theory" because the whole meaning exercise requires some common ground. If one ever wonders how to give meaning to those things that are supposed to provide the well-understood meaning, one finds that formal set theory is relatively disconnected from what one is trying to do (and turns to type theory).

I find it very striking that at the time Tarski formulated his original criterion for defining truth, the lingua franca was higher-order logic (which is equivalent to simple type theory), not set theory. I do feel something was lost there. Classical higher-order logic, we are concerned with mapping to truth values, but point being about the mappings (not the "sets"). So rather than being the "informal set", a grouping of objects by some property, or a formal set which described by an obscure string formal set theory, a "higher-order logic" set is defined by its mapping. A "woman" is defined by a mapping

woman: Individuals -> {true, false}

... and you can argue when it returns true. I find this closer to discussions on meaning in the linguistic sense.

Semantics in programming languages

In programming languages, "semantics" refers to the "meaning" to programming language (this is actually a surprising and confusing statement). Now, "meaning" is something a lot narrower than linguistic meaning.

  • Something does not have to be universally and objectively true to give "meaning"
  • The conventions we can rely on when giving meaning are entirely different.
  • The way we communicate or describe a model and the interpretation are well enough understood.
  • "Meaning" can be something like a specification that clarifies what statements and expressions of the language do. In practice, this is all informal. In research, there are three "styles":

    • Denotational semantics. We choose to embrace the conventions of mathematics as our medium for conveying meaning as objective truth
    • Operational semantics. The meaning is in the behavior, the execution of the programs. See what this interpreter does? That is the meaning.
    • Axiomatic semantics. The meaning of a program is properties you can state.

    This distinction is entirely standard and helpful, I learnt these as undergrad and e.g. Benjamin Pierce "Types and Programming Languages". It is also largely ignored in industry, where a natural languages specification relying on conventions is all you get for "meaning" (and stackoverflow may cover the gaps).


    Programming actually has its roots in formal logic. It is not a coincidence that the act of setting up formal languages and the burden of having to define meaning is shared by people who mathematical logic (ok, I should say formal logic to not exclude the philosophers) and programming languages, and that informaticians take an interest in proof assistants and machine-checked proof.

    Observe that all styles leave open a path to full formalization, but for axiomatic semantics it is somewhat odd since we'd have to formalize logic in logic. The boundaries are also not as clear-cut, since one can also describe an interpreter by some mathematical structure (labeled transition systems) and there is als formal methods, verification and the "theory of a program" and how it needs some semantics to build on (not necessarily "the" semantics). The taxonomy is still very useful, though, since it says something about the intention.

    My personal hobby is to study stacking systems of logic and reasoning, "compositional". Category theory is useful here, precisely because it permits to consistently ignore what is the foundation, and describe everything in terms of mappings (arrows), just like higher-order logic. I shall leave this topic for another post.


Go Karma Schizophrenia

Today, I tried to busy myself with doing some programming exercise in go. Not the board game, the language. Here are some idle thoughts on programming languages.

I had forgotten how good I can be at procrastination. I drank about a dozen cups of tea (soar throat), checked Twitter and mail... it was terrible.

If you studied programming languages, there are many things that seem off about golang. I could even understand if the lack of generics had been justifed as: "we value type inference and structural subtyping higher than having generics." but no, they are on the record as "we never needed them". ok... Opinions on programming language design are a dime a dozen.

I do not aspire to exercise my golang skills because it is a "great programming language", or the appeal to simplicity. Following a change of role at work, I simply expect to be dealing with a lot of go code in my job. That, and "modern C++". It's as simple as that. And sure enough, you can get used to anything.

When I was a grad student working on a certain programming language, I remember how convinced we were that we'd be changing the world. And indeed we have, but then a whole bunch of others have, too. Golang did not exist back then, and neither did "modern C++". I did have some experience in C++ programming, from peeking at programming courses for math students during uni (absolutely nothing related to actual C++ in my informatics curriculum, though there was a lecture on object-oriented programming) and my two internships. And also learned some C from a friend, one seminar and then many hours of toying around. This changed with my job of course, but the point is I perceived this dichotomy betwen programming as taught and research in academia and programming as done in industry or applied research contexts "with an industrial flavor". As grad students working on a language, we were convinced we have the best language, and determined to make it bestest, but I had seen "the other side".

I continued to dabble in C++ in those times, because I was sure I'd need it someday, it's the reality that cannot be ignored when studying programming languages. I described this as "professional schizophrenia": you split yourself into the part of you that accepts the "industry" programming languages, and the part that supports progress based on research which mostly gets ignored. I was then quite surprised when C# 2.0 got generics and Java 1.5 quickly followed suit. It looks like pattern matching makes it into the mainstream, too.

Another way to view this discrepancy is pragmatism: you use what's there, things are the way they are due to path dependence, economic pressure or some other reason that was of concern to somebody. Languages are tools, no need for judgment, just move towards the objective. This is my modus operandi, but it feels good to know that a domain-specific language that is using elements of an advanced type system is sometimes the thing that moves you towards that objective. Still, I am surprised how much one can be stuck in "my programming language is my identity", at myself in hindsight but also all the fan discussion.

Today, after procrastination, I think there might also be some karma. While I had not been too vocal, I'm sure in a weak moment, I must have said something like "golang is not a good language" aloud. I remember looking at source code that I considered to be looking like line noise (the inline & turned me off), surely I must have expressed my disaffection a few times. After a whole day of pain-avoidance behavior – the other name for procrastination – I realize now that maybe this is what has been hunting me. I have been paying for my remarks, dearly. Wake up, Burak, it's time to do your exercise and show how much of a pragmatic you really are.

On the bright side, after all this suffering, I am hoping to be born again as a happy gopher tomorrow.


Nothing is real

We are the children of concrete and steel
This is the place where the truth is concealed
This is the time when the lie is revealed
Everything is possible, but nothing is real

Living Colour - Type

Ryuichi had to get out. The kids were running around the house and screaming, their juvenile energy levels blending with the boredom of a cloudy Sunday morning to form an explosive mixture. He checked the viral pollution levels and left unceremoniously. It wasn't ok to leave Lingzhao alone with the kids, but still he had to get out or the noise and shouting would get to him and erase all possibility of having thought. He needed space to think, and silence.

The streets had not yet filled with people, only a few families strolled, then shining beautiful Black faces gleaming in the African sunlight. The Chromatic Cure had worked its wonders, he thought. After the first "baby pandemic" of 2020 and the 2nd civil war for racial justice came the Dinosaur diseases, caused multiple strains of airborne virus out of the remnants of Siberian permafrost. It affected people with light skin color disproportionately, which had eventually led to a treatment. The skin's pigments were modified to transform UV light into virus-defeating proteins, but it needed Black pigments to work, a lot of them. Ryuichi vaguely remembered from his medical biochemistry class in kindergarten. "And that is why everybody is Black now", the teacher had said. The Unified Society logo on her uniform caught his eye, it was one of his earliest memories.

Weekend or not, those remaining pesky white supremacists were still trouble, Ryuichi thought. Their enclaves were on life support from the Unified Society. Large compounds, former prison complexes that had been sealed airtight and determined to hold out, fueled by conspiracy theories and delusion. Endless debates had ensued whether the move to reactivate the old underground supply channels to keep them alive was a justified use of scarce resources. Compassion won out and the prison enclaves became the only refuge for white people who preferred living with other fellow white male supremacists.

Ryuichi worked on the team that intercepted electomagnetic signals and gathered intelligence. Two individuals, bugmold and GDJ, had emerged as leaders, one based on ideology and the other seemingly controled the software that kept the modified prison complex working. Very little was known about the fate of women in the prison complexes; there had been many who chose to give up freedom to retain their skin color. But violent fights had erased all but a few, who willingly accepted bugmold's tales of superior intelligence and the day of revenge, and GDJ's programming school of thought in order to keep the automated manufacturing going and launch InfoSec and fake news attacks on the Unified Society. The brutal fights in the White Supremacy enclaves had become a pattern that repeated itself once a decade. They even had a name for this: Natural Selection.

Ryuichi found a bench and took a break from his walk. It would soon be time.

Compassion also was the argument that led to the Plan being accepted. The intelligence gathered from Ryuichi's group revealed more and more signs that the crooked ideology and its implementation through software protocols led to prisons being a place of crimes against humanity and intolerable suffering, and human rights group revolted. The cure had been perfected and encapsulated; no complicated week-long procedure was necessary anymore, the biochemists of the Unified Society had created a new airborne virus that would set off the process of transforming the skin color. Everybody has a right to keep their skin color, the chairman of the assembly said, as long as it is Black. Even the most ardent supporters of "tolerance" could not bear the tales of natural selection.

Ryuichi checked his watch: it was noon. The sealed gray metal boxes with today's supplies to the prisons, delivered synchronously on a global automated schedule, just opened with a click. He smiled and thought, now that White is history, the team will be asking for a new project tomorrow. But it was the weekend, after all. Time to get back and see what the kids are up to.


Gotta not-do what you gotta not-do

Today's writing is about wu wei - "effortless action".

Wu wei comes from Taoism. I don't know very much about Taoism. I do not have a plan to become a thought leader on Taoism. My only personal connection to Chinese philosophical tradition is that I learned Tae Kwon Do as a teenager and I got curious about the "way" teachers description that the roots of these are not "martial arts" but a "philosophy (or way) of movement".

I was intrigued by this idea that "attaining perfection (of movement) through practice" was not for becoming be perfect fighter, but to attain wisdom. I went on to read Tao Te Ching, I Ching, and while I did not understand much, I became sensitive to these topics. [A friend at work, big fan of martial-arts, laughed at me for taking this philosophy stuff seriously - but then, "If the Tao would not be laughed at this way, it would not be the Tao!"]

The point is, you don't have to be an philosopher to appreciate that the idea of effortless action is one that is found in many forms and cultures. It's a pattern easy to observe, if you have the eye for it.

Take "flow or "being in the zone". There are three conditions for flow to occur:

  • Goals are clear
  • Feedback is immediate
  • A balance exists between opportunity and capacity

This resonates with the programmer working on a software artifact, with the amateur musician playing a piece on the piano, when doing sports (I am going to stop here). It also resonates indirectly, think of a manager who wants to create a productive environment for their team.

The analytical, self-conscious and doubtful mind takes a back seat and you "become one" with the thing you are doing. There is a view of taoism where this is not only at the individual level, but also at the level of society; this is what is described in the Tao Te Ching and I Ching (and discussed in the wikipedia article on wu wei).

It should be intuitive (if somewhat illogical in the realm of words) that "effortless action" requires to put in some effort. The programmer had to learn many things in order to get flow state, hacking away at this code. The musician had to practice countless hours in order to be able to play the piece. That effort is not felt in the moment of flow. You are building up capacity.

Building up capacity, if done right, is subject to compounding effects. You don't want to only learn, but also watch your rate of learning. Is there a better method? Am I still working towards the same goal?

Now, when we think of "goals" we are slightly at odds with "effortless action" and the Taoist ideal, since setting a goal can already be "forced." "Effortless action" is not forced, and also not goal-oriented.

Effortless action and taoism in general are not concerned with values or morality. Taoism features the theme of man's inability to force changes upon society. This is a bit revolting at first sight, since today's society is full of injustice. Black lives matter!

Things start making a bit more sense when one thinks of the attempts to shape a society or the world by ideology, this is bound to fail or become unsustainable (let's count capitalism as the ideology of $$$ here). You can have the best law, it is not by itself going to guarantee justice. Or, when you think of some software to help with some process, it is helpful to look at those things that do not need to be specified - because it is hopeless to specify them and you need to rely on "common sense" or some norm or culture. Also related is: "There is no way to peace, peace is the way"

The effort on the society level could be framed as "building the capacity and opportunity for justice". So an abstract, non-descript goal, direction, objective like "justice" or "I want to be good at X" is ok and not a goal in the sense of forcing some direction; we call it a goal since we live in times where we have an extraordinary freedom in choosing what we want to do with our lives. Choices do not preclude wu wei. I'm inclined to think that wu wei is related authenticity and focus; not getting distracted in futile matters, not doing something because you feel you have to, not doing something for purely "political" (tactical) reasons, but focus on the essence. Another quote that comes to mind here, "Never mistake activity for progress."

There is a lot more to find out about this. I am not an expert on Taoism and I still do not intend to become one; but maybe there's a reminder here for looking at one's own patterns of doing and not doing.


Life has a limit but knowledge has none

“Your life has a limit, but knowledge has none. If you use what is limited to pursue what has no limit, you will be in danger.” Zhuangzi

If DRMacIver in Subsetting life can draw an analogy between the limited nature of consciousness / the human condition and how C++ programmers effectively know and use only a fraction of the C++ standard — a little subset of the possibilities — then I can do the same with Zhuangzi's warning and ... something. Thank you, David.

I have written on intuitionist mathematics and logic before, so today it's not going to be about effective computations being terminating ones, potential infinity vs actual infinity. No, let's talk about models.

I found this quote in a little notebook of mine, an entry dated 20 years ago. I had an actual habit of writing notes, long-winded streams of consciousness, full of philosophy and reflection; a note on the fleeting nature consciousness next to a bitter account of unrequited love; a seemingly never-ending search. I threw away most of those notebooks, but for some reason I kept this one and found it the other day while clearing out the basement.

A model is always a simplified image of something. That something is not necessarily "reality"; sometimes the thing that is simplified is just another model, or construction. When you make a model of anything, you leave out some parts (attributes) and retain others (Herbert Stachowiak calls this Verkürzungsmerkmal, or attribute of abstraction). Constructing models is key activity for the informatician, and I am glad that undergrads get exposed to Stachowiak's model theory (slides in German).

Any programmer engages in the business in creating models. Every data structure is part of a model, and data structures serve different purposes. What is right in one context is not appropriate in another. Premature optimization is the adapation of code (and frequently also data) for the purpose of optimizing the performance of some operation, often losing flexibility in the process, and making other operations slow or more expensive. Premature means it's not good and the implicit recommendation is that it is good to keep things general until you know what operations you need.

Let's not forget about consciousness and life. As a human, you need to make models, too. You need to understand and interact with your fellow humans, understand the consequences of your actions. This is not easy. For goodness sake, you need to understand yourself. What was it that made me throw those notebooks away?

There is this theory that there is only one consciousness. Carl Gustav Jung writes about this in Typologie, as a key pitfall that makes some introverts insufferable (I felt seen) since they tend to confuse their self and that universal consciousness. "One love" stands for universal love and respect for all beings, no distinctions. Religion literally means connecting your self with the One. The vedic religion talks about consciousness. You don't need to reach for religion for accepting a universality of the basic conditions and limitations in which consciousness and thinking occurs, though.

But not only life is limited, what each individual knows and is able to do with the knowledge they have, is limited, too. Every human is absolutely unique, not only in the various social or economic conditions they are born in, but also in their continued perspective. I like this opening phrase from Jean Piaget's genetic epistemology

GENETIC EPISTEMOLOGY attempts to explain knowledge, and in particular scientific knowledge, on the basis of its history, its sociogenesis, and especially the psychological origins of the notions and operations upon which it is based.

Piaget set out to study the development of children, and ended up with a theory of knowledge. Is it constructed models all the way down? Piaget classifies knowledge into three kinds, phyical, social and "relationships" which he calls logico-mathematical models. Why does he calls the latter thing "logico-mathematical"? I think because mathematics and to some extent logical descriptions of mathematics are used here as synonyms for a kind of knowledge where we are completely free to choose what we are modeling and why; at the same time the models are supposed to be consistent. We want to be sure that "they work", that they are good for something.

Now, the human condition isn't only about the subject. Different people have different models, so it is not unusual that programmers end up spending 50% or more of their time resolving their disagreements about models. True progress is when people learn something from each other, bridge differences, teach each other and enable others to overcome their limitations.

I don't know how I end up on taoism, but these are the "ten thousand things". Consciousness is not limited, but each individual slice of it is. And there are challenges and limits to collaboration, many of them stemming from social, economical limitations or preconceptions. Every individual does feel the need to belong; this is where rationality itself reaches its limits.

I am just going to end here. A lot of disagreements would be easily bridged if we made more of an effort to understand each other: what models, what limitations, what perspectives makes the people say what they say, do what they do. It is not consciousness that is limited; it is its manifestation in the individual — and sometimes just the willingness to take the perspective of the other.


Intuitionistic Propositional Logic and Natural Deduction

Here is a primer on intuitionistic propositional logic and natural deduction, along with a brief account of their historical context. The bits of historical context may explain why formal logic is somewhat entangled with the discourse on "foundations of mathematics". I frankly wanted to research and type up my notes on something else, but then started "writing backwards" and thought I'd make this the introductory notes I wish I had before learning about these things.

The topics here can be seen as useful background knowledge for anyone learning about sequent calculus, proof theory, Martin-Löf type theory (aka intuitionistic type theory aka constructive type theory), the Curry-Howard-Lambek correspondence with category theory and particularly cartesian closed categories, categorical logic; I may write about these connections at another time, but this is supposed to be an accessible overview. We proceed according to this plan:

The focus on propositional logic is for simplicity, and to build a bridge to my last post where I briefly described the Curry-Howard correspondence. One can add symbols and rules for quantifiers to obtain first-order predicate logic, and the inclined reader should consult the more detailed texts on these topics, but proof terms would no longer be simply-typed $\lambda$-calculus, and then with higher-order logic there is also a different way to talk about sets of individual objects.

The beginnings of formal logic

For ages, mathematics involved theorems and proofs. Yet, the beginning of formal logic is marked by Gottlob Frege's and Bertrand Russell's works.

Gottlob Frege described a graphical notation that captured sound reasoning in precise rules. It was the beginning of the logicist program of demonstrating that all mathematics could be carried out in formal logic. Bertrand Russell advanced this program considerably, also inventing type theory in the process.

Formal logic, symbolic logic refers to logical statements and reasoning being expressed in a formal language, with symbols representing logical statements and rules of symbol manipulation describing how to obtain new statements from existing ones, preserving validity. Frege and Russell's deductive systems of reasoning are today presented as Hilbert systems (more on those below).

Georg Cantor had proposed set theory as a foundational system earlier, but a formal account of set theory had to wait until Zermelo published his axioms. People (including Cantor himself) had found paradoxes in set theory or what they understood to be set theory. The search for the "right" foundations continued, while alternative systems of formal logic were proposed (for example, Schönfinkel's combinatory logic, which does not use any variables). The period that followed the beginnings saw what is today called the foundational crisis of mathematics.

Axiomatic proof systems

David Hilbert and his school systematized mathematical logic, the application of formal logic to formalizing mathematics, and "metamathematics".

Around 1920, the “Hilbert style” axiomatic approach, as it is often called, was known to everyone and dominated the logical scene [1]

Hilbert systems are a uniform presentation of formal deductive reasoning and can be treated as synonymous with axiomatic proof systems. There are "logical axioms" that capture the process of sound reasoning and there are "mathematical axioms" which would capture the mathematical theory we want to formalize. The only inference rules are the rule of substitution and modus ponens. A proof is a sequence of formulas, with each term being either an axiom or derived from a previous terms by application of the inference rules. If formula $A$ is derivable this way this is written $\vdash A$.

Later, John von Neumann refined axiomatic proof systems using axiom schemes: axioms schemes contain metavariables, and all instance count as axioms and modus ponens becomes the only rule. Here is such a Hilbert-style axiomatic proof system for propositional logic, due to Jan Łukasiewicz: $$ \begin{array}{l} 1.~ A \rightarrow (B \rightarrow A) \\ 2.~ (A \rightarrow (B \rightarrow C)) \rightarrow ((A \rightarrow B) \rightarrow (A \rightarrow C)) \\ 3.~ (\neg B \rightarrow \neg A) \rightarrow (A \rightarrow B) \\ \end{array} $$

The rule of Modus Ponens lets us infer from $A \rightarrow B$ and $A$ the new formula $B$. Note that the other operators $\vee$ and $\wedge$ are considered as abbrevations; this is referred to as Ockham's razor, since it uses a minimum of primitives $A \vee B$ is turned into $\neg A \rightarrow B$, and $A \wedge B$ is $\neg (A \rightarrow \neg B)$.

It is amazing that this axiomatic proof systems is complete: all tautologies can be proven from a few axiom schemes. Many texts on formal logic still use axiomatic proof system, presumably because these permit to talk about formal logic with a minimal amount of machinery for syntax and logical axioms; appealing to the deduction theorem and admissible rules hide much of the structure of the proof, which is not considered interesting to begin with; all that matters is the fact that something can be proven.

[1] Stanford Encyclopedia of Philosophy The Development of Proof Theory

[2] Stanford Encyclopedia of Philosophy The Emergence of First-Order Logic

What is intuitionistic logic?

L.E.J. Brouwer rejected the idea that logics would be prior to mathematics and proposed intuitionism as philosophy of mathematics; he understood mathematics as mental constructions that come first, before any rendering in logic.

We summarize the intuitionist positions as follows:

  • An intuitionist rejects the law of the excluded middle $A \vee \neg A$ when $A$ is a statement about infinite sets
  • An intuitionist rejects the idea of an "actual infinite" altogether and regards infinity as "potential infinity", approximable by repeating a process, but never completed.
  • An intuitionist aims to base all mathematics on effective constructions that can be verified based on intuition (such as the intuition of being able to carry out a countable number of steps)

These restrictions avoid paradoxes of naive set theory, at the cost of rejecting large parts of established mathematical practice.

Brouwer's student Heyting [3] and also Kolmogorov [4] formulated axioms of intuitionistic logic as axiomatic proof systems.

Their efforts is what enabled further discussion (beyond philosophical disagreement) and defined intuitionistic logic [5]. Today, maybe due to the multiple available proof systems for formal logic, many sources define intuitionistic logic informally by reproducing the "Brouwer-Heyting-Kolmogorov (BHK) interpretation" [wikipedia]. This is a description of what is an acceptable proof. We skip a discussion of these because we will be giving natural deduction rules below. The key part is that structure of the proof matters now; the structure contains the constructions that the intuitionist cares about.

Intuitionistic logic is only considered one form of constructive logic and constructive mathematics; there are other logics that are compatible with the BHK interpretation, and multiple possible interpretations of the word construction and function, and then I think most mathematicians do not bother to formal logic (while still caring about precise language and rigour). A different system of logic that is motivated by enabling constructive reasoning is Martin-Löf type theory (also called intuitionistic type theory or constructive type theory).

[3] Arend Heyting (1930). Die formalen Regeln der intuitionistischen Logik., 3 Teile, In: Sitzungsberichte der preußischen Akademie der Wissenschaften. phys.-math. Klasse, 1930, 42–56, 57–71, 158–169.

[4] Thierry Coquand. Kolmogorov's contribution to intuitionistic logic. 10.1007/978-3-540-36351-4_2.

[5] Stanford Encyclopedia of Philosophy The Development of Intuitionistic Logic.

Gentzen's natural deduction and sequent calculus

Gerhard Gentzen was David Hilbert's student and sparked the entire field of structural proof theory with his seminal article and thesis that introduced both natural deduction and sequent calculus - in both intuitionistic and classical variants [6].

Gentzen was not the first to invent natural deduction. Jan Łukasiewicz observed that the axiomatic systems of Frege, Russell and Hilbert do not reflect mathematical practice; mathematicians start their proofs from assumptions, applying "proof figures" to derive new results. His student Stanisław Jaśkowski came up with a calculus of natural deduction, where derivations from assumptions are drawn in boxes [7].

Gentzen pointed out that proofs come with structure: a proof may contain a "detour", and such detours can always be removed, yielding a normal form. In order to demonstrate this, he reformulated natural deduction as the sequent calculus and the normalization through his celebrated cut-elimination theorem. In the sequent calculus, lists of formulae are manipulated (the word sequent is a random word chosen by Kleene to resemble "Sequenz"). Much later, Dag Prawitz proved the corresponding normalization result directly for natural deduction [8].

Natural deduction captures formally and precisely reflect how a mathematician proves things while still admitting a "natural" understanding of what is going on - unlike axiomatic proof systems where structure is artificial and ultimately does not matter as it was obtained by modus ponens and axioms.

[6] Gentzen, Gerhard Karl Erich (1935). "Untersuchungen über das logische Schließen. I". Mathematische Zeitschrift. 39 (2): 176–210. doi:10.1007/BF01201353. (link)

[7] Stanisław Jaśkowski (1934) On the Rules of Suppositions in Formal Logic Studia Logica 1, pp. 5–32 (reprinted in: Storrs McCall (ed.), Polish Logic 1920-1939, Oxford University Press, 1967 pp. 232–258)

[8] Dag Prawitz (1965). "Natural Deduction : A proof theoretical study" (reissued 2006 Dover) \(\newcommand{\andintro}[0]{\color{blue}{(\wedge{\mathrm{\small INTRO}})}}\) \(\newcommand{\andelim}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM}})}}\) \(\newcommand{\andelimleft}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM/1}})}}\) \(\newcommand{\andelimright}[0]{\color{blue}{(\wedge{\mathrm{\small ELIM/2}})}}\) \(\newcommand{\orintro}[0]{\color{blue}{(\vee{\mathrm{\small INTRO}})}}\) \(\newcommand{\orelim}[0]{\color{blue}{(\vee{\mathrm{\small ELIM}})}}\) \(\newcommand{\implintro}[0]{\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}}\) \(\newcommand{\implelim}[0]{\color{blue}{(\rightarrow{\mathrm{\small ELIM}})}}\) \(\newcommand{\notelim}[0]{\color{blue}{(\neg{\mathrm{\small ELIM}})}}\) \(\newcommand{\botelim}[0]{\color{blue}{(\bot{\mathrm{\small ELIM}})}}\)

Intuitionistic natural deduction NJ

We now look at the calculus Gentzen calls NJ, but for propositional logic. For notation, we use $\wedge$ instead of & for conjunction and $\rightarrow$ instead of $\supset$ for implication. Our formulae are defined as:

  • $\bot$ is a formula ('falsehood')
  • a propositional variable $p$ is a formula,
  • if $A$ is a formula, so is $\neg A$
  • if $A, B$ are formulae, so are $A \vee B$, $A \wedge B$ and $A \rightarrow B$.

We shall need to use inference rules and derivations. An inference rule has the form: $$ \frac{X_1 \quad \ldots \quad X_n}{Y} $$

with the meaning that $X_i$ are premises and $Y$ the conclusion of the rule. In natural deduction, each premise and each conclusion of an inference rules is simply a formula. A derivation is an arrangement of inference rules in tree form; the derived formula is at the bottom and inference rules connect each conclusion to premises above the bar. The derivations built up from (instance of) inference rules this way gives a proof of the formula at the root.

The rules are as follows (the notation will be explained below): $$ \cfrac{A\quad B}{A \wedge B}~\color{blue}{(\wedge\mathrm{\small INTRO})} \quad \cfrac{A \wedge B}{A}~\color{blue}{(\wedge\mathrm{\small ELIM/1}}) \quad \cfrac{A \wedge B}{B}~\color{blue}{(\wedge\mathrm{\small ELIM/2})}$$ $$ \cfrac{A}{A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/1}}}) \quad \quad \frac{B}{A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/2}})} $$ $$ \cfrac{\begin{array}[cc]{} \\ {} \\ A \vee B \end{array} \quad \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array}\quad \begin{array}[cc]{} [B] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array} } {C}~\color{blue}{(\vee{\mathrm{\small ELIM}})} $$ $$ \cfrac{\begin{array}[cc]{} [A] \\ ~~\vdots \\ \bot \end{array} }{\neg A}~\color{blue}{(\neg{\mathrm{\small INTRO}})} \quad \cfrac{A \quad \neg{A}}{\bot}~\color{blue}{(\neg{\mathrm{\small ELIM}})} $$ $$ \cfrac{ \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}B\phantom{]} \end{array}}{A \rightarrow B} ~\implintro % % -> ELIM % \quad \quad \frac{A \rightarrow B \quad A}{B}~\implelim $$ $$ \cfrac{\bot}{A}~\botelim $$

Each rule preserves validity, which ensures that derivations will produce correct results (soundness). If one does not consider the proof-theoretical explanation of what an inference rule is as sufficient, one needs to talk about models and interpretations.

Moreover the collection of rules is systematic in a different way: if we ignore the duplicity of $\color{blue}{(\wedge{\mathrm{\small ELIM}})},\color{blue}{(\vee{\mathrm{\small INTRO}})}$ rules, we can say: for every logical operator $\neg, \wedge, \vee, \rightarrow$ there is one introduction rule and one elimination rule. In a sense, the introduction rule defines the meaning of the operator, and the elimination rule is ultimately a consequence of this definition: it removes precisely what was introduced! These are the "detours" in the structure of proofs, which can be removed.

In order to avoid unnecessary complication, we need to convene that in rule $\notelim$, the conclusion $A$ is always different from $\bot$. If we omit the rule $\notelim$, we end up with minimal logic. Let us now take two rules and read what they are saying:

  • $\andelimleft$ is the (left) elimination rule for $\wedge$, and says: if you have a proof of $A \wedge B$, then you also have a proof of $A$
  • $\andintro$ is the introduction rule for $\wedge$, and says: if you have a proof of $A$ and a proof of $B$, you may combine them into a proof of $A \wedge B$.

Assumptions. As mentioned, a derivation may start in any formula $A$ at the leafs. In the course of the proof, the assumptions may be discharged, as seen in the inference rules $\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}$, $\color{blue}{(\neg{\mathrm{\small INTRO}})}$ and $\color{blue}{(\vee{\mathrm{\small ELIM}})}$. The discharged assumption is then marked $[A]$ to indicate that the conclusion does not depend on it. The rule $\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}$ says if you can prove $B$ from assumption $A$, then we have proven from $A$ follows from $B$, i.e. $A \rightarrow B$. Note that it is possible that there are several other, undischarged assumptions still present in the derivation, and that the discharged assumption may have occurred multiple times in the subderivation, with all those occurrences being discharged at once.

Let's go through an example proof of $X \vee (Y \wedge Z) \rightarrow (X \vee Y) \wedge (X \vee Z)$. For this, we will need a derivation that ends in $\implintro$. $$ \cfrac{ \begin{array}[cc]{} [X \vee (Y \wedge Z)] \\ ~~\vdots \\ (X \vee Y) \wedge (X \vee Z) \end{array} }{ X \vee (Y \wedge Z) \rightarrow (X \vee Y) \wedge (X \vee Z) }{\implintro} $$ For the upper part, we need $\orelim$. $$ \cfrac{ \begin{array}[cc]{} {\phantom{X}} \\ {\phantom{\vdots}} \\ X \vee (Y \vee Z) \end{array} \quad \begin{array}[cc]{} [X] \\ ~~\vdots \\ (X \vee Y) \wedge (X \vee Z) \end{array} \quad \begin{array}[cc]{} [Y \wedge Z] \\ ~~\vdots \\ (X \vee Y) \wedge (X \vee Z) \end{array} }{ (X \vee Y) \wedge (X \vee Z) }{\orelim} $$ The two branches are now easy: $$ \cfrac{\cfrac{X}{(X \vee Y)}\orintro\quad\cfrac{X}{(X \vee Z)}\orintro} {(X \vee Y) \wedge (X \vee Z)}{\andintro} $$ $$ \cfrac{\cfrac{\cfrac{Y \wedge Z}{Y}\andelimleft}{(X \vee Y)}\orintro\quad\cfrac{\cfrac{Y \wedge Z}{Z}\andelimright}{(X \vee Z)}\orintro} {(X \vee Y) \wedge (X \vee Z)}{\andintro} $$

Intuitionistic treatment of negation. What if we treated $\neg A$ as an abbreviation for $A \rightarrow \bot$? Then the the role of $\color{blue}{(\neg{\mathrm{\small INTRO}})}$ is played by $\color{blue}{(\rightarrow{\mathrm{\small INTRO}})}$ and the role of $\color{blue}{(\neg{\mathrm{\small ELIM}})}$ is played by $\color{blue}{(\rightarrow{\mathrm{\small ELIM}})}$. In fact, every derivation in the language with primitive $\neg$ could be rewritten into a derivation for the language where $\neg$ is an abbreviation. Neat!

Normal form. A natural deduction proof is said to be in normal form if there is no detour, i.e. if there is no formula that is both the conclusion of an introduction rule and at the same time a major premise of an elimination rule. This is due to what Prawitz calls the inversion principle: whenever there is an elimination rule which operates on the result of an introduction rule above, then the results of that elimination rule were already present among the premises of the introduction rule. A normal derivation will - roughly - take apart the assumptions at the leaves with elimination rules and then build up the conclusion with introduction rules. (The $\orelim$ rule complicates this a little, since it passes through a formula $C$ from its minor premises; the proof by induction on the structure of derivations will transform such proofs by rewriting them and push the application of $\orelim$ downwards).

Curry-Howard correspondence. As discussed last time, we can assign terms of simply-typed $\lambda$-calculus to these derivations. Propositions of IPC are then exactly the types of simply-typed $\lambda$-calculus, with $\wedge$ product types and $\vee$ for sum (coproduct) types. Or, from the vantage point of logic, $\lambda$-terms are proof terms. Normalization of proofs corresponds to normalization of $\lambda$-terms; $\beta$-reduction corresponds to removal of a detour involving $\implintro$ and $\implelim$.

Still on Curry-Howard, it's a bit less obvious how to interpret $\botelim$ as a typing rule; the answer is that it needs an $\mathsf{abort}$ constant which (similar to throwing an exception) passes control to an imaginary top level function that never terminates. This and more is explained succinctly in Zena Ariola's slides [8], but I do want to point out how this yields a neat programmer's intuition for the convention regarding $\botelim$: since obviously an program $\mathsf{abort}(\ldots(\mathsf{abort}(t)\ldots)$ could always be replaced by a simple $\mathsf{abort}(t)$.

[8] Zena Ariola. The interplay between logic and computation. slides from a talk at "Women in Logic 2019".

Natural deduction in sequent calculus style

There is something odd about the above presentation: we have propositions at the leaves of a derivation which are temporarily treated as assumptions and then cease to be assumptions when they are discharged. It is also possible to not discharge them, or discharge only some of them when making a proof step. Here, we present an alternative formulation that makes it clear what happens with assumptions at every inference step.

(I thought for a long time that it was type theorists or programming language researchers who invented this, suspecting typographical reasons, but to my surprise I learned that it was in fact Gentzen [9] himself in 1936 who introduced this alternative in the paper where he proved the consistency of Peano arithmetic using transfinite induction.)

We shall present now intuitionistic natural deduction "in sequent calculus style". Derivations are still tree-shaped, however instead for formulas, premises and conclusions are sequents of the form $\Gamma \vdash A$ where $\Gamma$ is a (possibly empty) set of formulae. The meaning of a sequent is "from assumptions $\Gamma$, one can (intuitionistically) prove $A$". We write the union of such sets with a comma, omit braces around singleton sets and treat $\neg A$ as abbreviation for $A \rightarrow \bot$ so we don't have to give any additional rules. Then, the rules look like this: $$ \cfrac{\Gamma \vdash A\quad \Delta \vdash B}{\Gamma, \Delta \vdash A \wedge B}~\color{blue}{(\wedge\mathrm{\small INTRO})} \quad \cfrac{\Gamma \vdash A \wedge B}{\Gamma \vdash A}~\color{blue}{(\wedge\mathrm{\small ELIM/1}}) \quad \cfrac{\Gamma \vdash A \wedge B}{\Gamma \vdash B}~\color{blue}{(\wedge\mathrm{\small ELIM/2})}$$ $$ \cfrac{\Gamma \vdash A}{\Gamma \vdash A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/1}}}) \quad \quad \frac{\Gamma \vdash B}{\Gamma \vdash A \vee B}~\color{blue}{(\vee{\mathrm{\small INTRO/2}})} $$ $$ \cfrac{\Gamma \vdash A \vee B \quad\quad \Delta, A \vdash C \quad\quad \Theta, B \vdash C}{\Gamma, \Delta, \Theta \vdash C}~\color{blue}{(\vee{\mathrm{\small ELIM}})} $$ $$ \cfrac{\Gamma, A \vdash B}{\Gamma \vdash A \rightarrow B}~\implintro % % -> ELIM % \quad \quad \cfrac{\Gamma \vdash A \rightarrow B \quad \Delta \vdash A}{\Gamma, \Delta \vdash B}~\implelim $$ $$ \cfrac{\Gamma \vdash \bot}{\Gamma \vdash A}~\botelim $$

These rules make it clear that it is now "compulsory" to discharge assumptions. When a rule has multiple premises, the assumptions are combined. By choosing broad enough set of assumptions ahead of time (weakening) the different sets of assumptions can be assumed to be the same. When moving towards sequent calculus, one can do away with the treatment of assumptions as sets and treat them as ordered sequences but with structural rules like exchange and contraction; removing one or both structural rules then yields sub-structural logic and proof theory.

[9] Gerhard Gentzen (1936) "Die Widerspruchsfreiheit der reinen Zahlentheorie"


We are at the end of our tour; I hope this can provide a good starting point for studying the multiple connections between $\lambda$-calculus, logic and proof theory.

Sound reasoning - preferably automated - and some form of formal logic and proof are going to be key when modeling some domain and constructing correct software for it. Programmers are used to deal with precise formulations (programming languages, specs) but only a subset will be interested in the "formalization of all mathematics"; a programmer with a basic understanding of simply-typed $\lambda$-calculus should ideally have a very clear path to understanding logic, proof assistants, type theory - I don't think that this is the case today.

The activities in constructing models and software may be similar to mathematics, but they are certainly not the same. Each domain is a mathematical universe, with random and often contradictory pieces of information waiting to be organized in "islands of consistency". Hopefully we will one day reach some widespread understanding and usable tool support for formal reasoning that can serve both practical informatics as well as formalizing mathematics.

[If you want to discuss, the HN thread is a great place to leave comments]


Higher-order Logic and Equality.

In this post, I want to sketch a particular perspective on $\lambda$-calculus and higher-order logic and Church's simple theory of types. I have a few motivations to write this up. One of them is that as an undergrad (ages ago), upon encountering first-order predicate logic, I had endlessly wondered why one couldn't use "boolean functions" and something like functional programming for logic. It was only much later that I discovered Church's 1940 typed $\lambda$ calculus was in a sense, just that.

When people think of the connection between $\lambda$-calculus and logic, the first thing that comes to their mind may well be the Curry-Howard correspondence. Similarly, the first thing that people may think of when hearing type theory is Martin Löf type theory. So another motivation is to have a place that describes higher-order logic a.k.a. simple type theory - which after all was the first appearance of typed $\lambda$ calculus - and how there are multiple ways to relate it to formal logic and proofs.

Preliminary: Natural Deduction

Natural deduction [wikipedia], in short, is a system of reasoning involving inference rules to derive new true statements from existing ones. What sets them apart from a Hilbert system [wikipedia] is that logical operators themselves are described using inference rules, proofs are trees (derivations) formed from such rule.

In this journey, we present natural deduction rules in sequent calculus style: assumptions are tracked in a context $\Gamma$.

Minimal Logic

Minimal logic [wikipedia], also referred to as the positive fragment of intuitionistic logic, is a deductive system for propositional logic that does not contain any inference rule for falsehood. Here are the rules: $$ \frac{}{\Gamma, A \vdash A}\quad\quad \frac{\Gamma, A \vdash B}{\Gamma \vdash A \rightarrow B} $$ $$ \frac{\Gamma \vdash A \rightarrow B \quad \Gamma \vdash A}{\Gamma \vdash B}$$ We could have added conjunction $\wedge$ or disjunction $\vee$ or a symbol $\bot$ to stand for a false proposition, along with the convention that $\neg A = A \rightarrow \bot$; it would still be minimal logic. "Full" intuitionistic logic has a rule that goes from $\bot$ to an arbitrary proposition $A$ (ex falso quodlibet); and other rules, like double negation elimitation, get us classical logic.

The naming scheme that calls this "minimal" is consistent with Dag Prawitz's "Natural Deduction: A Proof-Theoretical Study." Yet, I believe it is fairly common to call this kind of logic intuitionistic logic, even though there is no rule for $\bot$.

The proofs for this logic have a structure that is familiar to all students of functional programming languages.

Simply typed $\lambda$-calculus

We now follow the celebrated "propositions-as-types" or Curry-Howard correspondence and assign names (variables) to assumptions, terms (expressions) to proofs, using this language: $$ s, t ::= x \enspace \vert \enspace (\lambda x: T. t) \enspace \vert \enspace(s\ t) \enspace $$ $$ T ::= b_i \enspace \vert \enspace T \rightarrow T $$

Here $b_i$ are some given base types. Our annotated inference rules: $$ \frac{}{\Gamma, x:A \vdash x:A}\quad\quad \frac{\Gamma, x:A \vdash s:B}{\Gamma \vdash (\lambda x:A. s) : A \rightarrow B} $$ $$ \frac{\Gamma \vdash s:A \rightarrow B \quad \Gamma \vdash t:A}{\Gamma \vdash (s\ t):B}$$ A term encodes the entire derivation (or natural deduction proof) of the proposition/type.

Reduction as proof normalization

From the Curry-Howard angle, functional programming is "proof normalization": derivations that contain redundant steps get simplified to smaller derivations. The redundancy is always of the form that part of the derivation needs an assumption $x:A$, and another part of the derivation actually "produces" a term of type $A$. In more detail, consider a derivation $\Gamma \vdash (\lambda x:A. s): A \rightarrow B$ and a derivation $\Gamma \vdash t: A$, which can be combined to yield $\Gamma \vdash ((\lambda x:A. s)\ t) : B$. The detour can be removed by taking every occurrence of $x$ within $s$ and replacing it with $t$. This yields a smaller derivation $\Gamma \vdash s[t/x] : B$. If two derivations stand in this relation, we call them $\beta$-equivalent. $$ ((\lambda x:A. t)\ s) \approx_\beta s[t/x] $$

Here $[t/x]$ is notation for substituting $t$ for $x$, and we hurry to point out that care must be taken that we may need to rename bound variables. Two terms that only differ in the names of their bound variables are $\alpha$-equivalent and we will pretend they are equal. If we used de Bruijn indices instead of variable names, there would be no need for $\alpha$-equivalence.

[Digression: the identity above is not an algebraic identity - these are syntactic objects which involve variable binding. Sometimes it is ok to treat lambda-calculi as an algebra, and this leads to combinatory logic, see neelk's answer on stackexchange and Selinger's paper.]

If we give a direction to the identity, we obtain $\beta$ reduction: $$((\lambda x:A. t) s) \Rightarrow_\beta s[t/x]$$ We can show that $\Rightarrow_\beta$ is strongly normalizing and therefore terminating reduction relation, and that $\Rightarrow_\beta$ is confluent (Church-Rosser property), so all reductions end up in the same normal form. This yields a decision procedure for the "word problem": to find out whether $s \approx_\beta t$, reduce both to normal forms and check that they are equal.

We now shift our perspective, leave the Curry-Howard correspondence aside and look at the theory of equality between these expressions. \(\newcommand{\qeq}[0]{\color{blue} {\mathrm{eq}}}\)

Higher-order Logic, using Equality alone

We are going to build a calculus for classical logic, based on our language. Our development here follows the $Q_0$ type theory (due to Peter Andrews). This is a type theory which is close to Church's 1940 simply-typed $\lambda$ calculus, but uses only equality and a description operator. We will omit the description operator.

We need two base types: $\iota$ for "individuals" and $\omicron$ for "propositions". Since the logic is classical, every proposition will be either true of false, so $\omicron$ can be seen as the type these "truth values". With this setup, we can view any term of type $\iota \rightarrow \omicron$ as describing a set of individuals, or alternatively, a predicate. This resembles the characteristic function of a set in mathematics and also what a programmer calls a Boolean function.

[ The name "higher-order" is appropriate, since this logic lets us also talk about set of sets of individuals, via terms of type $(\iota \rightarrow \omicron) \rightarrow \omicron$ and so on. This use also matches the terminology "higher-order function", which is not a coincidence. Note that you can quantify over "smaller-order predicates", but you can't express a "set of all sets."]

The only built-in/primitive notion is a logical constant $\qeq_\alpha: \alpha \rightarrow \alpha \rightarrow \omicron$ which serves as equality operator for any type $\alpha$. Before we build propositions out of logical operators and quantifiers, let us write down types of such operators, with the common convention that $\alpha \rightarrow \beta \rightarrow \gamma$ stands for $\alpha \rightarrow (\beta \rightarrow \gamma)$ the types would be: $$ \newcommand{\qtrue}[0]{{\color{blue} {\mathrm{true}}}} \newcommand{\qfalse}[0]{{\color{blue} {\mathrm{false}}}} \newcommand{\qnot}[0]{{\color{blue} {\mathrm{not}}}} \newcommand{\qor}[0]{\color{blue} {\mathrm{or}}} \newcommand{\qand}[0]{\color{blue} {\mathrm{and}}} \newcommand{\qimplies}[0]{\color{blue} {\mathrm{implies}}} \begin{align*} \qtrue:\ & \omicron\\ \qfalse:\ & \omicron\\ \qnot:\ & \omicron \rightarrow \omicron\\ \qand:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qor:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qimplies:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \end{align*} $$ Using again a notational convention that $((r~s)~t)$ is written $r~s~t$, the tautology $A \vee B \rightarrow B \vee A$ is expressed as $\qimplies~(\qor~A~B)~(\qor~B~A)$.

What about quantifiers? Since we have $\lambda$, for variable binding, there is no need for other variable binders like $\forall$ or $\exists$. Instead we introduce universal and existential quantification using two higher-order operators: $$ \newcommand{\qPi}[0]{\color{blue} \Pi} \newcommand{\qSigma}[0]{\color{blue} \Sigma} \begin{align*} \qPi_\alpha:\ & (\alpha \rightarrow \omicron) \rightarrow \omicron\\ \qSigma_\alpha:\ & (\alpha \rightarrow \omicron) \rightarrow \omicron\\ \end{align*} $$ So $\forall x .\exists y. x = y$ would be expressed as $\qPi \lambda x. \qSigma \lambda y. \qeq~x~y$. People studying classical logical systems frequently "apply Occam's razor" and try to define as few thing as possible. The $Q_0$ way is to play Occam's razor as much as possible. All logical operators are "implemented" using only $\qeq$. $$ \newcommand{\tprop}[0]{\omicron} \begin{align*} \qtrue &:= \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \qeq_\tprop \qeq_\tprop\\ \qfalse &:= \qeq_{\tprop \rightarrow \tprop} (\lambda x:\tprop. \qtrue) (\lambda x:\tprop. x)\\ \qnot &:= \qeq_{\tprop \rightarrow \tprop} \qfalse\\ \qand &:= \lambda x:\tprop. \lambda y:\tprop. \\ &\quad \quad \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~\qtrue~\qtrue) \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~x~y) \\ \qor &:= \lambda x:\tprop. \lambda y:\tprop. \\ &\quad \quad \qnot~(\qand~(\qnot~x)~(\qnot~y))\\ \qimplies &:= \lambda x:\tprop. \lambda y:\tprop. \qeq~x~(\qand~x~y)\\ \qPi_\alpha &:= \qeq_{(\alpha \rightarrow \tprop) \rightarrow (\alpha \rightarrow \tprop) \rightarrow \tprop}~(\lambda x : \alpha. \qtrue)\\ \qSigma_{\alpha} &:= \lambda x: \alpha \rightarrow \tprop. \qnot~~\qPi_{\alpha}~~\qnot~x.\\ \end{align*} $$

One advantage of having everything defined in terms of $\qeq$ is that when we give inference rules to explain how reasoning take place, they all involve only $\qeq$. We are using the notation of inference rules to describe an equational theory. The only assertions $\Gamma \vdash r$ that are derived are (more or less - up to $\beta$-equivalence) of the form $\Gamma \vdash \qeq_\alpha~s~t$, but we explicitly write some assumptions that we make about types. Here they are: $$ \frac{\Gamma \vdash s : \tprop \rightarrow \tprop}{\Gamma \vdash \qeq_\tprop~(\qand~(s~\qtrue)~(s~\qfalse))~(\qPi_\tprop~\lambda x : \tprop.~s~x)} $$ $$ \frac{\Gamma \vdash s : \alpha \quad \Gamma \vdash t:\alpha \quad \Gamma \vdash z: \alpha \rightarrow \tprop}{\Gamma \vdash \qimplies~(\qeq_\alpha~s~t)~(\qeq_\tprop~(z~s)~(z~t))} $$ $$ \frac{\Gamma \vdash s : \beta \rightarrow \alpha \quad \Gamma \vdash t:\beta \rightarrow \alpha}{\Gamma \vdash \qeq~(\qeq_{\beta \rightarrow \alpha}~s~t) (\qPi_\alpha \lambda x:\alpha.\qeq_\alpha~(s~x)~(t~x))} $$ $$ \frac{}{\Gamma \vdash \qeq~((\lambda x : \alpha. s)~t)~(s [t/x])} $$ $$ \frac{\Gamma, x: \alpha \vdash r : \beta \quad \Gamma \vdash r[A/x] \quad \vdash \qeq_\alpha~A~B}{\Gamma \vdash r[B/x]} $$

The first rule is an axiom that says $\tprop$ consists of $\qtrue$ and $\qfalse$. This looks like a data type specification for $\tprop$. The second rule is an axiom that says that if two objects are equal then they have the same properties (Leibniz would say "indiscernibility of identicals"). The third rule is an axiom that says two expressions of function type are equal, if they agree on all arguments; this is the extensional definition of function equality.

The fourth rule is an axiom for $\beta$-conversion. We're conveniently hiding all the naming stuff again behind $[t/x]$ notation; but essentially this lets us apply $\beta$-reduction as we please.

The fifth rule is a proper inference rule - the only actual inference rule, which has a premiss $\Gamma \vdash r$ - that says "equals can be replaced by equals", anywhere in a term. This is where the actual reasoning happens. For example, a proposition $\forall x:\alpha. P$ is encoded as a $\qPi \lambda x:\alpha. P$ and proving it "within the system" is literally showing that $\lambda x:\alpha. P$ is equal to $\lambda x:\alpha. \mathsf{true}$.

Wrapping up

I presented in a notation/style that is different from both Andrews's book or the Stanford Encyclopedia of Philosophy's entry on Church's type theory. It is closer to the way a typed programming languages would be formulated.

We have only scratched the surface, but sketched how a whole formal system of reasoning can indeed be based on $\lambda$ calculus playing the role of characteristic function. And it didn't actually involve Curry-Howard correspondence. I like to think of this situation as "multiple levels of logic," and topics like higher-order unification and meta-logic/logical frameworks make use of this.

I hope these notes can help raise awareness for what is going when simple type theory and higher order logic. A lot more can be said, but that is for another post. If you are interested to dive deeper, here are some references that talk about higher-order logic.

  Stanford Encyclopedia of Philosophy's entry on Church's type theory

  Jean Gallier. "Constructive Logics. Part I: A Tutorial on Proof Systems and Typed λ-Calculi", freely accessible on his logic papers page.

  William M. Farmer "The Seven Virtues of Type Theory" https://doi.org/10.1016/j.jal.2007.11.001

  Peter Andrews "An Introduction to Mathematical Logic and Type Theory: to Truth through Proof" (2002)

  Alonzo Church. "A Formulation of the Simple Theory of Types". J. Symb. Logic, 5 (1) (1940)

  Dag Prawitz. "Natural deduction: A Proof-Theoretical Study" (2006) Dover - new issue of the celebrated 1965 work.