# Intro


I am going to refer to material in "Higher order logic and equality" often. Like in that post, there is actually no original material here; I am just gathering definitions that is probably well-known to some grad students and experts (yet nobody calls it standard since neither higher-order logic nor intuitionistic logic are known very well outside certain circles who deal with formal logic and proof assistants and the like). This is my feeble attempt at an exposition.

# The Logic of a Topos

We take a $\lambda$ calculus with surjective pairing and a type $\omicron$ which we take as a substitute for "truth values" (topos theory texts use $\Omega$ but we want easy comparison with the classical version linked above, and we spell out $\qPi \lambda x. t$ without setting up an abbreviation $\forall x. t$ and we also spell out $\qSigma \lambda x. t$ without setting up an abbreviation $\exists x. t$). We want to set this up in an intuitionistic manner, which means doing away with the idea that $\omicron$ stands for two truth values, and not using classical logical equivalences to define $\qor$ and $\qSigma$. The terms would then correspond to a particular cartesian-closed category, the free topos.

We again want to define everything in terms of equality $\qeq_\alpha: \alpha \rightarrow \alpha \rightarrow \omicron$. What is this equality doing? In our langugage, it talks about equality of terms, and in categorical logic, terms correspond to arrows and thus equality should talk about equality of arrows. (There is some detail hidden here as our language will also let us talk about "global elements" $1 \stackrel{c}{\longrightarrow} C$ which are written as constant terms $c: C$). We take the following elements of our logical language: \begin{align*} \qtrue:\ & \omicron\\ \qand:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qimplies:\ & \omicron \rightarrow \omicron \rightarrow \omicron\\ \qPi_\alpha:\ & (\alpha \rightarrow \omicron) \rightarrow \omicron\\ \end{align*}

... and define them using the only primitive $\qeq$: \begin{align*} \qtrue &:= \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \qeq_\tprop \qeq_\tprop\\ \qand &:= \lambda x:\tprop. \lambda y:\tprop. \\ &\quad \quad \qeq_{\tprop \rightarrow \tprop \rightarrow \tprop} \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~\qtrue~\qtrue) \\ &\quad \quad \quad (\lambda z: \tprop \rightarrow \tprop \rightarrow \tprop. z~x~y) \\ \qimplies &:= \lambda x:\tprop. \lambda y:\tprop. \qeq~x~(\qand~x~y)\\ \qPi_\alpha &:= \qeq_{(\alpha \rightarrow \tprop) \rightarrow (\alpha \rightarrow \tprop) \rightarrow \tprop}~(\lambda x : \alpha. \qtrue) \end{align*}

These definitions are the same as for classical logic. An intro to topos theory will typically describe $\qand$ and $\qimplies$ using commutative diagrams, but here we are defining the language first. Moreover, the language has only one primitive $\qeq$, though that primitive is instantiated at different types.

# False

Now we look at $\qfalse$, which is also defined the same as in classical case: \begin{align*} \qfalse &: \omicron\\ \qfalse &:= \qeq_{\tprop \rightarrow \tprop} (\lambda x:\tprop. \qtrue) (\lambda x:\tprop. x)\\ \end{align*}

In a topos theory text, one encounters $\forall v:\omicron. v$, which is shorthand for $\qPi\lambda v:\omicron. v$ and actually expands to the very same definition above. The definition that mentions $\qeq$ explicitly has the advantage that it explains what is going on: it morally says "if there is anything other than $\qtrue$ in $\omicron$, then $\lambda x:\omicron. \qtrue$ and $\mathsf{id}_\omicron$ are not the same." So we have a value $\qfalse$, but it need to be the only thing distinct from $\qtrue$ in $\omicron$.

# Not, Or and Exists

On to intuitionistic negation $\qnot$. In intuitionistic logic, we would define negation like this: \begin{align*} \qnot &: \omicron \rightarrow \omicron\\ \qnot &:= \lambda x: \tprop. \qimplies(x, \qfalse) \end{align*}

If we consult the classical version $\qnot := \qeq_{\tprop \rightarrow \tprop} \qfalse$, we again find that this is not that different.

Finally, let's look at $\qor$ and $\qSigma$. We define these using $\qPi$: \begin{align*} \qor &: \omicron \rightarrow \omicron \rightarrow \omicron\\ \qor &:= \lambda x: \omicron. \lambda y: \omicron. \qPi \lambda r:\omicron. \\ & \quad \quad \qimplies(\qand(\qimplies(x, r), \qimplies(y,r)), r) \\ \\ \qSigma_\alpha &: (\alpha \rightarrow \omicron) \rightarrow \omicron \\ \qSigma_\alpha &:= \lambda x: \alpha \rightarrow \omicron. \qPi \lambda r:\omicron. \\ &\quad \quad (\qPi \lambda z:\alpha. \qimplies(\qimplies(x z, r), vr) \end{align*}

We can get an intuitive understanding of these definitions: for $\qor$, if we have a way of obtaining an element $r$ (result) from $x$ and one of obtaining the same from $y$, then also from $(\qor~x~y)$. Similarly for $\qSigma$, if we have a way of obtaining any result $r$ from the "instantiated" predicate $x z$, then also from $\qSigma x$.

The definition of $\qor$ in terms of $\qand$ and $\qPi$ is not very different from natural deduction rule for "or elimination": $$\cfrac{\begin{array}[cc]{} \\ {} \\ A \vee B \end{array} \quad \begin{array}[cc]{} [A] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array}\quad \begin{array}[cc]{} [B] \\ ~~\vdots \\ \phantom{[}C\phantom{]} \end{array} } {C}~\color{blue}{(\vee{\mathrm{\small ELIM}})}$$

Checking the the natural deduction rule for existential elimination and comparing it with the encoding above is left as an exercise.

# The end of the beginning

Higher order logic seems to deserve more attention. We have set up a logical system based on $\lambda$ calculus without making use of the Curry Howard correspondence. While the previous discussion focused on classical higher order logic as introduced by Church in 1940, in this post we looked at intuitionistic higher order logic. The interest in the latter is at least very heavily influenced by topos theory.

It is slightly confusing that when discussing elementary toposes, one talks of first-order logic when in reality the logic is inherently higher-order (because a topos is cartesian closed). I guess the charitable perspective is what people mean is "you can do everything that you would want to do in formal set theory using first-order logic."

The Lambek and Scott book "Introduction to higher order categorical logic" dwells a bit on the fact that it is always possible to get a formulation of the internal language of a topos using equality. Andrews $Q_0$ develops simple type theory / higher order logic using a single equality operator $Q$. Various theorem proving systems have been based on this (TPS and HOL).

One could say that the way higher-order logic admits the definition of logical connectives in terms of other is an "encoding" of natural deduction rules. Could one use this idea to build a logical framework, a representation of derivations? Indeed this is what is underlying Isabelle/Pure framework and $\lambda$Prolog.

It would seem worthwhile to write a proper book that takes a logical perspective on the $\lambda$ calculus, developing higher order logic as language, category theory as semantics and arriving at elementary toposes as models. It would probably feel very elementary and pedestrian in the academic world, but I do think it would be useful. It feels to me that there is so much "background knowledge" to gather the relevant theory on logic and computation that most practitioners will not have the energy (and this says something about the potential for adopting of formal methods). Until such a book arrives (I'm not sure if I would have the energy), I hope my out-of-order blog posts provide some help on the reader's journey. There are certainly a lot of of connections between logic, language, types and semantics that are left to be explored, besides Curry Howard and dependent types.

## 27.9.20

### Programmers love Higher-Order Logic

My previous take on higher-order logic was classical, channeling Church's simple theory of types and Andrews $Q_0$. In that last post, I was hinting at motivation. So I want to expand a bit now how this stuff can be useful for building software at large. I am posting some ramblings here.

# Equality is Fundamental

There are two different "levels" at which a programmer deals with some kind of reasoning that involves equality:

• while writing code: need to be able to tell when two objects are the same.
• when designing or arguing that code is correct: reasoning whether behavior like output values conform to a (however implicit) specification

In both cases, one needs to be able to identify whether objects (or behaviors or input-output pairs) are "the same" as expected ones. But the latter activity, reasoning about program correctness, happens in some form of logic. Much that has been written and is being written on formal logic is about mathematics and mathematical convention. This is true whether we talk about semantics of programming languages or program specification.

But for programmers, there are benefits to knowing formal reasoning even if one is not interested in the foundations of math, axiomatizations of set theory etc. It is no secret that in mathematics as in programming, you run into situations where there are multiple different ways to represent some data, and one needs to chose or translate between those representations. While programmers resign to writing "proto-to-proto" translation over and over (which often are not as trivial as it seems), in mathematics it is customary to leave out the awkward details, introduce "abuse of notiation" and quotienting to keep arguments informal, but concise.

The heart and soul of much mathematics consists of the fact that the “same” object can be presented to us in different ways. [1]

[1] Barry Mazur. When is one thing equal to some other thing?

# Equality in First Order Logic

We take a leap towards "formal systems", where objects are represented using (strings of) symbols, along with strict rules to form statements and derive new true statements from existing ones.

A typical definition of a first-order logic language contains constant and function symbols, predicate symbols, logical operators.

There is a "design choice" here whether equality is considered as built-in, magically denoting exactly the equality in the underlying sets, or whether equality is a predicate with certain special properties. Wikipedia" talks about "first-order logic with equality" (when built-in) and first-order logic without equality (when it is not).

When equality does not come built-in, one needs to add axioms. The equality axioms are special, in the sense that they have to be preserved by all the function and relation symbols. This looks like:

• $\mathsf{eq}$ is reflexitve, symmetric, transitive
• if $\mathsf{eq}(x_1, y_1), \ldots, \mathsf{eq}(x_n, y_n)$, then $\mathsf{eq}(f(x_1,\ldots,x_n),f(y_1,\ldots,y_n)$ for all function symbols $f$ of arity $n$
• if $\mathsf{eq}(x_1, y_1), \ldots, \mathsf{eq}(x_n, y_n)$, then $R(x_1,\ldots,x_n) \Leftrightarrow R(y_1,\ldots,y_n)$ for all relation symbols $R$ of arity $n$

With a "defined" equality, is it is well possible to interpret logical statements involving $\mathsf{eq}$ in such a way that two distinct elements are regarded as equal according to $\mathsf{eq}$.

# Coding our own equality

When we are talking about practical software development, there is really no hope of defining a platonic ideals; yet when techniques of reasoning, informaticians are in various ways confined to results and methods that formalizing mathematics.

Take this example: We'd like our terms to represent free monoid on $X$, so we'd like these terms to satisfy the following equations (or identities): $$s \cdot 1 = s$$ $$t = 1 \cdot t$$ $$(s \cdot t) \cdot u = s \cdot (t \cdot u)$$

Consider definitions in some fictional language to express the "interface" of a monoid: $$\begin{array}{ll} \mathsf{interface}~ M[X] \\ \quad \mathsf{unit}(): M[X]\\ \quad \mathsf{emb}(x: X): M[X]\\ \quad \mathsf{dot}(s: M[X], t: M[X]): M[X]\\ \quad \mathsf{eq}(s: M[X], t: M[X]): \mathsf{bool} \end{array}$$

Here, $\mathsf{unit}$ stands for $1$, $\mathsf{emb}(x)$ is for embedding $x$ and $\mathsf{dot}$ is the combine operaion in $MX$. If our fictional language has algebraic data types, we can use those to build terms which represent the algebraic theory of monoids: $$\begin{array}{ll} \mathsf{data}~ \mathsf{Mon}[X] \\ \quad \mathsf{Unit}(): \mathsf{Mon}[X]\\ \quad \mathsf{Emb}(x: X): \mathsf{Mon}[X]\\ \quad \mathsf{Dot}(s: \mathsf{Mon}[X], t: \mathsf{Mon}[X]): \mathsf{Mon}[X]\\ \quad \mathsf{eqMon}(s: \mathsf{Mon}[X], t: \mathsf{Mon}[X]): \mathsf{bool} \end{array}$$

It is now really obvious what these constructors stand for, but we note that while the equality $\mathsf{eqMon}$ returns true whenever two terms are equal when considered as elements of the free monoid, there is at least one more equality relation on terms that is different ("structural equality" or "syntactic equality", which will consider $\mathsf{dot}(\mathsf{dot}(s, t), u)$ as distinct from $\mathsf{dot}(s, \mathsf{dot}(t, u))$).

Mathematically speaking, we want to consider the quotient set $\mathsf{Mon}[X] / \mathsf{eqMon}$, and work with equivalence classes.

# Formulas = Functions to Truth Values

After all this talk about equality in algebraic theories, we want to turn equality in the context of logic, specifically higher-order logic.

In classical logic, a proposition (logical statement) stands for a truth value, and there are exactly two of then (the two-value type "bool"). Boole was the first to associate propositional connectives with algebraic operations. In this perspective, the the truth values are part of the language, even though this was later abandoned as in Tarski-style semantics.

Programmers are quite used to having functions return $\mathsf{bool}$. In terms of mathematical logic, our custom-made equality "relation" $\mathsf{eqMon}$ above is in fact a function to the set of truth values. Could we not do away with relations and talk in terms of functions?

Yes, we can; using truth values and functions in a formal language of propositions is the essential summary of classical higher order logic (or Church's simple theory of types. There is a more compelling example: if you consider "set builder notation" $$\{ x \in A \ |\ \phi(x) \}$$ then this describes a subset of $A$. We can choose to describe $\phi$ as defining a (unary) relation, but viewing it as a function to $\mathsf{bool}$ is no doubt more natural in a context like SELECT x FROM A WHERE  $\phi$ where "relations" are given by their extensions (as tables) and computation in the form of expressions is available.

The point of (classical) higher-order logic is that using the $\lambda$ calculus and truth values, we have everything we need to define the meaning of all logical connectives, and the types prevent logical paradoxes of "naively-untyped set theory". Since the truth values are part of the language, it is possible to define the meaning purely through definitions that involve equality.

More on this and classical higher order logic in that last post on Higher Order Logic and Equality and the next one I'm going to write in a second, where I talk about the intuitionistic version.

I will just end with a note that there is no distinction between "predicate", "relation" and "set" is higher-order logic, and this is a good thing. An expression of type $\iota \rightarrow \omicron$ (where $\iota$ are the individuals and $\omicron$ the truth values) defines a (sub)set of individuals, a predicate on the set of individuals, a unary relation, all at the same time. $$\mathsf{even} := \lambda x:\iota. \Sigma \lambda y:\iota. \mathsf{eq}(x, \mathsf{times}(2, y))$$ $$\mathsf{greater} := \lambda x:\iota.\lambda y:\iota. \Sigma \lambda z:\iota. \mathsf{eq}(x, \mathsf{plus}(y,z))$$ $$\mathsf{lowerbound} := \lambda x:\iota.\lambda y:\iota \rightarrow \omicron. \Pi \lambda z:\iota. \mathsf{implies}(y(z), \mathsf{greater}(z, x))$$

The confusing bit for programmers may be that this thing is called simple type theory when in fact there is only one "sort" of individuals. This makes more sense when considering that it is no problem to come up with multiple "sorts" of individuals but that wouldn't give sets ($\iota \rightarrow \omicron$), sets of sets $(\iota \rightarrow \omicron) \rightarrow \omicron$, sets of sets of sets (and so on) of individuals. It's this distinction of hierarchies of sets/predicates/relations that earn Church's higher order logic the name "simple type theory": only terms with the right types are considered well-formed and meaningful.

## 19.7.20

### Modules, components and their limits

I read Clear explanation of Rust's module system. The TL;DR the module system in rust is a way to (somewhat) break/hide the link between what things are defined and what source files the things are defined in. A rust module provides control over visibility across source files and makes explicit how references reach across "modules".

The post starts with a request to ignore what "module" means in other languages, and this plunged me into thoughts on all the things that are called modules. It is somewhat sad but to be expected that a basic word like "module" (or "package") could never have a standard meaning, despite all these programming languages supposedly built while we have all this PL theory.

The more academic vocabulary has "module" as a language concept that comes with a scope and visibility, a way to bundle types and named values. It's somehwat independent of installation, deployment (the only touch point being support for separate compilation), but - to my knowledge - there is no talk of an "installed-at-a-version" module. So the rust module concept is pretty consistent with that. Arranging a set of modules that fit together is the programmer's problem, not the researcher's.

Alas, there is this thing that happens at the level of Rust crates/npm packages, and it is really a different problem. Nevertheless, those things are also called modules in some languages (C++20 modules, Java modules, go modules).

In Java, source files in the same directory make a Java package and they come with a way to control visibility. The association between files and packages is rigid. More recently, Java modules were added to express dependencies between modules, visibility of packages - something like a "packages of packages", but really more on the level of Rust crates.

I am reminded of "components" (if anybody remembers component-oriented programming) because deployment is part of this game - the notion of dependencies only makes sense in the context of some installation. But then components used to be about runtime services (in addition to code), object brokers and such and here we would actually like to talk about less dynamic things, just code and functionality in the form of source and/or libraries. Still, components are things that get assembled and exist at versions, so I'd say it's an appropriate, if old-fashioned name.

Now, such a (static) component would abstractly describe what npm packages or Rust crates achieve, and also form of Debian development packages (sigh) where you install a C library in binary form and its header files. That gives us even more of a twist now, which is that C++ 20 modules are really modules in the sense of achieving grouping for visibility and scope (check) but at the same time that whole notion has a huge impact on C++ since it should actually make it easier to deploy C++ components. A relevant quote from the C++20 link above:

After a module is compiled once, the results are stored in a binary file that describes all the exported types, functions and templates. That file can be processed much faster than a header file, and can be reused by the compiler every place where the module is imported in a project.

So while a C++20 module is language concept, it is foreseeable that one day all self-respecting newly developed libraries will be deployed as (installed at a version) modules. Oh well. There was really no reason to hope that a standard meaning for "modules" and the concept "above" could be justified in any way, across languages. At least, there remains an insight that it really makes sense to deal with the two levels separately.

I sure wish we could all call the installed-at-a-version thing "component", and the language concept "modules". Just two paragraphs on how that might sound:

Every language needs a module system: a way to deal with grouping, interfaces (in terms of making visible - "exporting" only the interface and keeping the rest internal). Including files was always bad and indicates the lack of a module system. It's when one needs to deal with independently developed, deployed and release units that there comes the problem of dependencies, evolution and versions, and these things are the components. Components are related to modules in that they need to contain at least one of them for talking about the things that are provided and brought in scope.

If there is a dependency between versions of components and there are backwards incompatible (!) changes in the depended-upon component (or a left-pad situation), then this is something that needs to be dealt through changes to the depending component. That isn't something that module systems (or any other language design) can help much with.

## 18.7.20

### Relating partial evaluation, multi-stage programming and macros

Here is some quick fact on two metaprogramming techniques. $$\begin{array}{l} \mathsf{letrec}\ \mathtt{pow}\ =\ \lambda n : \mathsf{Int}. \lambda x : \mathsf{Int}. \\ \quad \mathsf{if}\ (\mathsf{eqInt}\ n\ 0)\ 1\ (x\ * (\mathtt{pow}\ (n - 1)\ x)) \end{array}$$

What can we do with an expression $\mathtt{pow}\ 3$?

## Partial evaluation

The "partial evaluation" answer is to inline the definition, and evaluate what can be evaluated, yielding a residual program. $$\begin{array}{l} \lambda x : \mathsf{Int}. \mathsf{if}\ (\mathsf{eqInt}\ 3\ 0)\ 1\ (x * (\mathtt{pow}\ (3 - 1)\ x)\\ \lambda x : \mathsf{Int}. x * (\mathtt{pow}\ 2\ x)\\ \ldots\\ \lambda x : \mathsf{Int}. x * (x * (x * 1)) \end{array}$$

## Staging

The multi-stage programming answer (or staging) is to rethink $\mathtt{pow}$ as a transformation of code. $$\newcommand{\qquote}[1]{\color{blue}{[|}#1\color{blue}{|]}} \begin{array}{l} \mathsf{letrec}\ \mathtt{pow}\ =\ \lambda n : \mathsf{Int}. \lambda x : \mathsf{\color{blue}{Code[Int]}}. \\ \quad \mathsf{if}\ (\mathsf{eqInt}\ n\ 0)\ \color{blue}{[|}1\color{blue}{|]}\ \color{blue}{[|}(x * (\mathtt{pow}\ (n - 1)\ x)\color{blue}{|]} \\ \mathsf{let}\ \mathtt{mkpow}\ =\ \lambda n : \mathsf{Int}. \qquote{\lambda y : \mathsf{Int}. (\mathtt{pow}\ n \qquote{y})}\\ \end{array}$$

The notation was (probably) not designed to scare you:

• $\qquote{\ldots}$ is quasi-quotation. What is between these brackets is code

I hope these notes can help raise awareness for what is going when simple type theory and higher order logic. A lot more can be said, but that is for another post. If you are interested to dive deeper, here are some references that talk about higher-order logic.

Stanford Encyclopedia of Philosophy's entry on Church's type theory

Jean Gallier. "Constructive Logics. Part I: A Tutorial on Proof Systems and Typed λ-Calculi", freely accessible on his logic papers page.

William M. Farmer "The Seven Virtues of Type Theory" https://doi.org/10.1016/j.jal.2007.11.001

Peter Andrews "An Introduction to Mathematical Logic and Type Theory: to Truth through Proof" (2002)

Alonzo Church. "A Formulation of the Simple Theory of Types". J. Symb. Logic, 5 (1) (1940)

Dag Prawitz. "Natural deduction: A Proof-Theoretical Study" (2006) Dover - new issue of the celebrated 1965 work.