Monday, May 9, 2016

Löb's theorem is (almost) the Y combinator

As a computer scientist, one of the stranger sociological facts you'll encounter is that regular mathematicians tend to dislike it (and its cousin formal logic). The reason for that is nicely encapsulated in the following anecdote that Danny Hillis tells about Richard Feynman and the design of the Connection Machine computer:

[...] Richard had completed his analysis of the behavior of the router, and much to our surprise and amusement, he presented his answer in the form of a set of partial differential equations. To a physicist this may seem natural, but to a computer designer, treating a set of boolean circuits as a continuous, differentiable system is a bit strange. Feynman's router equations were in terms of variables representing continuous quantities such as "the average number of 1 bits in a message address." I was much more accustomed to seeing analysis in terms of inductive proof and case analysis than taking the derivative of "the number of 1's" with respect to time. Our discrete analysis said we needed seven buffers per chip; Feynman's equations suggested that we only needed five. We decided to play it safe and ignore Feynman.

The decision to ignore Feynman's analysis was made in September, but by next spring we were up against a wall. The chips that we had designed were slightly too big to manufacture and the only way to solve the problem was to cut the number of buffers per chip back to five. Since Feynman's equations claimed we could do this safely, his unconventional methods of analysis started looking better and better to us. We decided to go ahead and make the chips with the smaller number of buffers.

Fortunately, he was right. When we put together the chips the machine worked.

I've put the relevant bit in bold. To most mathematicians, inductive proofs are the tool of last resort. You always try to weave together the known properties of your object together in a creative way, and do an exhaustive case analysis only if all else fails. However, to logicians and computer scientists, induction is the tool of first resort – as soon as we have an object, we grind it to dust with induction and reassemble it bit by bit with the properties we want.

Regular mathematicians find this process roughly as appealing as filling in tax forms, but it does come with its compensations. One of those compensations is that we get to collect a wide variety of fixed point theorems, to justify an equally wide variety of recursive constructions.

The modal version of Löb's theorem is one of the more exotic fixed point theorems in logic. It is often glossed as representing the "abstract content" of Gödel's incompleteness theorem, which is the sort of explanation that conceals as much as it reveals. If you look at the proof, though, you'll see that its structure is a very close match for something very familiar to computer scientists – the proof of Löb's theorem is (almost) a derivation of the Y combinator!

To understand this, let's try formulating the fragment of provability logic we need as a natural deduction system. The types we will use are the following: $$ \begin{array}{lcl} A & ::= & b \bnfalt A \to B \bnfalt \mu a.\,A(a) \bnfalt \Box A \end{array} $$

This looks like the simply-typed lambda calculus, extended with recursive types and a special $\Box A$ type former. This is almost, but not quite, correct! $\Box A$ is indeed a modal operator -- in particular, it satisfies the axioms of the K4 modality. However, the recursive type $\mu a.\,A$ is not an ordinary recursive type, because unrestricted recursive types lead to logical inconsistency. Instead, $\mu a.\,A$ is a modal fixed point. That is, it satisfies the isomorphism: $$ \mu a.\,A \simeq A{[\Box(\mu a.\,A)/a]} $$

Hopefully, this will become a bit clearer if we present typing and inference rules for this logic. We'll use a variant of Davies and Pfenning's judgemental formulation of modal logic, modified to support the K4 modality rather than the S4 modality. The basic judgement will be $\judge{\Delta; \Gamma}{e}{A}$, where $\Delta$ are modal variables and $\Gamma$ are ordinary variables. $$ \begin{array}{ll} \rule{ x:A \in \Gamma} { \judge{\Delta; \Gamma}{x}{A} } & \\[1em] \rule{ \judge{\Delta; \Gamma, x:A}{e}{B} } { \judge{\Delta; \Gamma}{\fun{x}{e}}{A \to B} } & \rule{ \judge{\Delta; \Gamma}{e}{A \to B} & \judge{\Delta; \Gamma}{e'}{A} } { \judge{\Delta; \Gamma}{e\;e'}{B} } \\[1em] \rule{ \judge{\Delta; \Gamma}{e}{A{[\Box(\mu a.\,A)/a]}} } { \judge{\Delta; \Gamma}{\fold(e)}{\mu a.\,A} } & \rule{ \judge{\Delta; \Gamma}{e}{\mu a.\,A} } { \judge{\Delta; \Gamma}{\unfold(e)}{A{[\Box(\mu a.\,A)/a]}} } \\[1em] \rule{ \judge{\Delta; \Delta}{e}{A} } { \judge{\Delta; \Gamma}{\box(e)}{\Box A} } & \rule{ \judge{\Delta; \Gamma}{e}{\Box A} & \judge{\Delta, x:A; \Gamma}{e'}{C} } { \judge{\Delta; \Gamma}{\letv{\box(x)}{e}{e'}}{C} } \end{array} $$

The two key rules are the introduction and elimination rules for $\Box A$.

The elimination rule is the standard Davies-Pfenning rule. It says that if you can prove $\Box A$, you can add an $A$ to the modal context $\Delta$ and carry on with your proof.

The introduction rule is different, and weaker than the DP S4 rule. It says you can introduce a $\Box A$, if you can prove $A$ only using the modal hypotheses in $\Delta$. So the variables in $\Gamma$ are deleted, and the ones in $\Delta$ are copied into the ordinary hypothetical context. This setup is weaker than the DP introduction rule, and captures the fact that in K4 you can't derive the comonadic unit rule that $\Box A \to A$.

Once we have this, it becomes possible to derive Löb's theorem. Before we do that, let's look at the derivation of the Y combinator. To derive a fixed point at a type $P$, we introduce a recursive type $μa.\, a \to P$, and then we can add fold's and unfold's to the standard definition of the Y combinator:

let y : (P → P) → P = 
  λf. let f' = f in                   
      let g =                         
         (λr. let r' = r in          
              f' (unfold r' (r')))
      in 
      (g (fold g))     

You could also write this more compactly as λf. let g = λx. f (unfold x x) in g (fold g), but I have introduced some spurious let-bindings for a reason which will become apparent in a second.

Namely, virtually the same construction works with Löb's theorem! We will take the modal recursive type $\mu a.\,a \to P$, and then add some boxing and unboxing to manage the boxes (I'll annotate the variables with their types so that you can see what's going on):

let lob : □(□P → P) → □P = 
  λf. let box(f)' = f in                      // f' modal, □P → P
      let box(g) =                            // g modal, □(μa. a → P) → P)
         box(λr. let box(r') = r in           // r' modal, μa. a → P
                 f' box(unfold r' (box r')))  // application type P
      in 
      box(g (fold g))     

Note that if you took $\Box$ to be the identity constructor, and erased all the $\box$'s from this term, then this proof of Löb's theorem is actually exactly the same as Curry's Y combinator!

However, people more typically take $\Box$ to mean something like "quotation" or "provability". What proof shows is that there's nothing mysterious going on with self-reference and quotation -- or rather, it is precisely the same mystery as the universality of the lambda-calculus.

Postscript: One of the nice things about blogging is that you can be more informal than in a paper, but it's also one of the pitfalls. I got an email asking about the history of this system, and of course a blog post doesn't have a related work section! Basically, a few years ago, I worked out a calculus giving a syntax for applicative functors. An applicative is basically a monoidal functor with a strength, and I noticed that the modal logic K is basically a monoidal functor without a strength. There matters stood until last year I read Kenneth Foner's Haskell Symposium paper Getting a Quick Fix on Comonads, which was about programming with Löb's theorem. This was interesting, and after Wikipedia's proof of Löb's theorem, I saw that (a) the proof was basically Curry's paradox, and (b) I could adapt the rules for applicatives to model the modal logic K4, which could (with a small bit of tinkering) have a very similar calculus to applicatives. (In fact, the proof I give of Löb's theorem is a transliteration of the proof on Wikipedia into the system in this blog post!)

I should emphasize that I haven't really proved anything about these systems: I think the proofs should be straightforward, but I haven't done them. Luckily, at least for K, I don't have to -- recently, Alex Kavvos has put a draft paper on the arXiv, System K: a simple modal λ-calculus, which works out the metatheory of K with a dual-zone Davies/Pfenning-style calculus. If you need syntax for applicative functors minus strength, do take a look!

7 comments:

  1. Neat! Is there a simple way to see that the modal mu *is* consistent?

    ReplyDelete
  2. I like this take on Lob's theorem. There was a nice related paper at the 2015 Haskell Symposium: http://dl.acm.org/citation.cfm?doid=2804302.2804310

    ReplyDelete
  3. Very nice! I wonder though if you can remove the explicit occurrences of ◻ in the fold/unfold rules to get more orthogonal rules by restricting the grammar of types to:

    A ::= A→B | μt.A(t) | ◻T
    T ::= A | t

    To me, at least, this also seems to more directly capture a modal restriction on recursion. I think it's also at least slightly more general. I'd guess consistency can be shown by induction on the modal nesting of types.

    ReplyDelete
  4. Actually, this isn't more general, the more general version is where you introduce the recursive type variable in the modal context, hence it only becomes available once you pass to a reachable world via the box rule. You can do this with the original syntax of types, but the well-formedness judgement for types should check that the modal scoping is correct, hence including modal type assumptions that are copied to ordinary assumptions when you pass through a ◻ type constructor. It seems pretty natural to have such "modal type variables" in the context of a modal type system, but in S4 I guess they would be equivalent to the ordinary type variables.

    ReplyDelete
  5. Hmmm. induction |-> coinduction (?!)

    ReplyDelete
  6. Hi Rowan, have you written up how to handle modal type variables anywhere? It's not obvious to me what the right rules are....

    ReplyDelete
  7. Not written up, no, aside from a rough piece of paper and a comment on this blog post at this point, it may fail to be sensible on closer examination. The idea is add assumptions of the form "a=A : type" to the modal context at the fold rule, and have them out of scope at first, but flowing to the non-modal context along with the term variables. I can write up the rules if that's not clear.

    ReplyDelete