Abel Jansma

Truth is not a direction: a Tarski attack on LLM probes

2026-07-10T08:00:00+00:00

A diagonal attack for LLM truth probes shows why no probe on a language model’s embedding space can pin down truth.

A linear dream

Modern LLMs famously encode input texts as vectors in some embedding space. One of the most satisfying discoveries about LLMs is that many natural concepts, like gender, emotions, capital cities, correspond to directions in this space. The extent to which a certain input text contains the concept “Male”, for example, can be quantified by the angle between the input’s embedding, and the direction corresponding to the “Male” concept. This is often referred to as the Linear Representation Hypothesis.

I recently learned about a very bold version of this hypothesis, which claims that there is a direction that corresponds to “truth”. This is explored by Marks and Tegmark here for example, and further studied here, here, here, here, and here.

Having access to such truth directions can be useful if you want to know if a piece of text is true, and crucial to AI safety research because it can reveal if an AI system is being truthful or deceitful.

AI safety researchers have thus been feeding LLMs true and false statements, and training classifiers that separate their embeddings. This works weirdly well, and seems to generalise to some extent. Could it really be true, that a superhuman AI could reflect the truth of propositions in the embedding geometry?

You might recognise this dream from elsewhere. Russell, Whitehead, and Hilbert had hoped to construct something similar for all of mathematics: a systematic way to decide on truth. In their case, it was based on the machinery of mathematical proof. However, Gödel famously destroyed that dream: no such systematic way to prove or disprove any mathematical statement can exist. Mathematical truth can not be fully captured by provability.

Speaking of “speaking of”

Gödel showed this by creating a clever system in which arithmetic expressions could state things about arithmetic sentences. This then created the ouroboros sentence this sentence has no proof. Tarski sharpened the paradox further: no sufficiently expressive language can contain its own total truth predicate. If a language is expressive enough to describe its own semantics, then is true as it applies to statements in that language, cannot fully be spoken of from within it.

Turing’s halting problem provides another example of such a diagonal construction: there is an evaluator (the program HALT), a system expressive enough to ‘talk about itself’ (descriptions of Turing machines can be fed into Turing machines), and a negation. A nice universal approach to such paradoxes of self-reference is this paper by Noson Yanofsky, who says

All these different examples are really saying the same thing: there will be trouble when things deal with their own properties.

Note that transformer-based LLMs do something similar: they represent inputs as vectors in a space where directions correspond to concepts. However, these concepts can themselves be expressed in natural language that can be fed as input! Transformers trained on natural language are made of the same stuff as their inputs. Not all geometry of these embedding spaces corresponds to easily verbalisable structure, but important parts do (see for example Anthropic’s post on J-space), and in particular the truthfulness concept often seems present.

A diagonal attack

Now let’s set up the attack. Let t(s) be the output of the truth probe on the input string s. This could be the projection of the model’s embedding on the truth-direction for example, or a non-linear classifier, it doesn’t really matter. Then consider

t(“The truth probe's score for this sentence evaluates to FALSE.”)

What should this evaluate to? If the sentence is true, then a correct probe should output TRUE. However, if the sentence is true, then according to the sentence the truth probe will output FALSE, which would indicate that the sentence is actually false. If the sentence is false, then apparently the truth probe will output TRUE, etc… A paradox.

Clearly, no such universal truth probe can exist! More generally:

No definable probe over a model’s representation space can exactly capture truth for any language rich enough to describe that probe and its outputs.

Note that “rich enough to describe that probe and its output” is a very low bar. It just means the language can name the model, the probe, and form sentences about their behaviour. English certainly can.

What’s fun about this example, however, is that we can just see what happens when we do try to evaluate this!

I created a simple truth-probe for the Qwen3.5-4B model by training a logistic regression classifier on projections on the difference in mean embeddings of a set of training sentences that are labelled TRUE or FALSE. This difference-in-means approach was shown here to work well, and indeed, after training on 120 labelled example sentences, the probe (thresholded at 0.5) is 94% accurate on 36 withheld evaluation sentences (AUC of 0.98, the only mislabeled sentences being arithmetical ones like five times seven is thirty-five).

(Note: this is obviously a very small toy example, only intended to illustrate that the diagonal Tarski attack can be executed in practice.)

On diagonal attack sentences, however, the score seems to be nonsensical and all over the place. This is not just an effect of self-reference. Some self-referring sentences have clear truth-values, like This sentence is written in English, which the model scores accurately.

A Fixed-point fix?

All these liar-paradox like scenarios are examples of Lawvere’s fixed-point theorem, which roughly states that when a system contains a sufficiently expressive evaluator, every self-map on the target of the evaluator (i.e. functions on the set of ‘truth values’), must have a fixed point. Since negation on the set {TRUE, FALSE} does not have a fixed point, no such evaluator can exist. But what if we construct a set of truth values where all operations do have fixed points? Would that solve the paradox and allow for a universal generalised truth probe?

Let’s represent {TRUE, FALSE} as {0, 1}. Negation can then be represented as a map v -> 1-v, which has no fixed point. However, if we extend the domain to the full interval [0, 1], then negation has a fixed point at 1/2. The standard liar paradox sentence can then be expressed as

t(“This sentence is not true.”)

Suppose its truth value is x. That means that it is true with value x, and therefore should receive a score of 1-x. This is a fixed point equation, which has a solution at x = 1/2. So the standard liar sentence would get a self-consistent valuation of 0.5.

This resolved the most basic liar paradox, but not every diagonal attack, since not all functions on [0, 1] have fixed points. To make this work in general, we can for example allow only continuous functions on [0, 1] (which always have a fixed point by Brouwer’s fixed-point theorem). But that restriction comes at the cost of expressivity: "This sentence has truth score less than 0.5" is not a continuous function of the truth score of the sentence. Accordingly, it leads to a new liar paradox in the graded truth semantics. There is no satisfying protection against these diagonal attacks, and no such universal truth probe can exist.

Takeaways

Truth probes that project on a single direction work surprisingly well in simple cases.
However, they are not truth oracles, and never will be.
A graded truth semantics on [0, 1] with continuous truth operations can assign the ordinary liar the fixed-point value 1/2, but only by restricting crisp assertions about exact truth scores.
Conventional logistic truth probes do not implement such reflective semantics merely because their outputs lie in [0, 1].

It might seem absurd to you to even suggest superhuman AIs could function as a truth-oracle (it certainly does to me), but there are two reasons to take it seriously. First, it is how these things will be used practically by the vast majority of people. They are already replacing standard Google search results, and I’ve had many discussions end with people delegating final authority on the truth to an AI. Second, some people do really believe in a kind of platonic representation space that all models converge on, and that represents the “true” state of the world. If truth is indeed an objective part of the world, then you might expect such a universal truth direction to emerge as models get better. This post argues that this would lead to paradoxes.

Finally, I want to emphasise again that I’m not claiming that truth probes are not useful! I think they could greatly help with understanding model behaviour, and picking up on misaligned behaviour early. Mathematics has not seriously been hindered by the problem of undecidability or the undefinability of truth. The fact that truth is not a direction in an AI’s embeddings space similarly won’t stop us from making progress in AI alignment research.

A short koan about the spider Oza

2025-10-28T08:00:00+00:00

A rabbi asked the wise spider Oza:

“Tell me, why is there something rather than nothing?”

Oza:

“you are confused—both are: the something in that it is, and the nothing in that it isn’t.”

Both caught in the same web.

Complex Systems and Quantitative Mereology

2025-01-28T08:00:00+00:00

Have a look at these three rings:

Why are they connected? If you only consider two of the rings and ignore the third, then any pair can be smoothly separated. The triplet is connected and cannot be taken apart, but any description purely in terms of pairwise relationships would miss the fact that the three rings are connected.

While the three rings above—known as the Borromean rings—are especially simple and symmetric, similar links can be created for any number of rings. Here are links of four and six rings, for example, that fall apart if a single ring is removed:

I find this kind of magical. There seems to be a kind of ‘top-down’ causation: the behaviour of individual rings is restricted by the group as a whole, not by any individual.

Making sense of higher-order structure

The rings above are a simple example of a more general phenomenon known as ‘higher-order structure’ or ‘emergence’. These two terms are often used in an imprecise way. Imprecise use of ‘emergence’ does not bother me so much, because it seems to refer to something that is not very precise anyway (similar to ‘complexity’). But I believe that ‘higher-order structure’ can be made much more precise—it certainly sounds pretty mathematical. In fact, I recently wrote a paper about an idea that makes this precise, and that seems to explain many uses of the word ‘higher-order’. In particular, it answers the questions: higher than what? and which order? The paper is called A Mereological Approach to Higher-Order Structure in Complex Systems: from Macro to Micro with Möbius and is publicly available here. In this blog post¹, I hope to give a more accessible and less technical overview of the main ideas in the paper.

Mereology

Mereology² is the study of ‘parts’—specifically the relationship between parts and wholes. My central thesis is that incorporating mereology into scientific and mathematical thinking is a good idea. While a lot has been written about mereology—mostly by philosophers and logicians—it is not widely used as a practical set of ideas in the sciences. I suspect the aforementioned philosophers and logicians would find my approach simplistic, naive, or superficial, but I have found it to be very practical.

If we want to describe parts and wholes, let’s start with the whole. It is the biggest possible part of itself. The whole can then be divided into smaller parts. One example of this is from the 1886 “Handbook of Practical Cookery” by Matilda Dods:

The parts can be further divided into smaller parts, and so on. It is natural to assume some rules that parts should obey:

If $a$ is a part of $b$, and $b$ is a part of $c$, then $a$ is also a part of $c$.
Every part is a part of itself (namely, the ‘trivial’ biggest possible part).
If $a$ is a part of $b$, then $b$ is not a part of $a$ (unless $a=b$).

These rules are well-known to mathematicians, who call them transitivity, reflexivity, and antisymmetry. Together, they define what is called a partial order. It gives a way to order a collection of things—in this case, wholes, parts, and parts of parts (and parts of parts of parts, etc.). When a part $a$ is smaller than (or equal to) part $b$, we write $a \leq b$. We say partial order, because not all parts can be put in order! A possible mereology on a bicycle is, for example, the following, where an arrow is drawn from a part $a$ to $b$ if $b$ is a part of $a$:

Clearly, all the components drawn in black are ‘parts’ of the full bicycle, and the ordering reflects that the tires and the spokes are both parts of the wheels. However, the tires are not part of the frame, and vice-versa, so there is no arrow between the two—they are ‘incomparable’.

Now let’s put on a more scientific hat. If we want to describe the mereology of a system, it makes sense to say that there is a unique part that is the biggest, namely the whole system. This is the top of the partial order of parts (for example, the fully assembled bicycle above). In addition, let’s assume that the system is only made up of a finite number of parts (a billion parts is fine, but infinitely many is not). If we call the system $S$, then I propose we call any (finite) partial order that has $S$ at the top a mereology³ on $S$. If we write the set of parts of $S$ as $\mathcal{D}(S)$ and the ordering as $\leq$, then the mereology can be summarised as the pair $(\mathcal{D}(S), \leq)$.

Decomposing Complex Systems

Fundamental to any description of Nature is the choice of parts. I imagine this like the cast of a theatre play: who are the characters that come together to tell the story. If you want to describe how an object behaves, you should decide whether you want to describe it in terms of the atoms, molecules, layers, structural elements, or something else. Once you make this choice, you can start writing down equations that describe how these parts interact. Say you want to predict some property of a system, let’s call it $Q$ for quantity. This could be the temperature of a material, or the height of a person—anything that can be described by a number is allowed here. If $Q$ is some macroscopic quantity that you could measure, then it makes sense to say: $Q(S)$ is the sum of microscopic contributions from all the parts that make up the system $S$. Since I’m imagining $Q$ to be an observable macroscopic quantity, built from microscopic contributions of the parts, I write them as a big $Q$ and little $q$ respectively. Each part can contribute something else, so I write $q(s)$ for the contribution of part $s$. Saying that the quantity $Q$ is the sum of the contributions of the parts can then be written mathematically as:

\[Q(S) = \sum_{s \in S} q(s)\]

Where I write $\sum\limits_{s \in S}$ to mean “sum all the parts $s$ that are in $S$”. Now the problem is that this can only really lead to very boring descriptions of the quantity $Q$. If the whole is just the sum of the parts, then your $Q$ is a pretty boring quantity. For example, say we want to describe the height of a person in terms of the genes they have. We could write this as $H(G)$: their height as a function of their genes. If then $H(G) = \sum\limits_{g \in G}h(g)$, then the height of a person is just determined independently by each gene. Biology is much more exciting than that: genes can individually contribute to your height, but they can also interact in complex ways and change each other’s effects. It therefore makes sense to extend our notion of parts to also include combinations of individual genes. For example, we could have contributions of three genes $g_1$, $g_2$, and $g_3$, but also of the pairs $(g_1, g_2)$, $(g_1, g_3)$, and $(g_2, g_3)$—and perhaps even the triplet $(g_1, g_2, g_3)$. Contributions from pairs are very common and usually called interactions. Less commonly considered is an interaction among three elements. Because interactions of more than two elements are less common, they are usually collectively referred to as higher-order interactions⁴.

Ok let’s therefore assume that in principle, all parts and all possible combinations might contribute to the overall observation of $Q$ (though not all have to contribute—some contributions could be zero in practice). This means that the quantity $Q$ is not just the sum of the contributions of the parts, but also of the contributions of the pairs, triplets, quadruplets, and so on. This can be written as:

\[Q(S) = \sum_{s \in S} q(s) + \sum_{\substack{s_1 \in S\\s_2 \in S}} q(s_1, s_2) + \sum_{\substack{s_1 \in S\\s_2 \in S\\s_3 \in S}} q(s_1, s_2, s_3) + \ldots\]

To write this down more efficiently, we define the set of all possible combinations of parts of $S$ as $\mathcal{P}(S)$. Mathematicians call $\mathcal{P}(S)$ the power set of $S$. The quantity $Q$ can then be written as:

\[Q(S) = \sum_{p \in \mathcal{P}(S)} q(p)\]

We are now at the point where we can connect this back to mereology. Note that $\mathcal{P}(S)$ has an “order” to it. Let’s say $S = (g_1, g_2, g_3)$, then $\mathcal{P}(S) = (\emptyset, (g_1), (g_2), (g_3), (g_1, g_2), (g_1, g_3), (g_2, g_3), (g_1, g_2, g_3))$. The thing written as $\emptyset$ represents the ‘empty’ set $(~)$, which technically is also a part of $S$—namely the ‘empty part’. Note that some elements of $\mathcal{P}(S)$ can be made from others. For example, $(g_1, g_2)$ can be made from $(g_1)$ by adding $(g_2)$. In this sense, $(g_1)$ is a ‘part’ of $(g_1, g_2)$, and $(g_1, g_2)$ is thus “bigger” than $(g_1)$. This is usually written as $(g_1)\subseteq (g_1, g_2)$. It should not be too hard to convince yourself that $\mathcal{P}(S)$, ordered in this way by $\subseteq$, is a mereology on $S$!

Here’s a picture of the power set mereology on systems with 2, 3, or 4 variables (which I’ve labelled simply as $0$, $1$, $2$, and $3$):

Each has the full system at the top (as any mereology should), the empty set at the bottom, and in between all the singletons, pairs, triplets, and so on. All three also clearly have ‘levels’ or ‘orders’ corresponding to horizontal slices.

This means that we can now ‘decompose’ the quantity $Q(S)$ over a mereology on $S$ as follows

\[Q(S) = \sum_{p \subseteq S} q(p)\]

This now reads: $Q(S)$ is built from contributions from all the parts of $S$, including all possible combinations of them. Perhaps you are not yet convinced that we have gained anything by doing this. We have simply rewritten the decomposition of $Q$ in a more complicated form. However, mereologies have some special properties which allows us to do something pretty cool.

Möbius inversion

Let’s leave the mereology unspecified for now, and just say there is some mereology $(\mathcal{D}(S), \leq)$ on $S$. The decomposition of $Q(S)$ can be written as:

\[Q(S) = \sum_{p \leq S} q(p)\]

By definition, every element of $\mathcal{D}(S)$ is a part of $S$. This means that if the full system $S$ has the property $Q(S)$, then any part $p$ might have the property $Q(p)$. Note that $Q(p)$ is not the same as $q(p)$—the former is a property of the part $p$, while the latter is the contribution of $p$ to the property of the whole system. Now it might make sense to say that the decomposition is valid for all parts:

\[Q(p) = \sum_{r \leq p} q(r)\]

This means that we have a decomposition of $Q$ on every part. If there is a total of $N$ parts in the mereology, then this leads to $N$ equations (one decomposition of $Q$ per part), with $N$ unknowns (the contribution $q$ of each part). This is a system of equations that could in theory be solved, but for large mereologies this can be impractical. Since a mereology is a very special thing with a lot of structure, solving the equations is actually much simpler! The solution is given by the Möbius inversion formula:

\[Q(S) = \sum_{p \leq S} q(p) \quad \iff \quad q(S) = \sum_{p \leq S} \mu(p, S)Q(p)\]

This says that whenever you have a sum over a mereology, you can actually invert this sum (‘solve the system of equations’). This gives you an expression for all the microscopic contributions $q(p)$, that are usually not directly observable, in terms of the observable properties of the parts $Q(r)$. $Q$ is a sum of $q$’s, but the $q$’s are themselves a sum of $Q$’s, weighed by a weird number called $\mu(p, S)$. This number is called the Möbius function of the mereology.

The precise definition of $\mu$ is not so important⁵. The only important thing is that it allows you to invert sums on a mereology, and that it is fully specified by the ‘shape’ of the mereology. Only the relationships between the parts matter—not what the parts actually are. For example, the power set mereology $(\mathcal{P}(S), \subseteq)$ that we saw before always has the same shape, no matter what kind of system $S$ is. This means that the Möbius function is always the same for the power set mereology. It is given by

\[\mu(p, S) = (-1)^{|S| - |p|}\]

In other words: if you decompose $Q(S)$ over the power set mereology, then the microscopic $q(p)$ given by an alternating sum over $Q(p)$’s:

\[q(S) = \sum_{p \subseteq S} (-1)^{|S| - |p|}Q(p)\]

Where $\mid S\mid - \mid p\mid$ refers to the difference in the number of elements in $S$ and $p$. Let’s look at two examples of this in practice.

Example 1: Consider a system of two genes $g_1$, and $g_2$. Let us decompose a person’s height $H$ over the power set mereology of these genes:

\[H(g_1, g_2) = \sum_{p \subseteq (g_1, g_2)} h(p) = {\color{grey}{h(\emptyset})} + {\color{YellowGreen}{h(g_1)}} + {\color{skyblue}{h(g_2)}} + {\color{Tan}{h(g_1, g_2)}}\]

This says: the height of the person with both genes is given by some contribution $h(\emptyset)$ that forms the ‘baseline’ height, plus the contributions of each gene individually, plus the interaction of the pair of genes. Matching the colour scheme above, we can draw this as:

To infer the contribution of an individual gene, we apply the Möbius inversion formula:

\[h(g_1) = (-1)^{|g_1| - |g_1|}H(g_1) + (-1)^{|\emptyset| - |g_1|}H(\emptyset)\\ = H(g_1)- H(\emptyset)\]

That is, the effect $h(g_1)$ of an individual gene is the difference between a person with that gene and a person without it. This makes sense! In fact, it makes so much sense that it is not very interesting. More interesting is the interaction term. A similar Möbius inversion yields:

\[h(g_1, g_2) = H(g_1, g_2) - H(g_1) - H(g_2) + H(\emptyset)\]

which can be drawn as:

This is actually a well-known quantity in genetics, where it is known as an epistatic effect between the genes.

Example 2 In physics and information theory, one often quantifies the amount of information in a system by the entropy. The precise definition is not important for now, but let’s denote the entropy carried by two variables $(X, Y)$ by $H(X, Y)$. One could imagine that the total entropy is composed of contributions from the individual variables, and a contribution that is only carried by the pair. This amounts to a power set mereology:

\[H(X, Y) = I(X) + I(Y) + I(X, Y)\]

We have omitted the empty set this time, since an empty set of variables carries no information. A Möbius inversion shows that

\[I(X, Y) = H(X, Y) - H(X) - H(Y)\]

This is a quantity known as the mutual information between $X$ and $Y$, and represents one of the most fundamental concepts in information theory. While there are other ways to derive it, this shows that it is the unique way to decompose the entropy over subsets of variables.

Beyond power sets

Both examples above use the power set mereology, which is especially simple⁶. In the paper, I find that Möbius inversions on different mereologies reproduce different quantities that are all well-known in certain fields of science. One alternative but common mereology is the ‘partition’ mereology, where a set is not divided into subsets, but into ‘partitions’—different ways to cut up a system. Here’s what that looks like for two, three, and four variables:

Table 1 from the paper gives an overview of how different mereologies are associated to different quantities:

To summarise: Most of the quantities above were the result of people thinking very hard about what the right definition of the microscopic ‘interaction’ terms are. However, instead of thinking hard about equations, you can instead invest this mental energy into thinking about appropriate mereologies. Once you have fixed the mereology, there is a unique microscopic description that is compatible with it. Not only that, the mereology shows you exactly what is meant by ‘higher-order’, namely, some terms are higher with respect to the partial ordering. This resonates strongly with Plato’s call to ‘carve Nature at its joints’—a good description of Nature depends on a good choice of parts. If you choose a natural mereology, then the higher-order interactions that derive from it inherit the justification.

Real-world applications

In the paper I use this approach⁷ to derive new notions of ‘higher-order’ interactions that can be used in machine learning (I derive a novel decomposition of the so-called KL-divergence). Another exciting application is ‘coarse-graining’. Physicists love studying what happens when you ‘coarse-grain’ a system—how does the physics change if you squint or look from far away? In essence, a coarse-graining is simply a change in mereology—a coarser description is one with fewer or larger parts. Describing coarse-grainings at the level of mereologies gives an entirely new way to think about this process. In one of the final sections of the paper I show that coarse-grainings correspond to special kinds of transformations of mereologies called Galois connections, and use this to derive well-known coarse-grained quantities (the ‘renormalised’ Ising couplings).

To apply the framework, you have to know the Möbius function $\mu$ of the mereology you are interested in. For one famous mereology—the so-called redundancy mereology—the Möbius function was not known. It is a particularly complex mereology and includes many⁸ parts:

However, in a recent collaboration with Pedro Mediano at Imperial College London, and Fernando Rosas at Sussex, we were able to calculate the Möbius function for this mereology, and therefore calculate new kinds of higher-order interactions (see the paper here). I’m now exploring what else can be done with this approach, and recently found a new way to decompose causal effects, as outlined here.

In short: there is lots to do. Stay tuned for more.

Thanks to Twitter/X user @prathyvsh for urging me to write this, and creating the figures of Brunnian links and visual algebra. They make really great visualisations of algebraic structures ↩
Mereology is the λόγος (logos: explanation/consideration) of a μέρος (meros: part). The meros root more famously appears in words like polymer (literally: manyparts). ↩
Mathematicians might notice how this is somewhat similar to the definition of a topology. The two are indeed related, but not the same. A topology on a finite set is necessarily a mereology, but not all mereologies are topologies, and topologies do not have to be finite. ↩
It’s interesting to ask: Why are beyond-pairwise interactions not common? I think it’s at least part caused by a lack of imagination. We can picture things interacting as a graph, where points represent parts, and lines represent interactions between parts. But lines always connect two points, not three or more, so we cannot really picture what higher-order interactions look like. This is a limitation of our imagination, not of Nature. However, pairwise descriptions have been very successful, and some people have argued that this is the way Nature works. Observing higher-order interactions from data is also harder than pairwise interactions, so perhaps it is simply a reflection of the fact that data sets have historically been small. This would also explain why higher-order interactions are becoming more en vogue now that data sets are getting bigger. ↩
The Möbius function is usually recursively defined over a partial order, but there is a very nice expression due to Phillip Hall: Given an interval $[x, y]$ on the partial order $P$, let $c_i$ be the number of chains in $P$ from $x$ to $y$ of length $i$. Then the Möbius function on $P$ is given by $\mu(x, y) = -c_1 + c_2 - c_3 + \ldots$. ↩
The Möbius inversion formula over the power set mereology is more famously known as the inclusion-exclusion principle. ↩
I’m still looking for a good name for this approach/framework. Möreology? If you have any suggestions—please let me know! ↩
In fact, the number of parts in this mereology for 9 variables was calculated for the first time in 2023, and for more than 9 variables is still unknown. ↩

The Encyclopedia of Mobius Inversions in the Sciences

2025-01-28T08:00:00+00:00

In a recent paper, I proposed to study complex systems through a mereological lens by applying the Möbius inversion theorem. I also covered this in a recent blogpost. Here, I will collect some of the most important applications of Möbius transformations in the sciences. I will update this table as I find more applications. If you have suggestions, please send me an email, or leave a comment below!

Field of Study	Macro quantity	Mereology	Micro quantity
Statistics	Moments	Powerset	Central moments
	Moments	Partitions	Cumulants
	Free moments	Non-crossing partitions	Free cumulants
	Path signature moments	Ordered partitions	Path signature cumulants
	Causal effects	Antichains	Causal synergy/redundancy
Information Theory	Entropy	Powerset	Mutual information
	Entropy	Singletons	Total correlation
	Surprisal	Powerset	Pointwise mutual information
	Joint Surprisal	Powerset	Conditional interactions
	Mutual Information	Antichains	Synergy/redundancy atom
Biology	Pheno- & Genotype	Powerset	Epistasis
	Gene expression profile	Powerset	Genetic interactions
	Population statistics	Powerset	Synergistic treatment effect
Physics	Energy	Powerset	Ising interactions
	Correlation functions	Partitions	Ursell functions
	Quantum corr. functions	Partitions	Scattering amplitudes
Chemistry	Molecular property	Subgraphs	Fragment contributions
	Molecular property	Reaction poset	Cluster contributions
Game Theory	Coalition value	Powerset	Harsanyi dividends
	Shapley value	Supersets	Normalised coalition synergy
Artificial Intelligence	Generative model probabilities	Powerset	Feature interaction
	Predictive model predictions	Powerset	Feature contribution
	Dempster-Shafer Belief	Lattices	Evidence weight
	KL-divergence	Powerset	$\Delta_{p\|q}$ measure

A Möbius inversion theorem for modules and vector spaces

2025-01-28T08:00:00+00:00

In a previous post, I proposed to study complex systems through a mereological lens by applying the Möbius inversion theorem. This has become my favourite theorem by now, because I think it allows you to do integration and differentiation on observables in complex systems where this was not possible before.

It was originally proved by Möbius for integer numbers ordered by division, but generalised to commutative rings and arbitrary posets by Gian-Carlo Rota. I wanted to apply this to group- and vector-valued functions so that I could calculate semantic synergy in text embeddings, but to do so the theorem needed to be generalised. Let’s first review the classical version of the theorem, and then generalise it to modules and vector spaces.

Definition: A commutative ring $R$ (with unity) is a set equipped with two operations, addition ($+$) and multiplication ($\cdot$), such that the following properties hold:

1. $(R, +)$ is an Abelian group with identity element $0_R$.

2. $(R, \cdot)$ is a monoid with identity element $1_R$.

3. Multiplication distributes over addition: $a \cdot (b + c) = a \cdot b + a \cdot c$ and $(a + b) \cdot c = a \cdot c + b \cdot c$

4. Multiplication is commutative: $a \cdot b = b \cdot a$

The Classical Möbius Inversion Theorem

The theorem is really a statement about the incidence algebra of the poset $P$, which is the set of all functions $f: P \times P \to R$ from intervals on $P$ to $R$, equipped with the convolution product defined by:

\[(f \ast g)(x, y) = \sum_{x\leq z \leq y} f(x, z) \cdot g(z, y)\]

Note that the following three functions are part of the incidence algebra:

\[\begin{align} \zeta(x, y) &= \begin{cases} 1_R & \text{if } x \leq y \\ 0_R & \text{otherwise} \\ \end{cases} \\[1em] \mu(x, y) &= \begin{cases} 1_R & \text{if } x = y \\ -\sum_{x \leq z < y} \mu(x, z) & \text{if } x < y \\ 0_R & \text{otherwise} \end{cases} \\[1em] \delta(x, y) &= \begin{cases} 1_R & \text{if } x = y \\ 0_R & \text{otherwise}\\ \end{cases} \end{align}\]

Theorem (Möbius inversion theorem): Let $R$ be a commutative ring, and $f, g: P \to R$ be $R$-valued functions defined on a locally finite poset $P$. Then the following two statements are equivalent:

1. $f(x) = \sum_{y \leq x} g(y)$

2. $g(x) = \sum_{y \leq x} \mu(y, x) \cdot f(y)$

where $\mu$ is the Möbius function on $P$.

The Möbius inversion theorem is thus really the statement that $\mu$ is the $\ast$-inverse of $\zeta$, as $\zeta \ast \mu = \delta$ and $\mu \ast \zeta = \delta$. This is a very powerful result: since $f$ is like an integral over the poset, $\mu$ is like a differential operator on $P$.

Generalising the Möbius Inversion Theorem

It turns out that we can generalise this theorem to group-valued functions. This is strictly more general than the commutative ring case, and includes vector-valued functions. To do so, we need to define $R$-modules:

Definition: Let $R$ be a commutative ring. An R-module is an Abelian group $M$ with a group operation $\oplus : M \times M \to M$ and a scalar multiplication $\star: R \times M \to M$ such that the following properties hold for all $r, s \in R$ and $x, y \in M$:

1. $r \star (x \oplus y) = r \star x \oplus r \star y$

2. $(r + s) \star x = r \star x \oplus s \star x$

3. $r \star (s \star x) = (r \cdot s) \star x$

4. $1 \star x = x$

We can then state a more general version of the theorem:

Theorem (Möbius inversion theorem on R-modules): Let $(P, \leq)$ be a locally finite poset, $(R, +, \cdot)$ a commutative ring, and $(M, \oplus,\star)$ an R-module. Consider $Q, q: P \to M$. Then the following two statements are equivalent:

1. $Q(x) = \bigoplus_{y \leq x} q(y)$

2. $q(x) = \bigoplus_{y \leq x} \mu(y, x) \star Q(y)$

Proof: The proof is similar to the classical case, but we need to carefully consider the properties of the different operations involved.

Filling in the first statement into the right-hand side of the second statement:

\[\bigoplus_{y \leq x} \mu(y, x) \star Q(y) = \bigoplus_{y \leq x} \left( \mu(y, x) \star \bigoplus_{z \leq y} q(z) \right)\]

By the first property of modules, $\star$ multiplication distributes over $\oplus$ addition:

\[= \bigoplus_{y \leq x} \left(\bigoplus_{z \leq y} \mu(y, x) \star q(z) \right)\]

As the module is based on an Abelian group, $\bigoplus$ is associative and commutative, so we can change the order of summation as follows:

\[= \bigoplus_{z \leq x} \left( \bigoplus_{z\leq y \leq x} \mu(y, x) \star q(z) \right)\]

By the second property of modules, we can replace the $\bigoplus$ (addition in the group) with $\sum$ (addition in the ring) as follows:

\[= \bigoplus_{z \leq x} \left( \sum_{z\leq y \leq x} \mu(y, x) \right) \star q(z)\]

Observe that the term in parentheses is equal to $(\zeta \ast \mu)(z, x)=\delta(z, x)$, a convolution in the incidence algebra. Therefore:

\[= \bigoplus_{z \leq x} \delta(z, x) \star q(z) = 1_R \star q(x) = q(x)\]

This proves that the second statement follows from the first.

To prove the reverse direction, we fill in the second statement into the first:

\[\bigoplus_{y \leq x} q(y) = \bigoplus_{y \leq x} \left( \bigoplus_{z \leq y} \mu(z, y) \star Q(z) \right)\]

From here, the same arguments can be applied to show that this equals $Q(x)$. This completes the proof. $\square$

Conclusion

The theorem is thus no longer a statement about convolutions in the incidence algebra, but about a kind of ‘incidence action’ on the module. This generalisation opens up new possibilities for applying Möbius inversion to group- or vector-valued function. One context in which this might be useful is text embeddings. The Möbius inverse of the vector-valued embedding function would quantify the ‘emergent semantics’ of a piece of text.

Möbius Functions on Powerset Antichains

2024-02-09T14:00:00+00:00

UPDATE: I have now found an explicit formula for the Möbius function on the redundancy lattice. This will appear in a preprint soon.

I calculated the full Möbius function on the lattice of powerset antichains of 2, 3, and 4 variables. As far as I can tell, these results have not been shared before, although I’m sure the 2- and 3-variable case has been calculated but not shared many times before. The results and the code are available here.

This should make higher-order information decompositions much more straightforward and computationally efficient when one is interested in more than 3 variables. Some background information on what this is all means is given below.

Given a set $S$, the powerset $\mathcal{P}(S)$ is the set of all subsets of $S$. From this set, one can create a partially ordered set $P=(\mathcal{P}(S), \leq)$, where we impose the following ordering. For $s, t \in \mathcal{P}(S)$, we say that $s \leq t$ if and only if $s \subseteq t$. This is a partial ordering because not all pairs of elements are comparable. For example, if $S=\{1,2,3\}$, then $\{1\}$ and $\{2\}$ are not comparable, but it’s clear that $\{1\} \leq \{1,2\}$.

Now one can imagine functions $f: P^n \to \mathbb{C}$ on this poset. Of particular interest is the so-called Möbius function $\mu: P^2 \to \mathbb{N}$, which is defined recursively:

\[\mu(s,t) = \begin{cases} 1 & \text{if } s = t \\ -\sum_{s < u \leq t} \mu(u, t) & \text{if } s < t\\ 0 & \text{otherwise }\\ \end{cases}\]

This function is important because it allows the Möbius inversion theorem (due to Rota, 1964) to be stated:

\[f(t) = \sum_{s \leq t} g(s) \Leftrightarrow g(t) = \sum_{s \leq t} \mu(s,t) f(s)\]

This theorem forms the basis for our understanding of interactions in complex systems (more details in an upcoming manuscript).

The Möbius function is most studied on the following posets:

The poset $(\mathbb{N}, \leq)$ of natural number with their natural ordering (in which case the Möbius inversion theorem reduces to the discrete fundamental theorem of calculus)
The poset $(\mathbb{N}, \leq_d)$ of natural numbers ordered by divisibility (in which case $\mu$ is the inverse of the Riemann zeta function)
On the poset $(\mathcal{P}, \subseteq)$ of subsets ordered by inclusion (in which case $\mu(s, t)=(-1)^{\mid t-s \mid }$)
On the poset $(\Pi(S), \leq_r)$ of partitions of a set $S$ ordered by refinement (in which case $\mu(\pi, S)=(-1)^{(\pi-1)}(\mid \pi\mid -1)!$).

Here, however, I will focus a much less understood poset, inspired by a branch of information theory called the Partial Information Decomposition (PID). The PID aims to decompose information into unique, redundant, and synergistic components. The key insight is that given a set of variables, one can define redundant information quantities that depend on particular combinations of the variables. The central constraint is that the redundant information should only be nontrivial for two sets of variables that are incomparable on the poset of subsets $P$ defined above. This is because a redundancy among $A$ and $B$ is only meaningful if $A$ is not a subset of $B$ (or vice versa).

Now, an important observation is that the collection of these incomparable subsets have an interesting structure. A collection of incomparable items is called an antichain (because it is in some sense the opposite of a chain on $P$), and antichains can again be given a partial order! Let’s call the collection of antichains $A$, so that for every $a, b \in A$, we can set $a \leq b$ if for every $b_i \in b$, there is an $a_i \in a$ such that $a_i \subseteq b_i$. That is, $b$ in some sense ‘covers’ $a$, and $b$ contains less redundancy than $a$.

The partial information decomposition then writes the total information a set of sources $X_1 \ldots X_n$ carries about a target $Y$ as a sum of information atoms $\Pi$ on this lattice of antichains:

\[I(\{X_1, \ldots, X_n\}; Y) = \sum_{a \in A} \Pi(a; Y)\]

To identify what the information atoms are in terms of already known information quantities, we can interpret $I(\{X_1, \ldots, X_n\}; Y)$ as a function $I: A \to \mathbb{R}$, and then apply the Möbius inversion theorem to find the information atoms $\Pi(a; Y)$.

The problem is that we don’t have a closed form solution for the Möbius function on $A$. If we did, we could calculate arbitrary information atoms and pinpoint exactly which parts of a system show synergistic information processing. In addition, a brute-force calculation is very hard, as the number of antichains grows superexponentially with the number of variables. In fact, the number of antichains on a set of $n$ variables is given by the $n$’th Dedekind number (minus two), which is a series that grows so fast that only the first nine terms are known: 2, 3, 6, 20, 168, 7581, 7828354, 2414682040998, 56130437228687557907788, 286386577668298411128469151667598498812366. That last term was, in fact, first calculated in 2023.

Still, one would only need to calculate the Möbius functions on the lattice of antichains once, after which any time you want to decompose some information quantity you can just look up the value. Because this sequence grows so quickly, the literature has so far only really calculated decompositions of the information that two variables carry about a third. I wrote some code that calculates the Möbius function on the lattice of antichains (available on Github), and have stored the results for up to 4 variables giving information about a fifth. This is not a huge improvement, but it’s a start, and I invite anyone to see how far they can optimise my code and perhaps calculate higher Möbius functions (it’s currently a pretty basic Python implementation). I’m pretty sure the 5-variable case is doable with my current approach, but beyond that it looks like more clever approaches have to be found.

I found that the Möbius function on the lattice of antichains, up to $N=4$, only takes the values $-1$, $0$, and $1$. This is similar to Möbius functions on other posets, like Boolean algebras or positive integers ordered by divisibility, but this does not mean that finding a closed form expression for the Möbius function will be easy. For example, the Möbius function on the positive number orders by divisibility also only takes values -1, 0, and 1, but a closed form expression would solve the Riemann hypothesis and win you a million dollars. Still, I suspect that the Möbius function on the lattice of antichains is more tractable than one might think, since there seems to be a close relationship between the product of antichain lattices and Boolean algebras, for which the Möbius function is well understood.

The main advantage of having these Möbius functions calculated is that new PID approaches now no longer need to solve the system of equations each time a new definition of redundancy is introduced, and can instead use the Möbius function to calculate all information atoms directly. Especially as the number of variables grows, solving the system of PID equations becomes increasingly hard, so having stored values of the Möbius function can offer a huge advantage.

Introducing: Stator

2023-12-21T14:00:00+00:00

In my PhD thesis, I developed a method and software package to infer interactions and cell states from gene expression data. Both the software and the research are now published!

The software is called Stator, and comprises the Stator Nextflow pipeline, as well as a bespoke Shiny app developed by Yuelin Yao.

We used it to find previously hidden differentiation states in the embryonic mouse brain and sub-phases of the cell cycle, and discovered liver tumour cell states that are prognostic of patient survival. Please get in touch if you are interested in using Stator for your own research!

A simple origami swallow

2023-09-02T08:00:00+00:00

I was completely mesmerised by the swallows in Portugal, and while playing around with some origami paper, I came up with this simple design for an origami swallow. It is based on the famous ‘crane base’ (steps 1-7), and implements many folds used in the standard Orizuru crane (inside-reverse folds for the wings, narrowing the bottom two flaps, etc).

Step 1

Two diagonal valley folds, and two horizontal mountain folds.

Step 2

Push the diagonals together along the valley folds.

Step 3

Flatten, keep the open sides at the bottom.

Step 4

Fold the open edges inwards (on both sides).

Step 5

Fold the top triangle down.

Step 6

Undo the previous fold.

Step 7

Open the two flaps and fold the inside triangle up (on both sides).

Step 8

Open the four sides.

Step 9

Close and flatten along the edges previously not touching.

Step 10.

Fold along the bottom diagonals.

Step 11.

Open the two wings with an inside reverse fold. This is a free fold that determines the angle of the wings.

Step 12.

Fold open the v-shaped tail. This is another free fold.

The do-calculus of sampling from restricted Boltzmann machines

2023-01-09T08:00:00+00:00

Generative models, in particular energy-based models, are often used to sample from conditional distributions–a process known as inpainting. One of the most fundamental kinds of generative energy-based models is called a restricted Boltzmann machine (RBM), which is essentially a bipartite, glassy Ising model. Inpainting with RBMs is usually done by sampling from the visible layer while fixing the value of some visible nodes, which is an intervention, not a passive observation. I could not find a proof that the resulting interventional sampling distribution approaches the conditional distribution, so here follows an argument that it in fact does (in the case of Gibbs sampling), based on the do-calculus.

Given a Bayesian network of variables $V$ represented by a DAG $G = (V, E)$, the fundamental problem in causal inference is to estimate quantities like $p_G\left(V_A=v_a \mid do(V_B=v_b)\right)$ for an arbitrary partition of the vertices $V = V_A \cup V_B$. The do-calculus provides rules and methods for answering such questions. Unfortunately, the standard formulation of RBMs is not suitable for treatment by the do-calculus, as the network of dependencies does not form a DAG. However, by exploiting the sequential nature of Gibbs-sampling, I show the following:

Lemma (RBM inpainting is conditional sampling) Consider a restricted Boltzmann machine with visible nodes $v$ and hidden nodes $h$, where $|v|>0$ and $|h|>0$. Let $p(v)$ be the marginal probability distribution over the visible nodes, and $v = v_a \cup v_b$ an arbitrary partition of the visible nodes. Consider the nodes of the RBM as random variables that evolve under alternating Gibbs sampling of the visible and the hidden layers. Define the do-operator on a variable $X$ as fixing that variable $X=x$ before generating each sample, and the see-operator as passively observing $X=x$. Then:

\[p(v_a | do(v_b)) = p(v_a | see(v_b)) = p(v_a | v_b)~~~~~~~~~~~~~~~~ (1)\]

That is, inpainting is sampling from the conditional distribution.

(proof) First note that the RBM has to be represented by a DAG. To do this, consider the nodes as random variables that evolve under Gibbs sampling. The bipartite structure of the network allows us to unroll the network in time:

where $v^i$ and $h^i$ mark the $i$’th Gibbs sample of the visible and hidden layer, respectively. Given the partition $v = v_a \cup v_b$, the following should be verified:

\[p(v^1_a \mid do(v^0_b)) = p(v^1_a \mid v^0_b) ~~~~~~~~~~~~~~~~ (2)\]

The second rule of the do-calculus states:

Rule 2(Action/observation exchange)

\[p(y \mid do(x), do(z), w) = p(y \mid do(x), z, w) \text{ if } (Y\!\perp\!\!\!\perp Z \mid X, W)_{G_{\overline{X}\underline{Z}}}\]

where $G_{\overline{X}\underline{Z}}$ is the graph $G$ with all the arrows into $X$, and out of $Z$ removed. If $X=W=\emptyset$, then Rule 2 can be directly apply to Equation (1) by defining the following two graphs:

where $v$ could be partitioned because at any given time point the visible nodes are mutually independent conditional on the hidden layer. The value of $v_b^1$ will be discarded and set to $v_b^0$ again, but that is irrelevant in the present discussion. Now $(v_a^1 \!\perp\!\!\!\perp v_b^0)_{G^\dagger}$, which by Rule 2 implies Equation (2), and completes the proof. $\square$

Example As a very simple example, consider the case where $v_a=v_1$, $v_b=v_2$ and $h_1$ all comprise just a single node. The full distribution is:

\[P_G(v_1, v_2, h_1) = \frac{1}{\mathcal{Z}_G} e^{h_1 w_{11} v_1 + h_1 w_{12} v_2 + b_1 v_1 + b_2 v_2 + c_1 h_i}\]

Denoting by $(abc)$ the situation in which $v_1=a, ~v_2=b, ~h_1=c$, the conditional distribution after observing $v_2=1$ is:

\[\begin{align*} P(v_1=1|\text{see}(v_2=1)) &= \frac{P_G(v_1=1, v_2=1)}{P_G(v_2=1)}\\ &= \frac{(110) + (111)}{(111) + (110) + (011) + (010)}\\ &= \frac{e^{b_1+b_2} + e^{w_{11} + w_{12} + b_1 + b_2 + c_1}}{e^{w_{11}+w{12}+b_1+b_2+c_1} + e^{b_1+b_2} + e^{w_{12} + b_2 + c_1} + e^{b_2}}\\ &= \frac{e^{b_1} + e^{w_{11} + w_{12} + b_1 + c_1}}{e^{w_{11}+w{12}+b_1+c_1} + e^{b_1} + e^{w_{12} + c_1} + 1} \end{align*}\]

Now consider the intervention $do(v_2=1)$. It just adds a bias $w_{12}$ to the hidden layer. Denoting by $(ab)$ that $v_1=a, ~h_1=b$:

\[\begin{align*} P_G(v_1=1|\text{do}(v_2=1)) &= P_{G^\dagger}(v_1=1)\\ &= \frac{1}{\mathcal{Z}_{G^\dagger}} \Big((11) + (10)\Big) \end{align*}\]

Writing out this partition function:

\[\mathcal{Z}_{G^\dagger} = (11) + (10) + (01) + (00) = e^{w_{11}+b_1+c_1+w_{12}} + e^{b_1} + e^{c_1+w_{12}} + 1\]

so that

\[P(v_1=1|\text{do}(v_2=1)) = \frac{e^{w_{11}+b_1+c_1+w_{12}} + e^{b_1}}{e^{w_{11}+b_1+c_1+w_{12}} + e^{b_1} + e^{c_1+w_{12}} + 1}\]

which indeed coincides with expression for the see-operator.

Using the Open Game engine to model MEV

2022-12-15T08:00:00+00:00

(This is an update on my Ethereum Protocol Fellowship. More updates can be found here ).

Extracting value from Uniswap transactions

In this post, I will outline a way in which the equilibrium analysis from the Open Game engine can be used to come up with profitable strategies for a block proposer on the Ethereum blockchain, either by re-ordering the transactions in the block, or by inserting their own. I will consider two sources of MEV here, inspired by the Clockwork Finance paper. First, I will consider transactions to a Uniswap-like exchange, that exchanges two tokens according to the constant product rule. By ordering such transactions, the block proposer can profitably manipulate the price of the tokens. Second, such price manipulation can be made even more profitable by placing a bet on the exchange rate offered by the Uniswap contract. To be able to analyse both these scenarios, I will model a blockchain with just two player accounts, $p_0$ and $p_1$, and two contract accounts: Uniswap and Bet. Most of the complexity will be captured in Haskell functions, and the Open Game engine is only used to search for profitable strategies for the block proposer $p_0$.

Let’s first define some useful data types. I assume there exist only two tokens: A and B, both of which can be held in fractional amounts. An Account has an integer AccountID, and can hold amounts of both tokens. A transaction Tx specifies a sender and receiver ID, and how much of which token is sent. A block is simply modelled as a list of transactions:

data Token = A | B
  deriving (Eq,Ord,Show)
type TokenAmount = Double
type AccountID = Int

-- ID, amount A, amount B
type Account = (AccountID, TokenAmount, TokenAmount)

type AccountStates = [Account]

-- Sender, Receiver, Token, Amount
-- Should this also have a data field?
type Tx = (AccountID, AccountID, Token, TokenAmount)

type TxBlock = [Tx]

Let’s further define useful functions to find the recipient of a transaction and to set and get an account’s balance:

getReceiver :: Tx -> AccountID
getReceiver (_,receiver, _, _) = receiver

balance :: Account -> Token -> TokenAmount
balance (userID, amountA, amountB) token
    | token==A = amountA
    | otherwise =  amountB

updateAccount :: Account -> Token -> TokenAmount -> Account
updateAccount (acID, bA, bB) token amount
   | token == A = (acID, bA + amount, bB)
   | otherwise = (acID, bA, bB + amount)

A call to the Uniswap contract is a map from the current state and a transaction to a new state. To generate the new state from the old one, I use a lens on the list, imported from Control.Lens. When a user has insufficient funds, they are charged a small (gas) fee.

uniSwapExchange :: AccountStates -> Tx -> AccountStates
-- Old state -> transaction -> new state
-- Use lenses to 'modify' account states
-- If the user does not have sufficient funds, fine the user (a gas fee?) propagate the old state
uniSwapExchange accountStates (userID, uniSwapID, token, tokenAmount)
   | userbalanceInsufficient = accountStates & (ix userID) .~ userFined
   | otherwise = accountStates & (ix userID) .~ userUpdate & (ix uniSwapID) .~ uniswapUpdate
  where
   user_bal = balance (accountStates!!userID)
   userbalanceInsufficient = user_bal token < tokenAmount
   userFined = 
      if token ==A
      then (userID, minimum [0, user_bal A - 0.1] , user_bal B)
      else (userID, user_bal A ,  minimum [0, user_bal B - 0.1])
   uniSwap_bal = balance (accountStates!!uniSwapID)
   dA = uniSwap_bal A * tokenAmount / (uniSwap_bal B + tokenAmount) 
   dB = uniSwap_bal B * tokenAmount / (uniSwap_bal A + tokenAmount) 
   userUpdate = 
      if token == A
      then (userID, user_bal A - tokenAmount, user_bal B + dB)
      else (userID, user_bal A + dA, user_bal B - tokenAmount)
   uniswapUpdate = 
      if token == A
      then (uniSwapID, uniSwap_bal A + tokenAmount, uniSwap_bal B - dB)
      else (uniSwapID, uniSwap_bal A - dA, uniSwap_bal B + tokenAmount)

Similarly, a call to the Bet contract transforms an old state into a new one, but it also needs an AccountID that specifies which contract should be used as a price oracle. It further needs the price that would trigger a win for the better. Crucially, both of these are a property of the contract, and not set by the transaction that actually places a bet of a certain amount. I’ve implemented it so that when the user wins a bet in token T, they receive the full T-balance of the betting contract.

betOnExchange :: AccountStates -> AccountID -> Double -> Tx -> AccountStates
-- Old state, price oracle, bet threshold, bet amount, new state
-- Use lenses to 'modify' account states
betOnExchange accountStates oracleID ratio (userID, betID, betToken, betAmount)
   | (userbalanceInsufficient || betAmount<0) = accountStates & (ix userID) .~ userFined
   | ((uniSwap_bal A)/(uniSwap_bal B) >= ratio) = accountStates & (ix userID) .~ userUpdate_userWin & (ix betID) .~ betUpdate_userWin
   | ((uniSwap_bal A)/(uniSwap_bal B) <  ratio) = accountStates & (ix userID) .~ userUpdate_userLose & (ix betID) .~ betUpdate_userLose
  where
   uniSwap_bal = balance (accountStates!!oracleID)
   bet_bal = balance (accountStates!!betID)
   user_bal = balance (accountStates!!userID)
   userbalanceInsufficient = user_bal betToken < betAmount
   userFined = 
      if betToken ==A
      then (userID, minimum [0, user_bal A - 0.1] , user_bal B)
      else (userID, user_bal A ,  minimum [0, user_bal B - 0.1])
   -- The prize is either the amount bet, or the remaining balance if that is less
   prize = minimum [betAmount, balance (accountStates!!betID) betToken]
   userUpdate_userWin = updateAccount (accountStates!!userID) betToken prize
   betUpdate_userWin =  updateAccount (accountStates!!betID) betToken (-prize)
   userUpdate_userLose = updateAccount (accountStates!!userID) betToken (-prize)
   betUpdate_userLose = updateAccount (accountStates!!betID) betToken betAmount

That is all the needed contract complexity. The only thing needed now is to initialise the accounts and make sure the right internal code gets triggered when a transaction is made to a contract account (in fact, I will only worry about contract-calling transactions here). Let’s initialise the accounts of two players and the two contracts with the following balances:

p0_ac = (0, 10, 10)
p1_ac = (1, 10, 10)
uniswap_ac = (2, 100, 100)
bet_ac = (3, 100, 100)

initAccounts :: AccountStates
initAccounts = [p0_ac, p1_ac, uniswap_ac, bet_ac]

I will interpret $p_0$ as the block proposer. Since, in this limited example, $p_0$ does not actually care who makes the other contract calls, one other player, $p_1$, should be sufficient. I then specify how each contract call should be executed, specifying the account with index 2 (the Uniswap account) as the price oracle for the betting contract, and setting the winning threshold at a token ratio of 1.1.

executeTx :: AccountStates -> Tx -> AccountStates
executeTx states tx
   | getReceiver tx == 2 = uniSwapExchange states tx
   | getReceiver tx == 3 = betOnExchange states 2 1.1 tx
   | otherwise = states

Executing a whole block of transactions is then simply a foldl of executing each transaction, since the accounts get updated each time:

executeBlock :: AccountStates -> TxBlock -> AccountStates
executeBlock accountStates_init block = foldl executeTx accountStates_init block

There are many definitions of MEV, and people still disagree on the right definition. Here, I will simply choose Token A as the relevant holder of value (also called the numéraire), so the final A balance determines the payoff, which is equal to the MEV:

blockPayoff :: AccountStates -> TxBlock -> AccountID -> Payoff
-- Old state, block, payoff for user "userID"
blockPayoff initStates block userID = newBalance - oldBalance
  where
   newBalance = balance ((executeBlock initStates block)!!userID) A
   oldBalance = balance (initStates!!userID) A

To analyse the extractable value by the proposer $p_0$, let’s imagine a mempool with transactions in which both players exchange tokens A and B in both directions.

tx1 = (getAccountID p0_ac, getAccountID uniswap_ac, A, 2.0)
tx2 = (getAccountID p1_ac, getAccountID uniswap_ac, A, 3.0)
tx3 = (getAccountID p0_ac, getAccountID uniswap_ac, B, 2.0)
tx4 = (getAccountID p1_ac, getAccountID uniswap_ac, B, 3.0)

block1 :: TxBlock
block1 = [tx1, tx2, tx3, tx4]

The strategy of the proposer is simply a choice of an ordering of these four transactions, which can be implemented as the (trivial) open game as follows:

txOrderingGame  = [opengame|
   inputs    :      ;
   feedback  :      ;

   :----------------------------:
   inputs    :      ;
   feedback  :      ;
   operation : dependentDecision "proposer" (const actionSpace);
   outputs   : ordering ;
   returns   : blockPayoff initAccounts (blockPerm ordering) 0     ;
   :----------------------------:

   outputs   :      ;
   returns   :      ;
  |]
  where
   actionSpace = [0..(product [1..4]-1)]
   blockPerm = \x -> ((permutations block1)!!x)

analyseTxOrderingGame strat = generateIsEq $ evaluate txOrderingGame strat void

What if the proposer decides to put their interactions first, i.e. order the block as [tx1, tx3, tx2, tx4]? This corresponds to the fifth permutation, so can be analysed as follows:

λ: analyseTxOrderingGame $ choosePerm 5

----Analytics begin----
 Strategies are NOT in equilibrium. Consider the following profitable deviations: 

Player: proposer
Optimal Move: 1
Current Strategy: fromFreqs [(5,1.0)]
Optimal Payoff: 0.15964740450538706
Current Payoff: 0.03920031360250853
 --other game-- 
 --No more information--
 NEWGAME: 
----Analytics end----

It can be seen that there is a bit of ‘profit’ using this ordering, though not a lot. Using the first permutation, i.e. [tx2, tx1, tx3, tx4], yields more than four times as much profit. This is because if both players first exchange A to B, then the price of A decreases, which means you get more A for your B, and so changing 2 B for A with tx3 leaves the proposer with more A in total. This means that even though the engine suggests 1 as the optimal permutation, the identity permutation with index 0 should also be optimal, which indeed it is:

λ: analyseTxOrderingGame $ choosePerm 0

----Analytics begin----
 Strategies are in equilibrium
 NEWGAME: 
----Analytics end----

In fact, permutation 0 might be considered slightly better because it leaves the proposer (indexed with !!0) with slightly more B:

λ: (executeBlock initAccounts [tx2, tx1, tx3, tx4])!!0
(0,10.159647404505387,9.849283402681461)
λ: (executeBlock initAccounts [tx1, tx2, tx3, tx4])!!0
(0,10.159647404505387,9.96078431372549)

However, since B is deemed irrelevant for the total MEV, this is not considered a profitable deviation.

Betting on Uniswap as a price oracle

Now consider the possible insertion of a transaction to the betting account. I will make this transaction a function that bets a certain amount, so that

blockWithbet :: TokenAmount -> TxBlock
blockWithbet amount = [tx1, tx2, tx3, txBet amount]

Letting the player bet multiples of 0.1 of a full token, the full game simply becomes the composition of two choices: what to bet, and how to order, as follows

txOrderingGame_withBet  = [opengame|
   inputs    :      ;
   feedback  :      ;

   :----------------------------:

   inputs    :      ;
   feedback  :      ;
   operation : dependentDecision "proposer" (const betAmounts);
   outputs   : betAmount ;
   returns   :  0   ;

   inputs    :      ;
   feedback  :      ;
   operation : dependentDecision "proposer" (const actionSpace);
   outputs   : ordering ;
   returns   : blockPayoff initAccounts (blockPerm ordering betAmount) 0  ;
   :----------------------------:

   outputs   :      ;
   returns   :      ;
  |]
  where
   betAmounts = [0,0.1..(balance (initAccounts!!0) A)]
   actionSpace = [0..(product [1..4]-1)]
   blockPerm = \orderChoice betAmount -> ((permutations $ blockWithbet betAmount)!!orderChoice)

which then also requires two separate strategies:

betAndOrderStrat :: Double -> Int -> List '[Kleisli Stochastic () Double,
   Kleisli Stochastic () Int]
betAndOrderStrat amount orderChoice = Kleisli (\x -> playDeterministically amount) ::- Kleisli (\x -> playDeterministically orderChoice) ::- Nil

analyseTxOrderingGame_withBet strat = generateIsEq $ evaluate txOrderingGame_withBet strat void

To analyse this game, let’s first consider the strategy of betting 4A, and not doing any reordering:

λ: analyseTxOrderingGame_withBet $ betAndOrderStrat 4 0

----Analytics begin----
 Strategies are NOT in equilibrium. Consider the following profitable deviations: 

Player: proposer
Optimal Move: 0.0
Current Strategy: fromFreqs [(4.0,1.0)]
Optimal Payoff: 0.15964740450538706
Current Payoff: -3.840352595494613
 --other game-- 
 --No more information--
 NEWGAME: 

 Strategies are NOT in equilibrium. Consider the following profitable deviations: 

Player: proposer
Optimal Move: 17
Current Strategy: fromFreqs [(0,1.0)]
Optimal Payoff: 4.159647404505387
Current Payoff: -3.840352595494613
 --other game-- 
 --No more information--
 NEWGAME: 
----Analytics end----

The engine reports two profitable deviations: first of all, the proposer is better off not betting at all, since they don’t win with this reordering anyway. Second, there is a more profitable ordering, the permutation indexed by 17, which corresponds to [tx2, tx1, txBet, tx3]. Following both these suggestions: no betting and reordering, leads to the following:

λ: analyseTxOrderingGame_withBet $ betAndOrderStrat 0 17

----Analytics begin----
 Strategies are NOT in equilibrium. Consider the following profitable deviations: 

Player: proposer
Optimal Move: 8.0
Current Strategy: fromFreqs [(0.0,1.0)]
Optimal Payoff: 8.159647404505385
Current Payoff: 0.15964740450538706
 --other game-- 
 --No more information--
 NEWGAME: 

 Strategies are in equilibrium
 NEWGAME: 
----Analytics end----

That is, the engine recognises that with this ordering, not only does the proposer extract the same reordering MEV as before, they can now also insert their own bet in the middle, and can profit maximally by betting everything they still have:

λ: analyseTxOrderingGame_withBet $ betAndOrderStrat 8 17

----Analytics begin----
 Strategies are in equilibrium
 NEWGAME: 

 Strategies are in equilibrium
 NEWGAME: 
----Analytics end----

One advantage of this approach is that it makes clear that MEV not only depends on the transactions in the mempool, but also on the internal states of the contracts. When betting 4A in the current setup, it is even more profitable to bet 8A:

λ: analyseTxOrderingGame_withBet $ betAndOrderStrat 4 14

----Analytics begin----
 Strategies are NOT in equilibrium. Consider the following profitable deviations: 

Player: proposer
Optimal Move: 8.0
Current Strategy: fromFreqs [(4.0,1.0)]
Optimal Payoff: 8.159647404505385
Current Payoff: 4.159647404505387
 --other game-- 
 --No more information--
 NEWGAME: 

 Strategies are in equilibrium
 NEWGAME: 
----Analytics end----

However, when the Uniswap contract is initialised as having 1000 of each token, rather than 100, then there is no longer enough liquidity in the mempool to manipulate the price of A enough to win the bet, so the 8.16A of MEV all but disappears, and the most profitable strategy is to not bet at all:

λ: analyseTxOrderingGame_withBet $ betAndOrderStrat 4 14

----Analytics begin----
 Strategies are NOT in equilibrium. Consider the following profitable deviations: 

Player: proposer
Optimal Move: 0.0
Current Strategy: fromFreqs [(4.0,1.0)]
Optimal Payoff: 1.599784433289031e-2
Current Payoff: -3.984002155667109
 --other game-- 
 --No more information--
 NEWGAME: 

 Strategies are in equilibrium
 NEWGAME: 
----Analytics end----

Conclusion

It is satisfying to see MEV directly as the payoff of this block proposer game. However, it is a bit unfortunate that all compositional structure of the transaction execution is hidden in the foldl application, rather than explicit composition of Open Games. A logical next step would be to implement something similar, but model each contract call as an open game (potentially just a lifted function), so that player strategies and contract calls can be represented at the same level. Furthermore, because the betting and ordering are currently two different games, the engine cannot always identify a deviation of both strategies that is only jointly profitable. For example, it considers not betting and not reordering an equilibrium:

λ: analyseTxOrderingGame_withBet $ betAndOrderStrat 0 0

----Analytics begin----
 Strategies are in equilibrium
 NEWGAME: 

 Strategies are in equilibrium
 NEWGAME: 
----Analytics end----

Indeed, when not betting at all, the 0th permutation yields all possible MEV. However, there are better global deviations, as can be seen by comparing the final A balance of the proposer under either the ‘don’t bet don’t reorder’ strategy

λ: (executeBlock initAccounts $ (permutations $ blockWithbet 0)!!0)!!0
-- (ID, A balance, B balance)
(0,10.159647404505387,9.96078431372549)

with a bolder ‘bet 8 and choose ordering 17’ strategy

λ: (executeBlock initAccounts $ (permutations $ blockWithbet 8)!!17)!!0
-- (ID, A balance, B balance)
(0,18.159647404505385,9.849283402681461)

This shows that there is a total MEV of 8A, not found by the Open Game engine as a profitable deviation using this setup. This means that a different representation of this game might be better.

Another problem with this approach is that is does not scale well. Since the payoff of each ordering has to be calculated separately, the time complexity scales factorially in the number of transactions per block. Analysing a block with 7 Uniswap transactions and 1 Bet transaction already took around 45 seconds on my laptop. It perhaps makes more sense to come up with a set of reasonable strategies to shrink the action space to a more manageable size.