Skip to content

Pairwise Compatibility Based Puzzle Solving

emarj edited this page Jun 9, 2025 · 6 revisions

Puzzle, Pieces and features

What is precisely a piece? It is difficult to say. It could be an image, a polygon, a point cloud, etc. and it could be enriched with some features, like segmentation masks, bounding boxes, colours, hyper-spectral information etc.

To formalize mathematically all these possibilities would be very time consuming. We will just say that we have a set universal set $\mathcal{P}$ of pieces. A puzzle is a finite subset of this universe of cardinality $N$, whose elements we denote by $P_i$ with $i \in [N]$, where $[N] := {0,1,\dots,N-1}$. We think of an element $P$ as an object containing all the information and features at our disposal.

In practice, pieces are given as an object with basic features (image or point cloud for example) and then more advances features are extracted after. Moreover, if some form of statical learning is used, some of these features could be implicitly extracted during the training process.

Sometime it can be necessary to distinguish between pieces $P_i$ and their extracted features $\mathrm{F}(P_i)$, we will not always do that in the following. To be more precise, there is be a map $\mathrm{F}: \mathcal{P} \to \mathcal{F}$ called feature extractor. Let $\psi$ be function that computes some quantity given a pair of pieces $P_i,P_j$. This will be encoded as a map $\psi: \mathcal{P}^2 \to H$, but most likely the map will factor through the feature extractor, meaning that there exist a map $\hat{\psi}: \mathcal{F}^2 \to H$ such that $\hat{\psi} \circ \mathrm{F} = \psi$.

Configurations

An object in the plane can described by the position $\mathrm{x} = (x,y) \in \mathbb{R}^2$ of its centroid and the angle $\theta$ of rotation around it.

Angles

We want to account for the periodicity of the angle. Thus we describe each object by an element of $Q := \mathbb{R}^2 \times S^1$, where $S^1$ is the unitary circle. We can describe $S^1$ as $\mathbb{R}/2\pi\mathbb{Z}$, that is, we identify two angles if they differ by an integer multiple of $2\pi$, or as a subset $\{e^{i\theta} : \theta \in \mathbb{R}\}$ of the complex plane.

To implement this effectively we could store a unitary complex number $e^{i\theta}= \cos \theta +i\sin \theta$ and use complex operations on it. This is somewhat equivalent to use quaternions for 3D rotations.

If we implement this simply as a number we should manually keep track of the periodicity.

Relative Configurations

A pair of objects in the plane is described by an element of $Q^2 := Q \times Q$.

A translation $T_{\mathrm{v}}$ by a vector $\mathrm{v}$ acts on $Q$ by

$$T_{\mathrm{v}} \cdot (\mathrm{x},\theta) = (\mathrm{x} + \mathrm{v},\theta)$$

and rotations $R_{\alpha}$ by angle $\alpha$ around $O = (0,0)$ acts on $Q$ by

$$R_\alpha \cdot (\mathrm{x},\theta) = (R_\alpha\mathrm{x},\theta + \alpha)$$

Let $E^+(2)$ be the group of rigid motions generated by rotations around the origin and translations. This group acts on $Q$ as described above and also on $Q^2$ simply by setting $g \cdot (q_1,q_2) := (g\cdot q_1, g\cdot q_2)$.

In puzzle solving one is mainly interested in relative configurations between objects. More precisely, we want pairs of absolute configurations to be equivalent if they differ by rotations or translations. Thus the set of relative configurations can be seen as the orbit group of $Q^2$ under the action of the group of rigid motions $E^+(2)$. We define $Q_{rel} := Q^2/E^+(2)$.

There is a bijection $Q \cong Q_{rel}$ constructed as follows:

$$Q \to Q_{rel},\quad (\mathrm{x},\theta) \mapsto E^+(2)\cdot((O,0),(\mathrm{x},\theta))$$

In more layman terms, this means that we can describe relative configurations by fixing the first object in the origin, not rotated, and by giving the absolute configuration of the second object. From now on we will simply adopt this convention and we identify $Q$ and $Q_{rel}$.

Given a pair of absolute configurations $((\mathrm{x},\theta), (\mathrm{x}',\beta)) \in Q^2$, we can get a relative configuration in standard form by applying $T_{-\mathrm{x}}$ followed by $R_{-\theta}$, obtaining

$$(R_{-\theta}(\mathrm{x}' - \mathrm{x}),\beta - \theta)$$

Remark. If $(\mathrm{x},\theta) = [(O,(\mathrm{x},\theta))]$ is the configuration of object $i$ with respect to $j$, the configuration of object $j$ with respect to $i$ is obtained as follows. Since

$$R_{-\theta} \circ T_{(-\mathrm{x})} \cdot ((O,0),(\mathrm{x},\theta)) = R_{-\theta} \cdot ((-\mathrm{x},0),(O,\theta)) = ((R_{-\theta}(-\mathrm{x}),-\theta),(O,0))$$

by exchanging the roles of object $i$ and $j$, that is $(q_1,q_2) \mapsto (q_2,q_1)$, we conclude that the relative configuration of object $j$ with respect to $i$ in standard form is

$$J(\mathrm{x},\theta) := (- R_{-\theta}(\mathrm{x}),-\theta)$$

Notice that $J: Q \to Q$ defines an involution, that is $J^2 = \mathrm{id}$.

In summary, a relative configuration is a pair $(\mathrm{x},\theta)$ (or a triple $(x,y,\theta)$ if you prefer) that represents the configuration of the second object with respect to the first.

Solutions

Intuitively speaking, a solution to a puzzle $\mathbb{P}$ of size $N$ can be described as a list of absolute configurations, one for each pieces, that is, a solution can be seen as an element of $Q^N$. The orbit of $E^+(2)$ acting on $Q^N$ is a class of equivalent solutions. Thus a solution should be defined more precisely as an equivalence class, that is, as an element of $Q^N/E^+(2)$.

As was done for the relative configuration of pairs of pieces, even in this case we can select an anchor and express a solution with respect to it. Let $k \in [N]$ be the index of the anchored piece, then we can select as representative the solution such that $P_k$ is the origin.

There are other ways to describe a solution, but they should be equivalent to this one. For example a solution could be given as a list of $N-1$ relative configurations $q_{ij}$ with $(i,j)$ and edge of $T(K_N)$, a spanning tree of the complete graph $K_N$. To get a solution as above, is enough to fix an anchor piece $p$ and a star centered in $p$ as a spanning tree.


Relative Pairwise Compatibility

Configuration score function and Pairwise compatibility

A configuration scoring (or score) function (CSF) is a mapping $$\phi: Q_{rel} \to \mathbb{R}$$that to each configuration assigns a score. The set of scoring functions is denoted by $\mathrm{CSF}(Q_{rel})$.

A (pairwise) compatibility function (PCF) is a function that takes in input two pieces and gives a configuration score function

$$\Psi: \mathcal{P}^2 \to \mathrm{CSF}(Q_{rel})$$

or, equivalently, this can be seen a real valued function on concrete relative configurations:

$$\Psi: \mathcal{P}^2 \times Q_{rel} \to \mathbb{R}$$

We set $\Psi(P,P) = 0$ for every $P \in \mathcal{P}$.

We must have

$$\Psi(P_1,P_2,(\mathrm{x},\theta)) = \Psi(P_2,P_1,J(\mathrm{x},\theta)) = \Psi(P_2,P_1,(R_{-\theta}(-\mathrm{x}),-\theta)) $$

for pieces $P_1,P_2 \in \mathcal{P}$, with $R_\theta$ and $J$ as above (see Configurations).


Given natural number $n$, a pairwise score matrix (PSM) is a matrix of scoring functions, that is, an element $\Phi$ of $M_n(\mathrm{CSF(Q_{rel})})$, or more explicitly

$$\Phi: [N]^2 \to \mathrm{CSF}(Q_{rel}): (i,j) \mapsto \Phi_{ij}$$

Let ${P_1,\dots,P_n} \subset \mathcal{P}$ be a puzzle. A PCF induces a PSM by setting:

$$ \Phi_{ij} := \Psi(P_i,P_j) $$

We will often abuse the notation and denote this simply by $\Psi_{ij}$.

Allowed Region

This kind of compatibility allows for pieces to be compatible even if they are not touching. This could be helpful when we exploit certain features that can have an "action at a distance". Example line continuation... In many cases, however, we require the pieces to touch.

On the other hand, in most cases we do not want pieces to overlap/compenetrate, thus $\Phi_{ij}(x,y,\theta)$ will be negative for every overlap configuration.

Clone this wiki locally