05_outlook.tex

\section{Outlook}

The first problem that should be tackled is the gap between validation and test results on the AIDS dataset. Increasing the number of graphs could potentially solve the issue already. However, to verify any improvement we should also consider working with multiple completely distinct validation sets at once.

Also the underperforming consensus step should be improved in the future. We can produce a solid initial matching, and simply applying the same "softmax-no-norms" consensus of \cite{fey2020_update} should yield at least as good results as they report. Therefore, exactly replicating their model might be a good start and allow us to then investigate which modifications can further improve these results.

After that we might want to look back at the four defining properties of metrics:
\begin{itemize}
    \itemsep0em
    \item $d(x,y) \ge 0$ non-negativity
    \item $d(x,y) = 0 \Leftrightarrow x = y$ identity of indiscernibles
    \item $d(x,y)  = d(y,x)$ symmetry
    \item $d(x,y) \le d(x,z) + d(z, y)$ subadditivity or triangle inequality
\end{itemize}
We already ensured non-negativity and symmetry, but it would certainly be interesting to investigate if we can ensure the triangle inequality. Perhaps that could lead to a small modification of the model that induces a strong and useful prior. Also investigating the identity of indiscernibles could lead to interesting results. If it was possible to ensure this property, it would solve the graph isomorphism problem, and finally answer the question of whether the problem is NP-complete or not. However, doing so will require the use of more powerful embedding networks because a standard GNN cannot distinguish between certain graphs\footnote{For example, a GNN cannot distinguish between a hexagon and two disconnected triangles.} and will, therefore, yield the same set of embeddings in such a case. Maybe easier to handle and therefore more promising is translational invariance, $d(x,y) = d(x+a,y+a)$, which might be a valid assumptions in many cases. Graph sums can be defined by adding the adjacency matrices of equal-sized graphs (\citealp{graph_sum2004}) or, in the context of the graph edit distance, maybe by adding a node that is connected to all other nodes. If it is not possible to derive enforcing model modifications for translational invariance or the triangle inequality then one could still try using the properties for data augmentation, which could certainly lead to better generalization.

A different direction for further research is to speed up the current model. It seems inevitable that the whole process is quadratic in the number of nodes due to the use of the cost matrix $C \in \mathbb{R}^{N \times N}$. However, we might be able to achieve linear run time by using the Nystr{\"{o}}m method or multiscale matching. \cite{nytrom2019} show that using the Nystr{\"{o}}m method and Sinkhorn scaling one can accurately approximate the Sinkhorn distance without ever computing the full cost matrix. The Nystr{\"{o}}m method is based on a set of landmarks that are used to speed up the matching. Similarly multiscale matching uses clusters of nodes that are then matched with each other (\citealp{multiscale2016}). Ideally one would combine both approaches to increase the accuracy since landmarks and clusters each have their merits and demerits. Using landmarks one can vastly overestimate small distances, and multiscale matching never computes distances that exceed some threshold but approximates them with infinity.

To show off the advantage of our full BP cost matrix it would be interesting to investigate "sparse matchings". \cite{fey2020_update} use only pairs of graphs where every node in the source graph finds a match in the target graph. We could simply add the other pairs to the training set, but validate on both groups individually. There should be a good chance that we significantly outperform any previous methods on this specialized task.