class: middle, center, title-slide
Lecture 9: Graph neural networks
Prof. Gilles Louppe
[email protected]
???
R: since the lecture is short, add a code example at the end. Check UDL
- Graphs
- Graph neural networks
- Special cases
- Applications
class: middle
Many real-world problems do not fit into the tabular format of machine learning. Instead, many problems involve data that is naturally represented as .italic[graphs]:
- (bio)molecular structures,
- traffic networks,
- scene graphs,
- social networks,
- computer programs,
- ... and many more!
class: middle
Molecules are naturally represented as graphs, where nodes are atoms and edges are bonds.
Features can be associated with each node and edge, e.g. the atomic number, the number of bonds, etc.
.footnote[Credits: Petar Veličković, CST Wednesday Seminar, 2021.]
class: middle
An interesting problem is to predict whether a molecule is a potent drug. This can be formulated as a binary classification problem, where
- the input is a graph representation of the molecule and
- the output is a binary label (e.g., whether the drug will inhibit bacterial growth).
.footnote[Credits: Petar Veličković, CST Wednesday Seminar, 2021.]
???
- Binary classification on whether the drug will inhibit bacterial growth (E. coli).
- Train a graph neural network (GNN) on a curated dataset
$O(10^4)$ of known drugs.
Once a GNN can accurately predict whether a molecule is a potent drug, we can use it on arbitrary new graphs to identify potential drugs:
- Run on large dataset of candidates.
- Select top-100 with the highest predicted potency.
- Manually inspect top-100.
--
This very approach led to the discovery of .italic[Halicin], a previously overlooked compound that is a highly potent antibiotic!
.footnote[Credits: Petar Veličković, CST Wednesday Seminar, 2021.]
class: middle
class: middle, black-slide
class: middle
class: middle
A graph
Edges can be represented by an adjacency matrix
The features of the nodes are represented by a matrix
???
Draw an example on the board.
class: middle
.grid[
.kol-1-2[
] ]
???
Draw on the blackboard, nodes, edges and their features.
class: middle
Given a graph
- graph-level predictions
$y \in \mathcal{Y}$ , using graph-level functions$f(\mathbf{X}, \mathbf{A})$ , - node-level predictions
$\mathbf{y} \in \mathcal{Y}^{|\mathcal{V}|}$ , using node-level functions$\mathbf{F}(\mathbf{X}, \mathbf{A})$ . - edge-level predictions
$\mathbf{y} \in \mathcal{Y}^{|\mathcal{E}|}$ , using edge-level functions$\mathbf{F}(\mathbf{X}, \mathbf{A})$ .
class: middle
.footnote[Credits: Simon J.D. Prince, Understanding Deep Learning, 2023.]
class: middle
A permutation matrix
class: middle
class: middle
The very representation
For graph-level tasks, we want permutation invariance, i.e.
class: middle
For node-level tasks, we want permutation equivariance, i.e.
???
That is, permuting the nodes of the graph should modify the results only up to their permutation.
class: middle
class: middle
Graph neural networks (GNNs) are neural networks that operate on graphs. They implement graph-level permutation invariance and node-level permutation equivariance.
The general blueprint is to stack permutation equivariant function(s), optionally followed by a permutation invariant function.
class: middle
.footnote[Image credits: Petar Veličković, Everything is Connected: Graph Neural Networks, 2023.]
class: middle count: false
.footnote[Image credits: Petar Veličković, Everything is Connected: Graph Neural Networks, 2023.]
class: middle count: false
.footnote[Image credits: Petar Veličković, Everything is Connected: Graph Neural Networks, 2023.]
class: middle count: false
.footnote[Image credits: Petar Veličković, Everything is Connected: Graph Neural Networks, 2023.]
class: middle count: false
.footnote[Image credits: Petar Veličković, Everything is Connected: Graph Neural Networks, 2023.]
class: middle
If we denote
???
Illustrate on the board.
New feature vectors
class: middle
A strong inductive bias of graph neural networks is based on locality. It assumes that the information about a node is most relevant to its close neighbors rather than distant ones.
For a node
Accordingly,
As previously,
class: middle
.footnote[Note:
class: middle
Permutation equivariant functions
Similarly to regular layers, a GNN layer computes a new representation $$\mathbf{H} = \mathbf{F}(\mathbf{X}, \mathbf{A}) = \begin{bmatrix}\mathbf{h}_1\\
\vdots\\
\mathbf{h}_{|\mathcal{V}|}\end{bmatrix} = \begin{bmatrix}\phi(\mathbf{x}_1, \mathbf{X}_{\mathcal{N}_1})\
\vdots\\
\phi(\mathbf{x}_{|\mathcal{V}|}, \mathbf{X}_{\mathcal{N}_{|\mathcal{V}|}})\end{bmatrix}$$ from the input representation
class: middle
GNN layers are usually classified in three spatial flavors depending on how they implement the propagation operator
- Convolutional
- Attentional
- Message-passing
class: middle
.grid[ .kol-1-2[
Features of neighboring nodes are aggregated with fixed coefficients
Example:
.footnote[Image credits: Bronstein et al., Geometric Deep Learning, 2021.]
???
Illustrate on the board, with a larger graph.
class: middle
.grid[ .kol-1-2[
Features of neighboring nodes are aggregated with implicit weights via an attention mechanism:
Example:
.footnote[Image credits: Bronstein et al., Geometric Deep Learning, 2021.]
class: middle
.grid[ .kol-1-2[
Compute arbitrary vectors (or .italic[messages]) to be sent across the edges of the graph:
This is the most generic form of GNN layers.
.footnote[Image credits: Bronstein et al., Geometric Deep Learning, 2021.]
???
Illustrate on the board, with a larger graph.
class: middle
Each flavor of GNN layers can be composed in parallel and then combined (e.g., by concatenation or average) to form the final representation
This is similar to having multiple kernels in a convolutional layer or multiple attention heads in an attention layer.
???
Draw a full architecture on the blackboard, stopping at the parallel composition.
class: middle
.footnote[Image credits: Petar Veličković, Graph Attention Networks, 2017.]
class: middle
Layers can be stacked in series to form deep graph neural networks: $$\begin{aligned}\mathbf{H}_0 &= \mathbf{X} \\ \mathbf{H}_1 &= \mathbf{F}_1(\mathbf{H}_0, \mathbf{A}) \\ ... & \\ \mathbf{H}_L &= \mathbf{F}_L(\mathbf{H}_{L-1}, \mathbf{A}) \end{aligned}$$ This is similar to stacking convolutional layers in a convolutional neural network or stacking transformer blocks in a transformer.
???
Continue the drawing on the blackboard, by stacking layers in series.
Elaborate on the propagation of information across the graph.
- The effective neighborhood of a node grows with the depth of the network
- Similar to CNNs in which the effective receptive field grows with the depth of the network.
class: middle, center
.center.width-50[] .italic[1-layer GNN]
Stacking layers in series increases the effective receptive field of each node.
class: middle, center count: false
.center.width-50[] .italic[2-layer GNN]
Stacking layers in series increases the effective receptive field of each node.
class: middle, center count: false
.center.width-50[] .italic[3-layer GNN]
Stacking layers in series increases the effective receptive field of each node.
class: middle
To make graph-level predictions, the node representations
class: middle
class: middle
When the set of edges is empty, the graph reduces to a set of isolated nodes. In this case,
Such a structure is often referred to as a .italic[Deep Set] and can be considered as a special case of a GNN.
class: middle
When nodes are expected to have a relational structure but the edges are unknown, it is common to assume that all nodes are connected to each other. In this case,
class: middle
For convolutional GNN layers,
class: middle
For attentional GNN layers however,
In other words, the transformer architecture is a special case of a GNN.
class: middle
Images can be represented as graphs, where each pixel is a node and the edges are defined by the spatial adjacency of the pixels.
That is, convolutional neural networks can be seen as a special case of GNNs.
class: middle
class: middle
.footnote[Credits: Stokes et al, A Deep Learning Approach to Antibiotic Discovery, 2020.]
class: middle
.footnote[Credits: Shi and Rajkumar, Point-GNN, 2020.]
class: middle
.grid[
.kol-2-3[
.footnote[Credits: Derrow-Pinion et al, 2021.]
class: middle
.footnote[Credits: Sanchez-Gonzalez et al, 2020.]
class: middle, black-slide
.center[