![]() In the Encode-Process-Decode fashion, abstract inputs (obtained from natural inputs) are processed by the neural net (Processor), and its outputs are decoded into abstract outputs which could then be mapped to more natural task-specific outputs. □ The blueprint explains how neural networks can mimic and empower the execution process of usually discrete algorithms in the embedding space. I would also recommend checking out a very illustrative talk by Stefanie Jegelka explaining the main outcomes of this work at the Deep Learning and Combinatorial Optimization 2021 workshop. Further, the authors show that choosing a proper aggregation function for a GNN is crucial when modeling a particular DP algorithm, e.g., for Bellman-Ford we need a min-aggregator. Indeed, compare an iteration of the classical Bellman-Ford algorithm for finding a shortest path and an Aggregate-Combine step of a message passing GNN - you’d find a lot of commonalities. Compared to the ICLR’20 paper, here the authors discuss a stronger extrapolation condition - linear alignment to DP. With the notion of algorithmic alignment (introduced by the same authors in the spotlight ICLR’20 paper), the authors show that GNNs are well-aligned with dynamic programming (DP)(check the illustration □). Xu et al in their outstanding ICLR’21 paper study extrapolation of neural nets and arrive at several spectacular outcomes. GNNs + Combinatorial Optimization & AlgorithmsĢ021 was a big year for this emerging subfield! ![]() Having spectral features concatenated with input node features, SAN outperforms sparse GNNs on many molecular tasks. SAN by Kreuzer, Beaini, et alemploys top-k eigenvalues and eigenvectors of the Laplacian showing that spectral features alone can distinguish graphs considered isomorphic by 1-WL test. Perhaps the two most visible graph transformer models this year were SAN (Spectral Attention Nets) and Graphormer. Even more importantly, we need a way to imbue nodes with some positional features, otherwise GTs fall behind GNNs (as shown in the 2020 paper of Dwivedi and Bresson). ![]() Fully-connected graphs mean we have ‘true’ edges from the original graph and ‘fake’ edges added from the fully-connected transformation, and we want to distinguish those. On the other hand, GTs do not suffer from oversmoothing which is a common problem of long-distance message passing. On one hand, this brings back the O(N²) complexity in the number of nodes N. While GNNs operate on usual (normally sparse) graphs, Graph Transformers (GTs) operate on the fully-connected graph where each node is connected to every other node in a graph.
0 Comments
Leave a Reply. |