Do Deep Graph Neural Networks exist?
One of the open questions in GNN literature is whether deep GNN, i.e. GNN with many layers (e.g. more than 10), is useful.
There is a
theoretical paper,
What graph neural networks cannot learn: depth vs width, that proves that at least the number of layers * the embedding size of each layer should be proportional to the number of the nodes in the graph if GNN can compute many Turing computable functions. So if a graph has 10K nodes, then d*w = O(10K). For example, common embedding size, w, is 128 or 256, which means that a number of layers should be 40.
There is a cost associated with each layer: each node has to look at every neighbor and aggregate its information. So most of the implementations have up to 5 layers for obvious reasons, it's very time-consuming to compute.
Somewhat contrary, another
theoretical paper,
Graph Neural Networks Exponentially Lose Expressive Power for Node Classification, shows that under the certain conditions on the graph, GNN will essentially carry only degree information for each node, which is the most local property you can have for a node. This does not contradict the previous paper as (1) this paper works in a limit, (2) previous paper says that if d*w < O(n) then there is an instance of a graph for which GNN fails, which does not mean the result is universal for all graphs, and (3) this paper has certain conditions to hold which are only applicable to a narrow family of graphs.
Beyond this, there is a question of
double descent, whether it occurs in GNN setting, which is yet the next question to solve.
So, my response is that for now we still have little understanding if deep GNN is useful and if so, how we can make them efficient in practice.