Papers
arxiv:2510.14336

DARTS-GT: Differentiable Architecture Search for Graph Transformers with Quantifiable Instance-Specific Interpretability Analysis

Published on Oct 16
Authors:
,

Abstract

Graph Transformers (GTs) have emerged as powerful architectures for graph-structured data, yet remain constrained by rigid designs and lack quantifiable interpretability. Current state-of-the-art GTs commit to fixed GNN types across all layers, missing potential benefits of depth-specific component selection, while their complex architectures become opaque where performance gains cannot be distinguished between meaningful patterns and spurious correlations. We redesign GT attention through asymmetry, decoupling structural encoding from feature representation: queries derive from node features while keys and values come from GNN transformations. Within this framework, we use Differentiable ARchiTecture Search (DARTS) to select optimal GNN operators at each layer, enabling depth-wise heterogeneity inside transformer attention itself (DARTS-GT). To understand discovered architectures, we develop the first quantitative interpretability framework for GTs through causal ablation. Our metrics (Head-deviation, Specialization, and Focus), identify which heads and nodes drive predictions while enabling model comparison. Experiments across eight benchmarks show DARTS-GT achieves state-of-the-art on four datasets while remaining competitive on others, with discovered architectures revealing dataset-specific patterns. Our interpretability analysis reveals that visual attention salience and causal importance do not always correlate, indicating widely used visualization approaches may miss components that actually matter. Crucially, heterogeneous architectures found by DARTS-GT consistently produced more interpretable models than baselines, establishing that Graph Transformers need not choose between performance and interpretability.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.14336 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2510.14336 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.14336 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.