arXiv cs.LG
5/13/2026

Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning
Short summary
Researchers developed SoftBlobGIN, a lightweight graph neural network that interprets protein language model (ESM-2) features by projecting them onto protein contact graphs and performing structure-aware message passing. The method achieves 92.8% accuracy on enzyme classification and 0.983 AUROC on binding-site detection, automatically identifying functional substructures without retraining the base model. Learned blob partitions group residues into biologically meaningful clusters, making protein AI model predictions more transparent and directly auditable.
- •Novel GNN method (SoftBlobGIN) interprets protein language models via structural graph projections
- •Achieves 92.8% enzyme classification accuracy and 0.983 AUROC on binding-site detection
- •Automatically identifies biologically meaningful functional substructures without retraining
Generated with AI, which can make mistakes.
Is this a good recommendation for you?