arXiv cs.LG
arXiv cs.LG

arXiv cs.LG

Blog

80posts
0followers

arXiv cs.LG publishes articles covering LLM, AI, analysis, data. A trusted source for AI and technology insights.

Rotation-Preserving Supervised Fine-Tuning

Rotation-Preserving Supervised Fine-Tuning

23d

QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization

QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization

23d

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

23d

Interpretable EEG Microstate Discovery via Variational Deep Embedding: A Systematic Architecture Search with Multi-Quadrant Evaluation

Interpretable EEG Microstate Discovery via Variational Deep Embedding: A Systematic Architecture Search with Multi-Quadrant Evaluation

23d

$\xi$-DPO: Direct Preference Optimization via Ratio Reward Margin

$\xi$-DPO: Direct Preference Optimization via Ratio Reward Margin

23d

Adaptive scheduling steers diffusion L

Adaptive scheduling steers diffusion L

23d

Hierarchical Multi-Scale Graph Neural Networks: Scalable Heterophilous Learning with Oversmoothing and Oversquashing Mitigation

Hierarchical Multi-Scale Graph Neural Networks: Scalable Heterophilous Learning with Oversmoothing and Oversquashing Mitigation

23d

TMPO: Trajectory Matching Policy Optimization for Diverse and Efficient Diffusion Alignment

TMPO: Trajectory Matching Policy Optimization for Diverse and Efficient Diffusion Alignment

23d

Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning

Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning

23d

Vertex-Softmax: Tight Transformer Verification via Exact Softmax Optimization

Vertex-Softmax: Tight Transformer Verification via Exact Softmax Optimization

23d

Statistical Inference and Quality Measures of KV Cache Quantisations Inspired by TurboQuant

Statistical Inference and Quality Measures of KV Cache Quantisations Inspired by TurboQuant

24d

Distributional Reinforcement Learning via the Cram\'er Distance

Distributional Reinforcement Learning via the Cram\'er Distance

24d

Path-Based Gradient Boosting for Graph-Level Prediction

Path-Based Gradient Boosting for Graph-Level Prediction

24d

Reinforcement learning for inverse structural design and rapid laser cutting of kirigami prototypes

Reinforcement learning for inverse structural design and rapid laser cutting of kirigami prototypes

24d

The Safety-Aware Denoiser for Text Diffusion Models

The Safety-Aware Denoiser for Text Diffusion Models

24d

Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking

Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking

24d

Do Foundation Model Embeddings Improve Cross-Country Crop Yield Generalisation? A Leave-One-Country-Out Evaluation in Sub-Saharan Africa

Do Foundation Model Embeddings Improve Cross-Country Crop Yield Generalisation? A Leave-One-Country-Out Evaluation in Sub-Saharan Africa

24d

Geometry-free prediction of inertial lift forces in microfluidic devices using deep learning

Geometry-free prediction of inertial lift forces in microfluidic devices using deep learning

24d

BaLoRA: Bayesian Low-Rank Adaptation of Large Scale Models

BaLoRA: Bayesian Low-Rank Adaptation of Large Scale Models

24d

TTCD:Transformer Integrated Temporal Causal Discovery from Non-Stationary Time Series Data

TTCD:Transformer Integrated Temporal Causal Discovery from Non-Stationary Time Series Data

24d

Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models

Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models

25d

A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence

A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence

25d

ESA satellite telemetry anomaly detection

ESA satellite telemetry anomaly detection

25d

On the Role of Strain and Vorticity in Numerical Integration Error for Flow Matching

On the Role of Strain and Vorticity in Numerical Integration Error for Flow Matching

25d