arXiv cs.CL
5/12/2026

Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models
Short summary
Researchers leveraged multimodal LLMs—specifically InternVL and Qwen—to analyze remote sensing imagery for smart cities, enabling automated design suggestions, constructability assessment, and risk identification. The study compared accuracy and reliability across spatial scales, demonstrating how vision-language models can enhance built environment reasoning. Results suggest integrating remote sensing with advanced LLMs significantly improves smart city decision-making and urban planning workflows.
- •Tested InternVL and Qwen models on satellite imagery for urban planning tasks
- •Models performed well on design suggestions, constructability assessment, and risk identification
- •Demonstrates multimodal AI potential for smart city infrastructure decisions
Generated with AI, which can make mistakes.
Is this a good recommendation for you?