Back to feed
arXiv cs.CL
arXiv cs.CL
5/12/2026
Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models

Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models

Short summary

Researchers leveraged multimodal LLMs—specifically InternVL and Qwen—to analyze remote sensing imagery for smart cities, enabling automated design suggestions, constructability assessment, and risk identification. The study compared accuracy and reliability across spatial scales, demonstrating how vision-language models can enhance built environment reasoning. Results suggest integrating remote sensing with advanced LLMs significantly improves smart city decision-making and urban planning workflows.

  • Tested InternVL and Qwen models on satellite imagery for urban planning tasks
  • Models performed well on design suggestions, constructability assessment, and risk identification
  • Demonstrates multimodal AI potential for smart city infrastructure decisions

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more