jina-embeddings-v5-omni: Geometry-preserving Embeddings via Locked Aligned Towers

Short summary

Jina AI introduced GELATO, a novel approach that extends frozen multimodal embedding models by training only 0.35% of weights. The jina-embeddings-v5-omni suite combines text, image, audio, and video into a single semantic space while maintaining exact backward compatibility with the v5 Text models. Competitive benchmarking shows performance on par with larger systems, enabling efficient deployment of multimodal AI applications without retraining core components.

•GELATO method freezes backbone models and trains only 0.35% of weights (connector layers)
•jina-embeddings-v5-omni supports text, image, audio, and video in unified embedding space
•Achieves competitive performance with larger multimodal models at dramatically lower training cost

Generated with AI, which can make mistakes.

#ai-tools #research-breakthrough #ai-agents

Read full article at arXiv cs.CL

Is this a good recommendation for you?

jina-embeddings-v5-omni: Geometry-preserving Embeddings via Locked Aligned Towers

Short summary

Explore more