Models & Research

Liquid AI Introduces LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M: Dense Bi-Encoder and Late-Interaction M…

AI Quick Briefs Editorial Desk · June 19, 2026

What it does

Liquid AI launched two new models: LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M. These models combine dense bi-encoder and late-interaction ColBERT architectures to power fast multilingual search. They support 11 languages and are designed to run efficiently on edge devices, where low latency and limited resources matter.

Why it matters

Multilingual search in resource-constrained environments is hard to do well. Dense bi-encoders provide quick semantic embeddings for candidate retrieval, but usually sacrifice retrieval precision. Late-interaction models like ColBERT add precision by re-ranking based on detailed interactions between query and document embeddings, but tend to be slower and heavier. Liquid AI’s approach marries these two models to boost both speed and accuracy on edge hardware.

This tight integration matters because companies building global search functions on mobile, embedded, or offline devices often have to balance response time, power consumption, and multilingual coverage. Their new LFM2.5 retrievers simplify that trade-off while extending language reach to 11 languages, making them valuable for practical deployments where cloud calls aren’t ideal or possible.

Who it is for

Developers working on search features in apps serving multiple languages will find these models useful. Edge device operators, from smart assistants to on-device document search, gain from the balance of performance and model size. Enterprises seeking to reduce cloud dependency and latency in user-facing search can also consider these models for embedding in their pipelines.

The catch

While combining bi-encoder and ColBERT models aims to optimize speed and accuracy, edge deployment always involves compromises. The 350 million parameter size may still challenge some low-resource devices. Additionally, maintaining 11 languages means performance could vary by language, potentially requiring customization or additional tuning for specific markets or domains.

What to watch next

Look for benchmarks and real-world tests comparing LFM2.5 models to established multilingual retrievers across latency, accuracy, and resource use. Also watch how Liquid AI integrates these models with their tools and SDKs to ease developer adoption on edge platforms. Any signs of expanding language coverage or reducing model size further would tighten their edge search position.

AI Quick Briefs Editorial Desk

Read Full Article →