Chang She

Chang She headshot.
LanceDB company logo.
CEO
LanceDB

Presentation Title:

The New Data Lake Format for Multimodal AI

Presentation Summary:

AI brings the focus front and center to both new data types like embeddings, image, video, and video, and new workloads like search and training. The core of the traditional data lake is a parquet + iceberg powered storage layer for fast OLAP queries and cannot meet the new demands of AI. Instead, Lance format is the new open-source standard for the storage layer for AI data.

For the world’s most cutting-edge multimodal companies like Midjourney, Luma Labs, and WorldLabs, Lance format delivers way faster performance than parquet, superior schema evolution to Iceberg, blazing fast search on 10s of billions of vectors, and superior training performance for LLMs and VLMs. The future of multimodal AI applications and models is being built on Lance today.

Picture of About | Chang She

About | Chang She

Chang is the CEO and Co-founder of LanceDB and has been making data tooling for ML/AI for almost two decades.

One of the original co-authors of the pandas project, Chang started LanceDB to make it easy for AI teams to work with all of the data that doesn't fit neatly into dataframes — from embeddings to images, from audio to video, at petabyte scale.