fbpx

Hagay Lupesko

Engineering Lead
Databricks

Presentation Title:

Scaling to the Future: Training 100B+ Parameter Models on Trillions of Tokens

Presentation Summary:

Atom icon for The AI Conference 2023, a groundbreaking two-day event on AGI, LLMs, Infrastructure, Alignment, AI Startups, and Neural Architectures.In this talk we’ll embark on a journey into the realm of large language models (LLMs), as we unravel the complexities of training 100-billion-plus parameter models on an unprecedented scale. We’ll delve into the intricacies of orchestrating thousands of GPUs to process trillions of tokens, discuss cutting-edge model architectures, training optimizations, hardware capabilities, and sophisticated orchestration and fault-tolerance strategies required to sustain weeks-long training runs.

Brain icon for The AI Conference 2023, a groundbreaking two-day event on AGI, LLMs, Infrastructure, Alignment, AI Startups, and Neural Architectures.

This talk is suitable for ML researchers, engineers, and anyone curious about the “sausage making” behind large scale LLM training.

About | Hagay Lupesko

Hagay Lupesko is an engineering lead at Databricks AI, where he focuses on making generative AI training and inference efficient, fast, and accessible. Prior to Databricks, Hagay led AI engineering at MosaicML (acquired by Databricks), Meta AI, and AWS ML. He shipped products across various domains: from 3D medical imaging, through global scale web systems, and up to deep learning systems powering apps and services used by billions of people world wide.