Training state of the large language models requires a deep understanding of data preparation, model construction, hyperparameter tuning, and hardware infrastructure.
In this talk, Cerebras Systems VP of Product Andy Hock will share best practices and novel methods that Cerebras developed and applied over the past year that can improve training efficiency by up to an order of magnitude. Specifically, he will share Cerebras’ experience with non-English and multi-lingual datasets, large and variable length sequence lengths, as well as maximal update parametrization. He will also demonstrate how to train models with over 100B parameters without code alterations and third party libraries, by taking advantage of Cerebras’ weight streaming architecture.
Dr. Andy Hock is Vice President and Head of Product at Cerebras Systems. He is a researcher turned product leader and entrepreneur with a track record of bringing transformative technology products to market at scale.
At Cerebras, he and his team are responsible for product strategy and roadmap across AI research, hardware, and software, as well as ML solutions engineering. Before Cerebras, Andy was Senior Director for Advanced Technologies for Skybox Imaging leading up to its $500M acquisition by Google in 2014. At Google, he led Product Management for ML/AI data product and platform development for the Skybox / Terra Bella project before joining Cerebras.
Earlier in his career, Andy was a Senior Scientist and Senior Manager for signal and image processing algorithm and software development with Arete Associates. He holds a PhD in Geophysics and Space Physics from UCLA.