Have you ever wondered how large organizations and high tech unicorns are able to build platforms on Kubernetes to run all kinds of workloads - web, stateless, stateful, batch, and even AI?
Kubernetes’ strengths in dynamic resource scheduling, automated orchestration and vibrant ecosystem of frameworks make it ideal for building AI/ML platforms. This becomes highly scalable when it combines the power of GKE hosted in the Cloud with disposable GPUs and TPUs.
In this talk we will explore some recent OSS technologies that enable efficient job management and powerful distributed computing while preventing compute resource wastage and maximising utilisation. These are essential when training and serving large Generative AI models.