Lightweight LLM AI Inference with Wasm with Michael Yuan

Опубликовано: 01 Июль 2024
на канале: Civo

247

Join Michael Yuan, CEO of ‪@SecondStateInc‬, as he explores lightweight large language model (LLM) inference with WebAssembly (WASM).

In this video, Michael demonstrates how to run full-scale LLMs like LLaMA on various platforms, from personal laptops to cloud servers, with the efficiency of WASM. He addresses the challenges of running LLMs in cloud environments, offers practical demos, and discusses future applications.

Learn more about Civo Navigate here -►https://www.civo.com/navigate
====================
Get free credit to try the world’s first K3s-powered, managed Kubernetes service.
Sign up to Civo -► https://www.civo.com
Get started with Kubernetes with Civo Academy: https://www.civo.com/academy

Subscribe to our YouTube Channel -► http://civo.io/subscribe

Follow Civo:
• Twitter -►   / civocloud
• Github -► https://github.com/civo
• LinkedIn -►   / civocloud
• Facebook -►   / civocloud

0:00 - Introduction and Background
2:00 - Challenges of Running Large Language Models
6:00 - Importance of Portability in AI Workloads
9:00 - Introduction to WebAssembly (WASM)
11:30 - Benefits of WASM for AI Inference
15:00 - Live Demo Setup
18:00 - Running WASM Locally
20:30 - Deploying WASM on Cloud with NVIDIA
23:00 - Practical Use Cases and Applications