Skip to Main Content
NVIDIA Triton LLM Inference Server for Llama3.2 models

About This Workshop

Youtube Video

About This Workshop
Leverage the power of NVIDIA GPUs and advanced deep learning frameworks, such as Triton and TensorRT, to gain hands-on experience deploying scalable Llama 3.2 models for large language model inference. This workshop offers a comprehensive introduction to deploying and optimizing AI models using NVIDIA Triton, focusing on key tools and techniques to enhance inference performance and reduce latency. Attendees will explore real-world scenarios using A10 shape standalone or A10 shape within OKE, learning to streamline model management and fully utilize GPU resources for efficient AI deployments.

Workshop Info

1 hour
  • Lab 1 - Provision the resources for A10 instance
  • Lab 2 - Provision the resources for oke
  •  Administrative access to OCI Tenancy
  • Ability to spin-up A10 instances in OCI
  • Ability to create resources with Public IP addresses (load Balancer, Instances, OCI API Endpoint)

 

Other Workshops you might like

Ask Oracle
Helping you on LiveLabs