Optimization

Optimization#

Explore techniques for maximizing performance, including model customization, sharding strategies, Pallas kernels, and benchmarking.

🛠️ Customizing Model Configs

Optimize and customize your LLM model configurations for higher performance (MFU) on TPUs.

Customizing model configs for TPUs
🥞 Sharding Strategies

Choose efficient sharding strategies (FSDP, TP, EP, PP) using Roofline Analysis and understand arithmetic intensity.

Sharding on TPUs
⚡ Pallas Kernels

Optimize with Pallas kernels for fine-grained control.

Optimizing with Pallas kernels
📈 Benchmarking & Tuning

Guide to setting up benchmarks, performing performance tuning, and analyzing metrics.

Benchmarking & tuning guide