Optimization

Optimization#

Explore techniques for maximizing performance, including model customization, sharding strategies, Pallas kernels, and benchmarking.

🛠️ Customizing Model Configs

Optimize and customize your LLM model configurations for higher performance (MFU) on TPUs.

🥞 Sharding Strategies

Choose efficient sharding strategies (FSDP, TP, EP, PP) using Roofline Analysis and understand arithmetic intensity.

⚡ Pallas Kernels

Optimize with Pallas kernels for fine-grained control.

📈 Benchmarking & Tuning

Guide to setting up benchmarks, performing performance tuning, and analyzing metrics.