Optimization#
Explore techniques for maximizing performance, including model customization, sharding strategies, Pallas kernels, and benchmarking.
🛠️ Customizing Model Configs
Optimize and customize your LLM model configurations for higher performance (MFU) on TPUs.
🥞 Sharding Strategies
Choose efficient sharding strategies (FSDP, TP, EP, PP) using Roofline Analysis and understand arithmetic intensity.
⚡ Pallas Kernels
Optimize with Pallas kernels for fine-grained control.
📈 Benchmarking & Tuning
Guide to setting up benchmarks, performing performance tuning, and analyzing metrics.