Skip to main content
Back to top
Ctrl
+
K
Search
Ctrl
+
K
MaxText
Install MaxText
Build and Upload MaxText Docker Images
Tutorials
Getting started: First run
Pre-training
Post-training
SFT on single-host TPUs
SFT on multi-host TPUs
Reinforcement Learning on single-host TPUs
Reinforcement Learning on Multi-Host TPUs
Knowledge distillation
Multimodal support
Full fine-tuning on single-host TPUs
GEPA Prompt Optimization for MaxText
Inference on MaxText
Run MaxText
Via localhost or single-host VM
Via single-host GPU
At scale with XPK
Via Pathways
Via Decoupled Mode (No Google Cloud Dependencies)
How-to guides
Optimization
Customizing model configs for TPUs
Sharding on TPUs
Optimizing with Pallas kernels
Benchmarking & tuning guide
Data pipelines
Grain pipeline
Hugging Face pipeline
TFDS pipeline
Optimizing pipeline performance
Checkpointing
GCS bucket-based checkpointing
Emergency checkpointing
Multi-tier checkpointing
Checkpoint Conversion Utilities
Monitoring and debugging
Features and diagnostics
Enable GCP workload observabiltiy
Troubleshooting: Megascale hangs
ML Goodput measurement
Understand logs and metrics
Use Vertex AI Tensorboard
Profiling with XProf
Running a workload with Google Cloud ML Diagnostics Enabled
Run MaxText Python Notebooks on TPUs
MaxText Model Bringup: Community Contributor Guide
Distillation
Reference documentation
Performance metrics
Models
Optimized models tiering
Supported models list
Architecture
Architecture overview
MaxText and the JAX ecosystem
Core concepts
Checkpoints
Comparison to alternatives
Batch Size
Quantization
Tiling
JAX, XLA, and Pallas
Mixture of Experts (MoE) Configuration
How to Contribute
Update MaxText dependencies
Contribute to documentation
MaxText release notes
.rst
.pdf
maxtext.input_pipeline.packing.prefill_packing module
maxtext.input_pipeline.packing.prefill_packing module
#