Skip to main content
Ctrl+K
MaxText  documentation - Home MaxText  documentation - Home
  • MaxText
  • Install MaxText
  • Build and Upload MaxText Docker Images
  • Tutorials
    • Getting started: First run
    • Pre-training
    • Post-training
      • SFT on single-host TPUs
      • SFT on multi-host TPUs
      • Reinforcement Learning on single-host TPUs
      • Reinforcement Learning on Multi-Host TPUs
      • Knowledge distillation
      • Multimodal support
      • Full fine-tuning on single-host TPUs
      • GEPA Prompt Optimization for MaxText
    • Inference on MaxText
  • Run MaxText
    • Via localhost or single-host VM
    • Via single-host GPU
    • At scale with XPK
    • Via Pathways
    • Via Decoupled Mode (No Google Cloud Dependencies)
  • How-to guides
    • Optimization
      • Customizing model configs for TPUs
      • Sharding on TPUs
      • Optimizing with Pallas kernels
      • Benchmarking & tuning guide
    • Data pipelines
      • Grain pipeline
      • Hugging Face pipeline
      • TFDS pipeline
      • Optimizing pipeline performance
    • Checkpointing
      • GCS bucket-based checkpointing
      • Emergency checkpointing
      • Multi-tier checkpointing
      • Checkpoint Conversion Utilities
    • Monitoring and debugging
      • Features and diagnostics
      • Enable GCP workload observabiltiy
      • Troubleshooting: Megascale hangs
      • ML Goodput measurement
      • Understand logs and metrics
      • Use Vertex AI Tensorboard
      • Profiling with XProf
      • Running a workload with Google Cloud ML Diagnostics Enabled
    • Run MaxText Python Notebooks on TPUs
    • MaxText Model Bringup: Community Contributor Guide
    • Distillation
  • Reference documentation
    • Performance metrics
    • Models
      • Optimized models tiering
      • Supported models list
    • Architecture
      • Architecture overview
      • MaxText and the JAX ecosystem
    • Core concepts
      • Checkpoints
      • Comparison to alternatives
      • Batch Size
      • Quantization
      • Tiling
      • JAX, XLA, and Pallas
      • Mixture of Experts (MoE) Configuration
  • How to Contribute
    • Update MaxText dependencies
    • Contribute to documentation
  • MaxText release notes
  • .rst

maxtext.kernels.attention package

Contents

  • Submodules

maxtext.kernels.attention package#

Attention kernels.

Submodules#

  • maxtext.kernels.attention.jax_flash_attention module
    • flash_attention_block_masked()
    • mask_blocker()
  • maxtext.kernels.attention.ragged_attention module
    • get_mha_cost_estimate()
    • reference_mqa()
    • reference_mha()
    • reference_gqa()
    • ragged_flash_attention_kernel()
    • ragged_mqa()
    • ragged_mha()
    • ragged_gqa()
  • maxtext.kernels.attention.splash_attention_kernel module
    • SegmentIds
      • SegmentIds.q
      • SegmentIds.kv
    • get_kernel_name()
    • attention_reference()
    • attention_reference_custom()
    • make_attention_reference()
    • make_masked_mha_reference()
    • make_masked_mqa_reference()
    • QKVLayout
      • QKVLayout.HEAD_DIM_MINOR
      • QKVLayout.SEQ_MINOR
    • from_head_minor()
    • BlockSizes
      • BlockSizes.block_q
      • BlockSizes.block_kv
      • BlockSizes.block_kv_compute
      • BlockSizes.block_q_dkv
      • BlockSizes.block_kv_dkv
      • BlockSizes.block_kv_dkv_compute
      • BlockSizes.block_q_dq
      • BlockSizes.block_kv_dq
      • BlockSizes.use_fused_bwd_kernel
      • BlockSizes.q_layout
      • BlockSizes.k_layout
      • BlockSizes.v_layout
      • BlockSizes.has_backward_blocks
      • BlockSizes.get_default()
    • flash_attention_kernel()
    • SplashAttentionKernel
      • SplashAttentionKernel.manual_fwd()
      • SplashAttentionKernel.manual_bwd()
      • SplashAttentionKernel.manual_sharding_spec()
      • SplashAttentionKernel.tree_flatten()
      • SplashAttentionKernel.tree_unflatten()
    • make_splash_mha()
    • make_splash_mqa()
    • make_splash_mha_single_device()
    • make_splash_mqa_single_device()
Contents
  • Submodules

By MaxText developers

© Copyright 2023–2026, Google LLC.