Run MaxText Python Notebooks on TPUs#

This guide provides clear, step-by-step instructions for getting started with python notebooks on the two most popular platforms: Google Colab and a local JupyterLab environment.

📑 Table of Contents#

Prerequisites#

Before starting, make sure you have:

  • ✅ Basic familiarity with Jupyter, Python, and Git

For Method 2 (Visual Studio Code) and Method 3 (Local Jupyter Lab) only:

  • ✅ A Google Cloud Platform (GCP) account with billing enabled

  • ✅ TPU quota available in your region (check under IAM & Admin → Quotas)

  • tpu.nodes.create permission to create a TPU VM

  • ✅ gcloud CLI installed locally

  • ✅ Firewall rules open for port 8888 (Jupyter) if accessing directly

Method 1: Google Colab with TPU#

This is the fastest way to run MaxText python notebooks without managing infrastructure.

⚠️ IMPORTANT NOTE ⚠️ The free tier of Google Colab provides access to v5e-1 TPU, but this access is not guaranteed and is subject to availability and usage limits.

Currently, this method only supports the sft_qwen3_demo.ipynb notebook, which demonstrates Qwen3-0.6B SFT training and evaluation on OpenAI’s GSM8K dataset. If you want to run other notebooks, please use the local Jupyter Lab setup method.

Before proceeding, please verify that the specific notebook you are running works reliably on the free-tier TPU resources. If you encounter frequent disconnections or resource limitations, you may need to:

  • Upgrade to a Colab Pro or Pro+ subscription for more stable and powerful TPU access.

  • Move to local Jupyter Lab setup method with access to a powerful TPU machine.

Step 1: Choose an Example#

1.a. Visit the MaxText examples directory on Github.

1.b. Find the notebook you want to run (e.g., sft_qwen3_demo.ipynb) and copy its URL.

Step 2: Import into Colab#

2.a. Go to Google Colab and sign in.

2.b. Select File -> Open Notebook.

2.c. Select the GitHub tab.

2.d. Paste the target .ipynb link you copied in step 1.b and press Enter.

Step 3: Enable TPU Runtime#

3.a. RuntimeChange runtime type

3.b. Select your desired TPU under Hardware accelerator

3.c. Click Save

Step 4: Run the Notebook#

Follow the instructions within the notebook cells to install dependencies and run the training/inference.

Available Examples#

Supervised Fine-Tuning (SFT)#

  • sft_qwen3_demo.ipynb → Qwen3-0.6B SFT training and evaluation on OpenAI’s GSM8K dataset. This notebook is friendly for beginners and runs successfully on Google Colab’s free-tier v5e-1 TPU runtime.

  • sft_llama3_demo.ipynb → Llama3.1-8B SFT training on Hugging Face ultrachat_200k dataset. We recommend running this on a v5p-8 TPU VM using the port-forwarding method.

Reinforcement Learning (GRPO/GSPO) Training#

  • rl_llama3_demo.ipynb → GRPO/GSPO training on OpenAI’s GSM8K dataset. We recommend running this on a v5p-8 TPU VM using the port-forwarding method.

Common Pitfalls & Debugging#

Issue

Solution

❌ TPU runtime mismatch

Check TPU runtime version matches VM image

❌ Colab disconnects

Save checkpoints to GCS or Drive regularly

❌ “RESOURCE_EXHAUSTED” errors

Use smaller batch size or v5e-8 instead of v5e-1

❌ Firewall blocked

Ensure port 8888 open, or always use SSH tunneling

❌ Path confusion

In Colab use /content/maxtext; in TPU VM use ~/maxtext

Support and Resources#

Contributing#

If you encounter issues or have improvements for this guide, please:

  1. Open an issue on the MaxText repository

  2. Submit a pull request with your improvements

  3. Share your experience in the discussions


Happy Training! 🚀