HLO Graph Diff Verification Testing#

This document provides context for the HLO Graph Diff tests, what HLO is, and how to manage reference baselines.

What is HLO?#

HLO (High-Level Optimizer) is the intermediate representation used by XLA (Accelerated Linear Algebra) to capture the lowering compiler graph structures.

An HLO module records:

  • The sequences of low-level math operations (dot products, convolutions, additions).

  • Array tensor shapes and numerical precisions.

  • Multipod TPU cluster partitioning array sharding mappings.

Purpose of HloDiffTest#

The primary purpose of the TestHloDiff validation checks is to ensure that refactoring PRs are purely refactoring code and not unintentionally impacting graph compiler lowering or performance.

  • For pure refactors: The HLO graph layout should remain strictly identical. Any detected deviation flags that execution boundaries or operation pipelines might have changed under the hood.

  • For dependency updates: Changes to framework dependencies (like updating JAX or XLA versions) are expected to slightly alter compiled HLO output layouts, which makes baseline updates appropriate in those scenarios.


How the Test Works#

This test runs automatically as part of the tpu-integration CI test suite on every Pull Request.

When the test method executes, it performs the following sequence of actions:

  1. Triggers Compilation: It runs the model training lifecycle compilation-only phase (invoking train_compile.main()) without actually allocating hardware compute nodes or running optimization passes.

  2. Dumps HLO modules: Instructs the XLA compiler back-end to capture optimizer operations lowering structure graphs and dump them to text files.

  3. Strict comparison matches: Compares the structural lines of the generated representation graph directly against baseline .txt copies stored under tests/utils/.


Updating HLO reference files#

When intended architectures transformations alter graph lowering, reference file baselines require updates.

[!IMPORTANT]
While running the update script locally is not the end of the world, relying on local execution can cause remote CI tests to fail. The PR verification pipelines run the tests in a strictly locked GitHub Actions environment. The smallest discrepancies in local library installations will introduce slight backend lowering graph deviations. If your local execution leads to a remote CI check failure, rely on the GitHub Action trigger described below to generate environment-matching baselines.

Method 2: Local Execution#

If you need to test or update baselines manually during development:

source .venv/bin/activate
pytest tests/integration/hlo_diff_test.py -v

Or to force update the local baselines:

python3 tests/utils/update_hlo_references.py