maxtext.integration.tunix.weight_mapping.llama3 module#

Defines the weight mapping from MaxText’s Llama3 model to a vLLM-compatible format.

This module provides the LLAMA3_VLLM_MAPPING dataclass, which contains all the necessary configurations to convert MaxText’s Llama3 model weights into a format that can be loaded by HuggingFace’s vLLM. This includes: - A direct mapping of parameter names. - Sharding specifications for distributed environments. - Hook functions for complex transformations (e.g., RoPE reordering).

class maxtext.integration.tunix.weight_mapping.llama3.LLAMA3_VLLM_MAPPING[source]#

Bases: object

Mapping MaxText Llama 2 and Llama 3 weights to vLLM’s Llama 2 and Llama 3 weights.

static to_hf_hook_fns()[source]#

Defines and returns hook functions for weight transformations.

These hooks are applied to specific weights during the conversion from MaxText to a HuggingFace-compatible format. They handle transformations like RoPE reordering and query scaling that are not simple re-mappings.

Returns:

A dictionary where keys are MaxText parameter names and values are the corresponding transformation functions.

static to_hf_transpose_keys()[source]#

Returns a list of keys for weights that need to be transposed.

Returns:

An empty dictionary, as no keys require transposition for this mapping.

static lora_to_hf_mappings()[source]#

Provides the mapping for LoRA (Low-Rank Adaptation) weights.

Returns:

None, as LoRA mappings are not defined for this model.

static to_hf_mapping()[source]#

Mapping from MaxText model to HuggingFace vLLM model.

Currently, the param mapping conforms to the Tunix API, which combines the param name & sharding in one dictionary. This is subject to change in the future where we can decouple the two.