maxtext.integration.tunix.weight_mapping.qwen2 module#

Defines the weight mapping from MaxText’s Qwen2 model to a vLLM-compatible format.

This module provides the QWEN2_VLLM_MAPPING dataclass, which contains all the necessary configurations to convert MaxText’s Qwen2 model weights into a format that can be loaded by HuggingFace’s vLLM. This includes: - A direct mapping of parameter names. - Sharding specifications for distributed environments.

class maxtext.integration.tunix.weight_mapping.qwen2.QWEN2_VLLM_MAPPING[source]#

Bases: object

Mapping MaxText Qwen2 weights to vLLM’s Qwen2 weights.

static to_hf_hook_fns()[source]#

Returns a dictionary of hook functions to be applied to MaxText weights.

Returns:

An empty dictionary, as no hook functions are needed for this mapping.

static to_hf_transpose_keys()[source]#

Returns a list of keys for weights that need to be transposed.

Returns:

An empty dictionary, as no keys require transposition for this mapping.

static lora_to_hf_mappings()[source]#

Provides the mapping for LoRA (Low-Rank Adaptation) weights.

Returns:

None, as LoRA mappings are not defined for this model.

static to_hf_mapping()[source]#

Mapping from MaxText model to HuggingFace vLLM model.

Currently, the param mapping conforms to the Tunix API, which combines the param name & sharding in one dictionary. This is subject to change in the future where we can decouple the two.