maxtext.models.qwen2 module

maxtext.models.qwen2 module#

Qwen2 family of model decoder layers.

class maxtext.models.qwen2.AttentionWithNorm(*args, **kwargs)[source]#

Bases: Module

Base class with shared common components: self-attention block with normalization.

Parameters:

args (Any)
kwargs (Any)

Return type:

Any

apply_attention_with_norm(inputs, decoder_segment_ids, decoder_positions, deterministic, model_mode, kv_cache=None, attention_metadata=None)[source]#

Applies self-attention with pre and post-layer normalization.

Parameters:

inputs (Array)
decoder_segment_ids (None | Array)
decoder_positions (None | Array)
deterministic (bool)
model_mode (str)
kv_cache (None | Array)
attention_metadata (None | dict[str, Any])

class maxtext.models.qwen2.Qwen2DecoderLayer(*args, **kwargs)[source]#

Bases: AttentionWithNorm

Qwen2 Transformer decoder layer (dense).

Parameters:

args (Any)
kwargs (Any)

Return type:

Any

maxtext.models.qwen2 module

Contents

maxtext.models.qwen2 module#