maxtext.layers.encoders module#

Module for encoder layers.

class maxtext.layers.encoders.VisionEncoder(*args, **kwargs)[source]#

Bases: Module

Vision encoder to encode images into soft tokens.

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

class maxtext.layers.encoders.AudioEncoder(*args, **kwargs)[source]#

Bases: Module

Audio encoder to encode audio features into soft tokens.

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

Any

maxtext.layers.encoders.vision_encoder_as_linen(config, mesh)[source]#

Creates a VisionEncoder module.

Parameters:
  • config (Any)

  • mesh (Mesh)

maxtext.layers.encoders.audio_encoder_as_linen(config, mesh)[source]#

Creates an AudioEncoder module.

Parameters:
  • config (Any)

  • mesh (Mesh)