# Auto Classes

In many cases, the architecture you want to use can be guessed from the name or the path of the pretrained model you
are supplying to the `from_pretrained()` method. AutoClasses are here to do this job for you so that you
automatically retrieve the relevant model given the name/path to the pretrained weights/config/vocabulary.

Instantiating one of [AutoConfig](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoConfig), [AutoModel](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel), and
[AutoTokenizer](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoTokenizer) will directly create a class of the relevant architecture. For instance

```python
model = AutoModel.from_pretrained("google-bert/bert-base-cased", device_map="auto")
```

will create a model that is an instance of [BertModel](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertModel).

There is one class of `AutoModel` for each task.

## Extending the Auto Classes

Each of the auto classes has a method to be extended with your custom classes. For instance, if you have defined a
custom class of model `NewModel`, make sure you have a `NewModelConfig` then you can add those to the auto
classes like this:

```python
from transformers import AutoConfig, AutoModel

AutoConfig.register("new-model", NewModelConfig)
AutoModel.register(NewModelConfig, NewModel)
```

You will then be able to use the auto classes like you would usually do!

If your `NewModelConfig` is a subclass of [PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), make sure its
`model_type` attribute is set to the same key you use when registering the config (here `"new-model"`).

Likewise, if your `NewModel` is a subclass of [PreTrainedModel](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel), make sure its
`config_class` attribute is set to the same class you use when registering the model (here
`NewModelConfig`).

## AutoConfig[[transformers.AutoConfig]]

#### transformers.AutoConfig[[transformers.AutoConfig]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/configuration_auto.py#L264)

This is a generic configuration class that will be instantiated as one of the configuration classes of the library
when created with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoConfig.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoConfig.from_pretrainedhttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/configuration_auto.py#L287[{"name": "pretrained_model_name_or_path", "val": ": str | os.PathLike[str]"}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  Can be either:

  - A string, the *model id* of a pretrained model configuration hosted inside a model repo on
    huggingface.co.
  - A path to a *directory* containing a configuration file saved using the
    [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.save_pretrained) method, or the [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) method,
    e.g., `./my_model_directory/`.
  - a path to a saved configuration JSON *file*, e.g.,
    `./my_model_directory/configuration.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model configuration should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force the (re-)download the model weights and configuration files and override the
  cached versions if they exist.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final configuration object.

  If `True`, then this functions returns a `Tuple(config, unused_kwargs)` where *unused_kwargs* is a
  dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the
  part of `kwargs` which has not been used to update `config` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs(additional** keyword arguments, *optional*) --
  The values in kwargs of any keys which are configuration attributes will be used to override the loaded
  values. Behavior concerning key/value pairs whose keys are *not* configuration attributes is controlled
  by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the configuration classes of the library from a pretrained model configuration.

The configuration class to instantiate is selected based on the `model_type` property of the config object that
is loaded, or when it's missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **EvollaModel** -- [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) (EvollaConfig model)
- **afmoe** -- [AfmoeConfig](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeConfig) (AfmoeConfig model)
- **aimv2** -- [Aimv2Config](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2Config) (Aimv2Config model)
- **aimv2_text_model** -- [Aimv2TextConfig](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2TextConfig) (Aimv2TextConfig model)
- **aimv2_vision_model** -- [Aimv2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2VisionConfig) (Aimv2VisionConfig model)
- **albert** -- [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) (AlbertConfig model)
- **align** -- [AlignConfig](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignConfig) (AlignConfig model)
- **align_text_model** -- [AlignTextConfig](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignTextConfig) (AlignTextConfig model)
- **align_vision_model** -- [AlignVisionConfig](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignVisionConfig) (AlignVisionConfig model)
- **altclip** -- [AltCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPConfig) (AltCLIPConfig model)
- **altclip_text_model** -- [AltCLIPTextConfig](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPTextConfig) (AltCLIPTextConfig model)
- **altclip_vision_model** -- [AltCLIPVisionConfig](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPVisionConfig) (AltCLIPVisionConfig model)
- **apertus** -- [ApertusConfig](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusConfig) (ApertusConfig model)
- **arcee** -- [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) (ArceeConfig model)
- **aria** -- [AriaConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaConfig) (AriaConfig model)
- **aria_text** -- [AriaTextConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextConfig) (AriaTextConfig model)
- **audio-spectrogram-transformer** -- [ASTConfig](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) (ASTConfig model)
- **audioflamingo3** -- [AudioFlamingo3Config](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Config) (AudioFlamingo3Config model)
- **audioflamingo3_encoder** -- [AudioFlamingo3EncoderConfig](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3EncoderConfig) (AudioFlamingo3EncoderConfig model)
- **autoformer** -- [AutoformerConfig](/docs/transformers/v5.8.0/en/model_doc/autoformer#transformers.AutoformerConfig) (AutoformerConfig model)
- **aya_vision** -- [AyaVisionConfig](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionConfig) (AyaVisionConfig model)
- **bamba** -- [BambaConfig](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaConfig) (BambaConfig model)
- **bark** -- [BarkConfig](/docs/transformers/v5.8.0/en/model_doc/bark#transformers.BarkConfig) (BarkConfig model)
- **bart** -- [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) (BartConfig model)
- **beit** -- [BeitConfig](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitConfig) (BeitConfig model)
- **bert** -- [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) (BertConfig model)
- **bert-generation** -- [BertGenerationConfig](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationConfig) (BertGenerationConfig model)
- **big_bird** -- [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) (BigBirdConfig model)
- **bigbird_pegasus** -- [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) (BigBirdPegasusConfig model)
- **biogpt** -- [BioGptConfig](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptConfig) (BioGptConfig model)
- **bit** -- [BitConfig](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitConfig) (BitConfig model)
- **bitnet** -- [BitNetConfig](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetConfig) (BitNetConfig model)
- **blenderbot** -- [BlenderbotConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotConfig) (BlenderbotConfig model)
- **blenderbot-small** -- [BlenderbotSmallConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) (BlenderbotSmallConfig model)
- **blip** -- [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) (BlipConfig model)
- **blip-2** -- [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) (Blip2Config model)
- **blip_2_qformer** -- [Blip2QFormerConfig](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2QFormerConfig) (Blip2QFormerConfig model)
- **blip_2_vision_model** -- [Blip2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2VisionConfig) (Blip2VisionConfig model)
- **blip_text_model** -- [BlipTextConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipTextConfig) (BlipTextConfig model)
- **blip_vision_model** -- [BlipVisionConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipVisionConfig) (BlipVisionConfig model)
- **bloom** -- [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) (BloomConfig model)
- **blt** -- [BltConfig](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltConfig) (BltConfig model)
- **blt_global_transformer** -- `BltGlobalTransformerConfig` (BltGlobalTransformerConfig model)
- **blt_local_decoder** -- `BltLocalDecoderConfig` (BltLocalDecoderConfig model)
- **blt_local_encoder** -- `BltLocalEncoderConfig` (BltLocalEncoderConfig model)
- **blt_patcher** -- `BltPatcherConfig` (BltPatcherConfig model)
- **bridgetower** -- [BridgeTowerConfig](/docs/transformers/v5.8.0/en/model_doc/bridgetower#transformers.BridgeTowerConfig) (BridgeTowerConfig model)
- **bridgetower_text_model** -- [BridgeTowerTextConfig](/docs/transformers/v5.8.0/en/model_doc/bridgetower#transformers.BridgeTowerTextConfig) (BridgeTowerTextConfig model)
- **bridgetower_vision_model** -- [BridgeTowerVisionConfig](/docs/transformers/v5.8.0/en/model_doc/bridgetower#transformers.BridgeTowerVisionConfig) (BridgeTowerVisionConfig model)
- **bros** -- [BrosConfig](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosConfig) (BrosConfig model)
- **camembert** -- [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) (CamembertConfig model)
- **canine** -- [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) (CanineConfig model)
- **chameleon** -- [ChameleonConfig](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonConfig) (ChameleonConfig model)
- **chameleon_vqgan** -- [ChameleonVQVAEConfig](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonVQVAEConfig) (ChameleonVQVAEConfig model)
- **chinese_clip** -- [ChineseCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPConfig) (ChineseCLIPConfig model)
- **chinese_clip_text_model** -- [ChineseCLIPTextConfig](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPTextConfig) (ChineseCLIPTextConfig model)
- **chinese_clip_vision_model** -- [ChineseCLIPVisionConfig](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPVisionConfig) (ChineseCLIPVisionConfig model)
- **chmv2** -- [CHMv2Config](/docs/transformers/v5.8.0/en/model_doc/chmv2#transformers.CHMv2Config) (CHMv2Config model)
- **clap** -- [ClapConfig](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapConfig) (ClapConfig model)
- **clap_audio_model** -- [ClapAudioConfig](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapAudioConfig) (ClapAudioConfig model)
- **clap_text_model** -- [ClapTextConfig](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapTextConfig) (ClapTextConfig model)
- **clip** -- [CLIPConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPConfig) (CLIPConfig model)
- **clip_text_model** -- [CLIPTextConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTextConfig) (CLIPTextConfig model)
- **clip_vision_model** -- [CLIPVisionConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPVisionConfig) (CLIPVisionConfig model)
- **clipseg** -- [CLIPSegConfig](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegConfig) (CLIPSegConfig model)
- **clipseg_text_model** -- [CLIPSegTextConfig](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegTextConfig) (CLIPSegTextConfig model)
- **clipseg_vision_model** -- [CLIPSegVisionConfig](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegVisionConfig) (CLIPSegVisionConfig model)
- **clvp** -- [ClvpConfig](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpConfig) (ClvpConfig model)
- **clvp_decoder** -- [ClvpDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpDecoderConfig) (ClvpDecoderConfig model)
- **clvp_encoder** -- [ClvpEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpEncoderConfig) (ClvpEncoderConfig model)
- **codegen** -- [CodeGenConfig](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenConfig) (CodeGenConfig model)
- **cohere** -- [CohereConfig](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereConfig) (CohereConfig model)
- **cohere2** -- [Cohere2Config](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2Config) (Cohere2Config model)
- **cohere2_vision** -- [Cohere2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionConfig) (Cohere2VisionConfig model)
- **cohere_asr** -- [CohereAsrConfig](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrConfig) (CohereAsrConfig model)
- **colmodernvbert** -- [ColModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/colmodernvbert#transformers.ColModernVBertConfig) (ColModernVBertConfig model)
- **colpali** -- [ColPaliConfig](/docs/transformers/v5.8.0/en/model_doc/colpali#transformers.ColPaliConfig) (ColPaliConfig model)
- **colqwen2** -- [ColQwen2Config](/docs/transformers/v5.8.0/en/model_doc/colqwen2#transformers.ColQwen2Config) (ColQwen2Config model)
- **conditional_detr** -- [ConditionalDetrConfig](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrConfig) (ConditionalDetrConfig model)
- **convbert** -- [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) (ConvBertConfig model)
- **convnext** -- [ConvNextConfig](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextConfig) (ConvNextConfig model)
- **convnextv2** -- [ConvNextV2Config](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2Config) (ConvNextV2Config model)
- **cpmant** -- [CpmAntConfig](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntConfig) (CpmAntConfig model)
- **csm** -- [CsmConfig](/docs/transformers/v5.8.0/en/model_doc/csm#transformers.CsmConfig) (CsmConfig model)
- **csm_depth_decoder_model** -- [CsmDepthDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/csm#transformers.CsmDepthDecoderConfig) (CsmDepthDecoderConfig model)
- **ctrl** -- [CTRLConfig](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLConfig) (CTRLConfig model)
- **cvt** -- [CvtConfig](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtConfig) (CvtConfig model)
- **cwm** -- [CwmConfig](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmConfig) (CwmConfig model)
- **d_fine** -- [DFineConfig](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineConfig) (DFineConfig model)
- **dab-detr** -- [DabDetrConfig](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrConfig) (DabDetrConfig model)
- **dac** -- [DacConfig](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacConfig) (DacConfig model)
- **data2vec-audio** -- [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) (Data2VecAudioConfig model)
- **data2vec-text** -- [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) (Data2VecTextConfig model)
- **data2vec-vision** -- [Data2VecVisionConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionConfig) (Data2VecVisionConfig model)
- **dbrx** -- [DbrxConfig](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxConfig) (DbrxConfig model)
- **deberta** -- [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) (DebertaConfig model)
- **deberta-v2** -- [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) (DebertaV2Config model)
- **decision_transformer** -- [DecisionTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/decision_transformer#transformers.DecisionTransformerConfig) (DecisionTransformerConfig model)
- **deepseek_v2** -- [DeepseekV2Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Config) (DeepseekV2Config model)
- **deepseek_v3** -- [DeepseekV3Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Config) (DeepseekV3Config model)
- **deepseek_v4** -- [DeepseekV4Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4Config) (DeepseekV4Config model)
- **deepseek_vl** -- [DeepseekVLConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLConfig) (DeepseekVLConfig model)
- **deepseek_vl_hybrid** -- [DeepseekVLHybridConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridConfig) (DeepseekVLHybridConfig model)
- **deformable_detr** -- [DeformableDetrConfig](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrConfig) (DeformableDetrConfig model)
- **deimv2** -- [Deimv2Config](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2Config) (Deimv2Config model)
- **deit** -- [DeiTConfig](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTConfig) (DeiTConfig model)
- **depth_anything** -- [DepthAnythingConfig](/docs/transformers/v5.8.0/en/model_doc/depth_anything#transformers.DepthAnythingConfig) (DepthAnythingConfig model)
- **depth_pro** -- [DepthProConfig](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProConfig) (DepthProConfig model)
- **detr** -- [DetrConfig](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrConfig) (DetrConfig model)
- **dia** -- [DiaConfig](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaConfig) (DiaConfig model)
- **dia_decoder** -- [DiaDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaDecoderConfig) (DiaDecoderConfig model)
- **dia_encoder** -- [DiaEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaEncoderConfig) (DiaEncoderConfig model)
- **diffllama** -- [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) (DiffLlamaConfig model)
- **dinat** -- [DinatConfig](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatConfig) (DinatConfig model)
- **dinov2** -- [Dinov2Config](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2Config) (Dinov2Config model)
- **dinov2_with_registers** -- [Dinov2WithRegistersConfig](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersConfig) (Dinov2WithRegistersConfig model)
- **dinov3_convnext** -- [DINOv3ConvNextConfig](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ConvNextConfig) (DINOv3ConvNextConfig model)
- **dinov3_vit** -- [DINOv3ViTConfig](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ViTConfig) (DINOv3ViTConfig model)
- **distilbert** -- [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) (DistilBertConfig model)
- **doge** -- [DogeConfig](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeConfig) (DogeConfig model)
- **donut-swin** -- [DonutSwinConfig](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinConfig) (DonutSwinConfig model)
- **dots1** -- [Dots1Config](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1Config) (Dots1Config model)
- **dpr** -- [DPRConfig](/docs/transformers/v5.8.0/en/model_doc/dpr#transformers.DPRConfig) (DPRConfig model)
- **dpt** -- [DPTConfig](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTConfig) (DPTConfig model)
- **edgetam** -- [EdgeTamConfig](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamConfig) (EdgeTamConfig model)
- **edgetam_video** -- [EdgeTamVideoConfig](/docs/transformers/v5.8.0/en/model_doc/edgetam_video#transformers.EdgeTamVideoConfig) (EdgeTamVideoConfig model)
- **edgetam_vision_model** -- [EdgeTamVisionConfig](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamVisionConfig) (EdgeTamVisionConfig model)
- **efficientloftr** -- [EfficientLoFTRConfig](/docs/transformers/v5.8.0/en/model_doc/efficientloftr#transformers.EfficientLoFTRConfig) (EfficientLoFTRConfig model)
- **efficientnet** -- [EfficientNetConfig](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetConfig) (EfficientNetConfig model)
- **electra** -- [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) (ElectraConfig model)
- **emu3** -- [Emu3Config](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Config) (Emu3Config model)
- **emu3_text_model** -- [Emu3TextConfig](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3TextConfig) (Emu3TextConfig model)
- **emu3_vqgan** -- [Emu3VQVAEConfig](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3VQVAEConfig) (Emu3VQVAEConfig model)
- **encodec** -- [EncodecConfig](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecConfig) (EncodecConfig model)
- **encoder-decoder** -- [EncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) (EncoderDecoderConfig model)
- **eomt** -- [EomtConfig](/docs/transformers/v5.8.0/en/model_doc/eomt#transformers.EomtConfig) (EomtConfig model)
- **eomt_dinov3** -- [EomtDinov3Config](/docs/transformers/v5.8.0/en/model_doc/eomt_dinov3#transformers.EomtDinov3Config) (EomtDinov3Config model)
- **ernie** -- [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) (ErnieConfig model)
- **ernie4_5** -- [Ernie4_5Config](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5Config) (Ernie4_5Config model)
- **ernie4_5_moe** -- [Ernie4_5_MoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeConfig) (Ernie4_5_MoeConfig model)
- **ernie4_5_vl_moe** -- [Ernie4_5_VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeConfig) (Ernie4_5_VLMoeConfig model)
- **ernie4_5_vl_moe_text** -- [Ernie4_5_VLMoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeTextConfig) (Ernie4_5_VLMoeTextConfig model)
- **ernie4_5_vl_moe_vision** -- [Ernie4_5_VLMoeVisionConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeVisionConfig) (Ernie4_5_VLMoeVisionConfig model)
- **esm** -- [EsmConfig](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmConfig) (EsmConfig model)
- **eurobert** -- [EuroBertConfig](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertConfig) (EuroBertConfig model)
- **evolla** -- [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) (EvollaConfig model)
- **exaone4** -- [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) (Exaone4Config model)
- **exaone4_5** -- [Exaone4_5_Config](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Config) (Exaone4_5_Config model)
- **exaone4_5_vision** -- [Exaone4_5_VisionConfig](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_VisionConfig) (Exaone4_5_VisionConfig model)
- **exaone_moe** -- [ExaoneMoeConfig](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeConfig) (ExaoneMoeConfig model)
- **falcon** -- [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) (FalconConfig model)
- **falcon_h1** -- [FalconH1Config](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1Config) (FalconH1Config model)
- **falcon_mamba** -- [FalconMambaConfig](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaConfig) (FalconMambaConfig model)
- **fast_vlm** -- [FastVlmConfig](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmConfig) (FastVlmConfig model)
- **fastspeech2_conformer** -- [FastSpeech2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerConfig) (FastSpeech2ConformerConfig model)
- **fastspeech2_conformer_hifigan** -- [FastSpeech2ConformerHifiGanConfig](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerHifiGanConfig) (FastSpeech2ConformerHifiGanConfig model)
- **fastspeech2_conformer_with_hifigan** -- [FastSpeech2ConformerWithHifiGanConfig](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerWithHifiGanConfig) (FastSpeech2ConformerWithHifiGanConfig model)
- **flaubert** -- [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) (FlaubertConfig model)
- **flava** -- [FlavaConfig](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaConfig) (FlavaConfig model)
- **flava_image_model** -- [FlavaImageConfig](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaImageConfig) (FlavaImageConfig model)
- **flava_multimodal_model** -- [FlavaMultimodalConfig](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaMultimodalConfig) (FlavaMultimodalConfig model)
- **flava_text_model** -- [FlavaTextConfig](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaTextConfig) (FlavaTextConfig model)
- **flex_olmo** -- [FlexOlmoConfig](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoConfig) (FlexOlmoConfig model)
- **florence2** -- [Florence2Config](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Config) (Florence2Config model)
- **florence_vision** -- [Florence2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2VisionConfig) (Florence2VisionConfig model)
- **fnet** -- [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) (FNetConfig model)
- **focalnet** -- [FocalNetConfig](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetConfig) (FocalNetConfig model)
- **fsmt** -- [FSMTConfig](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTConfig) (FSMTConfig model)
- **funnel** -- [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) (FunnelConfig model)
- **fuyu** -- [FuyuConfig](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuConfig) (FuyuConfig model)
- **gemma** -- [GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaConfig) (GemmaConfig model)
- **gemma2** -- [Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Config) (Gemma2Config model)
- **gemma3** -- [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) (Gemma3Config model)
- **gemma3_text** -- [Gemma3TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextConfig) (Gemma3TextConfig model)
- **gemma3n** -- [Gemma3nConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nConfig) (Gemma3nConfig model)
- **gemma3n_audio** -- [Gemma3nAudioConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nAudioConfig) (Gemma3nAudioConfig model)
- **gemma3n_text** -- [Gemma3nTextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nTextConfig) (Gemma3nTextConfig model)
- **gemma3n_vision** -- [Gemma3nVisionConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nVisionConfig) (Gemma3nVisionConfig model)
- **gemma4** -- [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) (Gemma4Config model)
- **gemma4_assistant** -- [Gemma4AssistantConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4_assistant#transformers.Gemma4AssistantConfig) (Gemma4AssistantConfig model)
- **gemma4_audio** -- [Gemma4AudioConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4AudioConfig) (Gemma4AudioConfig model)
- **gemma4_text** -- [Gemma4TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4TextConfig) (Gemma4TextConfig model)
- **gemma4_vision** -- [Gemma4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4VisionConfig) (Gemma4VisionConfig model)
- **git** -- [GitConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitConfig) (GitConfig model)
- **git_vision_model** -- [GitVisionConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitVisionConfig) (GitVisionConfig model)
- **glm** -- [GlmConfig](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmConfig) (GlmConfig model)
- **glm4** -- [Glm4Config](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Config) (Glm4Config model)
- **glm46v** -- [Glm46VConfig](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VConfig) (Glm46VConfig model)
- **glm4_moe** -- [Glm4MoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeConfig) (Glm4MoeConfig model)
- **glm4_moe_lite** -- [Glm4MoeLiteConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteConfig) (Glm4MoeLiteConfig model)
- **glm4v** -- [Glm4vConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vConfig) (Glm4vConfig model)
- **glm4v_moe** -- [Glm4vMoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeConfig) (Glm4vMoeConfig model)
- **glm4v_moe_text** -- [Glm4vMoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeTextConfig) (Glm4vMoeTextConfig model)
- **glm4v_moe_vision** -- [Glm4vMoeVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeVisionConfig) (Glm4vMoeVisionConfig model)
- **glm4v_text** -- [Glm4vTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vTextConfig) (Glm4vTextConfig model)
- **glm4v_vision** -- [Glm4vVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vVisionConfig) (Glm4vVisionConfig model)
- **glm_image** -- [GlmImageConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageConfig) (GlmImageConfig model)
- **glm_image_text** -- [GlmImageTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageTextConfig) (GlmImageTextConfig model)
- **glm_image_vision** -- [GlmImageVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVisionConfig) (GlmImageVisionConfig model)
- **glm_image_vqmodel** -- [GlmImageVQVAEConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVQVAEConfig) (GlmImageVQVAEConfig model)
- **glm_moe_dsa** -- [GlmMoeDsaConfig](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaConfig) (GlmMoeDsaConfig model)
- **glm_ocr** -- [GlmOcrConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrConfig) (GlmOcrConfig model)
- **glm_ocr_text** -- [GlmOcrTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrTextConfig) (GlmOcrTextConfig model)
- **glm_ocr_vision** -- [GlmOcrVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrVisionConfig) (GlmOcrVisionConfig model)
- **glmasr** -- [GlmAsrConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrConfig) (GlmAsrConfig model)
- **glmasr_encoder** -- [GlmAsrEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrEncoderConfig) (GlmAsrEncoderConfig model)
- **glpn** -- [GLPNConfig](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNConfig) (GLPNConfig model)
- **got_ocr2** -- [GotOcr2Config](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Config) (GotOcr2Config model)
- **gpt-sw3** -- [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) (GPT2Config model)
- **gpt2** -- [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) (GPT2Config model)
- **gpt_bigcode** -- [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) (GPTBigCodeConfig model)
- **gpt_neo** -- [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) (GPTNeoConfig model)
- **gpt_neox** -- [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) (GPTNeoXConfig model)
- **gpt_neox_japanese** -- [GPTNeoXJapaneseConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) (GPTNeoXJapaneseConfig model)
- **gpt_oss** -- [GptOssConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssConfig) (GptOssConfig model)
- **gptj** -- [GPTJConfig](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJConfig) (GPTJConfig model)
- **granite** -- [GraniteConfig](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteConfig) (GraniteConfig model)
- **granite4_vision** -- [Granite4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionConfig) (Granite4VisionConfig model)
- **granite4_vision_text** -- [Granite4VisionTextConfig](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionTextConfig) (Granite4VisionTextConfig model)
- **granite_speech** -- [GraniteSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechConfig) (GraniteSpeechConfig model)
- **granite_speech_encoder** -- [GraniteSpeechEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechEncoderConfig) (GraniteSpeechEncoderConfig model)
- **granite_speech_plus** -- [GraniteSpeechPlusConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusConfig) (GraniteSpeechPlusConfig model)
- **granite_speech_plus_encoder** -- [GraniteSpeechPlusEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusEncoderConfig) (GraniteSpeechPlusEncoderConfig model)
- **granitemoe** -- [GraniteMoeConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeConfig) (GraniteMoeConfig model)
- **granitemoehybrid** -- [GraniteMoeHybridConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridConfig) (GraniteMoeHybridConfig model)
- **granitemoeshared** -- [GraniteMoeSharedConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedConfig) (GraniteMoeSharedConfig model)
- **grounding-dino** -- [GroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoConfig) (GroundingDinoConfig model)
- **groupvit** -- [GroupViTConfig](/docs/transformers/v5.8.0/en/model_doc/groupvit#transformers.GroupViTConfig) (GroupViTConfig model)
- **groupvit_text_model** -- [GroupViTTextConfig](/docs/transformers/v5.8.0/en/model_doc/groupvit#transformers.GroupViTTextConfig) (GroupViTTextConfig model)
- **groupvit_vision_model** -- [GroupViTVisionConfig](/docs/transformers/v5.8.0/en/model_doc/groupvit#transformers.GroupViTVisionConfig) (GroupViTVisionConfig model)
- **helium** -- [HeliumConfig](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumConfig) (HeliumConfig model)
- **hgnet_v2** -- [HGNetV2Config](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2Config) (HGNetV2Config model)
- **hiera** -- [HieraConfig](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraConfig) (HieraConfig model)
- **higgs_audio_v2** -- [HiggsAudioV2Config](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2#transformers.HiggsAudioV2Config) (HiggsAudioV2Config model)
- **higgs_audio_v2_tokenizer** -- [HiggsAudioV2TokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerConfig) (HiggsAudioV2TokenizerConfig model)
- **hubert** -- [HubertConfig](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertConfig) (HubertConfig model)
- **hunyuan_v1_dense** -- [HunYuanDenseV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Config) (HunYuanDenseV1Config model)
- **hunyuan_v1_moe** -- [HunYuanMoEV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Config) (HunYuanMoEV1Config model)
- **hy_v3** -- [HYV3Config](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3Config) (HYV3Config model)
- **ibert** -- [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) (IBertConfig model)
- **idefics** -- [IdeficsConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsConfig) (IdeficsConfig model)
- **idefics2** -- [Idefics2Config](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Config) (Idefics2Config model)
- **idefics2_perceiver** -- [Idefics2PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2PerceiverConfig) (Idefics2PerceiverConfig model)
- **idefics2_vision** -- [Idefics2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2VisionConfig) (Idefics2VisionConfig model)
- **idefics3** -- [Idefics3Config](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Config) (Idefics3Config model)
- **idefics3_vision** -- [Idefics3VisionConfig](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3VisionConfig) (Idefics3VisionConfig model)
- **idefics_perciever** -- [IdeficsPerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsPerceiverConfig) (IdeficsPerceiverConfig model)
- **idefics_vision** -- [IdeficsVisionConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsVisionConfig) (IdeficsVisionConfig model)
- **ijepa** -- [IJepaConfig](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaConfig) (IJepaConfig model)
- **imagegpt** -- [ImageGPTConfig](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTConfig) (ImageGPTConfig model)
- **informer** -- [InformerConfig](/docs/transformers/v5.8.0/en/model_doc/informer#transformers.InformerConfig) (InformerConfig model)
- **instructblip** -- [InstructBlipConfig](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipConfig) (InstructBlipConfig model)
- **instructblip_qformer** -- [InstructBlipQFormerConfig](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipQFormerConfig) (InstructBlipQFormerConfig model)
- **instructblip_vision_model** -- [InstructBlipVisionConfig](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipVisionConfig) (InstructBlipVisionConfig model)
- **instructblipvideo** -- [InstructBlipVideoConfig](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoConfig) (InstructBlipVideoConfig model)
- **instructblipvideo_qformer** -- [InstructBlipVideoQFormerConfig](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoQFormerConfig) (InstructBlipVideoQFormerConfig model)
- **instructblipvideo_vision_model** -- [InstructBlipVideoVisionConfig](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoVisionConfig) (InstructBlipVideoVisionConfig model)
- **internvl** -- [InternVLConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLConfig) (InternVLConfig model)
- **internvl_vision** -- [InternVLVisionConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLVisionConfig) (InternVLVisionConfig model)
- **jais2** -- [Jais2Config](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2Config) (Jais2Config model)
- **jamba** -- [JambaConfig](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaConfig) (JambaConfig model)
- **janus** -- [JanusConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusConfig) (JanusConfig model)
- **janus_vision_model** -- [JanusVisionConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusVisionConfig) (JanusVisionConfig model)
- **janus_vqgan** -- [JanusVQVAEConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusVQVAEConfig) (JanusVQVAEConfig model)
- **jetmoe** -- [JetMoeConfig](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeConfig) (JetMoeConfig model)
- **jina_embeddings_v3** -- [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) (JinaEmbeddingsV3Config model)
- **kosmos-2** -- [Kosmos2Config](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Config) (Kosmos2Config model)
- **kosmos-2.5** -- [Kosmos2_5Config](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Config) (Kosmos2_5Config model)
- **kosmos_2_5_text_model** -- [Kosmos2_5TextConfig](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5TextConfig) (Kosmos2_5TextConfig model)
- **kosmos_2_5_vision_model** -- [Kosmos2_5VisionConfig](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5VisionConfig) (Kosmos2_5VisionConfig model)
- **kosmos_2_text_model** -- [Kosmos2TextConfig](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2TextConfig) (Kosmos2TextConfig model)
- **kosmos_2_vision_model** -- [Kosmos2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2VisionConfig) (Kosmos2VisionConfig model)
- **kyutai_speech_to_text** -- [KyutaiSpeechToTextConfig](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextConfig) (KyutaiSpeechToTextConfig model)
- **laguna** -- [LagunaConfig](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaConfig) (LagunaConfig model)
- **lasr_ctc** -- [LasrCTCConfig](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrCTCConfig) (LasrCTCConfig model)
- **lasr_encoder** -- [LasrEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrEncoderConfig) (LasrEncoderConfig model)
- **layoutlm** -- [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) (LayoutLMConfig model)
- **layoutlmv2** -- [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) (LayoutLMv2Config model)
- **layoutlmv3** -- [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) (LayoutLMv3Config model)
- **layoutxlm** -- [LayoutXLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutxlm#transformers.LayoutXLMConfig) (LayoutXLMConfig model)
- **led** -- [LEDConfig](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDConfig) (LEDConfig model)
- **levit** -- [LevitConfig](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitConfig) (LevitConfig model)
- **lfm2** -- [Lfm2Config](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2Config) (Lfm2Config model)
- **lfm2_moe** -- [Lfm2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeConfig) (Lfm2MoeConfig model)
- **lfm2_vl** -- [Lfm2VlConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlConfig) (Lfm2VlConfig model)
- **lightglue** -- [LightGlueConfig](/docs/transformers/v5.8.0/en/model_doc/lightglue#transformers.LightGlueConfig) (LightGlueConfig model)
- **lighton_ocr** -- [LightOnOcrConfig](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrConfig) (LightOnOcrConfig model)
- **lilt** -- [LiltConfig](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltConfig) (LiltConfig model)
- **llama** -- [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) (LlamaConfig model)
- **llama4** -- [Llama4Config](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Config) (Llama4Config model)
- **llama4_text** -- [Llama4TextConfig](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4TextConfig) (Llama4TextConfig model)
- **llama4_vision_model** -- [Llama4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4VisionConfig) (Llama4VisionConfig model)
- **llava** -- [LlavaConfig](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaConfig) (LlavaConfig model)
- **llava_next** -- [LlavaNextConfig](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextConfig) (LlavaNextConfig model)
- **llava_next_video** -- [LlavaNextVideoConfig](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoConfig) (LlavaNextVideoConfig model)
- **llava_onevision** -- [LlavaOnevisionConfig](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionConfig) (LlavaOnevisionConfig model)
- **longcat_flash** -- [LongcatFlashConfig](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashConfig) (LongcatFlashConfig model)
- **longformer** -- [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) (LongformerConfig model)
- **longt5** -- [LongT5Config](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5Config) (LongT5Config model)
- **luke** -- [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) (LukeConfig model)
- **lw_detr** -- [LwDetrConfig](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrConfig) (LwDetrConfig model)
- **lw_detr_vit** -- [LwDetrViTConfig](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrViTConfig) (LwDetrViTConfig model)
- **lxmert** -- [LxmertConfig](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertConfig) (LxmertConfig model)
- **m2m_100** -- [M2M100Config](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100Config) (M2M100Config model)
- **mamba** -- [MambaConfig](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaConfig) (MambaConfig model)
- **mamba2** -- [Mamba2Config](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Config) (Mamba2Config model)
- **marian** -- [MarianConfig](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianConfig) (MarianConfig model)
- **markuplm** -- [MarkupLMConfig](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMConfig) (MarkupLMConfig model)
- **mask2former** -- [Mask2FormerConfig](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerConfig) (Mask2FormerConfig model)
- **maskformer** -- [MaskFormerConfig](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerConfig) (MaskFormerConfig model)
- **maskformer-swin** -- `MaskFormerSwinConfig` (MaskFormerSwinConfig model)
- **mbart** -- [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) (MBartConfig model)
- **megatron-bert** -- [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) (MegatronBertConfig model)
- **metaclip_2** -- [MetaClip2Config](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Config) (MetaClip2Config model)
- **metaclip_2_text_model** -- [MetaClip2TextConfig](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2TextConfig) (MetaClip2TextConfig model)
- **metaclip_2_vision_model** -- [MetaClip2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2VisionConfig) (MetaClip2VisionConfig model)
- **mgp-str** -- [MgpstrConfig](/docs/transformers/v5.8.0/en/model_doc/mgp-str#transformers.MgpstrConfig) (MgpstrConfig model)
- **mimi** -- [MimiConfig](/docs/transformers/v5.8.0/en/model_doc/mimi#transformers.MimiConfig) (MimiConfig model)
- **minicpmv4_6** -- [MiniCPMV4_6Config](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Config) (MiniCPMV4_6Config model)
- **minicpmv4_6_vision** -- [MiniCPMV4_6VisionConfig](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6VisionConfig) (MiniCPMV4_6VisionConfig model)
- **minimax** -- [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) (MiniMaxConfig model)
- **minimax_m2** -- [MiniMaxM2Config](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2Config) (MiniMaxM2Config model)
- **ministral** -- [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) (MinistralConfig model)
- **ministral3** -- [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) (Ministral3Config model)
- **mistral** -- [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) (MistralConfig model)
- **mistral3** -- [Mistral3Config](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Config) (Mistral3Config model)
- **mistral4** -- [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) (Mistral4Config model)
- **mixtral** -- [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) (MixtralConfig model)
- **mlcd** -- [MLCDVisionConfig](/docs/transformers/v5.8.0/en/model_doc/mlcd#transformers.MLCDVisionConfig) (MLCDVisionConfig model)
- **mlcd_vision_model** -- [MLCDVisionConfig](/docs/transformers/v5.8.0/en/model_doc/mlcd#transformers.MLCDVisionConfig) (MLCDVisionConfig model)
- **mllama** -- [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) (MllamaConfig model)
- **mllama_text_model** -- [MllamaTextConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaTextConfig) (MllamaTextConfig model)
- **mllama_vision_model** -- [MllamaVisionConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaVisionConfig) (MllamaVisionConfig model)
- **mm-grounding-dino** -- [MMGroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoConfig) (MMGroundingDinoConfig model)
- **mobilebert** -- [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) (MobileBertConfig model)
- **mobilenet_v1** -- [MobileNetV1Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1Config) (MobileNetV1Config model)
- **mobilenet_v2** -- [MobileNetV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Config) (MobileNetV2Config model)
- **mobilevit** -- [MobileViTConfig](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTConfig) (MobileViTConfig model)
- **mobilevitv2** -- [MobileViTV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Config) (MobileViTV2Config model)
- **modernbert** -- [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) (ModernBertConfig model)
- **modernbert-decoder** -- [ModernBertDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderConfig) (ModernBertDecoderConfig model)
- **modernvbert** -- [ModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertConfig) (ModernVBertConfig model)
- **moonshine** -- [MoonshineConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineConfig) (MoonshineConfig model)
- **moonshine_streaming** -- [MoonshineStreamingConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingConfig) (MoonshineStreamingConfig model)
- **moonshine_streaming_encoder** -- [MoonshineStreamingEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingEncoderConfig) (MoonshineStreamingEncoderConfig model)
- **moshi** -- [MoshiConfig](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiConfig) (MoshiConfig model)
- **moshi_depth** -- [MoshiDepthConfig](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiDepthConfig) (MoshiDepthConfig model)
- **mpnet** -- [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) (MPNetConfig model)
- **mpt** -- [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) (MptConfig model)
- **mra** -- [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) (MraConfig model)
- **mt5** -- [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) (MT5Config model)
- **musicflamingo** -- [MusicFlamingoConfig](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoConfig) (MusicFlamingoConfig model)
- **musicgen** -- [MusicgenConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenConfig) (MusicgenConfig model)
- **musicgen_decoder** -- [MusicgenDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenDecoderConfig) (MusicgenDecoderConfig model)
- **musicgen_melody** -- [MusicgenMelodyConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyConfig) (MusicgenMelodyConfig model)
- **musicgen_melody_decoder** -- [MusicgenMelodyDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyDecoderConfig) (MusicgenMelodyDecoderConfig model)
- **mvp** -- [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) (MvpConfig model)
- **nanochat** -- [NanoChatConfig](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatConfig) (NanoChatConfig model)
- **nemotron** -- [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) (NemotronConfig model)
- **nemotron_h** -- [NemotronHConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHConfig) (NemotronHConfig model)
- **nllb-moe** -- [NllbMoeConfig](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeConfig) (NllbMoeConfig model)
- **nomic_bert** -- [NomicBertConfig](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertConfig) (NomicBertConfig model)
- **nougat** -- [NougatConfig](/docs/transformers/v5.8.0/en/model_doc/nougat#transformers.NougatConfig) (NougatConfig model)
- **nystromformer** -- [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) (NystromformerConfig model)
- **olmo** -- [OlmoConfig](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoConfig) (OlmoConfig model)
- **olmo2** -- [Olmo2Config](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Config) (Olmo2Config model)
- **olmo3** -- [Olmo3Config](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Config) (Olmo3Config model)
- **olmo_hybrid** -- [OlmoHybridConfig](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridConfig) (OlmoHybridConfig model)
- **olmoe** -- [OlmoeConfig](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeConfig) (OlmoeConfig model)
- **omdet-turbo** -- [OmDetTurboConfig](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboConfig) (OmDetTurboConfig model)
- **oneformer** -- [OneFormerConfig](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerConfig) (OneFormerConfig model)
- **openai-gpt** -- [OpenAIGPTConfig](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTConfig) (OpenAIGPTConfig model)
- **openai_privacy_filter** -- [OpenAIPrivacyFilterConfig](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterConfig) (OpenAIPrivacyFilterConfig model)
- **opt** -- [OPTConfig](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTConfig) (OPTConfig model)
- **ovis2** -- [Ovis2Config](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Config) (Ovis2Config model)
- **owlv2** -- [Owlv2Config](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2Config) (Owlv2Config model)
- **owlv2_text_model** -- [Owlv2TextConfig](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2TextConfig) (Owlv2TextConfig model)
- **owlv2_vision_model** -- [Owlv2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2VisionConfig) (Owlv2VisionConfig model)
- **owlvit** -- [OwlViTConfig](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTConfig) (OwlViTConfig model)
- **owlvit_text_model** -- [OwlViTTextConfig](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTTextConfig) (OwlViTTextConfig model)
- **owlvit_vision_model** -- [OwlViTVisionConfig](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTVisionConfig) (OwlViTVisionConfig model)
- **paddleocr_vl** -- [PaddleOCRVLConfig](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLConfig) (PaddleOCRVLConfig model)
- **paddleocr_vl_text** -- [PaddleOCRTextConfig](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRTextConfig) (PaddleOCRTextConfig model)
- **paddleocr_vl_vision** -- [PaddleOCRVisionConfig](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVisionConfig) (PaddleOCRVisionConfig model)
- **paligemma** -- [PaliGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaConfig) (PaliGemmaConfig model)
- **parakeet_ctc** -- [ParakeetCTCConfig](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetCTCConfig) (ParakeetCTCConfig model)
- **parakeet_encoder** -- [ParakeetEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetEncoderConfig) (ParakeetEncoderConfig model)
- **patchtsmixer** -- [PatchTSMixerConfig](/docs/transformers/v5.8.0/en/model_doc/patchtsmixer#transformers.PatchTSMixerConfig) (PatchTSMixerConfig model)
- **patchtst** -- [PatchTSTConfig](/docs/transformers/v5.8.0/en/model_doc/patchtst#transformers.PatchTSTConfig) (PatchTSTConfig model)
- **pe_audio** -- [PeAudioConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioConfig) (PeAudioConfig model)
- **pe_audio_encoder** -- [PeAudioEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioEncoderConfig) (PeAudioEncoderConfig model)
- **pe_audio_video** -- [PeAudioVideoConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoConfig) (PeAudioVideoConfig model)
- **pe_audio_video_encoder** -- [PeAudioVideoEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoEncoderConfig) (PeAudioVideoEncoderConfig model)
- **pe_video** -- [PeVideoConfig](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoConfig) (PeVideoConfig model)
- **pe_video_encoder** -- [PeVideoEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoEncoderConfig) (PeVideoEncoderConfig model)
- **pegasus** -- [PegasusConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusConfig) (PegasusConfig model)
- **pegasus_x** -- [PegasusXConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXConfig) (PegasusXConfig model)
- **perceiver** -- [PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverConfig) (PerceiverConfig model)
- **perception_lm** -- [PerceptionLMConfig](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMConfig) (PerceptionLMConfig model)
- **persimmon** -- [PersimmonConfig](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonConfig) (PersimmonConfig model)
- **phi** -- [PhiConfig](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiConfig) (PhiConfig model)
- **phi3** -- [Phi3Config](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Config) (Phi3Config model)
- **phi4_multimodal** -- [Phi4MultimodalConfig](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalConfig) (Phi4MultimodalConfig model)
- **phi4_multimodal_audio** -- [Phi4MultimodalAudioConfig](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalAudioConfig) (Phi4MultimodalAudioConfig model)
- **phi4_multimodal_vision** -- [Phi4MultimodalVisionConfig](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalVisionConfig) (Phi4MultimodalVisionConfig model)
- **phimoe** -- [PhimoeConfig](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeConfig) (PhimoeConfig model)
- **pi0** -- [PI0Config](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Config) (PI0Config model)
- **pix2struct** -- [Pix2StructConfig](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructConfig) (Pix2StructConfig model)
- **pix2struct_text_model** -- [Pix2StructTextConfig](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructTextConfig) (Pix2StructTextConfig model)
- **pix2struct_vision_model** -- [Pix2StructVisionConfig](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructVisionConfig) (Pix2StructVisionConfig model)
- **pixio** -- [PixioConfig](/docs/transformers/v5.8.0/en/model_doc/pixio#transformers.PixioConfig) (PixioConfig model)
- **pixtral** -- [PixtralVisionConfig](/docs/transformers/v5.8.0/en/model_doc/pixtral#transformers.PixtralVisionConfig) (PixtralVisionConfig model)
- **plbart** -- [PLBartConfig](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartConfig) (PLBartConfig model)
- **poolformer** -- [PoolFormerConfig](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerConfig) (PoolFormerConfig model)
- **pop2piano** -- [Pop2PianoConfig](/docs/transformers/v5.8.0/en/model_doc/pop2piano#transformers.Pop2PianoConfig) (Pop2PianoConfig model)
- **pp_chart2table** -- [PPChart2TableConfig](/docs/transformers/v5.8.0/en/model_doc/pp_chart2table#transformers.PPChart2TableConfig) (PPChart2TableConfig model)
- **pp_doclayout_v2** -- [PPDocLayoutV2Config](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v2#transformers.PPDocLayoutV2Config) (PPDocLayoutV2Config model)
- **pp_doclayout_v3** -- [PPDocLayoutV3Config](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3Config) (PPDocLayoutV3Config model)
- **pp_formulanet** -- [PPFormulaNetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetConfig) (PPFormulaNetConfig model)
- **pp_lcnet** -- [PPLCNetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_lcnet#transformers.PPLCNetConfig) (PPLCNetConfig model)
- **pp_lcnet_v3** -- [PPLCNetV3Config](/docs/transformers/v5.8.0/en/model_doc/pp_lcnet_v3#transformers.PPLCNetV3Config) (PPLCNetV3Config model)
- **pp_ocrv5_mobile_det** -- [PPOCRV5MobileDetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_det#transformers.PPOCRV5MobileDetConfig) (PPOCRV5MobileDetConfig model)
- **pp_ocrv5_mobile_rec** -- [PPOCRV5MobileRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecConfig) (PPOCRV5MobileRecConfig model)
- **pp_ocrv5_server_det** -- [PPOCRV5ServerDetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_det#transformers.PPOCRV5ServerDetConfig) (PPOCRV5ServerDetConfig model)
- **pp_ocrv5_server_rec** -- [PPOCRV5ServerRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecConfig) (PPOCRV5ServerRecConfig model)
- **prompt_depth_anything** -- [PromptDepthAnythingConfig](/docs/transformers/v5.8.0/en/model_doc/prompt_depth_anything#transformers.PromptDepthAnythingConfig) (PromptDepthAnythingConfig model)
- **prophetnet** -- [ProphetNetConfig](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetConfig) (ProphetNetConfig model)
- **pvt** -- [PvtConfig](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtConfig) (PvtConfig model)
- **pvt_v2** -- [PvtV2Config](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2Config) (PvtV2Config model)
- **qianfan_ocr** -- [QianfanOCRConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRConfig) (QianfanOCRConfig model)
- **qianfan_ocr_vision** -- [QianfanOCRVisionConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRVisionConfig) (QianfanOCRVisionConfig model)
- **qwen2** -- [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) (Qwen2Config model)
- **qwen2_5_omni** -- [Qwen2_5OmniConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniConfig) (Qwen2_5OmniConfig model)
- **qwen2_5_omni_audio_encoder** -- [Qwen2_5OmniAudioEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniAudioEncoderConfig) (Qwen2_5OmniAudioEncoderConfig model)
- **qwen2_5_omni_bigvgan** -- [Qwen2_5OmniBigVGANConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniBigVGANConfig) (Qwen2_5OmniBigVGANConfig model)
- **qwen2_5_omni_dit** -- [Qwen2_5OmniDiTConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniDiTConfig) (Qwen2_5OmniDiTConfig model)
- **qwen2_5_omni_talker** -- [Qwen2_5OmniTalkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniTalkerConfig) (Qwen2_5OmniTalkerConfig model)
- **qwen2_5_omni_text** -- [Qwen2_5OmniTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniTextConfig) (Qwen2_5OmniTextConfig model)
- **qwen2_5_omni_thinker** -- [Qwen2_5OmniThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerConfig) (Qwen2_5OmniThinkerConfig model)
- **qwen2_5_omni_token2wav** -- [Qwen2_5OmniToken2WavConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniToken2WavConfig) (Qwen2_5OmniToken2WavConfig model)
- **qwen2_5_omni_vision_encoder** -- [Qwen2_5OmniVisionEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniVisionEncoderConfig) (Qwen2_5OmniVisionEncoderConfig model)
- **qwen2_5_vl** -- [Qwen2_5_VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLConfig) (Qwen2_5_VLConfig model)
- **qwen2_5_vl_text** -- [Qwen2_5_VLTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLTextConfig) (Qwen2_5_VLTextConfig model)
- **qwen2_5_vl_vision** -- [Qwen2_5_VLVisionConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLVisionConfig) (Qwen2_5_VLVisionConfig model)
- **qwen2_audio** -- [Qwen2AudioConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioConfig) (Qwen2AudioConfig model)
- **qwen2_audio_encoder** -- [Qwen2AudioEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioEncoderConfig) (Qwen2AudioEncoderConfig model)
- **qwen2_moe** -- [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) (Qwen2MoeConfig model)
- **qwen2_vl** -- [Qwen2VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLConfig) (Qwen2VLConfig model)
- **qwen2_vl_text** -- [Qwen2VLTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLTextConfig) (Qwen2VLTextConfig model)
- **qwen2_vl_vision** -- [Qwen2VLVisionConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLVisionConfig) (Qwen2VLVisionConfig model)
- **qwen3** -- [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) (Qwen3Config model)
- **qwen3_5** -- [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) (Qwen3_5Config model)
- **qwen3_5_moe** -- [Qwen3_5MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeConfig) (Qwen3_5MoeConfig model)
- **qwen3_5_moe_text** -- [Qwen3_5MoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeTextConfig) (Qwen3_5MoeTextConfig model)
- **qwen3_5_moe_vision** -- [Qwen3_5MoeVisionConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3_5MoeVisionConfig) (Qwen3_5MoeVisionConfig model)
- **qwen3_5_text** -- [Qwen3_5TextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextConfig) (Qwen3_5TextConfig model)
- **qwen3_5_vision** -- [Qwen3_5VisionConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5VisionConfig) (Qwen3_5VisionConfig model)
- **qwen3_moe** -- [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) (Qwen3MoeConfig model)
- **qwen3_next** -- [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) (Qwen3NextConfig model)
- **qwen3_omni_moe** -- [Qwen3OmniMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeConfig) (Qwen3OmniMoeConfig model)
- **qwen3_omni_moe_audio_encoder** -- [Qwen3OmniMoeAudioEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3OmniMoeAudioEncoderConfig) (Qwen3OmniMoeAudioEncoderConfig model)
- **qwen3_omni_moe_talker_code_predictor** -- [Qwen3OmniMoeTalkerCodePredictorConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3OmniMoeTalkerCodePredictorConfig) (Qwen3OmniMoeTalkerCodePredictorConfig model)
- **qwen3_omni_moe_talker_text** -- [Qwen3OmniMoeTalkerTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3OmniMoeTalkerTextConfig) (Qwen3OmniMoeTalkerTextConfig model)
- **qwen3_omni_moe_text** -- [Qwen3OmniMoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3OmniMoeTextConfig) (Qwen3OmniMoeTextConfig model)
- **qwen3_omni_moe_thinker** -- [Qwen3OmniMoeThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerConfig) (Qwen3OmniMoeThinkerConfig model)
- **qwen3_omni_moe_vision_encoder** -- [Qwen3OmniMoeVisionEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3OmniMoeVisionEncoderConfig) (Qwen3OmniMoeVisionEncoderConfig model)
- **qwen3_vl** -- [Qwen3VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLConfig) (Qwen3VLConfig model)
- **qwen3_vl_moe** -- [Qwen3VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeConfig) (Qwen3VLMoeConfig model)
- **qwen3_vl_moe_text** -- [Qwen3VLMoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeTextConfig) (Qwen3VLMoeTextConfig model)
- **qwen3_vl_moe_vision** -- [Qwen3VLMoeVisionConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeVisionConfig) (Qwen3VLMoeVisionConfig model)
- **qwen3_vl_text** -- [Qwen3VLTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLTextConfig) (Qwen3VLTextConfig model)
- **qwen3_vl_vision** -- [Qwen3VLVisionConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLVisionConfig) (Qwen3VLVisionConfig model)
- **rag** -- [RagConfig](/docs/transformers/v5.8.0/en/model_doc/rag#transformers.RagConfig) (RagConfig model)
- **recurrent_gemma** -- [RecurrentGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaConfig) (RecurrentGemmaConfig model)
- **reformer** -- [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) (ReformerConfig model)
- **regnet** -- [RegNetConfig](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetConfig) (RegNetConfig model)
- **rembert** -- [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) (RemBertConfig model)
- **resnet** -- [ResNetConfig](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetConfig) (ResNetConfig model)
- **roberta** -- [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) (RoCBertConfig model)
- **roformer** -- [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) (RoFormerConfig model)
- **rt_detr** -- [RTDetrConfig](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrConfig) (RTDetrConfig model)
- **rt_detr_resnet** -- [RTDetrResNetConfig](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrResNetConfig) (RTDetrResNetConfig model)
- **rt_detr_v2** -- [RTDetrV2Config](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2Config) (RTDetrV2Config model)
- **rwkv** -- [RwkvConfig](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvConfig) (RwkvConfig model)
- **sam** -- [SamConfig](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamConfig) (SamConfig model)
- **sam2** -- [Sam2Config](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2Config) (Sam2Config model)
- **sam2_hiera_det_model** -- [Sam2HieraDetConfig](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2HieraDetConfig) (Sam2HieraDetConfig model)
- **sam2_video** -- [Sam2VideoConfig](/docs/transformers/v5.8.0/en/model_doc/sam2_video#transformers.Sam2VideoConfig) (Sam2VideoConfig model)
- **sam2_vision_model** -- [Sam2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2VisionConfig) (Sam2VisionConfig model)
- **sam3** -- [Sam3Config](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3Config) (Sam3Config model)
- **sam3_detr_decoder** -- [Sam3DETRDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3DETRDecoderConfig) (Sam3DETRDecoderConfig model)
- **sam3_detr_encoder** -- [Sam3DETREncoderConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3DETREncoderConfig) (Sam3DETREncoderConfig model)
- **sam3_geometry_encoder** -- [Sam3GeometryEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3GeometryEncoderConfig) (Sam3GeometryEncoderConfig model)
- **sam3_lite_text** -- [Sam3LiteTextConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextConfig) (Sam3LiteTextConfig model)
- **sam3_lite_text_detr_decoder** -- [Sam3LiteTextDETRDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextDETRDecoderConfig) (Sam3LiteTextDETRDecoderConfig model)
- **sam3_lite_text_detr_encoder** -- [Sam3LiteTextDETREncoderConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextDETREncoderConfig) (Sam3LiteTextDETREncoderConfig model)
- **sam3_lite_text_geometry_encoder** -- [Sam3LiteTextGeometryEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextGeometryEncoderConfig) (Sam3LiteTextGeometryEncoderConfig model)
- **sam3_lite_text_mask_decoder** -- [Sam3LiteTextMaskDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextMaskDecoderConfig) (Sam3LiteTextMaskDecoderConfig model)
- **sam3_lite_text_text_model** -- [Sam3LiteTextTextConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextTextConfig) (Sam3LiteTextTextConfig model)
- **sam3_mask_decoder** -- [Sam3MaskDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3MaskDecoderConfig) (Sam3MaskDecoderConfig model)
- **sam3_tracker** -- [Sam3TrackerConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker#transformers.Sam3TrackerConfig) (Sam3TrackerConfig model)
- **sam3_tracker_video** -- [Sam3TrackerVideoConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker_video#transformers.Sam3TrackerVideoConfig) (Sam3TrackerVideoConfig model)
- **sam3_video** -- [Sam3VideoConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_video#transformers.Sam3VideoConfig) (Sam3VideoConfig model)
- **sam3_vision_model** -- [Sam3VisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3VisionConfig) (Sam3VisionConfig model)
- **sam3_vit_model** -- [Sam3ViTConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3ViTConfig) (Sam3ViTConfig model)
- **sam_hq** -- [SamHQConfig](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQConfig) (SamHQConfig model)
- **sam_hq_vision_model** -- [SamHQVisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQVisionConfig) (SamHQVisionConfig model)
- **sam_vision_model** -- [SamVisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamVisionConfig) (SamVisionConfig model)
- **seamless_m4t** -- [SeamlessM4TConfig](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TConfig) (SeamlessM4TConfig model)
- **seamless_m4t_v2** -- [SeamlessM4Tv2Config](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Config) (SeamlessM4Tv2Config model)
- **seed_oss** -- [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) (SeedOssConfig model)
- **segformer** -- [SegformerConfig](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerConfig) (SegformerConfig model)
- **seggpt** -- [SegGptConfig](/docs/transformers/v5.8.0/en/model_doc/seggpt#transformers.SegGptConfig) (SegGptConfig model)
- **sew** -- [SEWConfig](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWConfig) (SEWConfig model)
- **sew-d** -- [SEWDConfig](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDConfig) (SEWDConfig model)
- **shieldgemma2** -- [ShieldGemma2Config](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2Config) (ShieldGemma2Config model)
- **siglip** -- [SiglipConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipConfig) (SiglipConfig model)
- **siglip2** -- [Siglip2Config](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Config) (Siglip2Config model)
- **siglip2_text_model** -- [Siglip2TextConfig](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2TextConfig) (Siglip2TextConfig model)
- **siglip2_vision_model** -- [Siglip2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2VisionConfig) (Siglip2VisionConfig model)
- **siglip_text_model** -- [SiglipTextConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipTextConfig) (SiglipTextConfig model)
- **siglip_vision_model** -- [SiglipVisionConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipVisionConfig) (SiglipVisionConfig model)
- **slanet** -- [SLANetConfig](/docs/transformers/v5.8.0/en/model_doc/slanet#transformers.SLANetConfig) (SLANetConfig model)
- **slanext** -- [SLANeXtConfig](/docs/transformers/v5.8.0/en/model_doc/slanext#transformers.SLANeXtConfig) (SLANeXtConfig model)
- **smollm3** -- [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) (SmolLM3Config model)
- **smolvlm** -- [SmolVLMConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMConfig) (SmolVLMConfig model)
- **smolvlm_vision** -- [SmolVLMVisionConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMVisionConfig) (SmolVLMVisionConfig model)
- **solar_open** -- [SolarOpenConfig](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenConfig) (SolarOpenConfig model)
- **speech-encoder-decoder** -- [SpeechEncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/speech-encoder-decoder#transformers.SpeechEncoderDecoderConfig) (SpeechEncoderDecoderConfig model)
- **speech_to_text** -- [Speech2TextConfig](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextConfig) (Speech2TextConfig model)
- **speecht5** -- [SpeechT5Config](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Config) (SpeechT5Config model)
- **speecht5_hifigan** -- [SpeechT5HifiGanConfig](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5HifiGanConfig) (SpeechT5HifiGanConfig model)
- **splinter** -- [SplinterConfig](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterConfig) (SplinterConfig model)
- **squeezebert** -- [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) (SqueezeBertConfig model)
- **stablelm** -- [StableLmConfig](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmConfig) (StableLmConfig model)
- **starcoder2** -- [Starcoder2Config](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Config) (Starcoder2Config model)
- **superglue** -- [SuperGlueConfig](/docs/transformers/v5.8.0/en/model_doc/superglue#transformers.SuperGlueConfig) (SuperGlueConfig model)
- **superpoint** -- [SuperPointConfig](/docs/transformers/v5.8.0/en/model_doc/superpoint#transformers.SuperPointConfig) (SuperPointConfig model)
- **swiftformer** -- [SwiftFormerConfig](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerConfig) (SwiftFormerConfig model)
- **swin** -- [SwinConfig](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinConfig) (SwinConfig model)
- **swin2sr** -- [Swin2SRConfig](/docs/transformers/v5.8.0/en/model_doc/swin2sr#transformers.Swin2SRConfig) (Swin2SRConfig model)
- **swinv2** -- [Swinv2Config](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Config) (Swinv2Config model)
- **switch_transformers** -- [SwitchTransformersConfig](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersConfig) (SwitchTransformersConfig model)
- **t5** -- [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) (T5Config model)
- **t5_gemma_module** -- [T5GemmaModuleConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaModuleConfig) (T5GemmaModuleConfig model)
- **t5gemma** -- [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) (T5GemmaConfig model)
- **t5gemma2** -- [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) (T5Gemma2Config model)
- **t5gemma2_decoder** -- [T5Gemma2DecoderConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2DecoderConfig) (T5Gemma2DecoderConfig model)
- **t5gemma2_encoder** -- [T5Gemma2EncoderConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2EncoderConfig) (T5Gemma2EncoderConfig model)
- **t5gemma2_text** -- [T5Gemma2TextConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2TextConfig) (T5Gemma2TextConfig model)
- **table-transformer** -- [TableTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerConfig) (TableTransformerConfig model)
- **tapas** -- [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) (TapasConfig model)
- **textnet** -- [TextNetConfig](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetConfig) (TextNetConfig model)
- **time_series_transformer** -- [TimeSeriesTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/time_series_transformer#transformers.TimeSeriesTransformerConfig) (TimeSeriesTransformerConfig model)
- **timesfm** -- [TimesFmConfig](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmConfig) (TimesFmConfig model)
- **timesfm2_5** -- [TimesFm2_5Config](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5Config) (TimesFm2_5Config model)
- **timesformer** -- [TimesformerConfig](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerConfig) (TimesformerConfig model)
- **timm_backbone** -- [TimmBackboneConfig](/docs/transformers/v5.8.0/en/main_classes/backbones#transformers.TimmBackboneConfig) (TimmBackboneConfig model)
- **timm_wrapper** -- [TimmWrapperConfig](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperConfig) (TimmWrapperConfig model)
- **trocr** -- [TrOCRConfig](/docs/transformers/v5.8.0/en/model_doc/trocr#transformers.TrOCRConfig) (TrOCRConfig model)
- **tvp** -- [TvpConfig](/docs/transformers/v5.8.0/en/model_doc/tvp#transformers.TvpConfig) (TvpConfig model)
- **udop** -- [UdopConfig](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopConfig) (UdopConfig model)
- **umt5** -- [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) (UMT5Config model)
- **unispeech** -- [UniSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechConfig) (UniSpeechConfig model)
- **unispeech-sat** -- [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) (UniSpeechSatConfig model)
- **univnet** -- [UnivNetConfig](/docs/transformers/v5.8.0/en/model_doc/univnet#transformers.UnivNetConfig) (UnivNetConfig model)
- **upernet** -- [UperNetConfig](/docs/transformers/v5.8.0/en/model_doc/upernet#transformers.UperNetConfig) (UperNetConfig model)
- **uvdoc** -- [UVDocConfig](/docs/transformers/v5.8.0/en/model_doc/uvdoc#transformers.UVDocConfig) (UVDocConfig model)
- **uvdoc_backbone** -- [UVDocBackboneConfig](/docs/transformers/v5.8.0/en/model_doc/uvdoc#transformers.UVDocBackboneConfig) (UVDocBackboneConfig model)
- **vaultgemma** -- [VaultGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaConfig) (VaultGemmaConfig model)
- **vibevoice_acoustic_tokenizer** -- [VibeVoiceAcousticTokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerConfig) (VibeVoiceAcousticTokenizerConfig model)
- **vibevoice_acoustic_tokenizer_decoder** -- [VibeVoiceAcousticTokenizerDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerDecoderConfig) (VibeVoiceAcousticTokenizerDecoderConfig model)
- **vibevoice_acoustic_tokenizer_encoder** -- [VibeVoiceAcousticTokenizerEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerEncoderConfig) (VibeVoiceAcousticTokenizerEncoderConfig model)
- **vibevoice_asr** -- [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) (VibeVoiceAsrConfig model)
- **video_llama_3** -- [VideoLlama3Config](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Config) (VideoLlama3Config model)
- **video_llama_3_vision** -- [VideoLlama3VisionConfig](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3VisionConfig) (VideoLlama3VisionConfig model)
- **video_llava** -- [VideoLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaConfig) (VideoLlavaConfig model)
- **videomae** -- [VideoMAEConfig](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEConfig) (VideoMAEConfig model)
- **videomt** -- [VideomtConfig](/docs/transformers/v5.8.0/en/model_doc/videomt#transformers.VideomtConfig) (VideomtConfig model)
- **vilt** -- [ViltConfig](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltConfig) (ViltConfig model)
- **vipllava** -- [VipLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaConfig) (VipLlavaConfig model)
- **vision-encoder-decoder** -- [VisionEncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderConfig) (VisionEncoderDecoderConfig model)
- **vision-text-dual-encoder** -- [VisionTextDualEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/vision-text-dual-encoder#transformers.VisionTextDualEncoderConfig) (VisionTextDualEncoderConfig model)
- **visual_bert** -- [VisualBertConfig](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertConfig) (VisualBertConfig model)
- **vit** -- [ViTConfig](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTConfig) (ViTConfig model)
- **vit_mae** -- [ViTMAEConfig](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEConfig) (ViTMAEConfig model)
- **vit_msn** -- [ViTMSNConfig](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNConfig) (ViTMSNConfig model)
- **vitdet** -- [VitDetConfig](/docs/transformers/v5.8.0/en/model_doc/vitdet#transformers.VitDetConfig) (VitDetConfig model)
- **vitmatte** -- [VitMatteConfig](/docs/transformers/v5.8.0/en/model_doc/vitmatte#transformers.VitMatteConfig) (VitMatteConfig model)
- **vitpose** -- [VitPoseConfig](/docs/transformers/v5.8.0/en/model_doc/vitpose#transformers.VitPoseConfig) (VitPoseConfig model)
- **vitpose_backbone** -- `VitPoseBackboneConfig` (VitPoseBackboneConfig model)
- **vits** -- [VitsConfig](/docs/transformers/v5.8.0/en/model_doc/vits#transformers.VitsConfig) (VitsConfig model)
- **vivit** -- [VivitConfig](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitConfig) (VivitConfig model)
- **vjepa2** -- [VJEPA2Config](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2Config) (VJEPA2Config model)
- **voxtral** -- [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) (VoxtralConfig model)
- **voxtral_encoder** -- [VoxtralEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralEncoderConfig) (VoxtralEncoderConfig model)
- **voxtral_realtime** -- [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) (VoxtralRealtimeConfig model)
- **voxtral_realtime_encoder** -- [VoxtralRealtimeEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeEncoderConfig) (VoxtralRealtimeEncoderConfig model)
- **voxtral_realtime_text** -- [VoxtralRealtimeTextConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeTextConfig) (VoxtralRealtimeTextConfig model)
- **wav2vec2** -- [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) (Wav2Vec2Config model)
- **wav2vec2-bert** -- [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) (Wav2Vec2BertConfig model)
- **wav2vec2-conformer** -- [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) (Wav2Vec2ConformerConfig model)
- **wavlm** -- [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) (WavLMConfig model)
- **whisper** -- [WhisperConfig](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperConfig) (WhisperConfig model)
- **xclip** -- [XCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/xclip#transformers.XCLIPConfig) (XCLIPConfig model)
- **xclip_text_model** -- [XCLIPTextConfig](/docs/transformers/v5.8.0/en/model_doc/xclip#transformers.XCLIPTextConfig) (XCLIPTextConfig model)
- **xclip_vision_model** -- [XCLIPVisionConfig](/docs/transformers/v5.8.0/en/model_doc/xclip#transformers.XCLIPVisionConfig) (XCLIPVisionConfig model)
- **xcodec** -- [XcodecConfig](/docs/transformers/v5.8.0/en/model_doc/xcodec#transformers.XcodecConfig) (XcodecConfig model)
- **xglm** -- [XGLMConfig](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMConfig) (XGLMConfig model)
- **xlm** -- [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) (XLMRobertaXLConfig model)
- **xlnet** -- [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) (XLNetConfig model)
- **xlstm** -- [xLSTMConfig](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMConfig) (xLSTMConfig model)
- **xmod** -- [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) (XmodConfig model)
- **yolos** -- [YolosConfig](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosConfig) (YolosConfig model)
- **yoso** -- [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) (YosoConfig model)
- **youtu** -- [YoutuConfig](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuConfig) (YoutuConfig model)
- **zamba** -- [ZambaConfig](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaConfig) (ZambaConfig model)
- **zamba2** -- [Zamba2Config](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Config) (Zamba2Config model)
- **zoedepth** -- [ZoeDepthConfig](/docs/transformers/v5.8.0/en/model_doc/zoedepth#transformers.ZoeDepthConfig) (ZoeDepthConfig model)

Examples:

```python
>>> from transformers import AutoConfig

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased")

>>> # Download configuration from huggingface.co (user-uploaded) and cache.
>>> config = AutoConfig.from_pretrained("dbmdz/bert-base-german-cased")

>>> # If configuration file is in a directory (e.g., was saved using *save_pretrained('./test/saved_model/')*).
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/")

>>> # Load a specific configuration file.
>>> config = AutoConfig.from_pretrained("./test/bert_saved_model/my_configuration.json")

>>> # Change some config attributes when loading a pretrained config.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-uncased", output_attentions=True, foo=False)
>>> config.output_attentions
True

>>> config, unused_kwargs = AutoConfig.from_pretrained(
...     "google-bert/bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True
... )
>>> config.output_attentions
True

>>> unused_kwargs
{'foo': False}
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model configuration hosted inside a model repo on huggingface.co. - A path to a *directory* containing a configuration file saved using the [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.save_pretrained) method, or the [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) method, e.g., `./my_model_directory/`. - a path to a saved configuration JSON *file*, e.g., `./my_model_directory/configuration.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final configuration object.  If `True`, then this functions returns a `Tuple(config, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part of `kwargs` which has not been used to update `config` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs(additional keyword arguments, *optional*) : The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* configuration attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoConfig.register]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/configuration_auto.py#L424)

Register a new configuration for this class.

**Parameters:**

model_type (`str`) : The model type like "bert" or "gpt".

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The config to register.

## AutoTokenizer[[transformers.AutoTokenizer]]

#### transformers.AutoTokenizer[[transformers.AutoTokenizer]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/tokenization_auto.py#L564)

This is a generic tokenizer class that will be instantiated as one of the tokenizer classes of the library when
created with the [AutoTokenizer.from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoTokenizer.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoTokenizer.from_pretrainedhttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/tokenization_auto.py#L578[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "*inputs", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  Can be either:

  - A string, the *model id* of a predefined tokenizer hosted inside a model repo on huggingface.co.
  - A path to a *directory* containing vocabulary files required by the tokenizer, for instance saved
    using the [save_pretrained()](/docs/transformers/v5.8.0/en/internal/tokenization_utils#transformers.PreTrainedTokenizerBase.save_pretrained) method, e.g., `./my_model_directory/`.
  - a path to a single saved vocabulary file if and only if the tokenizer only requires a
    single vocabulary file (like Bert or XLNet), e.g.: `./my_model_directory/vocab.txt`. (Not
    applicable to all derived classes)
- **inputs** (additional positional arguments, *optional*) --
  Will be passed along to the Tokenizer `__init__()` method.
- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) --
  The configuration object used to determine the tokenizer class to instantiate.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model configuration should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force the (re-)download the model weights and configuration files and override the
  cached versions if they exist.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **subfolder** (`str`, *optional*) --
  In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for
  facebook/rag-token-base), specify it here.
- **tokenizer_type** (`str`, *optional*) --
  Tokenizer type to be loaded.
- **backend** (`str`, *optional*, defaults to `"tokenizers"`) --
  Backend to use for tokenization. Valid options are:
  - `"tokenizers"`: Use the HuggingFace tokenizers library backend (default)
  - `"sentencepiece"`: Use the SentencePiece backend
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (additional keyword arguments, *optional*) --
  Will be passed to the Tokenizer `__init__()` method. Can be used to set special tokens like
  `bos_token`, `eos_token`, `unk_token`, `sep_token`, `pad_token`, `cls_token`, `mask_token`,
  `additional_special_tokens`. See parameters in the `__init__()` for more details.0

Instantiate one of the tokenizer classes of the library from a pretrained model vocabulary.

The tokenizer class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (Aimv2Config model)
- **albert** -- [AlbertTokenizer](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertTokenizer) (AlbertConfig model)
- **align** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (AlignConfig model)
- **audioflamingo3** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (AudioFlamingo3Config model)
- **aya_vision** -- [CohereTokenizer](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereTokenizer) (AyaVisionConfig model)
- **bark** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (BarkConfig model)
- **bart** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (BartConfig model)
- **bert** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (BertConfig model)
- **bert-generation** -- [BertGenerationTokenizer](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationTokenizer) (BertGenerationConfig model)
- **big_bird** -- [BigBirdTokenizer](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdTokenizer) (BigBirdConfig model)
- **bigbird_pegasus** -- [PegasusTokenizer](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusTokenizer) (BigBirdPegasusConfig model)
- **biogpt** -- [BioGptTokenizer](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptTokenizer) (BioGptConfig model)
- **blenderbot** -- [BlenderbotTokenizer](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotTokenizer) (BlenderbotConfig model)
- **blenderbot-small** -- [BlenderbotSmallTokenizer](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallTokenizer) (BlenderbotSmallConfig model)
- **blip** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (BlipConfig model)
- **blip-2** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (Blip2Config model)
- **bridgetower** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (BridgeTowerConfig model)
- **bros** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (BrosConfig model)
- **camembert** -- [CamembertTokenizer](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertTokenizer) (CamembertConfig model)
- **canine** -- [CanineTokenizer](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineTokenizer) (CanineConfig model)
- **chameleon** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (ChameleonConfig model)
- **chinese_clip** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (ChineseCLIPConfig model)
- **clap** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (ClapConfig model)
- **clip** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (CLIPConfig model)
- **clipseg** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (CLIPSegConfig model)
- **clvp** -- [ClvpTokenizer](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpTokenizer) (ClvpConfig model)
- **codegen** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (CodeGenConfig model)
- **cohere** -- [CohereTokenizer](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereTokenizer) (CohereConfig model)
- **cohere2** -- [CohereTokenizer](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereTokenizer) (Cohere2Config model)
- **cohere_asr** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (CohereAsrConfig model)
- **colqwen2** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (ColQwen2Config model)
- **convbert** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (ConvBertConfig model)
- **cpmant** -- [CpmAntTokenizer](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntTokenizer) (CpmAntConfig model)
- **ctrl** -- [CTRLTokenizer](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLTokenizer) (CTRLConfig model)
- **data2vec-audio** -- [Wav2Vec2CTCTokenizer](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2CTCTokenizer) (Data2VecAudioConfig model)
- **data2vec-text** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (Data2VecTextConfig model)
- **dbrx** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (DbrxConfig model)
- **deberta** -- [DebertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaTokenizer) (DebertaConfig model)
- **deberta-v2** -- [DebertaV2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Tokenizer) (DebertaV2Config model)
- **deepseek_v2** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (DeepseekV2Config model)
- **deepseek_v3** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (DeepseekV3Config model)
- **deepseek_v4** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (DeepseekV4Config model)
- **deepseek_vl** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (DeepseekVLConfig model)
- **deepseek_vl_hybrid** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (DeepseekVLHybridConfig model)
- **dia** -- [DiaTokenizer](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaTokenizer) (DiaConfig model)
- **distilbert** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (DistilBertConfig model)
- **dpr** -- [DPRQuestionEncoderTokenizer](/docs/transformers/v5.8.0/en/model_doc/dpr#transformers.DPRQuestionEncoderTokenizer) (DPRConfig model)
- **electra** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (ElectraConfig model)
- **emu3** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (Emu3Config model)
- **ernie** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (ErnieConfig model)
- **esm** -- [EsmTokenizer](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmTokenizer) (EsmConfig model)
- **falcon_mamba** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (FalconMambaConfig model)
- **fastspeech2_conformer** -- `None` (FastSpeech2ConformerConfig model)
- **flaubert** -- [FlaubertTokenizer](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertTokenizer) (FlaubertConfig model)
- **flava** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (FlavaConfig model)
- **flex_olmo** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (FlexOlmoConfig model)
- **florence2** -- [BartTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (Florence2Config model)
- **fnet** -- [FNetTokenizer](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetTokenizer) (FNetConfig model)
- **fsmt** -- [FSMTTokenizer](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTTokenizer) (FSMTConfig model)
- **funnel** -- [FunnelTokenizer](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelTokenizer) (FunnelConfig model)
- **fuyu** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (FuyuConfig model)
- **gemma** -- [GemmaTokenizer](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaTokenizer) (GemmaConfig model)
- **gemma2** -- [GemmaTokenizer](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaTokenizer) (Gemma2Config model)
- **gemma3** -- [GemmaTokenizer](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaTokenizer) (Gemma3Config model)
- **gemma3_text** -- [GemmaTokenizer](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaTokenizer) (Gemma3TextConfig model)
- **gemma3n** -- [GemmaTokenizer](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaTokenizer) (Gemma3nConfig model)
- **gemma3n_text** -- [GemmaTokenizer](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaTokenizer) (Gemma3nTextConfig model)
- **git** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (GitConfig model)
- **glm** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (GlmConfig model)
- **glm4** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (Glm4Config model)
- **glm4_moe** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (Glm4MoeConfig model)
- **glm4_moe_lite** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (Glm4MoeLiteConfig model)
- **glm4v** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (Glm4vConfig model)
- **glm4v_moe** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (Glm4vMoeConfig model)
- **glm_image** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (GlmImageConfig model)
- **glmasr** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (GlmAsrConfig model)
- **got_ocr2** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (GotOcr2Config model)
- **gpt-sw3** -- [GPTSw3Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt-sw3#transformers.GPTSw3Tokenizer) (GPT2Config model)
- **gpt2** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (GPT2Config model)
- **gpt_bigcode** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (GPTBigCodeConfig model)
- **gpt_neo** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (GPTNeoConfig model)
- **gpt_neox** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (GPTNeoXConfig model)
- **gpt_neox_japanese** -- [GPTNeoXJapaneseTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseTokenizer) (GPTNeoXJapaneseConfig model)
- **gptj** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (GPTJConfig model)
- **granite** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (GraniteConfig model)
- **granitemoe** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (GraniteMoeConfig model)
- **granitemoehybrid** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (GraniteMoeHybridConfig model)
- **granitemoeshared** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (GraniteMoeSharedConfig model)
- **grounding-dino** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (GroundingDinoConfig model)
- **groupvit** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (GroupViTConfig model)
- **hubert** -- [Wav2Vec2CTCTokenizer](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2CTCTokenizer) (HubertConfig model)
- **ibert** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (IBertConfig model)
- **idefics** -- [LlamaTokenizer](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaTokenizer) (IdeficsConfig model)
- **idefics2** -- [LlamaTokenizer](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaTokenizer) (Idefics2Config model)
- **instructblip** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (InstructBlipConfig model)
- **instructblipvideo** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (InstructBlipVideoConfig model)
- **internvl** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (InternVLConfig model)
- **jais2** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (Jais2Config model)
- **jamba** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (JambaConfig model)
- **janus** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (JanusConfig model)
- **jina_embeddings_v3** -- [XLMRobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaTokenizer) (JinaEmbeddingsV3Config model)
- **kosmos-2** -- [XLMRobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaTokenizer) (Kosmos2Config model)
- **lasr_ctc** -- [LasrTokenizer](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrTokenizer) (LasrCTCConfig model)
- **lasr_encoder** -- [LasrTokenizer](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrTokenizer) (LasrEncoderConfig model)
- **layoutlm** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (LayoutLMConfig model)
- **layoutlmv2** -- [LayoutLMv2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Tokenizer) (LayoutLMv2Config model)
- **layoutlmv3** -- [LayoutLMv3Tokenizer](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Tokenizer) (LayoutLMv3Config model)
- **layoutxlm** -- [LayoutXLMTokenizer](/docs/transformers/v5.8.0/en/model_doc/layoutxlm#transformers.LayoutXLMTokenizer) (LayoutXLMConfig model)
- **led** -- [LEDTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (LEDConfig model)
- **lighton_ocr** -- [Qwen2TokenizerFast](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (LightOnOcrConfig model)
- **lilt** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (LiltConfig model)
- **llava** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (LlavaConfig model)
- **llava_next** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (LlavaNextConfig model)
- **longformer** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (LongformerConfig model)
- **luke** -- [LukeTokenizer](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeTokenizer) (LukeConfig model)
- **lxmert** -- [LxmertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (LxmertConfig model)
- **m2m_100** -- [M2M100Tokenizer](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100Tokenizer) (M2M100Config model)
- **mamba** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (MambaConfig model)
- **mamba2** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (Mamba2Config model)
- **marian** -- [MarianTokenizer](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianTokenizer) (MarianConfig model)
- **markuplm** -- [MarkupLMTokenizer](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMTokenizer) (MarkupLMConfig model)
- **mbart** -- [MBartTokenizer](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartTokenizer) (MBartConfig model)
- **megatron-bert** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (MegatronBertConfig model)
- **metaclip_2** -- [XLMRobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaTokenizer) (MetaClip2Config model)
- **mgp-str** -- [MgpstrTokenizer](/docs/transformers/v5.8.0/en/model_doc/mgp-str#transformers.MgpstrTokenizer) (MgpstrConfig model)
- **minicpmv4_6** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (MiniCPMV4_6Config model)
- **minimax_m2** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (MiniMaxM2Config model)
- **ministral** -- [MistralCommonBackend](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralCommonBackend) (MinistralConfig model)
- **ministral3** -- [MistralCommonBackend](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralCommonBackend) (Ministral3Config model)
- **mistral** -- [MistralCommonBackend](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralCommonBackend) (MistralConfig model)
- **mistral3** -- [MistralCommonBackend](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralCommonBackend) (Mistral3Config model)
- **mixtral** -- [MistralCommonBackend](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralCommonBackend) (MixtralConfig model)
- **mm-grounding-dino** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (MMGroundingDinoConfig model)
- **mobilebert** -- [MobileBertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (MobileBertConfig model)
- **modernbert** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (ModernBertConfig model)
- **mpnet** -- [MPNetTokenizer](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetTokenizer) (MPNetConfig model)
- **mpt** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (MptConfig model)
- **mra** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (MraConfig model)
- **mt5** -- [T5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Tokenizer) (MT5Config model)
- **musicgen** -- [T5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Tokenizer) (MusicgenConfig model)
- **musicgen_melody** -- [T5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Tokenizer) (MusicgenMelodyConfig model)
- **mvp** -- [MvpTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (MvpConfig model)
- **nemotron** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (NemotronConfig model)
- **nllb-moe** -- [NllbTokenizer](/docs/transformers/v5.8.0/en/model_doc/nllb#transformers.NllbTokenizer) (NllbMoeConfig model)
- **nomic_bert** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (NomicBertConfig model)
- **nougat** -- [NougatTokenizer](/docs/transformers/v5.8.0/en/model_doc/nougat#transformers.NougatTokenizer) (NougatConfig model)
- **nystromformer** -- [AlbertTokenizer](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertTokenizer) (NystromformerConfig model)
- **olmo** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (OlmoConfig model)
- **olmo2** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (Olmo2Config model)
- **olmo3** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (Olmo3Config model)
- **olmo_hybrid** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (OlmoHybridConfig model)
- **olmoe** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (OlmoeConfig model)
- **omdet-turbo** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (OmDetTurboConfig model)
- **oneformer** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (OneFormerConfig model)
- **openai-gpt** -- [OpenAIGPTTokenizer](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTTokenizer) (OpenAIGPTConfig model)
- **opt** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (OPTConfig model)
- **ovis2** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Ovis2Config model)
- **owlv2** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (Owlv2Config model)
- **owlvit** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (OwlViTConfig model)
- **pegasus** -- [PegasusTokenizer](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusTokenizer) (PegasusConfig model)
- **pegasus_x** -- [PegasusTokenizer](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusTokenizer) (PegasusXConfig model)
- **perceiver** -- [PerceiverTokenizer](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverTokenizer) (PerceiverConfig model)
- **phi** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (PhiConfig model)
- **phi3** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (Phi3Config model)
- **phimoe** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (PhimoeConfig model)
- **pix2struct** -- [T5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Tokenizer) (Pix2StructConfig model)
- **pixtral** -- [MistralCommonBackend](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralCommonBackend) (PixtralVisionConfig model)
- **plbart** -- [PLBartTokenizer](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartTokenizer) (PLBartConfig model)
- **pp_formulanet** -- [NougatTokenizer](/docs/transformers/v5.8.0/en/model_doc/nougat#transformers.NougatTokenizer) (PPFormulaNetConfig model)
- **prophetnet** -- [ProphetNetTokenizer](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetTokenizer) (ProphetNetConfig model)
- **qianfan_ocr** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (QianfanOCRConfig model)
- **qwen2** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen2Config model)
- **qwen2_5_omni** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen2_5OmniConfig model)
- **qwen2_5_vl** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen2_5_VLConfig model)
- **qwen2_audio** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen2AudioConfig model)
- **qwen2_moe** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen2MoeConfig model)
- **qwen2_vl** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen2VLConfig model)
- **qwen3** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen3Config model)
- **qwen3_5** -- [Qwen3_5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Tokenizer) (Qwen3_5Config model)
- **qwen3_5_moe** -- [Qwen3_5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Tokenizer) (Qwen3_5MoeConfig model)
- **qwen3_moe** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen3MoeConfig model)
- **qwen3_next** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen3NextConfig model)
- **qwen3_omni_moe** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen3OmniMoeConfig model)
- **qwen3_vl** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen3VLConfig model)
- **qwen3_vl_moe** -- [Qwen2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Tokenizer) (Qwen3VLMoeConfig model)
- **rag** -- [RagTokenizer](/docs/transformers/v5.8.0/en/model_doc/rag#transformers.RagTokenizer) (RagConfig model)
- **recurrent_gemma** -- [GemmaTokenizer](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaTokenizer) (RecurrentGemmaConfig model)
- **reformer** -- [ReformerTokenizer](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerTokenizer) (ReformerConfig model)
- **rembert** -- [RemBertTokenizer](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertTokenizer) (RemBertConfig model)
- **roberta** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.RobertaTokenizer) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertTokenizer](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertTokenizer) (RoCBertConfig model)
- **roformer** -- [RoFormerTokenizer](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerTokenizer) (RoFormerConfig model)
- **rwkv** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (RwkvConfig model)
- **sam3** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (Sam3Config model)
- **sam3_video** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (Sam3VideoConfig model)
- **seamless_m4t** -- [SeamlessM4TTokenizer](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TTokenizer) (SeamlessM4TConfig model)
- **seamless_m4t_v2** -- [SeamlessM4TTokenizer](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TTokenizer) (SeamlessM4Tv2Config model)
- **shieldgemma2** -- [GemmaTokenizer](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaTokenizer) (ShieldGemma2Config model)
- **siglip** -- [SiglipTokenizer](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipTokenizer) (SiglipConfig model)
- **siglip2** -- [Siglip2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Tokenizer) (Siglip2Config model)
- **speech_to_text** -- [Speech2TextTokenizer](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextTokenizer) (Speech2TextConfig model)
- **speecht5** -- [SpeechT5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Tokenizer) (SpeechT5Config model)
- **splinter** -- [SplinterTokenizer](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterTokenizer) (SplinterConfig model)
- **squeezebert** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (SqueezeBertConfig model)
- **stablelm** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (StableLmConfig model)
- **starcoder2** -- [GPT2Tokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Tokenizer) (Starcoder2Config model)
- **switch_transformers** -- [T5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Tokenizer) (SwitchTransformersConfig model)
- **t5** -- [T5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Tokenizer) (T5Config model)
- **t5gemma** -- [GemmaTokenizer](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaTokenizer) (T5GemmaConfig model)
- **tapas** -- [TapasTokenizer](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasTokenizer) (TapasConfig model)
- **trocr** -- [XLMRobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaTokenizer) (TrOCRConfig model)
- **tvp** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (TvpConfig model)
- **udop** -- [UdopTokenizer](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopTokenizer) (UdopConfig model)
- **umt5** -- [T5Tokenizer](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Tokenizer) (UMT5Config model)
- **unispeech** -- [Wav2Vec2CTCTokenizer](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2CTCTokenizer) (UniSpeechConfig model)
- **unispeech-sat** -- [Wav2Vec2CTCTokenizer](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2CTCTokenizer) (UniSpeechSatConfig model)
- **vilt** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (ViltConfig model)
- **vipllava** -- [TokenizersBackend](/docs/transformers/v5.8.0/en/main_classes/tokenizer#transformers.TokenizersBackend) (VipLlavaConfig model)
- **visual_bert** -- [BertTokenizer](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.BertTokenizer) (VisualBertConfig model)
- **vits** -- [VitsTokenizer](/docs/transformers/v5.8.0/en/model_doc/vits#transformers.VitsTokenizer) (VitsConfig model)
- **voxtral** -- [MistralCommonBackend](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralCommonBackend) (VoxtralConfig model)
- **voxtral_realtime** -- [MistralCommonBackend](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralCommonBackend) (VoxtralRealtimeConfig model)
- **wav2vec2** -- [Wav2Vec2CTCTokenizer](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2CTCTokenizer) (Wav2Vec2Config model)
- **wav2vec2-bert** -- [Wav2Vec2CTCTokenizer](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2CTCTokenizer) (Wav2Vec2BertConfig model)
- **wav2vec2-conformer** -- [Wav2Vec2CTCTokenizer](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2CTCTokenizer) (Wav2Vec2ConformerConfig model)
- **whisper** -- [WhisperTokenizer](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperTokenizer) (WhisperConfig model)
- **xclip** -- [CLIPTokenizer](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTokenizer) (XCLIPConfig model)
- **xglm** -- [XGLMTokenizer](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMTokenizer) (XGLMConfig model)
- **xlm** -- [XLMTokenizer](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMTokenizer) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaTokenizer) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaTokenizer) (XLMRobertaXLConfig model)
- **xlnet** -- [XLNetTokenizer](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetTokenizer) (XLNetConfig model)
- **xlstm** -- [GPTNeoXTokenizer](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXTokenizer) (xLSTMConfig model)
- **xmod** -- [XLMRobertaTokenizer](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaTokenizer) (XmodConfig model)
- **yoso** -- [AlbertTokenizer](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertTokenizer) (YosoConfig model)

Examples:

```python
>>> from transformers import AutoTokenizer

>>> # Download vocabulary from huggingface.co and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")

>>> # Download vocabulary from huggingface.co (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-cased")

>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using *save_pretrained('./test/saved_model/')*)
>>> # tokenizer = AutoTokenizer.from_pretrained("./test/bert_saved_model/")

>>> # Download vocabulary from huggingface.co and define model-specific arguments
>>> tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base", add_prefix_space=True)

>>> # Explicitly use the tokenizers backend
>>> tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer", backend="tokenizers")

>>> # Explicitly use the sentencepiece backend
>>> tokenizer = AutoTokenizer.from_pretrained("hf-internal-testing/llama-tokenizer", backend="sentencepiece")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a predefined tokenizer hosted inside a model repo on huggingface.co. - A path to a *directory* containing vocabulary files required by the tokenizer, for instance saved using the [save_pretrained()](/docs/transformers/v5.8.0/en/internal/tokenization_utils#transformers.PreTrainedTokenizerBase.save_pretrained) method, e.g., `./my_model_directory/`. - a path to a single saved vocabulary file if and only if the tokenizer only requires a single vocabulary file (like Bert or XLNet), e.g.: `./my_model_directory/vocab.txt`. (Not applicable to all derived classes)

inputs (additional positional arguments, *optional*) : Will be passed along to the Tokenizer `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : The configuration object used to determine the tokenizer class to instantiate.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download the model weights and configuration files and override the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

subfolder (`str`, *optional*) : In case the relevant files are located inside a subfolder of the model repo on huggingface.co (e.g. for facebook/rag-token-base), specify it here.

tokenizer_type (`str`, *optional*) : Tokenizer type to be loaded.

backend (`str`, *optional*, defaults to `"tokenizers"`) : Backend to use for tokenization. Valid options are: - `"tokenizers"`: Use the HuggingFace tokenizers library backend (default) - `"sentencepiece"`: Use the SentencePiece backend

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (additional keyword arguments, *optional*) : Will be passed to the Tokenizer `__init__()` method. Can be used to set special tokens like `bos_token`, `eos_token`, `unk_token`, `sep_token`, `pad_token`, `cls_token`, `mask_token`, `additional_special_tokens`. See parameters in the `__init__()` for more details.
#### register[[transformers.AutoTokenizer.register]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/tokenization_auto.py#L858)

Register a new tokenizer in this mapping.

**Parameters:**

config_class ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The configuration corresponding to the model to register.

tokenizer_class : The tokenizer class to register (V5 - preferred parameter).

slow_tokenizer_class : (Deprecated) The slow tokenizer to register.

fast_tokenizer_class : (Deprecated) The fast tokenizer to register.

## AutoFeatureExtractor[[transformers.AutoFeatureExtractor]]

#### transformers.AutoFeatureExtractor[[transformers.AutoFeatureExtractor]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/feature_extraction_auto.py#L232)

This is a generic feature extractor class that will be instantiated as one of the feature extractor classes of the
library when created with the [AutoFeatureExtractor.from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoFeatureExtractor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoFeatureExtractor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/feature_extraction_auto.py#L246[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a feature extractor file saved using the
    [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/feature_extractor#transformers.FeatureExtractionMixin.save_pretrained) method, e.g.,
    `./my_model_directory/`.
  - a path to a saved feature extractor JSON *file*, e.g.,
    `./my_model_directory/preprocessor_config.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model feature extractor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the feature extractor files and override the cached versions
  if they exist.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final feature extractor object. If `True`, then this
  functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of
  `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are feature extractor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the feature extractor classes of the library from a pretrained model vocabulary.

The feature extractor class to instantiate is selected based on the `model_type` property of the config object
(either passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's
missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **audio-spectrogram-transformer** -- [ASTFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTFeatureExtractor) (ASTConfig model)
- **audioflamingo3** -- [WhisperFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperFeatureExtractor) (AudioFlamingo3Config model)
- **clap** -- [ClapFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapFeatureExtractor) (ClapConfig model)
- **clvp** -- [ClvpFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpFeatureExtractor) (ClvpConfig model)
- **cohere_asr** -- [CohereAsrFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrFeatureExtractor) (CohereAsrConfig model)
- **csm** -- [EncodecFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecFeatureExtractor) (CsmConfig model)
- **dac** -- [DacFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacFeatureExtractor) (DacConfig model)
- **data2vec-audio** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (Data2VecAudioConfig model)
- **dia** -- [DiaFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaFeatureExtractor) (DiaConfig model)
- **encodec** -- [EncodecFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecFeatureExtractor) (EncodecConfig model)
- **gemma3n** -- [Gemma3nAudioFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nAudioFeatureExtractor) (Gemma3nConfig model)
- **gemma4** -- [Gemma4AudioFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4AudioFeatureExtractor) (Gemma4Config model)
- **glmasr** -- [WhisperFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperFeatureExtractor) (GlmAsrConfig model)
- **granite_speech** -- [GraniteSpeechFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechFeatureExtractor) (GraniteSpeechConfig model)
- **granite_speech_plus** -- [GraniteSpeechFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechFeatureExtractor) (GraniteSpeechPlusConfig model)
- **higgs_audio_v2_tokenizer** -- [DacFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacFeatureExtractor) (HiggsAudioV2TokenizerConfig model)
- **hubert** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (HubertConfig model)
- **kyutai_speech_to_text** -- [KyutaiSpeechToTextFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextFeatureExtractor) (KyutaiSpeechToTextConfig model)
- **lasr_ctc** -- [LasrFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrFeatureExtractor) (LasrCTCConfig model)
- **lasr_encoder** -- [LasrFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrFeatureExtractor) (LasrEncoderConfig model)
- **markuplm** -- [MarkupLMFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMFeatureExtractor) (MarkupLMConfig model)
- **mimi** -- [EncodecFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecFeatureExtractor) (MimiConfig model)
- **moonshine** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (MoonshineConfig model)
- **moshi** -- [EncodecFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecFeatureExtractor) (MoshiConfig model)
- **musicgen** -- [EncodecFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecFeatureExtractor) (MusicgenConfig model)
- **musicgen_melody** -- [MusicgenMelodyFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyFeatureExtractor) (MusicgenMelodyConfig model)
- **parakeet_ctc** -- [ParakeetFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetFeatureExtractor) (ParakeetCTCConfig model)
- **parakeet_encoder** -- [ParakeetFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetFeatureExtractor) (ParakeetEncoderConfig model)
- **pe_audio** -- [PeAudioFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioFeatureExtractor) (PeAudioConfig model)
- **pe_audio_video** -- [PeAudioFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioFeatureExtractor) (PeAudioVideoConfig model)
- **phi4_multimodal** -- [Phi4MultimodalFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalFeatureExtractor) (Phi4MultimodalConfig model)
- **pop2piano** -- [Pop2PianoFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/pop2piano#transformers.models.pop2piano.feature_extraction_pop2piano._LazyModule.__getattr__..Placeholder) (Pop2PianoConfig model)
- **qwen2_5_omni** -- [WhisperFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperFeatureExtractor) (Qwen2_5OmniConfig model)
- **qwen2_audio** -- [WhisperFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperFeatureExtractor) (Qwen2AudioConfig model)
- **qwen3_omni_moe** -- [WhisperFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperFeatureExtractor) (Qwen3OmniMoeConfig model)
- **seamless_m4t** -- [SeamlessM4TFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TFeatureExtractor) (SeamlessM4TConfig model)
- **seamless_m4t_v2** -- [SeamlessM4TFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TFeatureExtractor) (SeamlessM4Tv2Config model)
- **sew** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (SEWConfig model)
- **sew-d** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (SEWDConfig model)
- **speech_to_text** -- [Speech2TextFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextFeatureExtractor) (Speech2TextConfig model)
- **speecht5** -- [SpeechT5FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5FeatureExtractor) (SpeechT5Config model)
- **unispeech** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (UniSpeechConfig model)
- **unispeech-sat** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (UniSpeechSatConfig model)
- **univnet** -- [UnivNetFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/univnet#transformers.UnivNetFeatureExtractor) (UnivNetConfig model)
- **vibevoice_acoustic_tokenizer** -- [VibeVoiceAcousticTokenizerFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerFeatureExtractor) (VibeVoiceAcousticTokenizerConfig model)
- **vibevoice_asr** -- [VibeVoiceAcousticTokenizerFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerFeatureExtractor) (VibeVoiceAsrConfig model)
- **voxtral** -- [WhisperFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperFeatureExtractor) (VoxtralConfig model)
- **voxtral_realtime** -- [VoxtralRealtimeFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeFeatureExtractor) (VoxtralRealtimeConfig model)
- **wav2vec2** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (Wav2Vec2Config model)
- **wav2vec2-bert** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (Wav2Vec2BertConfig model)
- **wav2vec2-conformer** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (Wav2Vec2ConformerConfig model)
- **wavlm** -- [Wav2Vec2FeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2FeatureExtractor) (WavLMConfig model)
- **whisper** -- [WhisperFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperFeatureExtractor) (WhisperConfig model)
- **xcodec** -- [DacFeatureExtractor](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacFeatureExtractor) (XcodecConfig model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoFeatureExtractor

>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> # feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a feature extractor file saved using the [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/feature_extractor#transformers.FeatureExtractionMixin.save_pretrained) method, e.g., `./my_model_directory/`. - a path to a saved feature extractor JSON *file*, e.g., `./my_model_directory/preprocessor_config.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final feature extractor object. If `True`, then this functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoFeatureExtractor.register]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/feature_extraction_auto.py#L374)

Register a new feature extractor for this class.

**Parameters:**

config_class ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The configuration corresponding to the model to register.

feature_extractor_class (`FeatureExtractorMixin`) : The feature extractor to register.

## AutoImageProcessor[[transformers.AutoImageProcessor]]

#### transformers.AutoImageProcessor[[transformers.AutoImageProcessor]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/image_processing_auto.py#L445)

This is a generic image processor class that will be instantiated as one of the image processor classes of the
library when created with the [AutoImageProcessor.from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoImageProcessor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoImageProcessor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/image_processing_auto.py#L459[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "*inputs", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained image_processor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a image processor file saved using the
    [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/image_processor#transformers.ImageProcessingMixin.save_pretrained) method, e.g.,
    `./my_model_directory/`.
  - a path to a saved image processor JSON *file*, e.g.,
    `./my_model_directory/preprocessor_config.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model image processor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the image processor files and override the cached versions if
  they exist.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **use_fast** (`bool`, *optional*, defaults to `False`) --
  **Deprecated**: Use `backend="torchvision"` instead. This parameter is kept for backward compatibility.
  Use a fast torchvision-based image processor if it is supported for a given model.
  If a fast image processor is not available for a given model, a normal numpy-based image processor
  is returned instead.
- **backend** (`str`, *optional*, defaults to `None`) --
  The backend to use for image processing. Can be:
  - `None`: Automatically select the best available backend (torchvision if available, otherwise pil)
  - `"torchvision"`: Use Torchvision backend (GPU-accelerated, faster)
  - `"pil"`: Use PIL backend (portable, CPU-only)
  - Any custom backend name registered via `register()` method
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final image processor object. If `True`, then this
  functions returns a `Tuple(image_processor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not image processor attributes: i.e., the part of
  `kwargs` which has not been used to update `image_processor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **image_processor_filename** (`str`, *optional*, defaults to `"config.json"`) --
  The name of the file in the model directory to use for the image processor config.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are image processor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* image processor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the image processor classes of the library from a pretrained model vocabulary.

The image processor class to instantiate is selected based on the `model_type` property of the config object
(either passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's
missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aimv2** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (Aimv2Config model)
- **aimv2_vision_model** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (Aimv2VisionConfig model)
- **align** -- `{'torchvision': 'EfficientNetImageProcessor', 'pil': 'EfficientNetImageProcessorPil'}` (AlignConfig model)
- **altclip** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (AltCLIPConfig model)
- **aria** -- `{'pil': 'AriaImageProcessorPil', 'torchvision': 'AriaImageProcessor'}` (AriaConfig model)
- **aya_vision** -- `{'torchvision': 'GotOcr2ImageProcessor', 'pil': 'GotOcr2ImageProcessorPil'}` (AyaVisionConfig model)
- **beit** -- `{'pil': 'BeitImageProcessorPil', 'torchvision': 'BeitImageProcessor'}` (BeitConfig model)
- **bit** -- `{'pil': 'BitImageProcessorPil', 'torchvision': 'BitImageProcessor'}` (BitConfig model)
- **blip** -- `{'pil': 'BlipImageProcessorPil', 'torchvision': 'BlipImageProcessor'}` (BlipConfig model)
- **blip-2** -- `{'torchvision': 'BlipImageProcessor', 'pil': 'BlipImageProcessorPil'}` (Blip2Config model)
- **bridgetower** -- `{'pil': 'BridgeTowerImageProcessorPil', 'torchvision': 'BridgeTowerImageProcessor'}` (BridgeTowerConfig model)
- **chameleon** -- `{'pil': 'ChameleonImageProcessorPil', 'torchvision': 'ChameleonImageProcessor'}` (ChameleonConfig model)
- **chinese_clip** -- `{'pil': 'ChineseCLIPImageProcessorPil', 'torchvision': 'ChineseCLIPImageProcessor'}` (ChineseCLIPConfig model)
- **chmv2** -- `{'torchvision': 'CHMv2ImageProcessor'}` (CHMv2Config model)
- **clip** -- `{'pil': 'CLIPImageProcessorPil', 'torchvision': 'CLIPImageProcessor'}` (CLIPConfig model)
- **clipseg** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (CLIPSegConfig model)
- **cohere2_vision** -- `{'torchvision': 'Cohere2VisionImageProcessor'}` (Cohere2VisionConfig model)
- **colpali** -- `{'torchvision': 'SiglipImageProcessor', 'pil': 'SiglipImageProcessorPil'}` (ColPaliConfig model)
- **colqwen2** -- `{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}` (ColQwen2Config model)
- **conditional_detr** -- `{'pil': 'ConditionalDetrImageProcessorPil', 'torchvision': 'ConditionalDetrImageProcessor'}` (ConditionalDetrConfig model)
- **convnext** -- `{'pil': 'ConvNextImageProcessorPil', 'torchvision': 'ConvNextImageProcessor'}` (ConvNextConfig model)
- **convnextv2** -- `{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}` (ConvNextV2Config model)
- **cvt** -- `{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}` (CvtConfig model)
- **data2vec-vision** -- `{'torchvision': 'BeitImageProcessor', 'pil': 'BeitImageProcessorPil'}` (Data2VecVisionConfig model)
- **deepseek_vl** -- `{'pil': 'DeepseekVLImageProcessorPil', 'torchvision': 'DeepseekVLImageProcessor'}` (DeepseekVLConfig model)
- **deepseek_vl_hybrid** -- `{'pil': 'DeepseekVLHybridImageProcessorPil', 'torchvision': 'DeepseekVLHybridImageProcessor'}` (DeepseekVLHybridConfig model)
- **deformable_detr** -- `{'pil': 'DeformableDetrImageProcessorPil', 'torchvision': 'DeformableDetrImageProcessor'}` (DeformableDetrConfig model)
- **deimv2** -- `{'torchvision': 'RTDetrImageProcessor', 'pil': 'RTDetrImageProcessorPil'}` (Deimv2Config model)
- **deit** -- `{'pil': 'DeiTImageProcessorPil', 'torchvision': 'DeiTImageProcessor'}` (DeiTConfig model)
- **depth_anything** -- `{'torchvision': 'DPTImageProcessor', 'pil': 'DPTImageProcessorPil'}` (DepthAnythingConfig model)
- **depth_pro** -- `{'torchvision': 'DepthProImageProcessor'}` (DepthProConfig model)
- **detr** -- `{'pil': 'DetrImageProcessorPil', 'torchvision': 'DetrImageProcessor'}` (DetrConfig model)
- **dinat** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (DinatConfig model)
- **dinov2** -- `{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}` (Dinov2Config model)
- **dinov3_vit** -- `{'torchvision': 'DINOv3ViTImageProcessor'}` (DINOv3ViTConfig model)
- **donut-swin** -- `{'torchvision': 'DonutImageProcessor', 'pil': 'DonutImageProcessorPil'}` (DonutSwinConfig model)
- **dpt** -- `{'pil': 'DPTImageProcessorPil', 'torchvision': 'DPTImageProcessor'}` (DPTConfig model)
- **edgetam** -- `{'torchvision': 'Sam2ImageProcessor'}` (EdgeTamConfig model)
- **efficientloftr** -- `{'pil': 'EfficientLoFTRImageProcessorPil', 'torchvision': 'EfficientLoFTRImageProcessor'}` (EfficientLoFTRConfig model)
- **efficientnet** -- `{'pil': 'EfficientNetImageProcessorPil', 'torchvision': 'EfficientNetImageProcessor'}` (EfficientNetConfig model)
- **emu3** -- `{'pil': 'Emu3ImageProcessor'}` (Emu3Config model)
- **eomt** -- `{'pil': 'EomtImageProcessorPil', 'torchvision': 'EomtImageProcessor'}` (EomtConfig model)
- **eomt_dinov3** -- `{'torchvision': 'EomtImageProcessor', 'pil': 'EomtImageProcessorPil'}` (EomtDinov3Config model)
- **ernie4_5_vl_moe** -- `{'pil': 'Ernie4_5_VLMoeImageProcessorPil', 'torchvision': 'Ernie4_5_VLMoeImageProcessor'}` (Ernie4_5_VLMoeConfig model)
- **exaone4_5** -- `{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}` (Exaone4_5_Config model)
- **flava** -- `{'pil': 'FlavaImageProcessorPil', 'torchvision': 'FlavaImageProcessor'}` (FlavaConfig model)
- **florence2** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (Florence2Config model)
- **focalnet** -- `{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}` (FocalNetConfig model)
- **fuyu** -- `{'pil': 'FuyuImageProcessorPil', 'torchvision': 'FuyuImageProcessor'}` (FuyuConfig model)
- **gemma3** -- `{'pil': 'Gemma3ImageProcessorPil', 'torchvision': 'Gemma3ImageProcessor'}` (Gemma3Config model)
- **gemma3n** -- `{'torchvision': 'SiglipImageProcessor', 'pil': 'SiglipImageProcessorPil'}` (Gemma3nConfig model)
- **gemma4** -- `{'pil': 'Gemma4ImageProcessorPil', 'torchvision': 'Gemma4ImageProcessor'}` (Gemma4Config model)
- **git** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (GitConfig model)
- **glm46v** -- `{'pil': 'Glm46VImageProcessorPil', 'torchvision': 'Glm46VImageProcessor'}` (Glm46VConfig model)
- **glm4v** -- `{'pil': 'Glm4vImageProcessorPil', 'torchvision': 'Glm4vImageProcessor'}` (Glm4vConfig model)
- **glm_image** -- `{'pil': 'GlmImageImageProcessorPil', 'torchvision': 'GlmImageImageProcessor'}` (GlmImageConfig model)
- **glpn** -- `{'pil': 'GLPNImageProcessorPil', 'torchvision': 'GLPNImageProcessor'}` (GLPNConfig model)
- **got_ocr2** -- `{'pil': 'GotOcr2ImageProcessorPil', 'torchvision': 'GotOcr2ImageProcessor'}` (GotOcr2Config model)
- **granite4_vision** -- `{'torchvision': 'LlavaNextImageProcessor', 'pil': 'LlavaNextImageProcessorPil'}` (Granite4VisionConfig model)
- **grounding-dino** -- `{'pil': 'GroundingDinoImageProcessorPil', 'torchvision': 'GroundingDinoImageProcessor'}` (GroundingDinoConfig model)
- **groupvit** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (GroupViTConfig model)
- **hiera** -- `{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}` (HieraConfig model)
- **idefics** -- `{'pil': 'IdeficsImageProcessorPil', 'torchvision': 'IdeficsImageProcessor'}` (IdeficsConfig model)
- **idefics2** -- `{'pil': 'Idefics2ImageProcessorPil', 'torchvision': 'Idefics2ImageProcessor'}` (Idefics2Config model)
- **idefics3** -- `{'pil': 'Idefics3ImageProcessorPil', 'torchvision': 'Idefics3ImageProcessor'}` (Idefics3Config model)
- **ijepa** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (IJepaConfig model)
- **imagegpt** -- `{'pil': 'ImageGPTImageProcessorPil', 'torchvision': 'ImageGPTImageProcessor'}` (ImageGPTConfig model)
- **instructblip** -- `{'torchvision': 'BlipImageProcessor', 'pil': 'BlipImageProcessorPil'}` (InstructBlipConfig model)
- **internvl** -- `{'torchvision': 'GotOcr2ImageProcessor', 'pil': 'GotOcr2ImageProcessorPil'}` (InternVLConfig model)
- **janus** -- `{'pil': 'JanusImageProcessorPil', 'torchvision': 'JanusImageProcessor'}` (JanusConfig model)
- **kosmos-2** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (Kosmos2Config model)
- **kosmos-2.5** -- `{'torchvision': 'Kosmos2_5ImageProcessor', 'pil': 'Kosmos2_5ImageProcessorPil'}` (Kosmos2_5Config model)
- **layoutlmv2** -- `{'pil': 'LayoutLMv2ImageProcessorPil', 'torchvision': 'LayoutLMv2ImageProcessor'}` (LayoutLMv2Config model)
- **layoutlmv3** -- `{'pil': 'LayoutLMv3ImageProcessorPil', 'torchvision': 'LayoutLMv3ImageProcessor'}` (LayoutLMv3Config model)
- **layoutxlm** -- `{'torchvision': 'LayoutLMv2ImageProcessor', 'pil': 'LayoutLMv2ImageProcessorPil'}` (LayoutXLMConfig model)
- **levit** -- `{'pil': 'LevitImageProcessorPil', 'torchvision': 'LevitImageProcessor'}` (LevitConfig model)
- **lfm2_vl** -- `{'torchvision': 'Lfm2VlImageProcessor'}` (Lfm2VlConfig model)
- **lightglue** -- `{'pil': 'LightGlueImageProcessorPil', 'torchvision': 'LightGlueImageProcessor'}` (LightGlueConfig model)
- **lighton_ocr** -- `{'torchvision': 'PixtralImageProcessor', 'pil': 'PixtralImageProcessorPil'}` (LightOnOcrConfig model)
- **llama4** -- `{'torchvision': 'Llama4ImageProcessor'}` (Llama4Config model)
- **llava** -- `{'pil': 'LlavaImageProcessorPil', 'torchvision': 'LlavaImageProcessor'}` (LlavaConfig model)
- **llava_next** -- `{'pil': 'LlavaNextImageProcessorPil', 'torchvision': 'LlavaNextImageProcessor'}` (LlavaNextConfig model)
- **llava_next_video** -- `{'torchvision': 'LlavaNextImageProcessor', 'pil': 'LlavaNextImageProcessorPil'}` (LlavaNextVideoConfig model)
- **llava_onevision** -- `{'pil': 'LlavaOnevisionImageProcessorPil', 'torchvision': 'LlavaOnevisionImageProcessor'}` (LlavaOnevisionConfig model)
- **lw_detr** -- `{'torchvision': 'DeformableDetrImageProcessor', 'pil': 'DeformableDetrImageProcessorPil'}` (LwDetrConfig model)
- **mask2former** -- `{'pil': 'Mask2FormerImageProcessorPil', 'torchvision': 'Mask2FormerImageProcessor'}` (Mask2FormerConfig model)
- **maskformer** -- `{'pil': 'MaskFormerImageProcessorPil', 'torchvision': 'MaskFormerImageProcessor'}` (MaskFormerConfig model)
- **metaclip_2** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (MetaClip2Config model)
- **mgp-str** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (MgpstrConfig model)
- **minicpmv4_6** -- `{'pil': 'MiniCPMV4_6ImageProcessorPil', 'torchvision': 'MiniCPMV4_6ImageProcessor'}` (MiniCPMV4_6Config model)
- **mistral3** -- `{'torchvision': 'PixtralImageProcessor', 'pil': 'PixtralImageProcessorPil'}` (Mistral3Config model)
- **mlcd** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (MLCDVisionConfig model)
- **mllama** -- `{'pil': 'MllamaImageProcessorPil', 'torchvision': 'MllamaImageProcessor'}` (MllamaConfig model)
- **mm-grounding-dino** -- `{'torchvision': 'GroundingDinoImageProcessor', 'pil': 'GroundingDinoImageProcessorPil'}` (MMGroundingDinoConfig model)
- **mobilenet_v1** -- `{'pil': 'MobileNetV1ImageProcessorPil', 'torchvision': 'MobileNetV1ImageProcessor'}` (MobileNetV1Config model)
- **mobilenet_v2** -- `{'pil': 'MobileNetV2ImageProcessorPil', 'torchvision': 'MobileNetV2ImageProcessor'}` (MobileNetV2Config model)
- **mobilevit** -- `{'pil': 'MobileViTImageProcessorPil', 'torchvision': 'MobileViTImageProcessor'}` (MobileViTConfig model)
- **mobilevitv2** -- `{'torchvision': 'MobileViTImageProcessor', 'pil': 'MobileViTImageProcessorPil'}` (MobileViTV2Config model)
- **nougat** -- `{'pil': 'NougatImageProcessorPil', 'torchvision': 'NougatImageProcessor'}` (NougatConfig model)
- **omdet-turbo** -- `{'torchvision': 'DetrImageProcessor', 'pil': 'DetrImageProcessorPil'}` (OmDetTurboConfig model)
- **oneformer** -- `{'pil': 'OneFormerImageProcessorPil', 'torchvision': 'OneFormerImageProcessor'}` (OneFormerConfig model)
- **ovis2** -- `{'pil': 'Ovis2ImageProcessorPil', 'torchvision': 'Ovis2ImageProcessor'}` (Ovis2Config model)
- **owlv2** -- `{'pil': 'Owlv2ImageProcessorPil', 'torchvision': 'Owlv2ImageProcessor'}` (Owlv2Config model)
- **owlvit** -- `{'pil': 'OwlViTImageProcessorPil', 'torchvision': 'OwlViTImageProcessor'}` (OwlViTConfig model)
- **paddleocr_vl** -- `{'pil': 'PaddleOCRVLImageProcessorPil', 'torchvision': 'PaddleOCRVLImageProcessor'}` (PaddleOCRVLConfig model)
- **paligemma** -- `{'torchvision': 'SiglipImageProcessor', 'pil': 'SiglipImageProcessorPil'}` (PaliGemmaConfig model)
- **perceiver** -- `{'pil': 'PerceiverImageProcessorPil', 'torchvision': 'PerceiverImageProcessor'}` (PerceiverConfig model)
- **perception_lm** -- `{'torchvision': 'PerceptionLMImageProcessor'}` (PerceptionLMConfig model)
- **phi4_multimodal** -- `{'torchvision': 'Phi4MultimodalImageProcessor'}` (Phi4MultimodalConfig model)
- **pi0** -- `{'torchvision': 'PI0ImageProcessor'}` (PI0Config model)
- **pix2struct** -- `{'pil': 'Pix2StructImageProcessorPil', 'torchvision': 'Pix2StructImageProcessor'}` (Pix2StructConfig model)
- **pixio** -- `{'torchvision': 'BitImageProcessor', 'pil': 'BitImageProcessorPil'}` (PixioConfig model)
- **pixtral** -- `{'pil': 'PixtralImageProcessorPil', 'torchvision': 'PixtralImageProcessor'}` (PixtralVisionConfig model)
- **poolformer** -- `{'pil': 'PoolFormerImageProcessorPil', 'torchvision': 'PoolFormerImageProcessor'}` (PoolFormerConfig model)
- **pp_chart2table** -- `{'pil': 'PPChart2TableImageProcessorPil', 'torchvision': 'PPChart2TableImageProcessor'}` (PPChart2TableConfig model)
- **pp_doclayout_v2** -- `{'torchvision': 'PPDocLayoutV2ImageProcessor'}` (PPDocLayoutV2Config model)
- **pp_doclayout_v3** -- `{'torchvision': 'PPDocLayoutV3ImageProcessor'}` (PPDocLayoutV3Config model)
- **pp_formulanet** -- `{'torchvision': 'PPFormulaNetImageProcessor'}` (PPFormulaNetConfig model)
- **pp_lcnet** -- `{'torchvision': 'PPLCNetImageProcessor'}` (PPLCNetConfig model)
- **pp_ocrv5_mobile_det** -- `{'torchvision': 'PPOCRV5ServerDetImageProcessor'}` (PPOCRV5MobileDetConfig model)
- **pp_ocrv5_mobile_rec** -- `{'torchvision': 'PPOCRV5ServerRecImageProcessor'}` (PPOCRV5MobileRecConfig model)
- **pp_ocrv5_server_det** -- `{'torchvision': 'PPOCRV5ServerDetImageProcessor'}` (PPOCRV5ServerDetConfig model)
- **pp_ocrv5_server_rec** -- `{'torchvision': 'PPOCRV5ServerRecImageProcessor'}` (PPOCRV5ServerRecConfig model)
- **prompt_depth_anything** -- `{'pil': 'PromptDepthAnythingImageProcessorPil', 'torchvision': 'PromptDepthAnythingImageProcessor'}` (PromptDepthAnythingConfig model)
- **pvt** -- `{'pil': 'PvtImageProcessorPil', 'torchvision': 'PvtImageProcessor'}` (PvtConfig model)
- **pvt_v2** -- `{'torchvision': 'PvtImageProcessor', 'pil': 'PvtImageProcessorPil'}` (PvtV2Config model)
- **qianfan_ocr** -- `{'torchvision': 'GotOcr2ImageProcessor', 'pil': 'GotOcr2ImageProcessorPil'}` (QianfanOCRConfig model)
- **qwen2_5_omni** -- `{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}` (Qwen2_5OmniConfig model)
- **qwen2_5_vl** -- `{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}` (Qwen2_5_VLConfig model)
- **qwen2_vl** -- `{'pil': 'Qwen2VLImageProcessorPil', 'torchvision': 'Qwen2VLImageProcessor'}` (Qwen2VLConfig model)
- **qwen3_5** -- `{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}` (Qwen3_5Config model)
- **qwen3_5_moe** -- `{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}` (Qwen3_5MoeConfig model)
- **qwen3_omni_moe** -- `{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}` (Qwen3OmniMoeConfig model)
- **qwen3_vl** -- `{'torchvision': 'Qwen2VLImageProcessor', 'pil': 'Qwen2VLImageProcessorPil'}` (Qwen3VLConfig model)
- **regnet** -- `{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}` (RegNetConfig model)
- **resnet** -- `{'torchvision': 'ConvNextImageProcessor', 'pil': 'ConvNextImageProcessorPil'}` (ResNetConfig model)
- **rt_detr** -- `{'pil': 'RTDetrImageProcessorPil', 'torchvision': 'RTDetrImageProcessor'}` (RTDetrConfig model)
- **sam** -- `{'pil': 'SamImageProcessorPil', 'torchvision': 'SamImageProcessor'}` (SamConfig model)
- **sam2** -- `{'torchvision': 'Sam2ImageProcessor'}` (Sam2Config model)
- **sam2_video** -- `{'torchvision': 'Sam2ImageProcessor'}` (Sam2VideoConfig model)
- **sam3** -- `{'torchvision': 'Sam3ImageProcessor'}` (Sam3Config model)
- **sam3_lite_text** -- `{'torchvision': 'Sam3ImageProcessor'}` (Sam3LiteTextConfig model)
- **sam3_tracker** -- `{'torchvision': 'Sam3ImageProcessor'}` (Sam3TrackerConfig model)
- **sam3_tracker_video** -- `{'torchvision': 'Sam3ImageProcessor'}` (Sam3TrackerVideoConfig model)
- **sam3_video** -- `{'torchvision': 'Sam3ImageProcessor'}` (Sam3VideoConfig model)
- **sam_hq** -- `{'torchvision': 'SamImageProcessor', 'pil': 'SamImageProcessorPil'}` (SamHQConfig model)
- **segformer** -- `{'pil': 'SegformerImageProcessorPil', 'torchvision': 'SegformerImageProcessor'}` (SegformerConfig model)
- **seggpt** -- `{'pil': 'SegGptImageProcessorPil', 'torchvision': 'SegGptImageProcessor'}` (SegGptConfig model)
- **shieldgemma2** -- `{'torchvision': 'Gemma3ImageProcessor', 'pil': 'Gemma3ImageProcessorPil'}` (ShieldGemma2Config model)
- **siglip** -- `{'pil': 'SiglipImageProcessorPil', 'torchvision': 'SiglipImageProcessor'}` (SiglipConfig model)
- **siglip2** -- `{'pil': 'Siglip2ImageProcessorPil', 'torchvision': 'Siglip2ImageProcessor'}` (Siglip2Config model)
- **slanet** -- `{'torchvision': 'SLANeXtImageProcessor'}` (SLANetConfig model)
- **slanext** -- `{'torchvision': 'SLANeXtImageProcessor'}` (SLANeXtConfig model)
- **smolvlm** -- `{'pil': 'SmolVLMImageProcessorPil', 'torchvision': 'SmolVLMImageProcessor'}` (SmolVLMConfig model)
- **superglue** -- `{'pil': 'SuperGlueImageProcessorPil', 'torchvision': 'SuperGlueImageProcessor'}` (SuperGlueConfig model)
- **superpoint** -- `{'pil': 'SuperPointImageProcessorPil', 'torchvision': 'SuperPointImageProcessor'}` (SuperPointConfig model)
- **swiftformer** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (SwiftFormerConfig model)
- **swin** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (SwinConfig model)
- **swin2sr** -- `{'pil': 'Swin2SRImageProcessorPil', 'torchvision': 'Swin2SRImageProcessor'}` (Swin2SRConfig model)
- **swinv2** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (Swinv2Config model)
- **t5gemma2** -- `{'torchvision': 'Gemma3ImageProcessor', 'pil': 'Gemma3ImageProcessorPil'}` (T5Gemma2Config model)
- **t5gemma2_encoder** -- `{'torchvision': 'Gemma3ImageProcessor', 'pil': 'Gemma3ImageProcessorPil'}` (T5Gemma2EncoderConfig model)
- **table-transformer** -- `{'torchvision': 'DetrImageProcessor', 'pil': 'DetrImageProcessorPil'}` (TableTransformerConfig model)
- **textnet** -- `{'pil': 'TextNetImageProcessorPil', 'torchvision': 'TextNetImageProcessor'}` (TextNetConfig model)
- **timesformer** -- `{'pil': 'VideoMAEImageProcessorPil', 'torchvision': 'VideoMAEImageProcessor'}` (TimesformerConfig model)
- **timm_wrapper** -- `{'pil': 'TimmWrapperImageProcessor'}` (TimmWrapperConfig model)
- **trocr** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (TrOCRConfig model)
- **tvp** -- `{'pil': 'TvpImageProcessorPil', 'torchvision': 'TvpImageProcessor'}` (TvpConfig model)
- **udop** -- `{'torchvision': 'LayoutLMv3ImageProcessor', 'pil': 'LayoutLMv3ImageProcessorPil'}` (UdopConfig model)
- **upernet** -- `{'torchvision': 'SegformerImageProcessor', 'pil': 'SegformerImageProcessorPil'}` (UperNetConfig model)
- **uvdoc** -- `{'torchvision': 'UVDocImageProcessor'}` (UVDocConfig model)
- **video_llama_3** -- `{'pil': 'VideoLlama3ImageProcessorPil', 'torchvision': 'VideoLlama3ImageProcessor'}` (VideoLlama3Config model)
- **video_llava** -- `{'pil': 'VideoLlavaImageProcessor'}` (VideoLlavaConfig model)
- **videomae** -- `{'pil': 'VideoMAEImageProcessorPil', 'torchvision': 'VideoMAEImageProcessor'}` (VideoMAEConfig model)
- **vilt** -- `{'pil': 'ViltImageProcessorPil', 'torchvision': 'ViltImageProcessor'}` (ViltConfig model)
- **vipllava** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (VipLlavaConfig model)
- **vit** -- `{'pil': 'ViTImageProcessorPil', 'torchvision': 'ViTImageProcessor'}` (ViTConfig model)
- **vit_mae** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (ViTMAEConfig model)
- **vit_msn** -- `{'torchvision': 'ViTImageProcessor', 'pil': 'ViTImageProcessorPil'}` (ViTMSNConfig model)
- **vitmatte** -- `{'pil': 'VitMatteImageProcessorPil', 'torchvision': 'VitMatteImageProcessor'}` (VitMatteConfig model)
- **vitpose** -- `{'pil': 'VitPoseImageProcessorPil', 'torchvision': 'VitPoseImageProcessor'}` (VitPoseConfig model)
- **vivit** -- `{'torchvision': 'VivitImageProcessor'}` (VivitConfig model)
- **xclip** -- `{'torchvision': 'CLIPImageProcessor', 'pil': 'CLIPImageProcessorPil'}` (XCLIPConfig model)
- **yolos** -- `{'pil': 'YolosImageProcessorPil', 'torchvision': 'YolosImageProcessor'}` (YolosConfig model)
- **zoedepth** -- `{'pil': 'ZoeDepthImageProcessorPil', 'torchvision': 'ZoeDepthImageProcessor'}` (ZoeDepthConfig model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoImageProcessor

>>> # Download image processor from huggingface.co and cache.
>>> image_processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")

>>> # If image processor files are in a directory (e.g. image processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # image_processor = AutoImageProcessor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained image_processor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a image processor file saved using the [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/image_processor#transformers.ImageProcessingMixin.save_pretrained) method, e.g., `./my_model_directory/`. - a path to a saved image processor JSON *file*, e.g., `./my_model_directory/preprocessor_config.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model image processor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the image processor files and override the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

use_fast (`bool`, *optional*, defaults to `False`) : **Deprecated**: Use `backend="torchvision"` instead. This parameter is kept for backward compatibility. Use a fast torchvision-based image processor if it is supported for a given model. If a fast image processor is not available for a given model, a normal numpy-based image processor is returned instead.

backend (`str`, *optional*, defaults to `None`) : The backend to use for image processing. Can be: - `None`: Automatically select the best available backend (torchvision if available, otherwise pil) - `"torchvision"`: Use Torchvision backend (GPU-accelerated, faster) - `"pil"`: Use PIL backend (portable, CPU-only) - Any custom backend name registered via `register()` method

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final image processor object. If `True`, then this functions returns a `Tuple(image_processor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not image processor attributes: i.e., the part of `kwargs` which has not been used to update `image_processor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

image_processor_filename (`str`, *optional*, defaults to `"config.json"`) : The name of the file in the model directory to use for the image processor config.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are image processor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* image processor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoImageProcessor.register]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/image_processing_auto.py#L649)

Register a new image processor for this class.

**Parameters:**

config_class ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The configuration corresponding to the model to register.

slow_image_processor_class (`type`, *optional*) : The PIL backend image processor class (deprecated, use `image_processor_classes={"pil": ...}`).

fast_image_processor_class (`type`, *optional*) : The Torchvision backend image processor class (deprecated, use `image_processor_classes={"torchvision": ...}`).

image_processor_classes (`dict[str, type]`, *optional*) : Dictionary mapping backend names to image processor classes. Allows registering custom backends. Example: `{"pil": MyPilProcessor, "torchvision": MyTorchvisionProcessor, "custom": MyCustomProcessor}`

exist_ok (`bool`, *optional*, defaults to `False`) : If `True`, allow overwriting existing registrations.

## AutoVideoProcessor[[transformers.AutoVideoProcessor]]

#### transformers.AutoVideoProcessor[[transformers.AutoVideoProcessor]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/video_processing_auto.py#L227)

This is a generic video processor class that will be instantiated as one of the video processor classes of the
library when created with the [AutoVideoProcessor.from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoVideoProcessor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoVideoProcessor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/video_processing_auto.py#L241[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "*inputs", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained video_processor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a video processor file saved using the
    [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/video_processor#transformers.BaseVideoProcessor.save_pretrained) method, e.g.,
    `./my_model_directory/`.
  - a path to a saved video processor JSON *file*, e.g.,
    `./my_model_directory/preprocessor_config.json`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model video processor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the video processor files and override the cached versions if
  they exist.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final video processor object. If `True`, then this
  functions returns a `Tuple(video_processor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not video processor attributes: i.e., the part of
  `kwargs` which has not been used to update `video_processor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are video processor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* video processor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the video processor classes of the library from a pretrained model vocabulary.

The video processor class to instantiate is selected based on the `model_type` property of the config object
(either passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's
missing, by falling back to using pattern matching on `pretrained_model_name_or_path`:

- **ernie4_5_vl_moe** -- [Ernie4_5_VLMoeVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeVideoProcessor) (Ernie4_5_VLMoeConfig model)
- **exaone4_5** -- [Qwen2VLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLVideoProcessor) (Exaone4_5_Config model)
- **gemma4** -- [Gemma4VideoProcessor](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4VideoProcessor) (Gemma4Config model)
- **glm46v** -- [Glm46VVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VVideoProcessor) (Glm46VConfig model)
- **glm4v** -- [Glm4vVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vVideoProcessor) (Glm4vConfig model)
- **instructblip** -- [InstructBlipVideoVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoVideoProcessor) (InstructBlipConfig model)
- **instructblipvideo** -- [InstructBlipVideoVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoVideoProcessor) (InstructBlipVideoConfig model)
- **internvl** -- [InternVLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLVideoProcessor) (InternVLConfig model)
- **llava_next_video** -- [LlavaNextVideoVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoVideoProcessor) (LlavaNextVideoConfig model)
- **llava_onevision** -- [LlavaOnevisionVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionVideoProcessor) (LlavaOnevisionConfig model)
- **minicpmv4_6** -- [MiniCPMV4_6VideoProcessor](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6VideoProcessor) (MiniCPMV4_6Config model)
- **pe_audio_video** -- [PeVideoVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoVideoProcessor) (PeAudioVideoConfig model)
- **pe_video** -- [PeVideoVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoVideoProcessor) (PeVideoConfig model)
- **perception_lm** -- [PerceptionLMVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMVideoProcessor) (PerceptionLMConfig model)
- **qwen2_5_omni** -- [Qwen2VLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLVideoProcessor) (Qwen2_5OmniConfig model)
- **qwen2_5_vl** -- [Qwen2VLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLVideoProcessor) (Qwen2_5_VLConfig model)
- **qwen2_vl** -- [Qwen2VLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLVideoProcessor) (Qwen2VLConfig model)
- **qwen3_5** -- [Qwen3VLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLVideoProcessor) (Qwen3_5Config model)
- **qwen3_5_moe** -- [Qwen3VLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLVideoProcessor) (Qwen3_5MoeConfig model)
- **qwen3_omni_moe** -- [Qwen2VLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLVideoProcessor) (Qwen3OmniMoeConfig model)
- **qwen3_vl** -- [Qwen3VLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLVideoProcessor) (Qwen3VLConfig model)
- **qwen3_vl_moe** -- [Qwen3VLVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLVideoProcessor) (Qwen3VLMoeConfig model)
- **sam2_video** -- [Sam2VideoVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/sam2_video#transformers.Sam2VideoVideoProcessor) (Sam2VideoConfig model)
- **smolvlm** -- [SmolVLMVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMVideoProcessor) (SmolVLMConfig model)
- **video_llama_3** -- [VideoLlama3VideoProcessor](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3VideoProcessor) (VideoLlama3Config model)
- **video_llava** -- [VideoLlavaVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaVideoProcessor) (VideoLlavaConfig model)
- **videomae** -- [VideoMAEVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEVideoProcessor) (VideoMAEConfig model)
- **videomt** -- [VideomtVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/videomt#transformers.VideomtVideoProcessor) (VideomtConfig model)
- **vjepa2** -- [VJEPA2VideoProcessor](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2VideoProcessor) (VJEPA2Config model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoVideoProcessor

>>> # Download video processor from huggingface.co and cache.
>>> video_processor = AutoVideoProcessor.from_pretrained("llava-hf/llava-onevision-qwen2-0.5b-ov-hf")

>>> # If video processor files are in a directory (e.g. video processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # video_processor = AutoVideoProcessor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained video_processor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a video processor file saved using the [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/video_processor#transformers.BaseVideoProcessor.save_pretrained) method, e.g., `./my_model_directory/`. - a path to a saved video processor JSON *file*, e.g., `./my_model_directory/preprocessor_config.json`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model video processor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the video processor files and override the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final video processor object. If `True`, then this functions returns a `Tuple(video_processor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not video processor attributes: i.e., the part of `kwargs` which has not been used to update `video_processor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are video processor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* video processor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoVideoProcessor.register]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/video_processing_auto.py#L390)

Register a new video processor for this class.

**Parameters:**

config_class ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The configuration corresponding to the model to register.

video_processor_class ([BaseVideoProcessor](/docs/transformers/v5.8.0/en/main_classes/video_processor#transformers.BaseVideoProcessor)) : The video processor to register.

## AutoProcessor[[transformers.AutoProcessor]]

#### transformers.AutoProcessor[[transformers.AutoProcessor]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/processing_auto.py#L222)

This is a generic processor class that will be instantiated as one of the processor classes of the library when
created with the [AutoProcessor.from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoProcessor.from_pretrained) class method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_pretrainedtransformers.AutoProcessor.from_pretrainedhttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/processing_auto.py#L236[{"name": "pretrained_model_name_or_path", "val": ""}, {"name": "**kwargs", "val": ""}]- **pretrained_model_name_or_path** (`str` or `os.PathLike`) --
  This can be either:

  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on
    huggingface.co.
  - a path to a *directory* containing a processor files saved using the `save_pretrained()` method,
    e.g., `./my_model_directory/`.
- **cache_dir** (`str` or `os.PathLike`, *optional*) --
  Path to a directory in which a downloaded pretrained model feature extractor should be cached if the
  standard cache should not be used.
- **force_download** (`bool`, *optional*, defaults to `False`) --
  Whether or not to force to (re-)download the feature extractor files and override the cached versions
  if they exist.
- **proxies** (`dict[str, str]`, *optional*) --
  A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128',
  'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.
- **token** (`str` or *bool*, *optional*) --
  The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
  when running `hf auth login` (stored in `~/.huggingface`).
- **revision** (`str`, *optional*, defaults to `"main"`) --
  The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a
  git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any
  identifier allowed by git.
- **return_unused_kwargs** (`bool`, *optional*, defaults to `False`) --
  If `False`, then this function returns just the final feature extractor object. If `True`, then this
  functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary
  consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of
  `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.
- **trust_remote_code** (`bool`, *optional*, defaults to `False`) --
  Whether or not to allow for custom models defined on the Hub in their own modeling files. This option
  should only be set to `True` for repositories you trust and in which you have read the code, as it will
  execute code present on the Hub on your local machine.
- **kwargs** (`dict[str, Any]`, *optional*) --
  The values in kwargs of any keys which are feature extractor attributes will be used to override the
  loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is
  controlled by the `return_unused_kwargs` keyword parameter.0

Instantiate one of the processor classes of the library from a pretrained model vocabulary.

The processor class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible):

- **aimv2** -- [CLIPProcessor](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPProcessor) (Aimv2Config model)
- **align** -- [AlignProcessor](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignProcessor) (AlignConfig model)
- **altclip** -- [AltCLIPProcessor](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPProcessor) (AltCLIPConfig model)
- **aria** -- [AriaProcessor](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaProcessor) (AriaConfig model)
- **audioflamingo3** -- [AudioFlamingo3Processor](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Processor) (AudioFlamingo3Config model)
- **aya_vision** -- [AyaVisionProcessor](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionProcessor) (AyaVisionConfig model)
- **bark** -- [BarkProcessor](/docs/transformers/v5.8.0/en/model_doc/bark#transformers.BarkProcessor) (BarkConfig model)
- **blip** -- [BlipProcessor](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipProcessor) (BlipConfig model)
- **blip-2** -- [Blip2Processor](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Processor) (Blip2Config model)
- **bridgetower** -- [BridgeTowerProcessor](/docs/transformers/v5.8.0/en/model_doc/bridgetower#transformers.BridgeTowerProcessor) (BridgeTowerConfig model)
- **chameleon** -- [ChameleonProcessor](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonProcessor) (ChameleonConfig model)
- **chinese_clip** -- [ChineseCLIPProcessor](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPProcessor) (ChineseCLIPConfig model)
- **clap** -- [ClapProcessor](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapProcessor) (ClapConfig model)
- **clip** -- [CLIPProcessor](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPProcessor) (CLIPConfig model)
- **clipseg** -- [CLIPSegProcessor](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegProcessor) (CLIPSegConfig model)
- **clvp** -- [ClvpProcessor](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpProcessor) (ClvpConfig model)
- **cohere2_vision** -- [Cohere2VisionProcessor](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionProcessor) (Cohere2VisionConfig model)
- **cohere_asr** -- [CohereAsrProcessor](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrProcessor) (CohereAsrConfig model)
- **colmodernvbert** -- [ColModernVBertProcessor](/docs/transformers/v5.8.0/en/model_doc/colmodernvbert#transformers.ColModernVBertProcessor) (ColModernVBertConfig model)
- **colpali** -- [ColPaliProcessor](/docs/transformers/v5.8.0/en/model_doc/colpali#transformers.ColPaliProcessor) (ColPaliConfig model)
- **colqwen2** -- [ColQwen2Processor](/docs/transformers/v5.8.0/en/model_doc/colqwen2#transformers.ColQwen2Processor) (ColQwen2Config model)
- **deepseek_vl** -- [DeepseekVLProcessor](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLProcessor) (DeepseekVLConfig model)
- **deepseek_vl_hybrid** -- [DeepseekVLHybridProcessor](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridProcessor) (DeepseekVLHybridConfig model)
- **dia** -- [DiaProcessor](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaProcessor) (DiaConfig model)
- **edgetam** -- [Sam2Processor](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2Processor) (EdgeTamConfig model)
- **emu3** -- [Emu3Processor](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Processor) (Emu3Config model)
- **ernie4_5_vl_moe** -- [Ernie4_5_VLMoeProcessor](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeProcessor) (Ernie4_5_VLMoeConfig model)
- **evolla** -- [EvollaProcessor](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaProcessor) (EvollaConfig model)
- **exaone4_5** -- [Exaone4_5_Processor](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Processor) (Exaone4_5_Config model)
- **flava** -- [FlavaProcessor](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaProcessor) (FlavaConfig model)
- **florence2** -- [Florence2Processor](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Processor) (Florence2Config model)
- **fuyu** -- [FuyuProcessor](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuProcessor) (FuyuConfig model)
- **gemma3** -- [Gemma3Processor](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Processor) (Gemma3Config model)
- **gemma3n** -- [Gemma3nProcessor](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nProcessor) (Gemma3nConfig model)
- **gemma4** -- [Gemma4Processor](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Processor) (Gemma4Config model)
- **git** -- [GitProcessor](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitProcessor) (GitConfig model)
- **glm46v** -- [Glm46VProcessor](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VProcessor) (Glm46VConfig model)
- **glm4v** -- [Glm4vProcessor](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vProcessor) (Glm4vConfig model)
- **glm4v_moe** -- [Glm4vProcessor](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vProcessor) (Glm4vMoeConfig model)
- **glm_image** -- [Glm4vProcessor](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vProcessor) (GlmImageConfig model)
- **glmasr** -- [GlmAsrProcessor](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrProcessor) (GlmAsrConfig model)
- **got_ocr2** -- [GotOcr2Processor](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Processor) (GotOcr2Config model)
- **granite4_vision** -- [Granite4VisionProcessor](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionProcessor) (Granite4VisionConfig model)
- **granite_speech** -- [GraniteSpeechProcessor](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechProcessor) (GraniteSpeechConfig model)
- **granite_speech_plus** -- [GraniteSpeechProcessor](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechProcessor) (GraniteSpeechPlusConfig model)
- **grounding-dino** -- [GroundingDinoProcessor](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoProcessor) (GroundingDinoConfig model)
- **groupvit** -- [CLIPProcessor](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPProcessor) (GroupViTConfig model)
- **higgs_audio_v2** -- [HiggsAudioV2Processor](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2#transformers.HiggsAudioV2Processor) (HiggsAudioV2Config model)
- **hubert** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (HubertConfig model)
- **idefics** -- [IdeficsProcessor](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsProcessor) (IdeficsConfig model)
- **idefics2** -- [Idefics2Processor](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Processor) (Idefics2Config model)
- **idefics3** -- [Idefics3Processor](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Processor) (Idefics3Config model)
- **instructblip** -- [InstructBlipProcessor](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipProcessor) (InstructBlipConfig model)
- **instructblipvideo** -- [InstructBlipVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoProcessor) (InstructBlipVideoConfig model)
- **internvl** -- [InternVLProcessor](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLProcessor) (InternVLConfig model)
- **janus** -- [JanusProcessor](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusProcessor) (JanusConfig model)
- **kosmos-2** -- [Kosmos2Processor](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Processor) (Kosmos2Config model)
- **kosmos-2.5** -- [Kosmos2_5Processor](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Processor) (Kosmos2_5Config model)
- **kyutai_speech_to_text** -- [KyutaiSpeechToTextProcessor](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextProcessor) (KyutaiSpeechToTextConfig model)
- **lasr_ctc** -- [LasrProcessor](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrProcessor) (LasrCTCConfig model)
- **lasr_encoder** -- [LasrProcessor](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrProcessor) (LasrEncoderConfig model)
- **layoutlmv2** -- [LayoutLMv2Processor](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Processor) (LayoutLMv2Config model)
- **layoutlmv3** -- [LayoutLMv3Processor](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Processor) (LayoutLMv3Config model)
- **layoutxlm** -- [LayoutXLMProcessor](/docs/transformers/v5.8.0/en/model_doc/layoutxlm#transformers.LayoutXLMProcessor) (LayoutXLMConfig model)
- **lfm2_vl** -- [Lfm2VlProcessor](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlProcessor) (Lfm2VlConfig model)
- **lighton_ocr** -- [LightOnOcrProcessor](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrProcessor) (LightOnOcrConfig model)
- **llama4** -- [Llama4Processor](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Processor) (Llama4Config model)
- **llava** -- [LlavaProcessor](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaProcessor) (LlavaConfig model)
- **llava_next** -- [LlavaNextProcessor](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextProcessor) (LlavaNextConfig model)
- **llava_next_video** -- [LlavaNextVideoProcessor](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoProcessor) (LlavaNextVideoConfig model)
- **llava_onevision** -- [LlavaOnevisionProcessor](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionProcessor) (LlavaOnevisionConfig model)
- **markuplm** -- [MarkupLMProcessor](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMProcessor) (MarkupLMConfig model)
- **metaclip_2** -- [CLIPProcessor](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPProcessor) (MetaClip2Config model)
- **mgp-str** -- [MgpstrProcessor](/docs/transformers/v5.8.0/en/model_doc/mgp-str#transformers.MgpstrProcessor) (MgpstrConfig model)
- **minicpmv4_6** -- [MiniCPMV4_6Processor](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Processor) (MiniCPMV4_6Config model)
- **mistral3** -- [PixtralProcessor](/docs/transformers/v5.8.0/en/model_doc/pixtral#transformers.PixtralProcessor) (Mistral3Config model)
- **mllama** -- [MllamaProcessor](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaProcessor) (MllamaConfig model)
- **mm-grounding-dino** -- [GroundingDinoProcessor](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoProcessor) (MMGroundingDinoConfig model)
- **modernvbert** -- [Idefics3Processor](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Processor) (ModernVBertConfig model)
- **moonshine** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (MoonshineConfig model)
- **moonshine_streaming** -- [MoonshineStreamingProcessor](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingProcessor) (MoonshineStreamingConfig model)
- **musicflamingo** -- [MusicFlamingoProcessor](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoProcessor) (MusicFlamingoConfig model)
- **omdet-turbo** -- [OmDetTurboProcessor](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboProcessor) (OmDetTurboConfig model)
- **oneformer** -- [OneFormerProcessor](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerProcessor) (OneFormerConfig model)
- **ovis2** -- [Ovis2Processor](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Processor) (Ovis2Config model)
- **owlv2** -- [Owlv2Processor](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2Processor) (Owlv2Config model)
- **owlvit** -- [OwlViTProcessor](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTProcessor) (OwlViTConfig model)
- **paddleocr_vl** -- [PaddleOCRVLProcessor](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLProcessor) (PaddleOCRVLConfig model)
- **paligemma** -- [PaliGemmaProcessor](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaProcessor) (PaliGemmaConfig model)
- **perception_lm** -- [PerceptionLMProcessor](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMProcessor) (PerceptionLMConfig model)
- **phi4_multimodal** -- [Phi4MultimodalProcessor](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalProcessor) (Phi4MultimodalConfig model)
- **pi0** -- [PI0Processor](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Processor) (PI0Config model)
- **pix2struct** -- [Pix2StructProcessor](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructProcessor) (Pix2StructConfig model)
- **pixtral** -- [PixtralProcessor](/docs/transformers/v5.8.0/en/model_doc/pixtral#transformers.PixtralProcessor) (PixtralVisionConfig model)
- **pop2piano** -- [Pop2PianoProcessor](/docs/transformers/v5.8.0/en/model_doc/pop2piano#transformers.models.pop2piano.processing_pop2piano._LazyModule.__getattr__..Placeholder) (Pop2PianoConfig model)
- **pp_chart2table** -- [PPChart2TableProcessor](/docs/transformers/v5.8.0/en/model_doc/pp_chart2table#transformers.PPChart2TableProcessor) (PPChart2TableConfig model)
- **pp_formulanet** -- [PPFormulaNetProcessor](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetProcessor) (PPFormulaNetConfig model)
- **qianfan_ocr** -- [QianfanOCRProcessor](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRProcessor) (QianfanOCRConfig model)
- **qwen2_5_omni** -- [Qwen2_5OmniProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniProcessor) (Qwen2_5OmniConfig model)
- **qwen2_5_vl** -- [Qwen2_5_VLProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLProcessor) (Qwen2_5_VLConfig model)
- **qwen2_audio** -- [Qwen2AudioProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioProcessor) (Qwen2AudioConfig model)
- **qwen2_vl** -- [Qwen2VLProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLProcessor) (Qwen2VLConfig model)
- **qwen3_5** -- [Qwen3VLProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLProcessor) (Qwen3_5Config model)
- **qwen3_5_moe** -- [Qwen3VLProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLProcessor) (Qwen3_5MoeConfig model)
- **qwen3_omni_moe** -- [Qwen3OmniMoeProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeProcessor) (Qwen3OmniMoeConfig model)
- **qwen3_vl** -- [Qwen3VLProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLProcessor) (Qwen3VLConfig model)
- **qwen3_vl_moe** -- [Qwen3VLProcessor](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLProcessor) (Qwen3VLMoeConfig model)
- **sam** -- [SamProcessor](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamProcessor) (SamConfig model)
- **sam2** -- [Sam2Processor](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2Processor) (Sam2Config model)
- **sam3** -- [Sam3Processor](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3Processor) (Sam3Config model)
- **sam3_lite_text** -- [Sam3Processor](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3Processor) (Sam3LiteTextConfig model)
- **sam_hq** -- [SamHQProcessor](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQProcessor) (SamHQConfig model)
- **seamless_m4t** -- [SeamlessM4TProcessor](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TProcessor) (SeamlessM4TConfig model)
- **sew** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (SEWConfig model)
- **sew-d** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (SEWDConfig model)
- **shieldgemma2** -- [ShieldGemma2Processor](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2Processor) (ShieldGemma2Config model)
- **siglip** -- [SiglipProcessor](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipProcessor) (SiglipConfig model)
- **siglip2** -- [Siglip2Processor](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Processor) (Siglip2Config model)
- **smolvlm** -- [SmolVLMProcessor](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMProcessor) (SmolVLMConfig model)
- **speech_to_text** -- [Speech2TextProcessor](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextProcessor) (Speech2TextConfig model)
- **speecht5** -- [SpeechT5Processor](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Processor) (SpeechT5Config model)
- **t5gemma2** -- [Gemma3Processor](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Processor) (T5Gemma2Config model)
- **t5gemma2_encoder** -- [Gemma3Processor](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Processor) (T5Gemma2EncoderConfig model)
- **trocr** -- [TrOCRProcessor](/docs/transformers/v5.8.0/en/model_doc/trocr#transformers.TrOCRProcessor) (TrOCRConfig model)
- **tvp** -- [TvpProcessor](/docs/transformers/v5.8.0/en/model_doc/tvp#transformers.TvpProcessor) (TvpConfig model)
- **udop** -- [UdopProcessor](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopProcessor) (UdopConfig model)
- **unispeech** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (UniSpeechConfig model)
- **unispeech-sat** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (UniSpeechSatConfig model)
- **vibevoice_asr** -- [VibeVoiceAsrProcessor](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrProcessor) (VibeVoiceAsrConfig model)
- **video_llava** -- [VideoLlavaProcessor](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaProcessor) (VideoLlavaConfig model)
- **vilt** -- [ViltProcessor](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltProcessor) (ViltConfig model)
- **vipllava** -- [LlavaProcessor](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaProcessor) (VipLlavaConfig model)
- **vision-text-dual-encoder** -- [VisionTextDualEncoderProcessor](/docs/transformers/v5.8.0/en/model_doc/vision-text-dual-encoder#transformers.VisionTextDualEncoderProcessor) (VisionTextDualEncoderConfig model)
- **voxtral** -- [VoxtralProcessor](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralProcessor) (VoxtralConfig model)
- **voxtral_realtime** -- [VoxtralRealtimeProcessor](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeProcessor) (VoxtralRealtimeConfig model)
- **wav2vec2** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (Wav2Vec2Config model)
- **wav2vec2-bert** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (Wav2Vec2BertConfig model)
- **wav2vec2-conformer** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (Wav2Vec2ConformerConfig model)
- **wavlm** -- [Wav2Vec2Processor](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Processor) (WavLMConfig model)
- **whisper** -- [WhisperProcessor](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperProcessor) (WhisperConfig model)
- **xclip** -- [XCLIPProcessor](/docs/transformers/v5.8.0/en/model_doc/xclip#transformers.XCLIPProcessor) (XCLIPConfig model)

Passing `token=True` is required when you want to use a private model.

Examples:

```python
>>> from transformers import AutoProcessor

>>> # Download processor from huggingface.co and cache.
>>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If processor files are in a directory (e.g. processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # processor = AutoProcessor.from_pretrained("./test/saved_model/")
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : This can be either:  - a string, the *model id* of a pretrained feature_extractor hosted inside a model repo on huggingface.co. - a path to a *directory* containing a processor files saved using the `save_pretrained()` method, e.g., `./my_model_directory/`.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request.

token (`str` or *bool*, *optional*) : The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated when running `hf auth login` (stored in `~/.huggingface`).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

return_unused_kwargs (`bool`, *optional*, defaults to `False`) : If `False`, then this function returns just the final feature extractor object. If `True`, then this functions returns a `Tuple(feature_extractor, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not feature extractor attributes: i.e., the part of `kwargs` which has not been used to update `feature_extractor` and is otherwise ignored.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

kwargs (`dict[str, Any]`, *optional*) : The values in kwargs of any keys which are feature extractor attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* feature extractor attributes is controlled by the `return_unused_kwargs` keyword parameter.
#### register[[transformers.AutoProcessor.register]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/processing_auto.py#L459)

Register a new processor for this class.

**Parameters:**

config_class ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The configuration corresponding to the model to register.

processor_class ([ProcessorMixin](/docs/transformers/v5.8.0/en/main_classes/processors#transformers.ProcessorMixin)) : The processor to register.

## Generic model classes

The following auto classes are available for instantiating a base model class without a specific head.

### AutoModel[[transformers.AutoModel]]

#### transformers.AutoModel[[transformers.AutoModel]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L1997)

This is a generic model class that will be instantiated as one of the base model classes of the library when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModel.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [ASTConfig](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) configuration class: [ASTModel](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTModel) (ASTConfig model)
  - [AfmoeConfig](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeConfig) configuration class: [AfmoeModel](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeModel) (AfmoeConfig model)
  - [Aimv2Config](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2Config) configuration class: [Aimv2Model](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2Model) (Aimv2Config model)
  - [Aimv2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2VisionConfig) configuration class: [Aimv2VisionModel](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2VisionModel) (Aimv2VisionConfig model)
  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertModel` (AlbertConfig model)
  - [AlignConfig](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignConfig) configuration class: [AlignModel](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignModel) (AlignConfig model)
  - [AltCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPModel) (AltCLIPConfig model)
  - [ApertusConfig](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusConfig) configuration class: [ApertusModel](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusModel) (ApertusConfig model)
  - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeModel](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeModel) (ArceeConfig model)
  - [AriaConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaConfig) configuration class: [AriaModel](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaModel) (AriaConfig model)
  - [AriaTextConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextConfig) configuration class: [AriaTextModel](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextModel) (AriaTextConfig model)
  - [AudioFlamingo3Config](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Config) configuration class: [AudioFlamingo3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3ForConditionalGeneration) (AudioFlamingo3Config model)
  - [AudioFlamingo3EncoderConfig](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3EncoderConfig) configuration class: [AudioFlamingo3Encoder](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Encoder) (AudioFlamingo3EncoderConfig model)
  - [AutoformerConfig](/docs/transformers/v5.8.0/en/model_doc/autoformer#transformers.AutoformerConfig) configuration class: [AutoformerModel](/docs/transformers/v5.8.0/en/model_doc/autoformer#transformers.AutoformerModel) (AutoformerConfig model)
  - [AyaVisionConfig](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionConfig) configuration class: [AyaVisionModel](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionModel) (AyaVisionConfig model)
  - [BambaConfig](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaConfig) configuration class: [BambaModel](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaModel) (BambaConfig model)
  - [BarkConfig](/docs/transformers/v5.8.0/en/model_doc/bark#transformers.BarkConfig) configuration class: [BarkModel](/docs/transformers/v5.8.0/en/model_doc/bark#transformers.BarkModel) (BarkConfig model)
  - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartModel](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartModel) (BartConfig model)
  - [BeitConfig](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitConfig) configuration class: [BeitModel](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitModel) (BeitConfig model)
  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertModel](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertModel) (BertConfig model)
  - [BertGenerationConfig](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationConfig) configuration class: [BertGenerationEncoder](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationEncoder) (BertGenerationConfig model)
  - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdModel](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdModel) (BigBirdConfig model)
  - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusModel](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusModel) (BigBirdPegasusConfig model)
  - [BioGptConfig](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptModel](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptModel) (BioGptConfig model)
  - [BitConfig](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitConfig) configuration class: [BitModel](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitModel) (BitConfig model)
  - [BitNetConfig](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetConfig) configuration class: [BitNetModel](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetModel) (BitNetConfig model)
  - [BlenderbotConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotModel](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotModel) (BlenderbotConfig model)
  - [BlenderbotSmallConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallModel](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallModel) (BlenderbotSmallConfig model)
  - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2Model](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Model) (Blip2Config model)
  - [Blip2QFormerConfig](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2QFormerConfig) configuration class: [Blip2QFormerModel](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2QFormerModel) (Blip2QFormerConfig model)
  - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipModel) (BlipConfig model)
  - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomModel](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomModel) (BloomConfig model)
  - [BltConfig](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltConfig) configuration class: [BltModel](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltModel) (BltConfig model)
  - [BridgeTowerConfig](/docs/transformers/v5.8.0/en/model_doc/bridgetower#transformers.BridgeTowerConfig) configuration class: [BridgeTowerModel](/docs/transformers/v5.8.0/en/model_doc/bridgetower#transformers.BridgeTowerModel) (BridgeTowerConfig model)
  - [BrosConfig](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosConfig) configuration class: [BrosModel](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosModel) (BrosConfig model)
  - [CLIPConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPModel) (CLIPConfig model)
  - [CLIPSegConfig](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSegConfig model)
  - [CLIPTextConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTextConfig) configuration class: [CLIPTextModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTextModel) (CLIPTextConfig model)
  - [CLIPVisionConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPVisionConfig) configuration class: [CLIPVisionModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPVisionModel) (CLIPVisionConfig model)
  - [CTRLConfig](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLModel](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLModel) (CTRLConfig model)
  - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertModel](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertModel) (CamembertConfig model)
  - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineModel](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineModel) (CanineConfig model)
  - [ChameleonConfig](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonConfig) configuration class: [ChameleonModel](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonModel) (ChameleonConfig model)
  - [ChineseCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPConfig) configuration class: [ChineseCLIPModel](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPModel) (ChineseCLIPConfig model)
  - [ChineseCLIPVisionConfig](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPVisionConfig) configuration class: [ChineseCLIPVisionModel](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPVisionModel) (ChineseCLIPVisionConfig model)
  - [ClapConfig](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapConfig) configuration class: [ClapModel](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapModel) (ClapConfig model)
  - [ClvpConfig](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpConfig) configuration class: [ClvpModelForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpModelForConditionalGeneration) (ClvpConfig model)
  - [CodeGenConfig](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenModel](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenModel) (CodeGenConfig model)
  - [Cohere2Config](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2Config) configuration class: [Cohere2Model](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2Model) (Cohere2Config model)
  - [Cohere2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionConfig) configuration class: [Cohere2VisionModel](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionModel) (Cohere2VisionConfig model)
  - [CohereAsrConfig](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrConfig) configuration class: [CohereAsrModel](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrModel) (CohereAsrConfig model)
  - [CohereConfig](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereConfig) configuration class: [CohereModel](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereModel) (CohereConfig model)
  - [ConditionalDetrConfig](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrConfig) configuration class: [ConditionalDetrModel](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrModel) (ConditionalDetrConfig model)
  - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertModel](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertModel) (ConvBertConfig model)
  - [ConvNextConfig](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextConfig) configuration class: [ConvNextModel](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextModel) (ConvNextConfig model)
  - [ConvNextV2Config](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [ConvNextV2Model](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2Model) (ConvNextV2Config model)
  - [CpmAntConfig](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntConfig) configuration class: [CpmAntModel](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntModel) (CpmAntConfig model)
  - [CsmConfig](/docs/transformers/v5.8.0/en/model_doc/csm#transformers.CsmConfig) configuration class: [CsmForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/csm#transformers.CsmForConditionalGeneration) (CsmConfig model)
  - [CvtConfig](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtConfig) configuration class: [CvtModel](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtModel) (CvtConfig model)
  - [CwmConfig](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmConfig) configuration class: [CwmModel](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmModel) (CwmConfig model)
  - [DFineConfig](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineConfig) configuration class: [DFineModel](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineModel) (DFineConfig model)
  - [DINOv3ConvNextConfig](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ConvNextConfig) configuration class: [DINOv3ConvNextModel](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ConvNextModel) (DINOv3ConvNextConfig model)
  - [DINOv3ViTConfig](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ViTConfig) configuration class: [DINOv3ViTModel](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ViTModel) (DINOv3ViTConfig model)
  - [DPRConfig](/docs/transformers/v5.8.0/en/model_doc/dpr#transformers.DPRConfig) configuration class: [DPRQuestionEncoder](/docs/transformers/v5.8.0/en/model_doc/dpr#transformers.DPRQuestionEncoder) (DPRConfig model)
  - [DPTConfig](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTConfig) configuration class: [DPTModel](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTModel) (DPTConfig model)
  - [DabDetrConfig](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrConfig) configuration class: [DabDetrModel](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrModel) (DabDetrConfig model)
  - [DacConfig](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacConfig) configuration class: [DacModel](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacModel) (DacConfig model)
  - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioModel](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioModel) (Data2VecAudioConfig model)
  - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextModel](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextModel) (Data2VecTextConfig model)
  - [Data2VecVisionConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionModel](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionModel) (Data2VecVisionConfig model)
  - [DbrxConfig](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxConfig) configuration class: [DbrxModel](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxModel) (DbrxConfig model)
  - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaModel](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaModel) (DebertaConfig model)
  - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2Model](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Model) (DebertaV2Config model)
  - [DecisionTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/decision_transformer#transformers.DecisionTransformerConfig) configuration class: [DecisionTransformerModel](/docs/transformers/v5.8.0/en/model_doc/decision_transformer#transformers.DecisionTransformerModel) (DecisionTransformerConfig model)
  - [DeepseekV2Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Config) configuration class: [DeepseekV2Model](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Model) (DeepseekV2Config model)
  - [DeepseekV3Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3Model](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Model) (DeepseekV3Config model)
  - [DeepseekV4Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4Config) configuration class: [DeepseekV4Model](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4Model) (DeepseekV4Config model)
  - [DeepseekVLConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLConfig) configuration class: [DeepseekVLModel](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLModel) (DeepseekVLConfig model)
  - [DeepseekVLHybridConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridConfig) configuration class: [DeepseekVLHybridModel](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridModel) (DeepseekVLHybridConfig model)
  - [DeformableDetrConfig](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrConfig) configuration class: [DeformableDetrModel](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrModel) (DeformableDetrConfig model)
  - [DeiTConfig](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTModel](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTModel) (DeiTConfig model)
  - [Deimv2Config](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2Config) configuration class: [Deimv2Model](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2Model) (Deimv2Config model)
  - [DepthProConfig](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProConfig) configuration class: [DepthProModel](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProModel) (DepthProConfig model)
  - [DetrConfig](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrConfig) configuration class: [DetrModel](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrModel) (DetrConfig model)
  - [DiaConfig](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaConfig) configuration class: [DiaModel](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaModel) (DiaConfig model)
  - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaModel](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaModel) (DiffLlamaConfig model)
  - [DinatConfig](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatConfig) configuration class: [DinatModel](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatModel) (DinatConfig model)
  - [Dinov2Config](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2Config) configuration class: [Dinov2Model](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2Model) (Dinov2Config model)
  - [Dinov2WithRegistersConfig](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersConfig) configuration class: [Dinov2WithRegistersModel](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersModel) (Dinov2WithRegistersConfig model)
  - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertModel](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertModel) (DistilBertConfig model)
  - [DogeConfig](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeConfig) configuration class: [DogeModel](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeModel) (DogeConfig model)
  - [DonutSwinConfig](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinConfig) configuration class: [DonutSwinModel](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinModel) (DonutSwinConfig model)
  - [Dots1Config](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1Config) configuration class: [Dots1Model](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1Model) (Dots1Config model)
  - [EdgeTamConfig](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamConfig) configuration class: [EdgeTamModel](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamModel) (EdgeTamConfig model)
  - [EdgeTamVideoConfig](/docs/transformers/v5.8.0/en/model_doc/edgetam_video#transformers.EdgeTamVideoConfig) configuration class: [EdgeTamVideoModel](/docs/transformers/v5.8.0/en/model_doc/edgetam_video#transformers.EdgeTamVideoModel) (EdgeTamVideoConfig model)
  - [EdgeTamVisionConfig](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamVisionConfig) configuration class: [EdgeTamVisionModel](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamVisionModel) (EdgeTamVisionConfig model)
  - [EfficientLoFTRConfig](/docs/transformers/v5.8.0/en/model_doc/efficientloftr#transformers.EfficientLoFTRConfig) configuration class: [EfficientLoFTRModel](/docs/transformers/v5.8.0/en/model_doc/efficientloftr#transformers.EfficientLoFTRModel) (EfficientLoFTRConfig model)
  - [EfficientNetConfig](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetConfig) configuration class: [EfficientNetModel](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetModel) (EfficientNetConfig model)
  - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraModel](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraModel) (ElectraConfig model)
  - [Emu3Config](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Config) configuration class: [Emu3Model](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Model) (Emu3Config model)
  - [EncodecConfig](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecConfig) configuration class: [EncodecModel](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecModel) (EncodecConfig model)
  - [Ernie4_5Config](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5Config) configuration class: [Ernie4_5Model](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5Model) (Ernie4_5Config model)
  - [Ernie4_5_MoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeConfig) configuration class: [Ernie4_5_MoeModel](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeModel) (Ernie4_5_MoeConfig model)
  - [Ernie4_5_VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeConfig) configuration class: [Ernie4_5_VLMoeModel](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeModel) (Ernie4_5_VLMoeConfig model)
  - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieModel](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieModel) (ErnieConfig model)
  - [EsmConfig](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmConfig) configuration class: [EsmModel](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmModel) (EsmConfig model)
  - [EuroBertConfig](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertConfig) configuration class: [EuroBertModel](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertModel) (EuroBertConfig model)
  - [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) configuration class: [EvollaModel](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaModel) (EvollaConfig model)
  - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4Model](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Model) (Exaone4Config model)
  - [Exaone4_5_Config](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Config) configuration class: [Exaone4_5_Model](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Model) (Exaone4_5_Config model)
  - [Exaone4_5_VisionConfig](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_VisionConfig) configuration class: [Exaone4_5_VisionModel](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_VisionModel) (Exaone4_5_VisionConfig model)
  - [ExaoneMoeConfig](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeConfig) configuration class: [ExaoneMoeModel](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeModel) (ExaoneMoeConfig model)
  - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetModel](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetModel) (FNetConfig model)
  - [FSMTConfig](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTConfig) configuration class: [FSMTModel](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTModel) (FSMTConfig model)
  - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconModel](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconModel) (FalconConfig model)
  - [FalconH1Config](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1Config) configuration class: [FalconH1Model](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1Model) (FalconH1Config model)
  - [FalconMambaConfig](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaConfig) configuration class: [FalconMambaModel](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaModel) (FalconMambaConfig model)
  - [FastSpeech2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerConfig) configuration class: [FastSpeech2ConformerModel](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerModel) (FastSpeech2ConformerConfig model)
  - [FastSpeech2ConformerWithHifiGanConfig](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerWithHifiGanConfig) configuration class: [FastSpeech2ConformerWithHifiGan](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerWithHifiGan) (FastSpeech2ConformerWithHifiGanConfig model)
  - [FastVlmConfig](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmConfig) configuration class: [FastVlmModel](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmModel) (FastVlmConfig model)
  - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertModel](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertModel) (FlaubertConfig model)
  - [FlavaConfig](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaConfig) configuration class: [FlavaModel](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaModel) (FlavaConfig model)
  - [FlexOlmoConfig](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoConfig) configuration class: [FlexOlmoModel](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoModel) (FlexOlmoConfig model)
  - [Florence2Config](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Config) configuration class: [Florence2Model](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Model) (Florence2Config model)
  - [FocalNetConfig](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetConfig) configuration class: [FocalNetModel](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetModel) (FocalNetConfig model)
  - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelModel](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelModel) or [FunnelBaseModel](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelBaseModel) (FunnelConfig model)
  - [FuyuConfig](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuConfig) configuration class: [FuyuModel](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuModel) (FuyuConfig model)
  - [GLPNConfig](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNConfig) configuration class: [GLPNModel](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNModel) (GLPNConfig model)
  - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2Model](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Model) (GPT2Config model)
  - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeModel](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeModel) (GPTBigCodeConfig model)
  - [GPTJConfig](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJConfig) configuration class: [GPTJModel](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJModel) (GPTJConfig model)
  - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoModel](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoModel) (GPTNeoConfig model)
  - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXModel](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXModel) (GPTNeoXConfig model)
  - [GPTNeoXJapaneseConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) configuration class: [GPTNeoXJapaneseModel](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseModel) (GPTNeoXJapaneseConfig model)
  - [Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2Model](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Model) (Gemma2Config model)
  - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3Model](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Model) (Gemma3Config model)
  - [Gemma3TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3TextModel](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextModel) (Gemma3TextConfig model)
  - [Gemma3nAudioConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nAudioConfig) configuration class: `Gemma3nAudioEncoder` (Gemma3nAudioConfig model)
  - [Gemma3nConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nConfig) configuration class: [Gemma3nModel](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nModel) (Gemma3nConfig model)
  - [Gemma3nTextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nTextConfig) configuration class: [Gemma3nTextModel](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nTextModel) (Gemma3nTextConfig model)
  - [Gemma3nVisionConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nVisionConfig) configuration class: [TimmWrapperModel](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperModel) (Gemma3nVisionConfig model)
  - [Gemma4AudioConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4AudioConfig) configuration class: [Gemma4AudioModel](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4AudioModel) (Gemma4AudioConfig model)
  - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4Model](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Model) (Gemma4Config model)
  - [Gemma4TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4TextConfig) configuration class: [Gemma4TextModel](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4TextModel) (Gemma4TextConfig model)
  - [Gemma4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4VisionConfig) configuration class: [Gemma4VisionModel](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4VisionModel) (Gemma4VisionConfig model)
  - [GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaModel](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaModel) (GemmaConfig model)
  - [GitConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitConfig) configuration class: [GitModel](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitModel) (GitConfig model)
  - [Glm46VConfig](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VConfig) configuration class: [Glm46VModel](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VModel) (Glm46VConfig model)
  - [Glm4Config](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Config) configuration class: [Glm4Model](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Model) (Glm4Config model)
  - [Glm4MoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeConfig) configuration class: [Glm4MoeModel](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeModel) (Glm4MoeConfig model)
  - [Glm4MoeLiteConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteConfig) configuration class: [Glm4MoeLiteModel](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteModel) (Glm4MoeLiteConfig model)
  - [Glm4vConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vConfig) configuration class: [Glm4vModel](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vModel) (Glm4vConfig model)
  - [Glm4vMoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeConfig) configuration class: [Glm4vMoeModel](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeModel) (Glm4vMoeConfig model)
  - [Glm4vMoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeTextConfig) configuration class: [Glm4vMoeTextModel](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeTextModel) (Glm4vMoeTextConfig model)
  - [Glm4vMoeVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeVisionConfig) configuration class: [Glm4vMoeVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeVisionModel) (Glm4vMoeVisionConfig model)
  - [Glm4vTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vTextConfig) configuration class: [Glm4vTextModel](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vTextModel) (Glm4vTextConfig model)
  - [Glm4vVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vVisionConfig) configuration class: [Glm4vVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vVisionModel) (Glm4vVisionConfig model)
  - [GlmAsrConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrConfig) configuration class: [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model)
  - [GlmAsrEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrEncoderConfig) configuration class: [GlmAsrEncoder](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrEncoder) (GlmAsrEncoderConfig model)
  - [GlmConfig](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmConfig) configuration class: [GlmModel](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmModel) (GlmConfig model)
  - [GlmImageConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageConfig) configuration class: [GlmImageModel](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageModel) (GlmImageConfig model)
  - [GlmImageTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageTextConfig) configuration class: [GlmImageTextModel](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageTextModel) (GlmImageTextConfig model)
  - [GlmImageVQVAEConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVQVAEConfig) configuration class: [GlmImageVQVAE](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVQVAE) (GlmImageVQVAEConfig model)
  - [GlmImageVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVisionConfig) configuration class: [GlmImageVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVisionModel) (GlmImageVisionConfig model)
  - [GlmMoeDsaConfig](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaConfig) configuration class: [GlmMoeDsaModel](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaModel) (GlmMoeDsaConfig model)
  - [GlmOcrConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrConfig) configuration class: [GlmOcrModel](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrModel) (GlmOcrConfig model)
  - [GlmOcrTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrTextConfig) configuration class: [GlmOcrTextModel](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrTextModel) (GlmOcrTextConfig model)
  - [GlmOcrVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrVisionConfig) configuration class: [GlmOcrVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrVisionModel) (GlmOcrVisionConfig model)
  - [GotOcr2Config](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Config) configuration class: [GotOcr2Model](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Model) (GotOcr2Config model)
  - [GptOssConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssConfig) configuration class: [GptOssModel](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssModel) (GptOssConfig model)
  - [Granite4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionConfig) configuration class: [Granite4VisionModel](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionModel) (Granite4VisionConfig model)
  - [GraniteConfig](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteConfig) configuration class: [GraniteModel](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteModel) (GraniteConfig model)
  - [GraniteMoeConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeConfig) configuration class: [GraniteMoeModel](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeModel) (GraniteMoeConfig model)
  - [GraniteMoeHybridConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridConfig) configuration class: [GraniteMoeHybridModel](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridModel) (GraniteMoeHybridConfig model)
  - [GraniteMoeSharedConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedConfig) configuration class: [GraniteMoeSharedModel](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedModel) (GraniteMoeSharedConfig model)
  - [GraniteSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechConfig) configuration class: [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model)
  - [GroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoConfig) configuration class: [GroundingDinoModel](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoModel) (GroundingDinoConfig model)
  - [GroupViTConfig](/docs/transformers/v5.8.0/en/model_doc/groupvit#transformers.GroupViTConfig) configuration class: [GroupViTModel](/docs/transformers/v5.8.0/en/model_doc/groupvit#transformers.GroupViTModel) (GroupViTConfig model)
  - [HGNetV2Config](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2Config) configuration class: [HGNetV2Backbone](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2Backbone) (HGNetV2Config model)
  - [HYV3Config](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3Config) configuration class: [HYV3Model](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3Model) (HYV3Config model)
  - [HeliumConfig](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumConfig) configuration class: [HeliumModel](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumModel) (HeliumConfig model)
  - [HieraConfig](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraConfig) configuration class: [HieraModel](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraModel) (HieraConfig model)
  - [HiggsAudioV2Config](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2#transformers.HiggsAudioV2Config) configuration class: [HiggsAudioV2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2#transformers.HiggsAudioV2ForConditionalGeneration) (HiggsAudioV2Config model)
  - [HiggsAudioV2TokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerConfig) configuration class: [HiggsAudioV2TokenizerModel](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerModel) (HiggsAudioV2TokenizerConfig model)
  - [HubertConfig](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertConfig) configuration class: [HubertModel](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertModel) (HubertConfig model)
  - [HunYuanDenseV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Config) configuration class: [HunYuanDenseV1Model](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Model) (HunYuanDenseV1Config model)
  - [HunYuanMoEV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Config) configuration class: [HunYuanMoEV1Model](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Model) (HunYuanMoEV1Config model)
  - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertModel](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertModel) (IBertConfig model)
  - [IJepaConfig](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaConfig) configuration class: [IJepaModel](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaModel) (IJepaConfig model)
  - [Idefics2Config](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Config) configuration class: [Idefics2Model](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Model) (Idefics2Config model)
  - [Idefics3Config](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Config) configuration class: [Idefics3Model](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Model) (Idefics3Config model)
  - [Idefics3VisionConfig](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3VisionConfig) configuration class: [Idefics3VisionTransformer](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3VisionTransformer) (Idefics3VisionConfig model)
  - [IdeficsConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsConfig) configuration class: [IdeficsModel](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsModel) (IdeficsConfig model)
  - [ImageGPTConfig](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTConfig) configuration class: [ImageGPTModel](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTModel) (ImageGPTConfig model)
  - [InformerConfig](/docs/transformers/v5.8.0/en/model_doc/informer#transformers.InformerConfig) configuration class: [InformerModel](/docs/transformers/v5.8.0/en/model_doc/informer#transformers.InformerModel) (InformerConfig model)
  - [InstructBlipConfig](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipConfig) configuration class: [InstructBlipModel](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipModel) (InstructBlipConfig model)
  - [InstructBlipVideoConfig](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoConfig) configuration class: [InstructBlipVideoModel](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoModel) (InstructBlipVideoConfig model)
  - [InternVLConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLConfig) configuration class: [InternVLModel](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLModel) (InternVLConfig model)
  - [InternVLVisionConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLVisionConfig) configuration class: [InternVLVisionModel](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLVisionModel) (InternVLVisionConfig model)
  - [Jais2Config](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2Config) configuration class: [Jais2Model](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2Model) (Jais2Config model)
  - [JambaConfig](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaModel](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaModel) (JambaConfig model)
  - [JanusConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusConfig) configuration class: [JanusModel](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusModel) (JanusConfig model)
  - [JetMoeConfig](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeConfig) configuration class: [JetMoeModel](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeModel) (JetMoeConfig model)
  - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3Model](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Model) (JinaEmbeddingsV3Config model)
  - [Kosmos2Config](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Config) configuration class: [Kosmos2Model](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Model) (Kosmos2Config model)
  - [Kosmos2_5Config](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Config) configuration class: [Kosmos2_5Model](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Model) (Kosmos2_5Config model)
  - [KyutaiSpeechToTextConfig](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextConfig) configuration class: [KyutaiSpeechToTextModel](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextModel) (KyutaiSpeechToTextConfig model)
  - [LEDConfig](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDConfig) configuration class: [LEDModel](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDModel) (LEDConfig model)
  - [LagunaConfig](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaConfig) configuration class: [LagunaModel](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaModel) (LagunaConfig model)
  - [LasrCTCConfig](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrCTCConfig) configuration class: [LasrForCTC](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrForCTC) (LasrCTCConfig model)
  - [LasrEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrEncoderConfig) configuration class: [LasrEncoder](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrEncoder) (LasrEncoderConfig model)
  - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMModel](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMModel) (LayoutLMConfig model)
  - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2Model](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Model) (LayoutLMv2Config model)
  - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3Model](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Model) (LayoutLMv3Config model)
  - [LevitConfig](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitConfig) configuration class: [LevitModel](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitModel) (LevitConfig model)
  - [Lfm2Config](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2Config) configuration class: [Lfm2Model](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2Model) (Lfm2Config model)
  - [Lfm2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeConfig) configuration class: [Lfm2MoeModel](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeModel) (Lfm2MoeConfig model)
  - [Lfm2VlConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlConfig) configuration class: [Lfm2VlModel](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlModel) (Lfm2VlConfig model)
  - [LightGlueConfig](/docs/transformers/v5.8.0/en/model_doc/lightglue#transformers.LightGlueConfig) configuration class: [LightGlueForKeypointMatching](/docs/transformers/v5.8.0/en/model_doc/lightglue#transformers.LightGlueForKeypointMatching) (LightGlueConfig model)
  - [LightOnOcrConfig](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrConfig) configuration class: [LightOnOcrModel](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrModel) (LightOnOcrConfig model)
  - [LiltConfig](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltConfig) configuration class: [LiltModel](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltModel) (LiltConfig model)
  - [Llama4Config](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Config) configuration class: [Llama4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForConditionalGeneration) (Llama4Config model)
  - [Llama4TextConfig](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4TextConfig) configuration class: [Llama4TextModel](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4TextModel) (Llama4TextConfig model)
  - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaModel](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaModel) (LlamaConfig model)
  - [LlavaConfig](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaConfig) configuration class: [LlavaModel](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaModel) (LlavaConfig model)
  - [LlavaNextConfig](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextConfig) configuration class: [LlavaNextModel](/docs/transformers/v5.8.0/en/model_doc/llava_next#transformers.LlavaNextModel) (LlavaNextConfig model)
  - [LlavaNextVideoConfig](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoConfig) configuration class: [LlavaNextVideoModel](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoModel) (LlavaNextVideoConfig model)
  - [LlavaOnevisionConfig](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionConfig) configuration class: [LlavaOnevisionModel](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionModel) (LlavaOnevisionConfig model)
  - [LongT5Config](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5Config) configuration class: [LongT5Model](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5Model) (LongT5Config model)
  - [LongcatFlashConfig](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashConfig) configuration class: [LongcatFlashModel](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashModel) (LongcatFlashConfig model)
  - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerModel](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerModel) (LongformerConfig model)
  - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeModel](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeModel) (LukeConfig model)
  - [LwDetrConfig](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrConfig) configuration class: [LwDetrModel](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrModel) (LwDetrConfig model)
  - [LxmertConfig](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertConfig) configuration class: [LxmertModel](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertModel) (LxmertConfig model)
  - [M2M100Config](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100Config) configuration class: [M2M100Model](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100Model) (M2M100Config model)
  - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartModel](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartModel) (MBartConfig model)
  - [MLCDVisionConfig](/docs/transformers/v5.8.0/en/model_doc/mlcd#transformers.MLCDVisionConfig) configuration class: [MLCDVisionModel](/docs/transformers/v5.8.0/en/model_doc/mlcd#transformers.MLCDVisionModel) (MLCDVisionConfig model)
  - [MMGroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoConfig) configuration class: [MMGroundingDinoModel](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoModel) (MMGroundingDinoConfig model)
  - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetModel](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetModel) (MPNetConfig model)
  - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5Model](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Model) (MT5Config model)
  - [Mamba2Config](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2Model](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Model) (Mamba2Config model)
  - [MambaConfig](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaModel](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaModel) (MambaConfig model)
  - [MarianConfig](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianConfig) configuration class: [MarianModel](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianModel) (MarianConfig model)
  - [MarkupLMConfig](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMConfig) configuration class: [MarkupLMModel](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMModel) (MarkupLMConfig model)
  - [Mask2FormerConfig](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerConfig) configuration class: [Mask2FormerModel](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerModel) (Mask2FormerConfig model)
  - [MaskFormerConfig](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerConfig) configuration class: [MaskFormerModel](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerModel) (MaskFormerConfig model)
  - `MaskFormerSwinConfig` configuration class: `MaskFormerSwinModel` (MaskFormerSwinConfig model)
  - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertModel](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertModel) (MegatronBertConfig model)
  - [MetaClip2Config](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Config) configuration class: [MetaClip2Model](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Model) (MetaClip2Config model)
  - [MgpstrConfig](/docs/transformers/v5.8.0/en/model_doc/mgp-str#transformers.MgpstrConfig) configuration class: [MgpstrForSceneTextRecognition](/docs/transformers/v5.8.0/en/model_doc/mgp-str#transformers.MgpstrForSceneTextRecognition) (MgpstrConfig model)
  - [MimiConfig](/docs/transformers/v5.8.0/en/model_doc/mimi#transformers.MimiConfig) configuration class: [MimiModel](/docs/transformers/v5.8.0/en/model_doc/mimi#transformers.MimiModel) (MimiConfig model)
  - [MiniCPMV4_6Config](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Config) configuration class: [MiniCPMV4_6Model](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Model) (MiniCPMV4_6Config model)
  - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxModel](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxModel) (MiniMaxConfig model)
  - [MiniMaxM2Config](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2Config) configuration class: [MiniMaxM2Model](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2Model) (MiniMaxM2Config model)
  - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3Model](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Model) (Ministral3Config model)
  - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralModel](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralModel) (MinistralConfig model)
  - [Mistral3Config](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Config) configuration class: [Mistral3Model](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Model) (Mistral3Config model)
  - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4Model](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Model) (Mistral4Config model)
  - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralModel](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralModel) (MistralConfig model)
  - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralModel](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralModel) (MixtralConfig model)
  - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaModel](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaModel) (MllamaConfig model)
  - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertModel](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertModel) (MobileBertConfig model)
  - [MobileNetV1Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1Config) configuration class: [MobileNetV1Model](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1Model) (MobileNetV1Config model)
  - [MobileNetV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Config) configuration class: [MobileNetV2Model](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Model) (MobileNetV2Config model)
  - [MobileViTConfig](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTConfig) configuration class: [MobileViTModel](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTModel) (MobileViTConfig model)
  - [MobileViTV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Config) configuration class: [MobileViTV2Model](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Model) (MobileViTV2Config model)
  - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertModel](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertModel) (ModernBertConfig model)
  - [ModernBertDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderConfig) configuration class: [ModernBertDecoderModel](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderModel) (ModernBertDecoderConfig model)
  - [ModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertConfig) configuration class: [ModernVBertModel](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertModel) (ModernVBertConfig model)
  - [MoonshineConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineConfig) configuration class: [MoonshineModel](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineModel) (MoonshineConfig model)
  - [MoonshineStreamingConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingConfig) configuration class: [MoonshineStreamingModel](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingModel) (MoonshineStreamingConfig model)
  - [MoshiConfig](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiConfig) configuration class: [MoshiModel](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiModel) (MoshiConfig model)
  - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptModel](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptModel) (MptConfig model)
  - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraModel](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraModel) (MraConfig model)
  - [MusicFlamingoConfig](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoConfig) configuration class: [MusicFlamingoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoForConditionalGeneration) (MusicFlamingoConfig model)
  - [MusicgenConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenConfig) configuration class: [MusicgenModel](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenModel) (MusicgenConfig model)
  - [MusicgenMelodyConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyConfig) configuration class: [MusicgenMelodyModel](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyModel) (MusicgenMelodyConfig model)
  - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpModel](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpModel) (MvpConfig model)
  - [NanoChatConfig](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatConfig) configuration class: [NanoChatModel](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatModel) (NanoChatConfig model)
  - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronModel](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronModel) (NemotronConfig model)
  - [NemotronHConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHConfig) configuration class: [NemotronHModel](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHModel) (NemotronHConfig model)
  - [NllbMoeConfig](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeConfig) configuration class: [NllbMoeModel](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeModel) (NllbMoeConfig model)
  - [NomicBertConfig](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertConfig) configuration class: [NomicBertModel](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertModel) (NomicBertConfig model)
  - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerModel](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerModel) (NystromformerConfig model)
  - [OPTConfig](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTConfig) configuration class: [OPTModel](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTModel) (OPTConfig model)
  - [Olmo2Config](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Config) configuration class: [Olmo2Model](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Model) (Olmo2Config model)
  - [Olmo3Config](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Config) configuration class: [Olmo3Model](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Model) (Olmo3Config model)
  - [OlmoConfig](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoConfig) configuration class: [OlmoModel](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoModel) (OlmoConfig model)
  - [OlmoHybridConfig](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridConfig) configuration class: [OlmoHybridModel](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridModel) (OlmoHybridConfig model)
  - [OlmoeConfig](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeConfig) configuration class: [OlmoeModel](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeModel) (OlmoeConfig model)
  - [OmDetTurboConfig](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboConfig) configuration class: [OmDetTurboForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboForObjectDetection) (OmDetTurboConfig model)
  - [OneFormerConfig](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerConfig) configuration class: [OneFormerModel](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerModel) (OneFormerConfig model)
  - [OpenAIGPTConfig](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTModel](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTModel) (OpenAIGPTConfig model)
  - [OpenAIPrivacyFilterConfig](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterConfig) configuration class: [OpenAIPrivacyFilterModel](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterModel) (OpenAIPrivacyFilterConfig model)
  - [Ovis2Config](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Config) configuration class: [Ovis2Model](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Model) (Ovis2Config model)
  - [OwlViTConfig](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTConfig) configuration class: [OwlViTModel](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTModel) (OwlViTConfig model)
  - [Owlv2Config](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2Config) configuration class: [Owlv2Model](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2Model) (Owlv2Config model)
  - [PI0Config](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Config) configuration class: [PI0Model](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Model) (PI0Config model)
  - [PLBartConfig](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartConfig) configuration class: [PLBartModel](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartModel) (PLBartConfig model)
  - [PPDocLayoutV3Config](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3Config) configuration class: [PPDocLayoutV3Model](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3Model) (PPDocLayoutV3Config model)
  - [PPOCRV5MobileRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecConfig) configuration class: [PPOCRV5MobileRecModel](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecModel) (PPOCRV5MobileRecConfig model)
  - [PPOCRV5ServerRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecConfig) configuration class: [PPOCRV5ServerRecModel](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecModel) (PPOCRV5ServerRecConfig model)
  - [PaliGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaModel](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaModel) (PaliGemmaConfig model)
  - [ParakeetCTCConfig](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetCTCConfig) configuration class: [ParakeetForCTC](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetForCTC) (ParakeetCTCConfig model)
  - [ParakeetEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetEncoderConfig) configuration class: [ParakeetEncoder](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetEncoder) (ParakeetEncoderConfig model)
  - [PatchTSMixerConfig](/docs/transformers/v5.8.0/en/model_doc/patchtsmixer#transformers.PatchTSMixerConfig) configuration class: [PatchTSMixerModel](/docs/transformers/v5.8.0/en/model_doc/patchtsmixer#transformers.PatchTSMixerModel) (PatchTSMixerConfig model)
  - [PatchTSTConfig](/docs/transformers/v5.8.0/en/model_doc/patchtst#transformers.PatchTSTConfig) configuration class: [PatchTSTModel](/docs/transformers/v5.8.0/en/model_doc/patchtst#transformers.PatchTSTModel) (PatchTSTConfig model)
  - [PeAudioConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioConfig) configuration class: [PeAudioModel](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioModel) (PeAudioConfig model)
  - [PeAudioEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioEncoderConfig) configuration class: [PeAudioEncoder](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioEncoder) (PeAudioEncoderConfig model)
  - [PeAudioVideoConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoConfig) configuration class: [PeAudioVideoModel](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoModel) (PeAudioVideoConfig model)
  - [PeAudioVideoEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoEncoderConfig) configuration class: [PeAudioVideoEncoder](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoEncoder) (PeAudioVideoEncoderConfig model)
  - [PeVideoConfig](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoConfig) configuration class: [PeVideoModel](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoModel) (PeVideoConfig model)
  - [PeVideoEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoEncoderConfig) configuration class: [PeVideoEncoder](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoEncoder) (PeVideoEncoderConfig model)
  - [PegasusConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusConfig) configuration class: [PegasusModel](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusModel) (PegasusConfig model)
  - [PegasusXConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXConfig) configuration class: [PegasusXModel](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXModel) (PegasusXConfig model)
  - [PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverConfig) configuration class: [PerceiverModel](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverModel) (PerceiverConfig model)
  - [PerceptionLMConfig](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMConfig) configuration class: [PerceptionLMModel](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMModel) (PerceptionLMConfig model)
  - [PersimmonConfig](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonConfig) configuration class: [PersimmonModel](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonModel) (PersimmonConfig model)
  - [Phi3Config](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Config) configuration class: [Phi3Model](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Model) (Phi3Config model)
  - [Phi4MultimodalConfig](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalConfig) configuration class: [Phi4MultimodalModel](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalModel) (Phi4MultimodalConfig model)
  - [PhiConfig](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiConfig) configuration class: [PhiModel](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiModel) (PhiConfig model)
  - [PhimoeConfig](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeConfig) configuration class: [PhimoeModel](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeModel) (PhimoeConfig model)
  - [PixioConfig](/docs/transformers/v5.8.0/en/model_doc/pixio#transformers.PixioConfig) configuration class: [PixioModel](/docs/transformers/v5.8.0/en/model_doc/pixio#transformers.PixioModel) (PixioConfig model)
  - [PixtralVisionConfig](/docs/transformers/v5.8.0/en/model_doc/pixtral#transformers.PixtralVisionConfig) configuration class: [PixtralVisionModel](/docs/transformers/v5.8.0/en/model_doc/pixtral#transformers.PixtralVisionModel) (PixtralVisionConfig model)
  - [PoolFormerConfig](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerConfig) configuration class: [PoolFormerModel](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerModel) (PoolFormerConfig model)
  - [ProphetNetConfig](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetConfig) configuration class: [ProphetNetModel](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetModel) (ProphetNetConfig model)
  - [PvtConfig](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtConfig) configuration class: [PvtModel](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtModel) (PvtConfig model)
  - [PvtV2Config](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2Config) configuration class: [PvtV2Model](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2Model) (PvtV2Config model)
  - [QianfanOCRConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRConfig) configuration class: [QianfanOCRModel](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRModel) (QianfanOCRConfig model)
  - [QianfanOCRVisionConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRVisionConfig) configuration class: [QianfanOCRVisionModel](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRVisionModel) (QianfanOCRVisionConfig model)
  - [Qwen2AudioEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioEncoderConfig) configuration class: [Qwen2AudioEncoder](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioEncoder) (Qwen2AudioEncoderConfig model)
  - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2Model](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Model) (Qwen2Config model)
  - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeModel) (Qwen2MoeConfig model)
  - [Qwen2VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLConfig) configuration class: [Qwen2VLModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLModel) (Qwen2VLConfig model)
  - [Qwen2VLTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLTextConfig) configuration class: [Qwen2VLTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLTextModel) (Qwen2VLTextConfig model)
  - [Qwen2_5_VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLConfig) configuration class: [Qwen2_5_VLModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLModel) (Qwen2_5_VLConfig model)
  - [Qwen2_5_VLTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLTextConfig) configuration class: [Qwen2_5_VLTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLTextModel) (Qwen2_5_VLTextConfig model)
  - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3Model](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Model) (Qwen3Config model)
  - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeModel) (Qwen3MoeConfig model)
  - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextModel) (Qwen3NextConfig model)
  - [Qwen3VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLConfig) configuration class: [Qwen3VLModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLModel) (Qwen3VLConfig model)
  - [Qwen3VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeConfig) configuration class: [Qwen3VLMoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeModel) (Qwen3VLMoeConfig model)
  - [Qwen3VLMoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeTextConfig) configuration class: [Qwen3VLMoeTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeTextModel) (Qwen3VLMoeTextConfig model)
  - [Qwen3VLTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLTextConfig) configuration class: [Qwen3VLTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLTextModel) (Qwen3VLTextConfig model)
  - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5Model](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Model) (Qwen3_5Config model)
  - [Qwen3_5MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeConfig) configuration class: [Qwen3_5MoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeModel) (Qwen3_5MoeConfig model)
  - [Qwen3_5MoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeTextConfig) configuration class: [Qwen3_5MoeTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeTextModel) (Qwen3_5MoeTextConfig model)
  - [Qwen3_5TextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextConfig) configuration class: [Qwen3_5TextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextModel) (Qwen3_5TextConfig model)
  - [RTDetrConfig](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrConfig) configuration class: [RTDetrModel](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrModel) (RTDetrConfig model)
  - [RTDetrV2Config](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2Config) configuration class: [RTDetrV2Model](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2Model) (RTDetrV2Config model)
  - [RecurrentGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaConfig) configuration class: [RecurrentGemmaModel](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaModel) (RecurrentGemmaConfig model)
  - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerModel](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerModel) (ReformerConfig model)
  - [RegNetConfig](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetConfig) configuration class: [RegNetModel](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetModel) (RegNetConfig model)
  - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertModel](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertModel) (RemBertConfig model)
  - [ResNetConfig](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetConfig) configuration class: [ResNetModel](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetModel) (ResNetConfig model)
  - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertModel](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertModel) (RoCBertConfig model)
  - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerModel](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerModel) (RoFormerConfig model)
  - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaModel](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaModel) (RobertaConfig model)
  - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormModel](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormModel) (RobertaPreLayerNormConfig model)
  - [RwkvConfig](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvConfig) configuration class: [RwkvModel](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvModel) (RwkvConfig model)
  - [SEWConfig](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWConfig) configuration class: [SEWModel](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWModel) (SEWConfig model)
  - [SEWDConfig](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDConfig) configuration class: [SEWDModel](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDModel) (SEWDConfig model)
  - [Sam2Config](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2Config) configuration class: [Sam2Model](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2Model) (Sam2Config model)
  - [Sam2HieraDetConfig](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2HieraDetConfig) configuration class: [Sam2HieraDetModel](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2HieraDetModel) (Sam2HieraDetConfig model)
  - [Sam2VideoConfig](/docs/transformers/v5.8.0/en/model_doc/sam2_video#transformers.Sam2VideoConfig) configuration class: [Sam2VideoModel](/docs/transformers/v5.8.0/en/model_doc/sam2_video#transformers.Sam2VideoModel) (Sam2VideoConfig model)
  - [Sam2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2VisionConfig) configuration class: [Sam2VisionModel](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2VisionModel) (Sam2VisionConfig model)
  - [Sam3Config](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3Config) configuration class: [Sam3Model](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3Model) (Sam3Config model)
  - [Sam3LiteTextConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextConfig) configuration class: [Sam3LiteTextModel](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextModel) (Sam3LiteTextConfig model)
  - [Sam3LiteTextTextConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextTextConfig) configuration class: [Sam3LiteTextTextModel](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextTextModel) (Sam3LiteTextTextConfig model)
  - [Sam3TrackerConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker#transformers.Sam3TrackerConfig) configuration class: [Sam3TrackerModel](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker#transformers.Sam3TrackerModel) (Sam3TrackerConfig model)
  - [Sam3TrackerVideoConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker_video#transformers.Sam3TrackerVideoConfig) configuration class: [Sam3TrackerVideoModel](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker_video#transformers.Sam3TrackerVideoModel) (Sam3TrackerVideoConfig model)
  - [Sam3ViTConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3ViTConfig) configuration class: [Sam3ViTModel](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3ViTModel) (Sam3ViTConfig model)
  - [Sam3VideoConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_video#transformers.Sam3VideoConfig) configuration class: [Sam3VideoModel](/docs/transformers/v5.8.0/en/model_doc/sam3_video#transformers.Sam3VideoModel) (Sam3VideoConfig model)
  - [Sam3VisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3VisionConfig) configuration class: [Sam3VisionModel](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3VisionModel) (Sam3VisionConfig model)
  - [SamConfig](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamConfig) configuration class: [SamModel](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamModel) (SamConfig model)
  - [SamHQConfig](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQConfig) configuration class: [SamHQModel](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQModel) (SamHQConfig model)
  - [SamHQVisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQVisionConfig) configuration class: [SamHQVisionModel](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQVisionModel) (SamHQVisionConfig model)
  - [SamVisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamVisionConfig) configuration class: [SamVisionModel](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamVisionModel) (SamVisionConfig model)
  - [SeamlessM4TConfig](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TConfig) configuration class: [SeamlessM4TModel](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TModel) (SeamlessM4TConfig model)
  - [SeamlessM4Tv2Config](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Config) configuration class: [SeamlessM4Tv2Model](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Model) (SeamlessM4Tv2Config model)
  - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssModel](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssModel) (SeedOssConfig model)
  - [SegGptConfig](/docs/transformers/v5.8.0/en/model_doc/seggpt#transformers.SegGptConfig) configuration class: [SegGptModel](/docs/transformers/v5.8.0/en/model_doc/seggpt#transformers.SegGptModel) (SegGptConfig model)
  - [SegformerConfig](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerConfig) configuration class: [SegformerModel](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerModel) (SegformerConfig model)
  - [Siglip2Config](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Config) configuration class: [Siglip2Model](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Model) (Siglip2Config model)
  - [Siglip2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2VisionConfig) configuration class: [Siglip2VisionModel](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2VisionModel) (Siglip2VisionConfig model)
  - [SiglipConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipModel](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipModel) (SiglipConfig model)
  - [SiglipVisionConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipVisionConfig) configuration class: [SiglipVisionModel](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipVisionModel) (SiglipVisionConfig model)
  - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3Model](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Model) (SmolLM3Config model)
  - [SmolVLMConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMConfig) configuration class: [SmolVLMModel](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMModel) (SmolVLMConfig model)
  - [SmolVLMVisionConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMVisionConfig) configuration class: [SmolVLMVisionTransformer](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMVisionTransformer) (SmolVLMVisionConfig model)
  - [SolarOpenConfig](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenConfig) configuration class: [SolarOpenModel](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenModel) (SolarOpenConfig model)
  - [Speech2TextConfig](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextConfig) configuration class: [Speech2TextModel](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextModel) (Speech2TextConfig model)
  - [SpeechT5Config](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Config) configuration class: [SpeechT5Model](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Model) (SpeechT5Config model)
  - [SplinterConfig](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterConfig) configuration class: [SplinterModel](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterModel) (SplinterConfig model)
  - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertModel](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertModel) (SqueezeBertConfig model)
  - [StableLmConfig](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmConfig) configuration class: [StableLmModel](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmModel) (StableLmConfig model)
  - [Starcoder2Config](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Config) configuration class: [Starcoder2Model](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Model) (Starcoder2Config model)
  - [SwiftFormerConfig](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerConfig) configuration class: [SwiftFormerModel](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerModel) (SwiftFormerConfig model)
  - [Swin2SRConfig](/docs/transformers/v5.8.0/en/model_doc/swin2sr#transformers.Swin2SRConfig) configuration class: [Swin2SRModel](/docs/transformers/v5.8.0/en/model_doc/swin2sr#transformers.Swin2SRModel) (Swin2SRConfig model)
  - [SwinConfig](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinConfig) configuration class: [SwinModel](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinModel) (SwinConfig model)
  - [Swinv2Config](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2Model](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Model) (Swinv2Config model)
  - [SwitchTransformersConfig](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersConfig) configuration class: [SwitchTransformersModel](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersModel) (SwitchTransformersConfig model)
  - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5Model](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Model) (T5Config model)
  - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2Model](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Model) (T5Gemma2Config model)
  - [T5Gemma2EncoderConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2EncoderConfig) configuration class: `T5Gemma2Encoder` (T5Gemma2EncoderConfig model)
  - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaModel](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaModel) (T5GemmaConfig model)
  - [TableTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerConfig) configuration class: [TableTransformerModel](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerModel) (TableTransformerConfig model)
  - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasModel](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasModel) (TapasConfig model)
  - [TextNetConfig](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetConfig) configuration class: [TextNetModel](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetModel) (TextNetConfig model)
  - [TimeSeriesTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/time_series_transformer#transformers.TimeSeriesTransformerConfig) configuration class: [TimeSeriesTransformerModel](/docs/transformers/v5.8.0/en/model_doc/time_series_transformer#transformers.TimeSeriesTransformerModel) (TimeSeriesTransformerConfig model)
  - [TimesFm2_5Config](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5Config) configuration class: [TimesFm2_5Model](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5Model) (TimesFm2_5Config model)
  - [TimesFmConfig](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmConfig) configuration class: [TimesFmModel](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmModel) (TimesFmConfig model)
  - [TimesformerConfig](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerConfig) configuration class: [TimesformerModel](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerModel) (TimesformerConfig model)
  - [TimmBackboneConfig](/docs/transformers/v5.8.0/en/main_classes/backbones#transformers.TimmBackboneConfig) configuration class: [TimmBackbone](/docs/transformers/v5.8.0/en/main_classes/backbones#transformers.TimmBackbone) (TimmBackboneConfig model)
  - [TimmWrapperConfig](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperConfig) configuration class: [TimmWrapperModel](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperModel) (TimmWrapperConfig model)
  - [TvpConfig](/docs/transformers/v5.8.0/en/model_doc/tvp#transformers.TvpConfig) configuration class: [TvpModel](/docs/transformers/v5.8.0/en/model_doc/tvp#transformers.TvpModel) (TvpConfig model)
  - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5Model](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Model) (UMT5Config model)
  - [UVDocConfig](/docs/transformers/v5.8.0/en/model_doc/uvdoc#transformers.UVDocConfig) configuration class: [UVDocModel](/docs/transformers/v5.8.0/en/model_doc/uvdoc#transformers.UVDocModel) (UVDocConfig model)
  - [UdopConfig](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopConfig) configuration class: [UdopModel](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopModel) (UdopConfig model)
  - [UniSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechConfig) configuration class: [UniSpeechModel](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechModel) (UniSpeechConfig model)
  - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatModel](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatModel) (UniSpeechSatConfig model)
  - [UnivNetConfig](/docs/transformers/v5.8.0/en/model_doc/univnet#transformers.UnivNetConfig) configuration class: [UnivNetModel](/docs/transformers/v5.8.0/en/model_doc/univnet#transformers.UnivNetModel) (UnivNetConfig model)
  - [VJEPA2Config](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2Config) configuration class: [VJEPA2Model](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2Model) (VJEPA2Config model)
  - [VaultGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaConfig) configuration class: [VaultGemmaModel](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaModel) (VaultGemmaConfig model)
  - [ViTConfig](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTConfig) configuration class: [ViTModel](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTModel) (ViTConfig model)
  - [ViTMAEConfig](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEConfig) configuration class: [ViTMAEModel](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEModel) (ViTMAEConfig model)
  - [ViTMSNConfig](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNConfig) configuration class: [ViTMSNModel](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNModel) (ViTMSNConfig model)
  - [VibeVoiceAcousticTokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerConfig) configuration class: [VibeVoiceAcousticTokenizerModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerModel) (VibeVoiceAcousticTokenizerConfig model)
  - [VibeVoiceAcousticTokenizerDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerDecoderConfig) configuration class: [VibeVoiceAcousticTokenizerDecoderModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerDecoderModel) (VibeVoiceAcousticTokenizerDecoderConfig model)
  - [VibeVoiceAcousticTokenizerEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerEncoderConfig) configuration class: [VibeVoiceAcousticTokenizerEncoderModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerEncoderModel) (VibeVoiceAcousticTokenizerEncoderConfig model)
  - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
  - [VideoLlama3Config](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Config) configuration class: [VideoLlama3Model](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Model) (VideoLlama3Config model)
  - [VideoLlama3VisionConfig](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3VisionConfig) configuration class: [VideoLlama3VisionModel](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3VisionModel) (VideoLlama3VisionConfig model)
  - [VideoLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaConfig) configuration class: [VideoLlavaModel](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaModel) (VideoLlavaConfig model)
  - [VideoMAEConfig](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEConfig) configuration class: [VideoMAEModel](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEModel) (VideoMAEConfig model)
  - [ViltConfig](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltConfig) configuration class: [ViltModel](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltModel) (ViltConfig model)
  - [VipLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaConfig) configuration class: [VipLlavaModel](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaModel) (VipLlavaConfig model)
  - [VisionTextDualEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/vision-text-dual-encoder#transformers.VisionTextDualEncoderConfig) configuration class: [VisionTextDualEncoderModel](/docs/transformers/v5.8.0/en/model_doc/vision-text-dual-encoder#transformers.VisionTextDualEncoderModel) (VisionTextDualEncoderConfig model)
  - [VisualBertConfig](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertConfig) configuration class: [VisualBertModel](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertModel) (VisualBertConfig model)
  - [VitDetConfig](/docs/transformers/v5.8.0/en/model_doc/vitdet#transformers.VitDetConfig) configuration class: [VitDetModel](/docs/transformers/v5.8.0/en/model_doc/vitdet#transformers.VitDetModel) (VitDetConfig model)
  - [VitsConfig](/docs/transformers/v5.8.0/en/model_doc/vits#transformers.VitsConfig) configuration class: [VitsModel](/docs/transformers/v5.8.0/en/model_doc/vits#transformers.VitsModel) (VitsConfig model)
  - [VivitConfig](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitConfig) configuration class: [VivitModel](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitModel) (VivitConfig model)
  - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
  - [VoxtralEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralEncoderConfig) configuration class: [VoxtralEncoder](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralEncoder) (VoxtralEncoderConfig model)
  - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)
  - [VoxtralRealtimeEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeEncoderConfig) configuration class: [VoxtralRealtimeEncoder](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeEncoder) (VoxtralRealtimeEncoderConfig model)
  - [VoxtralRealtimeTextConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeTextConfig) configuration class: `VoxtralRealtimeTextModel` (VoxtralRealtimeTextConfig model)
  - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertModel](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertModel) (Wav2Vec2BertConfig model)
  - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2Model](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Model) (Wav2Vec2Config model)
  - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerModel](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerModel) (Wav2Vec2ConformerConfig model)
  - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMModel](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMModel) (WavLMConfig model)
  - [WhisperConfig](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperModel](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperModel) (WhisperConfig model)
  - [XCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/xclip#transformers.XCLIPConfig) configuration class: [XCLIPModel](/docs/transformers/v5.8.0/en/model_doc/xclip#transformers.XCLIPModel) (XCLIPConfig model)
  - [XGLMConfig](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMConfig) configuration class: [XGLMModel](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMModel) (XGLMConfig model)
  - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMModel) (XLMConfig model)
  - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaModel](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaModel) (XLMRobertaConfig model)
  - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLModel](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLModel) (XLMRobertaXLConfig model)
  - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetModel](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetModel) (XLNetConfig model)
  - [XcodecConfig](/docs/transformers/v5.8.0/en/model_doc/xcodec#transformers.XcodecConfig) configuration class: [XcodecModel](/docs/transformers/v5.8.0/en/model_doc/xcodec#transformers.XcodecModel) (XcodecConfig model)
  - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodModel](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodModel) (XmodConfig model)
  - [YolosConfig](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosConfig) configuration class: [YolosModel](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosModel) (YolosConfig model)
  - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoModel](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoModel) (YosoConfig model)
  - [YoutuConfig](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuConfig) configuration class: [YoutuModel](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuModel) (YoutuConfig model)
  - [Zamba2Config](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Config) configuration class: [Zamba2Model](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Model) (Zamba2Config model)
  - [ZambaConfig](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaConfig) configuration class: [ZambaModel](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaModel) (ZambaConfig model)
  - [xLSTMConfig](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMConfig) configuration class: [xLSTMModel](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMModel) (xLSTMConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the base model classes of the library from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModel.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [ASTConfig](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) configuration class: [ASTModel](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTModel) (ASTConfig model) - [AfmoeConfig](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeConfig) configuration class: [AfmoeModel](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeModel) (AfmoeConfig model) - [Aimv2Config](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2Config) configuration class: [Aimv2Model](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2Model) (Aimv2Config model) - [Aimv2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2VisionConfig) configuration class: [Aimv2VisionModel](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2VisionModel) (Aimv2VisionConfig model) - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertModel` (AlbertConfig model) - [AlignConfig](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignConfig) configuration class: [AlignModel](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignModel) (AlignConfig model) - [AltCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPModel) (AltCLIPConfig model) - [ApertusConfig](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusConfig) configuration class: [ApertusModel](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusModel) (ApertusConfig model) - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeModel](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeModel) (ArceeConfig model) - [AriaConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaConfig) configuration class: [AriaModel](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaModel) (AriaConfig model) - [AriaTextConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextConfig) configuration class: [AriaTextModel](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextModel) (AriaTextConfig model) - [AudioFlamingo3Config](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Config) configuration class: [AudioFlamingo3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3ForConditionalGeneration) (AudioFlamingo3Config model) - [AudioFlamingo3EncoderConfig](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3EncoderConfig) configuration class: [AudioFlamingo3Encoder](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Encoder) (AudioFlamingo3EncoderConfig model) - [AutoformerConfig](/docs/transformers/v5.8.0/en/model_doc/autoformer#transformers.AutoformerConfig) configuration class: [AutoformerModel](/docs/transformers/v5.8.0/en/model_doc/autoformer#transformers.AutoformerModel) (AutoformerConfig model) - [AyaVisionConfig](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionConfig) configuration class: [AyaVisionModel](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionModel) (AyaVisionConfig model) - [BambaConfig](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaConfig) configuration class: [BambaModel](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaModel) (BambaConfig model) - [BarkConfig](/docs/transformers/v5.8.0/en/model_doc/bark#transformers.BarkConfig) configuration class: [BarkModel](/docs/transformers/v5.8.0/en/model_doc/bark#transformers.BarkModel) (BarkConfig model) - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartModel](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartModel) (BartConfig model) - [BeitConfig](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitConfig) configuration class: [BeitModel](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitModel) (BeitConfig model) - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertModel](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertModel) (BertConfig model) - [BertGenerationConfig](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationConfig) configuration class: [BertGenerationEncoder](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationEncoder) (BertGenerationConfig model) - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdModel](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdModel) (BigBirdConfig model) - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusModel](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusModel) (BigBirdPegasusConfig model) - [BioGptConfig](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptModel](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptModel) (BioGptConfig model) - [BitConfig](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitConfig) configuration class: [BitModel](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitModel) (BitConfig model) - [BitNetConfig](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetConfig) configuration class: [BitNetModel](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetModel) (BitNetConfig model) - [BlenderbotConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotModel](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotModel) (BlenderbotConfig model) - [BlenderbotSmallConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallModel](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallModel) (BlenderbotSmallConfig model) - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2Model](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Model) (Blip2Config model) - [Blip2QFormerConfig](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2QFormerConfig) configuration class: [Blip2QFormerModel](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2QFormerModel) (Blip2QFormerConfig model) - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipModel) (BlipConfig model) - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomModel](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomModel) (BloomConfig model) - [BltConfig](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltConfig) configuration class: [BltModel](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltModel) (BltConfig model) - [BridgeTowerConfig](/docs/transformers/v5.8.0/en/model_doc/bridgetower#transformers.BridgeTowerConfig) configuration class: [BridgeTowerModel](/docs/transformers/v5.8.0/en/model_doc/bridgetower#transformers.BridgeTowerModel) (BridgeTowerConfig model) - [BrosConfig](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosConfig) configuration class: [BrosModel](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosModel) (BrosConfig model) - [CLIPConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPModel) (CLIPConfig model) - [CLIPSegConfig](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSegConfig model) - [CLIPTextConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTextConfig) configuration class: [CLIPTextModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTextModel) (CLIPTextConfig model) - [CLIPVisionConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPVisionConfig) configuration class: [CLIPVisionModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPVisionModel) (CLIPVisionConfig model) - [CTRLConfig](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLModel](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLModel) (CTRLConfig model) - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertModel](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertModel) (CamembertConfig model) - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineModel](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineModel) (CanineConfig model) - [ChameleonConfig](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonConfig) configuration class: [ChameleonModel](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonModel) (ChameleonConfig model) - [ChineseCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPConfig) configuration class: [ChineseCLIPModel](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPModel) (ChineseCLIPConfig model) - [ChineseCLIPVisionConfig](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPVisionConfig) configuration class: [ChineseCLIPVisionModel](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPVisionModel) (ChineseCLIPVisionConfig model) - [ClapConfig](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapConfig) configuration class: [ClapModel](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapModel) (ClapConfig model) - [ClvpConfig](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpConfig) configuration class: [ClvpModelForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpModelForConditionalGeneration) (ClvpConfig model) - [CodeGenConfig](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenModel](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenModel) (CodeGenConfig model) - [Cohere2Config](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2Config) configuration class: [Cohere2Model](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2Model) (Cohere2Config model) - [Cohere2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionConfig) configuration class: [Cohere2VisionModel](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionModel) (Cohere2VisionConfig model) - [CohereAsrConfig](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrConfig) configuration class: [CohereAsrModel](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrModel) (CohereAsrConfig model) - [CohereConfig](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereConfig) configuration class: [CohereModel](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereModel) (CohereConfig model) - [ConditionalDetrConfig](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrConfig) configuration class: [ConditionalDetrModel](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrModel) (ConditionalDetrConfig model) - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertModel](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertModel) (ConvBertConfig model) - [ConvNextConfig](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextConfig) configuration class: [ConvNextModel](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextModel) (ConvNextConfig model) - [ConvNextV2Config](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [ConvNextV2Model](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2Model) (ConvNextV2Config model) - [CpmAntConfig](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntConfig) configuration class: [CpmAntModel](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntModel) (CpmAntConfig model) - [CsmConfig](/docs/transformers/v5.8.0/en/model_doc/csm#transformers.CsmConfig) configuration class: [CsmForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/csm#transformers.CsmForConditionalGeneration) (CsmConfig model) - [CvtConfig](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtConfig) configuration class: [CvtModel](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtModel) (CvtConfig model) - [CwmConfig](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmConfig) configuration class: [CwmModel](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmModel) (CwmConfig model) - [DFineConfig](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineConfig) configuration class: [DFineModel](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineModel) (DFineConfig model) - [DINOv3ConvNextConfig](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ConvNextConfig) configuration class: [DINOv3ConvNextModel](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ConvNextModel) (DINOv3ConvNextConfig model) - [DINOv3ViTConfig](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ViTConfig) configuration class: [DINOv3ViTModel](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ViTModel) (DINOv3ViTConfig model) - [DPRConfig](/docs/transformers/v5.8.0/en/model_doc/dpr#transformers.DPRConfig) configuration class: [DPRQuestionEncoder](/docs/transformers/v5.8.0/en/model_doc/dpr#transformers.DPRQuestionEncoder) (DPRConfig model) - [DPTConfig](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTConfig) configuration class: [DPTModel](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTModel) (DPTConfig model) - [DabDetrConfig](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrConfig) configuration class: [DabDetrModel](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrModel) (DabDetrConfig model) - [DacConfig](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacConfig) configuration class: [DacModel](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacModel) (DacConfig model) - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioModel](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioModel) (Data2VecAudioConfig model) - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextModel](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextModel) (Data2VecTextConfig model) - [Data2VecVisionConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionModel](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionModel) (Data2VecVisionConfig model) - [DbrxConfig](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxConfig) configuration class: [DbrxModel](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxModel) (DbrxConfig model) - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaModel](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaModel) (DebertaConfig model) - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2Model](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Model) (DebertaV2Config model) - [DecisionTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/decision_transformer#transformers.DecisionTransformerConfig) configuration class: [DecisionTransformerModel](/docs/transformers/v5.8.0/en/model_doc/decision_transformer#transformers.DecisionTransformerModel) (DecisionTransformerConfig model) - [DeepseekV2Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Config) configuration class: [DeepseekV2Model](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Model) (DeepseekV2Config model) - [DeepseekV3Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3Model](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Model) (DeepseekV3Config model) - [DeepseekV4Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4Config) configuration class: [DeepseekV4Model](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4Model) (DeepseekV4Config model) - [DeepseekVLConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLConfig) configuration class: [DeepseekVLModel](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLModel) (DeepseekVLConfig model) - [DeepseekVLHybridConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridConfig) configuration class: [DeepseekVLHybridModel](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridModel) (DeepseekVLHybridConfig model) - [DeformableDetrConfig](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrConfig) configuration class: [DeformableDetrModel](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrModel) (DeformableDetrConfig model) - [DeiTConfig](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTModel](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTModel) (DeiTConfig model) - [Deimv2Config](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2Config) configuration class: [Deimv2Model](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2Model) (Deimv2Config model) - [DepthProConfig](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProConfig) configuration class: [DepthProModel](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProModel) (DepthProConfig model) - [DetrConfig](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrConfig) configuration class: [DetrModel](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrModel) (DetrConfig model) - [DiaConfig](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaConfig) configuration class: [DiaModel](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaModel) (DiaConfig model) - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaModel](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaModel) (DiffLlamaConfig model) - [DinatConfig](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatConfig) configuration class: [DinatModel](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatModel) (DinatConfig model) - [Dinov2Config](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2Config) configuration class: [Dinov2Model](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2Model) (Dinov2Config model) - [Dinov2WithRegistersConfig](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersConfig) configuration class: [Dinov2WithRegistersModel](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersModel) (Dinov2WithRegistersConfig model) - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertModel](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertModel) (DistilBertConfig model) - [DogeConfig](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeConfig) configuration class: [DogeModel](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeModel) (DogeConfig model) - [DonutSwinConfig](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinConfig) configuration class: [DonutSwinModel](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinModel) (DonutSwinConfig model) - [Dots1Config](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1Config) configuration class: [Dots1Model](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1Model) (Dots1Config model) - [EdgeTamConfig](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamConfig) configuration class: [EdgeTamModel](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamModel) (EdgeTamConfig model) - [EdgeTamVideoConfig](/docs/transformers/v5.8.0/en/model_doc/edgetam_video#transformers.EdgeTamVideoConfig) configuration class: [EdgeTamVideoModel](/docs/transformers/v5.8.0/en/model_doc/edgetam_video#transformers.EdgeTamVideoModel) (EdgeTamVideoConfig model) - [EdgeTamVisionConfig](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamVisionConfig) configuration class: [EdgeTamVisionModel](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamVisionModel) (EdgeTamVisionConfig model) - [EfficientLoFTRConfig](/docs/transformers/v5.8.0/en/model_doc/efficientloftr#transformers.EfficientLoFTRConfig) configuration class: [EfficientLoFTRModel](/docs/transformers/v5.8.0/en/model_doc/efficientloftr#transformers.EfficientLoFTRModel) (EfficientLoFTRConfig model) - [EfficientNetConfig](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetConfig) configuration class: [EfficientNetModel](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetModel) (EfficientNetConfig model) - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraModel](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraModel) (ElectraConfig model) - [Emu3Config](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Config) configuration class: [Emu3Model](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Model) (Emu3Config model) - [EncodecConfig](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecConfig) configuration class: [EncodecModel](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecModel) (EncodecConfig model) - [Ernie4_5Config](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5Config) configuration class: [Ernie4_5Model](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5Model) (Ernie4_5Config model) - [Ernie4_5_MoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeConfig) configuration class: [Ernie4_5_MoeModel](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeModel) (Ernie4_5_MoeConfig model) - [Ernie4_5_VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeConfig) configuration class: [Ernie4_5_VLMoeModel](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeModel) (Ernie4_5_VLMoeConfig model) - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieModel](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieModel) (ErnieConfig model) - [EsmConfig](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmConfig) configuration class: [EsmModel](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmModel) (EsmConfig model) - [EuroBertConfig](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertConfig) configuration class: [EuroBertModel](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertModel) (EuroBertConfig model) - [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) configuration class: [EvollaModel](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaModel) (EvollaConfig model) - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4Model](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Model) (Exaone4Config model) - [Exaone4_5_Config](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Config) configuration class: [Exaone4_5_Model](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Model) (Exaone4_5_Config model) - [Exaone4_5_VisionConfig](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_VisionConfig) configuration class: [Exaone4_5_VisionModel](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_VisionModel) (Exaone4_5_VisionConfig model) - [ExaoneMoeConfig](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeConfig) configuration class: [ExaoneMoeModel](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeModel) (ExaoneMoeConfig model) - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetModel](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetModel) (FNetConfig model) - [FSMTConfig](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTConfig) configuration class: [FSMTModel](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTModel) (FSMTConfig model) - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconModel](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconModel) (FalconConfig model) - [FalconH1Config](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1Config) configuration class: [FalconH1Model](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1Model) (FalconH1Config model) - [FalconMambaConfig](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaConfig) configuration class: [FalconMambaModel](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaModel) (FalconMambaConfig model) - [FastSpeech2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerConfig) configuration class: [FastSpeech2ConformerModel](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerModel) (FastSpeech2ConformerConfig model) - [FastSpeech2ConformerWithHifiGanConfig](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerWithHifiGanConfig) configuration class: [FastSpeech2ConformerWithHifiGan](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerWithHifiGan) (FastSpeech2ConformerWithHifiGanConfig model) - [FastVlmConfig](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmConfig) configuration class: [FastVlmModel](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmModel) (FastVlmConfig model) - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertModel](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertModel) (FlaubertConfig model) - [FlavaConfig](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaConfig) configuration class: [FlavaModel](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaModel) (FlavaConfig model) - [FlexOlmoConfig](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoConfig) configuration class: [FlexOlmoModel](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoModel) (FlexOlmoConfig model) - [Florence2Config](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Config) configuration class: [Florence2Model](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Model) (Florence2Config model) - [FocalNetConfig](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetConfig) configuration class: [FocalNetModel](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetModel) (FocalNetConfig model) - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelModel](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelModel) or [FunnelBaseModel](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelBaseModel) (FunnelConfig model) - [FuyuConfig](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuConfig) configuration class: [FuyuModel](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuModel) (FuyuConfig model) - [GLPNConfig](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNConfig) configuration class: [GLPNModel](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNModel) (GLPNConfig model) - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2Model](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Model) (GPT2Config model) - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeModel](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeModel) (GPTBigCodeConfig model) - [GPTJConfig](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJConfig) configuration class: [GPTJModel](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJModel) (GPTJConfig model) - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoModel](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoModel) (GPTNeoConfig model) - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXModel](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXModel) (GPTNeoXConfig model) - [GPTNeoXJapaneseConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) configuration class: [GPTNeoXJapaneseModel](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseModel) (GPTNeoXJapaneseConfig model) - [Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2Model](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Model) (Gemma2Config model) - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3Model](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Model) (Gemma3Config model) - [Gemma3TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3TextModel](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextModel) (Gemma3TextConfig model) - [Gemma3nAudioConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nAudioConfig) configuration class: `Gemma3nAudioEncoder` (Gemma3nAudioConfig model) - [Gemma3nConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nConfig) configuration class: [Gemma3nModel](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nModel) (Gemma3nConfig model) - [Gemma3nTextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nTextConfig) configuration class: [Gemma3nTextModel](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nTextModel) (Gemma3nTextConfig model) - [Gemma3nVisionConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nVisionConfig) configuration class: [TimmWrapperModel](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperModel) (Gemma3nVisionConfig model) - [Gemma4AudioConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4AudioConfig) configuration class: [Gemma4AudioModel](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4AudioModel) (Gemma4AudioConfig model) - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4Model](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Model) (Gemma4Config model) - [Gemma4TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4TextConfig) configuration class: [Gemma4TextModel](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4TextModel) (Gemma4TextConfig model) - [Gemma4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4VisionConfig) configuration class: [Gemma4VisionModel](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4VisionModel) (Gemma4VisionConfig model) - [GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaModel](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaModel) (GemmaConfig model) - [GitConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitConfig) configuration class: [GitModel](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitModel) (GitConfig model) - [Glm46VConfig](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VConfig) configuration class: [Glm46VModel](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VModel) (Glm46VConfig model) - [Glm4Config](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Config) configuration class: [Glm4Model](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Model) (Glm4Config model) - [Glm4MoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeConfig) configuration class: [Glm4MoeModel](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeModel) (Glm4MoeConfig model) - [Glm4MoeLiteConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteConfig) configuration class: [Glm4MoeLiteModel](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteModel) (Glm4MoeLiteConfig model) - [Glm4vConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vConfig) configuration class: [Glm4vModel](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vModel) (Glm4vConfig model) - [Glm4vMoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeConfig) configuration class: [Glm4vMoeModel](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeModel) (Glm4vMoeConfig model) - [Glm4vMoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeTextConfig) configuration class: [Glm4vMoeTextModel](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeTextModel) (Glm4vMoeTextConfig model) - [Glm4vMoeVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeVisionConfig) configuration class: [Glm4vMoeVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeVisionModel) (Glm4vMoeVisionConfig model) - [Glm4vTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vTextConfig) configuration class: [Glm4vTextModel](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vTextModel) (Glm4vTextConfig model) - [Glm4vVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vVisionConfig) configuration class: [Glm4vVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vVisionModel) (Glm4vVisionConfig model) - [GlmAsrConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrConfig) configuration class: [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model) - [GlmAsrEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrEncoderConfig) configuration class: [GlmAsrEncoder](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrEncoder) (GlmAsrEncoderConfig model) - [GlmConfig](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmConfig) configuration class: [GlmModel](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmModel) (GlmConfig model) - [GlmImageConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageConfig) configuration class: [GlmImageModel](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageModel) (GlmImageConfig model) - [GlmImageTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageTextConfig) configuration class: [GlmImageTextModel](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageTextModel) (GlmImageTextConfig model) - [GlmImageVQVAEConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVQVAEConfig) configuration class: [GlmImageVQVAE](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVQVAE) (GlmImageVQVAEConfig model) - [GlmImageVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVisionConfig) configuration class: [GlmImageVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVisionModel) (GlmImageVisionConfig model) - [GlmMoeDsaConfig](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaConfig) configuration class: [GlmMoeDsaModel](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaModel) (GlmMoeDsaConfig model) - [GlmOcrConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrConfig) configuration class: [GlmOcrModel](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrModel) (GlmOcrConfig model) - [GlmOcrTextConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrTextConfig) configuration class: [GlmOcrTextModel](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrTextModel) (GlmOcrTextConfig model) - [GlmOcrVisionConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrVisionConfig) configuration class: [GlmOcrVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrVisionModel) (GlmOcrVisionConfig model) - [GotOcr2Config](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Config) configuration class: [GotOcr2Model](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Model) (GotOcr2Config model) - [GptOssConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssConfig) configuration class: [GptOssModel](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssModel) (GptOssConfig model) - [Granite4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionConfig) configuration class: [Granite4VisionModel](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionModel) (Granite4VisionConfig model) - [GraniteConfig](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteConfig) configuration class: [GraniteModel](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteModel) (GraniteConfig model) - [GraniteMoeConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeConfig) configuration class: [GraniteMoeModel](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeModel) (GraniteMoeConfig model) - [GraniteMoeHybridConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridConfig) configuration class: [GraniteMoeHybridModel](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridModel) (GraniteMoeHybridConfig model) - [GraniteMoeSharedConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedConfig) configuration class: [GraniteMoeSharedModel](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedModel) (GraniteMoeSharedConfig model) - [GraniteSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechConfig) configuration class: [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model) - [GroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoConfig) configuration class: [GroundingDinoModel](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoModel) (GroundingDinoConfig model) - [GroupViTConfig](/docs/transformers/v5.8.0/en/model_doc/groupvit#transformers.GroupViTConfig) configuration class: [GroupViTModel](/docs/transformers/v5.8.0/en/model_doc/groupvit#transformers.GroupViTModel) (GroupViTConfig model) - [HGNetV2Config](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2Config) configuration class: [HGNetV2Backbone](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2Backbone) (HGNetV2Config model) - [HYV3Config](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3Config) configuration class: [HYV3Model](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3Model) (HYV3Config model) - [HeliumConfig](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumConfig) configuration class: [HeliumModel](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumModel) (HeliumConfig model) - [HieraConfig](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraConfig) configuration class: [HieraModel](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraModel) (HieraConfig model) - [HiggsAudioV2Config](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2#transformers.HiggsAudioV2Config) configuration class: [HiggsAudioV2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2#transformers.HiggsAudioV2ForConditionalGeneration) (HiggsAudioV2Config model) - [HiggsAudioV2TokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerConfig) configuration class: [HiggsAudioV2TokenizerModel](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerModel) (HiggsAudioV2TokenizerConfig model) - [HubertConfig](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertConfig) configuration class: [HubertModel](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertModel) (HubertConfig model) - [HunYuanDenseV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Config) configuration class: [HunYuanDenseV1Model](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Model) (HunYuanDenseV1Config model) - [HunYuanMoEV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Config) configuration class: [HunYuanMoEV1Model](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Model) (HunYuanMoEV1Config model) - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertModel](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertModel) (IBertConfig model) - [IJepaConfig](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaConfig) configuration class: [IJepaModel](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaModel) (IJepaConfig model) - [Idefics2Config](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Config) configuration class: [Idefics2Model](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Model) (Idefics2Config model) - [Idefics3Config](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Config) configuration class: [Idefics3Model](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Model) (Idefics3Config model) - [Idefics3VisionConfig](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3VisionConfig) configuration class: [Idefics3VisionTransformer](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3VisionTransformer) (Idefics3VisionConfig model) - [IdeficsConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsConfig) configuration class: [IdeficsModel](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsModel) (IdeficsConfig model) - [ImageGPTConfig](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTConfig) configuration class: [ImageGPTModel](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTModel) (ImageGPTConfig model) - [InformerConfig](/docs/transformers/v5.8.0/en/model_doc/informer#transformers.InformerConfig) configuration class: [InformerModel](/docs/transformers/v5.8.0/en/model_doc/informer#transformers.InformerModel) (InformerConfig model) - [InstructBlipConfig](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipConfig) configuration class: [InstructBlipModel](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipModel) (InstructBlipConfig model) - [InstructBlipVideoConfig](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoConfig) configuration class: [InstructBlipVideoModel](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoModel) (InstructBlipVideoConfig model) - [InternVLConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLConfig) configuration class: [InternVLModel](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLModel) (InternVLConfig model) - [InternVLVisionConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLVisionConfig) configuration class: [InternVLVisionModel](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLVisionModel) (InternVLVisionConfig model) - [Jais2Config](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2Config) configuration class: [Jais2Model](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2Model) (Jais2Config model) - [JambaConfig](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaModel](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaModel) (JambaConfig model) - [JanusConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusConfig) configuration class: [JanusModel](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusModel) (JanusConfig model) - [JetMoeConfig](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeConfig) configuration class: [JetMoeModel](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeModel) (JetMoeConfig model) - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3Model](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Model) (JinaEmbeddingsV3Config model) - [Kosmos2Config](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Config) configuration class: [Kosmos2Model](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Model) (Kosmos2Config model) - [Kosmos2_5Config](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Config) configuration class: [Kosmos2_5Model](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Model) (Kosmos2_5Config model) - [KyutaiSpeechToTextConfig](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextConfig) configuration class: [KyutaiSpeechToTextModel](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextModel) (KyutaiSpeechToTextConfig model) - [LEDConfig](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDConfig) configuration class: [LEDModel](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDModel) (LEDConfig model) - [LagunaConfig](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaConfig) configuration class: [LagunaModel](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaModel) (LagunaConfig model) - [LasrCTCConfig](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrCTCConfig) configuration class: [LasrForCTC](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrForCTC) (LasrCTCConfig model) - [LasrEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrEncoderConfig) configuration class: [LasrEncoder](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrEncoder) (LasrEncoderConfig model) - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMModel](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMModel) (LayoutLMConfig model) - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2Model](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Model) (LayoutLMv2Config model) - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3Model](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Model) (LayoutLMv3Config model) - [LevitConfig](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitConfig) configuration class: [LevitModel](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitModel) (LevitConfig model) - [Lfm2Config](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2Config) configuration class: [Lfm2Model](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2Model) (Lfm2Config model) - [Lfm2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeConfig) configuration class: [Lfm2MoeModel](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeModel) (Lfm2MoeConfig model) - [Lfm2VlConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlConfig) configuration class: [Lfm2VlModel](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlModel) (Lfm2VlConfig model) - [LightGlueConfig](/docs/transformers/v5.8.0/en/model_doc/lightglue#transformers.LightGlueConfig) configuration class: [LightGlueForKeypointMatching](/docs/transformers/v5.8.0/en/model_doc/lightglue#transformers.LightGlueForKeypointMatching) (LightGlueConfig model) - [LightOnOcrConfig](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrConfig) configuration class: [LightOnOcrModel](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrModel) (LightOnOcrConfig model) - [LiltConfig](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltConfig) configuration class: [LiltModel](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltModel) (LiltConfig model) - [Llama4Config](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Config) configuration class: [Llama4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForConditionalGeneration) (Llama4Config model) - [Llama4TextConfig](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4TextConfig) configuration class: [Llama4TextModel](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4TextModel) (Llama4TextConfig model) - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaModel](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaModel) (LlamaConfig model) - [LlavaConfig](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaConfig) configuration class: [LlavaModel](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaModel) (LlavaConfig model) - [LlavaNextConfig](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextConfig) configuration class: [LlavaNextModel](/docs/transformers/v5.8.0/en/model_doc/llava_next#transformers.LlavaNextModel) (LlavaNextConfig model) - [LlavaNextVideoConfig](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoConfig) configuration class: [LlavaNextVideoModel](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoModel) (LlavaNextVideoConfig model) - [LlavaOnevisionConfig](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionConfig) configuration class: [LlavaOnevisionModel](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionModel) (LlavaOnevisionConfig model) - [LongT5Config](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5Config) configuration class: [LongT5Model](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5Model) (LongT5Config model) - [LongcatFlashConfig](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashConfig) configuration class: [LongcatFlashModel](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashModel) (LongcatFlashConfig model) - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerModel](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerModel) (LongformerConfig model) - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeModel](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeModel) (LukeConfig model) - [LwDetrConfig](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrConfig) configuration class: [LwDetrModel](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrModel) (LwDetrConfig model) - [LxmertConfig](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertConfig) configuration class: [LxmertModel](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertModel) (LxmertConfig model) - [M2M100Config](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100Config) configuration class: [M2M100Model](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100Model) (M2M100Config model) - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartModel](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartModel) (MBartConfig model) - [MLCDVisionConfig](/docs/transformers/v5.8.0/en/model_doc/mlcd#transformers.MLCDVisionConfig) configuration class: [MLCDVisionModel](/docs/transformers/v5.8.0/en/model_doc/mlcd#transformers.MLCDVisionModel) (MLCDVisionConfig model) - [MMGroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoConfig) configuration class: [MMGroundingDinoModel](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoModel) (MMGroundingDinoConfig model) - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetModel](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetModel) (MPNetConfig model) - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5Model](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Model) (MT5Config model) - [Mamba2Config](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2Model](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Model) (Mamba2Config model) - [MambaConfig](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaModel](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaModel) (MambaConfig model) - [MarianConfig](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianConfig) configuration class: [MarianModel](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianModel) (MarianConfig model) - [MarkupLMConfig](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMConfig) configuration class: [MarkupLMModel](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMModel) (MarkupLMConfig model) - [Mask2FormerConfig](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerConfig) configuration class: [Mask2FormerModel](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerModel) (Mask2FormerConfig model) - [MaskFormerConfig](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerConfig) configuration class: [MaskFormerModel](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerModel) (MaskFormerConfig model) - `MaskFormerSwinConfig` configuration class: `MaskFormerSwinModel` (MaskFormerSwinConfig model) - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertModel](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertModel) (MegatronBertConfig model) - [MetaClip2Config](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Config) configuration class: [MetaClip2Model](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Model) (MetaClip2Config model) - [MgpstrConfig](/docs/transformers/v5.8.0/en/model_doc/mgp-str#transformers.MgpstrConfig) configuration class: [MgpstrForSceneTextRecognition](/docs/transformers/v5.8.0/en/model_doc/mgp-str#transformers.MgpstrForSceneTextRecognition) (MgpstrConfig model) - [MimiConfig](/docs/transformers/v5.8.0/en/model_doc/mimi#transformers.MimiConfig) configuration class: [MimiModel](/docs/transformers/v5.8.0/en/model_doc/mimi#transformers.MimiModel) (MimiConfig model) - [MiniCPMV4_6Config](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Config) configuration class: [MiniCPMV4_6Model](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Model) (MiniCPMV4_6Config model) - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxModel](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxModel) (MiniMaxConfig model) - [MiniMaxM2Config](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2Config) configuration class: [MiniMaxM2Model](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2Model) (MiniMaxM2Config model) - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3Model](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Model) (Ministral3Config model) - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralModel](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralModel) (MinistralConfig model) - [Mistral3Config](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Config) configuration class: [Mistral3Model](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Model) (Mistral3Config model) - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4Model](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Model) (Mistral4Config model) - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralModel](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralModel) (MistralConfig model) - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralModel](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralModel) (MixtralConfig model) - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaModel](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaModel) (MllamaConfig model) - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertModel](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertModel) (MobileBertConfig model) - [MobileNetV1Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1Config) configuration class: [MobileNetV1Model](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1Model) (MobileNetV1Config model) - [MobileNetV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Config) configuration class: [MobileNetV2Model](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Model) (MobileNetV2Config model) - [MobileViTConfig](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTConfig) configuration class: [MobileViTModel](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTModel) (MobileViTConfig model) - [MobileViTV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Config) configuration class: [MobileViTV2Model](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Model) (MobileViTV2Config model) - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertModel](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertModel) (ModernBertConfig model) - [ModernBertDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderConfig) configuration class: [ModernBertDecoderModel](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderModel) (ModernBertDecoderConfig model) - [ModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertConfig) configuration class: [ModernVBertModel](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertModel) (ModernVBertConfig model) - [MoonshineConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineConfig) configuration class: [MoonshineModel](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineModel) (MoonshineConfig model) - [MoonshineStreamingConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingConfig) configuration class: [MoonshineStreamingModel](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingModel) (MoonshineStreamingConfig model) - [MoshiConfig](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiConfig) configuration class: [MoshiModel](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiModel) (MoshiConfig model) - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptModel](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptModel) (MptConfig model) - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraModel](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraModel) (MraConfig model) - [MusicFlamingoConfig](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoConfig) configuration class: [MusicFlamingoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoForConditionalGeneration) (MusicFlamingoConfig model) - [MusicgenConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenConfig) configuration class: [MusicgenModel](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenModel) (MusicgenConfig model) - [MusicgenMelodyConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyConfig) configuration class: [MusicgenMelodyModel](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyModel) (MusicgenMelodyConfig model) - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpModel](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpModel) (MvpConfig model) - [NanoChatConfig](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatConfig) configuration class: [NanoChatModel](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatModel) (NanoChatConfig model) - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronModel](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronModel) (NemotronConfig model) - [NemotronHConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHConfig) configuration class: [NemotronHModel](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHModel) (NemotronHConfig model) - [NllbMoeConfig](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeConfig) configuration class: [NllbMoeModel](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeModel) (NllbMoeConfig model) - [NomicBertConfig](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertConfig) configuration class: [NomicBertModel](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertModel) (NomicBertConfig model) - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerModel](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerModel) (NystromformerConfig model) - [OPTConfig](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTConfig) configuration class: [OPTModel](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTModel) (OPTConfig model) - [Olmo2Config](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Config) configuration class: [Olmo2Model](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Model) (Olmo2Config model) - [Olmo3Config](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Config) configuration class: [Olmo3Model](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Model) (Olmo3Config model) - [OlmoConfig](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoConfig) configuration class: [OlmoModel](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoModel) (OlmoConfig model) - [OlmoHybridConfig](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridConfig) configuration class: [OlmoHybridModel](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridModel) (OlmoHybridConfig model) - [OlmoeConfig](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeConfig) configuration class: [OlmoeModel](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeModel) (OlmoeConfig model) - [OmDetTurboConfig](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboConfig) configuration class: [OmDetTurboForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboForObjectDetection) (OmDetTurboConfig model) - [OneFormerConfig](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerConfig) configuration class: [OneFormerModel](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerModel) (OneFormerConfig model) - [OpenAIGPTConfig](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTModel](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTModel) (OpenAIGPTConfig model) - [OpenAIPrivacyFilterConfig](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterConfig) configuration class: [OpenAIPrivacyFilterModel](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterModel) (OpenAIPrivacyFilterConfig model) - [Ovis2Config](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Config) configuration class: [Ovis2Model](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Model) (Ovis2Config model) - [OwlViTConfig](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTConfig) configuration class: [OwlViTModel](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTModel) (OwlViTConfig model) - [Owlv2Config](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2Config) configuration class: [Owlv2Model](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2Model) (Owlv2Config model) - [PI0Config](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Config) configuration class: [PI0Model](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Model) (PI0Config model) - [PLBartConfig](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartConfig) configuration class: [PLBartModel](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartModel) (PLBartConfig model) - [PPDocLayoutV3Config](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3Config) configuration class: [PPDocLayoutV3Model](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3Model) (PPDocLayoutV3Config model) - [PPOCRV5MobileRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecConfig) configuration class: [PPOCRV5MobileRecModel](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecModel) (PPOCRV5MobileRecConfig model) - [PPOCRV5ServerRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecConfig) configuration class: [PPOCRV5ServerRecModel](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecModel) (PPOCRV5ServerRecConfig model) - [PaliGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaModel](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaModel) (PaliGemmaConfig model) - [ParakeetCTCConfig](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetCTCConfig) configuration class: [ParakeetForCTC](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetForCTC) (ParakeetCTCConfig model) - [ParakeetEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetEncoderConfig) configuration class: [ParakeetEncoder](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetEncoder) (ParakeetEncoderConfig model) - [PatchTSMixerConfig](/docs/transformers/v5.8.0/en/model_doc/patchtsmixer#transformers.PatchTSMixerConfig) configuration class: [PatchTSMixerModel](/docs/transformers/v5.8.0/en/model_doc/patchtsmixer#transformers.PatchTSMixerModel) (PatchTSMixerConfig model) - [PatchTSTConfig](/docs/transformers/v5.8.0/en/model_doc/patchtst#transformers.PatchTSTConfig) configuration class: [PatchTSTModel](/docs/transformers/v5.8.0/en/model_doc/patchtst#transformers.PatchTSTModel) (PatchTSTConfig model) - [PeAudioConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioConfig) configuration class: [PeAudioModel](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioModel) (PeAudioConfig model) - [PeAudioEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioEncoderConfig) configuration class: [PeAudioEncoder](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioEncoder) (PeAudioEncoderConfig model) - [PeAudioVideoConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoConfig) configuration class: [PeAudioVideoModel](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoModel) (PeAudioVideoConfig model) - [PeAudioVideoEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoEncoderConfig) configuration class: [PeAudioVideoEncoder](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoEncoder) (PeAudioVideoEncoderConfig model) - [PeVideoConfig](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoConfig) configuration class: [PeVideoModel](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoModel) (PeVideoConfig model) - [PeVideoEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoEncoderConfig) configuration class: [PeVideoEncoder](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoEncoder) (PeVideoEncoderConfig model) - [PegasusConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusConfig) configuration class: [PegasusModel](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusModel) (PegasusConfig model) - [PegasusXConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXConfig) configuration class: [PegasusXModel](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXModel) (PegasusXConfig model) - [PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverConfig) configuration class: [PerceiverModel](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverModel) (PerceiverConfig model) - [PerceptionLMConfig](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMConfig) configuration class: [PerceptionLMModel](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMModel) (PerceptionLMConfig model) - [PersimmonConfig](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonConfig) configuration class: [PersimmonModel](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonModel) (PersimmonConfig model) - [Phi3Config](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Config) configuration class: [Phi3Model](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Model) (Phi3Config model) - [Phi4MultimodalConfig](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalConfig) configuration class: [Phi4MultimodalModel](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalModel) (Phi4MultimodalConfig model) - [PhiConfig](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiConfig) configuration class: [PhiModel](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiModel) (PhiConfig model) - [PhimoeConfig](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeConfig) configuration class: [PhimoeModel](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeModel) (PhimoeConfig model) - [PixioConfig](/docs/transformers/v5.8.0/en/model_doc/pixio#transformers.PixioConfig) configuration class: [PixioModel](/docs/transformers/v5.8.0/en/model_doc/pixio#transformers.PixioModel) (PixioConfig model) - [PixtralVisionConfig](/docs/transformers/v5.8.0/en/model_doc/pixtral#transformers.PixtralVisionConfig) configuration class: [PixtralVisionModel](/docs/transformers/v5.8.0/en/model_doc/pixtral#transformers.PixtralVisionModel) (PixtralVisionConfig model) - [PoolFormerConfig](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerConfig) configuration class: [PoolFormerModel](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerModel) (PoolFormerConfig model) - [ProphetNetConfig](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetConfig) configuration class: [ProphetNetModel](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetModel) (ProphetNetConfig model) - [PvtConfig](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtConfig) configuration class: [PvtModel](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtModel) (PvtConfig model) - [PvtV2Config](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2Config) configuration class: [PvtV2Model](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2Model) (PvtV2Config model) - [QianfanOCRConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRConfig) configuration class: [QianfanOCRModel](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRModel) (QianfanOCRConfig model) - [QianfanOCRVisionConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRVisionConfig) configuration class: [QianfanOCRVisionModel](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRVisionModel) (QianfanOCRVisionConfig model) - [Qwen2AudioEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioEncoderConfig) configuration class: [Qwen2AudioEncoder](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioEncoder) (Qwen2AudioEncoderConfig model) - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2Model](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Model) (Qwen2Config model) - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeModel) (Qwen2MoeConfig model) - [Qwen2VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLConfig) configuration class: [Qwen2VLModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLModel) (Qwen2VLConfig model) - [Qwen2VLTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLTextConfig) configuration class: [Qwen2VLTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLTextModel) (Qwen2VLTextConfig model) - [Qwen2_5_VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLConfig) configuration class: [Qwen2_5_VLModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLModel) (Qwen2_5_VLConfig model) - [Qwen2_5_VLTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLTextConfig) configuration class: [Qwen2_5_VLTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLTextModel) (Qwen2_5_VLTextConfig model) - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3Model](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Model) (Qwen3Config model) - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeModel) (Qwen3MoeConfig model) - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextModel) (Qwen3NextConfig model) - [Qwen3VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLConfig) configuration class: [Qwen3VLModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLModel) (Qwen3VLConfig model) - [Qwen3VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeConfig) configuration class: [Qwen3VLMoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeModel) (Qwen3VLMoeConfig model) - [Qwen3VLMoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeTextConfig) configuration class: [Qwen3VLMoeTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeTextModel) (Qwen3VLMoeTextConfig model) - [Qwen3VLTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLTextConfig) configuration class: [Qwen3VLTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLTextModel) (Qwen3VLTextConfig model) - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5Model](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Model) (Qwen3_5Config model) - [Qwen3_5MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeConfig) configuration class: [Qwen3_5MoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeModel) (Qwen3_5MoeConfig model) - [Qwen3_5MoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeTextConfig) configuration class: [Qwen3_5MoeTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeTextModel) (Qwen3_5MoeTextConfig model) - [Qwen3_5TextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextConfig) configuration class: [Qwen3_5TextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextModel) (Qwen3_5TextConfig model) - [RTDetrConfig](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrConfig) configuration class: [RTDetrModel](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrModel) (RTDetrConfig model) - [RTDetrV2Config](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2Config) configuration class: [RTDetrV2Model](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2Model) (RTDetrV2Config model) - [RecurrentGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaConfig) configuration class: [RecurrentGemmaModel](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaModel) (RecurrentGemmaConfig model) - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerModel](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerModel) (ReformerConfig model) - [RegNetConfig](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetConfig) configuration class: [RegNetModel](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetModel) (RegNetConfig model) - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertModel](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertModel) (RemBertConfig model) - [ResNetConfig](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetConfig) configuration class: [ResNetModel](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetModel) (ResNetConfig model) - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertModel](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertModel) (RoCBertConfig model) - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerModel](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerModel) (RoFormerConfig model) - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaModel](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaModel) (RobertaConfig model) - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormModel](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormModel) (RobertaPreLayerNormConfig model) - [RwkvConfig](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvConfig) configuration class: [RwkvModel](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvModel) (RwkvConfig model) - [SEWConfig](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWConfig) configuration class: [SEWModel](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWModel) (SEWConfig model) - [SEWDConfig](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDConfig) configuration class: [SEWDModel](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDModel) (SEWDConfig model) - [Sam2Config](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2Config) configuration class: [Sam2Model](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2Model) (Sam2Config model) - [Sam2HieraDetConfig](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2HieraDetConfig) configuration class: [Sam2HieraDetModel](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2HieraDetModel) (Sam2HieraDetConfig model) - [Sam2VideoConfig](/docs/transformers/v5.8.0/en/model_doc/sam2_video#transformers.Sam2VideoConfig) configuration class: [Sam2VideoModel](/docs/transformers/v5.8.0/en/model_doc/sam2_video#transformers.Sam2VideoModel) (Sam2VideoConfig model) - [Sam2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2VisionConfig) configuration class: [Sam2VisionModel](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2VisionModel) (Sam2VisionConfig model) - [Sam3Config](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3Config) configuration class: [Sam3Model](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3Model) (Sam3Config model) - [Sam3LiteTextConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextConfig) configuration class: [Sam3LiteTextModel](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextModel) (Sam3LiteTextConfig model) - [Sam3LiteTextTextConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextTextConfig) configuration class: [Sam3LiteTextTextModel](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextTextModel) (Sam3LiteTextTextConfig model) - [Sam3TrackerConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker#transformers.Sam3TrackerConfig) configuration class: [Sam3TrackerModel](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker#transformers.Sam3TrackerModel) (Sam3TrackerConfig model) - [Sam3TrackerVideoConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker_video#transformers.Sam3TrackerVideoConfig) configuration class: [Sam3TrackerVideoModel](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker_video#transformers.Sam3TrackerVideoModel) (Sam3TrackerVideoConfig model) - [Sam3ViTConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3ViTConfig) configuration class: [Sam3ViTModel](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3ViTModel) (Sam3ViTConfig model) - [Sam3VideoConfig](/docs/transformers/v5.8.0/en/model_doc/sam3_video#transformers.Sam3VideoConfig) configuration class: [Sam3VideoModel](/docs/transformers/v5.8.0/en/model_doc/sam3_video#transformers.Sam3VideoModel) (Sam3VideoConfig model) - [Sam3VisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3VisionConfig) configuration class: [Sam3VisionModel](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3VisionModel) (Sam3VisionConfig model) - [SamConfig](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamConfig) configuration class: [SamModel](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamModel) (SamConfig model) - [SamHQConfig](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQConfig) configuration class: [SamHQModel](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQModel) (SamHQConfig model) - [SamHQVisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQVisionConfig) configuration class: [SamHQVisionModel](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQVisionModel) (SamHQVisionConfig model) - [SamVisionConfig](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamVisionConfig) configuration class: [SamVisionModel](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamVisionModel) (SamVisionConfig model) - [SeamlessM4TConfig](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TConfig) configuration class: [SeamlessM4TModel](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TModel) (SeamlessM4TConfig model) - [SeamlessM4Tv2Config](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Config) configuration class: [SeamlessM4Tv2Model](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Model) (SeamlessM4Tv2Config model) - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssModel](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssModel) (SeedOssConfig model) - [SegGptConfig](/docs/transformers/v5.8.0/en/model_doc/seggpt#transformers.SegGptConfig) configuration class: [SegGptModel](/docs/transformers/v5.8.0/en/model_doc/seggpt#transformers.SegGptModel) (SegGptConfig model) - [SegformerConfig](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerConfig) configuration class: [SegformerModel](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerModel) (SegformerConfig model) - [Siglip2Config](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Config) configuration class: [Siglip2Model](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Model) (Siglip2Config model) - [Siglip2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2VisionConfig) configuration class: [Siglip2VisionModel](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2VisionModel) (Siglip2VisionConfig model) - [SiglipConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipModel](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipModel) (SiglipConfig model) - [SiglipVisionConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipVisionConfig) configuration class: [SiglipVisionModel](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipVisionModel) (SiglipVisionConfig model) - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3Model](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Model) (SmolLM3Config model) - [SmolVLMConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMConfig) configuration class: [SmolVLMModel](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMModel) (SmolVLMConfig model) - [SmolVLMVisionConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMVisionConfig) configuration class: [SmolVLMVisionTransformer](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMVisionTransformer) (SmolVLMVisionConfig model) - [SolarOpenConfig](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenConfig) configuration class: [SolarOpenModel](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenModel) (SolarOpenConfig model) - [Speech2TextConfig](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextConfig) configuration class: [Speech2TextModel](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextModel) (Speech2TextConfig model) - [SpeechT5Config](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Config) configuration class: [SpeechT5Model](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Model) (SpeechT5Config model) - [SplinterConfig](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterConfig) configuration class: [SplinterModel](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterModel) (SplinterConfig model) - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertModel](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertModel) (SqueezeBertConfig model) - [StableLmConfig](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmConfig) configuration class: [StableLmModel](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmModel) (StableLmConfig model) - [Starcoder2Config](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Config) configuration class: [Starcoder2Model](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Model) (Starcoder2Config model) - [SwiftFormerConfig](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerConfig) configuration class: [SwiftFormerModel](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerModel) (SwiftFormerConfig model) - [Swin2SRConfig](/docs/transformers/v5.8.0/en/model_doc/swin2sr#transformers.Swin2SRConfig) configuration class: [Swin2SRModel](/docs/transformers/v5.8.0/en/model_doc/swin2sr#transformers.Swin2SRModel) (Swin2SRConfig model) - [SwinConfig](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinConfig) configuration class: [SwinModel](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinModel) (SwinConfig model) - [Swinv2Config](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2Model](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Model) (Swinv2Config model) - [SwitchTransformersConfig](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersConfig) configuration class: [SwitchTransformersModel](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersModel) (SwitchTransformersConfig model) - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5Model](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Model) (T5Config model) - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2Model](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Model) (T5Gemma2Config model) - [T5Gemma2EncoderConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2EncoderConfig) configuration class: `T5Gemma2Encoder` (T5Gemma2EncoderConfig model) - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaModel](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaModel) (T5GemmaConfig model) - [TableTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerConfig) configuration class: [TableTransformerModel](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerModel) (TableTransformerConfig model) - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasModel](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasModel) (TapasConfig model) - [TextNetConfig](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetConfig) configuration class: [TextNetModel](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetModel) (TextNetConfig model) - [TimeSeriesTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/time_series_transformer#transformers.TimeSeriesTransformerConfig) configuration class: [TimeSeriesTransformerModel](/docs/transformers/v5.8.0/en/model_doc/time_series_transformer#transformers.TimeSeriesTransformerModel) (TimeSeriesTransformerConfig model) - [TimesFm2_5Config](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5Config) configuration class: [TimesFm2_5Model](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5Model) (TimesFm2_5Config model) - [TimesFmConfig](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmConfig) configuration class: [TimesFmModel](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmModel) (TimesFmConfig model) - [TimesformerConfig](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerConfig) configuration class: [TimesformerModel](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerModel) (TimesformerConfig model) - [TimmBackboneConfig](/docs/transformers/v5.8.0/en/main_classes/backbones#transformers.TimmBackboneConfig) configuration class: [TimmBackbone](/docs/transformers/v5.8.0/en/main_classes/backbones#transformers.TimmBackbone) (TimmBackboneConfig model) - [TimmWrapperConfig](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperConfig) configuration class: [TimmWrapperModel](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperModel) (TimmWrapperConfig model) - [TvpConfig](/docs/transformers/v5.8.0/en/model_doc/tvp#transformers.TvpConfig) configuration class: [TvpModel](/docs/transformers/v5.8.0/en/model_doc/tvp#transformers.TvpModel) (TvpConfig model) - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5Model](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Model) (UMT5Config model) - [UVDocConfig](/docs/transformers/v5.8.0/en/model_doc/uvdoc#transformers.UVDocConfig) configuration class: [UVDocModel](/docs/transformers/v5.8.0/en/model_doc/uvdoc#transformers.UVDocModel) (UVDocConfig model) - [UdopConfig](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopConfig) configuration class: [UdopModel](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopModel) (UdopConfig model) - [UniSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechConfig) configuration class: [UniSpeechModel](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechModel) (UniSpeechConfig model) - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatModel](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatModel) (UniSpeechSatConfig model) - [UnivNetConfig](/docs/transformers/v5.8.0/en/model_doc/univnet#transformers.UnivNetConfig) configuration class: [UnivNetModel](/docs/transformers/v5.8.0/en/model_doc/univnet#transformers.UnivNetModel) (UnivNetConfig model) - [VJEPA2Config](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2Config) configuration class: [VJEPA2Model](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2Model) (VJEPA2Config model) - [VaultGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaConfig) configuration class: [VaultGemmaModel](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaModel) (VaultGemmaConfig model) - [ViTConfig](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTConfig) configuration class: [ViTModel](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTModel) (ViTConfig model) - [ViTMAEConfig](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEConfig) configuration class: [ViTMAEModel](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEModel) (ViTMAEConfig model) - [ViTMSNConfig](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNConfig) configuration class: [ViTMSNModel](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNModel) (ViTMSNConfig model) - [VibeVoiceAcousticTokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerConfig) configuration class: [VibeVoiceAcousticTokenizerModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerModel) (VibeVoiceAcousticTokenizerConfig model) - [VibeVoiceAcousticTokenizerDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerDecoderConfig) configuration class: [VibeVoiceAcousticTokenizerDecoderModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerDecoderModel) (VibeVoiceAcousticTokenizerDecoderConfig model) - [VibeVoiceAcousticTokenizerEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerEncoderConfig) configuration class: [VibeVoiceAcousticTokenizerEncoderModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerEncoderModel) (VibeVoiceAcousticTokenizerEncoderConfig model) - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model) - [VideoLlama3Config](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Config) configuration class: [VideoLlama3Model](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Model) (VideoLlama3Config model) - [VideoLlama3VisionConfig](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3VisionConfig) configuration class: [VideoLlama3VisionModel](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3VisionModel) (VideoLlama3VisionConfig model) - [VideoLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaConfig) configuration class: [VideoLlavaModel](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaModel) (VideoLlavaConfig model) - [VideoMAEConfig](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEConfig) configuration class: [VideoMAEModel](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEModel) (VideoMAEConfig model) - [ViltConfig](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltConfig) configuration class: [ViltModel](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltModel) (ViltConfig model) - [VipLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaConfig) configuration class: [VipLlavaModel](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaModel) (VipLlavaConfig model) - [VisionTextDualEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/vision-text-dual-encoder#transformers.VisionTextDualEncoderConfig) configuration class: [VisionTextDualEncoderModel](/docs/transformers/v5.8.0/en/model_doc/vision-text-dual-encoder#transformers.VisionTextDualEncoderModel) (VisionTextDualEncoderConfig model) - [VisualBertConfig](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertConfig) configuration class: [VisualBertModel](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertModel) (VisualBertConfig model) - [VitDetConfig](/docs/transformers/v5.8.0/en/model_doc/vitdet#transformers.VitDetConfig) configuration class: [VitDetModel](/docs/transformers/v5.8.0/en/model_doc/vitdet#transformers.VitDetModel) (VitDetConfig model) - [VitsConfig](/docs/transformers/v5.8.0/en/model_doc/vits#transformers.VitsConfig) configuration class: [VitsModel](/docs/transformers/v5.8.0/en/model_doc/vits#transformers.VitsModel) (VitsConfig model) - [VivitConfig](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitConfig) configuration class: [VivitModel](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitModel) (VivitConfig model) - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model) - [VoxtralEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralEncoderConfig) configuration class: [VoxtralEncoder](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralEncoder) (VoxtralEncoderConfig model) - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model) - [VoxtralRealtimeEncoderConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeEncoderConfig) configuration class: [VoxtralRealtimeEncoder](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeEncoder) (VoxtralRealtimeEncoderConfig model) - [VoxtralRealtimeTextConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeTextConfig) configuration class: `VoxtralRealtimeTextModel` (VoxtralRealtimeTextConfig model) - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertModel](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertModel) (Wav2Vec2BertConfig model) - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2Model](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Model) (Wav2Vec2Config model) - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerModel](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerModel) (Wav2Vec2ConformerConfig model) - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMModel](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMModel) (WavLMConfig model) - [WhisperConfig](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperModel](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperModel) (WhisperConfig model) - [XCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/xclip#transformers.XCLIPConfig) configuration class: [XCLIPModel](/docs/transformers/v5.8.0/en/model_doc/xclip#transformers.XCLIPModel) (XCLIPConfig model) - [XGLMConfig](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMConfig) configuration class: [XGLMModel](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMModel) (XGLMConfig model) - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMModel) (XLMConfig model) - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaModel](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaModel) (XLMRobertaConfig model) - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLModel](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLModel) (XLMRobertaXLConfig model) - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetModel](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetModel) (XLNetConfig model) - [XcodecConfig](/docs/transformers/v5.8.0/en/model_doc/xcodec#transformers.XcodecConfig) configuration class: [XcodecModel](/docs/transformers/v5.8.0/en/model_doc/xcodec#transformers.XcodecModel) (XcodecConfig model) - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodModel](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodModel) (XmodConfig model) - [YolosConfig](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosConfig) configuration class: [YolosModel](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosModel) (YolosConfig model) - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoModel](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoModel) (YosoConfig model) - [YoutuConfig](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuConfig) configuration class: [YoutuModel](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuModel) (YoutuConfig model) - [Zamba2Config](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Config) configuration class: [Zamba2Model](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Model) (Zamba2Config model) - [ZambaConfig](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaConfig) configuration class: [ZambaModel](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaModel) (ZambaConfig model) - [xLSTMConfig](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMConfig) configuration class: [xLSTMModel](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMModel) (xLSTMConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModel.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the base model classes of the library from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **afmoe** -- [AfmoeModel](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeModel) (AfmoeConfig model)
- **aimv2** -- [Aimv2Model](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2Model) (Aimv2Config model)
- **aimv2_vision_model** -- [Aimv2VisionModel](/docs/transformers/v5.8.0/en/model_doc/aimv2#transformers.Aimv2VisionModel) (Aimv2VisionConfig model)
- **albert** -- `AlbertModel` (AlbertConfig model)
- **align** -- [AlignModel](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignModel) (AlignConfig model)
- **altclip** -- [AltCLIPModel](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPModel) (AltCLIPConfig model)
- **apertus** -- [ApertusModel](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusModel) (ApertusConfig model)
- **arcee** -- [ArceeModel](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeModel) (ArceeConfig model)
- **aria** -- [AriaModel](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaModel) (AriaConfig model)
- **aria_text** -- [AriaTextModel](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextModel) (AriaTextConfig model)
- **audio-spectrogram-transformer** -- [ASTModel](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTModel) (ASTConfig model)
- **audioflamingo3** -- [AudioFlamingo3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3ForConditionalGeneration) (AudioFlamingo3Config model)
- **audioflamingo3_encoder** -- [AudioFlamingo3Encoder](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Encoder) (AudioFlamingo3EncoderConfig model)
- **autoformer** -- [AutoformerModel](/docs/transformers/v5.8.0/en/model_doc/autoformer#transformers.AutoformerModel) (AutoformerConfig model)
- **aya_vision** -- [AyaVisionModel](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionModel) (AyaVisionConfig model)
- **bamba** -- [BambaModel](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaModel) (BambaConfig model)
- **bark** -- [BarkModel](/docs/transformers/v5.8.0/en/model_doc/bark#transformers.BarkModel) (BarkConfig model)
- **bart** -- [BartModel](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartModel) (BartConfig model)
- **beit** -- [BeitModel](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitModel) (BeitConfig model)
- **bert** -- [BertModel](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertModel) (BertConfig model)
- **bert-generation** -- [BertGenerationEncoder](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationEncoder) (BertGenerationConfig model)
- **big_bird** -- [BigBirdModel](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdModel) (BigBirdConfig model)
- **bigbird_pegasus** -- [BigBirdPegasusModel](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusModel) (BigBirdPegasusConfig model)
- **biogpt** -- [BioGptModel](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptModel) (BioGptConfig model)
- **bit** -- [BitModel](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitModel) (BitConfig model)
- **bitnet** -- [BitNetModel](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetModel) (BitNetConfig model)
- **blenderbot** -- [BlenderbotModel](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotModel) (BlenderbotConfig model)
- **blenderbot-small** -- [BlenderbotSmallModel](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallModel) (BlenderbotSmallConfig model)
- **blip** -- [BlipModel](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipModel) (BlipConfig model)
- **blip-2** -- [Blip2Model](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Model) (Blip2Config model)
- **blip_2_qformer** -- [Blip2QFormerModel](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2QFormerModel) (Blip2QFormerConfig model)
- **bloom** -- [BloomModel](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomModel) (BloomConfig model)
- **blt** -- [BltModel](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltModel) (BltConfig model)
- **bridgetower** -- [BridgeTowerModel](/docs/transformers/v5.8.0/en/model_doc/bridgetower#transformers.BridgeTowerModel) (BridgeTowerConfig model)
- **bros** -- [BrosModel](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosModel) (BrosConfig model)
- **camembert** -- [CamembertModel](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertModel) (CamembertConfig model)
- **canine** -- [CanineModel](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineModel) (CanineConfig model)
- **chameleon** -- [ChameleonModel](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonModel) (ChameleonConfig model)
- **chinese_clip** -- [ChineseCLIPModel](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPModel) (ChineseCLIPConfig model)
- **chinese_clip_vision_model** -- [ChineseCLIPVisionModel](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPVisionModel) (ChineseCLIPVisionConfig model)
- **clap** -- [ClapModel](/docs/transformers/v5.8.0/en/model_doc/clap#transformers.ClapModel) (ClapConfig model)
- **clip** -- [CLIPModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPModel) (CLIPConfig model)
- **clip_text_model** -- [CLIPTextModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPTextModel) (CLIPTextConfig model)
- **clip_vision_model** -- [CLIPVisionModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPVisionModel) (CLIPVisionConfig model)
- **clipseg** -- [CLIPSegModel](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSegConfig model)
- **clvp** -- [ClvpModelForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/clvp#transformers.ClvpModelForConditionalGeneration) (ClvpConfig model)
- **codegen** -- [CodeGenModel](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenModel) (CodeGenConfig model)
- **cohere** -- [CohereModel](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereModel) (CohereConfig model)
- **cohere2** -- [Cohere2Model](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2Model) (Cohere2Config model)
- **cohere2_vision** -- [Cohere2VisionModel](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionModel) (Cohere2VisionConfig model)
- **cohere_asr** -- [CohereAsrModel](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrModel) (CohereAsrConfig model)
- **conditional_detr** -- [ConditionalDetrModel](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrModel) (ConditionalDetrConfig model)
- **convbert** -- [ConvBertModel](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertModel) (ConvBertConfig model)
- **convnext** -- [ConvNextModel](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextModel) (ConvNextConfig model)
- **convnextv2** -- [ConvNextV2Model](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2Model) (ConvNextV2Config model)
- **cpmant** -- [CpmAntModel](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntModel) (CpmAntConfig model)
- **csm** -- [CsmForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/csm#transformers.CsmForConditionalGeneration) (CsmConfig model)
- **ctrl** -- [CTRLModel](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLModel) (CTRLConfig model)
- **cvt** -- [CvtModel](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtModel) (CvtConfig model)
- **cwm** -- [CwmModel](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmModel) (CwmConfig model)
- **d_fine** -- [DFineModel](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineModel) (DFineConfig model)
- **dab-detr** -- [DabDetrModel](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrModel) (DabDetrConfig model)
- **dac** -- [DacModel](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacModel) (DacConfig model)
- **data2vec-audio** -- [Data2VecAudioModel](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioModel) (Data2VecAudioConfig model)
- **data2vec-text** -- [Data2VecTextModel](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextModel) (Data2VecTextConfig model)
- **data2vec-vision** -- [Data2VecVisionModel](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionModel) (Data2VecVisionConfig model)
- **dbrx** -- [DbrxModel](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxModel) (DbrxConfig model)
- **deberta** -- [DebertaModel](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaModel) (DebertaConfig model)
- **deberta-v2** -- [DebertaV2Model](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Model) (DebertaV2Config model)
- **decision_transformer** -- [DecisionTransformerModel](/docs/transformers/v5.8.0/en/model_doc/decision_transformer#transformers.DecisionTransformerModel) (DecisionTransformerConfig model)
- **deepseek_v2** -- [DeepseekV2Model](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Model) (DeepseekV2Config model)
- **deepseek_v3** -- [DeepseekV3Model](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Model) (DeepseekV3Config model)
- **deepseek_v4** -- [DeepseekV4Model](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4Model) (DeepseekV4Config model)
- **deepseek_vl** -- [DeepseekVLModel](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLModel) (DeepseekVLConfig model)
- **deepseek_vl_hybrid** -- [DeepseekVLHybridModel](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridModel) (DeepseekVLHybridConfig model)
- **deformable_detr** -- [DeformableDetrModel](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrModel) (DeformableDetrConfig model)
- **deimv2** -- [Deimv2Model](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2Model) (Deimv2Config model)
- **deit** -- [DeiTModel](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTModel) (DeiTConfig model)
- **depth_pro** -- [DepthProModel](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProModel) (DepthProConfig model)
- **detr** -- [DetrModel](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrModel) (DetrConfig model)
- **dia** -- [DiaModel](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaModel) (DiaConfig model)
- **diffllama** -- [DiffLlamaModel](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaModel) (DiffLlamaConfig model)
- **dinat** -- [DinatModel](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatModel) (DinatConfig model)
- **dinov2** -- [Dinov2Model](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2Model) (Dinov2Config model)
- **dinov2_with_registers** -- [Dinov2WithRegistersModel](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersModel) (Dinov2WithRegistersConfig model)
- **dinov3_convnext** -- [DINOv3ConvNextModel](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ConvNextModel) (DINOv3ConvNextConfig model)
- **dinov3_vit** -- [DINOv3ViTModel](/docs/transformers/v5.8.0/en/model_doc/dinov3#transformers.DINOv3ViTModel) (DINOv3ViTConfig model)
- **distilbert** -- [DistilBertModel](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertModel) (DistilBertConfig model)
- **doge** -- [DogeModel](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeModel) (DogeConfig model)
- **donut-swin** -- [DonutSwinModel](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinModel) (DonutSwinConfig model)
- **dots1** -- [Dots1Model](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1Model) (Dots1Config model)
- **dpr** -- [DPRQuestionEncoder](/docs/transformers/v5.8.0/en/model_doc/dpr#transformers.DPRQuestionEncoder) (DPRConfig model)
- **dpt** -- [DPTModel](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTModel) (DPTConfig model)
- **edgetam** -- [EdgeTamModel](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamModel) (EdgeTamConfig model)
- **edgetam_video** -- [EdgeTamVideoModel](/docs/transformers/v5.8.0/en/model_doc/edgetam_video#transformers.EdgeTamVideoModel) (EdgeTamVideoConfig model)
- **edgetam_vision_model** -- [EdgeTamVisionModel](/docs/transformers/v5.8.0/en/model_doc/edgetam#transformers.EdgeTamVisionModel) (EdgeTamVisionConfig model)
- **efficientloftr** -- [EfficientLoFTRModel](/docs/transformers/v5.8.0/en/model_doc/efficientloftr#transformers.EfficientLoFTRModel) (EfficientLoFTRConfig model)
- **efficientnet** -- [EfficientNetModel](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetModel) (EfficientNetConfig model)
- **electra** -- [ElectraModel](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraModel) (ElectraConfig model)
- **emu3** -- [Emu3Model](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Model) (Emu3Config model)
- **encodec** -- [EncodecModel](/docs/transformers/v5.8.0/en/model_doc/encodec#transformers.EncodecModel) (EncodecConfig model)
- **ernie** -- [ErnieModel](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieModel) (ErnieConfig model)
- **ernie4_5** -- [Ernie4_5Model](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5Model) (Ernie4_5Config model)
- **ernie4_5_moe** -- [Ernie4_5_MoeModel](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeModel) (Ernie4_5_MoeConfig model)
- **ernie4_5_vl_moe** -- [Ernie4_5_VLMoeModel](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeModel) (Ernie4_5_VLMoeConfig model)
- **esm** -- [EsmModel](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmModel) (EsmConfig model)
- **eurobert** -- [EuroBertModel](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertModel) (EuroBertConfig model)
- **evolla** -- [EvollaModel](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaModel) (EvollaConfig model)
- **exaone4** -- [Exaone4Model](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Model) (Exaone4Config model)
- **exaone4_5** -- [Exaone4_5_Model](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Model) (Exaone4_5_Config model)
- **exaone4_5_vision** -- [Exaone4_5_VisionModel](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_VisionModel) (Exaone4_5_VisionConfig model)
- **exaone_moe** -- [ExaoneMoeModel](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeModel) (ExaoneMoeConfig model)
- **falcon** -- [FalconModel](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconModel) (FalconConfig model)
- **falcon_h1** -- [FalconH1Model](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1Model) (FalconH1Config model)
- **falcon_mamba** -- [FalconMambaModel](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaModel) (FalconMambaConfig model)
- **fast_vlm** -- [FastVlmModel](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmModel) (FastVlmConfig model)
- **fastspeech2_conformer** -- [FastSpeech2ConformerModel](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerModel) (FastSpeech2ConformerConfig model)
- **fastspeech2_conformer_with_hifigan** -- [FastSpeech2ConformerWithHifiGan](/docs/transformers/v5.8.0/en/model_doc/fastspeech2_conformer#transformers.FastSpeech2ConformerWithHifiGan) (FastSpeech2ConformerWithHifiGanConfig model)
- **flaubert** -- [FlaubertModel](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertModel) (FlaubertConfig model)
- **flava** -- [FlavaModel](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaModel) (FlavaConfig model)
- **flex_olmo** -- [FlexOlmoModel](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoModel) (FlexOlmoConfig model)
- **florence2** -- [Florence2Model](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Model) (Florence2Config model)
- **fnet** -- [FNetModel](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetModel) (FNetConfig model)
- **focalnet** -- [FocalNetModel](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetModel) (FocalNetConfig model)
- **fsmt** -- [FSMTModel](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTModel) (FSMTConfig model)
- **funnel** -- [FunnelModel](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelModel) or [FunnelBaseModel](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelBaseModel) (FunnelConfig model)
- **fuyu** -- [FuyuModel](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuModel) (FuyuConfig model)
- **gemma** -- [GemmaModel](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaModel) (GemmaConfig model)
- **gemma2** -- [Gemma2Model](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Model) (Gemma2Config model)
- **gemma3** -- [Gemma3Model](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Model) (Gemma3Config model)
- **gemma3_text** -- [Gemma3TextModel](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextModel) (Gemma3TextConfig model)
- **gemma3n** -- [Gemma3nModel](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nModel) (Gemma3nConfig model)
- **gemma3n_audio** -- `Gemma3nAudioEncoder` (Gemma3nAudioConfig model)
- **gemma3n_text** -- [Gemma3nTextModel](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nTextModel) (Gemma3nTextConfig model)
- **gemma3n_vision** -- [TimmWrapperModel](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperModel) (Gemma3nVisionConfig model)
- **gemma4** -- [Gemma4Model](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Model) (Gemma4Config model)
- **gemma4_audio** -- [Gemma4AudioModel](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4AudioModel) (Gemma4AudioConfig model)
- **gemma4_text** -- [Gemma4TextModel](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4TextModel) (Gemma4TextConfig model)
- **gemma4_vision** -- [Gemma4VisionModel](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4VisionModel) (Gemma4VisionConfig model)
- **git** -- [GitModel](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitModel) (GitConfig model)
- **glm** -- [GlmModel](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmModel) (GlmConfig model)
- **glm4** -- [Glm4Model](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Model) (Glm4Config model)
- **glm46v** -- [Glm46VModel](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VModel) (Glm46VConfig model)
- **glm4_moe** -- [Glm4MoeModel](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeModel) (Glm4MoeConfig model)
- **glm4_moe_lite** -- [Glm4MoeLiteModel](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteModel) (Glm4MoeLiteConfig model)
- **glm4v** -- [Glm4vModel](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vModel) (Glm4vConfig model)
- **glm4v_moe** -- [Glm4vMoeModel](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeModel) (Glm4vMoeConfig model)
- **glm4v_moe_text** -- [Glm4vMoeTextModel](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeTextModel) (Glm4vMoeTextConfig model)
- **glm4v_moe_vision** -- [Glm4vMoeVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeVisionModel) (Glm4vMoeVisionConfig model)
- **glm4v_text** -- [Glm4vTextModel](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vTextModel) (Glm4vTextConfig model)
- **glm4v_vision** -- [Glm4vVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vVisionModel) (Glm4vVisionConfig model)
- **glm_image** -- [GlmImageModel](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageModel) (GlmImageConfig model)
- **glm_image_text** -- [GlmImageTextModel](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageTextModel) (GlmImageTextConfig model)
- **glm_image_vision** -- [GlmImageVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVisionModel) (GlmImageVisionConfig model)
- **glm_image_vqmodel** -- [GlmImageVQVAE](/docs/transformers/v5.8.0/en/model_doc/glm_image#transformers.GlmImageVQVAE) (GlmImageVQVAEConfig model)
- **glm_moe_dsa** -- [GlmMoeDsaModel](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaModel) (GlmMoeDsaConfig model)
- **glm_ocr** -- [GlmOcrModel](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrModel) (GlmOcrConfig model)
- **glm_ocr_text** -- [GlmOcrTextModel](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrTextModel) (GlmOcrTextConfig model)
- **glm_ocr_vision** -- [GlmOcrVisionModel](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrVisionModel) (GlmOcrVisionConfig model)
- **glmasr** -- [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model)
- **glmasr_encoder** -- [GlmAsrEncoder](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrEncoder) (GlmAsrEncoderConfig model)
- **glpn** -- [GLPNModel](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNModel) (GLPNConfig model)
- **got_ocr2** -- [GotOcr2Model](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Model) (GotOcr2Config model)
- **gpt-sw3** -- [GPT2Model](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Model) (GPT2Config model)
- **gpt2** -- [GPT2Model](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Model) (GPT2Config model)
- **gpt_bigcode** -- [GPTBigCodeModel](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeModel) (GPTBigCodeConfig model)
- **gpt_neo** -- [GPTNeoModel](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoModel) (GPTNeoConfig model)
- **gpt_neox** -- [GPTNeoXModel](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXModel) (GPTNeoXConfig model)
- **gpt_neox_japanese** -- [GPTNeoXJapaneseModel](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseModel) (GPTNeoXJapaneseConfig model)
- **gpt_oss** -- [GptOssModel](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssModel) (GptOssConfig model)
- **gptj** -- [GPTJModel](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJModel) (GPTJConfig model)
- **granite** -- [GraniteModel](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteModel) (GraniteConfig model)
- **granite4_vision** -- [Granite4VisionModel](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionModel) (Granite4VisionConfig model)
- **granite_speech** -- [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model)
- **granitemoe** -- [GraniteMoeModel](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeModel) (GraniteMoeConfig model)
- **granitemoehybrid** -- [GraniteMoeHybridModel](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridModel) (GraniteMoeHybridConfig model)
- **granitemoeshared** -- [GraniteMoeSharedModel](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedModel) (GraniteMoeSharedConfig model)
- **grounding-dino** -- [GroundingDinoModel](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoModel) (GroundingDinoConfig model)
- **groupvit** -- [GroupViTModel](/docs/transformers/v5.8.0/en/model_doc/groupvit#transformers.GroupViTModel) (GroupViTConfig model)
- **helium** -- [HeliumModel](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumModel) (HeliumConfig model)
- **hgnet_v2** -- [HGNetV2Backbone](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2Backbone) (HGNetV2Config model)
- **hiera** -- [HieraModel](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraModel) (HieraConfig model)
- **higgs_audio_v2** -- [HiggsAudioV2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2#transformers.HiggsAudioV2ForConditionalGeneration) (HiggsAudioV2Config model)
- **higgs_audio_v2_tokenizer** -- [HiggsAudioV2TokenizerModel](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerModel) (HiggsAudioV2TokenizerConfig model)
- **hubert** -- [HubertModel](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertModel) (HubertConfig model)
- **hunyuan_v1_dense** -- [HunYuanDenseV1Model](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Model) (HunYuanDenseV1Config model)
- **hunyuan_v1_moe** -- [HunYuanMoEV1Model](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Model) (HunYuanMoEV1Config model)
- **hy_v3** -- [HYV3Model](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3Model) (HYV3Config model)
- **ibert** -- [IBertModel](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertModel) (IBertConfig model)
- **idefics** -- [IdeficsModel](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsModel) (IdeficsConfig model)
- **idefics2** -- [Idefics2Model](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Model) (Idefics2Config model)
- **idefics3** -- [Idefics3Model](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Model) (Idefics3Config model)
- **idefics3_vision** -- [Idefics3VisionTransformer](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3VisionTransformer) (Idefics3VisionConfig model)
- **ijepa** -- [IJepaModel](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaModel) (IJepaConfig model)
- **imagegpt** -- [ImageGPTModel](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTModel) (ImageGPTConfig model)
- **informer** -- [InformerModel](/docs/transformers/v5.8.0/en/model_doc/informer#transformers.InformerModel) (InformerConfig model)
- **instructblip** -- [InstructBlipModel](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipModel) (InstructBlipConfig model)
- **instructblipvideo** -- [InstructBlipVideoModel](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoModel) (InstructBlipVideoConfig model)
- **internvl** -- [InternVLModel](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLModel) (InternVLConfig model)
- **internvl_vision** -- [InternVLVisionModel](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLVisionModel) (InternVLVisionConfig model)
- **jais2** -- [Jais2Model](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2Model) (Jais2Config model)
- **jamba** -- [JambaModel](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaModel) (JambaConfig model)
- **janus** -- [JanusModel](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusModel) (JanusConfig model)
- **jetmoe** -- [JetMoeModel](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeModel) (JetMoeConfig model)
- **jina_embeddings_v3** -- [JinaEmbeddingsV3Model](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Model) (JinaEmbeddingsV3Config model)
- **kosmos-2** -- [Kosmos2Model](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Model) (Kosmos2Config model)
- **kosmos-2.5** -- [Kosmos2_5Model](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Model) (Kosmos2_5Config model)
- **kyutai_speech_to_text** -- [KyutaiSpeechToTextModel](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextModel) (KyutaiSpeechToTextConfig model)
- **laguna** -- [LagunaModel](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaModel) (LagunaConfig model)
- **lasr_ctc** -- [LasrForCTC](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrForCTC) (LasrCTCConfig model)
- **lasr_encoder** -- [LasrEncoder](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrEncoder) (LasrEncoderConfig model)
- **layoutlm** -- [LayoutLMModel](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMModel) (LayoutLMConfig model)
- **layoutlmv2** -- [LayoutLMv2Model](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Model) (LayoutLMv2Config model)
- **layoutlmv3** -- [LayoutLMv3Model](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Model) (LayoutLMv3Config model)
- **led** -- [LEDModel](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDModel) (LEDConfig model)
- **levit** -- [LevitModel](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitModel) (LevitConfig model)
- **lfm2** -- [Lfm2Model](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2Model) (Lfm2Config model)
- **lfm2_moe** -- [Lfm2MoeModel](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeModel) (Lfm2MoeConfig model)
- **lfm2_vl** -- [Lfm2VlModel](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlModel) (Lfm2VlConfig model)
- **lightglue** -- [LightGlueForKeypointMatching](/docs/transformers/v5.8.0/en/model_doc/lightglue#transformers.LightGlueForKeypointMatching) (LightGlueConfig model)
- **lighton_ocr** -- [LightOnOcrModel](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrModel) (LightOnOcrConfig model)
- **lilt** -- [LiltModel](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltModel) (LiltConfig model)
- **llama** -- [LlamaModel](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaModel) (LlamaConfig model)
- **llama4** -- [Llama4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForConditionalGeneration) (Llama4Config model)
- **llama4_text** -- [Llama4TextModel](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4TextModel) (Llama4TextConfig model)
- **llava** -- [LlavaModel](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaModel) (LlavaConfig model)
- **llava_next** -- [LlavaNextModel](/docs/transformers/v5.8.0/en/model_doc/llava_next#transformers.LlavaNextModel) (LlavaNextConfig model)
- **llava_next_video** -- [LlavaNextVideoModel](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoModel) (LlavaNextVideoConfig model)
- **llava_onevision** -- [LlavaOnevisionModel](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionModel) (LlavaOnevisionConfig model)
- **longcat_flash** -- [LongcatFlashModel](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashModel) (LongcatFlashConfig model)
- **longformer** -- [LongformerModel](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerModel) (LongformerConfig model)
- **longt5** -- [LongT5Model](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5Model) (LongT5Config model)
- **luke** -- [LukeModel](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeModel) (LukeConfig model)
- **lw_detr** -- [LwDetrModel](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrModel) (LwDetrConfig model)
- **lxmert** -- [LxmertModel](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertModel) (LxmertConfig model)
- **m2m_100** -- [M2M100Model](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100Model) (M2M100Config model)
- **mamba** -- [MambaModel](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaModel) (MambaConfig model)
- **mamba2** -- [Mamba2Model](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Model) (Mamba2Config model)
- **marian** -- [MarianModel](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianModel) (MarianConfig model)
- **markuplm** -- [MarkupLMModel](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMModel) (MarkupLMConfig model)
- **mask2former** -- [Mask2FormerModel](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerModel) (Mask2FormerConfig model)
- **maskformer** -- [MaskFormerModel](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerModel) (MaskFormerConfig model)
- **maskformer-swin** -- `MaskFormerSwinModel` (MaskFormerSwinConfig model)
- **mbart** -- [MBartModel](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartModel) (MBartConfig model)
- **megatron-bert** -- [MegatronBertModel](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertModel) (MegatronBertConfig model)
- **metaclip_2** -- [MetaClip2Model](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Model) (MetaClip2Config model)
- **mgp-str** -- [MgpstrForSceneTextRecognition](/docs/transformers/v5.8.0/en/model_doc/mgp-str#transformers.MgpstrForSceneTextRecognition) (MgpstrConfig model)
- **mimi** -- [MimiModel](/docs/transformers/v5.8.0/en/model_doc/mimi#transformers.MimiModel) (MimiConfig model)
- **minicpmv4_6** -- [MiniCPMV4_6Model](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Model) (MiniCPMV4_6Config model)
- **minimax** -- [MiniMaxModel](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxModel) (MiniMaxConfig model)
- **minimax_m2** -- [MiniMaxM2Model](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2Model) (MiniMaxM2Config model)
- **ministral** -- [MinistralModel](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralModel) (MinistralConfig model)
- **ministral3** -- [Ministral3Model](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Model) (Ministral3Config model)
- **mistral** -- [MistralModel](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralModel) (MistralConfig model)
- **mistral3** -- [Mistral3Model](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Model) (Mistral3Config model)
- **mistral4** -- [Mistral4Model](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Model) (Mistral4Config model)
- **mixtral** -- [MixtralModel](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralModel) (MixtralConfig model)
- **mlcd** -- [MLCDVisionModel](/docs/transformers/v5.8.0/en/model_doc/mlcd#transformers.MLCDVisionModel) (MLCDVisionConfig model)
- **mlcd_vision_model** -- [MLCDVisionModel](/docs/transformers/v5.8.0/en/model_doc/mlcd#transformers.MLCDVisionModel) (MLCDVisionConfig model)
- **mllama** -- [MllamaModel](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaModel) (MllamaConfig model)
- **mm-grounding-dino** -- [MMGroundingDinoModel](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoModel) (MMGroundingDinoConfig model)
- **mobilebert** -- [MobileBertModel](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertModel) (MobileBertConfig model)
- **mobilenet_v1** -- [MobileNetV1Model](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1Model) (MobileNetV1Config model)
- **mobilenet_v2** -- [MobileNetV2Model](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Model) (MobileNetV2Config model)
- **mobilevit** -- [MobileViTModel](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTModel) (MobileViTConfig model)
- **mobilevitv2** -- [MobileViTV2Model](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Model) (MobileViTV2Config model)
- **modernbert** -- [ModernBertModel](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertModel) (ModernBertConfig model)
- **modernbert-decoder** -- [ModernBertDecoderModel](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderModel) (ModernBertDecoderConfig model)
- **modernvbert** -- [ModernVBertModel](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertModel) (ModernVBertConfig model)
- **moonshine** -- [MoonshineModel](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineModel) (MoonshineConfig model)
- **moonshine_streaming** -- [MoonshineStreamingModel](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingModel) (MoonshineStreamingConfig model)
- **moshi** -- [MoshiModel](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiModel) (MoshiConfig model)
- **mpnet** -- [MPNetModel](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetModel) (MPNetConfig model)
- **mpt** -- [MptModel](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptModel) (MptConfig model)
- **mra** -- [MraModel](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraModel) (MraConfig model)
- **mt5** -- [MT5Model](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Model) (MT5Config model)
- **musicflamingo** -- [MusicFlamingoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoForConditionalGeneration) (MusicFlamingoConfig model)
- **musicgen** -- [MusicgenModel](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenModel) (MusicgenConfig model)
- **musicgen_melody** -- [MusicgenMelodyModel](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyModel) (MusicgenMelodyConfig model)
- **mvp** -- [MvpModel](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpModel) (MvpConfig model)
- **nanochat** -- [NanoChatModel](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatModel) (NanoChatConfig model)
- **nemotron** -- [NemotronModel](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronModel) (NemotronConfig model)
- **nemotron_h** -- [NemotronHModel](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHModel) (NemotronHConfig model)
- **nllb-moe** -- [NllbMoeModel](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeModel) (NllbMoeConfig model)
- **nomic_bert** -- [NomicBertModel](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertModel) (NomicBertConfig model)
- **nystromformer** -- [NystromformerModel](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerModel) (NystromformerConfig model)
- **olmo** -- [OlmoModel](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoModel) (OlmoConfig model)
- **olmo2** -- [Olmo2Model](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Model) (Olmo2Config model)
- **olmo3** -- [Olmo3Model](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Model) (Olmo3Config model)
- **olmo_hybrid** -- [OlmoHybridModel](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridModel) (OlmoHybridConfig model)
- **olmoe** -- [OlmoeModel](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeModel) (OlmoeConfig model)
- **omdet-turbo** -- [OmDetTurboForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboForObjectDetection) (OmDetTurboConfig model)
- **oneformer** -- [OneFormerModel](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerModel) (OneFormerConfig model)
- **openai-gpt** -- [OpenAIGPTModel](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTModel) (OpenAIGPTConfig model)
- **openai_privacy_filter** -- [OpenAIPrivacyFilterModel](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterModel) (OpenAIPrivacyFilterConfig model)
- **opt** -- [OPTModel](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTModel) (OPTConfig model)
- **ovis2** -- [Ovis2Model](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Model) (Ovis2Config model)
- **owlv2** -- [Owlv2Model](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2Model) (Owlv2Config model)
- **owlvit** -- [OwlViTModel](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTModel) (OwlViTConfig model)
- **paligemma** -- [PaliGemmaModel](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaModel) (PaliGemmaConfig model)
- **parakeet_ctc** -- [ParakeetForCTC](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetForCTC) (ParakeetCTCConfig model)
- **parakeet_encoder** -- [ParakeetEncoder](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetEncoder) (ParakeetEncoderConfig model)
- **patchtsmixer** -- [PatchTSMixerModel](/docs/transformers/v5.8.0/en/model_doc/patchtsmixer#transformers.PatchTSMixerModel) (PatchTSMixerConfig model)
- **patchtst** -- [PatchTSTModel](/docs/transformers/v5.8.0/en/model_doc/patchtst#transformers.PatchTSTModel) (PatchTSTConfig model)
- **pe_audio** -- [PeAudioModel](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioModel) (PeAudioConfig model)
- **pe_audio_encoder** -- [PeAudioEncoder](/docs/transformers/v5.8.0/en/model_doc/pe_audio#transformers.PeAudioEncoder) (PeAudioEncoderConfig model)
- **pe_audio_video** -- [PeAudioVideoModel](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoModel) (PeAudioVideoConfig model)
- **pe_audio_video_encoder** -- [PeAudioVideoEncoder](/docs/transformers/v5.8.0/en/model_doc/pe_audio_video#transformers.PeAudioVideoEncoder) (PeAudioVideoEncoderConfig model)
- **pe_video** -- [PeVideoModel](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoModel) (PeVideoConfig model)
- **pe_video_encoder** -- [PeVideoEncoder](/docs/transformers/v5.8.0/en/model_doc/pe_video#transformers.PeVideoEncoder) (PeVideoEncoderConfig model)
- **pegasus** -- [PegasusModel](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusModel) (PegasusConfig model)
- **pegasus_x** -- [PegasusXModel](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXModel) (PegasusXConfig model)
- **perceiver** -- [PerceiverModel](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverModel) (PerceiverConfig model)
- **perception_lm** -- [PerceptionLMModel](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMModel) (PerceptionLMConfig model)
- **persimmon** -- [PersimmonModel](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonModel) (PersimmonConfig model)
- **phi** -- [PhiModel](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiModel) (PhiConfig model)
- **phi3** -- [Phi3Model](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Model) (Phi3Config model)
- **phi4_multimodal** -- [Phi4MultimodalModel](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalModel) (Phi4MultimodalConfig model)
- **phimoe** -- [PhimoeModel](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeModel) (PhimoeConfig model)
- **pi0** -- [PI0Model](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Model) (PI0Config model)
- **pixio** -- [PixioModel](/docs/transformers/v5.8.0/en/model_doc/pixio#transformers.PixioModel) (PixioConfig model)
- **pixtral** -- [PixtralVisionModel](/docs/transformers/v5.8.0/en/model_doc/pixtral#transformers.PixtralVisionModel) (PixtralVisionConfig model)
- **plbart** -- [PLBartModel](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartModel) (PLBartConfig model)
- **poolformer** -- [PoolFormerModel](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerModel) (PoolFormerConfig model)
- **pp_doclayout_v3** -- [PPDocLayoutV3Model](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3Model) (PPDocLayoutV3Config model)
- **pp_ocrv5_mobile_rec** -- [PPOCRV5MobileRecModel](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecModel) (PPOCRV5MobileRecConfig model)
- **pp_ocrv5_server_rec** -- [PPOCRV5ServerRecModel](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecModel) (PPOCRV5ServerRecConfig model)
- **prophetnet** -- [ProphetNetModel](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetModel) (ProphetNetConfig model)
- **pvt** -- [PvtModel](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtModel) (PvtConfig model)
- **pvt_v2** -- [PvtV2Model](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2Model) (PvtV2Config model)
- **qianfan_ocr** -- [QianfanOCRModel](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRModel) (QianfanOCRConfig model)
- **qianfan_ocr_vision** -- [QianfanOCRVisionModel](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRVisionModel) (QianfanOCRVisionConfig model)
- **qwen2** -- [Qwen2Model](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Model) (Qwen2Config model)
- **qwen2_5_vl** -- [Qwen2_5_VLModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLModel) (Qwen2_5_VLConfig model)
- **qwen2_5_vl_text** -- [Qwen2_5_VLTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLTextModel) (Qwen2_5_VLTextConfig model)
- **qwen2_audio_encoder** -- [Qwen2AudioEncoder](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioEncoder) (Qwen2AudioEncoderConfig model)
- **qwen2_moe** -- [Qwen2MoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeModel) (Qwen2MoeConfig model)
- **qwen2_vl** -- [Qwen2VLModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLModel) (Qwen2VLConfig model)
- **qwen2_vl_text** -- [Qwen2VLTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLTextModel) (Qwen2VLTextConfig model)
- **qwen3** -- [Qwen3Model](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Model) (Qwen3Config model)
- **qwen3_5** -- [Qwen3_5Model](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Model) (Qwen3_5Config model)
- **qwen3_5_moe** -- [Qwen3_5MoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeModel) (Qwen3_5MoeConfig model)
- **qwen3_5_moe_text** -- [Qwen3_5MoeTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeTextModel) (Qwen3_5MoeTextConfig model)
- **qwen3_5_text** -- [Qwen3_5TextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextModel) (Qwen3_5TextConfig model)
- **qwen3_moe** -- [Qwen3MoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeModel) (Qwen3MoeConfig model)
- **qwen3_next** -- [Qwen3NextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextModel) (Qwen3NextConfig model)
- **qwen3_vl** -- [Qwen3VLModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLModel) (Qwen3VLConfig model)
- **qwen3_vl_moe** -- [Qwen3VLMoeModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeModel) (Qwen3VLMoeConfig model)
- **qwen3_vl_moe_text** -- [Qwen3VLMoeTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeTextModel) (Qwen3VLMoeTextConfig model)
- **qwen3_vl_text** -- [Qwen3VLTextModel](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLTextModel) (Qwen3VLTextConfig model)
- **recurrent_gemma** -- [RecurrentGemmaModel](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaModel) (RecurrentGemmaConfig model)
- **reformer** -- [ReformerModel](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerModel) (ReformerConfig model)
- **regnet** -- [RegNetModel](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetModel) (RegNetConfig model)
- **rembert** -- [RemBertModel](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertModel) (RemBertConfig model)
- **resnet** -- [ResNetModel](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetModel) (ResNetConfig model)
- **roberta** -- [RobertaModel](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaModel) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaPreLayerNormModel](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormModel) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertModel](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertModel) (RoCBertConfig model)
- **roformer** -- [RoFormerModel](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerModel) (RoFormerConfig model)
- **rt_detr** -- [RTDetrModel](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrModel) (RTDetrConfig model)
- **rt_detr_v2** -- [RTDetrV2Model](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2Model) (RTDetrV2Config model)
- **rwkv** -- [RwkvModel](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvModel) (RwkvConfig model)
- **sam** -- [SamModel](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamModel) (SamConfig model)
- **sam2** -- [Sam2Model](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2Model) (Sam2Config model)
- **sam2_hiera_det_model** -- [Sam2HieraDetModel](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2HieraDetModel) (Sam2HieraDetConfig model)
- **sam2_video** -- [Sam2VideoModel](/docs/transformers/v5.8.0/en/model_doc/sam2_video#transformers.Sam2VideoModel) (Sam2VideoConfig model)
- **sam2_vision_model** -- [Sam2VisionModel](/docs/transformers/v5.8.0/en/model_doc/sam2#transformers.Sam2VisionModel) (Sam2VisionConfig model)
- **sam3** -- [Sam3Model](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3Model) (Sam3Config model)
- **sam3_lite_text** -- [Sam3LiteTextModel](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextModel) (Sam3LiteTextConfig model)
- **sam3_lite_text_text_model** -- [Sam3LiteTextTextModel](/docs/transformers/v5.8.0/en/model_doc/sam3_lite_text#transformers.Sam3LiteTextTextModel) (Sam3LiteTextTextConfig model)
- **sam3_tracker** -- [Sam3TrackerModel](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker#transformers.Sam3TrackerModel) (Sam3TrackerConfig model)
- **sam3_tracker_video** -- [Sam3TrackerVideoModel](/docs/transformers/v5.8.0/en/model_doc/sam3_tracker_video#transformers.Sam3TrackerVideoModel) (Sam3TrackerVideoConfig model)
- **sam3_video** -- [Sam3VideoModel](/docs/transformers/v5.8.0/en/model_doc/sam3_video#transformers.Sam3VideoModel) (Sam3VideoConfig model)
- **sam3_vision_model** -- [Sam3VisionModel](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3VisionModel) (Sam3VisionConfig model)
- **sam3_vit_model** -- [Sam3ViTModel](/docs/transformers/v5.8.0/en/model_doc/sam3#transformers.Sam3ViTModel) (Sam3ViTConfig model)
- **sam_hq** -- [SamHQModel](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQModel) (SamHQConfig model)
- **sam_hq_vision_model** -- [SamHQVisionModel](/docs/transformers/v5.8.0/en/model_doc/sam_hq#transformers.SamHQVisionModel) (SamHQVisionConfig model)
- **sam_vision_model** -- [SamVisionModel](/docs/transformers/v5.8.0/en/model_doc/sam#transformers.SamVisionModel) (SamVisionConfig model)
- **seamless_m4t** -- [SeamlessM4TModel](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TModel) (SeamlessM4TConfig model)
- **seamless_m4t_v2** -- [SeamlessM4Tv2Model](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Model) (SeamlessM4Tv2Config model)
- **seed_oss** -- [SeedOssModel](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssModel) (SeedOssConfig model)
- **segformer** -- [SegformerModel](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerModel) (SegformerConfig model)
- **seggpt** -- [SegGptModel](/docs/transformers/v5.8.0/en/model_doc/seggpt#transformers.SegGptModel) (SegGptConfig model)
- **sew** -- [SEWModel](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWModel) (SEWConfig model)
- **sew-d** -- [SEWDModel](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDModel) (SEWDConfig model)
- **siglip** -- [SiglipModel](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipModel) (SiglipConfig model)
- **siglip2** -- [Siglip2Model](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Model) (Siglip2Config model)
- **siglip2_vision_model** -- [Siglip2VisionModel](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2VisionModel) (Siglip2VisionConfig model)
- **siglip_vision_model** -- [SiglipVisionModel](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipVisionModel) (SiglipVisionConfig model)
- **smollm3** -- [SmolLM3Model](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Model) (SmolLM3Config model)
- **smolvlm** -- [SmolVLMModel](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMModel) (SmolVLMConfig model)
- **smolvlm_vision** -- [SmolVLMVisionTransformer](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMVisionTransformer) (SmolVLMVisionConfig model)
- **solar_open** -- [SolarOpenModel](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenModel) (SolarOpenConfig model)
- **speech_to_text** -- [Speech2TextModel](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextModel) (Speech2TextConfig model)
- **speecht5** -- [SpeechT5Model](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Model) (SpeechT5Config model)
- **splinter** -- [SplinterModel](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterModel) (SplinterConfig model)
- **squeezebert** -- [SqueezeBertModel](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertModel) (SqueezeBertConfig model)
- **stablelm** -- [StableLmModel](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmModel) (StableLmConfig model)
- **starcoder2** -- [Starcoder2Model](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Model) (Starcoder2Config model)
- **swiftformer** -- [SwiftFormerModel](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerModel) (SwiftFormerConfig model)
- **swin** -- [SwinModel](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinModel) (SwinConfig model)
- **swin2sr** -- [Swin2SRModel](/docs/transformers/v5.8.0/en/model_doc/swin2sr#transformers.Swin2SRModel) (Swin2SRConfig model)
- **swinv2** -- [Swinv2Model](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Model) (Swinv2Config model)
- **switch_transformers** -- [SwitchTransformersModel](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersModel) (SwitchTransformersConfig model)
- **t5** -- [T5Model](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Model) (T5Config model)
- **t5gemma** -- [T5GemmaModel](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaModel) (T5GemmaConfig model)
- **t5gemma2** -- [T5Gemma2Model](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Model) (T5Gemma2Config model)
- **t5gemma2_encoder** -- `T5Gemma2Encoder` (T5Gemma2EncoderConfig model)
- **table-transformer** -- [TableTransformerModel](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerModel) (TableTransformerConfig model)
- **tapas** -- [TapasModel](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasModel) (TapasConfig model)
- **textnet** -- [TextNetModel](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetModel) (TextNetConfig model)
- **time_series_transformer** -- [TimeSeriesTransformerModel](/docs/transformers/v5.8.0/en/model_doc/time_series_transformer#transformers.TimeSeriesTransformerModel) (TimeSeriesTransformerConfig model)
- **timesfm** -- [TimesFmModel](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmModel) (TimesFmConfig model)
- **timesfm2_5** -- [TimesFm2_5Model](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5Model) (TimesFm2_5Config model)
- **timesformer** -- [TimesformerModel](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerModel) (TimesformerConfig model)
- **timm_backbone** -- [TimmBackbone](/docs/transformers/v5.8.0/en/main_classes/backbones#transformers.TimmBackbone) (TimmBackboneConfig model)
- **timm_wrapper** -- [TimmWrapperModel](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperModel) (TimmWrapperConfig model)
- **tvp** -- [TvpModel](/docs/transformers/v5.8.0/en/model_doc/tvp#transformers.TvpModel) (TvpConfig model)
- **udop** -- [UdopModel](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopModel) (UdopConfig model)
- **umt5** -- [UMT5Model](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Model) (UMT5Config model)
- **unispeech** -- [UniSpeechModel](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechModel) (UniSpeechConfig model)
- **unispeech-sat** -- [UniSpeechSatModel](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatModel) (UniSpeechSatConfig model)
- **univnet** -- [UnivNetModel](/docs/transformers/v5.8.0/en/model_doc/univnet#transformers.UnivNetModel) (UnivNetConfig model)
- **uvdoc** -- [UVDocModel](/docs/transformers/v5.8.0/en/model_doc/uvdoc#transformers.UVDocModel) (UVDocConfig model)
- **vaultgemma** -- [VaultGemmaModel](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaModel) (VaultGemmaConfig model)
- **vibevoice_acoustic_tokenizer** -- [VibeVoiceAcousticTokenizerModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerModel) (VibeVoiceAcousticTokenizerConfig model)
- **vibevoice_acoustic_tokenizer_decoder** -- [VibeVoiceAcousticTokenizerDecoderModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerDecoderModel) (VibeVoiceAcousticTokenizerDecoderConfig model)
- **vibevoice_acoustic_tokenizer_encoder** -- [VibeVoiceAcousticTokenizerEncoderModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerEncoderModel) (VibeVoiceAcousticTokenizerEncoderConfig model)
- **vibevoice_asr** -- [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
- **video_llama_3** -- [VideoLlama3Model](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Model) (VideoLlama3Config model)
- **video_llama_3_vision** -- [VideoLlama3VisionModel](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3VisionModel) (VideoLlama3VisionConfig model)
- **video_llava** -- [VideoLlavaModel](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaModel) (VideoLlavaConfig model)
- **videomae** -- [VideoMAEModel](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEModel) (VideoMAEConfig model)
- **vilt** -- [ViltModel](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltModel) (ViltConfig model)
- **vipllava** -- [VipLlavaModel](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaModel) (VipLlavaConfig model)
- **vision-text-dual-encoder** -- [VisionTextDualEncoderModel](/docs/transformers/v5.8.0/en/model_doc/vision-text-dual-encoder#transformers.VisionTextDualEncoderModel) (VisionTextDualEncoderConfig model)
- **visual_bert** -- [VisualBertModel](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertModel) (VisualBertConfig model)
- **vit** -- [ViTModel](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTModel) (ViTConfig model)
- **vit_mae** -- [ViTMAEModel](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEModel) (ViTMAEConfig model)
- **vit_msn** -- [ViTMSNModel](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNModel) (ViTMSNConfig model)
- **vitdet** -- [VitDetModel](/docs/transformers/v5.8.0/en/model_doc/vitdet#transformers.VitDetModel) (VitDetConfig model)
- **vits** -- [VitsModel](/docs/transformers/v5.8.0/en/model_doc/vits#transformers.VitsModel) (VitsConfig model)
- **vivit** -- [VivitModel](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitModel) (VivitConfig model)
- **vjepa2** -- [VJEPA2Model](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2Model) (VJEPA2Config model)
- **voxtral** -- [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
- **voxtral_encoder** -- [VoxtralEncoder](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralEncoder) (VoxtralEncoderConfig model)
- **voxtral_realtime** -- [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)
- **voxtral_realtime_encoder** -- [VoxtralRealtimeEncoder](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeEncoder) (VoxtralRealtimeEncoderConfig model)
- **voxtral_realtime_text** -- `VoxtralRealtimeTextModel` (VoxtralRealtimeTextConfig model)
- **wav2vec2** -- [Wav2Vec2Model](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Model) (Wav2Vec2Config model)
- **wav2vec2-bert** -- [Wav2Vec2BertModel](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertModel) (Wav2Vec2BertConfig model)
- **wav2vec2-conformer** -- [Wav2Vec2ConformerModel](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerModel) (Wav2Vec2ConformerConfig model)
- **wavlm** -- [WavLMModel](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMModel) (WavLMConfig model)
- **whisper** -- [WhisperModel](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperModel) (WhisperConfig model)
- **xclip** -- [XCLIPModel](/docs/transformers/v5.8.0/en/model_doc/xclip#transformers.XCLIPModel) (XCLIPConfig model)
- **xcodec** -- [XcodecModel](/docs/transformers/v5.8.0/en/model_doc/xcodec#transformers.XcodecModel) (XcodecConfig model)
- **xglm** -- [XGLMModel](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMModel) (XGLMConfig model)
- **xlm** -- [XLMModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMModel) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaModel](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaModel) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaXLModel](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLModel) (XLMRobertaXLConfig model)
- **xlnet** -- [XLNetModel](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetModel) (XLNetConfig model)
- **xlstm** -- [xLSTMModel](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMModel) (xLSTMConfig model)
- **xmod** -- [XmodModel](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodModel) (XmodConfig model)
- **yolos** -- [YolosModel](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosModel) (YolosConfig model)
- **yoso** -- [YosoModel](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoModel) (YosoConfig model)
- **youtu** -- [YoutuModel](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuModel) (YoutuConfig model)
- **zamba** -- [ZambaModel](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaModel) (ZambaConfig model)
- **zamba2** -- [Zamba2Model](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Model) (Zamba2Config model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Generic pretraining classes

The following auto classes are available for instantiating a model with a pretraining head.

### AutoModelForPreTraining[[transformers.AutoModelForPreTraining]]

#### transformers.AutoModelForPreTraining[[transformers.AutoModelForPreTraining]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2004)

This is a generic model class that will be instantiated as one of the model classes of the library (with a pretraining head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForPreTraining.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForPreTraining` (AlbertConfig model)
  - [AudioFlamingo3Config](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Config) configuration class: [AudioFlamingo3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3ForConditionalGeneration) (AudioFlamingo3Config model)
  - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForConditionalGeneration) (BartConfig model)
  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForPreTraining) (BertConfig model)
  - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForPreTraining](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForPreTraining) (BigBirdConfig model)
  - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForCausalLM) (BloomConfig model)
  - [CTRLConfig](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRLConfig model)
  - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForMaskedLM) (CamembertConfig model)
  - [ColModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/colmodernvbert#transformers.ColModernVBertConfig) configuration class: [ColModernVBertForRetrieval](/docs/transformers/v5.8.0/en/model_doc/colmodernvbert#transformers.ColModernVBertForRetrieval) (ColModernVBertConfig model)
  - [ColPaliConfig](/docs/transformers/v5.8.0/en/model_doc/colpali#transformers.ColPaliConfig) configuration class: [ColPaliForRetrieval](/docs/transformers/v5.8.0/en/model_doc/colpali#transformers.ColPaliForRetrieval) (ColPaliConfig model)
  - [ColQwen2Config](/docs/transformers/v5.8.0/en/model_doc/colqwen2#transformers.ColQwen2Config) configuration class: [ColQwen2ForRetrieval](/docs/transformers/v5.8.0/en/model_doc/colqwen2#transformers.ColQwen2ForRetrieval) (ColQwen2Config model)
  - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecTextConfig model)
  - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForMaskedLM) (DebertaConfig model)
  - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DebertaV2Config model)
  - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForMaskedLM) (DistilBertConfig model)
  - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForPreTraining](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForPreTraining) (ElectraConfig model)
  - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForPreTraining](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForPreTraining) (ErnieConfig model)
  - [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) configuration class: [EvollaForProteinText2Text](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaForProteinText2Text) (EvollaConfig model)
  - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForCausalLM) (Exaone4Config model)
  - [ExaoneMoeConfig](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeConfig) configuration class: [ExaoneMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeForCausalLM) (ExaoneMoeConfig model)
  - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForPreTraining](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForPreTraining) (FNetConfig model)
  - [FSMTConfig](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTConfig) configuration class: [FSMTForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTForConditionalGeneration) (FSMTConfig model)
  - [FalconMambaConfig](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaConfig) configuration class: [FalconMambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaForCausalLM) (FalconMambaConfig model)
  - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertWithLMHeadModel) (FlaubertConfig model)
  - [FlavaConfig](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaConfig) configuration class: [FlavaForPreTraining](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaForPreTraining) (FlavaConfig model)
  - [Florence2Config](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Config) configuration class: [Florence2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2ForConditionalGeneration) (Florence2Config model)
  - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForPreTraining](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForPreTraining) (FunnelConfig model)
  - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2LMHeadModel](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT2Config model)
  - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForCausalLM) (GPTBigCodeConfig model)
  - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model)
  - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model)
  - [GlmAsrConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrConfig) configuration class: [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model)
  - [HieraConfig](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraConfig) configuration class: [HieraForPreTraining](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraForPreTraining) (HieraConfig model)
  - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForMaskedLM) (IBertConfig model)
  - [Idefics2Config](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Config) configuration class: [Idefics2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2ForConditionalGeneration) (Idefics2Config model)
  - [Idefics3Config](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Config) configuration class: [Idefics3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3ForConditionalGeneration) (Idefics3Config model)
  - [IdeficsConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsConfig) configuration class: [IdeficsForVisionText2Text](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsForVisionText2Text) (IdeficsConfig model)
  - [JanusConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusConfig) configuration class: [JanusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusForConditionalGeneration) (JanusConfig model)
  - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForMaskedLM) (LayoutLMConfig model)
  - [LlavaConfig](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaConfig) configuration class: [LlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaForConditionalGeneration) (LlavaConfig model)
  - [LlavaNextConfig](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextConfig) configuration class: [LlavaNextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextForConditionalGeneration) (LlavaNextConfig model)
  - [LlavaNextVideoConfig](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoConfig) configuration class: [LlavaNextVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoForConditionalGeneration) (LlavaNextVideoConfig model)
  - [LlavaOnevisionConfig](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionConfig) configuration class: [LlavaOnevisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionForConditionalGeneration) (LlavaOnevisionConfig model)
  - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForMaskedLM) (LongformerConfig model)
  - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForMaskedLM) (LukeConfig model)
  - [LxmertConfig](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertConfig) configuration class: [LxmertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertForPreTraining) (LxmertConfig model)
  - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForMaskedLM) (MPNetConfig model)
  - [Mamba2Config](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2ForCausalLM) (Mamba2Config model)
  - [MambaConfig](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaForCausalLM) (MambaConfig model)
  - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForPreTraining) (MegatronBertConfig model)
  - [Mistral3Config](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Config) configuration class: [Mistral3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3ForConditionalGeneration) (Mistral3Config model)
  - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForCausalLM) (Mistral4Config model)
  - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForConditionalGeneration) (MllamaConfig model)
  - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForPreTraining) (MobileBertConfig model)
  - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForCausalLM) (MptConfig model)
  - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForMaskedLM) (MraConfig model)
  - [MusicFlamingoConfig](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoConfig) configuration class: [MusicFlamingoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoForConditionalGeneration) (MusicFlamingoConfig model)
  - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForConditionalGeneration) (MvpConfig model)
  - [NanoChatConfig](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatConfig) configuration class: [NanoChatForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatForCausalLM) (NanoChatConfig model)
  - [NllbMoeConfig](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeConfig) configuration class: [NllbMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeForConditionalGeneration) (NllbMoeConfig model)
  - [OpenAIGPTConfig](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAIGPTConfig model)
  - [PaliGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemmaConfig model)
  - [Qwen2AudioConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioConfig) configuration class: [Qwen2AudioForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioForConditionalGeneration) (Qwen2AudioConfig model)
  - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForPreTraining) (RoCBertConfig model)
  - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForMaskedLM) (RobertaConfig model)
  - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForMaskedLM) (RobertaPreLayerNormConfig model)
  - [RwkvConfig](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvConfig) configuration class: [RwkvForCausalLM](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvForCausalLM) (RwkvConfig model)
  - [SplinterConfig](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterConfig) configuration class: [SplinterForPreTraining](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterForPreTraining) (SplinterConfig model)
  - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForMaskedLM) (SqueezeBertConfig model)
  - [SwitchTransformersConfig](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersConfig) configuration class: [SwitchTransformersForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersForConditionalGeneration) (SwitchTransformersConfig model)
  - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForConditionalGeneration) (T5Config model)
  - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model)
  - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForConditionalGeneration) (T5GemmaConfig model)
  - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForMaskedLM) (TapasConfig model)
  - [UniSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechConfig) configuration class: [UniSpeechForPreTraining](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechForPreTraining) (UniSpeechConfig model)
  - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForPreTraining](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForPreTraining) (UniSpeechSatConfig model)
  - [ViTMAEConfig](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEConfig) configuration class: [ViTMAEForPreTraining](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEForPreTraining) (ViTMAEConfig model)
  - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
  - [VideoLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaConfig) configuration class: [VideoLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaForConditionalGeneration) (VideoLlavaConfig model)
  - [VideoMAEConfig](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEConfig) configuration class: [VideoMAEForPreTraining](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEForPreTraining) (VideoMAEConfig model)
  - [VipLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaConfig) configuration class: [VipLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaForConditionalGeneration) (VipLlavaConfig model)
  - [VisualBertConfig](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertConfig) configuration class: [VisualBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertForPreTraining) (VisualBertConfig model)
  - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
  - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)
  - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForPreTraining](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForPreTraining) (Wav2Vec2Config model)
  - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForPreTraining](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForPreTraining) (Wav2Vec2ConformerConfig model)
  - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMWithLMHeadModel) (XLMConfig model)
  - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForMaskedLM) (XLMRobertaConfig model)
  - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForMaskedLM) (XLMRobertaXLConfig model)
  - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetLMHeadModel) (XLNetConfig model)
  - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForMaskedLM) (XmodConfig model)
  - [xLSTMConfig](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMConfig) configuration class: [xLSTMForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMForCausalLM) (xLSTMConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a pretraining head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForPreTraining.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForPreTraining` (AlbertConfig model) - [AudioFlamingo3Config](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Config) configuration class: [AudioFlamingo3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3ForConditionalGeneration) (AudioFlamingo3Config model) - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForConditionalGeneration) (BartConfig model) - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForPreTraining) (BertConfig model) - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForPreTraining](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForPreTraining) (BigBirdConfig model) - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForCausalLM) (BloomConfig model) - [CTRLConfig](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRLConfig model) - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForMaskedLM) (CamembertConfig model) - [ColModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/colmodernvbert#transformers.ColModernVBertConfig) configuration class: [ColModernVBertForRetrieval](/docs/transformers/v5.8.0/en/model_doc/colmodernvbert#transformers.ColModernVBertForRetrieval) (ColModernVBertConfig model) - [ColPaliConfig](/docs/transformers/v5.8.0/en/model_doc/colpali#transformers.ColPaliConfig) configuration class: [ColPaliForRetrieval](/docs/transformers/v5.8.0/en/model_doc/colpali#transformers.ColPaliForRetrieval) (ColPaliConfig model) - [ColQwen2Config](/docs/transformers/v5.8.0/en/model_doc/colqwen2#transformers.ColQwen2Config) configuration class: [ColQwen2ForRetrieval](/docs/transformers/v5.8.0/en/model_doc/colqwen2#transformers.ColQwen2ForRetrieval) (ColQwen2Config model) - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecTextConfig model) - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForMaskedLM) (DebertaConfig model) - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DebertaV2Config model) - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForMaskedLM) (DistilBertConfig model) - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForPreTraining](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForPreTraining) (ElectraConfig model) - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForPreTraining](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForPreTraining) (ErnieConfig model) - [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) configuration class: [EvollaForProteinText2Text](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaForProteinText2Text) (EvollaConfig model) - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForCausalLM) (Exaone4Config model) - [ExaoneMoeConfig](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeConfig) configuration class: [ExaoneMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeForCausalLM) (ExaoneMoeConfig model) - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForPreTraining](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForPreTraining) (FNetConfig model) - [FSMTConfig](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTConfig) configuration class: [FSMTForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTForConditionalGeneration) (FSMTConfig model) - [FalconMambaConfig](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaConfig) configuration class: [FalconMambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaForCausalLM) (FalconMambaConfig model) - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertWithLMHeadModel) (FlaubertConfig model) - [FlavaConfig](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaConfig) configuration class: [FlavaForPreTraining](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaForPreTraining) (FlavaConfig model) - [Florence2Config](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Config) configuration class: [Florence2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2ForConditionalGeneration) (Florence2Config model) - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForPreTraining](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForPreTraining) (FunnelConfig model) - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2LMHeadModel](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT2Config model) - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForCausalLM) (GPTBigCodeConfig model) - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model) - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model) - [GlmAsrConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrConfig) configuration class: [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model) - [HieraConfig](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraConfig) configuration class: [HieraForPreTraining](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraForPreTraining) (HieraConfig model) - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForMaskedLM) (IBertConfig model) - [Idefics2Config](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Config) configuration class: [Idefics2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2ForConditionalGeneration) (Idefics2Config model) - [Idefics3Config](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Config) configuration class: [Idefics3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3ForConditionalGeneration) (Idefics3Config model) - [IdeficsConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsConfig) configuration class: [IdeficsForVisionText2Text](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsForVisionText2Text) (IdeficsConfig model) - [JanusConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusConfig) configuration class: [JanusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusForConditionalGeneration) (JanusConfig model) - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForMaskedLM) (LayoutLMConfig model) - [LlavaConfig](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaConfig) configuration class: [LlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaForConditionalGeneration) (LlavaConfig model) - [LlavaNextConfig](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextConfig) configuration class: [LlavaNextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextForConditionalGeneration) (LlavaNextConfig model) - [LlavaNextVideoConfig](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoConfig) configuration class: [LlavaNextVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoForConditionalGeneration) (LlavaNextVideoConfig model) - [LlavaOnevisionConfig](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionConfig) configuration class: [LlavaOnevisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionForConditionalGeneration) (LlavaOnevisionConfig model) - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForMaskedLM) (LongformerConfig model) - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForMaskedLM) (LukeConfig model) - [LxmertConfig](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertConfig) configuration class: [LxmertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertForPreTraining) (LxmertConfig model) - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForMaskedLM) (MPNetConfig model) - [Mamba2Config](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2ForCausalLM) (Mamba2Config model) - [MambaConfig](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaForCausalLM) (MambaConfig model) - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForPreTraining) (MegatronBertConfig model) - [Mistral3Config](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Config) configuration class: [Mistral3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3ForConditionalGeneration) (Mistral3Config model) - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForCausalLM) (Mistral4Config model) - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForConditionalGeneration) (MllamaConfig model) - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForPreTraining) (MobileBertConfig model) - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForCausalLM) (MptConfig model) - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForMaskedLM) (MraConfig model) - [MusicFlamingoConfig](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoConfig) configuration class: [MusicFlamingoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoForConditionalGeneration) (MusicFlamingoConfig model) - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForConditionalGeneration) (MvpConfig model) - [NanoChatConfig](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatConfig) configuration class: [NanoChatForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatForCausalLM) (NanoChatConfig model) - [NllbMoeConfig](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeConfig) configuration class: [NllbMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeForConditionalGeneration) (NllbMoeConfig model) - [OpenAIGPTConfig](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAIGPTConfig model) - [PaliGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemmaConfig model) - [Qwen2AudioConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioConfig) configuration class: [Qwen2AudioForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioForConditionalGeneration) (Qwen2AudioConfig model) - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForPreTraining) (RoCBertConfig model) - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForMaskedLM) (RobertaConfig model) - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForMaskedLM) (RobertaPreLayerNormConfig model) - [RwkvConfig](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvConfig) configuration class: [RwkvForCausalLM](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvForCausalLM) (RwkvConfig model) - [SplinterConfig](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterConfig) configuration class: [SplinterForPreTraining](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterForPreTraining) (SplinterConfig model) - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForMaskedLM) (SqueezeBertConfig model) - [SwitchTransformersConfig](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersConfig) configuration class: [SwitchTransformersForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersForConditionalGeneration) (SwitchTransformersConfig model) - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForConditionalGeneration) (T5Config model) - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model) - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForConditionalGeneration) (T5GemmaConfig model) - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForMaskedLM) (TapasConfig model) - [UniSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechConfig) configuration class: [UniSpeechForPreTraining](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechForPreTraining) (UniSpeechConfig model) - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForPreTraining](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForPreTraining) (UniSpeechSatConfig model) - [ViTMAEConfig](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEConfig) configuration class: [ViTMAEForPreTraining](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEForPreTraining) (ViTMAEConfig model) - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model) - [VideoLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaConfig) configuration class: [VideoLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaForConditionalGeneration) (VideoLlavaConfig model) - [VideoMAEConfig](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEConfig) configuration class: [VideoMAEForPreTraining](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEForPreTraining) (VideoMAEConfig model) - [VipLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaConfig) configuration class: [VipLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaForConditionalGeneration) (VipLlavaConfig model) - [VisualBertConfig](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertConfig) configuration class: [VisualBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertForPreTraining) (VisualBertConfig model) - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model) - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model) - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForPreTraining](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForPreTraining) (Wav2Vec2Config model) - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForPreTraining](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForPreTraining) (Wav2Vec2ConformerConfig model) - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMWithLMHeadModel) (XLMConfig model) - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForMaskedLM) (XLMRobertaConfig model) - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForMaskedLM) (XLMRobertaXLConfig model) - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetLMHeadModel) (XLNetConfig model) - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForMaskedLM) (XmodConfig model) - [xLSTMConfig](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMConfig) configuration class: [xLSTMForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMForCausalLM) (xLSTMConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForPreTraining.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a pretraining head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForPreTraining` (AlbertConfig model)
- **audioflamingo3** -- [AudioFlamingo3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3ForConditionalGeneration) (AudioFlamingo3Config model)
- **bart** -- [BartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForConditionalGeneration) (BartConfig model)
- **bert** -- [BertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForPreTraining) (BertConfig model)
- **big_bird** -- [BigBirdForPreTraining](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForPreTraining) (BigBirdConfig model)
- **bloom** -- [BloomForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForCausalLM) (BloomConfig model)
- **camembert** -- [CamembertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForMaskedLM) (CamembertConfig model)
- **colmodernvbert** -- [ColModernVBertForRetrieval](/docs/transformers/v5.8.0/en/model_doc/colmodernvbert#transformers.ColModernVBertForRetrieval) (ColModernVBertConfig model)
- **colpali** -- [ColPaliForRetrieval](/docs/transformers/v5.8.0/en/model_doc/colpali#transformers.ColPaliForRetrieval) (ColPaliConfig model)
- **colqwen2** -- [ColQwen2ForRetrieval](/docs/transformers/v5.8.0/en/model_doc/colqwen2#transformers.ColQwen2ForRetrieval) (ColQwen2Config model)
- **ctrl** -- [CTRLLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRLConfig model)
- **data2vec-text** -- [Data2VecTextForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecTextConfig model)
- **deberta** -- [DebertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForMaskedLM) (DebertaConfig model)
- **deberta-v2** -- [DebertaV2ForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DebertaV2Config model)
- **distilbert** -- [DistilBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForMaskedLM) (DistilBertConfig model)
- **electra** -- [ElectraForPreTraining](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForPreTraining) (ElectraConfig model)
- **ernie** -- [ErnieForPreTraining](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForPreTraining) (ErnieConfig model)
- **evolla** -- [EvollaForProteinText2Text](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaForProteinText2Text) (EvollaConfig model)
- **exaone4** -- [Exaone4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForCausalLM) (Exaone4Config model)
- **exaone_moe** -- [ExaoneMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeForCausalLM) (ExaoneMoeConfig model)
- **falcon_mamba** -- [FalconMambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaForCausalLM) (FalconMambaConfig model)
- **flaubert** -- [FlaubertWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertWithLMHeadModel) (FlaubertConfig model)
- **flava** -- [FlavaForPreTraining](/docs/transformers/v5.8.0/en/model_doc/flava#transformers.FlavaForPreTraining) (FlavaConfig model)
- **florence2** -- [Florence2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2ForConditionalGeneration) (Florence2Config model)
- **fnet** -- [FNetForPreTraining](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForPreTraining) (FNetConfig model)
- **fsmt** -- [FSMTForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTForConditionalGeneration) (FSMTConfig model)
- **funnel** -- [FunnelForPreTraining](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForPreTraining) (FunnelConfig model)
- **gemma3** -- [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model)
- **gemma4** -- [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model)
- **glmasr** -- [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model)
- **gpt-sw3** -- [GPT2LMHeadModel](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT2Config model)
- **gpt2** -- [GPT2LMHeadModel](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT2Config model)
- **gpt_bigcode** -- [GPTBigCodeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForCausalLM) (GPTBigCodeConfig model)
- **hiera** -- [HieraForPreTraining](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraForPreTraining) (HieraConfig model)
- **ibert** -- [IBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForMaskedLM) (IBertConfig model)
- **idefics** -- [IdeficsForVisionText2Text](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsForVisionText2Text) (IdeficsConfig model)
- **idefics2** -- [Idefics2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2ForConditionalGeneration) (Idefics2Config model)
- **idefics3** -- [Idefics3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3ForConditionalGeneration) (Idefics3Config model)
- **janus** -- [JanusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusForConditionalGeneration) (JanusConfig model)
- **layoutlm** -- [LayoutLMForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForMaskedLM) (LayoutLMConfig model)
- **llava** -- [LlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaForConditionalGeneration) (LlavaConfig model)
- **llava_next** -- [LlavaNextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextForConditionalGeneration) (LlavaNextConfig model)
- **llava_next_video** -- [LlavaNextVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoForConditionalGeneration) (LlavaNextVideoConfig model)
- **llava_onevision** -- [LlavaOnevisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionForConditionalGeneration) (LlavaOnevisionConfig model)
- **longformer** -- [LongformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForMaskedLM) (LongformerConfig model)
- **luke** -- [LukeForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForMaskedLM) (LukeConfig model)
- **lxmert** -- [LxmertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertForPreTraining) (LxmertConfig model)
- **mamba** -- [MambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaForCausalLM) (MambaConfig model)
- **mamba2** -- [Mamba2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2ForCausalLM) (Mamba2Config model)
- **megatron-bert** -- [MegatronBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForPreTraining) (MegatronBertConfig model)
- **mistral3** -- [Mistral3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3ForConditionalGeneration) (Mistral3Config model)
- **mistral4** -- [Mistral4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForCausalLM) (Mistral4Config model)
- **mllama** -- [MllamaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForConditionalGeneration) (MllamaConfig model)
- **mobilebert** -- [MobileBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForPreTraining) (MobileBertConfig model)
- **mpnet** -- [MPNetForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForMaskedLM) (MPNetConfig model)
- **mpt** -- [MptForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForCausalLM) (MptConfig model)
- **mra** -- [MraForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForMaskedLM) (MraConfig model)
- **musicflamingo** -- [MusicFlamingoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoForConditionalGeneration) (MusicFlamingoConfig model)
- **mvp** -- [MvpForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForConditionalGeneration) (MvpConfig model)
- **nanochat** -- [NanoChatForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatForCausalLM) (NanoChatConfig model)
- **nllb-moe** -- [NllbMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeForConditionalGeneration) (NllbMoeConfig model)
- **openai-gpt** -- [OpenAIGPTLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAIGPTConfig model)
- **paligemma** -- [PaliGemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemmaConfig model)
- **qwen2_audio** -- [Qwen2AudioForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioForConditionalGeneration) (Qwen2AudioConfig model)
- **roberta** -- [RobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForMaskedLM) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaPreLayerNormForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForMaskedLM) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForPreTraining) (RoCBertConfig model)
- **rwkv** -- [RwkvForCausalLM](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvForCausalLM) (RwkvConfig model)
- **splinter** -- [SplinterForPreTraining](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterForPreTraining) (SplinterConfig model)
- **squeezebert** -- [SqueezeBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForMaskedLM) (SqueezeBertConfig model)
- **switch_transformers** -- [SwitchTransformersForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersForConditionalGeneration) (SwitchTransformersConfig model)
- **t5** -- [T5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForConditionalGeneration) (T5Config model)
- **t5gemma** -- [T5GemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForConditionalGeneration) (T5GemmaConfig model)
- **t5gemma2** -- [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model)
- **tapas** -- [TapasForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForMaskedLM) (TapasConfig model)
- **unispeech** -- [UniSpeechForPreTraining](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechForPreTraining) (UniSpeechConfig model)
- **unispeech-sat** -- [UniSpeechSatForPreTraining](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForPreTraining) (UniSpeechSatConfig model)
- **vibevoice_asr** -- [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
- **video_llava** -- [VideoLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaForConditionalGeneration) (VideoLlavaConfig model)
- **videomae** -- [VideoMAEForPreTraining](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEForPreTraining) (VideoMAEConfig model)
- **vipllava** -- [VipLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaForConditionalGeneration) (VipLlavaConfig model)
- **visual_bert** -- [VisualBertForPreTraining](/docs/transformers/v5.8.0/en/model_doc/visual_bert#transformers.VisualBertForPreTraining) (VisualBertConfig model)
- **vit_mae** -- [ViTMAEForPreTraining](/docs/transformers/v5.8.0/en/model_doc/vit_mae#transformers.ViTMAEForPreTraining) (ViTMAEConfig model)
- **voxtral** -- [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
- **voxtral_realtime** -- [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)
- **wav2vec2** -- [Wav2Vec2ForPreTraining](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForPreTraining) (Wav2Vec2Config model)
- **wav2vec2-conformer** -- [Wav2Vec2ConformerForPreTraining](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForPreTraining) (Wav2Vec2ConformerConfig model)
- **xlm** -- [XLMWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMWithLMHeadModel) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForMaskedLM) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaXLForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForMaskedLM) (XLMRobertaXLConfig model)
- **xlnet** -- [XLNetLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetLMHeadModel) (XLNetConfig model)
- **xlstm** -- [xLSTMForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMForCausalLM) (xLSTMConfig model)
- **xmod** -- [XmodForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForMaskedLM) (XmodConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Natural Language Processing

The following auto classes are available for the following natural language processing tasks.

### AutoModelForCausalLM[[transformers.AutoModelForCausalLM]]

#### transformers.AutoModelForCausalLM[[transformers.AutoModelForCausalLM]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2011)

This is a generic model class that will be instantiated as one of the model classes of the library (with a causal language modeling head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForCausalLM.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AfmoeConfig](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeConfig) configuration class: [AfmoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeForCausalLM) (AfmoeConfig model)
  - [ApertusConfig](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusConfig) configuration class: [ApertusForCausalLM](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusForCausalLM) (ApertusConfig model)
  - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForCausalLM) (ArceeConfig model)
  - [AriaTextConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextConfig) configuration class: [AriaTextForCausalLM](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextForCausalLM) (AriaTextConfig model)
  - [BambaConfig](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaConfig) configuration class: [BambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaForCausalLM) (BambaConfig model)
  - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForCausalLM) (BartConfig model)
  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertLMHeadModel) (BertConfig model)
  - [BertGenerationConfig](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationConfig) configuration class: [BertGenerationDecoder](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationDecoder) (BertGenerationConfig model)
  - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForCausalLM](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForCausalLM) (BigBirdConfig model)
  - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForCausalLM) (BigBirdPegasusConfig model)
  - [BioGptConfig](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForCausalLM](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptForCausalLM) (BioGptConfig model)
  - [BitNetConfig](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetConfig) configuration class: [BitNetForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetForCausalLM) (BitNetConfig model)
  - [BlenderbotConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotForCausalLM](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotForCausalLM) (BlenderbotConfig model)
  - [BlenderbotSmallConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallForCausalLM](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallForCausalLM) (BlenderbotSmallConfig model)
  - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForCausalLM) (BloomConfig model)
  - [BltConfig](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltConfig) configuration class: [BltForCausalLM](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltForCausalLM) (BltConfig model)
  - [CTRLConfig](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRLConfig model)
  - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForCausalLM) (CamembertConfig model)
  - [CodeGenConfig](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenForCausalLM](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenForCausalLM) (CodeGenConfig model)
  - [Cohere2Config](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2Config) configuration class: [Cohere2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2ForCausalLM) (Cohere2Config model)
  - [CohereConfig](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereConfig) configuration class: [CohereForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereForCausalLM) (CohereConfig model)
  - [CpmAntConfig](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntConfig) configuration class: [CpmAntForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntForCausalLM) (CpmAntConfig model)
  - [CwmConfig](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmConfig) configuration class: [CwmForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmForCausalLM) (CwmConfig model)
  - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForCausalLM](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForCausalLM) (Data2VecTextConfig model)
  - [DbrxConfig](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxConfig) configuration class: [DbrxForCausalLM](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxForCausalLM) (DbrxConfig model)
  - [DeepseekV2Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Config) configuration class: [DeepseekV2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2ForCausalLM) (DeepseekV2Config model)
  - [DeepseekV3Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3ForCausalLM) (DeepseekV3Config model)
  - [DeepseekV4Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4Config) configuration class: [DeepseekV4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4ForCausalLM) (DeepseekV4Config model)
  - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForCausalLM) (DiffLlamaConfig model)
  - [DogeConfig](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeConfig) configuration class: [DogeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeForCausalLM) (DogeConfig model)
  - [Dots1Config](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1Config) configuration class: [Dots1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1ForCausalLM) (Dots1Config model)
  - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForCausalLM](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForCausalLM) (ElectraConfig model)
  - [Emu3Config](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Config) configuration class: [Emu3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3ForCausalLM) (Emu3Config model)
  - [Ernie4_5Config](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5Config) configuration class: [Ernie4_5ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5ForCausalLM) (Ernie4_5Config model)
  - [Ernie4_5_MoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeConfig) configuration class: [Ernie4_5_MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeForCausalLM) (Ernie4_5_MoeConfig model)
  - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForCausalLM) (ErnieConfig model)
  - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForCausalLM) (Exaone4Config model)
  - [ExaoneMoeConfig](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeConfig) configuration class: [ExaoneMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeForCausalLM) (ExaoneMoeConfig model)
  - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForCausalLM) (FalconConfig model)
  - [FalconH1Config](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1Config) configuration class: [FalconH1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1ForCausalLM) (FalconH1Config model)
  - [FalconMambaConfig](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaConfig) configuration class: [FalconMambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaForCausalLM) (FalconMambaConfig model)
  - [FlexOlmoConfig](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoConfig) configuration class: [FlexOlmoForCausalLM](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoForCausalLM) (FlexOlmoConfig model)
  - [FuyuConfig](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuConfig) configuration class: [FuyuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuForCausalLM) (FuyuConfig model)
  - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2LMHeadModel](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT2Config model)
  - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForCausalLM) (GPTBigCodeConfig model)
  - [GPTJConfig](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJConfig) configuration class: [GPTJForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJForCausalLM) (GPTJConfig model)
  - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForCausalLM) (GPTNeoConfig model)
  - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForCausalLM) (GPTNeoXConfig model)
  - [GPTNeoXJapaneseConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) configuration class: [GPTNeoXJapaneseForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseForCausalLM) (GPTNeoXJapaneseConfig model)
  - [Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2ForCausalLM) (Gemma2Config model)
  - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model)
  - [Gemma3TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForCausalLM) (Gemma3TextConfig model)
  - [Gemma3nConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nConfig) configuration class: [Gemma3nForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForConditionalGeneration) (Gemma3nConfig model)
  - [Gemma3nTextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nTextConfig) configuration class: [Gemma3nForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForCausalLM) (Gemma3nTextConfig model)
  - [Gemma4AssistantConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4_assistant#transformers.Gemma4AssistantConfig) configuration class: [Gemma4AssistantForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma4_assistant#transformers.Gemma4AssistantForCausalLM) (Gemma4AssistantConfig model)
  - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model)
  - [Gemma4TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4TextConfig) configuration class: [Gemma4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForCausalLM) (Gemma4TextConfig model)
  - [GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaForCausalLM) (GemmaConfig model)
  - [GitConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitConfig) configuration class: [GitForCausalLM](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitForCausalLM) (GitConfig model)
  - [Glm4Config](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Config) configuration class: [Glm4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4ForCausalLM) (Glm4Config model)
  - [Glm4MoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeConfig) configuration class: [Glm4MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeForCausalLM) (Glm4MoeConfig model)
  - [Glm4MoeLiteConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteConfig) configuration class: [Glm4MoeLiteForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteForCausalLM) (Glm4MoeLiteConfig model)
  - [GlmConfig](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmConfig) configuration class: [GlmForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmForCausalLM) (GlmConfig model)
  - [GlmMoeDsaConfig](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaConfig) configuration class: [GlmMoeDsaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaForCausalLM) (GlmMoeDsaConfig model)
  - [GotOcr2Config](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Config) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (GotOcr2Config model)
  - [GptOssConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssConfig) configuration class: [GptOssForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssForCausalLM) (GptOssConfig model)
  - [GraniteConfig](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteConfig) configuration class: [GraniteForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteForCausalLM) (GraniteConfig model)
  - [GraniteMoeConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeConfig) configuration class: [GraniteMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeForCausalLM) (GraniteMoeConfig model)
  - [GraniteMoeHybridConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridConfig) configuration class: [GraniteMoeHybridForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridForCausalLM) (GraniteMoeHybridConfig model)
  - [GraniteMoeSharedConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedConfig) configuration class: [GraniteMoeSharedForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedForCausalLM) (GraniteMoeSharedConfig model)
  - [HYV3Config](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3Config) configuration class: [HYV3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3ForCausalLM) (HYV3Config model)
  - [HeliumConfig](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumConfig) configuration class: [HeliumForCausalLM](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumForCausalLM) (HeliumConfig model)
  - [HunYuanDenseV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Config) configuration class: [HunYuanDenseV1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1ForCausalLM) (HunYuanDenseV1Config model)
  - [HunYuanMoEV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Config) configuration class: [HunYuanMoEV1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1ForCausalLM) (HunYuanMoEV1Config model)
  - [Jais2Config](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2Config) configuration class: [Jais2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2ForCausalLM) (Jais2Config model)
  - [JambaConfig](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaForCausalLM) (JambaConfig model)
  - [JetMoeConfig](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeConfig) configuration class: [JetMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeForCausalLM) (JetMoeConfig model)
  - [LagunaConfig](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaConfig) configuration class: [LagunaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaForCausalLM) (LagunaConfig model)
  - [Lfm2Config](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2Config) configuration class: [Lfm2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2ForCausalLM) (Lfm2Config model)
  - [Lfm2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeConfig) configuration class: [Lfm2MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeForCausalLM) (Lfm2MoeConfig model)
  - [Llama4Config](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Config) configuration class: [Llama4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForCausalLM) (Llama4Config model)
  - [Llama4TextConfig](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4TextConfig) configuration class: [Llama4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForCausalLM) (Llama4TextConfig model)
  - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaForCausalLM) (LlamaConfig model)
  - [LongcatFlashConfig](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashConfig) configuration class: [LongcatFlashForCausalLM](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashForCausalLM) (LongcatFlashConfig model)
  - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForCausalLM) (MBartConfig model)
  - [Mamba2Config](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2ForCausalLM) (Mamba2Config model)
  - [MambaConfig](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaForCausalLM) (MambaConfig model)
  - [MarianConfig](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianConfig) configuration class: [MarianForCausalLM](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianForCausalLM) (MarianConfig model)
  - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForCausalLM) (MegatronBertConfig model)
  - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxForCausalLM](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForCausalLM) (MiniMaxConfig model)
  - [MiniMaxM2Config](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2Config) configuration class: [MiniMaxM2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2ForCausalLM) (MiniMaxM2Config model)
  - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForCausalLM) (Ministral3Config model)
  - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForCausalLM) (MinistralConfig model)
  - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForCausalLM) (MistralConfig model)
  - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForCausalLM) (MixtralConfig model)
  - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForCausalLM) (MllamaConfig model)
  - [ModernBertDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderConfig) configuration class: [ModernBertDecoderForCausalLM](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderForCausalLM) (ModernBertDecoderConfig model)
  - [MoshiConfig](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiConfig) configuration class: [MoshiForCausalLM](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiForCausalLM) (MoshiConfig model)
  - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForCausalLM) (MptConfig model)
  - [MusicgenConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenConfig) configuration class: [MusicgenForCausalLM](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenForCausalLM) (MusicgenConfig model)
  - [MusicgenMelodyConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyConfig) configuration class: [MusicgenMelodyForCausalLM](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyForCausalLM) (MusicgenMelodyConfig model)
  - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForCausalLM) (MvpConfig model)
  - [NanoChatConfig](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatConfig) configuration class: [NanoChatForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatForCausalLM) (NanoChatConfig model)
  - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForCausalLM) (NemotronConfig model)
  - [NemotronHConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHConfig) configuration class: [NemotronHForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHForCausalLM) (NemotronHConfig model)
  - [OPTConfig](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTConfig) configuration class: [OPTForCausalLM](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTForCausalLM) (OPTConfig model)
  - [Olmo2Config](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Config) configuration class: [Olmo2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2ForCausalLM) (Olmo2Config model)
  - [Olmo3Config](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Config) configuration class: [Olmo3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3ForCausalLM) (Olmo3Config model)
  - [OlmoConfig](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoConfig) configuration class: [OlmoForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoForCausalLM) (OlmoConfig model)
  - [OlmoHybridConfig](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridConfig) configuration class: [OlmoHybridForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridForCausalLM) (OlmoHybridConfig model)
  - [OlmoeConfig](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeConfig) configuration class: [OlmoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeForCausalLM) (OlmoeConfig model)
  - [OpenAIGPTConfig](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAIGPTConfig model)
  - [PLBartConfig](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartConfig) configuration class: [PLBartForCausalLM](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartForCausalLM) (PLBartConfig model)
  - [PegasusConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusConfig) configuration class: [PegasusForCausalLM](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusForCausalLM) (PegasusConfig model)
  - [PersimmonConfig](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonConfig) configuration class: [PersimmonForCausalLM](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonForCausalLM) (PersimmonConfig model)
  - [Phi3Config](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Config) configuration class: [Phi3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3ForCausalLM) (Phi3Config model)
  - [Phi4MultimodalConfig](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalConfig) configuration class: [Phi4MultimodalForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalForCausalLM) (Phi4MultimodalConfig model)
  - [PhiConfig](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiConfig) configuration class: [PhiForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiForCausalLM) (PhiConfig model)
  - [PhimoeConfig](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeConfig) configuration class: [PhimoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeForCausalLM) (PhimoeConfig model)
  - [ProphetNetConfig](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetConfig) configuration class: [ProphetNetForCausalLM](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetForCausalLM) (ProphetNetConfig model)
  - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForCausalLM) (Qwen2Config model)
  - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForCausalLM) (Qwen2MoeConfig model)
  - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForCausalLM) (Qwen3Config model)
  - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForCausalLM) (Qwen3MoeConfig model)
  - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForCausalLM) (Qwen3NextConfig model)
  - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForCausalLM) (Qwen3_5Config model)
  - [Qwen3_5MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeConfig) configuration class: [Qwen3_5MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForCausalLM) (Qwen3_5MoeConfig model)
  - [Qwen3_5MoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeTextConfig) configuration class: [Qwen3_5MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForCausalLM) (Qwen3_5MoeTextConfig model)
  - [Qwen3_5TextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextConfig) configuration class: [Qwen3_5ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForCausalLM) (Qwen3_5TextConfig model)
  - [RecurrentGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaConfig) configuration class: [RecurrentGemmaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaForCausalLM) (RecurrentGemmaConfig model)
  - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerModelWithLMHead](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerModelWithLMHead) (ReformerConfig model)
  - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForCausalLM) (RemBertConfig model)
  - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForCausalLM) (RoCBertConfig model)
  - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForCausalLM) (RoFormerConfig model)
  - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForCausalLM) (RobertaConfig model)
  - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForCausalLM) (RobertaPreLayerNormConfig model)
  - [RwkvConfig](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvConfig) configuration class: [RwkvForCausalLM](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvForCausalLM) (RwkvConfig model)
  - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssForCausalLM](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForCausalLM) (SeedOssConfig model)
  - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForCausalLM) (SmolLM3Config model)
  - [SolarOpenConfig](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenConfig) configuration class: [SolarOpenForCausalLM](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenForCausalLM) (SolarOpenConfig model)
  - [StableLmConfig](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmConfig) configuration class: [StableLmForCausalLM](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmForCausalLM) (StableLmConfig model)
  - [Starcoder2Config](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Config) configuration class: [Starcoder2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2ForCausalLM) (Starcoder2Config model)
  - [TrOCRConfig](/docs/transformers/v5.8.0/en/model_doc/trocr#transformers.TrOCRConfig) configuration class: [TrOCRForCausalLM](/docs/transformers/v5.8.0/en/model_doc/trocr#transformers.TrOCRForCausalLM) (TrOCRConfig model)
  - [VaultGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaConfig) configuration class: [VaultGemmaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaForCausalLM) (VaultGemmaConfig model)
  - [WhisperConfig](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForCausalLM](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperForCausalLM) (WhisperConfig model)
  - [XGLMConfig](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMConfig) configuration class: [XGLMForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMForCausalLM) (XGLMConfig model)
  - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMWithLMHeadModel) (XLMConfig model)
  - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForCausalLM) (XLMRobertaConfig model)
  - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForCausalLM) (XLMRobertaXLConfig model)
  - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetLMHeadModel) (XLNetConfig model)
  - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForCausalLM) (XmodConfig model)
  - [YoutuConfig](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuConfig) configuration class: [YoutuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuForCausalLM) (YoutuConfig model)
  - [Zamba2Config](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Config) configuration class: [Zamba2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2ForCausalLM) (Zamba2Config model)
  - [ZambaConfig](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaConfig) configuration class: [ZambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaForCausalLM) (ZambaConfig model)
  - [xLSTMConfig](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMConfig) configuration class: [xLSTMForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMForCausalLM) (xLSTMConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a causal language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCausalLM.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AfmoeConfig](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeConfig) configuration class: [AfmoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeForCausalLM) (AfmoeConfig model) - [ApertusConfig](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusConfig) configuration class: [ApertusForCausalLM](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusForCausalLM) (ApertusConfig model) - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForCausalLM) (ArceeConfig model) - [AriaTextConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextConfig) configuration class: [AriaTextForCausalLM](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextForCausalLM) (AriaTextConfig model) - [BambaConfig](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaConfig) configuration class: [BambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaForCausalLM) (BambaConfig model) - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForCausalLM) (BartConfig model) - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertLMHeadModel) (BertConfig model) - [BertGenerationConfig](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationConfig) configuration class: [BertGenerationDecoder](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationDecoder) (BertGenerationConfig model) - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForCausalLM](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForCausalLM) (BigBirdConfig model) - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForCausalLM) (BigBirdPegasusConfig model) - [BioGptConfig](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForCausalLM](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptForCausalLM) (BioGptConfig model) - [BitNetConfig](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetConfig) configuration class: [BitNetForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetForCausalLM) (BitNetConfig model) - [BlenderbotConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotForCausalLM](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotForCausalLM) (BlenderbotConfig model) - [BlenderbotSmallConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallForCausalLM](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallForCausalLM) (BlenderbotSmallConfig model) - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForCausalLM) (BloomConfig model) - [BltConfig](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltConfig) configuration class: [BltForCausalLM](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltForCausalLM) (BltConfig model) - [CTRLConfig](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRLConfig model) - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForCausalLM) (CamembertConfig model) - [CodeGenConfig](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenConfig) configuration class: [CodeGenForCausalLM](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenForCausalLM) (CodeGenConfig model) - [Cohere2Config](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2Config) configuration class: [Cohere2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2ForCausalLM) (Cohere2Config model) - [CohereConfig](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereConfig) configuration class: [CohereForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereForCausalLM) (CohereConfig model) - [CpmAntConfig](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntConfig) configuration class: [CpmAntForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntForCausalLM) (CpmAntConfig model) - [CwmConfig](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmConfig) configuration class: [CwmForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmForCausalLM) (CwmConfig model) - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForCausalLM](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForCausalLM) (Data2VecTextConfig model) - [DbrxConfig](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxConfig) configuration class: [DbrxForCausalLM](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxForCausalLM) (DbrxConfig model) - [DeepseekV2Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Config) configuration class: [DeepseekV2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2ForCausalLM) (DeepseekV2Config model) - [DeepseekV3Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3ForCausalLM) (DeepseekV3Config model) - [DeepseekV4Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4Config) configuration class: [DeepseekV4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4ForCausalLM) (DeepseekV4Config model) - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForCausalLM) (DiffLlamaConfig model) - [DogeConfig](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeConfig) configuration class: [DogeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeForCausalLM) (DogeConfig model) - [Dots1Config](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1Config) configuration class: [Dots1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1ForCausalLM) (Dots1Config model) - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForCausalLM](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForCausalLM) (ElectraConfig model) - [Emu3Config](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Config) configuration class: [Emu3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3ForCausalLM) (Emu3Config model) - [Ernie4_5Config](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5Config) configuration class: [Ernie4_5ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5ForCausalLM) (Ernie4_5Config model) - [Ernie4_5_MoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeConfig) configuration class: [Ernie4_5_MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeForCausalLM) (Ernie4_5_MoeConfig model) - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForCausalLM) (ErnieConfig model) - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForCausalLM) (Exaone4Config model) - [ExaoneMoeConfig](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeConfig) configuration class: [ExaoneMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeForCausalLM) (ExaoneMoeConfig model) - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForCausalLM) (FalconConfig model) - [FalconH1Config](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1Config) configuration class: [FalconH1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1ForCausalLM) (FalconH1Config model) - [FalconMambaConfig](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaConfig) configuration class: [FalconMambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaForCausalLM) (FalconMambaConfig model) - [FlexOlmoConfig](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoConfig) configuration class: [FlexOlmoForCausalLM](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoForCausalLM) (FlexOlmoConfig model) - [FuyuConfig](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuConfig) configuration class: [FuyuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuForCausalLM) (FuyuConfig model) - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2LMHeadModel](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT2Config model) - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForCausalLM) (GPTBigCodeConfig model) - [GPTJConfig](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJConfig) configuration class: [GPTJForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJForCausalLM) (GPTJConfig model) - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForCausalLM) (GPTNeoConfig model) - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForCausalLM) (GPTNeoXConfig model) - [GPTNeoXJapaneseConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseConfig) configuration class: [GPTNeoXJapaneseForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseForCausalLM) (GPTNeoXJapaneseConfig model) - [Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2ForCausalLM) (Gemma2Config model) - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model) - [Gemma3TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForCausalLM) (Gemma3TextConfig model) - [Gemma3nConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nConfig) configuration class: [Gemma3nForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForConditionalGeneration) (Gemma3nConfig model) - [Gemma3nTextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nTextConfig) configuration class: [Gemma3nForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForCausalLM) (Gemma3nTextConfig model) - [Gemma4AssistantConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4_assistant#transformers.Gemma4AssistantConfig) configuration class: [Gemma4AssistantForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma4_assistant#transformers.Gemma4AssistantForCausalLM) (Gemma4AssistantConfig model) - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model) - [Gemma4TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4TextConfig) configuration class: [Gemma4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForCausalLM) (Gemma4TextConfig model) - [GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaForCausalLM) (GemmaConfig model) - [GitConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitConfig) configuration class: [GitForCausalLM](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitForCausalLM) (GitConfig model) - [Glm4Config](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Config) configuration class: [Glm4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4ForCausalLM) (Glm4Config model) - [Glm4MoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeConfig) configuration class: [Glm4MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeForCausalLM) (Glm4MoeConfig model) - [Glm4MoeLiteConfig](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteConfig) configuration class: [Glm4MoeLiteForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteForCausalLM) (Glm4MoeLiteConfig model) - [GlmConfig](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmConfig) configuration class: [GlmForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmForCausalLM) (GlmConfig model) - [GlmMoeDsaConfig](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaConfig) configuration class: [GlmMoeDsaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaForCausalLM) (GlmMoeDsaConfig model) - [GotOcr2Config](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Config) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (GotOcr2Config model) - [GptOssConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssConfig) configuration class: [GptOssForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssForCausalLM) (GptOssConfig model) - [GraniteConfig](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteConfig) configuration class: [GraniteForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteForCausalLM) (GraniteConfig model) - [GraniteMoeConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeConfig) configuration class: [GraniteMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeForCausalLM) (GraniteMoeConfig model) - [GraniteMoeHybridConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridConfig) configuration class: [GraniteMoeHybridForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridForCausalLM) (GraniteMoeHybridConfig model) - [GraniteMoeSharedConfig](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedConfig) configuration class: [GraniteMoeSharedForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedForCausalLM) (GraniteMoeSharedConfig model) - [HYV3Config](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3Config) configuration class: [HYV3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3ForCausalLM) (HYV3Config model) - [HeliumConfig](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumConfig) configuration class: [HeliumForCausalLM](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumForCausalLM) (HeliumConfig model) - [HunYuanDenseV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Config) configuration class: [HunYuanDenseV1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1ForCausalLM) (HunYuanDenseV1Config model) - [HunYuanMoEV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Config) configuration class: [HunYuanMoEV1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1ForCausalLM) (HunYuanMoEV1Config model) - [Jais2Config](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2Config) configuration class: [Jais2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2ForCausalLM) (Jais2Config model) - [JambaConfig](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaForCausalLM) (JambaConfig model) - [JetMoeConfig](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeConfig) configuration class: [JetMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeForCausalLM) (JetMoeConfig model) - [LagunaConfig](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaConfig) configuration class: [LagunaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaForCausalLM) (LagunaConfig model) - [Lfm2Config](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2Config) configuration class: [Lfm2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2ForCausalLM) (Lfm2Config model) - [Lfm2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeConfig) configuration class: [Lfm2MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeForCausalLM) (Lfm2MoeConfig model) - [Llama4Config](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Config) configuration class: [Llama4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForCausalLM) (Llama4Config model) - [Llama4TextConfig](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4TextConfig) configuration class: [Llama4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForCausalLM) (Llama4TextConfig model) - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaForCausalLM) (LlamaConfig model) - [LongcatFlashConfig](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashConfig) configuration class: [LongcatFlashForCausalLM](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashForCausalLM) (LongcatFlashConfig model) - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForCausalLM) (MBartConfig model) - [Mamba2Config](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2Config) configuration class: [Mamba2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2ForCausalLM) (Mamba2Config model) - [MambaConfig](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaConfig) configuration class: [MambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaForCausalLM) (MambaConfig model) - [MarianConfig](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianConfig) configuration class: [MarianForCausalLM](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianForCausalLM) (MarianConfig model) - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForCausalLM) (MegatronBertConfig model) - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxForCausalLM](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForCausalLM) (MiniMaxConfig model) - [MiniMaxM2Config](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2Config) configuration class: [MiniMaxM2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2ForCausalLM) (MiniMaxM2Config model) - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForCausalLM) (Ministral3Config model) - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForCausalLM) (MinistralConfig model) - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForCausalLM) (MistralConfig model) - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForCausalLM) (MixtralConfig model) - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForCausalLM) (MllamaConfig model) - [ModernBertDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderConfig) configuration class: [ModernBertDecoderForCausalLM](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderForCausalLM) (ModernBertDecoderConfig model) - [MoshiConfig](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiConfig) configuration class: [MoshiForCausalLM](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiForCausalLM) (MoshiConfig model) - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForCausalLM) (MptConfig model) - [MusicgenConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenConfig) configuration class: [MusicgenForCausalLM](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenForCausalLM) (MusicgenConfig model) - [MusicgenMelodyConfig](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyConfig) configuration class: [MusicgenMelodyForCausalLM](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyForCausalLM) (MusicgenMelodyConfig model) - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForCausalLM) (MvpConfig model) - [NanoChatConfig](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatConfig) configuration class: [NanoChatForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatForCausalLM) (NanoChatConfig model) - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForCausalLM) (NemotronConfig model) - [NemotronHConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHConfig) configuration class: [NemotronHForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHForCausalLM) (NemotronHConfig model) - [OPTConfig](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTConfig) configuration class: [OPTForCausalLM](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTForCausalLM) (OPTConfig model) - [Olmo2Config](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Config) configuration class: [Olmo2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2ForCausalLM) (Olmo2Config model) - [Olmo3Config](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Config) configuration class: [Olmo3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3ForCausalLM) (Olmo3Config model) - [OlmoConfig](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoConfig) configuration class: [OlmoForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoForCausalLM) (OlmoConfig model) - [OlmoHybridConfig](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridConfig) configuration class: [OlmoHybridForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridForCausalLM) (OlmoHybridConfig model) - [OlmoeConfig](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeConfig) configuration class: [OlmoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeForCausalLM) (OlmoeConfig model) - [OpenAIGPTConfig](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAIGPTConfig model) - [PLBartConfig](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartConfig) configuration class: [PLBartForCausalLM](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartForCausalLM) (PLBartConfig model) - [PegasusConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusConfig) configuration class: [PegasusForCausalLM](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusForCausalLM) (PegasusConfig model) - [PersimmonConfig](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonConfig) configuration class: [PersimmonForCausalLM](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonForCausalLM) (PersimmonConfig model) - [Phi3Config](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Config) configuration class: [Phi3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3ForCausalLM) (Phi3Config model) - [Phi4MultimodalConfig](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalConfig) configuration class: [Phi4MultimodalForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalForCausalLM) (Phi4MultimodalConfig model) - [PhiConfig](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiConfig) configuration class: [PhiForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiForCausalLM) (PhiConfig model) - [PhimoeConfig](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeConfig) configuration class: [PhimoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeForCausalLM) (PhimoeConfig model) - [ProphetNetConfig](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetConfig) configuration class: [ProphetNetForCausalLM](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetForCausalLM) (ProphetNetConfig model) - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForCausalLM) (Qwen2Config model) - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForCausalLM) (Qwen2MoeConfig model) - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForCausalLM) (Qwen3Config model) - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForCausalLM) (Qwen3MoeConfig model) - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForCausalLM) (Qwen3NextConfig model) - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForCausalLM) (Qwen3_5Config model) - [Qwen3_5MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeConfig) configuration class: [Qwen3_5MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForCausalLM) (Qwen3_5MoeConfig model) - [Qwen3_5MoeTextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeTextConfig) configuration class: [Qwen3_5MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForCausalLM) (Qwen3_5MoeTextConfig model) - [Qwen3_5TextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextConfig) configuration class: [Qwen3_5ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForCausalLM) (Qwen3_5TextConfig model) - [RecurrentGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaConfig) configuration class: [RecurrentGemmaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaForCausalLM) (RecurrentGemmaConfig model) - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerModelWithLMHead](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerModelWithLMHead) (ReformerConfig model) - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForCausalLM) (RemBertConfig model) - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForCausalLM) (RoCBertConfig model) - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForCausalLM) (RoFormerConfig model) - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForCausalLM) (RobertaConfig model) - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForCausalLM) (RobertaPreLayerNormConfig model) - [RwkvConfig](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvConfig) configuration class: [RwkvForCausalLM](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvForCausalLM) (RwkvConfig model) - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssForCausalLM](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForCausalLM) (SeedOssConfig model) - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForCausalLM) (SmolLM3Config model) - [SolarOpenConfig](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenConfig) configuration class: [SolarOpenForCausalLM](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenForCausalLM) (SolarOpenConfig model) - [StableLmConfig](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmConfig) configuration class: [StableLmForCausalLM](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmForCausalLM) (StableLmConfig model) - [Starcoder2Config](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Config) configuration class: [Starcoder2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2ForCausalLM) (Starcoder2Config model) - [TrOCRConfig](/docs/transformers/v5.8.0/en/model_doc/trocr#transformers.TrOCRConfig) configuration class: [TrOCRForCausalLM](/docs/transformers/v5.8.0/en/model_doc/trocr#transformers.TrOCRForCausalLM) (TrOCRConfig model) - [VaultGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaConfig) configuration class: [VaultGemmaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaForCausalLM) (VaultGemmaConfig model) - [WhisperConfig](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForCausalLM](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperForCausalLM) (WhisperConfig model) - [XGLMConfig](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMConfig) configuration class: [XGLMForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMForCausalLM) (XGLMConfig model) - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMWithLMHeadModel) (XLMConfig model) - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForCausalLM) (XLMRobertaConfig model) - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForCausalLM) (XLMRobertaXLConfig model) - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetLMHeadModel) (XLNetConfig model) - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForCausalLM) (XmodConfig model) - [YoutuConfig](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuConfig) configuration class: [YoutuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuForCausalLM) (YoutuConfig model) - [Zamba2Config](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Config) configuration class: [Zamba2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2ForCausalLM) (Zamba2Config model) - [ZambaConfig](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaConfig) configuration class: [ZambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaForCausalLM) (ZambaConfig model) - [xLSTMConfig](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMConfig) configuration class: [xLSTMForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMForCausalLM) (xLSTMConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForCausalLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a causal language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **afmoe** -- [AfmoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/afmoe#transformers.AfmoeForCausalLM) (AfmoeConfig model)
- **apertus** -- [ApertusForCausalLM](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusForCausalLM) (ApertusConfig model)
- **arcee** -- [ArceeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForCausalLM) (ArceeConfig model)
- **aria_text** -- [AriaTextForCausalLM](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaTextForCausalLM) (AriaTextConfig model)
- **bamba** -- [BambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bamba#transformers.BambaForCausalLM) (BambaConfig model)
- **bart** -- [BartForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForCausalLM) (BartConfig model)
- **bert** -- [BertLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertLMHeadModel) (BertConfig model)
- **bert-generation** -- [BertGenerationDecoder](/docs/transformers/v5.8.0/en/model_doc/bert-generation#transformers.BertGenerationDecoder) (BertGenerationConfig model)
- **big_bird** -- [BigBirdForCausalLM](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForCausalLM) (BigBirdConfig model)
- **bigbird_pegasus** -- [BigBirdPegasusForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForCausalLM) (BigBirdPegasusConfig model)
- **biogpt** -- [BioGptForCausalLM](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptForCausalLM) (BioGptConfig model)
- **bitnet** -- [BitNetForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bitnet#transformers.BitNetForCausalLM) (BitNetConfig model)
- **blenderbot** -- [BlenderbotForCausalLM](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotForCausalLM) (BlenderbotConfig model)
- **blenderbot-small** -- [BlenderbotSmallForCausalLM](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallForCausalLM) (BlenderbotSmallConfig model)
- **bloom** -- [BloomForCausalLM](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForCausalLM) (BloomConfig model)
- **blt** -- [BltForCausalLM](/docs/transformers/v5.8.0/en/model_doc/blt#transformers.BltForCausalLM) (BltConfig model)
- **camembert** -- [CamembertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForCausalLM) (CamembertConfig model)
- **codegen** -- [CodeGenForCausalLM](/docs/transformers/v5.8.0/en/model_doc/codegen#transformers.CodeGenForCausalLM) (CodeGenConfig model)
- **cohere** -- [CohereForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cohere#transformers.CohereForCausalLM) (CohereConfig model)
- **cohere2** -- [Cohere2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cohere2#transformers.Cohere2ForCausalLM) (Cohere2Config model)
- **cpmant** -- [CpmAntForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cpmant#transformers.CpmAntForCausalLM) (CpmAntConfig model)
- **ctrl** -- [CTRLLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLLMHeadModel) (CTRLConfig model)
- **cwm** -- [CwmForCausalLM](/docs/transformers/v5.8.0/en/model_doc/cwm#transformers.CwmForCausalLM) (CwmConfig model)
- **data2vec-text** -- [Data2VecTextForCausalLM](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForCausalLM) (Data2VecTextConfig model)
- **dbrx** -- [DbrxForCausalLM](/docs/transformers/v5.8.0/en/model_doc/dbrx#transformers.DbrxForCausalLM) (DbrxConfig model)
- **deepseek_v2** -- [DeepseekV2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2ForCausalLM) (DeepseekV2Config model)
- **deepseek_v3** -- [DeepseekV3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3ForCausalLM) (DeepseekV3Config model)
- **deepseek_v4** -- [DeepseekV4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/deepseek_v4#transformers.DeepseekV4ForCausalLM) (DeepseekV4Config model)
- **diffllama** -- [DiffLlamaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForCausalLM) (DiffLlamaConfig model)
- **doge** -- [DogeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeForCausalLM) (DogeConfig model)
- **dots1** -- [Dots1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/dots1#transformers.Dots1ForCausalLM) (Dots1Config model)
- **electra** -- [ElectraForCausalLM](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForCausalLM) (ElectraConfig model)
- **emu3** -- [Emu3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3ForCausalLM) (Emu3Config model)
- **ernie** -- [ErnieForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForCausalLM) (ErnieConfig model)
- **ernie4_5** -- [Ernie4_5ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ernie4_5#transformers.Ernie4_5ForCausalLM) (Ernie4_5Config model)
- **ernie4_5_moe** -- [Ernie4_5_MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_moe#transformers.Ernie4_5_MoeForCausalLM) (Ernie4_5_MoeConfig model)
- **exaone4** -- [Exaone4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForCausalLM) (Exaone4Config model)
- **exaone_moe** -- [ExaoneMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/exaone_moe#transformers.ExaoneMoeForCausalLM) (ExaoneMoeConfig model)
- **falcon** -- [FalconForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForCausalLM) (FalconConfig model)
- **falcon_h1** -- [FalconH1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon_h1#transformers.FalconH1ForCausalLM) (FalconH1Config model)
- **falcon_mamba** -- [FalconMambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/falcon_mamba#transformers.FalconMambaForCausalLM) (FalconMambaConfig model)
- **flex_olmo** -- [FlexOlmoForCausalLM](/docs/transformers/v5.8.0/en/model_doc/flex_olmo#transformers.FlexOlmoForCausalLM) (FlexOlmoConfig model)
- **fuyu** -- [FuyuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuForCausalLM) (FuyuConfig model)
- **gemma** -- [GemmaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaForCausalLM) (GemmaConfig model)
- **gemma2** -- [Gemma2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2ForCausalLM) (Gemma2Config model)
- **gemma3** -- [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model)
- **gemma3_text** -- [Gemma3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForCausalLM) (Gemma3TextConfig model)
- **gemma3n** -- [Gemma3nForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForConditionalGeneration) (Gemma3nConfig model)
- **gemma3n_text** -- [Gemma3nForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForCausalLM) (Gemma3nTextConfig model)
- **gemma4** -- [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model)
- **gemma4_assistant** -- [Gemma4AssistantForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma4_assistant#transformers.Gemma4AssistantForCausalLM) (Gemma4AssistantConfig model)
- **gemma4_text** -- [Gemma4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForCausalLM) (Gemma4TextConfig model)
- **git** -- [GitForCausalLM](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitForCausalLM) (GitConfig model)
- **glm** -- [GlmForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmForCausalLM) (GlmConfig model)
- **glm4** -- [Glm4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4ForCausalLM) (Glm4Config model)
- **glm4_moe** -- [Glm4MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm4_moe#transformers.Glm4MoeForCausalLM) (Glm4MoeConfig model)
- **glm4_moe_lite** -- [Glm4MoeLiteForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm4_moe_lite#transformers.Glm4MoeLiteForCausalLM) (Glm4MoeLiteConfig model)
- **glm_moe_dsa** -- [GlmMoeDsaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/glm_moe_dsa#transformers.GlmMoeDsaForCausalLM) (GlmMoeDsaConfig model)
- **got_ocr2** -- [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (GotOcr2Config model)
- **gpt-sw3** -- [GPT2LMHeadModel](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT2Config model)
- **gpt2** -- [GPT2LMHeadModel](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2LMHeadModel) (GPT2Config model)
- **gpt_bigcode** -- [GPTBigCodeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForCausalLM) (GPTBigCodeConfig model)
- **gpt_neo** -- [GPTNeoForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForCausalLM) (GPTNeoConfig model)
- **gpt_neox** -- [GPTNeoXForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForCausalLM) (GPTNeoXConfig model)
- **gpt_neox_japanese** -- [GPTNeoXJapaneseForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_neox_japanese#transformers.GPTNeoXJapaneseForCausalLM) (GPTNeoXJapaneseConfig model)
- **gpt_oss** -- [GptOssForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssForCausalLM) (GptOssConfig model)
- **gptj** -- [GPTJForCausalLM](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJForCausalLM) (GPTJConfig model)
- **granite** -- [GraniteForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granite#transformers.GraniteForCausalLM) (GraniteConfig model)
- **granitemoe** -- [GraniteMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granitemoe#transformers.GraniteMoeForCausalLM) (GraniteMoeConfig model)
- **granitemoehybrid** -- [GraniteMoeHybridForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granitemoehybrid#transformers.GraniteMoeHybridForCausalLM) (GraniteMoeHybridConfig model)
- **granitemoeshared** -- [GraniteMoeSharedForCausalLM](/docs/transformers/v5.8.0/en/model_doc/granitemoeshared#transformers.GraniteMoeSharedForCausalLM) (GraniteMoeSharedConfig model)
- **helium** -- [HeliumForCausalLM](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumForCausalLM) (HeliumConfig model)
- **hunyuan_v1_dense** -- [HunYuanDenseV1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1ForCausalLM) (HunYuanDenseV1Config model)
- **hunyuan_v1_moe** -- [HunYuanMoEV1ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1ForCausalLM) (HunYuanMoEV1Config model)
- **hy_v3** -- [HYV3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/hy_v3#transformers.HYV3ForCausalLM) (HYV3Config model)
- **jais2** -- [Jais2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/jais2#transformers.Jais2ForCausalLM) (Jais2Config model)
- **jamba** -- [JambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaForCausalLM) (JambaConfig model)
- **jetmoe** -- [JetMoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeForCausalLM) (JetMoeConfig model)
- **laguna** -- [LagunaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/laguna#transformers.LagunaForCausalLM) (LagunaConfig model)
- **lfm2** -- [Lfm2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/lfm2#transformers.Lfm2ForCausalLM) (Lfm2Config model)
- **lfm2_moe** -- [Lfm2MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/lfm2_moe#transformers.Lfm2MoeForCausalLM) (Lfm2MoeConfig model)
- **llama** -- [LlamaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaForCausalLM) (LlamaConfig model)
- **llama4** -- [Llama4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForCausalLM) (Llama4Config model)
- **llama4_text** -- [Llama4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForCausalLM) (Llama4TextConfig model)
- **longcat_flash** -- [LongcatFlashForCausalLM](/docs/transformers/v5.8.0/en/model_doc/longcat_flash#transformers.LongcatFlashForCausalLM) (LongcatFlashConfig model)
- **mamba** -- [MambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba#transformers.MambaForCausalLM) (MambaConfig model)
- **mamba2** -- [Mamba2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mamba2#transformers.Mamba2ForCausalLM) (Mamba2Config model)
- **marian** -- [MarianForCausalLM](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianForCausalLM) (MarianConfig model)
- **mbart** -- [MBartForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForCausalLM) (MBartConfig model)
- **megatron-bert** -- [MegatronBertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForCausalLM) (MegatronBertConfig model)
- **minimax** -- [MiniMaxForCausalLM](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForCausalLM) (MiniMaxConfig model)
- **minimax_m2** -- [MiniMaxM2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/minimax_m2#transformers.MiniMaxM2ForCausalLM) (MiniMaxM2Config model)
- **ministral** -- [MinistralForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForCausalLM) (MinistralConfig model)
- **ministral3** -- [Ministral3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForCausalLM) (Ministral3Config model)
- **mistral** -- [MistralForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForCausalLM) (MistralConfig model)
- **mixtral** -- [MixtralForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForCausalLM) (MixtralConfig model)
- **mllama** -- [MllamaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForCausalLM) (MllamaConfig model)
- **modernbert-decoder** -- [ModernBertDecoderForCausalLM](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderForCausalLM) (ModernBertDecoderConfig model)
- **moshi** -- [MoshiForCausalLM](/docs/transformers/v5.8.0/en/model_doc/moshi#transformers.MoshiForCausalLM) (MoshiConfig model)
- **mpt** -- [MptForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForCausalLM) (MptConfig model)
- **musicgen** -- [MusicgenForCausalLM](/docs/transformers/v5.8.0/en/model_doc/musicgen#transformers.MusicgenForCausalLM) (MusicgenConfig model)
- **musicgen_melody** -- [MusicgenMelodyForCausalLM](/docs/transformers/v5.8.0/en/model_doc/musicgen_melody#transformers.MusicgenMelodyForCausalLM) (MusicgenMelodyConfig model)
- **mvp** -- [MvpForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForCausalLM) (MvpConfig model)
- **nanochat** -- [NanoChatForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nanochat#transformers.NanoChatForCausalLM) (NanoChatConfig model)
- **nemotron** -- [NemotronForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForCausalLM) (NemotronConfig model)
- **nemotron_h** -- [NemotronHForCausalLM](/docs/transformers/v5.8.0/en/model_doc/nemotron_h#transformers.NemotronHForCausalLM) (NemotronHConfig model)
- **olmo** -- [OlmoForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoForCausalLM) (OlmoConfig model)
- **olmo2** -- [Olmo2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2ForCausalLM) (Olmo2Config model)
- **olmo3** -- [Olmo3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3ForCausalLM) (Olmo3Config model)
- **olmo_hybrid** -- [OlmoHybridForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmo_hybrid#transformers.OlmoHybridForCausalLM) (OlmoHybridConfig model)
- **olmoe** -- [OlmoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/olmoe#transformers.OlmoeForCausalLM) (OlmoeConfig model)
- **openai-gpt** -- [OpenAIGPTLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTLMHeadModel) (OpenAIGPTConfig model)
- **opt** -- [OPTForCausalLM](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTForCausalLM) (OPTConfig model)
- **pegasus** -- [PegasusForCausalLM](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusForCausalLM) (PegasusConfig model)
- **persimmon** -- [PersimmonForCausalLM](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonForCausalLM) (PersimmonConfig model)
- **phi** -- [PhiForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiForCausalLM) (PhiConfig model)
- **phi3** -- [Phi3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3ForCausalLM) (Phi3Config model)
- **phi4_multimodal** -- [Phi4MultimodalForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalForCausalLM) (Phi4MultimodalConfig model)
- **phimoe** -- [PhimoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeForCausalLM) (PhimoeConfig model)
- **plbart** -- [PLBartForCausalLM](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartForCausalLM) (PLBartConfig model)
- **prophetnet** -- [ProphetNetForCausalLM](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetForCausalLM) (ProphetNetConfig model)
- **qwen2** -- [Qwen2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForCausalLM) (Qwen2Config model)
- **qwen2_moe** -- [Qwen2MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForCausalLM) (Qwen2MoeConfig model)
- **qwen3** -- [Qwen3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForCausalLM) (Qwen3Config model)
- **qwen3_5** -- [Qwen3_5ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForCausalLM) (Qwen3_5Config model)
- **qwen3_5_moe** -- [Qwen3_5MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForCausalLM) (Qwen3_5MoeConfig model)
- **qwen3_5_moe_text** -- [Qwen3_5MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForCausalLM) (Qwen3_5MoeTextConfig model)
- **qwen3_5_text** -- [Qwen3_5ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForCausalLM) (Qwen3_5TextConfig model)
- **qwen3_moe** -- [Qwen3MoeForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForCausalLM) (Qwen3MoeConfig model)
- **qwen3_next** -- [Qwen3NextForCausalLM](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForCausalLM) (Qwen3NextConfig model)
- **recurrent_gemma** -- [RecurrentGemmaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/recurrent_gemma#transformers.RecurrentGemmaForCausalLM) (RecurrentGemmaConfig model)
- **reformer** -- [ReformerModelWithLMHead](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerModelWithLMHead) (ReformerConfig model)
- **rembert** -- [RemBertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForCausalLM) (RemBertConfig model)
- **roberta** -- [RobertaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForCausalLM) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaPreLayerNormForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForCausalLM) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForCausalLM) (RoCBertConfig model)
- **roformer** -- [RoFormerForCausalLM](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForCausalLM) (RoFormerConfig model)
- **rwkv** -- [RwkvForCausalLM](/docs/transformers/v5.8.0/en/model_doc/rwkv#transformers.RwkvForCausalLM) (RwkvConfig model)
- **seed_oss** -- [SeedOssForCausalLM](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForCausalLM) (SeedOssConfig model)
- **smollm3** -- [SmolLM3ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForCausalLM) (SmolLM3Config model)
- **solar_open** -- [SolarOpenForCausalLM](/docs/transformers/v5.8.0/en/model_doc/solar_open#transformers.SolarOpenForCausalLM) (SolarOpenConfig model)
- **stablelm** -- [StableLmForCausalLM](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmForCausalLM) (StableLmConfig model)
- **starcoder2** -- [Starcoder2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2ForCausalLM) (Starcoder2Config model)
- **trocr** -- [TrOCRForCausalLM](/docs/transformers/v5.8.0/en/model_doc/trocr#transformers.TrOCRForCausalLM) (TrOCRConfig model)
- **vaultgemma** -- [VaultGemmaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/vaultgemma#transformers.VaultGemmaForCausalLM) (VaultGemmaConfig model)
- **whisper** -- [WhisperForCausalLM](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperForCausalLM) (WhisperConfig model)
- **xglm** -- [XGLMForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xglm#transformers.XGLMForCausalLM) (XGLMConfig model)
- **xlm** -- [XLMWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMWithLMHeadModel) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForCausalLM) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaXLForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForCausalLM) (XLMRobertaXLConfig model)
- **xlnet** -- [XLNetLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetLMHeadModel) (XLNetConfig model)
- **xlstm** -- [xLSTMForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xlstm#transformers.xLSTMForCausalLM) (xLSTMConfig model)
- **xmod** -- [XmodForCausalLM](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForCausalLM) (XmodConfig model)
- **youtu** -- [YoutuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/youtu#transformers.YoutuForCausalLM) (YoutuConfig model)
- **zamba** -- [ZambaForCausalLM](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaForCausalLM) (ZambaConfig model)
- **zamba2** -- [Zamba2ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2ForCausalLM) (Zamba2Config model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMaskedLM[[transformers.AutoModelForMaskedLM]]

#### transformers.AutoModelForMaskedLM[[transformers.AutoModelForMaskedLM]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2028)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked language modeling head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMaskedLM.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForMaskedLM` (AlbertConfig model)
  - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForConditionalGeneration) (BartConfig model)
  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForMaskedLM) (BertConfig model)
  - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForMaskedLM) (BigBirdConfig model)
  - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForMaskedLM) (CamembertConfig model)
  - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForMaskedLM) (ConvBertConfig model)
  - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecTextConfig model)
  - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForMaskedLM) (DebertaConfig model)
  - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DebertaV2Config model)
  - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForMaskedLM) (DistilBertConfig model)
  - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForMaskedLM) (ElectraConfig model)
  - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForMaskedLM) (ErnieConfig model)
  - [EsmConfig](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmForMaskedLM) (EsmConfig model)
  - [EuroBertConfig](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertConfig) configuration class: [EuroBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertForMaskedLM) (EuroBertConfig model)
  - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForMaskedLM) (FNetConfig model)
  - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertWithLMHeadModel) (FlaubertConfig model)
  - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForMaskedLM) (FunnelConfig model)
  - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForMaskedLM) (IBertConfig model)
  - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3ForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForMaskedLM) (JinaEmbeddingsV3Config model)
  - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForMaskedLM) (LayoutLMConfig model)
  - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForMaskedLM) (LongformerConfig model)
  - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForMaskedLM) (LukeConfig model)
  - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForConditionalGeneration) (MBartConfig model)
  - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForMaskedLM) (MPNetConfig model)
  - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForMaskedLM) (MegatronBertConfig model)
  - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForMaskedLM) (MobileBertConfig model)
  - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForMaskedLM) (ModernBertConfig model)
  - [ModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertConfig) configuration class: [ModernVBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertForMaskedLM) (ModernVBertConfig model)
  - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForMaskedLM) (MraConfig model)
  - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForConditionalGeneration) (MvpConfig model)
  - [NomicBertConfig](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertConfig) configuration class: [NomicBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertForMaskedLM) (NomicBertConfig model)
  - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForMaskedLM) (NystromformerConfig model)
  - [PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverConfig) configuration class: [PerceiverForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForMaskedLM) (PerceiverConfig model)
  - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerForMaskedLM) (ReformerConfig model)
  - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForMaskedLM) (RemBertConfig model)
  - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForMaskedLM) (RoCBertConfig model)
  - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForMaskedLM) (RoFormerConfig model)
  - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForMaskedLM) (RobertaConfig model)
  - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForMaskedLM) (RobertaPreLayerNormConfig model)
  - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForMaskedLM) (SqueezeBertConfig model)
  - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForMaskedLM) (TapasConfig model)
  - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMWithLMHeadModel) (XLMConfig model)
  - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForMaskedLM) (XLMRobertaConfig model)
  - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForMaskedLM) (XLMRobertaXLConfig model)
  - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForMaskedLM) (XmodConfig model)
  - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForMaskedLM) (YosoConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedLM.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForMaskedLM` (AlbertConfig model) - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForConditionalGeneration) (BartConfig model) - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForMaskedLM) (BertConfig model) - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForMaskedLM) (BigBirdConfig model) - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForMaskedLM) (CamembertConfig model) - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForMaskedLM) (ConvBertConfig model) - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecTextConfig model) - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForMaskedLM) (DebertaConfig model) - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DebertaV2Config model) - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForMaskedLM) (DistilBertConfig model) - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForMaskedLM) (ElectraConfig model) - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForMaskedLM) (ErnieConfig model) - [EsmConfig](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmForMaskedLM) (EsmConfig model) - [EuroBertConfig](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertConfig) configuration class: [EuroBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertForMaskedLM) (EuroBertConfig model) - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForMaskedLM) (FNetConfig model) - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertWithLMHeadModel) (FlaubertConfig model) - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForMaskedLM) (FunnelConfig model) - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForMaskedLM) (IBertConfig model) - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3ForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForMaskedLM) (JinaEmbeddingsV3Config model) - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForMaskedLM) (LayoutLMConfig model) - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForMaskedLM) (LongformerConfig model) - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForMaskedLM) (LukeConfig model) - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForConditionalGeneration) (MBartConfig model) - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForMaskedLM) (MPNetConfig model) - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForMaskedLM) (MegatronBertConfig model) - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForMaskedLM) (MobileBertConfig model) - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForMaskedLM) (ModernBertConfig model) - [ModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertConfig) configuration class: [ModernVBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertForMaskedLM) (ModernVBertConfig model) - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForMaskedLM) (MraConfig model) - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForConditionalGeneration) (MvpConfig model) - [NomicBertConfig](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertConfig) configuration class: [NomicBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertForMaskedLM) (NomicBertConfig model) - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForMaskedLM) (NystromformerConfig model) - [PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverConfig) configuration class: [PerceiverForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForMaskedLM) (PerceiverConfig model) - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerForMaskedLM) (ReformerConfig model) - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForMaskedLM) (RemBertConfig model) - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForMaskedLM) (RoCBertConfig model) - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForMaskedLM) (RoFormerConfig model) - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForMaskedLM) (RobertaConfig model) - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForMaskedLM) (RobertaPreLayerNormConfig model) - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForMaskedLM) (SqueezeBertConfig model) - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForMaskedLM) (TapasConfig model) - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMWithLMHeadModel) (XLMConfig model) - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForMaskedLM) (XLMRobertaConfig model) - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForMaskedLM) (XLMRobertaXLConfig model) - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForMaskedLM) (XmodConfig model) - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForMaskedLM) (YosoConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMaskedLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a masked language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForMaskedLM` (AlbertConfig model)
- **bart** -- [BartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForConditionalGeneration) (BartConfig model)
- **bert** -- [BertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForMaskedLM) (BertConfig model)
- **big_bird** -- [BigBirdForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForMaskedLM) (BigBirdConfig model)
- **camembert** -- [CamembertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForMaskedLM) (CamembertConfig model)
- **convbert** -- [ConvBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForMaskedLM) (ConvBertConfig model)
- **data2vec-text** -- [Data2VecTextForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForMaskedLM) (Data2VecTextConfig model)
- **deberta** -- [DebertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForMaskedLM) (DebertaConfig model)
- **deberta-v2** -- [DebertaV2ForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForMaskedLM) (DebertaV2Config model)
- **distilbert** -- [DistilBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForMaskedLM) (DistilBertConfig model)
- **electra** -- [ElectraForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForMaskedLM) (ElectraConfig model)
- **ernie** -- [ErnieForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForMaskedLM) (ErnieConfig model)
- **esm** -- [EsmForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmForMaskedLM) (EsmConfig model)
- **eurobert** -- [EuroBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertForMaskedLM) (EuroBertConfig model)
- **flaubert** -- [FlaubertWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertWithLMHeadModel) (FlaubertConfig model)
- **fnet** -- [FNetForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForMaskedLM) (FNetConfig model)
- **funnel** -- [FunnelForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForMaskedLM) (FunnelConfig model)
- **ibert** -- [IBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForMaskedLM) (IBertConfig model)
- **jina_embeddings_v3** -- [JinaEmbeddingsV3ForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForMaskedLM) (JinaEmbeddingsV3Config model)
- **layoutlm** -- [LayoutLMForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForMaskedLM) (LayoutLMConfig model)
- **longformer** -- [LongformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForMaskedLM) (LongformerConfig model)
- **luke** -- [LukeForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForMaskedLM) (LukeConfig model)
- **mbart** -- [MBartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForConditionalGeneration) (MBartConfig model)
- **megatron-bert** -- [MegatronBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForMaskedLM) (MegatronBertConfig model)
- **mobilebert** -- [MobileBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForMaskedLM) (MobileBertConfig model)
- **modernbert** -- [ModernBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForMaskedLM) (ModernBertConfig model)
- **modernvbert** -- [ModernVBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertForMaskedLM) (ModernVBertConfig model)
- **mpnet** -- [MPNetForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForMaskedLM) (MPNetConfig model)
- **mra** -- [MraForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForMaskedLM) (MraConfig model)
- **mvp** -- [MvpForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForConditionalGeneration) (MvpConfig model)
- **nomic_bert** -- [NomicBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertForMaskedLM) (NomicBertConfig model)
- **nystromformer** -- [NystromformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForMaskedLM) (NystromformerConfig model)
- **perceiver** -- [PerceiverForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForMaskedLM) (PerceiverConfig model)
- **reformer** -- [ReformerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerForMaskedLM) (ReformerConfig model)
- **rembert** -- [RemBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForMaskedLM) (RemBertConfig model)
- **roberta** -- [RobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForMaskedLM) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaPreLayerNormForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForMaskedLM) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForMaskedLM) (RoCBertConfig model)
- **roformer** -- [RoFormerForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForMaskedLM) (RoFormerConfig model)
- **squeezebert** -- [SqueezeBertForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForMaskedLM) (SqueezeBertConfig model)
- **tapas** -- [TapasForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForMaskedLM) (TapasConfig model)
- **xlm** -- [XLMWithLMHeadModel](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMWithLMHeadModel) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForMaskedLM) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaXLForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForMaskedLM) (XLMRobertaXLConfig model)
- **xmod** -- [XmodForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForMaskedLM) (XmodConfig model)
- **yoso** -- [YosoForMaskedLM](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForMaskedLM) (YosoConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMaskGeneration[[transformers.AutoModelForMaskGeneration]]

#### transformers.AutoModelForMaskGeneration[[transformers.AutoModelForMaskGeneration]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L1977)

### AutoModelForSeq2SeqLM[[transformers.AutoModelForSeq2SeqLM]]

#### transformers.AutoModelForSeq2SeqLM[[transformers.AutoModelForSeq2SeqLM]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2035)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence language modeling head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSeq2SeqLM.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AudioFlamingo3Config](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Config) configuration class: [AudioFlamingo3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3ForConditionalGeneration) (AudioFlamingo3Config model)
  - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForConditionalGeneration) (BartConfig model)
  - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForConditionalGeneration) (BigBirdPegasusConfig model)
  - [BlenderbotConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotForConditionalGeneration) (BlenderbotConfig model)
  - [BlenderbotSmallConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallForConditionalGeneration) (BlenderbotSmallConfig model)
  - [EncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) configuration class: [EncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/encoder-decoder#transformers.EncoderDecoderModel) (EncoderDecoderConfig model)
  - [FSMTConfig](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTConfig) configuration class: [FSMTForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTForConditionalGeneration) (FSMTConfig model)
  - [GlmAsrConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrConfig) configuration class: [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model)
  - [GraniteSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechConfig) configuration class: [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model)
  - [GraniteSpeechPlusConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusConfig) configuration class: [GraniteSpeechPlusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusForConditionalGeneration) (GraniteSpeechPlusConfig model)
  - [LEDConfig](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDConfig) configuration class: [LEDForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDForConditionalGeneration) (LEDConfig model)
  - [LongT5Config](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5Config) configuration class: [LongT5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5ForConditionalGeneration) (LongT5Config model)
  - [M2M100Config](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100Config) configuration class: [M2M100ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100ForConditionalGeneration) (M2M100Config model)
  - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForConditionalGeneration) (MBartConfig model)
  - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForConditionalGeneration) (MT5Config model)
  - [MarianConfig](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianConfig) configuration class: [MarianMTModel](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianMTModel) (MarianConfig model)
  - [MusicFlamingoConfig](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoConfig) configuration class: [MusicFlamingoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoForConditionalGeneration) (MusicFlamingoConfig model)
  - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForConditionalGeneration) (MvpConfig model)
  - [NllbMoeConfig](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeConfig) configuration class: [NllbMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeForConditionalGeneration) (NllbMoeConfig model)
  - [PLBartConfig](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartConfig) configuration class: [PLBartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartForConditionalGeneration) (PLBartConfig model)
  - [PegasusConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusConfig) configuration class: [PegasusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusForConditionalGeneration) (PegasusConfig model)
  - [PegasusXConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXConfig) configuration class: [PegasusXForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXForConditionalGeneration) (PegasusXConfig model)
  - [ProphetNetConfig](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetConfig) configuration class: [ProphetNetForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetForConditionalGeneration) (ProphetNetConfig model)
  - [Qwen2AudioConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioConfig) configuration class: [Qwen2AudioForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioForConditionalGeneration) (Qwen2AudioConfig model)
  - [SeamlessM4TConfig](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TConfig) configuration class: [SeamlessM4TForTextToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TForTextToText) (SeamlessM4TConfig model)
  - [SeamlessM4Tv2Config](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Config) configuration class: [SeamlessM4Tv2ForTextToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2ForTextToText) (SeamlessM4Tv2Config model)
  - [SwitchTransformersConfig](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersConfig) configuration class: [SwitchTransformersForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersForConditionalGeneration) (SwitchTransformersConfig model)
  - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForConditionalGeneration) (T5Config model)
  - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model)
  - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForConditionalGeneration) (T5GemmaConfig model)
  - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForConditionalGeneration) (UMT5Config model)
  - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
  - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
  - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence language modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = AutoModelForSeq2SeqLM.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AudioFlamingo3Config](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3Config) configuration class: [AudioFlamingo3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3ForConditionalGeneration) (AudioFlamingo3Config model) - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForConditionalGeneration) (BartConfig model) - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForConditionalGeneration) (BigBirdPegasusConfig model) - [BlenderbotConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotConfig) configuration class: [BlenderbotForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotForConditionalGeneration) (BlenderbotConfig model) - [BlenderbotSmallConfig](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallConfig) configuration class: [BlenderbotSmallForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallForConditionalGeneration) (BlenderbotSmallConfig model) - [EncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/encoder-decoder#transformers.EncoderDecoderConfig) configuration class: [EncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/encoder-decoder#transformers.EncoderDecoderModel) (EncoderDecoderConfig model) - [FSMTConfig](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTConfig) configuration class: [FSMTForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTForConditionalGeneration) (FSMTConfig model) - [GlmAsrConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrConfig) configuration class: [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model) - [GraniteSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechConfig) configuration class: [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model) - [GraniteSpeechPlusConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusConfig) configuration class: [GraniteSpeechPlusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusForConditionalGeneration) (GraniteSpeechPlusConfig model) - [LEDConfig](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDConfig) configuration class: [LEDForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDForConditionalGeneration) (LEDConfig model) - [LongT5Config](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5Config) configuration class: [LongT5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5ForConditionalGeneration) (LongT5Config model) - [M2M100Config](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100Config) configuration class: [M2M100ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100ForConditionalGeneration) (M2M100Config model) - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForConditionalGeneration) (MBartConfig model) - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForConditionalGeneration) (MT5Config model) - [MarianConfig](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianConfig) configuration class: [MarianMTModel](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianMTModel) (MarianConfig model) - [MusicFlamingoConfig](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoConfig) configuration class: [MusicFlamingoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoForConditionalGeneration) (MusicFlamingoConfig model) - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForConditionalGeneration) (MvpConfig model) - [NllbMoeConfig](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeConfig) configuration class: [NllbMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeForConditionalGeneration) (NllbMoeConfig model) - [PLBartConfig](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartConfig) configuration class: [PLBartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartForConditionalGeneration) (PLBartConfig model) - [PegasusConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusConfig) configuration class: [PegasusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusForConditionalGeneration) (PegasusConfig model) - [PegasusXConfig](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXConfig) configuration class: [PegasusXForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXForConditionalGeneration) (PegasusXConfig model) - [ProphetNetConfig](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetConfig) configuration class: [ProphetNetForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetForConditionalGeneration) (ProphetNetConfig model) - [Qwen2AudioConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioConfig) configuration class: [Qwen2AudioForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioForConditionalGeneration) (Qwen2AudioConfig model) - [SeamlessM4TConfig](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TConfig) configuration class: [SeamlessM4TForTextToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TForTextToText) (SeamlessM4TConfig model) - [SeamlessM4Tv2Config](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Config) configuration class: [SeamlessM4Tv2ForTextToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2ForTextToText) (SeamlessM4Tv2Config model) - [SwitchTransformersConfig](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersConfig) configuration class: [SwitchTransformersForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersForConditionalGeneration) (SwitchTransformersConfig model) - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForConditionalGeneration) (T5Config model) - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model) - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForConditionalGeneration) (T5GemmaConfig model) - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForConditionalGeneration) (UMT5Config model) - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model) - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model) - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSeq2SeqLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a sequence-to-sequence language modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **audioflamingo3** -- [AudioFlamingo3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/audioflamingo3#transformers.AudioFlamingo3ForConditionalGeneration) (AudioFlamingo3Config model)
- **bart** -- [BartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForConditionalGeneration) (BartConfig model)
- **bigbird_pegasus** -- [BigBirdPegasusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForConditionalGeneration) (BigBirdPegasusConfig model)
- **blenderbot** -- [BlenderbotForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blenderbot#transformers.BlenderbotForConditionalGeneration) (BlenderbotConfig model)
- **blenderbot-small** -- [BlenderbotSmallForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blenderbot-small#transformers.BlenderbotSmallForConditionalGeneration) (BlenderbotSmallConfig model)
- **encoder-decoder** -- [EncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/encoder-decoder#transformers.EncoderDecoderModel) (EncoderDecoderConfig model)
- **fsmt** -- [FSMTForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fsmt#transformers.FSMTForConditionalGeneration) (FSMTConfig model)
- **glmasr** -- [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model)
- **granite_speech** -- [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model)
- **granite_speech_plus** -- [GraniteSpeechPlusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusForConditionalGeneration) (GraniteSpeechPlusConfig model)
- **led** -- [LEDForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDForConditionalGeneration) (LEDConfig model)
- **longt5** -- [LongT5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/longt5#transformers.LongT5ForConditionalGeneration) (LongT5Config model)
- **m2m_100** -- [M2M100ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/m2m_100#transformers.M2M100ForConditionalGeneration) (M2M100Config model)
- **marian** -- [MarianMTModel](/docs/transformers/v5.8.0/en/model_doc/marian#transformers.MarianMTModel) (MarianConfig model)
- **mbart** -- [MBartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForConditionalGeneration) (MBartConfig model)
- **mt5** -- [MT5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForConditionalGeneration) (MT5Config model)
- **musicflamingo** -- [MusicFlamingoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/musicflamingo#transformers.MusicFlamingoForConditionalGeneration) (MusicFlamingoConfig model)
- **mvp** -- [MvpForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForConditionalGeneration) (MvpConfig model)
- **nllb-moe** -- [NllbMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/nllb-moe#transformers.NllbMoeForConditionalGeneration) (NllbMoeConfig model)
- **pegasus** -- [PegasusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pegasus#transformers.PegasusForConditionalGeneration) (PegasusConfig model)
- **pegasus_x** -- [PegasusXForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pegasus_x#transformers.PegasusXForConditionalGeneration) (PegasusXConfig model)
- **plbart** -- [PLBartForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartForConditionalGeneration) (PLBartConfig model)
- **prophetnet** -- [ProphetNetForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/prophetnet#transformers.ProphetNetForConditionalGeneration) (ProphetNetConfig model)
- **qwen2_audio** -- [Qwen2AudioForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioForConditionalGeneration) (Qwen2AudioConfig model)
- **seamless_m4t** -- [SeamlessM4TForTextToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TForTextToText) (SeamlessM4TConfig model)
- **seamless_m4t_v2** -- [SeamlessM4Tv2ForTextToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2ForTextToText) (SeamlessM4Tv2Config model)
- **switch_transformers** -- [SwitchTransformersForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/switch_transformers#transformers.SwitchTransformersForConditionalGeneration) (SwitchTransformersConfig model)
- **t5** -- [T5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForConditionalGeneration) (T5Config model)
- **t5gemma** -- [T5GemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForConditionalGeneration) (T5GemmaConfig model)
- **t5gemma2** -- [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model)
- **umt5** -- [UMT5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForConditionalGeneration) (UMT5Config model)
- **vibevoice_asr** -- [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
- **voxtral** -- [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
- **voxtral_realtime** -- [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForSequenceClassification[[transformers.AutoModelForSequenceClassification]]

#### transformers.AutoModelForSequenceClassification[[transformers.AutoModelForSequenceClassification]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2046)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence classification head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSequenceClassification.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForSequenceClassification` (AlbertConfig model)
  - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForSequenceClassification) (ArceeConfig model)
  - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForSequenceClassification) (BartConfig model)
  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForSequenceClassification) (BertConfig model)
  - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForSequenceClassification) (BigBirdConfig model)
  - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForSequenceClassification) (BigBirdPegasusConfig model)
  - [BioGptConfig](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptForSequenceClassification) (BioGptConfig model)
  - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForSequenceClassification) (BloomConfig model)
  - [CTRLConfig](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLForSequenceClassification) (CTRLConfig model)
  - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForSequenceClassification) (CamembertConfig model)
  - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForSequenceClassification) (CanineConfig model)
  - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForSequenceClassification) (ConvBertConfig model)
  - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForSequenceClassification) (Data2VecTextConfig model)
  - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForSequenceClassification) (DebertaConfig model)
  - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForSequenceClassification) (DebertaV2Config model)
  - [DeepseekV2Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Config) configuration class: [DeepseekV2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2ForSequenceClassification) (DeepseekV2Config model)
  - [DeepseekV3Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3ForSequenceClassification) (DeepseekV3Config model)
  - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForSequenceClassification) (DiffLlamaConfig model)
  - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForSequenceClassification) (DistilBertConfig model)
  - [DogeConfig](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeConfig) configuration class: [DogeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeForSequenceClassification) (DogeConfig model)
  - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForSequenceClassification) (ElectraConfig model)
  - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForSequenceClassification) (ErnieConfig model)
  - [EsmConfig](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmForSequenceClassification) (EsmConfig model)
  - [EuroBertConfig](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertConfig) configuration class: [EuroBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertForSequenceClassification) (EuroBertConfig model)
  - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForSequenceClassification) (Exaone4Config model)
  - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForSequenceClassification) (FNetConfig model)
  - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForSequenceClassification) (FalconConfig model)
  - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForSequenceClassification) (FlaubertConfig model)
  - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForSequenceClassification) (FunnelConfig model)
  - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForSequenceClassification) (GPT2Config model)
  - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForSequenceClassification) (GPTBigCodeConfig model)
  - [GPTJConfig](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJConfig) configuration class: [GPTJForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJForSequenceClassification) (GPTJConfig model)
  - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForSequenceClassification) (GPTNeoConfig model)
  - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForSequenceClassification) (GPTNeoXConfig model)
  - [Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2ForSequenceClassification) (Gemma2Config model)
  - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForSequenceClassification) (Gemma3Config model)
  - [Gemma3TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3TextForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextForSequenceClassification) (Gemma3TextConfig model)
  - [GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaForSequenceClassification) (GemmaConfig model)
  - [Glm4Config](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Config) configuration class: [Glm4ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4ForSequenceClassification) (Glm4Config model)
  - [GlmConfig](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmConfig) configuration class: [GlmForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmForSequenceClassification) (GlmConfig model)
  - [GptOssConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssConfig) configuration class: [GptOssForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssForSequenceClassification) (GptOssConfig model)
  - [HeliumConfig](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumConfig) configuration class: [HeliumForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumForSequenceClassification) (HeliumConfig model)
  - [HunYuanDenseV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Config) configuration class: [HunYuanDenseV1ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1ForSequenceClassification) (HunYuanDenseV1Config model)
  - [HunYuanMoEV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Config) configuration class: [HunYuanMoEV1ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1ForSequenceClassification) (HunYuanMoEV1Config model)
  - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForSequenceClassification) (IBertConfig model)
  - [JambaConfig](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaForSequenceClassification) (JambaConfig model)
  - [JetMoeConfig](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeConfig) configuration class: [JetMoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeForSequenceClassification) (JetMoeConfig model)
  - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForSequenceClassification) (JinaEmbeddingsV3Config model)
  - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForSequenceClassification) (LayoutLMConfig model)
  - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForSequenceClassification) (LayoutLMv2Config model)
  - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForSequenceClassification) (LayoutLMv3Config model)
  - [LiltConfig](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltConfig) configuration class: [LiltForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltForSequenceClassification) (LiltConfig model)
  - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaForSequenceClassification) (LlamaConfig model)
  - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForSequenceClassification) (LongformerConfig model)
  - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForSequenceClassification) (LukeConfig model)
  - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForSequenceClassification) (MBartConfig model)
  - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForSequenceClassification) (MPNetConfig model)
  - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForSequenceClassification) (MT5Config model)
  - [MarkupLMConfig](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMConfig) configuration class: [MarkupLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMForSequenceClassification) (MarkupLMConfig model)
  - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForSequenceClassification) (MegatronBertConfig model)
  - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForSequenceClassification) (MiniMaxConfig model)
  - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForSequenceClassification) (Ministral3Config model)
  - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForSequenceClassification) (MinistralConfig model)
  - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForSequenceClassification) (Mistral4Config model)
  - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForSequenceClassification) (MistralConfig model)
  - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForSequenceClassification) (MixtralConfig model)
  - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForSequenceClassification) (MobileBertConfig model)
  - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForSequenceClassification) (ModernBertConfig model)
  - [ModernBertDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderConfig) configuration class: [ModernBertDecoderForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderForSequenceClassification) (ModernBertDecoderConfig model)
  - [ModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertConfig) configuration class: [ModernVBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertForSequenceClassification) (ModernVBertConfig model)
  - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForSequenceClassification) (MptConfig model)
  - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForSequenceClassification) (MraConfig model)
  - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForSequenceClassification) (MvpConfig model)
  - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForSequenceClassification) (NemotronConfig model)
  - [NomicBertConfig](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertConfig) configuration class: [NomicBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertForSequenceClassification) (NomicBertConfig model)
  - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForSequenceClassification) (NystromformerConfig model)
  - [OPTConfig](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTConfig) configuration class: [OPTForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTForSequenceClassification) (OPTConfig model)
  - [Olmo2Config](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Config) configuration class: [Olmo2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2ForSequenceClassification) (Olmo2Config model)
  - [Olmo3Config](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Config) configuration class: [Olmo3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3ForSequenceClassification) (Olmo3Config model)
  - [OlmoConfig](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoConfig) configuration class: [OlmoForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoForSequenceClassification) (OlmoConfig model)
  - [OpenAIGPTConfig](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTForSequenceClassification) (OpenAIGPTConfig model)
  - [PLBartConfig](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartConfig) configuration class: [PLBartForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartForSequenceClassification) (PLBartConfig model)
  - [PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverConfig) configuration class: [PerceiverForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForSequenceClassification) (PerceiverConfig model)
  - [PersimmonConfig](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonConfig) configuration class: [PersimmonForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonForSequenceClassification) (PersimmonConfig model)
  - [Phi3Config](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Config) configuration class: [Phi3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3ForSequenceClassification) (Phi3Config model)
  - [PhiConfig](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiConfig) configuration class: [PhiForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiForSequenceClassification) (PhiConfig model)
  - [PhimoeConfig](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeConfig) configuration class: [PhimoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeForSequenceClassification) (PhimoeConfig model)
  - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForSequenceClassification) (Qwen2Config model)
  - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForSequenceClassification) (Qwen2MoeConfig model)
  - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForSequenceClassification) (Qwen3Config model)
  - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForSequenceClassification) (Qwen3MoeConfig model)
  - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForSequenceClassification) (Qwen3NextConfig model)
  - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForSequenceClassification) (Qwen3_5Config model)
  - [Qwen3_5TextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextConfig) configuration class: [Qwen3_5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForSequenceClassification) (Qwen3_5TextConfig model)
  - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerForSequenceClassification) (ReformerConfig model)
  - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForSequenceClassification) (RemBertConfig model)
  - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForSequenceClassification) (RoCBertConfig model)
  - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForSequenceClassification) (RoFormerConfig model)
  - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForSequenceClassification) (RobertaConfig model)
  - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForSequenceClassification) (RobertaPreLayerNormConfig model)
  - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForSequenceClassification) (SeedOssConfig model)
  - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForSequenceClassification) (SmolLM3Config model)
  - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForSequenceClassification) (SqueezeBertConfig model)
  - [StableLmConfig](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmConfig) configuration class: [StableLmForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmForSequenceClassification) (StableLmConfig model)
  - [Starcoder2Config](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Config) configuration class: [Starcoder2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2ForSequenceClassification) (Starcoder2Config model)
  - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForSequenceClassification) (T5Config model)
  - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForSequenceClassification) (T5Gemma2Config model)
  - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForSequenceClassification) (T5GemmaConfig model)
  - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForSequenceClassification) (TapasConfig model)
  - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForSequenceClassification) (UMT5Config model)
  - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForSequenceClassification) (XLMConfig model)
  - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForSequenceClassification) (XLMRobertaConfig model)
  - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForSequenceClassification) (XLMRobertaXLConfig model)
  - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForSequenceClassification) (XLNetConfig model)
  - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForSequenceClassification) (XmodConfig model)
  - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForSequenceClassification) (YosoConfig model)
  - [Zamba2Config](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Config) configuration class: [Zamba2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2ForSequenceClassification) (Zamba2Config model)
  - [ZambaConfig](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaConfig) configuration class: [ZambaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaForSequenceClassification) (ZambaConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSequenceClassification.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForSequenceClassification` (AlbertConfig model) - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForSequenceClassification) (ArceeConfig model) - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForSequenceClassification) (BartConfig model) - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForSequenceClassification) (BertConfig model) - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForSequenceClassification) (BigBirdConfig model) - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForSequenceClassification) (BigBirdPegasusConfig model) - [BioGptConfig](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptForSequenceClassification) (BioGptConfig model) - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForSequenceClassification) (BloomConfig model) - [CTRLConfig](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLConfig) configuration class: [CTRLForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLForSequenceClassification) (CTRLConfig model) - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForSequenceClassification) (CamembertConfig model) - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForSequenceClassification) (CanineConfig model) - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForSequenceClassification) (ConvBertConfig model) - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForSequenceClassification) (Data2VecTextConfig model) - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForSequenceClassification) (DebertaConfig model) - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForSequenceClassification) (DebertaV2Config model) - [DeepseekV2Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2Config) configuration class: [DeepseekV2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2ForSequenceClassification) (DeepseekV2Config model) - [DeepseekV3Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3ForSequenceClassification) (DeepseekV3Config model) - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForSequenceClassification) (DiffLlamaConfig model) - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForSequenceClassification) (DistilBertConfig model) - [DogeConfig](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeConfig) configuration class: [DogeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeForSequenceClassification) (DogeConfig model) - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForSequenceClassification) (ElectraConfig model) - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForSequenceClassification) (ErnieConfig model) - [EsmConfig](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmForSequenceClassification) (EsmConfig model) - [EuroBertConfig](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertConfig) configuration class: [EuroBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertForSequenceClassification) (EuroBertConfig model) - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForSequenceClassification) (Exaone4Config model) - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForSequenceClassification) (FNetConfig model) - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForSequenceClassification) (FalconConfig model) - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForSequenceClassification) (FlaubertConfig model) - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForSequenceClassification) (FunnelConfig model) - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForSequenceClassification) (GPT2Config model) - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForSequenceClassification) (GPTBigCodeConfig model) - [GPTJConfig](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJConfig) configuration class: [GPTJForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJForSequenceClassification) (GPTJConfig model) - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForSequenceClassification) (GPTNeoConfig model) - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForSequenceClassification) (GPTNeoXConfig model) - [Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2ForSequenceClassification) (Gemma2Config model) - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForSequenceClassification) (Gemma3Config model) - [Gemma3TextConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextConfig) configuration class: [Gemma3TextForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextForSequenceClassification) (Gemma3TextConfig model) - [GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaForSequenceClassification) (GemmaConfig model) - [Glm4Config](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Config) configuration class: [Glm4ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4ForSequenceClassification) (Glm4Config model) - [GlmConfig](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmConfig) configuration class: [GlmForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmForSequenceClassification) (GlmConfig model) - [GptOssConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssConfig) configuration class: [GptOssForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssForSequenceClassification) (GptOssConfig model) - [HeliumConfig](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumConfig) configuration class: [HeliumForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumForSequenceClassification) (HeliumConfig model) - [HunYuanDenseV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1Config) configuration class: [HunYuanDenseV1ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1ForSequenceClassification) (HunYuanDenseV1Config model) - [HunYuanMoEV1Config](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1Config) configuration class: [HunYuanMoEV1ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1ForSequenceClassification) (HunYuanMoEV1Config model) - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForSequenceClassification) (IBertConfig model) - [JambaConfig](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaConfig) configuration class: [JambaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaForSequenceClassification) (JambaConfig model) - [JetMoeConfig](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeConfig) configuration class: [JetMoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeForSequenceClassification) (JetMoeConfig model) - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForSequenceClassification) (JinaEmbeddingsV3Config model) - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForSequenceClassification) (LayoutLMConfig model) - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForSequenceClassification) (LayoutLMv2Config model) - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForSequenceClassification) (LayoutLMv3Config model) - [LiltConfig](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltConfig) configuration class: [LiltForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltForSequenceClassification) (LiltConfig model) - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaForSequenceClassification) (LlamaConfig model) - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForSequenceClassification) (LongformerConfig model) - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForSequenceClassification) (LukeConfig model) - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForSequenceClassification) (MBartConfig model) - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForSequenceClassification) (MPNetConfig model) - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForSequenceClassification) (MT5Config model) - [MarkupLMConfig](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMConfig) configuration class: [MarkupLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMForSequenceClassification) (MarkupLMConfig model) - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForSequenceClassification) (MegatronBertConfig model) - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForSequenceClassification) (MiniMaxConfig model) - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForSequenceClassification) (Ministral3Config model) - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForSequenceClassification) (MinistralConfig model) - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForSequenceClassification) (Mistral4Config model) - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForSequenceClassification) (MistralConfig model) - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForSequenceClassification) (MixtralConfig model) - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForSequenceClassification) (MobileBertConfig model) - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForSequenceClassification) (ModernBertConfig model) - [ModernBertDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderConfig) configuration class: [ModernBertDecoderForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderForSequenceClassification) (ModernBertDecoderConfig model) - [ModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertConfig) configuration class: [ModernVBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertForSequenceClassification) (ModernVBertConfig model) - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForSequenceClassification) (MptConfig model) - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForSequenceClassification) (MraConfig model) - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForSequenceClassification) (MvpConfig model) - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForSequenceClassification) (NemotronConfig model) - [NomicBertConfig](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertConfig) configuration class: [NomicBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertForSequenceClassification) (NomicBertConfig model) - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForSequenceClassification) (NystromformerConfig model) - [OPTConfig](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTConfig) configuration class: [OPTForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTForSequenceClassification) (OPTConfig model) - [Olmo2Config](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2Config) configuration class: [Olmo2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2ForSequenceClassification) (Olmo2Config model) - [Olmo3Config](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3Config) configuration class: [Olmo3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3ForSequenceClassification) (Olmo3Config model) - [OlmoConfig](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoConfig) configuration class: [OlmoForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoForSequenceClassification) (OlmoConfig model) - [OpenAIGPTConfig](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTConfig) configuration class: [OpenAIGPTForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTForSequenceClassification) (OpenAIGPTConfig model) - [PLBartConfig](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartConfig) configuration class: [PLBartForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartForSequenceClassification) (PLBartConfig model) - [PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverConfig) configuration class: [PerceiverForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForSequenceClassification) (PerceiverConfig model) - [PersimmonConfig](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonConfig) configuration class: [PersimmonForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonForSequenceClassification) (PersimmonConfig model) - [Phi3Config](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Config) configuration class: [Phi3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3ForSequenceClassification) (Phi3Config model) - [PhiConfig](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiConfig) configuration class: [PhiForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiForSequenceClassification) (PhiConfig model) - [PhimoeConfig](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeConfig) configuration class: [PhimoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeForSequenceClassification) (PhimoeConfig model) - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForSequenceClassification) (Qwen2Config model) - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForSequenceClassification) (Qwen2MoeConfig model) - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForSequenceClassification) (Qwen3Config model) - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForSequenceClassification) (Qwen3MoeConfig model) - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForSequenceClassification) (Qwen3NextConfig model) - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForSequenceClassification) (Qwen3_5Config model) - [Qwen3_5TextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5TextConfig) configuration class: [Qwen3_5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForSequenceClassification) (Qwen3_5TextConfig model) - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerForSequenceClassification) (ReformerConfig model) - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForSequenceClassification) (RemBertConfig model) - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForSequenceClassification) (RoCBertConfig model) - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForSequenceClassification) (RoFormerConfig model) - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForSequenceClassification) (RobertaConfig model) - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForSequenceClassification) (RobertaPreLayerNormConfig model) - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForSequenceClassification) (SeedOssConfig model) - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForSequenceClassification) (SmolLM3Config model) - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForSequenceClassification) (SqueezeBertConfig model) - [StableLmConfig](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmConfig) configuration class: [StableLmForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmForSequenceClassification) (StableLmConfig model) - [Starcoder2Config](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Config) configuration class: [Starcoder2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2ForSequenceClassification) (Starcoder2Config model) - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForSequenceClassification) (T5Config model) - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForSequenceClassification) (T5Gemma2Config model) - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForSequenceClassification) (T5GemmaConfig model) - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForSequenceClassification) (TapasConfig model) - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForSequenceClassification) (UMT5Config model) - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForSequenceClassification) (XLMConfig model) - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForSequenceClassification) (XLMRobertaConfig model) - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForSequenceClassification) (XLMRobertaXLConfig model) - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForSequenceClassification) (XLNetConfig model) - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForSequenceClassification) (XmodConfig model) - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForSequenceClassification) (YosoConfig model) - [Zamba2Config](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2Config) configuration class: [Zamba2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2ForSequenceClassification) (Zamba2Config model) - [ZambaConfig](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaConfig) configuration class: [ZambaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaForSequenceClassification) (ZambaConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSequenceClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a sequence classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForSequenceClassification` (AlbertConfig model)
- **arcee** -- [ArceeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForSequenceClassification) (ArceeConfig model)
- **bart** -- [BartForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForSequenceClassification) (BartConfig model)
- **bert** -- [BertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForSequenceClassification) (BertConfig model)
- **big_bird** -- [BigBirdForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForSequenceClassification) (BigBirdConfig model)
- **bigbird_pegasus** -- [BigBirdPegasusForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForSequenceClassification) (BigBirdPegasusConfig model)
- **biogpt** -- [BioGptForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptForSequenceClassification) (BioGptConfig model)
- **bloom** -- [BloomForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForSequenceClassification) (BloomConfig model)
- **camembert** -- [CamembertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForSequenceClassification) (CamembertConfig model)
- **canine** -- [CanineForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForSequenceClassification) (CanineConfig model)
- **convbert** -- [ConvBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForSequenceClassification) (ConvBertConfig model)
- **ctrl** -- [CTRLForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ctrl#transformers.CTRLForSequenceClassification) (CTRLConfig model)
- **data2vec-text** -- [Data2VecTextForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForSequenceClassification) (Data2VecTextConfig model)
- **deberta** -- [DebertaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForSequenceClassification) (DebertaConfig model)
- **deberta-v2** -- [DebertaV2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForSequenceClassification) (DebertaV2Config model)
- **deepseek_v2** -- [DeepseekV2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deepseek_v2#transformers.DeepseekV2ForSequenceClassification) (DeepseekV2Config model)
- **deepseek_v3** -- [DeepseekV3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3ForSequenceClassification) (DeepseekV3Config model)
- **diffllama** -- [DiffLlamaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForSequenceClassification) (DiffLlamaConfig model)
- **distilbert** -- [DistilBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForSequenceClassification) (DistilBertConfig model)
- **doge** -- [DogeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/doge#transformers.DogeForSequenceClassification) (DogeConfig model)
- **electra** -- [ElectraForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForSequenceClassification) (ElectraConfig model)
- **ernie** -- [ErnieForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForSequenceClassification) (ErnieConfig model)
- **esm** -- [EsmForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmForSequenceClassification) (EsmConfig model)
- **eurobert** -- [EuroBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertForSequenceClassification) (EuroBertConfig model)
- **exaone4** -- [Exaone4ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForSequenceClassification) (Exaone4Config model)
- **falcon** -- [FalconForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForSequenceClassification) (FalconConfig model)
- **flaubert** -- [FlaubertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForSequenceClassification) (FlaubertConfig model)
- **fnet** -- [FNetForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForSequenceClassification) (FNetConfig model)
- **funnel** -- [FunnelForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForSequenceClassification) (FunnelConfig model)
- **gemma** -- [GemmaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaForSequenceClassification) (GemmaConfig model)
- **gemma2** -- [Gemma2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2ForSequenceClassification) (Gemma2Config model)
- **gemma3** -- [Gemma3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForSequenceClassification) (Gemma3Config model)
- **gemma3_text** -- [Gemma3TextForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3TextForSequenceClassification) (Gemma3TextConfig model)
- **glm** -- [GlmForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmForSequenceClassification) (GlmConfig model)
- **glm4** -- [Glm4ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4ForSequenceClassification) (Glm4Config model)
- **gpt-sw3** -- [GPT2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForSequenceClassification) (GPT2Config model)
- **gpt2** -- [GPT2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForSequenceClassification) (GPT2Config model)
- **gpt_bigcode** -- [GPTBigCodeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForSequenceClassification) (GPTBigCodeConfig model)
- **gpt_neo** -- [GPTNeoForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForSequenceClassification) (GPTNeoConfig model)
- **gpt_neox** -- [GPTNeoXForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForSequenceClassification) (GPTNeoXConfig model)
- **gpt_oss** -- [GptOssForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssForSequenceClassification) (GptOssConfig model)
- **gptj** -- [GPTJForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJForSequenceClassification) (GPTJConfig model)
- **helium** -- [HeliumForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumForSequenceClassification) (HeliumConfig model)
- **hunyuan_v1_dense** -- [HunYuanDenseV1ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_dense#transformers.HunYuanDenseV1ForSequenceClassification) (HunYuanDenseV1Config model)
- **hunyuan_v1_moe** -- [HunYuanMoEV1ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/hunyuan_v1_moe#transformers.HunYuanMoEV1ForSequenceClassification) (HunYuanMoEV1Config model)
- **ibert** -- [IBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForSequenceClassification) (IBertConfig model)
- **jamba** -- [JambaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/jamba#transformers.JambaForSequenceClassification) (JambaConfig model)
- **jetmoe** -- [JetMoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/jetmoe#transformers.JetMoeForSequenceClassification) (JetMoeConfig model)
- **jina_embeddings_v3** -- [JinaEmbeddingsV3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForSequenceClassification) (JinaEmbeddingsV3Config model)
- **layoutlm** -- [LayoutLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForSequenceClassification) (LayoutLMConfig model)
- **layoutlmv2** -- [LayoutLMv2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForSequenceClassification) (LayoutLMv2Config model)
- **layoutlmv3** -- [LayoutLMv3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForSequenceClassification) (LayoutLMv3Config model)
- **lilt** -- [LiltForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltForSequenceClassification) (LiltConfig model)
- **llama** -- [LlamaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaForSequenceClassification) (LlamaConfig model)
- **longformer** -- [LongformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForSequenceClassification) (LongformerConfig model)
- **luke** -- [LukeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForSequenceClassification) (LukeConfig model)
- **markuplm** -- [MarkupLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMForSequenceClassification) (MarkupLMConfig model)
- **mbart** -- [MBartForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForSequenceClassification) (MBartConfig model)
- **megatron-bert** -- [MegatronBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForSequenceClassification) (MegatronBertConfig model)
- **minimax** -- [MiniMaxForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForSequenceClassification) (MiniMaxConfig model)
- **ministral** -- [MinistralForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForSequenceClassification) (MinistralConfig model)
- **ministral3** -- [Ministral3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForSequenceClassification) (Ministral3Config model)
- **mistral** -- [MistralForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForSequenceClassification) (MistralConfig model)
- **mistral4** -- [Mistral4ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForSequenceClassification) (Mistral4Config model)
- **mixtral** -- [MixtralForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForSequenceClassification) (MixtralConfig model)
- **mobilebert** -- [MobileBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForSequenceClassification) (MobileBertConfig model)
- **modernbert** -- [ModernBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForSequenceClassification) (ModernBertConfig model)
- **modernbert-decoder** -- [ModernBertDecoderForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/modernbert-decoder#transformers.ModernBertDecoderForSequenceClassification) (ModernBertDecoderConfig model)
- **modernvbert** -- [ModernVBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertForSequenceClassification) (ModernVBertConfig model)
- **mpnet** -- [MPNetForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForSequenceClassification) (MPNetConfig model)
- **mpt** -- [MptForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForSequenceClassification) (MptConfig model)
- **mra** -- [MraForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForSequenceClassification) (MraConfig model)
- **mt5** -- [MT5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForSequenceClassification) (MT5Config model)
- **mvp** -- [MvpForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForSequenceClassification) (MvpConfig model)
- **nemotron** -- [NemotronForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForSequenceClassification) (NemotronConfig model)
- **nomic_bert** -- [NomicBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertForSequenceClassification) (NomicBertConfig model)
- **nystromformer** -- [NystromformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForSequenceClassification) (NystromformerConfig model)
- **olmo** -- [OlmoForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/olmo#transformers.OlmoForSequenceClassification) (OlmoConfig model)
- **olmo2** -- [Olmo2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/olmo2#transformers.Olmo2ForSequenceClassification) (Olmo2Config model)
- **olmo3** -- [Olmo3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/olmo3#transformers.Olmo3ForSequenceClassification) (Olmo3Config model)
- **openai-gpt** -- [OpenAIGPTForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/openai-gpt#transformers.OpenAIGPTForSequenceClassification) (OpenAIGPTConfig model)
- **opt** -- [OPTForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTForSequenceClassification) (OPTConfig model)
- **perceiver** -- [PerceiverForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForSequenceClassification) (PerceiverConfig model)
- **persimmon** -- [PersimmonForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonForSequenceClassification) (PersimmonConfig model)
- **phi** -- [PhiForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiForSequenceClassification) (PhiConfig model)
- **phi3** -- [Phi3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3ForSequenceClassification) (Phi3Config model)
- **phimoe** -- [PhimoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/phimoe#transformers.PhimoeForSequenceClassification) (PhimoeConfig model)
- **plbart** -- [PLBartForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/plbart#transformers.PLBartForSequenceClassification) (PLBartConfig model)
- **qwen2** -- [Qwen2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForSequenceClassification) (Qwen2Config model)
- **qwen2_moe** -- [Qwen2MoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForSequenceClassification) (Qwen2MoeConfig model)
- **qwen3** -- [Qwen3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForSequenceClassification) (Qwen3Config model)
- **qwen3_5** -- [Qwen3_5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForSequenceClassification) (Qwen3_5Config model)
- **qwen3_5_text** -- [Qwen3_5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForSequenceClassification) (Qwen3_5TextConfig model)
- **qwen3_moe** -- [Qwen3MoeForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForSequenceClassification) (Qwen3MoeConfig model)
- **qwen3_next** -- [Qwen3NextForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForSequenceClassification) (Qwen3NextConfig model)
- **reformer** -- [ReformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerForSequenceClassification) (ReformerConfig model)
- **rembert** -- [RemBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForSequenceClassification) (RemBertConfig model)
- **roberta** -- [RobertaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForSequenceClassification) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaPreLayerNormForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForSequenceClassification) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForSequenceClassification) (RoCBertConfig model)
- **roformer** -- [RoFormerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForSequenceClassification) (RoFormerConfig model)
- **seed_oss** -- [SeedOssForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForSequenceClassification) (SeedOssConfig model)
- **smollm3** -- [SmolLM3ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForSequenceClassification) (SmolLM3Config model)
- **squeezebert** -- [SqueezeBertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForSequenceClassification) (SqueezeBertConfig model)
- **stablelm** -- [StableLmForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmForSequenceClassification) (StableLmConfig model)
- **starcoder2** -- [Starcoder2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2ForSequenceClassification) (Starcoder2Config model)
- **t5** -- [T5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForSequenceClassification) (T5Config model)
- **t5gemma** -- [T5GemmaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForSequenceClassification) (T5GemmaConfig model)
- **t5gemma2** -- [T5Gemma2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForSequenceClassification) (T5Gemma2Config model)
- **tapas** -- [TapasForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForSequenceClassification) (TapasConfig model)
- **umt5** -- [UMT5ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForSequenceClassification) (UMT5Config model)
- **xlm** -- [XLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForSequenceClassification) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForSequenceClassification) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaXLForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForSequenceClassification) (XLMRobertaXLConfig model)
- **xlnet** -- [XLNetForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForSequenceClassification) (XLNetConfig model)
- **xmod** -- [XmodForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForSequenceClassification) (XmodConfig model)
- **yoso** -- [YosoForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForSequenceClassification) (YosoConfig model)
- **zamba** -- [ZambaForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/zamba#transformers.ZambaForSequenceClassification) (ZambaConfig model)
- **zamba2** -- [Zamba2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/zamba2#transformers.Zamba2ForSequenceClassification) (Zamba2Config model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForMultipleChoice[[transformers.AutoModelForMultipleChoice]]

#### transformers.AutoModelForMultipleChoice[[transformers.AutoModelForMultipleChoice]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2102)

This is a generic model class that will be instantiated as one of the model classes of the library (with a multiple choice head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMultipleChoice.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertForMultipleChoice) (AlbertConfig model)
  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForMultipleChoice) (BertConfig model)
  - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForMultipleChoice) (BigBirdConfig model)
  - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForMultipleChoice) (CamembertConfig model)
  - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForMultipleChoice) (CanineConfig model)
  - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForMultipleChoice) (ConvBertConfig model)
  - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForMultipleChoice) (Data2VecTextConfig model)
  - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForMultipleChoice) (DebertaV2Config model)
  - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForMultipleChoice) (DistilBertConfig model)
  - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForMultipleChoice) (ElectraConfig model)
  - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForMultipleChoice) (ErnieConfig model)
  - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForMultipleChoice) (FNetConfig model)
  - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForMultipleChoice) (FlaubertConfig model)
  - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForMultipleChoice) (FunnelConfig model)
  - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForMultipleChoice) (IBertConfig model)
  - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForMultipleChoice) (LongformerConfig model)
  - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForMultipleChoice) (LukeConfig model)
  - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForMultipleChoice) (MPNetConfig model)
  - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForMultipleChoice) (MegatronBertConfig model)
  - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForMultipleChoice) (MobileBertConfig model)
  - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForMultipleChoice) (ModernBertConfig model)
  - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForMultipleChoice) (MraConfig model)
  - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForMultipleChoice) (NystromformerConfig model)
  - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForMultipleChoice) (RemBertConfig model)
  - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForMultipleChoice) (RoCBertConfig model)
  - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForMultipleChoice) (RoFormerConfig model)
  - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForMultipleChoice) (RobertaConfig model)
  - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForMultipleChoice) (RobertaPreLayerNormConfig model)
  - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForMultipleChoice) (SqueezeBertConfig model)
  - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForMultipleChoice) (XLMConfig model)
  - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForMultipleChoice) (XLMRobertaConfig model)
  - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForMultipleChoice) (XLMRobertaXLConfig model)
  - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForMultipleChoice) (XLNetConfig model)
  - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForMultipleChoice) (XmodConfig model)
  - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForMultipleChoice) (YosoConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a multiple choice head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMultipleChoice.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: [AlbertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertForMultipleChoice) (AlbertConfig model) - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForMultipleChoice) (BertConfig model) - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForMultipleChoice) (BigBirdConfig model) - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForMultipleChoice) (CamembertConfig model) - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForMultipleChoice) (CanineConfig model) - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForMultipleChoice) (ConvBertConfig model) - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForMultipleChoice) (Data2VecTextConfig model) - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForMultipleChoice) (DebertaV2Config model) - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForMultipleChoice) (DistilBertConfig model) - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForMultipleChoice) (ElectraConfig model) - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForMultipleChoice) (ErnieConfig model) - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForMultipleChoice) (FNetConfig model) - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForMultipleChoice) (FlaubertConfig model) - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForMultipleChoice) (FunnelConfig model) - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForMultipleChoice) (IBertConfig model) - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForMultipleChoice) (LongformerConfig model) - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForMultipleChoice) (LukeConfig model) - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForMultipleChoice) (MPNetConfig model) - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForMultipleChoice) (MegatronBertConfig model) - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForMultipleChoice) (MobileBertConfig model) - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForMultipleChoice) (ModernBertConfig model) - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForMultipleChoice) (MraConfig model) - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForMultipleChoice) (NystromformerConfig model) - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForMultipleChoice) (RemBertConfig model) - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForMultipleChoice) (RoCBertConfig model) - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForMultipleChoice) (RoFormerConfig model) - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForMultipleChoice) (RobertaConfig model) - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForMultipleChoice) (RobertaPreLayerNormConfig model) - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForMultipleChoice) (SqueezeBertConfig model) - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForMultipleChoice) (XLMConfig model) - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForMultipleChoice) (XLMRobertaConfig model) - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForMultipleChoice) (XLMRobertaXLConfig model) - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForMultipleChoice) (XLNetConfig model) - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForMultipleChoice) (XmodConfig model) - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForMultipleChoice) (YosoConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMultipleChoice.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a multiple choice head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- [AlbertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertForMultipleChoice) (AlbertConfig model)
- **bert** -- [BertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForMultipleChoice) (BertConfig model)
- **big_bird** -- [BigBirdForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForMultipleChoice) (BigBirdConfig model)
- **camembert** -- [CamembertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForMultipleChoice) (CamembertConfig model)
- **canine** -- [CanineForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForMultipleChoice) (CanineConfig model)
- **convbert** -- [ConvBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForMultipleChoice) (ConvBertConfig model)
- **data2vec-text** -- [Data2VecTextForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForMultipleChoice) (Data2VecTextConfig model)
- **deberta-v2** -- [DebertaV2ForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForMultipleChoice) (DebertaV2Config model)
- **distilbert** -- [DistilBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForMultipleChoice) (DistilBertConfig model)
- **electra** -- [ElectraForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForMultipleChoice) (ElectraConfig model)
- **ernie** -- [ErnieForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForMultipleChoice) (ErnieConfig model)
- **flaubert** -- [FlaubertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForMultipleChoice) (FlaubertConfig model)
- **fnet** -- [FNetForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForMultipleChoice) (FNetConfig model)
- **funnel** -- [FunnelForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForMultipleChoice) (FunnelConfig model)
- **ibert** -- [IBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForMultipleChoice) (IBertConfig model)
- **longformer** -- [LongformerForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForMultipleChoice) (LongformerConfig model)
- **luke** -- [LukeForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForMultipleChoice) (LukeConfig model)
- **megatron-bert** -- [MegatronBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForMultipleChoice) (MegatronBertConfig model)
- **mobilebert** -- [MobileBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForMultipleChoice) (MobileBertConfig model)
- **modernbert** -- [ModernBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForMultipleChoice) (ModernBertConfig model)
- **mpnet** -- [MPNetForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForMultipleChoice) (MPNetConfig model)
- **mra** -- [MraForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForMultipleChoice) (MraConfig model)
- **nystromformer** -- [NystromformerForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForMultipleChoice) (NystromformerConfig model)
- **rembert** -- [RemBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForMultipleChoice) (RemBertConfig model)
- **roberta** -- [RobertaForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForMultipleChoice) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaPreLayerNormForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForMultipleChoice) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForMultipleChoice) (RoCBertConfig model)
- **roformer** -- [RoFormerForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForMultipleChoice) (RoFormerConfig model)
- **squeezebert** -- [SqueezeBertForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForMultipleChoice) (SqueezeBertConfig model)
- **xlm** -- [XLMForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForMultipleChoice) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForMultipleChoice) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaXLForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForMultipleChoice) (XLMRobertaXLConfig model)
- **xlnet** -- [XLNetForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForMultipleChoice) (XLNetConfig model)
- **xmod** -- [XmodForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForMultipleChoice) (XmodConfig model)
- **yoso** -- [YosoForMultipleChoice](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForMultipleChoice) (YosoConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForNextSentencePrediction[[transformers.AutoModelForNextSentencePrediction]]

#### transformers.AutoModelForNextSentencePrediction[[transformers.AutoModelForNextSentencePrediction]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2109)

This is a generic model class that will be instantiated as one of the model classes of the library (with a next sentence prediction head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForNextSentencePrediction.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForNextSentencePrediction) (BertConfig model)
  - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForNextSentencePrediction) (ErnieConfig model)
  - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForNextSentencePrediction) (FNetConfig model)
  - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForNextSentencePrediction) (MegatronBertConfig model)
  - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForNextSentencePrediction) (MobileBertConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a next sentence prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForNextSentencePrediction.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForNextSentencePrediction) (BertConfig model) - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForNextSentencePrediction) (ErnieConfig model) - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForNextSentencePrediction) (FNetConfig model) - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForNextSentencePrediction) (MegatronBertConfig model) - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForNextSentencePrediction) (MobileBertConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForNextSentencePrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a next sentence prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **bert** -- [BertForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForNextSentencePrediction) (BertConfig model)
- **ernie** -- [ErnieForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForNextSentencePrediction) (ErnieConfig model)
- **fnet** -- [FNetForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForNextSentencePrediction) (FNetConfig model)
- **megatron-bert** -- [MegatronBertForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForNextSentencePrediction) (MegatronBertConfig model)
- **mobilebert** -- [MobileBertForNextSentencePrediction](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForNextSentencePrediction) (MobileBertConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTokenClassification[[transformers.AutoModelForTokenClassification]]

#### transformers.AutoModelForTokenClassification[[transformers.AutoModelForTokenClassification]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2095)

This is a generic model class that will be instantiated as one of the model classes of the library (with a token classification head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTokenClassification.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForTokenClassification` (AlbertConfig model)
  - [ApertusConfig](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusConfig) configuration class: [ApertusForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusForTokenClassification) (ApertusConfig model)
  - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForTokenClassification) (ArceeConfig model)
  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForTokenClassification) (BertConfig model)
  - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForTokenClassification) (BigBirdConfig model)
  - [BioGptConfig](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptForTokenClassification) (BioGptConfig model)
  - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForTokenClassification) (BloomConfig model)
  - [BrosConfig](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosConfig) configuration class: [BrosForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosForTokenClassification) (BrosConfig model)
  - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForTokenClassification) (CamembertConfig model)
  - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForTokenClassification) (CanineConfig model)
  - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForTokenClassification) (ConvBertConfig model)
  - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForTokenClassification) (Data2VecTextConfig model)
  - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForTokenClassification) (DebertaConfig model)
  - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForTokenClassification) (DebertaV2Config model)
  - [DeepseekV3Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3ForTokenClassification) (DeepseekV3Config model)
  - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForTokenClassification) (DiffLlamaConfig model)
  - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForTokenClassification) (DistilBertConfig model)
  - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForTokenClassification) (ElectraConfig model)
  - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForTokenClassification) (ErnieConfig model)
  - [EsmConfig](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmForTokenClassification) (EsmConfig model)
  - [EuroBertConfig](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertConfig) configuration class: [EuroBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertForTokenClassification) (EuroBertConfig model)
  - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForTokenClassification) (Exaone4Config model)
  - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForTokenClassification) (FNetConfig model)
  - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForTokenClassification) (FalconConfig model)
  - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForTokenClassification) (FlaubertConfig model)
  - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForTokenClassification) (FunnelConfig model)
  - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForTokenClassification) (GPT2Config model)
  - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForTokenClassification) (GPTBigCodeConfig model)
  - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForTokenClassification) (GPTNeoConfig model)
  - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForTokenClassification) (GPTNeoXConfig model)
  - [Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2ForTokenClassification) (Gemma2Config model)
  - [GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaForTokenClassification) (GemmaConfig model)
  - [Glm4Config](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Config) configuration class: [Glm4ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4ForTokenClassification) (Glm4Config model)
  - [GlmConfig](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmConfig) configuration class: [GlmForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmForTokenClassification) (GlmConfig model)
  - [GptOssConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssConfig) configuration class: [GptOssForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssForTokenClassification) (GptOssConfig model)
  - [HeliumConfig](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumConfig) configuration class: [HeliumForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumForTokenClassification) (HeliumConfig model)
  - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForTokenClassification) (IBertConfig model)
  - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForTokenClassification) (JinaEmbeddingsV3Config model)
  - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForTokenClassification) (LayoutLMConfig model)
  - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForTokenClassification) (LayoutLMv2Config model)
  - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForTokenClassification) (LayoutLMv3Config model)
  - [LiltConfig](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltConfig) configuration class: [LiltForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltForTokenClassification) (LiltConfig model)
  - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/llama#transformers.LlamaForTokenClassification) (LlamaConfig model)
  - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForTokenClassification) (LongformerConfig model)
  - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForTokenClassification) (LukeConfig model)
  - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForTokenClassification) (MPNetConfig model)
  - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForTokenClassification) (MT5Config model)
  - [MarkupLMConfig](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMConfig) configuration class: [MarkupLMForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMForTokenClassification) (MarkupLMConfig model)
  - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForTokenClassification) (MegatronBertConfig model)
  - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForTokenClassification) (MiniMaxConfig model)
  - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForTokenClassification) (Ministral3Config model)
  - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForTokenClassification) (MinistralConfig model)
  - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForTokenClassification) (Mistral4Config model)
  - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForTokenClassification) (MistralConfig model)
  - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForTokenClassification) (MixtralConfig model)
  - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForTokenClassification) (MobileBertConfig model)
  - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForTokenClassification) (ModernBertConfig model)
  - [ModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertConfig) configuration class: [ModernVBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertForTokenClassification) (ModernVBertConfig model)
  - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForTokenClassification) (MptConfig model)
  - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForTokenClassification) (MraConfig model)
  - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForTokenClassification) (NemotronConfig model)
  - [NomicBertConfig](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertConfig) configuration class: [NomicBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertForTokenClassification) (NomicBertConfig model)
  - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForTokenClassification) (NystromformerConfig model)
  - [OpenAIPrivacyFilterConfig](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterConfig) configuration class: [OpenAIPrivacyFilterForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterForTokenClassification) (OpenAIPrivacyFilterConfig model)
  - [PersimmonConfig](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonConfig) configuration class: [PersimmonForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonForTokenClassification) (PersimmonConfig model)
  - [Phi3Config](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Config) configuration class: [Phi3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3ForTokenClassification) (Phi3Config model)
  - [PhiConfig](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiConfig) configuration class: [PhiForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiForTokenClassification) (PhiConfig model)
  - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForTokenClassification) (Qwen2Config model)
  - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForTokenClassification) (Qwen2MoeConfig model)
  - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForTokenClassification) (Qwen3Config model)
  - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForTokenClassification) (Qwen3MoeConfig model)
  - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForTokenClassification) (Qwen3NextConfig model)
  - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForTokenClassification) (RemBertConfig model)
  - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForTokenClassification) (RoCBertConfig model)
  - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForTokenClassification) (RoFormerConfig model)
  - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForTokenClassification) (RobertaConfig model)
  - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForTokenClassification) (RobertaPreLayerNormConfig model)
  - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForTokenClassification) (SeedOssConfig model)
  - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForTokenClassification) (SmolLM3Config model)
  - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForTokenClassification) (SqueezeBertConfig model)
  - [StableLmConfig](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmConfig) configuration class: [StableLmForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmForTokenClassification) (StableLmConfig model)
  - [Starcoder2Config](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Config) configuration class: [Starcoder2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2ForTokenClassification) (Starcoder2Config model)
  - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForTokenClassification) (T5Config model)
  - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForTokenClassification) (T5Gemma2Config model)
  - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForTokenClassification) (T5GemmaConfig model)
  - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForTokenClassification) (UMT5Config model)
  - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForTokenClassification) (XLMConfig model)
  - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForTokenClassification) (XLMRobertaConfig model)
  - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForTokenClassification) (XLMRobertaXLConfig model)
  - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForTokenClassification) (XLNetConfig model)
  - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForTokenClassification) (XmodConfig model)
  - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForTokenClassification) (YosoConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a token classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTokenClassification.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForTokenClassification` (AlbertConfig model) - [ApertusConfig](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusConfig) configuration class: [ApertusForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusForTokenClassification) (ApertusConfig model) - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForTokenClassification) (ArceeConfig model) - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForTokenClassification) (BertConfig model) - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForTokenClassification) (BigBirdConfig model) - [BioGptConfig](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptConfig) configuration class: [BioGptForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptForTokenClassification) (BioGptConfig model) - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForTokenClassification) (BloomConfig model) - [BrosConfig](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosConfig) configuration class: [BrosForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosForTokenClassification) (BrosConfig model) - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForTokenClassification) (CamembertConfig model) - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForTokenClassification) (CanineConfig model) - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForTokenClassification) (ConvBertConfig model) - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForTokenClassification) (Data2VecTextConfig model) - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForTokenClassification) (DebertaConfig model) - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForTokenClassification) (DebertaV2Config model) - [DeepseekV3Config](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3Config) configuration class: [DeepseekV3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3ForTokenClassification) (DeepseekV3Config model) - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForTokenClassification) (DiffLlamaConfig model) - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForTokenClassification) (DistilBertConfig model) - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForTokenClassification) (ElectraConfig model) - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForTokenClassification) (ErnieConfig model) - [EsmConfig](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmConfig) configuration class: [EsmForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmForTokenClassification) (EsmConfig model) - [EuroBertConfig](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertConfig) configuration class: [EuroBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertForTokenClassification) (EuroBertConfig model) - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForTokenClassification) (Exaone4Config model) - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForTokenClassification) (FNetConfig model) - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForTokenClassification) (FalconConfig model) - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForTokenClassification) (FlaubertConfig model) - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForTokenClassification) (FunnelConfig model) - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForTokenClassification) (GPT2Config model) - [GPTBigCodeConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeConfig) configuration class: [GPTBigCodeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForTokenClassification) (GPTBigCodeConfig model) - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForTokenClassification) (GPTNeoConfig model) - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForTokenClassification) (GPTNeoXConfig model) - [Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2Config) configuration class: [Gemma2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2ForTokenClassification) (Gemma2Config model) - [GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaConfig) configuration class: [GemmaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaForTokenClassification) (GemmaConfig model) - [Glm4Config](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4Config) configuration class: [Glm4ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4ForTokenClassification) (Glm4Config model) - [GlmConfig](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmConfig) configuration class: [GlmForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmForTokenClassification) (GlmConfig model) - [GptOssConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssConfig) configuration class: [GptOssForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssForTokenClassification) (GptOssConfig model) - [HeliumConfig](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumConfig) configuration class: [HeliumForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumForTokenClassification) (HeliumConfig model) - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForTokenClassification) (IBertConfig model) - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForTokenClassification) (JinaEmbeddingsV3Config model) - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForTokenClassification) (LayoutLMConfig model) - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForTokenClassification) (LayoutLMv2Config model) - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForTokenClassification) (LayoutLMv3Config model) - [LiltConfig](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltConfig) configuration class: [LiltForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltForTokenClassification) (LiltConfig model) - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/llama#transformers.LlamaForTokenClassification) (LlamaConfig model) - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForTokenClassification) (LongformerConfig model) - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForTokenClassification) (LukeConfig model) - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForTokenClassification) (MPNetConfig model) - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForTokenClassification) (MT5Config model) - [MarkupLMConfig](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMConfig) configuration class: [MarkupLMForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMForTokenClassification) (MarkupLMConfig model) - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForTokenClassification) (MegatronBertConfig model) - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForTokenClassification) (MiniMaxConfig model) - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForTokenClassification) (Ministral3Config model) - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForTokenClassification) (MinistralConfig model) - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForTokenClassification) (Mistral4Config model) - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForTokenClassification) (MistralConfig model) - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForTokenClassification) (MixtralConfig model) - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForTokenClassification) (MobileBertConfig model) - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForTokenClassification) (ModernBertConfig model) - [ModernVBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertConfig) configuration class: [ModernVBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertForTokenClassification) (ModernVBertConfig model) - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForTokenClassification) (MptConfig model) - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForTokenClassification) (MraConfig model) - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForTokenClassification) (NemotronConfig model) - [NomicBertConfig](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertConfig) configuration class: [NomicBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertForTokenClassification) (NomicBertConfig model) - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForTokenClassification) (NystromformerConfig model) - [OpenAIPrivacyFilterConfig](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterConfig) configuration class: [OpenAIPrivacyFilterForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterForTokenClassification) (OpenAIPrivacyFilterConfig model) - [PersimmonConfig](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonConfig) configuration class: [PersimmonForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonForTokenClassification) (PersimmonConfig model) - [Phi3Config](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3Config) configuration class: [Phi3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3ForTokenClassification) (Phi3Config model) - [PhiConfig](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiConfig) configuration class: [PhiForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiForTokenClassification) (PhiConfig model) - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForTokenClassification) (Qwen2Config model) - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForTokenClassification) (Qwen2MoeConfig model) - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForTokenClassification) (Qwen3Config model) - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForTokenClassification) (Qwen3MoeConfig model) - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForTokenClassification) (Qwen3NextConfig model) - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForTokenClassification) (RemBertConfig model) - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForTokenClassification) (RoCBertConfig model) - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForTokenClassification) (RoFormerConfig model) - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForTokenClassification) (RobertaConfig model) - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForTokenClassification) (RobertaPreLayerNormConfig model) - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForTokenClassification) (SeedOssConfig model) - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForTokenClassification) (SmolLM3Config model) - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForTokenClassification) (SqueezeBertConfig model) - [StableLmConfig](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmConfig) configuration class: [StableLmForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmForTokenClassification) (StableLmConfig model) - [Starcoder2Config](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2Config) configuration class: [Starcoder2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2ForTokenClassification) (Starcoder2Config model) - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForTokenClassification) (T5Config model) - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForTokenClassification) (T5Gemma2Config model) - [T5GemmaConfig](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaConfig) configuration class: [T5GemmaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForTokenClassification) (T5GemmaConfig model) - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForTokenClassification) (UMT5Config model) - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForTokenClassification) (XLMConfig model) - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForTokenClassification) (XLMRobertaConfig model) - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForTokenClassification) (XLMRobertaXLConfig model) - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForTokenClassification) (XLNetConfig model) - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForTokenClassification) (XmodConfig model) - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForTokenClassification) (YosoConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTokenClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a token classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForTokenClassification` (AlbertConfig model)
- **apertus** -- [ApertusForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/apertus#transformers.ApertusForTokenClassification) (ApertusConfig model)
- **arcee** -- [ArceeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForTokenClassification) (ArceeConfig model)
- **bert** -- [BertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForTokenClassification) (BertConfig model)
- **big_bird** -- [BigBirdForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForTokenClassification) (BigBirdConfig model)
- **biogpt** -- [BioGptForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/biogpt#transformers.BioGptForTokenClassification) (BioGptConfig model)
- **bloom** -- [BloomForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForTokenClassification) (BloomConfig model)
- **bros** -- [BrosForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/bros#transformers.BrosForTokenClassification) (BrosConfig model)
- **camembert** -- [CamembertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForTokenClassification) (CamembertConfig model)
- **canine** -- [CanineForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForTokenClassification) (CanineConfig model)
- **convbert** -- [ConvBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForTokenClassification) (ConvBertConfig model)
- **data2vec-text** -- [Data2VecTextForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForTokenClassification) (Data2VecTextConfig model)
- **deberta** -- [DebertaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForTokenClassification) (DebertaConfig model)
- **deberta-v2** -- [DebertaV2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForTokenClassification) (DebertaV2Config model)
- **deepseek_v3** -- [DeepseekV3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/deepseek_v3#transformers.DeepseekV3ForTokenClassification) (DeepseekV3Config model)
- **diffllama** -- [DiffLlamaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForTokenClassification) (DiffLlamaConfig model)
- **distilbert** -- [DistilBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForTokenClassification) (DistilBertConfig model)
- **electra** -- [ElectraForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForTokenClassification) (ElectraConfig model)
- **ernie** -- [ErnieForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForTokenClassification) (ErnieConfig model)
- **esm** -- [EsmForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/esm#transformers.EsmForTokenClassification) (EsmConfig model)
- **eurobert** -- [EuroBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/eurobert#transformers.EuroBertForTokenClassification) (EuroBertConfig model)
- **exaone4** -- [Exaone4ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForTokenClassification) (Exaone4Config model)
- **falcon** -- [FalconForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForTokenClassification) (FalconConfig model)
- **flaubert** -- [FlaubertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForTokenClassification) (FlaubertConfig model)
- **fnet** -- [FNetForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForTokenClassification) (FNetConfig model)
- **funnel** -- [FunnelForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForTokenClassification) (FunnelConfig model)
- **gemma** -- [GemmaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gemma#transformers.GemmaForTokenClassification) (GemmaConfig model)
- **gemma2** -- [Gemma2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gemma2#transformers.Gemma2ForTokenClassification) (Gemma2Config model)
- **glm** -- [GlmForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/glm#transformers.GlmForTokenClassification) (GlmConfig model)
- **glm4** -- [Glm4ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/glm4#transformers.Glm4ForTokenClassification) (Glm4Config model)
- **gpt-sw3** -- [GPT2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForTokenClassification) (GPT2Config model)
- **gpt2** -- [GPT2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForTokenClassification) (GPT2Config model)
- **gpt_bigcode** -- [GPTBigCodeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_bigcode#transformers.GPTBigCodeForTokenClassification) (GPTBigCodeConfig model)
- **gpt_neo** -- [GPTNeoForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForTokenClassification) (GPTNeoConfig model)
- **gpt_neox** -- [GPTNeoXForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForTokenClassification) (GPTNeoXConfig model)
- **gpt_oss** -- [GptOssForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/gpt_oss#transformers.GptOssForTokenClassification) (GptOssConfig model)
- **helium** -- [HeliumForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/helium#transformers.HeliumForTokenClassification) (HeliumConfig model)
- **ibert** -- [IBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForTokenClassification) (IBertConfig model)
- **jina_embeddings_v3** -- [JinaEmbeddingsV3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForTokenClassification) (JinaEmbeddingsV3Config model)
- **layoutlm** -- [LayoutLMForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForTokenClassification) (LayoutLMConfig model)
- **layoutlmv2** -- [LayoutLMv2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForTokenClassification) (LayoutLMv2Config model)
- **layoutlmv3** -- [LayoutLMv3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForTokenClassification) (LayoutLMv3Config model)
- **lilt** -- [LiltForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltForTokenClassification) (LiltConfig model)
- **llama** -- [LlamaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/llama#transformers.LlamaForTokenClassification) (LlamaConfig model)
- **longformer** -- [LongformerForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForTokenClassification) (LongformerConfig model)
- **luke** -- [LukeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForTokenClassification) (LukeConfig model)
- **markuplm** -- [MarkupLMForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMForTokenClassification) (MarkupLMConfig model)
- **megatron-bert** -- [MegatronBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForTokenClassification) (MegatronBertConfig model)
- **minimax** -- [MiniMaxForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForTokenClassification) (MiniMaxConfig model)
- **ministral** -- [MinistralForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForTokenClassification) (MinistralConfig model)
- **ministral3** -- [Ministral3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForTokenClassification) (Ministral3Config model)
- **mistral** -- [MistralForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForTokenClassification) (MistralConfig model)
- **mistral4** -- [Mistral4ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForTokenClassification) (Mistral4Config model)
- **mixtral** -- [MixtralForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForTokenClassification) (MixtralConfig model)
- **mobilebert** -- [MobileBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForTokenClassification) (MobileBertConfig model)
- **modernbert** -- [ModernBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForTokenClassification) (ModernBertConfig model)
- **modernvbert** -- [ModernVBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/modernvbert#transformers.ModernVBertForTokenClassification) (ModernVBertConfig model)
- **mpnet** -- [MPNetForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForTokenClassification) (MPNetConfig model)
- **mpt** -- [MptForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForTokenClassification) (MptConfig model)
- **mra** -- [MraForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForTokenClassification) (MraConfig model)
- **mt5** -- [MT5ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForTokenClassification) (MT5Config model)
- **nemotron** -- [NemotronForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForTokenClassification) (NemotronConfig model)
- **nomic_bert** -- [NomicBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/nomic_bert#transformers.NomicBertForTokenClassification) (NomicBertConfig model)
- **nystromformer** -- [NystromformerForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForTokenClassification) (NystromformerConfig model)
- **openai_privacy_filter** -- [OpenAIPrivacyFilterForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/openai_privacy_filter#transformers.OpenAIPrivacyFilterForTokenClassification) (OpenAIPrivacyFilterConfig model)
- **persimmon** -- [PersimmonForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/persimmon#transformers.PersimmonForTokenClassification) (PersimmonConfig model)
- **phi** -- [PhiForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/phi#transformers.PhiForTokenClassification) (PhiConfig model)
- **phi3** -- [Phi3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/phi3#transformers.Phi3ForTokenClassification) (Phi3Config model)
- **qwen2** -- [Qwen2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForTokenClassification) (Qwen2Config model)
- **qwen2_moe** -- [Qwen2MoeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForTokenClassification) (Qwen2MoeConfig model)
- **qwen3** -- [Qwen3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForTokenClassification) (Qwen3Config model)
- **qwen3_moe** -- [Qwen3MoeForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForTokenClassification) (Qwen3MoeConfig model)
- **qwen3_next** -- [Qwen3NextForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForTokenClassification) (Qwen3NextConfig model)
- **rembert** -- [RemBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForTokenClassification) (RemBertConfig model)
- **roberta** -- [RobertaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForTokenClassification) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaPreLayerNormForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForTokenClassification) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForTokenClassification) (RoCBertConfig model)
- **roformer** -- [RoFormerForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForTokenClassification) (RoFormerConfig model)
- **seed_oss** -- [SeedOssForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForTokenClassification) (SeedOssConfig model)
- **smollm3** -- [SmolLM3ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForTokenClassification) (SmolLM3Config model)
- **squeezebert** -- [SqueezeBertForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForTokenClassification) (SqueezeBertConfig model)
- **stablelm** -- [StableLmForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/stablelm#transformers.StableLmForTokenClassification) (StableLmConfig model)
- **starcoder2** -- [Starcoder2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/starcoder2#transformers.Starcoder2ForTokenClassification) (Starcoder2Config model)
- **t5** -- [T5ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForTokenClassification) (T5Config model)
- **t5gemma** -- [T5GemmaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma#transformers.T5GemmaForTokenClassification) (T5GemmaConfig model)
- **t5gemma2** -- [T5Gemma2ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForTokenClassification) (T5Gemma2Config model)
- **umt5** -- [UMT5ForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForTokenClassification) (UMT5Config model)
- **xlm** -- [XLMForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForTokenClassification) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForTokenClassification) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaXLForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForTokenClassification) (XLMRobertaXLConfig model)
- **xlnet** -- [XLNetForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForTokenClassification) (XLNetConfig model)
- **xmod** -- [XmodForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForTokenClassification) (XmodConfig model)
- **yoso** -- [YosoForTokenClassification](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForTokenClassification) (YosoConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForQuestionAnswering[[transformers.AutoModelForQuestionAnswering]]

#### transformers.AutoModelForQuestionAnswering[[transformers.AutoModelForQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2055)

This is a generic model class that will be instantiated as one of the model classes of the library (with a question answering head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForQuestionAnswering` (AlbertConfig model)
  - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForQuestionAnswering) (ArceeConfig model)
  - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForQuestionAnswering) (BartConfig model)
  - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForQuestionAnswering) (BertConfig model)
  - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForQuestionAnswering) (BigBirdConfig model)
  - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForQuestionAnswering) (BigBirdPegasusConfig model)
  - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForQuestionAnswering) (BloomConfig model)
  - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForQuestionAnswering) (CamembertConfig model)
  - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForQuestionAnswering) (CanineConfig model)
  - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForQuestionAnswering) (ConvBertConfig model)
  - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForQuestionAnswering) (Data2VecTextConfig model)
  - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForQuestionAnswering) (DebertaConfig model)
  - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForQuestionAnswering) (DebertaV2Config model)
  - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForQuestionAnswering) (DiffLlamaConfig model)
  - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForQuestionAnswering) (DistilBertConfig model)
  - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForQuestionAnswering) (ElectraConfig model)
  - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForQuestionAnswering) (ErnieConfig model)
  - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForQuestionAnswering) (Exaone4Config model)
  - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForQuestionAnswering) (FNetConfig model)
  - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForQuestionAnswering) (FalconConfig model)
  - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertForQuestionAnsweringSimple](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForQuestionAnsweringSimple) (FlaubertConfig model)
  - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForQuestionAnswering) (FunnelConfig model)
  - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForQuestionAnswering) (GPT2Config model)
  - [GPTJConfig](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJConfig) configuration class: [GPTJForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJForQuestionAnswering) (GPTJConfig model)
  - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForQuestionAnswering) (GPTNeoConfig model)
  - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForQuestionAnswering) (GPTNeoXConfig model)
  - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForQuestionAnswering) (IBertConfig model)
  - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForQuestionAnswering) (JinaEmbeddingsV3Config model)
  - [LEDConfig](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDConfig) configuration class: [LEDForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDForQuestionAnswering) (LEDConfig model)
  - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForQuestionAnswering) (LayoutLMv2Config model)
  - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForQuestionAnswering) (LayoutLMv3Config model)
  - [LiltConfig](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltConfig) configuration class: [LiltForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltForQuestionAnswering) (LiltConfig model)
  - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/llama#transformers.LlamaForQuestionAnswering) (LlamaConfig model)
  - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForQuestionAnswering) (LongformerConfig model)
  - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForQuestionAnswering) (LukeConfig model)
  - [LxmertConfig](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertConfig) configuration class: [LxmertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertForQuestionAnswering) (LxmertConfig model)
  - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForQuestionAnswering) (MBartConfig model)
  - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForQuestionAnswering) (MPNetConfig model)
  - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForQuestionAnswering) (MT5Config model)
  - [MarkupLMConfig](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMConfig) configuration class: [MarkupLMForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMForQuestionAnswering) (MarkupLMConfig model)
  - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForQuestionAnswering) (MegatronBertConfig model)
  - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForQuestionAnswering) (MiniMaxConfig model)
  - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForQuestionAnswering) (Ministral3Config model)
  - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForQuestionAnswering) (MinistralConfig model)
  - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForQuestionAnswering) (MistralConfig model)
  - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForQuestionAnswering) (MixtralConfig model)
  - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForQuestionAnswering) (MobileBertConfig model)
  - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForQuestionAnswering) (ModernBertConfig model)
  - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForQuestionAnswering) (MptConfig model)
  - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForQuestionAnswering) (MraConfig model)
  - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForQuestionAnswering) (MvpConfig model)
  - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForQuestionAnswering) (NemotronConfig model)
  - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForQuestionAnswering) (NystromformerConfig model)
  - [OPTConfig](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTConfig) configuration class: [OPTForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTForQuestionAnswering) (OPTConfig model)
  - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForQuestionAnswering) (Qwen2Config model)
  - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForQuestionAnswering) (Qwen2MoeConfig model)
  - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForQuestionAnswering) (Qwen3Config model)
  - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForQuestionAnswering) (Qwen3MoeConfig model)
  - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForQuestionAnswering) (Qwen3NextConfig model)
  - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerForQuestionAnswering) (ReformerConfig model)
  - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForQuestionAnswering) (RemBertConfig model)
  - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForQuestionAnswering) (RoCBertConfig model)
  - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForQuestionAnswering) (RoFormerConfig model)
  - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForQuestionAnswering) (RobertaConfig model)
  - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForQuestionAnswering) (RobertaPreLayerNormConfig model)
  - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForQuestionAnswering) (SeedOssConfig model)
  - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForQuestionAnswering) (SmolLM3Config model)
  - [SplinterConfig](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterConfig) configuration class: [SplinterForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterForQuestionAnswering) (SplinterConfig model)
  - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForQuestionAnswering) (SqueezeBertConfig model)
  - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForQuestionAnswering) (T5Config model)
  - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForQuestionAnswering) (UMT5Config model)
  - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMForQuestionAnsweringSimple](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForQuestionAnsweringSimple) (XLMConfig model)
  - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForQuestionAnswering) (XLMRobertaConfig model)
  - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForQuestionAnswering) (XLMRobertaXLConfig model)
  - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetForQuestionAnsweringSimple](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForQuestionAnsweringSimple) (XLNetConfig model)
  - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForQuestionAnswering) (XmodConfig model)
  - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForQuestionAnswering) (YosoConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlbertConfig](/docs/transformers/v5.8.0/en/model_doc/albert#transformers.AlbertConfig) configuration class: `AlbertForQuestionAnswering` (AlbertConfig model) - [ArceeConfig](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeConfig) configuration class: [ArceeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForQuestionAnswering) (ArceeConfig model) - [BartConfig](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartConfig) configuration class: [BartForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForQuestionAnswering) (BartConfig model) - [BertConfig](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertConfig) configuration class: [BertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForQuestionAnswering) (BertConfig model) - [BigBirdConfig](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdConfig) configuration class: [BigBirdForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForQuestionAnswering) (BigBirdConfig model) - [BigBirdPegasusConfig](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusConfig) configuration class: [BigBirdPegasusForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForQuestionAnswering) (BigBirdPegasusConfig model) - [BloomConfig](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomConfig) configuration class: [BloomForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForQuestionAnswering) (BloomConfig model) - [CamembertConfig](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertConfig) configuration class: [CamembertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForQuestionAnswering) (CamembertConfig model) - [CanineConfig](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineConfig) configuration class: [CanineForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForQuestionAnswering) (CanineConfig model) - [ConvBertConfig](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertConfig) configuration class: [ConvBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForQuestionAnswering) (ConvBertConfig model) - [Data2VecTextConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextConfig) configuration class: [Data2VecTextForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForQuestionAnswering) (Data2VecTextConfig model) - [DebertaConfig](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaConfig) configuration class: [DebertaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForQuestionAnswering) (DebertaConfig model) - [DebertaV2Config](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2Config) configuration class: [DebertaV2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForQuestionAnswering) (DebertaV2Config model) - [DiffLlamaConfig](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaConfig) configuration class: [DiffLlamaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForQuestionAnswering) (DiffLlamaConfig model) - [DistilBertConfig](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertConfig) configuration class: [DistilBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForQuestionAnswering) (DistilBertConfig model) - [ElectraConfig](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraConfig) configuration class: [ElectraForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForQuestionAnswering) (ElectraConfig model) - [ErnieConfig](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieConfig) configuration class: [ErnieForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForQuestionAnswering) (ErnieConfig model) - [Exaone4Config](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4Config) configuration class: [Exaone4ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForQuestionAnswering) (Exaone4Config model) - [FNetConfig](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetConfig) configuration class: [FNetForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForQuestionAnswering) (FNetConfig model) - [FalconConfig](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconConfig) configuration class: [FalconForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForQuestionAnswering) (FalconConfig model) - [FlaubertConfig](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertConfig) configuration class: [FlaubertForQuestionAnsweringSimple](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForQuestionAnsweringSimple) (FlaubertConfig model) - [FunnelConfig](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelConfig) configuration class: [FunnelForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForQuestionAnswering) (FunnelConfig model) - [GPT2Config](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2Config) configuration class: [GPT2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForQuestionAnswering) (GPT2Config model) - [GPTJConfig](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJConfig) configuration class: [GPTJForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJForQuestionAnswering) (GPTJConfig model) - [GPTNeoConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoConfig) configuration class: [GPTNeoForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForQuestionAnswering) (GPTNeoConfig model) - [GPTNeoXConfig](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXConfig) configuration class: [GPTNeoXForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForQuestionAnswering) (GPTNeoXConfig model) - [IBertConfig](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertConfig) configuration class: [IBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForQuestionAnswering) (IBertConfig model) - [JinaEmbeddingsV3Config](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3Config) configuration class: [JinaEmbeddingsV3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForQuestionAnswering) (JinaEmbeddingsV3Config model) - [LEDConfig](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDConfig) configuration class: [LEDForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDForQuestionAnswering) (LEDConfig model) - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForQuestionAnswering) (LayoutLMv2Config model) - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForQuestionAnswering) (LayoutLMv3Config model) - [LiltConfig](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltConfig) configuration class: [LiltForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltForQuestionAnswering) (LiltConfig model) - [LlamaConfig](/docs/transformers/v5.8.0/en/model_doc/llama2#transformers.LlamaConfig) configuration class: [LlamaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/llama#transformers.LlamaForQuestionAnswering) (LlamaConfig model) - [LongformerConfig](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerConfig) configuration class: [LongformerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForQuestionAnswering) (LongformerConfig model) - [LukeConfig](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeConfig) configuration class: [LukeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForQuestionAnswering) (LukeConfig model) - [LxmertConfig](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertConfig) configuration class: [LxmertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertForQuestionAnswering) (LxmertConfig model) - [MBartConfig](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartConfig) configuration class: [MBartForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForQuestionAnswering) (MBartConfig model) - [MPNetConfig](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetConfig) configuration class: [MPNetForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForQuestionAnswering) (MPNetConfig model) - [MT5Config](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5Config) configuration class: [MT5ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForQuestionAnswering) (MT5Config model) - [MarkupLMConfig](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMConfig) configuration class: [MarkupLMForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMForQuestionAnswering) (MarkupLMConfig model) - [MegatronBertConfig](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertConfig) configuration class: [MegatronBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForQuestionAnswering) (MegatronBertConfig model) - [MiniMaxConfig](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxConfig) configuration class: [MiniMaxForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForQuestionAnswering) (MiniMaxConfig model) - [Ministral3Config](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3Config) configuration class: [Ministral3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForQuestionAnswering) (Ministral3Config model) - [MinistralConfig](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralConfig) configuration class: [MinistralForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForQuestionAnswering) (MinistralConfig model) - [MistralConfig](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralConfig) configuration class: [MistralForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForQuestionAnswering) (MistralConfig model) - [MixtralConfig](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralConfig) configuration class: [MixtralForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForQuestionAnswering) (MixtralConfig model) - [MobileBertConfig](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertConfig) configuration class: [MobileBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForQuestionAnswering) (MobileBertConfig model) - [ModernBertConfig](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertConfig) configuration class: [ModernBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForQuestionAnswering) (ModernBertConfig model) - [MptConfig](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptConfig) configuration class: [MptForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForQuestionAnswering) (MptConfig model) - [MraConfig](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraConfig) configuration class: [MraForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForQuestionAnswering) (MraConfig model) - [MvpConfig](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpConfig) configuration class: [MvpForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForQuestionAnswering) (MvpConfig model) - [NemotronConfig](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronConfig) configuration class: [NemotronForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForQuestionAnswering) (NemotronConfig model) - [NystromformerConfig](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerConfig) configuration class: [NystromformerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForQuestionAnswering) (NystromformerConfig model) - [OPTConfig](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTConfig) configuration class: [OPTForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTForQuestionAnswering) (OPTConfig model) - [Qwen2Config](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2Config) configuration class: [Qwen2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForQuestionAnswering) (Qwen2Config model) - [Qwen2MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeConfig) configuration class: [Qwen2MoeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForQuestionAnswering) (Qwen2MoeConfig model) - [Qwen3Config](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3Config) configuration class: [Qwen3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForQuestionAnswering) (Qwen3Config model) - [Qwen3MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeConfig) configuration class: [Qwen3MoeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForQuestionAnswering) (Qwen3MoeConfig model) - [Qwen3NextConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextConfig) configuration class: [Qwen3NextForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForQuestionAnswering) (Qwen3NextConfig model) - [ReformerConfig](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerConfig) configuration class: [ReformerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerForQuestionAnswering) (ReformerConfig model) - [RemBertConfig](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertConfig) configuration class: [RemBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForQuestionAnswering) (RemBertConfig model) - [RoCBertConfig](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertConfig) configuration class: [RoCBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForQuestionAnswering) (RoCBertConfig model) - [RoFormerConfig](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerConfig) configuration class: [RoFormerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForQuestionAnswering) (RoFormerConfig model) - [RobertaConfig](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaConfig) configuration class: [RobertaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForQuestionAnswering) (RobertaConfig model) - [RobertaPreLayerNormConfig](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormConfig) configuration class: [RobertaPreLayerNormForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForQuestionAnswering) (RobertaPreLayerNormConfig model) - [SeedOssConfig](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssConfig) configuration class: [SeedOssForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForQuestionAnswering) (SeedOssConfig model) - [SmolLM3Config](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3Config) configuration class: [SmolLM3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForQuestionAnswering) (SmolLM3Config model) - [SplinterConfig](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterConfig) configuration class: [SplinterForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterForQuestionAnswering) (SplinterConfig model) - [SqueezeBertConfig](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertConfig) configuration class: [SqueezeBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForQuestionAnswering) (SqueezeBertConfig model) - [T5Config](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5Config) configuration class: [T5ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForQuestionAnswering) (T5Config model) - [UMT5Config](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5Config) configuration class: [UMT5ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForQuestionAnswering) (UMT5Config model) - [XLMConfig](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMConfig) configuration class: [XLMForQuestionAnsweringSimple](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForQuestionAnsweringSimple) (XLMConfig model) - [XLMRobertaConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaConfig) configuration class: [XLMRobertaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForQuestionAnswering) (XLMRobertaConfig model) - [XLMRobertaXLConfig](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLConfig) configuration class: [XLMRobertaXLForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForQuestionAnswering) (XLMRobertaXLConfig model) - [XLNetConfig](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetConfig) configuration class: [XLNetForQuestionAnsweringSimple](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForQuestionAnsweringSimple) (XLNetConfig model) - [XmodConfig](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodConfig) configuration class: [XmodForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForQuestionAnswering) (XmodConfig model) - [YosoConfig](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoConfig) configuration class: [YosoForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForQuestionAnswering) (YosoConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **albert** -- `AlbertForQuestionAnswering` (AlbertConfig model)
- **arcee** -- [ArceeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/arcee#transformers.ArceeForQuestionAnswering) (ArceeConfig model)
- **bart** -- [BartForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bart#transformers.BartForQuestionAnswering) (BartConfig model)
- **bert** -- [BertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bert#transformers.BertForQuestionAnswering) (BertConfig model)
- **big_bird** -- [BigBirdForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/big_bird#transformers.BigBirdForQuestionAnswering) (BigBirdConfig model)
- **bigbird_pegasus** -- [BigBirdPegasusForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bigbird_pegasus#transformers.BigBirdPegasusForQuestionAnswering) (BigBirdPegasusConfig model)
- **bloom** -- [BloomForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/bloom#transformers.BloomForQuestionAnswering) (BloomConfig model)
- **camembert** -- [CamembertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/camembert#transformers.CamembertForQuestionAnswering) (CamembertConfig model)
- **canine** -- [CanineForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/canine#transformers.CanineForQuestionAnswering) (CanineConfig model)
- **convbert** -- [ConvBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/convbert#transformers.ConvBertForQuestionAnswering) (ConvBertConfig model)
- **data2vec-text** -- [Data2VecTextForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecTextForQuestionAnswering) (Data2VecTextConfig model)
- **deberta** -- [DebertaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/deberta#transformers.DebertaForQuestionAnswering) (DebertaConfig model)
- **deberta-v2** -- [DebertaV2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/deberta-v2#transformers.DebertaV2ForQuestionAnswering) (DebertaV2Config model)
- **diffllama** -- [DiffLlamaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/diffllama#transformers.DiffLlamaForQuestionAnswering) (DiffLlamaConfig model)
- **distilbert** -- [DistilBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/distilbert#transformers.DistilBertForQuestionAnswering) (DistilBertConfig model)
- **electra** -- [ElectraForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/electra#transformers.ElectraForQuestionAnswering) (ElectraConfig model)
- **ernie** -- [ErnieForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ernie#transformers.ErnieForQuestionAnswering) (ErnieConfig model)
- **exaone4** -- [Exaone4ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/exaone4#transformers.Exaone4ForQuestionAnswering) (Exaone4Config model)
- **falcon** -- [FalconForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/falcon#transformers.FalconForQuestionAnswering) (FalconConfig model)
- **flaubert** -- [FlaubertForQuestionAnsweringSimple](/docs/transformers/v5.8.0/en/model_doc/flaubert#transformers.FlaubertForQuestionAnsweringSimple) (FlaubertConfig model)
- **fnet** -- [FNetForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/fnet#transformers.FNetForQuestionAnswering) (FNetConfig model)
- **funnel** -- [FunnelForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/funnel#transformers.FunnelForQuestionAnswering) (FunnelConfig model)
- **gpt2** -- [GPT2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gpt2#transformers.GPT2ForQuestionAnswering) (GPT2Config model)
- **gpt_neo** -- [GPTNeoForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gpt_neo#transformers.GPTNeoForQuestionAnswering) (GPTNeoConfig model)
- **gpt_neox** -- [GPTNeoXForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gpt_neox#transformers.GPTNeoXForQuestionAnswering) (GPTNeoXConfig model)
- **gptj** -- [GPTJForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/gptj#transformers.GPTJForQuestionAnswering) (GPTJConfig model)
- **ibert** -- [IBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ibert#transformers.IBertForQuestionAnswering) (IBertConfig model)
- **jina_embeddings_v3** -- [JinaEmbeddingsV3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/jina_embeddings_v3#transformers.JinaEmbeddingsV3ForQuestionAnswering) (JinaEmbeddingsV3Config model)
- **layoutlmv2** -- [LayoutLMv2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForQuestionAnswering) (LayoutLMv2Config model)
- **layoutlmv3** -- [LayoutLMv3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForQuestionAnswering) (LayoutLMv3Config model)
- **led** -- [LEDForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/led#transformers.LEDForQuestionAnswering) (LEDConfig model)
- **lilt** -- [LiltForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/lilt#transformers.LiltForQuestionAnswering) (LiltConfig model)
- **llama** -- [LlamaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/llama#transformers.LlamaForQuestionAnswering) (LlamaConfig model)
- **longformer** -- [LongformerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/longformer#transformers.LongformerForQuestionAnswering) (LongformerConfig model)
- **luke** -- [LukeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/luke#transformers.LukeForQuestionAnswering) (LukeConfig model)
- **lxmert** -- [LxmertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/lxmert#transformers.LxmertForQuestionAnswering) (LxmertConfig model)
- **markuplm** -- [MarkupLMForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/markuplm#transformers.MarkupLMForQuestionAnswering) (MarkupLMConfig model)
- **mbart** -- [MBartForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mbart#transformers.MBartForQuestionAnswering) (MBartConfig model)
- **megatron-bert** -- [MegatronBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/megatron-bert#transformers.MegatronBertForQuestionAnswering) (MegatronBertConfig model)
- **minimax** -- [MiniMaxForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/minimax#transformers.MiniMaxForQuestionAnswering) (MiniMaxConfig model)
- **ministral** -- [MinistralForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ministral#transformers.MinistralForQuestionAnswering) (MinistralConfig model)
- **ministral3** -- [Ministral3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/ministral3#transformers.Ministral3ForQuestionAnswering) (Ministral3Config model)
- **mistral** -- [MistralForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mistral#transformers.MistralForQuestionAnswering) (MistralConfig model)
- **mixtral** -- [MixtralForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mixtral#transformers.MixtralForQuestionAnswering) (MixtralConfig model)
- **mobilebert** -- [MobileBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mobilebert#transformers.MobileBertForQuestionAnswering) (MobileBertConfig model)
- **modernbert** -- [ModernBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/modernbert#transformers.ModernBertForQuestionAnswering) (ModernBertConfig model)
- **mpnet** -- [MPNetForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mpnet#transformers.MPNetForQuestionAnswering) (MPNetConfig model)
- **mpt** -- [MptForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mpt#transformers.MptForQuestionAnswering) (MptConfig model)
- **mra** -- [MraForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mra#transformers.MraForQuestionAnswering) (MraConfig model)
- **mt5** -- [MT5ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mt5#transformers.MT5ForQuestionAnswering) (MT5Config model)
- **mvp** -- [MvpForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/mvp#transformers.MvpForQuestionAnswering) (MvpConfig model)
- **nemotron** -- [NemotronForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/nemotron#transformers.NemotronForQuestionAnswering) (NemotronConfig model)
- **nystromformer** -- [NystromformerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/nystromformer#transformers.NystromformerForQuestionAnswering) (NystromformerConfig model)
- **opt** -- [OPTForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/opt#transformers.OPTForQuestionAnswering) (OPTConfig model)
- **qwen2** -- [Qwen2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen2#transformers.Qwen2ForQuestionAnswering) (Qwen2Config model)
- **qwen2_moe** -- [Qwen2MoeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen2_moe#transformers.Qwen2MoeForQuestionAnswering) (Qwen2MoeConfig model)
- **qwen3** -- [Qwen3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen3#transformers.Qwen3ForQuestionAnswering) (Qwen3Config model)
- **qwen3_moe** -- [Qwen3MoeForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen3_moe#transformers.Qwen3MoeForQuestionAnswering) (Qwen3MoeConfig model)
- **qwen3_next** -- [Qwen3NextForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/qwen3_next#transformers.Qwen3NextForQuestionAnswering) (Qwen3NextConfig model)
- **reformer** -- [ReformerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/reformer#transformers.ReformerForQuestionAnswering) (ReformerConfig model)
- **rembert** -- [RemBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/rembert#transformers.RemBertForQuestionAnswering) (RemBertConfig model)
- **roberta** -- [RobertaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roberta#transformers.RobertaForQuestionAnswering) (RobertaConfig model)
- **roberta-prelayernorm** -- [RobertaPreLayerNormForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roberta-prelayernorm#transformers.RobertaPreLayerNormForQuestionAnswering) (RobertaPreLayerNormConfig model)
- **roc_bert** -- [RoCBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roc_bert#transformers.RoCBertForQuestionAnswering) (RoCBertConfig model)
- **roformer** -- [RoFormerForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/roformer#transformers.RoFormerForQuestionAnswering) (RoFormerConfig model)
- **seed_oss** -- [SeedOssForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/seed_oss#transformers.SeedOssForQuestionAnswering) (SeedOssConfig model)
- **smollm3** -- [SmolLM3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/smollm3#transformers.SmolLM3ForQuestionAnswering) (SmolLM3Config model)
- **splinter** -- [SplinterForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/splinter#transformers.SplinterForQuestionAnswering) (SplinterConfig model)
- **squeezebert** -- [SqueezeBertForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/squeezebert#transformers.SqueezeBertForQuestionAnswering) (SqueezeBertConfig model)
- **t5** -- [T5ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/t5#transformers.T5ForQuestionAnswering) (T5Config model)
- **umt5** -- [UMT5ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/umt5#transformers.UMT5ForQuestionAnswering) (UMT5Config model)
- **xlm** -- [XLMForQuestionAnsweringSimple](/docs/transformers/v5.8.0/en/model_doc/xlm#transformers.XLMForQuestionAnsweringSimple) (XLMConfig model)
- **xlm-roberta** -- [XLMRobertaForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta#transformers.XLMRobertaForQuestionAnswering) (XLMRobertaConfig model)
- **xlm-roberta-xl** -- [XLMRobertaXLForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/xlm-roberta-xl#transformers.XLMRobertaXLForQuestionAnswering) (XLMRobertaXLConfig model)
- **xlnet** -- [XLNetForQuestionAnsweringSimple](/docs/transformers/v5.8.0/en/model_doc/xlnet#transformers.XLNetForQuestionAnsweringSimple) (XLNetConfig model)
- **xmod** -- [XmodForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/xmod#transformers.XmodForQuestionAnswering) (XmodConfig model)
- **yoso** -- [YosoForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/yoso#transformers.YosoForQuestionAnswering) (YosoConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTextEncoding[[transformers.AutoModelForTextEncoding]]

#### transformers.AutoModelForTextEncoding[[transformers.AutoModelForTextEncoding]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L1989)

## Computer vision

The following auto classes are available for the following computer vision tasks.

### AutoModelForDepthEstimation[[transformers.AutoModelForDepthEstimation]]

#### transformers.AutoModelForDepthEstimation[[transformers.AutoModelForDepthEstimation]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2193)

This is a generic model class that will be instantiated as one of the model classes of the library (with a depth estimation head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForDepthEstimation.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [CHMv2Config](/docs/transformers/v5.8.0/en/model_doc/chmv2#transformers.CHMv2Config) configuration class: [CHMv2ForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/chmv2#transformers.CHMv2ForDepthEstimation) (CHMv2Config model)
  - [DPTConfig](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTConfig) configuration class: [DPTForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTForDepthEstimation) (DPTConfig model)
  - [DepthAnythingConfig](/docs/transformers/v5.8.0/en/model_doc/depth_anything#transformers.DepthAnythingConfig) configuration class: [DepthAnythingForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/depth_anything#transformers.DepthAnythingForDepthEstimation) (DepthAnythingConfig model)
  - [DepthProConfig](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProConfig) configuration class: [DepthProForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProForDepthEstimation) (DepthProConfig model)
  - [GLPNConfig](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNConfig) configuration class: [GLPNForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNForDepthEstimation) (GLPNConfig model)
  - [PromptDepthAnythingConfig](/docs/transformers/v5.8.0/en/model_doc/prompt_depth_anything#transformers.PromptDepthAnythingConfig) configuration class: [PromptDepthAnythingForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/prompt_depth_anything#transformers.PromptDepthAnythingForDepthEstimation) (PromptDepthAnythingConfig model)
  - [ZoeDepthConfig](/docs/transformers/v5.8.0/en/model_doc/zoedepth#transformers.ZoeDepthConfig) configuration class: [ZoeDepthForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/zoedepth#transformers.ZoeDepthForDepthEstimation) (ZoeDepthConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a depth estimation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForDepthEstimation.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [CHMv2Config](/docs/transformers/v5.8.0/en/model_doc/chmv2#transformers.CHMv2Config) configuration class: [CHMv2ForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/chmv2#transformers.CHMv2ForDepthEstimation) (CHMv2Config model) - [DPTConfig](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTConfig) configuration class: [DPTForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTForDepthEstimation) (DPTConfig model) - [DepthAnythingConfig](/docs/transformers/v5.8.0/en/model_doc/depth_anything#transformers.DepthAnythingConfig) configuration class: [DepthAnythingForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/depth_anything#transformers.DepthAnythingForDepthEstimation) (DepthAnythingConfig model) - [DepthProConfig](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProConfig) configuration class: [DepthProForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProForDepthEstimation) (DepthProConfig model) - [GLPNConfig](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNConfig) configuration class: [GLPNForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNForDepthEstimation) (GLPNConfig model) - [PromptDepthAnythingConfig](/docs/transformers/v5.8.0/en/model_doc/prompt_depth_anything#transformers.PromptDepthAnythingConfig) configuration class: [PromptDepthAnythingForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/prompt_depth_anything#transformers.PromptDepthAnythingForDepthEstimation) (PromptDepthAnythingConfig model) - [ZoeDepthConfig](/docs/transformers/v5.8.0/en/model_doc/zoedepth#transformers.ZoeDepthConfig) configuration class: [ZoeDepthForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/zoedepth#transformers.ZoeDepthForDepthEstimation) (ZoeDepthConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForDepthEstimation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a depth estimation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **chmv2** -- [CHMv2ForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/chmv2#transformers.CHMv2ForDepthEstimation) (CHMv2Config model)
- **depth_anything** -- [DepthAnythingForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/depth_anything#transformers.DepthAnythingForDepthEstimation) (DepthAnythingConfig model)
- **depth_pro** -- [DepthProForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/depth_pro#transformers.DepthProForDepthEstimation) (DepthProConfig model)
- **dpt** -- [DPTForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTForDepthEstimation) (DPTConfig model)
- **glpn** -- [GLPNForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/glpn#transformers.GLPNForDepthEstimation) (GLPNConfig model)
- **prompt_depth_anything** -- [PromptDepthAnythingForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/prompt_depth_anything#transformers.PromptDepthAnythingForDepthEstimation) (PromptDepthAnythingConfig model)
- **zoedepth** -- [ZoeDepthForDepthEstimation](/docs/transformers/v5.8.0/en/model_doc/zoedepth#transformers.ZoeDepthForDepthEstimation) (ZoeDepthConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTextRecognition[[transformers.AutoModelForTextRecognition]]

#### transformers.AutoModelForTextRecognition[[transformers.AutoModelForTextRecognition]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2200)

This is a generic model class that will be instantiated as one of the model classes of the library (with a text recognition head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTextRecognition.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [PPOCRV5MobileRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecConfig) configuration class: [PPOCRV5MobileRecForTextRecognition](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecForTextRecognition) (PPOCRV5MobileRecConfig model)
  - [PPOCRV5ServerRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecConfig) configuration class: [PPOCRV5ServerRecForTextRecognition](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecForTextRecognition) (PPOCRV5ServerRecConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a text recognition head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTextRecognition

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTextRecognition.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [PPOCRV5MobileRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecConfig) configuration class: [PPOCRV5MobileRecForTextRecognition](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecForTextRecognition) (PPOCRV5MobileRecConfig model) - [PPOCRV5ServerRecConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecConfig) configuration class: [PPOCRV5ServerRecForTextRecognition](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecForTextRecognition) (PPOCRV5ServerRecConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTextRecognition.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a text recognition head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **pp_ocrv5_mobile_rec** -- [PPOCRV5MobileRecForTextRecognition](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_rec#transformers.PPOCRV5MobileRecForTextRecognition) (PPOCRV5MobileRecConfig model)
- **pp_ocrv5_server_rec** -- [PPOCRV5ServerRecForTextRecognition](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_rec#transformers.PPOCRV5ServerRecForTextRecognition) (PPOCRV5ServerRecConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTextRecognition

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTextRecognition.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTextRecognition.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTableRecognition[[transformers.AutoModelForTableRecognition]]

#### transformers.AutoModelForTableRecognition[[transformers.AutoModelForTableRecognition]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2207)

This is a generic model class that will be instantiated as one of the model classes of the library (with a table recognition head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTableRecognition.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [SLANeXtConfig](/docs/transformers/v5.8.0/en/model_doc/slanext#transformers.SLANeXtConfig) configuration class: [SLANeXtForTableRecognition](/docs/transformers/v5.8.0/en/model_doc/slanext#transformers.SLANeXtForTableRecognition) (SLANeXtConfig model)
  - [SLANetConfig](/docs/transformers/v5.8.0/en/model_doc/slanet#transformers.SLANetConfig) configuration class: [SLANetForTableRecognition](/docs/transformers/v5.8.0/en/model_doc/slanet#transformers.SLANetForTableRecognition) (SLANetConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a table recognition head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTableRecognition

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTableRecognition.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [SLANeXtConfig](/docs/transformers/v5.8.0/en/model_doc/slanext#transformers.SLANeXtConfig) configuration class: [SLANeXtForTableRecognition](/docs/transformers/v5.8.0/en/model_doc/slanext#transformers.SLANeXtForTableRecognition) (SLANeXtConfig model) - [SLANetConfig](/docs/transformers/v5.8.0/en/model_doc/slanet#transformers.SLANetConfig) configuration class: [SLANetForTableRecognition](/docs/transformers/v5.8.0/en/model_doc/slanet#transformers.SLANetForTableRecognition) (SLANetConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTableRecognition.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a table recognition head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **slanet** -- [SLANetForTableRecognition](/docs/transformers/v5.8.0/en/model_doc/slanet#transformers.SLANetForTableRecognition) (SLANetConfig model)
- **slanext** -- [SLANeXtForTableRecognition](/docs/transformers/v5.8.0/en/model_doc/slanext#transformers.SLANeXtForTableRecognition) (SLANeXtConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTableRecognition

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableRecognition.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTableRecognition.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageClassification[[transformers.AutoModelForImageClassification]]

#### transformers.AutoModelForImageClassification[[transformers.AutoModelForImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2118)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image classification head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BeitConfig](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitConfig) configuration class: [BeitForImageClassification](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitForImageClassification) (BeitConfig model)
  - [BitConfig](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitConfig) configuration class: [BitForImageClassification](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitForImageClassification) (BitConfig model)
  - [CLIPConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPForImageClassification](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPForImageClassification) (CLIPConfig model)
  - [ConvNextConfig](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextConfig) configuration class: [ConvNextForImageClassification](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextForImageClassification) (ConvNextConfig model)
  - [ConvNextV2Config](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [ConvNextV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2ForImageClassification) (ConvNextV2Config model)
  - [CvtConfig](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtConfig) configuration class: [CvtForImageClassification](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtForImageClassification) (CvtConfig model)
  - [Data2VecVisionConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionForImageClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionForImageClassification) (Data2VecVisionConfig model)
  - [DeiTConfig](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTForImageClassification) or [DeiTForImageClassificationWithTeacher](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTForImageClassificationWithTeacher) (DeiTConfig model)
  - [DinatConfig](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatConfig) configuration class: [DinatForImageClassification](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatForImageClassification) (DinatConfig model)
  - [Dinov2Config](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2Config) configuration class: [Dinov2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2ForImageClassification) (Dinov2Config model)
  - [Dinov2WithRegistersConfig](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersConfig) configuration class: [Dinov2WithRegistersForImageClassification](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersForImageClassification) (Dinov2WithRegistersConfig model)
  - [DonutSwinConfig](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinConfig) configuration class: [DonutSwinForImageClassification](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinForImageClassification) (DonutSwinConfig model)
  - [EfficientNetConfig](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetConfig) configuration class: [EfficientNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetForImageClassification) (EfficientNetConfig model)
  - [FocalNetConfig](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetConfig) configuration class: [FocalNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetForImageClassification) (FocalNetConfig model)
  - [HGNetV2Config](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2Config) configuration class: [HGNetV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2ForImageClassification) (HGNetV2Config model)
  - [HieraConfig](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraConfig) configuration class: [HieraForImageClassification](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraForImageClassification) (HieraConfig model)
  - [IJepaConfig](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaConfig) configuration class: [IJepaForImageClassification](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaForImageClassification) (IJepaConfig model)
  - [ImageGPTConfig](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTConfig) configuration class: [ImageGPTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTForImageClassification) (ImageGPTConfig model)
  - [LevitConfig](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitConfig) configuration class: [LevitForImageClassification](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitForImageClassification) or [LevitForImageClassificationWithTeacher](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitForImageClassificationWithTeacher) (LevitConfig model)
  - [MetaClip2Config](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Config) configuration class: [MetaClip2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2ForImageClassification) (MetaClip2Config model)
  - [MobileNetV1Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1Config) configuration class: [MobileNetV1ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1ForImageClassification) (MobileNetV1Config model)
  - [MobileNetV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Config) configuration class: [MobileNetV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2ForImageClassification) (MobileNetV2Config model)
  - [MobileViTConfig](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTConfig) configuration class: [MobileViTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTForImageClassification) (MobileViTConfig model)
  - [MobileViTV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Config) configuration class: [MobileViTV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2ForImageClassification) (MobileViTV2Config model)
  - [PPLCNetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_lcnet#transformers.PPLCNetConfig) configuration class: [PPLCNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/pp_lcnet#transformers.PPLCNetForImageClassification) (PPLCNetConfig model)
  - [PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverConfig) configuration class: [PerceiverForImageClassificationLearned](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForImageClassificationLearned) or [PerceiverForImageClassificationFourier](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForImageClassificationFourier) or [PerceiverForImageClassificationConvProcessing](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForImageClassificationConvProcessing) (PerceiverConfig model)
  - [PoolFormerConfig](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerConfig) configuration class: [PoolFormerForImageClassification](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerForImageClassification) (PoolFormerConfig model)
  - [PvtConfig](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtConfig) configuration class: [PvtForImageClassification](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtForImageClassification) (PvtConfig model)
  - [PvtV2Config](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2Config) configuration class: [PvtV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2ForImageClassification) (PvtV2Config model)
  - [RegNetConfig](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetConfig) configuration class: [RegNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetForImageClassification) (RegNetConfig model)
  - [ResNetConfig](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetConfig) configuration class: [ResNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetForImageClassification) (ResNetConfig model)
  - [SegformerConfig](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerConfig) configuration class: [SegformerForImageClassification](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerForImageClassification) (SegformerConfig model)
  - [ShieldGemma2Config](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2Config) configuration class: [ShieldGemma2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2ForImageClassification) (ShieldGemma2Config model)
  - [Siglip2Config](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Config) configuration class: [Siglip2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2ForImageClassification) (Siglip2Config model)
  - [SiglipConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipForImageClassification](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipForImageClassification) (SiglipConfig model)
  - [SwiftFormerConfig](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerConfig) configuration class: [SwiftFormerForImageClassification](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerForImageClassification) (SwiftFormerConfig model)
  - [SwinConfig](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinConfig) configuration class: [SwinForImageClassification](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinForImageClassification) (SwinConfig model)
  - [Swinv2Config](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2ForImageClassification) (Swinv2Config model)
  - [TextNetConfig](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetConfig) configuration class: [TextNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetForImageClassification) (TextNetConfig model)
  - [TimmWrapperConfig](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperConfig) configuration class: [TimmWrapperForImageClassification](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperForImageClassification) (TimmWrapperConfig model)
  - [ViTConfig](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTConfig) configuration class: [ViTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTForImageClassification) (ViTConfig model)
  - [ViTMSNConfig](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNConfig) configuration class: [ViTMSNForImageClassification](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNForImageClassification) (ViTMSNConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageClassification.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BeitConfig](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitConfig) configuration class: [BeitForImageClassification](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitForImageClassification) (BeitConfig model) - [BitConfig](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitConfig) configuration class: [BitForImageClassification](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitForImageClassification) (BitConfig model) - [CLIPConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPForImageClassification](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPForImageClassification) (CLIPConfig model) - [ConvNextConfig](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextConfig) configuration class: [ConvNextForImageClassification](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextForImageClassification) (ConvNextConfig model) - [ConvNextV2Config](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2Config) configuration class: [ConvNextV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2ForImageClassification) (ConvNextV2Config model) - [CvtConfig](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtConfig) configuration class: [CvtForImageClassification](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtForImageClassification) (CvtConfig model) - [Data2VecVisionConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionForImageClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionForImageClassification) (Data2VecVisionConfig model) - [DeiTConfig](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTForImageClassification) or [DeiTForImageClassificationWithTeacher](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTForImageClassificationWithTeacher) (DeiTConfig model) - [DinatConfig](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatConfig) configuration class: [DinatForImageClassification](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatForImageClassification) (DinatConfig model) - [Dinov2Config](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2Config) configuration class: [Dinov2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2ForImageClassification) (Dinov2Config model) - [Dinov2WithRegistersConfig](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersConfig) configuration class: [Dinov2WithRegistersForImageClassification](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersForImageClassification) (Dinov2WithRegistersConfig model) - [DonutSwinConfig](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinConfig) configuration class: [DonutSwinForImageClassification](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinForImageClassification) (DonutSwinConfig model) - [EfficientNetConfig](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetConfig) configuration class: [EfficientNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetForImageClassification) (EfficientNetConfig model) - [FocalNetConfig](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetConfig) configuration class: [FocalNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetForImageClassification) (FocalNetConfig model) - [HGNetV2Config](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2Config) configuration class: [HGNetV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2ForImageClassification) (HGNetV2Config model) - [HieraConfig](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraConfig) configuration class: [HieraForImageClassification](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraForImageClassification) (HieraConfig model) - [IJepaConfig](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaConfig) configuration class: [IJepaForImageClassification](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaForImageClassification) (IJepaConfig model) - [ImageGPTConfig](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTConfig) configuration class: [ImageGPTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTForImageClassification) (ImageGPTConfig model) - [LevitConfig](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitConfig) configuration class: [LevitForImageClassification](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitForImageClassification) or [LevitForImageClassificationWithTeacher](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitForImageClassificationWithTeacher) (LevitConfig model) - [MetaClip2Config](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Config) configuration class: [MetaClip2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2ForImageClassification) (MetaClip2Config model) - [MobileNetV1Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1Config) configuration class: [MobileNetV1ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1ForImageClassification) (MobileNetV1Config model) - [MobileNetV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Config) configuration class: [MobileNetV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2ForImageClassification) (MobileNetV2Config model) - [MobileViTConfig](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTConfig) configuration class: [MobileViTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTForImageClassification) (MobileViTConfig model) - [MobileViTV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Config) configuration class: [MobileViTV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2ForImageClassification) (MobileViTV2Config model) - [PPLCNetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_lcnet#transformers.PPLCNetConfig) configuration class: [PPLCNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/pp_lcnet#transformers.PPLCNetForImageClassification) (PPLCNetConfig model) - [PerceiverConfig](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverConfig) configuration class: [PerceiverForImageClassificationLearned](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForImageClassificationLearned) or [PerceiverForImageClassificationFourier](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForImageClassificationFourier) or [PerceiverForImageClassificationConvProcessing](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForImageClassificationConvProcessing) (PerceiverConfig model) - [PoolFormerConfig](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerConfig) configuration class: [PoolFormerForImageClassification](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerForImageClassification) (PoolFormerConfig model) - [PvtConfig](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtConfig) configuration class: [PvtForImageClassification](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtForImageClassification) (PvtConfig model) - [PvtV2Config](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2Config) configuration class: [PvtV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2ForImageClassification) (PvtV2Config model) - [RegNetConfig](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetConfig) configuration class: [RegNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetForImageClassification) (RegNetConfig model) - [ResNetConfig](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetConfig) configuration class: [ResNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetForImageClassification) (ResNetConfig model) - [SegformerConfig](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerConfig) configuration class: [SegformerForImageClassification](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerForImageClassification) (SegformerConfig model) - [ShieldGemma2Config](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2Config) configuration class: [ShieldGemma2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2ForImageClassification) (ShieldGemma2Config model) - [Siglip2Config](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Config) configuration class: [Siglip2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2ForImageClassification) (Siglip2Config model) - [SiglipConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipForImageClassification](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipForImageClassification) (SiglipConfig model) - [SwiftFormerConfig](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerConfig) configuration class: [SwiftFormerForImageClassification](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerForImageClassification) (SwiftFormerConfig model) - [SwinConfig](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinConfig) configuration class: [SwinForImageClassification](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinForImageClassification) (SwinConfig model) - [Swinv2Config](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2ForImageClassification) (Swinv2Config model) - [TextNetConfig](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetConfig) configuration class: [TextNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetForImageClassification) (TextNetConfig model) - [TimmWrapperConfig](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperConfig) configuration class: [TimmWrapperForImageClassification](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperForImageClassification) (TimmWrapperConfig model) - [ViTConfig](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTConfig) configuration class: [ViTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTForImageClassification) (ViTConfig model) - [ViTMSNConfig](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNConfig) configuration class: [ViTMSNForImageClassification](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNForImageClassification) (ViTMSNConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **beit** -- [BeitForImageClassification](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitForImageClassification) (BeitConfig model)
- **bit** -- [BitForImageClassification](/docs/transformers/v5.8.0/en/model_doc/bit#transformers.BitForImageClassification) (BitConfig model)
- **clip** -- [CLIPForImageClassification](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPForImageClassification) (CLIPConfig model)
- **convnext** -- [ConvNextForImageClassification](/docs/transformers/v5.8.0/en/model_doc/convnext#transformers.ConvNextForImageClassification) (ConvNextConfig model)
- **convnextv2** -- [ConvNextV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/convnextv2#transformers.ConvNextV2ForImageClassification) (ConvNextV2Config model)
- **cvt** -- [CvtForImageClassification](/docs/transformers/v5.8.0/en/model_doc/cvt#transformers.CvtForImageClassification) (CvtConfig model)
- **data2vec-vision** -- [Data2VecVisionForImageClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionForImageClassification) (Data2VecVisionConfig model)
- **deit** -- [DeiTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTForImageClassification) or [DeiTForImageClassificationWithTeacher](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTForImageClassificationWithTeacher) (DeiTConfig model)
- **dinat** -- [DinatForImageClassification](/docs/transformers/v5.8.0/en/model_doc/dinat#transformers.DinatForImageClassification) (DinatConfig model)
- **dinov2** -- [Dinov2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/dinov2#transformers.Dinov2ForImageClassification) (Dinov2Config model)
- **dinov2_with_registers** -- [Dinov2WithRegistersForImageClassification](/docs/transformers/v5.8.0/en/model_doc/dinov2_with_registers#transformers.Dinov2WithRegistersForImageClassification) (Dinov2WithRegistersConfig model)
- **donut-swin** -- [DonutSwinForImageClassification](/docs/transformers/v5.8.0/en/model_doc/donut#transformers.DonutSwinForImageClassification) (DonutSwinConfig model)
- **efficientnet** -- [EfficientNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/efficientnet#transformers.EfficientNetForImageClassification) (EfficientNetConfig model)
- **focalnet** -- [FocalNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetForImageClassification) (FocalNetConfig model)
- **hgnet_v2** -- [HGNetV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/hgnet_v2#transformers.HGNetV2ForImageClassification) (HGNetV2Config model)
- **hiera** -- [HieraForImageClassification](/docs/transformers/v5.8.0/en/model_doc/hiera#transformers.HieraForImageClassification) (HieraConfig model)
- **ijepa** -- [IJepaForImageClassification](/docs/transformers/v5.8.0/en/model_doc/ijepa#transformers.IJepaForImageClassification) (IJepaConfig model)
- **imagegpt** -- [ImageGPTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/imagegpt#transformers.ImageGPTForImageClassification) (ImageGPTConfig model)
- **levit** -- [LevitForImageClassification](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitForImageClassification) or [LevitForImageClassificationWithTeacher](/docs/transformers/v5.8.0/en/model_doc/levit#transformers.LevitForImageClassificationWithTeacher) (LevitConfig model)
- **metaclip_2** -- [MetaClip2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2ForImageClassification) (MetaClip2Config model)
- **mobilenet_v1** -- [MobileNetV1ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v1#transformers.MobileNetV1ForImageClassification) (MobileNetV1Config model)
- **mobilenet_v2** -- [MobileNetV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2ForImageClassification) (MobileNetV2Config model)
- **mobilevit** -- [MobileViTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTForImageClassification) (MobileViTConfig model)
- **mobilevitv2** -- [MobileViTV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2ForImageClassification) (MobileViTV2Config model)
- **perceiver** -- [PerceiverForImageClassificationLearned](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForImageClassificationLearned) or [PerceiverForImageClassificationFourier](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForImageClassificationFourier) or [PerceiverForImageClassificationConvProcessing](/docs/transformers/v5.8.0/en/model_doc/perceiver#transformers.PerceiverForImageClassificationConvProcessing) (PerceiverConfig model)
- **poolformer** -- [PoolFormerForImageClassification](/docs/transformers/v5.8.0/en/model_doc/poolformer#transformers.PoolFormerForImageClassification) (PoolFormerConfig model)
- **pp_lcnet** -- [PPLCNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/pp_lcnet#transformers.PPLCNetForImageClassification) (PPLCNetConfig model)
- **pvt** -- [PvtForImageClassification](/docs/transformers/v5.8.0/en/model_doc/pvt#transformers.PvtForImageClassification) (PvtConfig model)
- **pvt_v2** -- [PvtV2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/pvt_v2#transformers.PvtV2ForImageClassification) (PvtV2Config model)
- **regnet** -- [RegNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/regnet#transformers.RegNetForImageClassification) (RegNetConfig model)
- **resnet** -- [ResNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/resnet#transformers.ResNetForImageClassification) (ResNetConfig model)
- **segformer** -- [SegformerForImageClassification](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerForImageClassification) (SegformerConfig model)
- **shieldgemma2** -- [ShieldGemma2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2ForImageClassification) (ShieldGemma2Config model)
- **siglip** -- [SiglipForImageClassification](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipForImageClassification) (SiglipConfig model)
- **siglip2** -- [Siglip2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2ForImageClassification) (Siglip2Config model)
- **swiftformer** -- [SwiftFormerForImageClassification](/docs/transformers/v5.8.0/en/model_doc/swiftformer#transformers.SwiftFormerForImageClassification) (SwiftFormerConfig model)
- **swin** -- [SwinForImageClassification](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinForImageClassification) (SwinConfig model)
- **swinv2** -- [Swinv2ForImageClassification](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2ForImageClassification) (Swinv2Config model)
- **textnet** -- [TextNetForImageClassification](/docs/transformers/v5.8.0/en/model_doc/textnet#transformers.TextNetForImageClassification) (TextNetConfig model)
- **timm_wrapper** -- [TimmWrapperForImageClassification](/docs/transformers/v5.8.0/en/model_doc/timm_wrapper#transformers.TimmWrapperForImageClassification) (TimmWrapperConfig model)
- **vit** -- [ViTForImageClassification](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTForImageClassification) (ViTConfig model)
- **vit_msn** -- [ViTMSNForImageClassification](/docs/transformers/v5.8.0/en/model_doc/vit_msn#transformers.ViTMSNForImageClassification) (ViTMSNConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForVideoClassification[[transformers.AutoModelForVideoClassification]]

#### transformers.AutoModelForVideoClassification[[transformers.AutoModelForVideoClassification]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2214)

This is a generic model class that will be instantiated as one of the model classes of the library (with a video classification head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForVideoClassification.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [TimesformerConfig](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerConfig) configuration class: [TimesformerForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerForVideoClassification) (TimesformerConfig model)
  - [VJEPA2Config](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2Config) configuration class: [VJEPA2ForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2ForVideoClassification) (VJEPA2Config model)
  - [VideoMAEConfig](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEConfig) configuration class: [VideoMAEForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEForVideoClassification) (VideoMAEConfig model)
  - [VivitConfig](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitConfig) configuration class: [VivitForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitForVideoClassification) (VivitConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a video classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForVideoClassification.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [TimesformerConfig](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerConfig) configuration class: [TimesformerForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerForVideoClassification) (TimesformerConfig model) - [VJEPA2Config](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2Config) configuration class: [VJEPA2ForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2ForVideoClassification) (VJEPA2Config model) - [VideoMAEConfig](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEConfig) configuration class: [VideoMAEForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEForVideoClassification) (VideoMAEConfig model) - [VivitConfig](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitConfig) configuration class: [VivitForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitForVideoClassification) (VivitConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForVideoClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a video classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **timesformer** -- [TimesformerForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/timesformer#transformers.TimesformerForVideoClassification) (TimesformerConfig model)
- **videomae** -- [VideoMAEForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/videomae#transformers.VideoMAEForVideoClassification) (VideoMAEConfig model)
- **vivit** -- [VivitForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/vivit#transformers.VivitForVideoClassification) (VivitConfig model)
- **vjepa2** -- [VJEPA2ForVideoClassification](/docs/transformers/v5.8.0/en/model_doc/vjepa2#transformers.VJEPA2ForVideoClassification) (VJEPA2Config model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForKeypointDetection[[transformers.AutoModelForKeypointDetection]]

#### transformers.AutoModelForKeypointDetection[[transformers.AutoModelForKeypointDetection]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L1981)

### AutoModelForKeypointMatching[[transformers.AutoModelForKeypointMatching]]

#### transformers.AutoModelForKeypointMatching[[transformers.AutoModelForKeypointMatching]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L1985)

### AutoModelForMaskedImageModeling[[transformers.AutoModelForMaskedImageModeling]]

#### transformers.AutoModelForMaskedImageModeling[[transformers.AutoModelForMaskedImageModeling]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2296)

This is a generic model class that will be instantiated as one of the model classes of the library (with a masked image modeling head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMaskedImageModeling.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [DeiTConfig](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTForMaskedImageModeling) (DeiTConfig model)
  - [FocalNetConfig](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetConfig) configuration class: [FocalNetForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetForMaskedImageModeling) (FocalNetConfig model)
  - [SwinConfig](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinConfig) configuration class: [SwinForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinForMaskedImageModeling) (SwinConfig model)
  - [Swinv2Config](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2ForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2ForMaskedImageModeling) (Swinv2Config model)
  - [ViTConfig](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTConfig) configuration class: [ViTForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTForMaskedImageModeling) (ViTConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a masked image modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedImageModeling.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [DeiTConfig](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTConfig) configuration class: [DeiTForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTForMaskedImageModeling) (DeiTConfig model) - [FocalNetConfig](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetConfig) configuration class: [FocalNetForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetForMaskedImageModeling) (FocalNetConfig model) - [SwinConfig](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinConfig) configuration class: [SwinForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinForMaskedImageModeling) (SwinConfig model) - [Swinv2Config](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2Config) configuration class: [Swinv2ForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2ForMaskedImageModeling) (Swinv2Config model) - [ViTConfig](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTConfig) configuration class: [ViTForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTForMaskedImageModeling) (ViTConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMaskedImageModeling.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a masked image modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **deit** -- [DeiTForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/deit#transformers.DeiTForMaskedImageModeling) (DeiTConfig model)
- **focalnet** -- [FocalNetForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/focalnet#transformers.FocalNetForMaskedImageModeling) (FocalNetConfig model)
- **swin** -- [SwinForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/swin#transformers.SwinForMaskedImageModeling) (SwinConfig model)
- **swinv2** -- [Swinv2ForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/swinv2#transformers.Swinv2ForMaskedImageModeling) (Swinv2Config model)
- **vit** -- [ViTForMaskedImageModeling](/docs/transformers/v5.8.0/en/model_doc/vit#transformers.ViTForMaskedImageModeling) (ViTConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForObjectDetection[[transformers.AutoModelForObjectDetection]]

#### transformers.AutoModelForObjectDetection[[transformers.AutoModelForObjectDetection]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2177)

This is a generic model class that will be instantiated as one of the model classes of the library (with a object detection head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForObjectDetection.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [ConditionalDetrConfig](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrConfig) configuration class: [ConditionalDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrForObjectDetection) (ConditionalDetrConfig model)
  - [DFineConfig](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineConfig) configuration class: [DFineForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineForObjectDetection) (DFineConfig model)
  - [DabDetrConfig](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrConfig) configuration class: [DabDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrForObjectDetection) (DabDetrConfig model)
  - [DeformableDetrConfig](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrConfig) configuration class: [DeformableDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrForObjectDetection) (DeformableDetrConfig model)
  - [Deimv2Config](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2Config) configuration class: [Deimv2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2ForObjectDetection) (Deimv2Config model)
  - [DetrConfig](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrForObjectDetection) (DetrConfig model)
  - [LwDetrConfig](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrConfig) configuration class: [LwDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrForObjectDetection) (LwDetrConfig model)
  - [PPDocLayoutV2Config](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v2#transformers.PPDocLayoutV2Config) configuration class: [PPDocLayoutV2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v2#transformers.PPDocLayoutV2ForObjectDetection) (PPDocLayoutV2Config model)
  - [PPDocLayoutV3Config](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3Config) configuration class: [PPDocLayoutV3ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3ForObjectDetection) (PPDocLayoutV3Config model)
  - [PPOCRV5MobileDetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_det#transformers.PPOCRV5MobileDetConfig) configuration class: [PPOCRV5MobileDetForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_det#transformers.PPOCRV5MobileDetForObjectDetection) (PPOCRV5MobileDetConfig model)
  - [PPOCRV5ServerDetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_det#transformers.PPOCRV5ServerDetConfig) configuration class: [PPOCRV5ServerDetForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_det#transformers.PPOCRV5ServerDetForObjectDetection) (PPOCRV5ServerDetConfig model)
  - [RTDetrConfig](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrConfig) configuration class: [RTDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrForObjectDetection) (RTDetrConfig model)
  - [RTDetrV2Config](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2Config) configuration class: [RTDetrV2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2ForObjectDetection) (RTDetrV2Config model)
  - [TableTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerConfig) configuration class: [TableTransformerForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerForObjectDetection) (TableTransformerConfig model)
  - [YolosConfig](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosConfig) configuration class: [YolosForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosForObjectDetection) (YolosConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a object detection head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForObjectDetection.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [ConditionalDetrConfig](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrConfig) configuration class: [ConditionalDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrForObjectDetection) (ConditionalDetrConfig model) - [DFineConfig](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineConfig) configuration class: [DFineForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineForObjectDetection) (DFineConfig model) - [DabDetrConfig](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrConfig) configuration class: [DabDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrForObjectDetection) (DabDetrConfig model) - [DeformableDetrConfig](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrConfig) configuration class: [DeformableDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrForObjectDetection) (DeformableDetrConfig model) - [Deimv2Config](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2Config) configuration class: [Deimv2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2ForObjectDetection) (Deimv2Config model) - [DetrConfig](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrForObjectDetection) (DetrConfig model) - [LwDetrConfig](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrConfig) configuration class: [LwDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrForObjectDetection) (LwDetrConfig model) - [PPDocLayoutV2Config](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v2#transformers.PPDocLayoutV2Config) configuration class: [PPDocLayoutV2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v2#transformers.PPDocLayoutV2ForObjectDetection) (PPDocLayoutV2Config model) - [PPDocLayoutV3Config](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3Config) configuration class: [PPDocLayoutV3ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3ForObjectDetection) (PPDocLayoutV3Config model) - [PPOCRV5MobileDetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_det#transformers.PPOCRV5MobileDetConfig) configuration class: [PPOCRV5MobileDetForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_det#transformers.PPOCRV5MobileDetForObjectDetection) (PPOCRV5MobileDetConfig model) - [PPOCRV5ServerDetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_det#transformers.PPOCRV5ServerDetConfig) configuration class: [PPOCRV5ServerDetForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_det#transformers.PPOCRV5ServerDetForObjectDetection) (PPOCRV5ServerDetConfig model) - [RTDetrConfig](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrConfig) configuration class: [RTDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrForObjectDetection) (RTDetrConfig model) - [RTDetrV2Config](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2Config) configuration class: [RTDetrV2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2ForObjectDetection) (RTDetrV2Config model) - [TableTransformerConfig](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerConfig) configuration class: [TableTransformerForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerForObjectDetection) (TableTransformerConfig model) - [YolosConfig](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosConfig) configuration class: [YolosForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosForObjectDetection) (YolosConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForObjectDetection.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a object detection head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **conditional_detr** -- [ConditionalDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/conditional_detr#transformers.ConditionalDetrForObjectDetection) (ConditionalDetrConfig model)
- **d_fine** -- [DFineForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/d_fine#transformers.DFineForObjectDetection) (DFineConfig model)
- **dab-detr** -- [DabDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/dab-detr#transformers.DabDetrForObjectDetection) (DabDetrConfig model)
- **deformable_detr** -- [DeformableDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/deformable_detr#transformers.DeformableDetrForObjectDetection) (DeformableDetrConfig model)
- **deimv2** -- [Deimv2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/deimv2#transformers.Deimv2ForObjectDetection) (Deimv2Config model)
- **detr** -- [DetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrForObjectDetection) (DetrConfig model)
- **lw_detr** -- [LwDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/lw_detr#transformers.LwDetrForObjectDetection) (LwDetrConfig model)
- **pp_doclayout_v2** -- [PPDocLayoutV2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v2#transformers.PPDocLayoutV2ForObjectDetection) (PPDocLayoutV2Config model)
- **pp_doclayout_v3** -- [PPDocLayoutV3ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_doclayout_v3#transformers.PPDocLayoutV3ForObjectDetection) (PPDocLayoutV3Config model)
- **pp_ocrv5_mobile_det** -- [PPOCRV5MobileDetForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_mobile_det#transformers.PPOCRV5MobileDetForObjectDetection) (PPOCRV5MobileDetConfig model)
- **pp_ocrv5_server_det** -- [PPOCRV5ServerDetForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/pp_ocrv5_server_det#transformers.PPOCRV5ServerDetForObjectDetection) (PPOCRV5ServerDetConfig model)
- **rt_detr** -- [RTDetrForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/rt_detr#transformers.RTDetrForObjectDetection) (RTDetrConfig model)
- **rt_detr_v2** -- [RTDetrV2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/rt_detr_v2#transformers.RTDetrV2ForObjectDetection) (RTDetrV2Config model)
- **table-transformer** -- [TableTransformerForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/table-transformer#transformers.TableTransformerForObjectDetection) (TableTransformerConfig model)
- **yolos** -- [YolosForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/yolos#transformers.YolosForObjectDetection) (YolosConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageSegmentation[[transformers.AutoModelForImageSegmentation]]

#### transformers.AutoModelForImageSegmentation[[transformers.AutoModelForImageSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2134)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image segmentation head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForImageSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [DetrConfig](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForSegmentation](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrForSegmentation) (DetrConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageSegmentation.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [DetrConfig](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForSegmentation](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrForSegmentation) (DetrConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForImageSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a image segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **detr** -- [DetrForSegmentation](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrForSegmentation) (DetrConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageToImage[[transformers.AutoModelForImageToImage]]

#### transformers.AutoModelForImageToImage[[transformers.AutoModelForImageToImage]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L1993)

### AutoModelForSemanticSegmentation[[transformers.AutoModelForSemanticSegmentation]]

#### transformers.AutoModelForSemanticSegmentation[[transformers.AutoModelForSemanticSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2141)

This is a generic model class that will be instantiated as one of the model classes of the library (with a semantic segmentation head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSemanticSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [BeitConfig](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitConfig) configuration class: [BeitForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitForSemanticSegmentation) (BeitConfig model)
  - [DPTConfig](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTConfig) configuration class: [DPTForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTForSemanticSegmentation) (DPTConfig model)
  - [Data2VecVisionConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionForSemanticSegmentation) (Data2VecVisionConfig model)
  - [MobileNetV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Config) configuration class: [MobileNetV2ForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2ForSemanticSegmentation) (MobileNetV2Config model)
  - [MobileViTConfig](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTConfig) configuration class: [MobileViTForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTForSemanticSegmentation) (MobileViTConfig model)
  - [MobileViTV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Config) configuration class: [MobileViTV2ForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2ForSemanticSegmentation) (MobileViTV2Config model)
  - [SegformerConfig](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerConfig) configuration class: [SegformerForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerForSemanticSegmentation) (SegformerConfig model)
  - [UperNetConfig](/docs/transformers/v5.8.0/en/model_doc/upernet#transformers.UperNetConfig) configuration class: [UperNetForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/upernet#transformers.UperNetForSemanticSegmentation) (UperNetConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a semantic segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSemanticSegmentation.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [BeitConfig](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitConfig) configuration class: [BeitForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitForSemanticSegmentation) (BeitConfig model) - [DPTConfig](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTConfig) configuration class: [DPTForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTForSemanticSegmentation) (DPTConfig model) - [Data2VecVisionConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionConfig) configuration class: [Data2VecVisionForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionForSemanticSegmentation) (Data2VecVisionConfig model) - [MobileNetV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2Config) configuration class: [MobileNetV2ForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2ForSemanticSegmentation) (MobileNetV2Config model) - [MobileViTConfig](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTConfig) configuration class: [MobileViTForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTForSemanticSegmentation) (MobileViTConfig model) - [MobileViTV2Config](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2Config) configuration class: [MobileViTV2ForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2ForSemanticSegmentation) (MobileViTV2Config model) - [SegformerConfig](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerConfig) configuration class: [SegformerForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerForSemanticSegmentation) (SegformerConfig model) - [UperNetConfig](/docs/transformers/v5.8.0/en/model_doc/upernet#transformers.UperNetConfig) configuration class: [UperNetForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/upernet#transformers.UperNetForSemanticSegmentation) (UperNetConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSemanticSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a semantic segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **beit** -- [BeitForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/beit#transformers.BeitForSemanticSegmentation) (BeitConfig model)
- **data2vec-vision** -- [Data2VecVisionForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecVisionForSemanticSegmentation) (Data2VecVisionConfig model)
- **dpt** -- [DPTForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/dpt#transformers.DPTForSemanticSegmentation) (DPTConfig model)
- **mobilenet_v2** -- [MobileNetV2ForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/mobilenet_v2#transformers.MobileNetV2ForSemanticSegmentation) (MobileNetV2Config model)
- **mobilevit** -- [MobileViTForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/mobilevit#transformers.MobileViTForSemanticSegmentation) (MobileViTConfig model)
- **mobilevitv2** -- [MobileViTV2ForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/mobilevitv2#transformers.MobileViTV2ForSemanticSegmentation) (MobileViTV2Config model)
- **segformer** -- [SegformerForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/segformer#transformers.SegformerForSemanticSegmentation) (SegformerConfig model)
- **upernet** -- [UperNetForSemanticSegmentation](/docs/transformers/v5.8.0/en/model_doc/upernet#transformers.UperNetForSemanticSegmentation) (UperNetConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForInstanceSegmentation[[transformers.AutoModelForInstanceSegmentation]]

#### transformers.AutoModelForInstanceSegmentation[[transformers.AutoModelForInstanceSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2168)

This is a generic model class that will be instantiated as one of the model classes of the library (with a instance segmentation head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForInstanceSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [MaskFormerConfig](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerConfig) configuration class: [MaskFormerForInstanceSegmentation](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerForInstanceSegmentation) (MaskFormerConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a instance segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForInstanceSegmentation.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [MaskFormerConfig](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerConfig) configuration class: [MaskFormerForInstanceSegmentation](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerForInstanceSegmentation) (MaskFormerConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForInstanceSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a instance segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **maskformer** -- [MaskFormerForInstanceSegmentation](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerForInstanceSegmentation) (MaskFormerConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForUniversalSegmentation[[transformers.AutoModelForUniversalSegmentation]]

#### transformers.AutoModelForUniversalSegmentation[[transformers.AutoModelForUniversalSegmentation]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2159)

This is a generic model class that will be instantiated as one of the model classes of the library (with a universal image segmentation head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForUniversalSegmentation.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [DetrConfig](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForSegmentation](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrForSegmentation) (DetrConfig model)
  - [EomtConfig](/docs/transformers/v5.8.0/en/model_doc/eomt#transformers.EomtConfig) configuration class: [EomtForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/eomt#transformers.EomtForUniversalSegmentation) (EomtConfig model)
  - [EomtDinov3Config](/docs/transformers/v5.8.0/en/model_doc/eomt_dinov3#transformers.EomtDinov3Config) configuration class: [EomtDinov3ForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/eomt_dinov3#transformers.EomtDinov3ForUniversalSegmentation) (EomtDinov3Config model)
  - [Mask2FormerConfig](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerConfig) configuration class: [Mask2FormerForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerForUniversalSegmentation) (Mask2FormerConfig model)
  - [MaskFormerConfig](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerConfig) configuration class: [MaskFormerForInstanceSegmentation](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerForInstanceSegmentation) (MaskFormerConfig model)
  - [OneFormerConfig](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerConfig) configuration class: [OneFormerForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerForUniversalSegmentation) (OneFormerConfig model)
  - [VideomtConfig](/docs/transformers/v5.8.0/en/model_doc/videomt#transformers.VideomtConfig) configuration class: [VideomtForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/videomt#transformers.VideomtForUniversalSegmentation) (VideomtConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a universal image segmentation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForUniversalSegmentation.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [DetrConfig](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrConfig) configuration class: [DetrForSegmentation](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrForSegmentation) (DetrConfig model) - [EomtConfig](/docs/transformers/v5.8.0/en/model_doc/eomt#transformers.EomtConfig) configuration class: [EomtForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/eomt#transformers.EomtForUniversalSegmentation) (EomtConfig model) - [EomtDinov3Config](/docs/transformers/v5.8.0/en/model_doc/eomt_dinov3#transformers.EomtDinov3Config) configuration class: [EomtDinov3ForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/eomt_dinov3#transformers.EomtDinov3ForUniversalSegmentation) (EomtDinov3Config model) - [Mask2FormerConfig](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerConfig) configuration class: [Mask2FormerForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerForUniversalSegmentation) (Mask2FormerConfig model) - [MaskFormerConfig](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerConfig) configuration class: [MaskFormerForInstanceSegmentation](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerForInstanceSegmentation) (MaskFormerConfig model) - [OneFormerConfig](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerConfig) configuration class: [OneFormerForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerForUniversalSegmentation) (OneFormerConfig model) - [VideomtConfig](/docs/transformers/v5.8.0/en/model_doc/videomt#transformers.VideomtConfig) configuration class: [VideomtForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/videomt#transformers.VideomtForUniversalSegmentation) (VideomtConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForUniversalSegmentation.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a universal image segmentation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **detr** -- [DetrForSegmentation](/docs/transformers/v5.8.0/en/model_doc/detr#transformers.DetrForSegmentation) (DetrConfig model)
- **eomt** -- [EomtForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/eomt#transformers.EomtForUniversalSegmentation) (EomtConfig model)
- **eomt_dinov3** -- [EomtDinov3ForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/eomt_dinov3#transformers.EomtDinov3ForUniversalSegmentation) (EomtDinov3Config model)
- **mask2former** -- [Mask2FormerForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/mask2former#transformers.Mask2FormerForUniversalSegmentation) (Mask2FormerConfig model)
- **maskformer** -- [MaskFormerForInstanceSegmentation](/docs/transformers/v5.8.0/en/model_doc/maskformer#transformers.MaskFormerForInstanceSegmentation) (MaskFormerConfig model)
- **oneformer** -- [OneFormerForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/oneformer#transformers.OneFormerForUniversalSegmentation) (OneFormerConfig model)
- **videomt** -- [VideomtForUniversalSegmentation](/docs/transformers/v5.8.0/en/model_doc/videomt#transformers.VideomtForUniversalSegmentation) (VideomtConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForZeroShotImageClassification[[transformers.AutoModelForZeroShotImageClassification]]

#### transformers.AutoModelForZeroShotImageClassification[[transformers.AutoModelForZeroShotImageClassification]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2125)

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot image classification head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForZeroShotImageClassification.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AlignConfig](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignConfig) configuration class: [AlignModel](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignModel) (AlignConfig model)
  - [AltCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPModel) (AltCLIPConfig model)
  - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForImageTextRetrieval](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForImageTextRetrieval) (Blip2Config model)
  - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipModel) (BlipConfig model)
  - [CLIPConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPModel) (CLIPConfig model)
  - [CLIPSegConfig](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSegConfig model)
  - [ChineseCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPConfig) configuration class: [ChineseCLIPModel](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPModel) (ChineseCLIPConfig model)
  - [MetaClip2Config](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Config) configuration class: [MetaClip2Model](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Model) (MetaClip2Config model)
  - [Siglip2Config](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Config) configuration class: [Siglip2Model](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Model) (Siglip2Config model)
  - [SiglipConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipModel](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipModel) (SiglipConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a zero-shot image classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotImageClassification.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AlignConfig](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignConfig) configuration class: [AlignModel](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignModel) (AlignConfig model) - [AltCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPConfig) configuration class: [AltCLIPModel](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPModel) (AltCLIPConfig model) - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForImageTextRetrieval](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForImageTextRetrieval) (Blip2Config model) - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipModel](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipModel) (BlipConfig model) - [CLIPConfig](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPConfig) configuration class: [CLIPModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPModel) (CLIPConfig model) - [CLIPSegConfig](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegConfig) configuration class: [CLIPSegModel](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSegConfig model) - [ChineseCLIPConfig](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPConfig) configuration class: [ChineseCLIPModel](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPModel) (ChineseCLIPConfig model) - [MetaClip2Config](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Config) configuration class: [MetaClip2Model](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Model) (MetaClip2Config model) - [Siglip2Config](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Config) configuration class: [Siglip2Model](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Model) (Siglip2Config model) - [SiglipConfig](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipConfig) configuration class: [SiglipModel](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipModel) (SiglipConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForZeroShotImageClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a zero-shot image classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **align** -- [AlignModel](/docs/transformers/v5.8.0/en/model_doc/align#transformers.AlignModel) (AlignConfig model)
- **altclip** -- [AltCLIPModel](/docs/transformers/v5.8.0/en/model_doc/altclip#transformers.AltCLIPModel) (AltCLIPConfig model)
- **blip** -- [BlipModel](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipModel) (BlipConfig model)
- **blip-2** -- [Blip2ForImageTextRetrieval](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForImageTextRetrieval) (Blip2Config model)
- **chinese_clip** -- [ChineseCLIPModel](/docs/transformers/v5.8.0/en/model_doc/chinese_clip#transformers.ChineseCLIPModel) (ChineseCLIPConfig model)
- **clip** -- [CLIPModel](/docs/transformers/v5.8.0/en/model_doc/clip#transformers.CLIPModel) (CLIPConfig model)
- **clipseg** -- [CLIPSegModel](/docs/transformers/v5.8.0/en/model_doc/clipseg#transformers.CLIPSegModel) (CLIPSegConfig model)
- **metaclip_2** -- [MetaClip2Model](/docs/transformers/v5.8.0/en/model_doc/metaclip_2#transformers.MetaClip2Model) (MetaClip2Config model)
- **siglip** -- [SiglipModel](/docs/transformers/v5.8.0/en/model_doc/siglip#transformers.SiglipModel) (SiglipConfig model)
- **siglip2** -- [Siglip2Model](/docs/transformers/v5.8.0/en/model_doc/siglip2#transformers.Siglip2Model) (Siglip2Config model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForZeroShotObjectDetection[[transformers.AutoModelForZeroShotObjectDetection]]

#### transformers.AutoModelForZeroShotObjectDetection[[transformers.AutoModelForZeroShotObjectDetection]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2184)

This is a generic model class that will be instantiated as one of the model classes of the library (with a zero-shot object detection head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForZeroShotObjectDetection.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [GroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoConfig) configuration class: [GroundingDinoForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoForObjectDetection) (GroundingDinoConfig model)
  - [MMGroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoConfig) configuration class: [MMGroundingDinoForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoForObjectDetection) (MMGroundingDinoConfig model)
  - [OmDetTurboConfig](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboConfig) configuration class: [OmDetTurboForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboForObjectDetection) (OmDetTurboConfig model)
  - [OwlViTConfig](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTConfig) configuration class: [OwlViTForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTForObjectDetection) (OwlViTConfig model)
  - [Owlv2Config](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2Config) configuration class: [Owlv2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2ForObjectDetection) (Owlv2Config model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a zero-shot object detection head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotObjectDetection.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [GroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoConfig) configuration class: [GroundingDinoForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoForObjectDetection) (GroundingDinoConfig model) - [MMGroundingDinoConfig](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoConfig) configuration class: [MMGroundingDinoForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoForObjectDetection) (MMGroundingDinoConfig model) - [OmDetTurboConfig](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboConfig) configuration class: [OmDetTurboForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboForObjectDetection) (OmDetTurboConfig model) - [OwlViTConfig](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTConfig) configuration class: [OwlViTForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTForObjectDetection) (OwlViTConfig model) - [Owlv2Config](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2Config) configuration class: [Owlv2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2ForObjectDetection) (Owlv2Config model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForZeroShotObjectDetection.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a zero-shot object detection head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **grounding-dino** -- [GroundingDinoForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/grounding-dino#transformers.GroundingDinoForObjectDetection) (GroundingDinoConfig model)
- **mm-grounding-dino** -- [MMGroundingDinoForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/mm-grounding-dino#transformers.MMGroundingDinoForObjectDetection) (MMGroundingDinoConfig model)
- **omdet-turbo** -- [OmDetTurboForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/omdet-turbo#transformers.OmDetTurboForObjectDetection) (OmDetTurboConfig model)
- **owlv2** -- [Owlv2ForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/owlv2#transformers.Owlv2ForObjectDetection) (Owlv2Config model)
- **owlvit** -- [OwlViTForObjectDetection](/docs/transformers/v5.8.0/en/model_doc/owlvit#transformers.OwlViTForObjectDetection) (OwlViTConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Audio

The following auto classes are available for the following audio tasks.

### AutoModelForAudioClassification[[transformers.AutoModelForAudioClassification]]

#### transformers.AutoModelForAudioClassification[[transformers.AutoModelForAudioClassification]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2245)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio classification head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioClassification.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [ASTConfig](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) configuration class: [ASTForAudioClassification](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTForAudioClassification) (ASTConfig model)
  - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForSequenceClassification) (Data2VecAudioConfig model)
  - [HubertConfig](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertConfig) configuration class: [HubertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertForSequenceClassification) (HubertConfig model)
  - [SEWConfig](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWConfig) configuration class: [SEWForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWForSequenceClassification) (SEWConfig model)
  - [SEWDConfig](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDConfig) configuration class: [SEWDForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDForSequenceClassification) (SEWDConfig model)
  - [UniSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechConfig) configuration class: [UniSpeechForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechForSequenceClassification) (UniSpeechConfig model)
  - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForSequenceClassification) (UniSpeechSatConfig model)
  - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForSequenceClassification) (Wav2Vec2BertConfig model)
  - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForSequenceClassification) (Wav2Vec2Config model)
  - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForSequenceClassification) (Wav2Vec2ConformerConfig model)
  - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForSequenceClassification) (WavLMConfig model)
  - [WhisperConfig](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForAudioClassification](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperForAudioClassification) (WhisperConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioClassification.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [ASTConfig](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTConfig) configuration class: [ASTForAudioClassification](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTForAudioClassification) (ASTConfig model) - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForSequenceClassification) (Data2VecAudioConfig model) - [HubertConfig](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertConfig) configuration class: [HubertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertForSequenceClassification) (HubertConfig model) - [SEWConfig](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWConfig) configuration class: [SEWForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWForSequenceClassification) (SEWConfig model) - [SEWDConfig](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDConfig) configuration class: [SEWDForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDForSequenceClassification) (SEWDConfig model) - [UniSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechConfig) configuration class: [UniSpeechForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechForSequenceClassification) (UniSpeechConfig model) - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForSequenceClassification) (UniSpeechSatConfig model) - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForSequenceClassification) (Wav2Vec2BertConfig model) - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForSequenceClassification) (Wav2Vec2Config model) - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForSequenceClassification) (Wav2Vec2ConformerConfig model) - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForSequenceClassification) (WavLMConfig model) - [WhisperConfig](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForAudioClassification](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperForAudioClassification) (WhisperConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a audio classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **audio-spectrogram-transformer** -- [ASTForAudioClassification](/docs/transformers/v5.8.0/en/model_doc/audio-spectrogram-transformer#transformers.ASTForAudioClassification) (ASTConfig model)
- **data2vec-audio** -- [Data2VecAudioForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForSequenceClassification) (Data2VecAudioConfig model)
- **hubert** -- [HubertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertForSequenceClassification) (HubertConfig model)
- **sew** -- [SEWForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWForSequenceClassification) (SEWConfig model)
- **sew-d** -- [SEWDForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDForSequenceClassification) (SEWDConfig model)
- **unispeech** -- [UniSpeechForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechForSequenceClassification) (UniSpeechConfig model)
- **unispeech-sat** -- [UniSpeechSatForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForSequenceClassification) (UniSpeechSatConfig model)
- **wav2vec2** -- [Wav2Vec2ForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForSequenceClassification) (Wav2Vec2Config model)
- **wav2vec2-bert** -- [Wav2Vec2BertForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForSequenceClassification) (Wav2Vec2BertConfig model)
- **wav2vec2-conformer** -- [Wav2Vec2ConformerForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForSequenceClassification) (Wav2Vec2ConformerConfig model)
- **wavlm** -- [WavLMForSequenceClassification](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForSequenceClassification) (WavLMConfig model)
- **whisper** -- [WhisperForAudioClassification](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperForAudioClassification) (WhisperConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForAudioFrameClassification[[transformers.AutoModelForAudioFrameClassification]]

#### transformers.AutoModelForAudioFrameClassification[[transformers.AutoModelForAudioFrameClassification]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2268)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio frame (token) classification head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioFrameClassification.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForAudioFrameClassification) (Data2VecAudioConfig model)
  - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForAudioFrameClassification) (UniSpeechSatConfig model)
  - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForAudioFrameClassification) (Wav2Vec2BertConfig model)
  - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForAudioFrameClassification) (Wav2Vec2Config model)
  - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForAudioFrameClassification) (Wav2Vec2ConformerConfig model)
  - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForAudioFrameClassification) (WavLMConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio frame (token) classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioFrameClassification.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForAudioFrameClassification) (Data2VecAudioConfig model) - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForAudioFrameClassification) (UniSpeechSatConfig model) - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForAudioFrameClassification) (Wav2Vec2BertConfig model) - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForAudioFrameClassification) (Wav2Vec2Config model) - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForAudioFrameClassification) (Wav2Vec2ConformerConfig model) - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForAudioFrameClassification) (WavLMConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioFrameClassification.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a audio frame (token) classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-audio** -- [Data2VecAudioForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForAudioFrameClassification) (Data2VecAudioConfig model)
- **unispeech-sat** -- [UniSpeechSatForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForAudioFrameClassification) (UniSpeechSatConfig model)
- **wav2vec2** -- [Wav2Vec2ForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForAudioFrameClassification) (Wav2Vec2Config model)
- **wav2vec2-bert** -- [Wav2Vec2BertForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForAudioFrameClassification) (Wav2Vec2BertConfig model)
- **wav2vec2-conformer** -- [Wav2Vec2ConformerForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForAudioFrameClassification) (Wav2Vec2ConformerConfig model)
- **wavlm** -- [WavLMForAudioFrameClassification](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForAudioFrameClassification) (WavLMConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForCTC[[transformers.AutoModelForCTC]]

#### transformers.AutoModelForCTC[[transformers.AutoModelForCTC]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2252)

This is a generic model class that will be instantiated as one of the model classes of the library (with a connectionist temporal classification head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForCTC.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForCTC](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForCTC) (Data2VecAudioConfig model)
  - [HubertConfig](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertConfig) configuration class: [HubertForCTC](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertForCTC) (HubertConfig model)
  - [LasrCTCConfig](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrCTCConfig) configuration class: [LasrForCTC](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrForCTC) (LasrCTCConfig model)
  - [ParakeetCTCConfig](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetCTCConfig) configuration class: [ParakeetForCTC](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetForCTC) (ParakeetCTCConfig model)
  - [SEWConfig](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWConfig) configuration class: [SEWForCTC](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWForCTC) (SEWConfig model)
  - [SEWDConfig](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDConfig) configuration class: [SEWDForCTC](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDForCTC) (SEWDConfig model)
  - [UniSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechConfig) configuration class: [UniSpeechForCTC](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechForCTC) (UniSpeechConfig model)
  - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForCTC](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForCTC) (UniSpeechSatConfig model)
  - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertForCTC](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForCTC) (Wav2Vec2BertConfig model)
  - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForCTC](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForCTC) (Wav2Vec2Config model)
  - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForCTC](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForCTC) (Wav2Vec2ConformerConfig model)
  - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMForCTC](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForCTC) (WavLMConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a connectionist temporal classification head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCTC.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForCTC](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForCTC) (Data2VecAudioConfig model) - [HubertConfig](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertConfig) configuration class: [HubertForCTC](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertForCTC) (HubertConfig model) - [LasrCTCConfig](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrCTCConfig) configuration class: [LasrForCTC](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrForCTC) (LasrCTCConfig model) - [ParakeetCTCConfig](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetCTCConfig) configuration class: [ParakeetForCTC](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetForCTC) (ParakeetCTCConfig model) - [SEWConfig](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWConfig) configuration class: [SEWForCTC](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWForCTC) (SEWConfig model) - [SEWDConfig](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDConfig) configuration class: [SEWDForCTC](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDForCTC) (SEWDConfig model) - [UniSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechConfig) configuration class: [UniSpeechForCTC](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechForCTC) (UniSpeechConfig model) - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForCTC](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForCTC) (UniSpeechSatConfig model) - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertForCTC](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForCTC) (Wav2Vec2BertConfig model) - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForCTC](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForCTC) (Wav2Vec2Config model) - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForCTC](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForCTC) (Wav2Vec2ConformerConfig model) - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMForCTC](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForCTC) (WavLMConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForCTC.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a connectionist temporal classification head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-audio** -- [Data2VecAudioForCTC](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForCTC) (Data2VecAudioConfig model)
- **hubert** -- [HubertForCTC](/docs/transformers/v5.8.0/en/model_doc/hubert#transformers.HubertForCTC) (HubertConfig model)
- **lasr_ctc** -- [LasrForCTC](/docs/transformers/v5.8.0/en/model_doc/lasr#transformers.LasrForCTC) (LasrCTCConfig model)
- **parakeet_ctc** -- [ParakeetForCTC](/docs/transformers/v5.8.0/en/model_doc/parakeet#transformers.ParakeetForCTC) (ParakeetCTCConfig model)
- **sew** -- [SEWForCTC](/docs/transformers/v5.8.0/en/model_doc/sew#transformers.SEWForCTC) (SEWConfig model)
- **sew-d** -- [SEWDForCTC](/docs/transformers/v5.8.0/en/model_doc/sew-d#transformers.SEWDForCTC) (SEWDConfig model)
- **unispeech** -- [UniSpeechForCTC](/docs/transformers/v5.8.0/en/model_doc/unispeech#transformers.UniSpeechForCTC) (UniSpeechConfig model)
- **unispeech-sat** -- [UniSpeechSatForCTC](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForCTC) (UniSpeechSatConfig model)
- **wav2vec2** -- [Wav2Vec2ForCTC](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForCTC) (Wav2Vec2Config model)
- **wav2vec2-bert** -- [Wav2Vec2BertForCTC](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForCTC) (Wav2Vec2BertConfig model)
- **wav2vec2-conformer** -- [Wav2Vec2ConformerForCTC](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForCTC) (Wav2Vec2ConformerConfig model)
- **wavlm** -- [WavLMForCTC](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForCTC) (WavLMConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForSpeechSeq2Seq[[transformers.AutoModelForSpeechSeq2Seq]]

#### transformers.AutoModelForSpeechSeq2Seq[[transformers.AutoModelForSpeechSeq2Seq]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2259)

This is a generic model class that will be instantiated as one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForSpeechSeq2Seq.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [CohereAsrConfig](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrConfig) configuration class: [CohereAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrForConditionalGeneration) (CohereAsrConfig model)
  - [DiaConfig](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaConfig) configuration class: [DiaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaForConditionalGeneration) (DiaConfig model)
  - [GraniteSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechConfig) configuration class: [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model)
  - [GraniteSpeechPlusConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusConfig) configuration class: [GraniteSpeechPlusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusForConditionalGeneration) (GraniteSpeechPlusConfig model)
  - [KyutaiSpeechToTextConfig](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextConfig) configuration class: [KyutaiSpeechToTextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextForConditionalGeneration) (KyutaiSpeechToTextConfig model)
  - [MoonshineConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineConfig) configuration class: [MoonshineForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineForConditionalGeneration) (MoonshineConfig model)
  - [MoonshineStreamingConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingConfig) configuration class: [MoonshineStreamingForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingForConditionalGeneration) (MoonshineStreamingConfig model)
  - [Pop2PianoConfig](/docs/transformers/v5.8.0/en/model_doc/pop2piano#transformers.Pop2PianoConfig) configuration class: [Pop2PianoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pop2piano#transformers.Pop2PianoForConditionalGeneration) (Pop2PianoConfig model)
  - [SeamlessM4TConfig](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TConfig) configuration class: [SeamlessM4TForSpeechToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TForSpeechToText) (SeamlessM4TConfig model)
  - [SeamlessM4Tv2Config](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Config) configuration class: [SeamlessM4Tv2ForSpeechToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2ForSpeechToText) (SeamlessM4Tv2Config model)
  - [Speech2TextConfig](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextConfig) configuration class: [Speech2TextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextForConditionalGeneration) (Speech2TextConfig model)
  - [SpeechEncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/speech-encoder-decoder#transformers.SpeechEncoderDecoderConfig) configuration class: [SpeechEncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/speech-encoder-decoder#transformers.SpeechEncoderDecoderModel) (SpeechEncoderDecoderConfig model)
  - [SpeechT5Config](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Config) configuration class: [SpeechT5ForSpeechToText](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5ForSpeechToText) (SpeechT5Config model)
  - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
  - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
  - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)
  - [WhisperConfig](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperForConditionalGeneration) (WhisperConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSpeechSeq2Seq.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [CohereAsrConfig](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrConfig) configuration class: [CohereAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrForConditionalGeneration) (CohereAsrConfig model) - [DiaConfig](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaConfig) configuration class: [DiaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaForConditionalGeneration) (DiaConfig model) - [GraniteSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechConfig) configuration class: [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model) - [GraniteSpeechPlusConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusConfig) configuration class: [GraniteSpeechPlusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusForConditionalGeneration) (GraniteSpeechPlusConfig model) - [KyutaiSpeechToTextConfig](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextConfig) configuration class: [KyutaiSpeechToTextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextForConditionalGeneration) (KyutaiSpeechToTextConfig model) - [MoonshineConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineConfig) configuration class: [MoonshineForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineForConditionalGeneration) (MoonshineConfig model) - [MoonshineStreamingConfig](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingConfig) configuration class: [MoonshineStreamingForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingForConditionalGeneration) (MoonshineStreamingConfig model) - [Pop2PianoConfig](/docs/transformers/v5.8.0/en/model_doc/pop2piano#transformers.Pop2PianoConfig) configuration class: [Pop2PianoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pop2piano#transformers.Pop2PianoForConditionalGeneration) (Pop2PianoConfig model) - [SeamlessM4TConfig](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TConfig) configuration class: [SeamlessM4TForSpeechToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TForSpeechToText) (SeamlessM4TConfig model) - [SeamlessM4Tv2Config](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2Config) configuration class: [SeamlessM4Tv2ForSpeechToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2ForSpeechToText) (SeamlessM4Tv2Config model) - [Speech2TextConfig](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextConfig) configuration class: [Speech2TextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextForConditionalGeneration) (Speech2TextConfig model) - [SpeechEncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/speech-encoder-decoder#transformers.SpeechEncoderDecoderConfig) configuration class: [SpeechEncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/speech-encoder-decoder#transformers.SpeechEncoderDecoderModel) (SpeechEncoderDecoderConfig model) - [SpeechT5Config](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5Config) configuration class: [SpeechT5ForSpeechToText](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5ForSpeechToText) (SpeechT5Config model) - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model) - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model) - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model) - [WhisperConfig](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperConfig) configuration class: [WhisperForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperForConditionalGeneration) (WhisperConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForSpeechSeq2Seq.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a sequence-to-sequence speech-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **cohere_asr** -- [CohereAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/cohere_asr#transformers.CohereAsrForConditionalGeneration) (CohereAsrConfig model)
- **dia** -- [DiaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/dia#transformers.DiaForConditionalGeneration) (DiaConfig model)
- **granite_speech** -- [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model)
- **granite_speech_plus** -- [GraniteSpeechPlusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusForConditionalGeneration) (GraniteSpeechPlusConfig model)
- **kyutai_speech_to_text** -- [KyutaiSpeechToTextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextForConditionalGeneration) (KyutaiSpeechToTextConfig model)
- **moonshine** -- [MoonshineForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/moonshine#transformers.MoonshineForConditionalGeneration) (MoonshineConfig model)
- **moonshine_streaming** -- [MoonshineStreamingForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/moonshine_streaming#transformers.MoonshineStreamingForConditionalGeneration) (MoonshineStreamingConfig model)
- **pop2piano** -- [Pop2PianoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pop2piano#transformers.Pop2PianoForConditionalGeneration) (Pop2PianoConfig model)
- **seamless_m4t** -- [SeamlessM4TForSpeechToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t#transformers.SeamlessM4TForSpeechToText) (SeamlessM4TConfig model)
- **seamless_m4t_v2** -- [SeamlessM4Tv2ForSpeechToText](/docs/transformers/v5.8.0/en/model_doc/seamless_m4t_v2#transformers.SeamlessM4Tv2ForSpeechToText) (SeamlessM4Tv2Config model)
- **speech-encoder-decoder** -- [SpeechEncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/speech-encoder-decoder#transformers.SpeechEncoderDecoderModel) (SpeechEncoderDecoderConfig model)
- **speech_to_text** -- [Speech2TextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/speech_to_text#transformers.Speech2TextForConditionalGeneration) (Speech2TextConfig model)
- **speecht5** -- [SpeechT5ForSpeechToText](/docs/transformers/v5.8.0/en/model_doc/speecht5#transformers.SpeechT5ForSpeechToText) (SpeechT5Config model)
- **vibevoice_asr** -- [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
- **voxtral** -- [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
- **voxtral_realtime** -- [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)
- **whisper** -- [WhisperForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/whisper#transformers.WhisperForConditionalGeneration) (WhisperConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForAudioXVector[[transformers.AutoModelForAudioXVector]]

#### transformers.AutoModelForAudioXVector[[transformers.AutoModelForAudioXVector]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2277)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio retrieval via x-vector head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioXVector.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForXVector](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForXVector) (Data2VecAudioConfig model)
  - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForXVector](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForXVector) (UniSpeechSatConfig model)
  - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertForXVector](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForXVector) (Wav2Vec2BertConfig model)
  - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForXVector](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForXVector) (Wav2Vec2Config model)
  - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForXVector](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForXVector) (Wav2Vec2ConformerConfig model)
  - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMForXVector](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForXVector) (WavLMConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio retrieval via x-vector head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioXVector.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Data2VecAudioConfig](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioConfig) configuration class: [Data2VecAudioForXVector](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForXVector) (Data2VecAudioConfig model) - [UniSpeechSatConfig](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatConfig) configuration class: [UniSpeechSatForXVector](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForXVector) (UniSpeechSatConfig model) - [Wav2Vec2BertConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertConfig) configuration class: [Wav2Vec2BertForXVector](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForXVector) (Wav2Vec2BertConfig model) - [Wav2Vec2Config](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2Config) configuration class: [Wav2Vec2ForXVector](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForXVector) (Wav2Vec2Config model) - [Wav2Vec2ConformerConfig](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerConfig) configuration class: [Wav2Vec2ConformerForXVector](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForXVector) (Wav2Vec2ConformerConfig model) - [WavLMConfig](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMConfig) configuration class: [WavLMForXVector](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForXVector) (WavLMConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioXVector.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a audio retrieval via x-vector head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **data2vec-audio** -- [Data2VecAudioForXVector](/docs/transformers/v5.8.0/en/model_doc/data2vec#transformers.Data2VecAudioForXVector) (Data2VecAudioConfig model)
- **unispeech-sat** -- [UniSpeechSatForXVector](/docs/transformers/v5.8.0/en/model_doc/unispeech-sat#transformers.UniSpeechSatForXVector) (UniSpeechSatConfig model)
- **wav2vec2** -- [Wav2Vec2ForXVector](/docs/transformers/v5.8.0/en/model_doc/wav2vec2#transformers.Wav2Vec2ForXVector) (Wav2Vec2Config model)
- **wav2vec2-bert** -- [Wav2Vec2BertForXVector](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-bert#transformers.Wav2Vec2BertForXVector) (Wav2Vec2BertConfig model)
- **wav2vec2-conformer** -- [Wav2Vec2ConformerForXVector](/docs/transformers/v5.8.0/en/model_doc/wav2vec2-conformer#transformers.Wav2Vec2ConformerForXVector) (Wav2Vec2ConformerConfig model)
- **wavlm** -- [WavLMForXVector](/docs/transformers/v5.8.0/en/model_doc/wavlm#transformers.WavLMForXVector) (WavLMConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTextToSpectrogram[[transformers.AutoModelForTextToSpectrogram]]

#### transformers.AutoModelForTextToSpectrogram[[transformers.AutoModelForTextToSpectrogram]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2281)

### AutoModelForTextToWaveform[[transformers.AutoModelForTextToWaveform]]

#### transformers.AutoModelForTextToWaveform[[transformers.AutoModelForTextToWaveform]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2285)

### AutoModelForAudioTokenization[[transformers.AutoModelForAudioTokenization]]

#### transformers.AutoModelForAudioTokenization[[transformers.AutoModelForAudioTokenization]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2303)

This is a generic model class that will be instantiated as one of the model classes of the library (with a audio tokenization through codebooks head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForAudioTokenization.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [DacConfig](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacConfig) configuration class: [DacModel](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacModel) (DacConfig model)
  - [HiggsAudioV2TokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerConfig) configuration class: [HiggsAudioV2TokenizerModel](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerModel) (HiggsAudioV2TokenizerConfig model)
  - [VibeVoiceAcousticTokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerConfig) configuration class: [VibeVoiceAcousticTokenizerModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerModel) (VibeVoiceAcousticTokenizerConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a audio tokenization through codebooks head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioTokenization

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioTokenization.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [DacConfig](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacConfig) configuration class: [DacModel](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacModel) (DacConfig model) - [HiggsAudioV2TokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerConfig) configuration class: [HiggsAudioV2TokenizerModel](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerModel) (HiggsAudioV2TokenizerConfig model) - [VibeVoiceAcousticTokenizerConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerConfig) configuration class: [VibeVoiceAcousticTokenizerModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerModel) (VibeVoiceAcousticTokenizerConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForAudioTokenization.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a audio tokenization through codebooks head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **dac** -- [DacModel](/docs/transformers/v5.8.0/en/model_doc/dac#transformers.DacModel) (DacConfig model)
- **higgs_audio_v2_tokenizer** -- [HiggsAudioV2TokenizerModel](/docs/transformers/v5.8.0/en/model_doc/higgs_audio_v2_tokenizer#transformers.HiggsAudioV2TokenizerModel) (HiggsAudioV2TokenizerConfig model)
- **vibevoice_acoustic_tokenizer** -- [VibeVoiceAcousticTokenizerModel](/docs/transformers/v5.8.0/en/model_doc/vibevoice_acoustic_tokenizer#transformers.VibeVoiceAcousticTokenizerModel) (VibeVoiceAcousticTokenizerConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForAudioTokenization

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioTokenization.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioTokenization.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Multimodal

The following auto classes are available for the following multimodal tasks.

### AutoModelForMultimodalLM[[transformers.AutoModelForMultimodalLM]]

#### transformers.AutoModelForMultimodalLM[[transformers.AutoModelForMultimodalLM]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2238)

This is a generic model class that will be instantiated as one of the model classes of the library (with a multimodal generation head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForMultimodalLM.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AriaConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaConfig) configuration class: [AriaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaForConditionalGeneration) (AriaConfig model)
  - [AyaVisionConfig](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionConfig) configuration class: [AyaVisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionForConditionalGeneration) (AyaVisionConfig model)
  - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (Blip2Config model)
  - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipForConditionalGeneration) (BlipConfig model)
  - [ChameleonConfig](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonConfig) configuration class: [ChameleonForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonForConditionalGeneration) (ChameleonConfig model)
  - [Cohere2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionConfig) configuration class: [Cohere2VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionForConditionalGeneration) (Cohere2VisionConfig model)
  - [DeepseekVLConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLConfig) configuration class: [DeepseekVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLForConditionalGeneration) (DeepseekVLConfig model)
  - [DeepseekVLHybridConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridConfig) configuration class: [DeepseekVLHybridForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridForConditionalGeneration) (DeepseekVLHybridConfig model)
  - [Emu3Config](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Config) configuration class: [Emu3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3ForConditionalGeneration) (Emu3Config model)
  - [Ernie4_5_VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeConfig) configuration class: [Ernie4_5_VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeForConditionalGeneration) (Ernie4_5_VLMoeConfig model)
  - [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) configuration class: [EvollaForProteinText2Text](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaForProteinText2Text) (EvollaConfig model)
  - [Exaone4_5_Config](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Config) configuration class: [Exaone4_5_ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_ForConditionalGeneration) (Exaone4_5_Config model)
  - [FastVlmConfig](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmConfig) configuration class: [FastVlmForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmForConditionalGeneration) (FastVlmConfig model)
  - [Florence2Config](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Config) configuration class: [Florence2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2ForConditionalGeneration) (Florence2Config model)
  - [FuyuConfig](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuConfig) configuration class: [FuyuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuForCausalLM) (FuyuConfig model)
  - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model)
  - [Gemma3nConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nConfig) configuration class: [Gemma3nForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForConditionalGeneration) (Gemma3nConfig model)
  - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model)
  - [GitConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitConfig) configuration class: [GitForCausalLM](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitForCausalLM) (GitConfig model)
  - [Glm46VConfig](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VConfig) configuration class: [Glm46VForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VForConditionalGeneration) (Glm46VConfig model)
  - [Glm4vConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vConfig) configuration class: [Glm4vForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vForConditionalGeneration) (Glm4vConfig model)
  - [Glm4vMoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeConfig) configuration class: [Glm4vMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeForConditionalGeneration) (Glm4vMoeConfig model)
  - [GlmAsrConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrConfig) configuration class: [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model)
  - [GlmOcrConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrConfig) configuration class: [GlmOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrForConditionalGeneration) (GlmOcrConfig model)
  - [GotOcr2Config](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Config) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (GotOcr2Config model)
  - [Granite4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionConfig) configuration class: [Granite4VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionForConditionalGeneration) (Granite4VisionConfig model)
  - [GraniteSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechConfig) configuration class: [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model)
  - [GraniteSpeechPlusConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusConfig) configuration class: [GraniteSpeechPlusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusForConditionalGeneration) (GraniteSpeechPlusConfig model)
  - [Idefics2Config](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Config) configuration class: [Idefics2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2ForConditionalGeneration) (Idefics2Config model)
  - [Idefics3Config](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Config) configuration class: [Idefics3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3ForConditionalGeneration) (Idefics3Config model)
  - [IdeficsConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsConfig) configuration class: [IdeficsForVisionText2Text](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsForVisionText2Text) (IdeficsConfig model)
  - [InstructBlipConfig](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipConfig) configuration class: [InstructBlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipForConditionalGeneration) (InstructBlipConfig model)
  - [InstructBlipVideoConfig](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoConfig) configuration class: [InstructBlipVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoForConditionalGeneration) (InstructBlipVideoConfig model)
  - [InternVLConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLConfig) configuration class: [InternVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLForConditionalGeneration) (InternVLConfig model)
  - [JanusConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusConfig) configuration class: [JanusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusForConditionalGeneration) (JanusConfig model)
  - [Kosmos2Config](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Config) configuration class: [Kosmos2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2ForConditionalGeneration) (Kosmos2Config model)
  - [Kosmos2_5Config](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Config) configuration class: [Kosmos2_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5ForConditionalGeneration) (Kosmos2_5Config model)
  - [KyutaiSpeechToTextConfig](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextConfig) configuration class: [KyutaiSpeechToTextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextForConditionalGeneration) (KyutaiSpeechToTextConfig model)
  - [Lfm2VlConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlConfig) configuration class: [Lfm2VlForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlForConditionalGeneration) (Lfm2VlConfig model)
  - [LightOnOcrConfig](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrConfig) configuration class: [LightOnOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrForConditionalGeneration) (LightOnOcrConfig model)
  - [Llama4Config](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Config) configuration class: [Llama4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForConditionalGeneration) (Llama4Config model)
  - [LlavaConfig](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaConfig) configuration class: [LlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaForConditionalGeneration) (LlavaConfig model)
  - [LlavaNextConfig](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextConfig) configuration class: [LlavaNextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextForConditionalGeneration) (LlavaNextConfig model)
  - [LlavaNextVideoConfig](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoConfig) configuration class: [LlavaNextVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoForConditionalGeneration) (LlavaNextVideoConfig model)
  - [LlavaOnevisionConfig](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionConfig) configuration class: [LlavaOnevisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionForConditionalGeneration) (LlavaOnevisionConfig model)
  - [MiniCPMV4_6Config](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Config) configuration class: [MiniCPMV4_6ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6ForConditionalGeneration) (MiniCPMV4_6Config model)
  - [Mistral3Config](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Config) configuration class: [Mistral3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3ForConditionalGeneration) (Mistral3Config model)
  - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForCausalLM) (Mistral4Config model)
  - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForConditionalGeneration) (MllamaConfig model)
  - [Ovis2Config](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Config) configuration class: [Ovis2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2ForConditionalGeneration) (Ovis2Config model)
  - [PI0Config](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Config) configuration class: [PI0ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0ForConditionalGeneration) (PI0Config model)
  - [PPChart2TableConfig](/docs/transformers/v5.8.0/en/model_doc/pp_chart2table#transformers.PPChart2TableConfig) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (PPChart2TableConfig model)
  - [PPFormulaNetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetConfig) configuration class: [PPFormulaNetForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetForConditionalGeneration) (PPFormulaNetConfig model)
  - [PaddleOCRVLConfig](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLConfig) configuration class: [PaddleOCRVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLForConditionalGeneration) (PaddleOCRVLConfig model)
  - [PaliGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemmaConfig model)
  - [PerceptionLMConfig](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMConfig) configuration class: [PerceptionLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMForConditionalGeneration) (PerceptionLMConfig model)
  - [Phi4MultimodalConfig](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalConfig) configuration class: [Phi4MultimodalForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalForCausalLM) (Phi4MultimodalConfig model)
  - [Pix2StructConfig](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructConfig) configuration class: [Pix2StructForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructForConditionalGeneration) (Pix2StructConfig model)
  - [QianfanOCRConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRConfig) configuration class: [QianfanOCRForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRForConditionalGeneration) (QianfanOCRConfig model)
  - [Qwen2AudioConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioConfig) configuration class: [Qwen2AudioForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioForConditionalGeneration) (Qwen2AudioConfig model)
  - [Qwen2VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLConfig) configuration class: [Qwen2VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLForConditionalGeneration) (Qwen2VLConfig model)
  - [Qwen2_5OmniConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniConfig) configuration class: [Qwen2_5OmniForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniForConditionalGeneration) (Qwen2_5OmniConfig model)
  - [Qwen2_5OmniThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerConfig) configuration class: [Qwen2_5OmniThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerForConditionalGeneration) (Qwen2_5OmniThinkerConfig model)
  - [Qwen2_5_VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLConfig) configuration class: [Qwen2_5_VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLForConditionalGeneration) (Qwen2_5_VLConfig model)
  - [Qwen3OmniMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeConfig) configuration class: [Qwen3OmniMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeForConditionalGeneration) (Qwen3OmniMoeConfig model)
  - [Qwen3OmniMoeThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerConfig) configuration class: [Qwen3OmniMoeThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerForConditionalGeneration) (Qwen3OmniMoeThinkerConfig model)
  - [Qwen3VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLConfig) configuration class: [Qwen3VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLForConditionalGeneration) (Qwen3VLConfig model)
  - [Qwen3VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeConfig) configuration class: [Qwen3VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeForConditionalGeneration) (Qwen3VLMoeConfig model)
  - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForConditionalGeneration) (Qwen3_5Config model)
  - [Qwen3_5MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeConfig) configuration class: [Qwen3_5MoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForConditionalGeneration) (Qwen3_5MoeConfig model)
  - [ShieldGemma2Config](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (ShieldGemma2Config model)
  - [SmolVLMConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMConfig) configuration class: [SmolVLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMForConditionalGeneration) (SmolVLMConfig model)
  - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model)
  - [UdopConfig](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopConfig) configuration class: [UdopForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopForConditionalGeneration) (UdopConfig model)
  - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
  - [VideoLlama3Config](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Config) configuration class: [VideoLlama3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3ForConditionalGeneration) (VideoLlama3Config model)
  - [VideoLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaConfig) configuration class: [VideoLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaForConditionalGeneration) (VideoLlavaConfig model)
  - [VipLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaConfig) configuration class: [VipLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaForConditionalGeneration) (VipLlavaConfig model)
  - [VisionEncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderConfig) configuration class: [VisionEncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderModel) (VisionEncoderDecoderConfig model)
  - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
  - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a multimodal generation head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMultimodalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMultimodalLM.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AriaConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaConfig) configuration class: [AriaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaForConditionalGeneration) (AriaConfig model) - [AyaVisionConfig](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionConfig) configuration class: [AyaVisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionForConditionalGeneration) (AyaVisionConfig model) - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (Blip2Config model) - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipForConditionalGeneration) (BlipConfig model) - [ChameleonConfig](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonConfig) configuration class: [ChameleonForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonForConditionalGeneration) (ChameleonConfig model) - [Cohere2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionConfig) configuration class: [Cohere2VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionForConditionalGeneration) (Cohere2VisionConfig model) - [DeepseekVLConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLConfig) configuration class: [DeepseekVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLForConditionalGeneration) (DeepseekVLConfig model) - [DeepseekVLHybridConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridConfig) configuration class: [DeepseekVLHybridForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridForConditionalGeneration) (DeepseekVLHybridConfig model) - [Emu3Config](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Config) configuration class: [Emu3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3ForConditionalGeneration) (Emu3Config model) - [Ernie4_5_VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeConfig) configuration class: [Ernie4_5_VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeForConditionalGeneration) (Ernie4_5_VLMoeConfig model) - [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) configuration class: [EvollaForProteinText2Text](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaForProteinText2Text) (EvollaConfig model) - [Exaone4_5_Config](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Config) configuration class: [Exaone4_5_ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_ForConditionalGeneration) (Exaone4_5_Config model) - [FastVlmConfig](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmConfig) configuration class: [FastVlmForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmForConditionalGeneration) (FastVlmConfig model) - [Florence2Config](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Config) configuration class: [Florence2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2ForConditionalGeneration) (Florence2Config model) - [FuyuConfig](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuConfig) configuration class: [FuyuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuForCausalLM) (FuyuConfig model) - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model) - [Gemma3nConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nConfig) configuration class: [Gemma3nForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForConditionalGeneration) (Gemma3nConfig model) - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model) - [GitConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitConfig) configuration class: [GitForCausalLM](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitForCausalLM) (GitConfig model) - [Glm46VConfig](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VConfig) configuration class: [Glm46VForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VForConditionalGeneration) (Glm46VConfig model) - [Glm4vConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vConfig) configuration class: [Glm4vForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vForConditionalGeneration) (Glm4vConfig model) - [Glm4vMoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeConfig) configuration class: [Glm4vMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeForConditionalGeneration) (Glm4vMoeConfig model) - [GlmAsrConfig](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrConfig) configuration class: [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model) - [GlmOcrConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrConfig) configuration class: [GlmOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrForConditionalGeneration) (GlmOcrConfig model) - [GotOcr2Config](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Config) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (GotOcr2Config model) - [Granite4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionConfig) configuration class: [Granite4VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionForConditionalGeneration) (Granite4VisionConfig model) - [GraniteSpeechConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechConfig) configuration class: [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model) - [GraniteSpeechPlusConfig](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusConfig) configuration class: [GraniteSpeechPlusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusForConditionalGeneration) (GraniteSpeechPlusConfig model) - [Idefics2Config](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Config) configuration class: [Idefics2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2ForConditionalGeneration) (Idefics2Config model) - [Idefics3Config](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Config) configuration class: [Idefics3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3ForConditionalGeneration) (Idefics3Config model) - [IdeficsConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsConfig) configuration class: [IdeficsForVisionText2Text](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsForVisionText2Text) (IdeficsConfig model) - [InstructBlipConfig](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipConfig) configuration class: [InstructBlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipForConditionalGeneration) (InstructBlipConfig model) - [InstructBlipVideoConfig](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoConfig) configuration class: [InstructBlipVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoForConditionalGeneration) (InstructBlipVideoConfig model) - [InternVLConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLConfig) configuration class: [InternVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLForConditionalGeneration) (InternVLConfig model) - [JanusConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusConfig) configuration class: [JanusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusForConditionalGeneration) (JanusConfig model) - [Kosmos2Config](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Config) configuration class: [Kosmos2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2ForConditionalGeneration) (Kosmos2Config model) - [Kosmos2_5Config](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Config) configuration class: [Kosmos2_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5ForConditionalGeneration) (Kosmos2_5Config model) - [KyutaiSpeechToTextConfig](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextConfig) configuration class: [KyutaiSpeechToTextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextForConditionalGeneration) (KyutaiSpeechToTextConfig model) - [Lfm2VlConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlConfig) configuration class: [Lfm2VlForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlForConditionalGeneration) (Lfm2VlConfig model) - [LightOnOcrConfig](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrConfig) configuration class: [LightOnOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrForConditionalGeneration) (LightOnOcrConfig model) - [Llama4Config](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Config) configuration class: [Llama4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForConditionalGeneration) (Llama4Config model) - [LlavaConfig](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaConfig) configuration class: [LlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaForConditionalGeneration) (LlavaConfig model) - [LlavaNextConfig](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextConfig) configuration class: [LlavaNextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextForConditionalGeneration) (LlavaNextConfig model) - [LlavaNextVideoConfig](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoConfig) configuration class: [LlavaNextVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoForConditionalGeneration) (LlavaNextVideoConfig model) - [LlavaOnevisionConfig](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionConfig) configuration class: [LlavaOnevisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionForConditionalGeneration) (LlavaOnevisionConfig model) - [MiniCPMV4_6Config](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Config) configuration class: [MiniCPMV4_6ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6ForConditionalGeneration) (MiniCPMV4_6Config model) - [Mistral3Config](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Config) configuration class: [Mistral3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3ForConditionalGeneration) (Mistral3Config model) - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForCausalLM) (Mistral4Config model) - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForConditionalGeneration) (MllamaConfig model) - [Ovis2Config](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Config) configuration class: [Ovis2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2ForConditionalGeneration) (Ovis2Config model) - [PI0Config](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Config) configuration class: [PI0ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0ForConditionalGeneration) (PI0Config model) - [PPChart2TableConfig](/docs/transformers/v5.8.0/en/model_doc/pp_chart2table#transformers.PPChart2TableConfig) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (PPChart2TableConfig model) - [PPFormulaNetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetConfig) configuration class: [PPFormulaNetForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetForConditionalGeneration) (PPFormulaNetConfig model) - [PaddleOCRVLConfig](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLConfig) configuration class: [PaddleOCRVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLForConditionalGeneration) (PaddleOCRVLConfig model) - [PaliGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemmaConfig model) - [PerceptionLMConfig](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMConfig) configuration class: [PerceptionLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMForConditionalGeneration) (PerceptionLMConfig model) - [Phi4MultimodalConfig](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalConfig) configuration class: [Phi4MultimodalForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalForCausalLM) (Phi4MultimodalConfig model) - [Pix2StructConfig](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructConfig) configuration class: [Pix2StructForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructForConditionalGeneration) (Pix2StructConfig model) - [QianfanOCRConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRConfig) configuration class: [QianfanOCRForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRForConditionalGeneration) (QianfanOCRConfig model) - [Qwen2AudioConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioConfig) configuration class: [Qwen2AudioForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioForConditionalGeneration) (Qwen2AudioConfig model) - [Qwen2VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLConfig) configuration class: [Qwen2VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLForConditionalGeneration) (Qwen2VLConfig model) - [Qwen2_5OmniConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniConfig) configuration class: [Qwen2_5OmniForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniForConditionalGeneration) (Qwen2_5OmniConfig model) - [Qwen2_5OmniThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerConfig) configuration class: [Qwen2_5OmniThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerForConditionalGeneration) (Qwen2_5OmniThinkerConfig model) - [Qwen2_5_VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLConfig) configuration class: [Qwen2_5_VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLForConditionalGeneration) (Qwen2_5_VLConfig model) - [Qwen3OmniMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeConfig) configuration class: [Qwen3OmniMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeForConditionalGeneration) (Qwen3OmniMoeConfig model) - [Qwen3OmniMoeThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerConfig) configuration class: [Qwen3OmniMoeThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerForConditionalGeneration) (Qwen3OmniMoeThinkerConfig model) - [Qwen3VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLConfig) configuration class: [Qwen3VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLForConditionalGeneration) (Qwen3VLConfig model) - [Qwen3VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeConfig) configuration class: [Qwen3VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeForConditionalGeneration) (Qwen3VLMoeConfig model) - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForConditionalGeneration) (Qwen3_5Config model) - [Qwen3_5MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeConfig) configuration class: [Qwen3_5MoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForConditionalGeneration) (Qwen3_5MoeConfig model) - [ShieldGemma2Config](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (ShieldGemma2Config model) - [SmolVLMConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMConfig) configuration class: [SmolVLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMForConditionalGeneration) (SmolVLMConfig model) - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model) - [UdopConfig](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopConfig) configuration class: [UdopForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopForConditionalGeneration) (UdopConfig model) - [VibeVoiceAsrConfig](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrConfig) configuration class: [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model) - [VideoLlama3Config](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Config) configuration class: [VideoLlama3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3ForConditionalGeneration) (VideoLlama3Config model) - [VideoLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaConfig) configuration class: [VideoLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaForConditionalGeneration) (VideoLlavaConfig model) - [VipLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaConfig) configuration class: [VipLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaForConditionalGeneration) (VipLlavaConfig model) - [VisionEncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderConfig) configuration class: [VisionEncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderModel) (VisionEncoderDecoderConfig model) - [VoxtralConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralConfig) configuration class: [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model) - [VoxtralRealtimeConfig](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeConfig) configuration class: [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForMultimodalLM.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a multimodal generation head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aria** -- [AriaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaForConditionalGeneration) (AriaConfig model)
- **aya_vision** -- [AyaVisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionForConditionalGeneration) (AyaVisionConfig model)
- **blip** -- [BlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipForConditionalGeneration) (BlipConfig model)
- **blip-2** -- [Blip2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (Blip2Config model)
- **chameleon** -- [ChameleonForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonForConditionalGeneration) (ChameleonConfig model)
- **cohere2_vision** -- [Cohere2VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionForConditionalGeneration) (Cohere2VisionConfig model)
- **deepseek_vl** -- [DeepseekVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLForConditionalGeneration) (DeepseekVLConfig model)
- **deepseek_vl_hybrid** -- [DeepseekVLHybridForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridForConditionalGeneration) (DeepseekVLHybridConfig model)
- **emu3** -- [Emu3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3ForConditionalGeneration) (Emu3Config model)
- **ernie4_5_vl_moe** -- [Ernie4_5_VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeForConditionalGeneration) (Ernie4_5_VLMoeConfig model)
- **evolla** -- [EvollaForProteinText2Text](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaForProteinText2Text) (EvollaConfig model)
- **exaone4_5** -- [Exaone4_5_ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_ForConditionalGeneration) (Exaone4_5_Config model)
- **fast_vlm** -- [FastVlmForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmForConditionalGeneration) (FastVlmConfig model)
- **florence2** -- [Florence2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2ForConditionalGeneration) (Florence2Config model)
- **fuyu** -- [FuyuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuForCausalLM) (FuyuConfig model)
- **gemma3** -- [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model)
- **gemma3n** -- [Gemma3nForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForConditionalGeneration) (Gemma3nConfig model)
- **gemma4** -- [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model)
- **git** -- [GitForCausalLM](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitForCausalLM) (GitConfig model)
- **glm46v** -- [Glm46VForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VForConditionalGeneration) (Glm46VConfig model)
- **glm4v** -- [Glm4vForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vForConditionalGeneration) (Glm4vConfig model)
- **glm4v_moe** -- [Glm4vMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeForConditionalGeneration) (Glm4vMoeConfig model)
- **glm_ocr** -- [GlmOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrForConditionalGeneration) (GlmOcrConfig model)
- **glmasr** -- [GlmAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glmasr#transformers.GlmAsrForConditionalGeneration) (GlmAsrConfig model)
- **got_ocr2** -- [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (GotOcr2Config model)
- **granite4_vision** -- [Granite4VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionForConditionalGeneration) (Granite4VisionConfig model)
- **granite_speech** -- [GraniteSpeechForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech#transformers.GraniteSpeechForConditionalGeneration) (GraniteSpeechConfig model)
- **granite_speech_plus** -- [GraniteSpeechPlusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite_speech_plus#transformers.GraniteSpeechPlusForConditionalGeneration) (GraniteSpeechPlusConfig model)
- **idefics** -- [IdeficsForVisionText2Text](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsForVisionText2Text) (IdeficsConfig model)
- **idefics2** -- [Idefics2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2ForConditionalGeneration) (Idefics2Config model)
- **idefics3** -- [Idefics3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3ForConditionalGeneration) (Idefics3Config model)
- **instructblip** -- [InstructBlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipForConditionalGeneration) (InstructBlipConfig model)
- **instructblipvideo** -- [InstructBlipVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoForConditionalGeneration) (InstructBlipVideoConfig model)
- **internvl** -- [InternVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLForConditionalGeneration) (InternVLConfig model)
- **janus** -- [JanusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusForConditionalGeneration) (JanusConfig model)
- **kosmos-2** -- [Kosmos2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2ForConditionalGeneration) (Kosmos2Config model)
- **kosmos-2.5** -- [Kosmos2_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5ForConditionalGeneration) (Kosmos2_5Config model)
- **kyutai_speech_to_text** -- [KyutaiSpeechToTextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kyutai_speech_to_text#transformers.KyutaiSpeechToTextForConditionalGeneration) (KyutaiSpeechToTextConfig model)
- **lfm2_vl** -- [Lfm2VlForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlForConditionalGeneration) (Lfm2VlConfig model)
- **lighton_ocr** -- [LightOnOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrForConditionalGeneration) (LightOnOcrConfig model)
- **llama4** -- [Llama4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForConditionalGeneration) (Llama4Config model)
- **llava** -- [LlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaForConditionalGeneration) (LlavaConfig model)
- **llava_next** -- [LlavaNextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextForConditionalGeneration) (LlavaNextConfig model)
- **llava_next_video** -- [LlavaNextVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoForConditionalGeneration) (LlavaNextVideoConfig model)
- **llava_onevision** -- [LlavaOnevisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionForConditionalGeneration) (LlavaOnevisionConfig model)
- **minicpmv4_6** -- [MiniCPMV4_6ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6ForConditionalGeneration) (MiniCPMV4_6Config model)
- **mistral3** -- [Mistral3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3ForConditionalGeneration) (Mistral3Config model)
- **mistral4** -- [Mistral4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForCausalLM) (Mistral4Config model)
- **mllama** -- [MllamaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForConditionalGeneration) (MllamaConfig model)
- **ovis2** -- [Ovis2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2ForConditionalGeneration) (Ovis2Config model)
- **paddleocr_vl** -- [PaddleOCRVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLForConditionalGeneration) (PaddleOCRVLConfig model)
- **paligemma** -- [PaliGemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemmaConfig model)
- **perception_lm** -- [PerceptionLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMForConditionalGeneration) (PerceptionLMConfig model)
- **phi4_multimodal** -- [Phi4MultimodalForCausalLM](/docs/transformers/v5.8.0/en/model_doc/phi4_multimodal#transformers.Phi4MultimodalForCausalLM) (Phi4MultimodalConfig model)
- **pi0** -- [PI0ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0ForConditionalGeneration) (PI0Config model)
- **pix2struct** -- [Pix2StructForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructForConditionalGeneration) (Pix2StructConfig model)
- **pp_chart2table** -- [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (PPChart2TableConfig model)
- **pp_formulanet** -- [PPFormulaNetForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetForConditionalGeneration) (PPFormulaNetConfig model)
- **qianfan_ocr** -- [QianfanOCRForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRForConditionalGeneration) (QianfanOCRConfig model)
- **qwen2_5_omni** -- [Qwen2_5OmniForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniForConditionalGeneration) (Qwen2_5OmniConfig model)
- **qwen2_5_omni_thinker** -- [Qwen2_5OmniThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerForConditionalGeneration) (Qwen2_5OmniThinkerConfig model)
- **qwen2_5_vl** -- [Qwen2_5_VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLForConditionalGeneration) (Qwen2_5_VLConfig model)
- **qwen2_audio** -- [Qwen2AudioForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_audio#transformers.Qwen2AudioForConditionalGeneration) (Qwen2AudioConfig model)
- **qwen2_vl** -- [Qwen2VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLForConditionalGeneration) (Qwen2VLConfig model)
- **qwen3_5** -- [Qwen3_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForConditionalGeneration) (Qwen3_5Config model)
- **qwen3_5_moe** -- [Qwen3_5MoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForConditionalGeneration) (Qwen3_5MoeConfig model)
- **qwen3_omni_moe** -- [Qwen3OmniMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeForConditionalGeneration) (Qwen3OmniMoeConfig model)
- **qwen3_omni_moe_thinker** -- [Qwen3OmniMoeThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerForConditionalGeneration) (Qwen3OmniMoeThinkerConfig model)
- **qwen3_vl** -- [Qwen3VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLForConditionalGeneration) (Qwen3VLConfig model)
- **qwen3_vl_moe** -- [Qwen3VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeForConditionalGeneration) (Qwen3VLMoeConfig model)
- **shieldgemma2** -- [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (ShieldGemma2Config model)
- **smolvlm** -- [SmolVLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMForConditionalGeneration) (SmolVLMConfig model)
- **t5gemma2** -- [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model)
- **udop** -- [UdopForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopForConditionalGeneration) (UdopConfig model)
- **vibevoice_asr** -- [VibeVoiceAsrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vibevoice_asr#transformers.VibeVoiceAsrForConditionalGeneration) (VibeVoiceAsrConfig model)
- **video_llama_3** -- [VideoLlama3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3ForConditionalGeneration) (VideoLlama3Config model)
- **video_llava** -- [VideoLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaForConditionalGeneration) (VideoLlavaConfig model)
- **vipllava** -- [VipLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaForConditionalGeneration) (VipLlavaConfig model)
- **vision-encoder-decoder** -- [VisionEncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderModel) (VisionEncoderDecoderConfig model)
- **voxtral** -- [VoxtralForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral#transformers.VoxtralForConditionalGeneration) (VoxtralConfig model)
- **voxtral_realtime** -- [VoxtralRealtimeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/voxtral_realtime#transformers.VoxtralRealtimeForConditionalGeneration) (VoxtralRealtimeConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForMultimodalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultimodalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMultimodalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForTableQuestionAnswering[[transformers.AutoModelForTableQuestionAnswering]]

#### transformers.AutoModelForTableQuestionAnswering[[transformers.AutoModelForTableQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2062)

This is a generic model class that will be instantiated as one of the model classes of the library (with a table question answering head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTableQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForQuestionAnswering) (TapasConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a table question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = AutoModelForTableQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [TapasConfig](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasConfig) configuration class: [TapasForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForQuestionAnswering) (TapasConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTableQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a table question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **tapas** -- [TapasForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/tapas#transformers.TapasForQuestionAnswering) (TapasConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> # Update configuration during loading
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForDocumentQuestionAnswering[[transformers.AutoModelForDocumentQuestionAnswering]]

#### transformers.AutoModelForDocumentQuestionAnswering[[transformers.AutoModelForDocumentQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2084)

This is a generic model class that will be instantiated as one of the model classes of the library (with a document question answering head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForDocumentQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForQuestionAnswering) (LayoutLMConfig model)
  - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForQuestionAnswering) (LayoutLMv2Config model)
  - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForQuestionAnswering) (LayoutLMv3Config model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a document question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = AutoModelForDocumentQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [LayoutLMConfig](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMConfig) configuration class: [LayoutLMForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForQuestionAnswering) (LayoutLMConfig model) - [LayoutLMv2Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2Config) configuration class: [LayoutLMv2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForQuestionAnswering) (LayoutLMv2Config model) - [LayoutLMv3Config](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3Config) configuration class: [LayoutLMv3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForQuestionAnswering) (LayoutLMv3Config model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForDocumentQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a document question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **layoutlm** -- [LayoutLMForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlm#transformers.LayoutLMForQuestionAnswering) (LayoutLMConfig model)
- **layoutlmv2** -- [LayoutLMv2ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv2#transformers.LayoutLMv2ForQuestionAnswering) (LayoutLMv2Config model)
- **layoutlmv3** -- [LayoutLMv3ForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/layoutlmv3#transformers.LayoutLMv3ForQuestionAnswering) (LayoutLMv3Config model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> # Update configuration during loading
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForVisualQuestionAnswering[[transformers.AutoModelForVisualQuestionAnswering]]

#### transformers.AutoModelForVisualQuestionAnswering[[transformers.AutoModelForVisualQuestionAnswering]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2073)

This is a generic model class that will be instantiated as one of the model classes of the library (with a visual question answering head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForVisualQuestionAnswering.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (Blip2Config model)
  - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipForQuestionAnswering) (BlipConfig model)
  - [ViltConfig](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltConfig) configuration class: [ViltForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltForQuestionAnswering) (ViltConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a visual question answering head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
>>> model = AutoModelForVisualQuestionAnswering.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (Blip2Config model) - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipForQuestionAnswering) (BlipConfig model) - [ViltConfig](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltConfig) configuration class: [ViltForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltForQuestionAnswering) (ViltConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForVisualQuestionAnswering.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a visual question answering head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **blip** -- [BlipForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipForQuestionAnswering) (BlipConfig model)
- **blip-2** -- [Blip2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (Blip2Config model)
- **vilt** -- [ViltForQuestionAnswering](/docs/transformers/v5.8.0/en/model_doc/vilt#transformers.ViltForQuestionAnswering) (ViltConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")

>>> # Update configuration during loading
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

### AutoModelForImageTextToText[[transformers.AutoModelForImageTextToText]]

#### transformers.AutoModelForImageTextToText[[transformers.AutoModelForImageTextToText]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2221)

This is a generic model class that will be instantiated as one of the model classes of the library (with a image-text-to-text modeling head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForImageTextToText.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [AriaConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaConfig) configuration class: [AriaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaForConditionalGeneration) (AriaConfig model)
  - [AyaVisionConfig](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionConfig) configuration class: [AyaVisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionForConditionalGeneration) (AyaVisionConfig model)
  - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (Blip2Config model)
  - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipForConditionalGeneration) (BlipConfig model)
  - [ChameleonConfig](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonConfig) configuration class: [ChameleonForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonForConditionalGeneration) (ChameleonConfig model)
  - [Cohere2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionConfig) configuration class: [Cohere2VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionForConditionalGeneration) (Cohere2VisionConfig model)
  - [DeepseekVLConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLConfig) configuration class: [DeepseekVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLForConditionalGeneration) (DeepseekVLConfig model)
  - [DeepseekVLHybridConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridConfig) configuration class: [DeepseekVLHybridForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridForConditionalGeneration) (DeepseekVLHybridConfig model)
  - [Emu3Config](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Config) configuration class: [Emu3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3ForConditionalGeneration) (Emu3Config model)
  - [Ernie4_5_VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeConfig) configuration class: [Ernie4_5_VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeForConditionalGeneration) (Ernie4_5_VLMoeConfig model)
  - [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) configuration class: [EvollaForProteinText2Text](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaForProteinText2Text) (EvollaConfig model)
  - [Exaone4_5_Config](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Config) configuration class: [Exaone4_5_ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_ForConditionalGeneration) (Exaone4_5_Config model)
  - [FastVlmConfig](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmConfig) configuration class: [FastVlmForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmForConditionalGeneration) (FastVlmConfig model)
  - [Florence2Config](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Config) configuration class: [Florence2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2ForConditionalGeneration) (Florence2Config model)
  - [FuyuConfig](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuConfig) configuration class: [FuyuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuForCausalLM) (FuyuConfig model)
  - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model)
  - [Gemma3nConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nConfig) configuration class: [Gemma3nForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForConditionalGeneration) (Gemma3nConfig model)
  - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model)
  - [GitConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitConfig) configuration class: [GitForCausalLM](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitForCausalLM) (GitConfig model)
  - [Glm46VConfig](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VConfig) configuration class: [Glm46VForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VForConditionalGeneration) (Glm46VConfig model)
  - [Glm4vConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vConfig) configuration class: [Glm4vForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vForConditionalGeneration) (Glm4vConfig model)
  - [Glm4vMoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeConfig) configuration class: [Glm4vMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeForConditionalGeneration) (Glm4vMoeConfig model)
  - [GlmOcrConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrConfig) configuration class: [GlmOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrForConditionalGeneration) (GlmOcrConfig model)
  - [GotOcr2Config](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Config) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (GotOcr2Config model)
  - [Granite4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionConfig) configuration class: [Granite4VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionForConditionalGeneration) (Granite4VisionConfig model)
  - [Idefics2Config](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Config) configuration class: [Idefics2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2ForConditionalGeneration) (Idefics2Config model)
  - [Idefics3Config](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Config) configuration class: [Idefics3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3ForConditionalGeneration) (Idefics3Config model)
  - [IdeficsConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsConfig) configuration class: [IdeficsForVisionText2Text](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsForVisionText2Text) (IdeficsConfig model)
  - [InstructBlipConfig](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipConfig) configuration class: [InstructBlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipForConditionalGeneration) (InstructBlipConfig model)
  - [InstructBlipVideoConfig](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoConfig) configuration class: [InstructBlipVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoForConditionalGeneration) (InstructBlipVideoConfig model)
  - [InternVLConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLConfig) configuration class: [InternVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLForConditionalGeneration) (InternVLConfig model)
  - [JanusConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusConfig) configuration class: [JanusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusForConditionalGeneration) (JanusConfig model)
  - [Kosmos2Config](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Config) configuration class: [Kosmos2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2ForConditionalGeneration) (Kosmos2Config model)
  - [Kosmos2_5Config](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Config) configuration class: [Kosmos2_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5ForConditionalGeneration) (Kosmos2_5Config model)
  - [Lfm2VlConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlConfig) configuration class: [Lfm2VlForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlForConditionalGeneration) (Lfm2VlConfig model)
  - [LightOnOcrConfig](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrConfig) configuration class: [LightOnOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrForConditionalGeneration) (LightOnOcrConfig model)
  - [Llama4Config](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Config) configuration class: [Llama4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForConditionalGeneration) (Llama4Config model)
  - [LlavaConfig](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaConfig) configuration class: [LlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaForConditionalGeneration) (LlavaConfig model)
  - [LlavaNextConfig](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextConfig) configuration class: [LlavaNextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextForConditionalGeneration) (LlavaNextConfig model)
  - [LlavaNextVideoConfig](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoConfig) configuration class: [LlavaNextVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoForConditionalGeneration) (LlavaNextVideoConfig model)
  - [LlavaOnevisionConfig](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionConfig) configuration class: [LlavaOnevisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionForConditionalGeneration) (LlavaOnevisionConfig model)
  - [MiniCPMV4_6Config](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Config) configuration class: [MiniCPMV4_6ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6ForConditionalGeneration) (MiniCPMV4_6Config model)
  - [Mistral3Config](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Config) configuration class: [Mistral3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3ForConditionalGeneration) (Mistral3Config model)
  - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForCausalLM) (Mistral4Config model)
  - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForConditionalGeneration) (MllamaConfig model)
  - [Ovis2Config](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Config) configuration class: [Ovis2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2ForConditionalGeneration) (Ovis2Config model)
  - [PI0Config](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Config) configuration class: [PI0ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0ForConditionalGeneration) (PI0Config model)
  - [PPChart2TableConfig](/docs/transformers/v5.8.0/en/model_doc/pp_chart2table#transformers.PPChart2TableConfig) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (PPChart2TableConfig model)
  - [PPFormulaNetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetConfig) configuration class: [PPFormulaNetForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetForConditionalGeneration) (PPFormulaNetConfig model)
  - [PaddleOCRVLConfig](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLConfig) configuration class: [PaddleOCRVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLForConditionalGeneration) (PaddleOCRVLConfig model)
  - [PaliGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemmaConfig model)
  - [PerceptionLMConfig](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMConfig) configuration class: [PerceptionLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMForConditionalGeneration) (PerceptionLMConfig model)
  - [Pix2StructConfig](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructConfig) configuration class: [Pix2StructForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructForConditionalGeneration) (Pix2StructConfig model)
  - [QianfanOCRConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRConfig) configuration class: [QianfanOCRForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRForConditionalGeneration) (QianfanOCRConfig model)
  - [Qwen2VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLConfig) configuration class: [Qwen2VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLForConditionalGeneration) (Qwen2VLConfig model)
  - [Qwen2_5OmniThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerConfig) configuration class: [Qwen2_5OmniThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerForConditionalGeneration) (Qwen2_5OmniThinkerConfig model)
  - [Qwen2_5_VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLConfig) configuration class: [Qwen2_5_VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLForConditionalGeneration) (Qwen2_5_VLConfig model)
  - [Qwen3OmniMoeThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerConfig) configuration class: [Qwen3OmniMoeThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerForConditionalGeneration) (Qwen3OmniMoeThinkerConfig model)
  - [Qwen3VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLConfig) configuration class: [Qwen3VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLForConditionalGeneration) (Qwen3VLConfig model)
  - [Qwen3VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeConfig) configuration class: [Qwen3VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeForConditionalGeneration) (Qwen3VLMoeConfig model)
  - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForConditionalGeneration) (Qwen3_5Config model)
  - [Qwen3_5MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeConfig) configuration class: [Qwen3_5MoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForConditionalGeneration) (Qwen3_5MoeConfig model)
  - [ShieldGemma2Config](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (ShieldGemma2Config model)
  - [SmolVLMConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMConfig) configuration class: [SmolVLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMForConditionalGeneration) (SmolVLMConfig model)
  - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model)
  - [UdopConfig](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopConfig) configuration class: [UdopForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopForConditionalGeneration) (UdopConfig model)
  - [VideoLlama3Config](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Config) configuration class: [VideoLlama3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3ForConditionalGeneration) (VideoLlama3Config model)
  - [VideoLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaConfig) configuration class: [VideoLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaForConditionalGeneration) (VideoLlavaConfig model)
  - [VipLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaConfig) configuration class: [VipLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaForConditionalGeneration) (VipLlavaConfig model)
  - [VisionEncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderConfig) configuration class: [VisionEncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderModel) (VisionEncoderDecoderConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a image-text-to-text modeling head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageTextToText.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [AriaConfig](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaConfig) configuration class: [AriaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaForConditionalGeneration) (AriaConfig model) - [AyaVisionConfig](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionConfig) configuration class: [AyaVisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionForConditionalGeneration) (AyaVisionConfig model) - [Blip2Config](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2Config) configuration class: [Blip2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (Blip2Config model) - [BlipConfig](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipConfig) configuration class: [BlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipForConditionalGeneration) (BlipConfig model) - [ChameleonConfig](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonConfig) configuration class: [ChameleonForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonForConditionalGeneration) (ChameleonConfig model) - [Cohere2VisionConfig](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionConfig) configuration class: [Cohere2VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionForConditionalGeneration) (Cohere2VisionConfig model) - [DeepseekVLConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLConfig) configuration class: [DeepseekVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLForConditionalGeneration) (DeepseekVLConfig model) - [DeepseekVLHybridConfig](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridConfig) configuration class: [DeepseekVLHybridForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridForConditionalGeneration) (DeepseekVLHybridConfig model) - [Emu3Config](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3Config) configuration class: [Emu3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3ForConditionalGeneration) (Emu3Config model) - [Ernie4_5_VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeConfig) configuration class: [Ernie4_5_VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeForConditionalGeneration) (Ernie4_5_VLMoeConfig model) - [EvollaConfig](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaConfig) configuration class: [EvollaForProteinText2Text](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaForProteinText2Text) (EvollaConfig model) - [Exaone4_5_Config](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_Config) configuration class: [Exaone4_5_ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_ForConditionalGeneration) (Exaone4_5_Config model) - [FastVlmConfig](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmConfig) configuration class: [FastVlmForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmForConditionalGeneration) (FastVlmConfig model) - [Florence2Config](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2Config) configuration class: [Florence2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2ForConditionalGeneration) (Florence2Config model) - [FuyuConfig](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuConfig) configuration class: [FuyuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuForCausalLM) (FuyuConfig model) - [Gemma3Config](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model) - [Gemma3nConfig](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nConfig) configuration class: [Gemma3nForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForConditionalGeneration) (Gemma3nConfig model) - [Gemma4Config](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4Config) configuration class: [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model) - [GitConfig](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitConfig) configuration class: [GitForCausalLM](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitForCausalLM) (GitConfig model) - [Glm46VConfig](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VConfig) configuration class: [Glm46VForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VForConditionalGeneration) (Glm46VConfig model) - [Glm4vConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vConfig) configuration class: [Glm4vForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vForConditionalGeneration) (Glm4vConfig model) - [Glm4vMoeConfig](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeConfig) configuration class: [Glm4vMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeForConditionalGeneration) (Glm4vMoeConfig model) - [GlmOcrConfig](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrConfig) configuration class: [GlmOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrForConditionalGeneration) (GlmOcrConfig model) - [GotOcr2Config](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2Config) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (GotOcr2Config model) - [Granite4VisionConfig](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionConfig) configuration class: [Granite4VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionForConditionalGeneration) (Granite4VisionConfig model) - [Idefics2Config](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2Config) configuration class: [Idefics2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2ForConditionalGeneration) (Idefics2Config model) - [Idefics3Config](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3Config) configuration class: [Idefics3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3ForConditionalGeneration) (Idefics3Config model) - [IdeficsConfig](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsConfig) configuration class: [IdeficsForVisionText2Text](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsForVisionText2Text) (IdeficsConfig model) - [InstructBlipConfig](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipConfig) configuration class: [InstructBlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipForConditionalGeneration) (InstructBlipConfig model) - [InstructBlipVideoConfig](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoConfig) configuration class: [InstructBlipVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoForConditionalGeneration) (InstructBlipVideoConfig model) - [InternVLConfig](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLConfig) configuration class: [InternVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLForConditionalGeneration) (InternVLConfig model) - [JanusConfig](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusConfig) configuration class: [JanusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusForConditionalGeneration) (JanusConfig model) - [Kosmos2Config](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2Config) configuration class: [Kosmos2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2ForConditionalGeneration) (Kosmos2Config model) - [Kosmos2_5Config](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5Config) configuration class: [Kosmos2_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5ForConditionalGeneration) (Kosmos2_5Config model) - [Lfm2VlConfig](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlConfig) configuration class: [Lfm2VlForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlForConditionalGeneration) (Lfm2VlConfig model) - [LightOnOcrConfig](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrConfig) configuration class: [LightOnOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrForConditionalGeneration) (LightOnOcrConfig model) - [Llama4Config](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4Config) configuration class: [Llama4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForConditionalGeneration) (Llama4Config model) - [LlavaConfig](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaConfig) configuration class: [LlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaForConditionalGeneration) (LlavaConfig model) - [LlavaNextConfig](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextConfig) configuration class: [LlavaNextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextForConditionalGeneration) (LlavaNextConfig model) - [LlavaNextVideoConfig](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoConfig) configuration class: [LlavaNextVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoForConditionalGeneration) (LlavaNextVideoConfig model) - [LlavaOnevisionConfig](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionConfig) configuration class: [LlavaOnevisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionForConditionalGeneration) (LlavaOnevisionConfig model) - [MiniCPMV4_6Config](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6Config) configuration class: [MiniCPMV4_6ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6ForConditionalGeneration) (MiniCPMV4_6Config model) - [Mistral3Config](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3Config) configuration class: [Mistral3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3ForConditionalGeneration) (Mistral3Config model) - [Mistral4Config](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4Config) configuration class: [Mistral4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForCausalLM) (Mistral4Config model) - [MllamaConfig](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaConfig) configuration class: [MllamaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForConditionalGeneration) (MllamaConfig model) - [Ovis2Config](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2Config) configuration class: [Ovis2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2ForConditionalGeneration) (Ovis2Config model) - [PI0Config](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0Config) configuration class: [PI0ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0ForConditionalGeneration) (PI0Config model) - [PPChart2TableConfig](/docs/transformers/v5.8.0/en/model_doc/pp_chart2table#transformers.PPChart2TableConfig) configuration class: [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (PPChart2TableConfig model) - [PPFormulaNetConfig](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetConfig) configuration class: [PPFormulaNetForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetForConditionalGeneration) (PPFormulaNetConfig model) - [PaddleOCRVLConfig](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLConfig) configuration class: [PaddleOCRVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLForConditionalGeneration) (PaddleOCRVLConfig model) - [PaliGemmaConfig](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaConfig) configuration class: [PaliGemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemmaConfig model) - [PerceptionLMConfig](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMConfig) configuration class: [PerceptionLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMForConditionalGeneration) (PerceptionLMConfig model) - [Pix2StructConfig](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructConfig) configuration class: [Pix2StructForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructForConditionalGeneration) (Pix2StructConfig model) - [QianfanOCRConfig](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRConfig) configuration class: [QianfanOCRForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRForConditionalGeneration) (QianfanOCRConfig model) - [Qwen2VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLConfig) configuration class: [Qwen2VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLForConditionalGeneration) (Qwen2VLConfig model) - [Qwen2_5OmniThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerConfig) configuration class: [Qwen2_5OmniThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerForConditionalGeneration) (Qwen2_5OmniThinkerConfig model) - [Qwen2_5_VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLConfig) configuration class: [Qwen2_5_VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLForConditionalGeneration) (Qwen2_5_VLConfig model) - [Qwen3OmniMoeThinkerConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerConfig) configuration class: [Qwen3OmniMoeThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerForConditionalGeneration) (Qwen3OmniMoeThinkerConfig model) - [Qwen3VLConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLConfig) configuration class: [Qwen3VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLForConditionalGeneration) (Qwen3VLConfig model) - [Qwen3VLMoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeConfig) configuration class: [Qwen3VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeForConditionalGeneration) (Qwen3VLMoeConfig model) - [Qwen3_5Config](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5Config) configuration class: [Qwen3_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForConditionalGeneration) (Qwen3_5Config model) - [Qwen3_5MoeConfig](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeConfig) configuration class: [Qwen3_5MoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForConditionalGeneration) (Qwen3_5MoeConfig model) - [ShieldGemma2Config](/docs/transformers/v5.8.0/en/model_doc/shieldgemma2#transformers.ShieldGemma2Config) configuration class: [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (ShieldGemma2Config model) - [SmolVLMConfig](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMConfig) configuration class: [SmolVLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMForConditionalGeneration) (SmolVLMConfig model) - [T5Gemma2Config](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2Config) configuration class: [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model) - [UdopConfig](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopConfig) configuration class: [UdopForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopForConditionalGeneration) (UdopConfig model) - [VideoLlama3Config](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3Config) configuration class: [VideoLlama3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3ForConditionalGeneration) (VideoLlama3Config model) - [VideoLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaConfig) configuration class: [VideoLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaForConditionalGeneration) (VideoLlavaConfig model) - [VipLlavaConfig](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaConfig) configuration class: [VipLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaForConditionalGeneration) (VipLlavaConfig model) - [VisionEncoderDecoderConfig](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderConfig) configuration class: [VisionEncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderModel) (VisionEncoderDecoderConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForImageTextToText.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a image-text-to-text modeling head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **aria** -- [AriaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aria#transformers.AriaForConditionalGeneration) (AriaConfig model)
- **aya_vision** -- [AyaVisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/aya_vision#transformers.AyaVisionForConditionalGeneration) (AyaVisionConfig model)
- **blip** -- [BlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip#transformers.BlipForConditionalGeneration) (BlipConfig model)
- **blip-2** -- [Blip2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/blip-2#transformers.Blip2ForConditionalGeneration) (Blip2Config model)
- **chameleon** -- [ChameleonForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/chameleon#transformers.ChameleonForConditionalGeneration) (ChameleonConfig model)
- **cohere2_vision** -- [Cohere2VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/cohere2_vision#transformers.Cohere2VisionForConditionalGeneration) (Cohere2VisionConfig model)
- **deepseek_vl** -- [DeepseekVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl#transformers.DeepseekVLForConditionalGeneration) (DeepseekVLConfig model)
- **deepseek_vl_hybrid** -- [DeepseekVLHybridForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/deepseek_vl_hybrid#transformers.DeepseekVLHybridForConditionalGeneration) (DeepseekVLHybridConfig model)
- **emu3** -- [Emu3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/emu3#transformers.Emu3ForConditionalGeneration) (Emu3Config model)
- **ernie4_5_vl_moe** -- [Ernie4_5_VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ernie4_5_vl_moe#transformers.Ernie4_5_VLMoeForConditionalGeneration) (Ernie4_5_VLMoeConfig model)
- **evolla** -- [EvollaForProteinText2Text](/docs/transformers/v5.8.0/en/model_doc/evolla#transformers.EvollaForProteinText2Text) (EvollaConfig model)
- **exaone4_5** -- [Exaone4_5_ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/exaone4_5#transformers.Exaone4_5_ForConditionalGeneration) (Exaone4_5_Config model)
- **fast_vlm** -- [FastVlmForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/fast_vlm#transformers.FastVlmForConditionalGeneration) (FastVlmConfig model)
- **florence2** -- [Florence2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/florence2#transformers.Florence2ForConditionalGeneration) (Florence2Config model)
- **fuyu** -- [FuyuForCausalLM](/docs/transformers/v5.8.0/en/model_doc/fuyu#transformers.FuyuForCausalLM) (FuyuConfig model)
- **gemma3** -- [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (Gemma3Config model)
- **gemma3n** -- [Gemma3nForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3n#transformers.Gemma3nForConditionalGeneration) (Gemma3nConfig model)
- **gemma4** -- [Gemma4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma4#transformers.Gemma4ForConditionalGeneration) (Gemma4Config model)
- **git** -- [GitForCausalLM](/docs/transformers/v5.8.0/en/model_doc/git#transformers.GitForCausalLM) (GitConfig model)
- **glm46v** -- [Glm46VForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm46v#transformers.Glm46VForConditionalGeneration) (Glm46VConfig model)
- **glm4v** -- [Glm4vForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v#transformers.Glm4vForConditionalGeneration) (Glm4vConfig model)
- **glm4v_moe** -- [Glm4vMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm4v_moe#transformers.Glm4vMoeForConditionalGeneration) (Glm4vMoeConfig model)
- **glm_ocr** -- [GlmOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/glm_ocr#transformers.GlmOcrForConditionalGeneration) (GlmOcrConfig model)
- **got_ocr2** -- [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (GotOcr2Config model)
- **granite4_vision** -- [Granite4VisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granite4_vision#transformers.Granite4VisionForConditionalGeneration) (Granite4VisionConfig model)
- **idefics** -- [IdeficsForVisionText2Text](/docs/transformers/v5.8.0/en/model_doc/idefics#transformers.IdeficsForVisionText2Text) (IdeficsConfig model)
- **idefics2** -- [Idefics2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics2#transformers.Idefics2ForConditionalGeneration) (Idefics2Config model)
- **idefics3** -- [Idefics3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/idefics3#transformers.Idefics3ForConditionalGeneration) (Idefics3Config model)
- **instructblip** -- [InstructBlipForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblip#transformers.InstructBlipForConditionalGeneration) (InstructBlipConfig model)
- **instructblipvideo** -- [InstructBlipVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/instructblipvideo#transformers.InstructBlipVideoForConditionalGeneration) (InstructBlipVideoConfig model)
- **internvl** -- [InternVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/internvl#transformers.InternVLForConditionalGeneration) (InternVLConfig model)
- **janus** -- [JanusForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/janus#transformers.JanusForConditionalGeneration) (JanusConfig model)
- **kosmos-2** -- [Kosmos2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos-2#transformers.Kosmos2ForConditionalGeneration) (Kosmos2Config model)
- **kosmos-2.5** -- [Kosmos2_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/kosmos2_5#transformers.Kosmos2_5ForConditionalGeneration) (Kosmos2_5Config model)
- **lfm2_vl** -- [Lfm2VlForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lfm2_vl#transformers.Lfm2VlForConditionalGeneration) (Lfm2VlConfig model)
- **lighton_ocr** -- [LightOnOcrForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/lighton_ocr#transformers.LightOnOcrForConditionalGeneration) (LightOnOcrConfig model)
- **llama4** -- [Llama4ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llama4#transformers.Llama4ForConditionalGeneration) (Llama4Config model)
- **llava** -- [LlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava#transformers.LlavaForConditionalGeneration) (LlavaConfig model)
- **llava_next** -- [LlavaNextForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/granitevision#transformers.LlavaNextForConditionalGeneration) (LlavaNextConfig model)
- **llava_next_video** -- [LlavaNextVideoForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_next_video#transformers.LlavaNextVideoForConditionalGeneration) (LlavaNextVideoConfig model)
- **llava_onevision** -- [LlavaOnevisionForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/llava_onevision#transformers.LlavaOnevisionForConditionalGeneration) (LlavaOnevisionConfig model)
- **minicpmv4_6** -- [MiniCPMV4_6ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/minicpmv4_6#transformers.MiniCPMV4_6ForConditionalGeneration) (MiniCPMV4_6Config model)
- **mistral3** -- [Mistral3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mistral3#transformers.Mistral3ForConditionalGeneration) (Mistral3Config model)
- **mistral4** -- [Mistral4ForCausalLM](/docs/transformers/v5.8.0/en/model_doc/mistral4#transformers.Mistral4ForCausalLM) (Mistral4Config model)
- **mllama** -- [MllamaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/mllama#transformers.MllamaForConditionalGeneration) (MllamaConfig model)
- **ovis2** -- [Ovis2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/ovis2#transformers.Ovis2ForConditionalGeneration) (Ovis2Config model)
- **paddleocr_vl** -- [PaddleOCRVLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paddleocr_vl#transformers.PaddleOCRVLForConditionalGeneration) (PaddleOCRVLConfig model)
- **paligemma** -- [PaliGemmaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/paligemma#transformers.PaliGemmaForConditionalGeneration) (PaliGemmaConfig model)
- **perception_lm** -- [PerceptionLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/perception_lm#transformers.PerceptionLMForConditionalGeneration) (PerceptionLMConfig model)
- **pi0** -- [PI0ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pi0#transformers.PI0ForConditionalGeneration) (PI0Config model)
- **pix2struct** -- [Pix2StructForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pix2struct#transformers.Pix2StructForConditionalGeneration) (Pix2StructConfig model)
- **pp_chart2table** -- [GotOcr2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/got_ocr2#transformers.GotOcr2ForConditionalGeneration) (PPChart2TableConfig model)
- **pp_formulanet** -- [PPFormulaNetForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/pp_formulanet#transformers.PPFormulaNetForConditionalGeneration) (PPFormulaNetConfig model)
- **qianfan_ocr** -- [QianfanOCRForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qianfan_ocr#transformers.QianfanOCRForConditionalGeneration) (QianfanOCRConfig model)
- **qwen2_5_omni_thinker** -- [Qwen2_5OmniThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_omni#transformers.Qwen2_5OmniThinkerForConditionalGeneration) (Qwen2_5OmniThinkerConfig model)
- **qwen2_5_vl** -- [Qwen2_5_VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_5_vl#transformers.Qwen2_5_VLForConditionalGeneration) (Qwen2_5_VLConfig model)
- **qwen2_vl** -- [Qwen2VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen2_vl#transformers.Qwen2VLForConditionalGeneration) (Qwen2VLConfig model)
- **qwen3_5** -- [Qwen3_5ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5#transformers.Qwen3_5ForConditionalGeneration) (Qwen3_5Config model)
- **qwen3_5_moe** -- [Qwen3_5MoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_5_moe#transformers.Qwen3_5MoeForConditionalGeneration) (Qwen3_5MoeConfig model)
- **qwen3_omni_moe_thinker** -- [Qwen3OmniMoeThinkerForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_omni_moe#transformers.Qwen3OmniMoeThinkerForConditionalGeneration) (Qwen3OmniMoeThinkerConfig model)
- **qwen3_vl** -- [Qwen3VLForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl#transformers.Qwen3VLForConditionalGeneration) (Qwen3VLConfig model)
- **qwen3_vl_moe** -- [Qwen3VLMoeForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/qwen3_vl_moe#transformers.Qwen3VLMoeForConditionalGeneration) (Qwen3VLMoeConfig model)
- **shieldgemma2** -- [Gemma3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/gemma3#transformers.Gemma3ForConditionalGeneration) (ShieldGemma2Config model)
- **smolvlm** -- [SmolVLMForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/smolvlm#transformers.SmolVLMForConditionalGeneration) (SmolVLMConfig model)
- **t5gemma2** -- [T5Gemma2ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/t5gemma2#transformers.T5Gemma2ForConditionalGeneration) (T5Gemma2Config model)
- **udop** -- [UdopForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/udop#transformers.UdopForConditionalGeneration) (UdopConfig model)
- **video_llama_3** -- [VideoLlama3ForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llama_3#transformers.VideoLlama3ForConditionalGeneration) (VideoLlama3Config model)
- **video_llava** -- [VideoLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/video_llava#transformers.VideoLlavaForConditionalGeneration) (VideoLlavaConfig model)
- **vipllava** -- [VipLlavaForConditionalGeneration](/docs/transformers/v5.8.0/en/model_doc/vipllava#transformers.VipLlavaForConditionalGeneration) (VipLlavaConfig model)
- **vision-encoder-decoder** -- [VisionEncoderDecoderModel](/docs/transformers/v5.8.0/en/model_doc/vision-encoder-decoder#transformers.VisionEncoderDecoderModel) (VisionEncoderDecoderConfig model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

## Time Series

### AutoModelForTimeSeriesPrediction[[transformers.AutoModelForTimeSeriesPrediction]]

#### transformers.AutoModelForTimeSeriesPrediction[[transformers.AutoModelForTimeSeriesPrediction]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/modeling_auto.py#L2150)

This is a generic model class that will be instantiated as one of the model classes of the library (with a time-series prediction head) when created
with the [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) class method or the [from_config()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_config) class
method.

This class cannot be instantiated directly using `__init__()` (throws an error).

from_configtransformers.AutoModelForTimeSeriesPrediction.from_confighttps://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L206[{"name": "**kwargs", "val": ""}]- **config** ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) --
  The model class to instantiate is selected based on the configuration class:

  - [TimesFm2_5Config](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5Config) configuration class: [TimesFm2_5ModelForPrediction](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5ModelForPrediction) (TimesFm2_5Config model)
  - [TimesFmConfig](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmConfig) configuration class: [TimesFmModelForPrediction](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmModelForPrediction) (TimesFmConfig model)
- **attn_implementation** (`str`, *optional*) --
  The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.0

Instantiates one of the model classes of the library (with a time-series prediction head) from a configuration.

Note:
Loading a model from its configuration file does **not** load the model weights. It only affects the
model's configuration. Use [from_pretrained()](/docs/transformers/v5.8.0/en/model_doc/auto#transformers.AutoModel.from_pretrained) to load the model weights.

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTimeSeriesPrediction.from_config(config)
```

**Parameters:**

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig)) : The model class to instantiate is selected based on the configuration class:  - [TimesFm2_5Config](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5Config) configuration class: [TimesFm2_5ModelForPrediction](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5ModelForPrediction) (TimesFm2_5Config model) - [TimesFmConfig](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmConfig) configuration class: [TimesFmModelForPrediction](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmModelForPrediction) (TimesFmConfig model)

attn_implementation (`str`, *optional*) : The attention implementation to use in the model (if relevant). Can be any of `"eager"` (manual implementation of the attention), `"sdpa"` (using [`F.scaled_dot_product_attention`](https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html)), `"flash_attention_2"` (using [Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention)), or `"flash_attention_3"` (using [Dao-AILab/flash-attention/hopper](https://github.com/Dao-AILab/flash-attention/tree/main/hopper)). By default, if available, SDPA will be used for torch>=2.1.1. The default is otherwise the manual `"eager"` implementation.
#### from_pretrained[[transformers.AutoModelForTimeSeriesPrediction.from_pretrained]]

[Source](https://github.com/huggingface/transformers/blob/v5.8.0/src/transformers/models/auto/auto_factory.py#L263)

Instantiate one of the model classes of the library (with a time-series prediction head) from a pretrained model.

The model class to instantiate is selected based on the `model_type` property of the config object (either
passed as an argument or loaded from `pretrained_model_name_or_path` if possible), or when it's missing, by
falling back to using pattern matching on `pretrained_model_name_or_path`:

- **timesfm** -- [TimesFmModelForPrediction](/docs/transformers/v5.8.0/en/model_doc/timesfm#transformers.TimesFmModelForPrediction) (TimesFmConfig model)
- **timesfm2_5** -- [TimesFm2_5ModelForPrediction](/docs/transformers/v5.8.0/en/model_doc/timesfm2_5#transformers.TimesFm2_5ModelForPrediction) (TimesFm2_5Config model)

The model is set in evaluation mode by default using `model.eval()` (so for instance, dropout modules are
deactivated). To train the model, you should first set it back in training mode with `model.train()`

Examples:

```python
>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTimeSeriesPrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True
```

**Parameters:**

pretrained_model_name_or_path (`str` or `os.PathLike`) : Can be either:  - A string, the *model id* of a pretrained model hosted inside a model repo on huggingface.co. - A path to a *directory* containing model weights saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained), e.g., `./my_model_directory/`.

model_args (additional positional arguments, *optional*) : Will be passed along to the underlying model `__init__()` method.

config ([PreTrainedConfig](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig), *optional*) : Configuration for the model to use instead of an automatically loaded configuration. Configuration can be automatically loaded when:  - The model is a model provided by the library (loaded with the *model id* string of a pretrained model). - The model was saved using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and is reloaded by supplying the save directory. - The model is loaded by supplying a local directory as `pretrained_model_name_or_path` and a configuration JSON file named *config.json* is found in the directory.

state_dict (*dict[str, torch.Tensor]*, *optional*) : A state dictionary to use instead of a state dictionary loaded from saved weights file.  This option can be used if you want to create a model from a pretrained configuration but load your own weights. In this case though, you should check if using [save_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.save_pretrained) and [from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/model#transformers.PreTrainedModel.from_pretrained) is not a simpler option.

cache_dir (`str` or `os.PathLike`, *optional*) : Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.

force_download (`bool`, *optional*, defaults to `False`) : Whether or not to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.

proxies (`dict[str, str]`, *optional*) : A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}`. The proxies are used on each request.

output_loading_info(`bool`, *optional*, defaults to `False`) : Whether or not to also return a dictionary containing missing keys, unexpected keys and error messages.

local_files_only(`bool`, *optional*, defaults to `False`) : Whether or not to only look at local files (e.g., not try downloading the model).

revision (`str`, *optional*, defaults to `"main"`) : The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

trust_remote_code (`bool`, *optional*, defaults to `False`) : Whether or not to allow for custom models defined on the Hub in their own modeling files. This option should only be set to `True` for repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine.

code_revision (`str`, *optional*, defaults to `"main"`) : The specific revision to use for the code on the Hub, if the code leaves in a different repository than the rest of the model. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git.

kwargs (additional keyword arguments, *optional*) : Can be used to update the configuration object (after it being loaded) and initiate the model (e.g., `output_attentions=True`). Behaves differently depending on whether a `config` is provided or automatically loaded:  - If a configuration is provided with `config`, `**kwargs` will be directly passed to the underlying model's `__init__` method (we assume all relevant updates to the configuration have already been done) - If a configuration is not provided, `kwargs` will be first passed to the configuration class initialization function ([from_pretrained()](/docs/transformers/v5.8.0/en/main_classes/configuration#transformers.PreTrainedConfig.from_pretrained)). Each key of `kwargs` that corresponds to a configuration attribute will be used to override said attribute with the supplied `kwargs` value. Remaining keys that do not correspond to any configuration attribute will be passed to the underlying model's `__init__` function.

