Model Card: GPT-2-DAPT-IMDb

A domain-adapted GPT-2, further pre-trained on the IMDb dataset text.

Model Details

Description

This model is based on the GPT-2 architecture and was further pre-trained (domain-adapted) using the text in IMDb dataset, excluding its test split.

Developed by: Cesar Gonzalez-Gutierrez
Funded by: ERC
Architecture: GPT-2
Language: English
License: MIT
Base model: GPT-2

Checkpoints

Intermediate checkpoints from the pre-training process are available and can be accessed using specific tags, which correspond to training epochs and steps:

Epoch	Step	Tags
1	703	epoch-1	step-703
5	3515	epoch-5	step-3515
10	7031	epoch-10	step-7031
20	14063	epoch-20	step-14063
30	21095	epoch-30	step-21095
40	28126	epoch-40	step-28126
50	35150	epoch-50	step-35150
60	42182	epoch-60	step-42182
70	49214	epoch-70	step-49214
80	56240	epoch-80	step-56240

To load a model from a specific intermediate checkpoint, use the revision parameter with the corresponding tag:

from transformers import AutoModelForCausalLM

model = AutoModelForMaskedLM.from_pretrained("<model-name>", revision="<checkpoint-tag>")

Sources

Paper: [Information pending]

Training Details

For more details on the training procedure, please refer to the base model's documentation: Training procedure.

Training Data

All texts from IMDb dataset, excluding the test partition.

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Precision: fp16
Batch size: 8
Gradient accumulation steps: 12

Uses

For typical use cases and limitations, please refer to the base model's guidance: Inteded uses & limitations.

Bias, Risks, and Limitations

This model inherits potential risks and limitations from the base model. Refer to: Limitations and bias.

Environmental Impact

Hardware Type: NVIDIA A100 PCIE 40GB
Runtime: 10 h
Cluster Provider: Artemisa
Compute Region: EU
Carbon Emitted: 1.55 kg CO2 eq.

Citation

BibTeX:

[More Information Needed]

Downloads last month: 2

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for cglez/gpt2-dapt-imdb

Base model

openai-community/gpt2

Finetuned

(2164)

this model

cglez
/

gpt2-dapt-imdb