hackyon
/

enct5-base

Text Classification

Model card Files Files and versions

enct5-base / README.md

hackyon's picture

Upload EncT5ForSequenceClassification

7afe512 verified about 2 years ago

|

history blame contribute delete

2.37 kB

	---
	language:
	- en
	- fr
	- ro
	- de
	license: apache-2.0
	library_name: transformers
	datasets:
	- c4
	---

	# Model Card for EncT5

	EncT5 is a variant of T5 that utilizes mainly the encoder for non-autoregressive (ie. classification and regression)
	tasks. The model is from the paper [Fine-tuning T5 Encoder for Non-autoregressive Tasks](https://arxiv.org/abs/2110.08426)
	by Frederick Liu, Terry Huang, Shihang Lyu, Siamak Shakeri, Hongkun Yu, Jing Li



	## Model Details

	### Model Description

	EncT5 uses the same base weights at T5, but must be fine-tuning before use. There are several special features
	to EncT5:

	1. There are less decoder layers (a single decoder layer by default), and so has fewer parameters/resources than the
	standard T5.
	3. There is a separate decoder word embedding, with the decoder input ids being predefined constants. During
	fine-tuning, the decoder embedding learns to use these constants as "prompts" to the encoder for the corresponding
	classification/regression tasks.
	5. There is a classification head on top of the decoder output.

	Research has shown that this model can be more efficient and usable over T5 and BERT for non-autoregressive
	tasks such as classification and regression.

	- Developed by: Frederick Liu, Terry Huang, Shihang Lyu, Siamak Shakeri, Hongkun Yu, Jing Li. See the
	[associated paper](https://arxiv.org/abs/2110.08426).
	- Model type: Language Model
	- Language(s) (NLP): English, French, Romanian, German
	- License: Apache 2.0
	- Based on model: [T5](https://huggingface.co/google-t5/t5-base)
	- Repository: [Github repro](https://github.com/hackyon/EncT5)
	- Paper: [Fine-tuning T5 Encoder for Non-autoregressive Tasks](https://arxiv.org/abs/2110.08426)

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	model = AutoModelForSequenceClassification.from_pretrained("hackyon/enct5-base", trust_remote_code=True)
	# Fine-tune the model before use.
	```

	See the [github repro](https://github.com/hackyon/EncT5) for a more comprehensive guide.

	## Training Details

	### Training Data

	The weights of this model are directly copied from [t5-base](https://huggingface.co/google-t5/t5-base).

	### Training Procedure

	This model must be fine-tuned before use. The decoder word embedding and classification head are both untrained.