Add metadata for license, library, and pipeline tag and add paper/code links
#1
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,18 +1,25 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
datasets:
|
| 3 |
- ExponentialScience/DLT-Sentiment-News
|
| 4 |
language:
|
| 5 |
- en
|
| 6 |
-
|
| 7 |
-
-
|
|
|
|
| 8 |
---
|
|
|
|
| 9 |
# LedgerBERT-Market-Sentiment
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
## Model Description
|
| 12 |
|
| 13 |
### Model Summary
|
| 14 |
|
| 15 |
-
LedgerBERT-Market-Sentiment is a fine-tuned version of LedgerBERT
|
| 16 |
|
| 17 |
This model is particularly effective for analyzing cryptocurrency news headlines, social media posts, and other DLT-related content where understanding market sentiment is important.
|
| 18 |
|
|
@@ -88,7 +95,7 @@ The dataset provides domain expertise through crowdsourced annotations from cryp
|
|
| 88 |
|
| 89 |
**Note:** News articles are absent from the DLT-Corpus used to pre-train LedgerBERT, making this an out-of-domain generalization test that demonstrates the model's robust language understanding.
|
| 90 |
|
| 91 |
-
For more details on the dataset used for
|
| 92 |
|
| 93 |
### Training Procedure
|
| 94 |
|
|
@@ -161,13 +168,14 @@ for text in texts:
|
|
| 161 |
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
|
| 162 |
predicted_class = predictions.argmax(dim=-1).item()
|
| 163 |
|
| 164 |
-
# Map to labels
|
| 165 |
-
labels = ["
|
| 166 |
sentiment = labels[predicted_class]
|
| 167 |
confidence = predictions[0][predicted_class].item()
|
| 168 |
|
| 169 |
print(f"Text: {text}")
|
| 170 |
-
print(f"Sentiment: {sentiment} (confidence: {confidence:.3f})
|
|
|
|
| 171 |
```
|
| 172 |
|
| 173 |
### Batch Processing
|
|
@@ -193,7 +201,8 @@ results = classifier(texts, truncation=True, max_length=512)
|
|
| 193 |
|
| 194 |
for text, result in zip(texts, results):
|
| 195 |
print(f"Text: {text}")
|
| 196 |
-
print(f"Sentiment: {result['label']} (score: {result['score']:.3f})
|
|
|
|
| 197 |
```
|
| 198 |
|
| 199 |
### Integration with News Feeds
|
|
@@ -218,7 +227,8 @@ for entry in feed.entries[:5]: # Process first 5 entries
|
|
| 218 |
|
| 219 |
print(f"Headline: {title}")
|
| 220 |
print(f"Market Sentiment: {result['label']} ({result['score']:.2%})")
|
| 221 |
-
print(f"Link: {entry.link}
|
|
|
|
| 222 |
```
|
| 223 |
|
| 224 |
## Citation
|
|
@@ -245,7 +255,7 @@ If you use LedgerBERT-Market-Sentiment in your research, please cite:
|
|
| 245 |
|
| 246 |
### Additional Fine-tuned Models
|
| 247 |
|
| 248 |
-
LedgerBERT can also be fine-tuned for other sentiment dimensions available in the DLT-Sentiment-News dataset
|
| 249 |
- **Content Characteristics** (liked, disliked, neutral)
|
| 250 |
- **Engagement Quality** (important, lol, neutral)
|
| 251 |
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: ExponentialScience/LedgerBERT
|
| 3 |
datasets:
|
| 4 |
- ExponentialScience/DLT-Sentiment-News
|
| 5 |
language:
|
| 6 |
- en
|
| 7 |
+
library_name: transformers
|
| 8 |
+
license: cc-by-nc-4.0
|
| 9 |
+
pipeline_tag: text-classification
|
| 10 |
---
|
| 11 |
+
|
| 12 |
# LedgerBERT-Market-Sentiment
|
| 13 |
|
| 14 |
+
This model was introduced in the paper [DLT-Corpus: A Large-Scale Text Collection for the Distributed Ledger Technology Domain](https://huggingface.co/papers/2602.22045).
|
| 15 |
+
|
| 16 |
+
The official code repository is available [here](https://github.com/dlt-science/DLT-Corpus).
|
| 17 |
+
|
| 18 |
## Model Description
|
| 19 |
|
| 20 |
### Model Summary
|
| 21 |
|
| 22 |
+
LedgerBERT-Market-Sentiment is a fine-tuned version of [LedgerBERT](https://huggingface.co/ExponentialScience/LedgerBERT) specialized for sentiment analysis of cryptocurrency and DLT-related content. The model classifies text into three market direction sentiment categories: **bullish** (positive market outlook), **bearish** (negative market outlook), and **neutral** (balanced or unclear market direction).
|
| 23 |
|
| 24 |
This model is particularly effective for analyzing cryptocurrency news headlines, social media posts, and other DLT-related content where understanding market sentiment is important.
|
| 25 |
|
|
|
|
| 95 |
|
| 96 |
**Note:** News articles are absent from the DLT-Corpus used to pre-train LedgerBERT, making this an out-of-domain generalization test that demonstrates the model's robust language understanding.
|
| 97 |
|
| 98 |
+
For more details on the dataset used for fine-tuning, see: https://huggingface.co/datasets/ExponentialScience/DLT-Sentiment-News
|
| 99 |
|
| 100 |
### Training Procedure
|
| 101 |
|
|
|
|
| 168 |
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
|
| 169 |
predicted_class = predictions.argmax(dim=-1).item()
|
| 170 |
|
| 171 |
+
# Map to labels based on config.json
|
| 172 |
+
labels = ["neutral", "bearish", "bullish"]
|
| 173 |
sentiment = labels[predicted_class]
|
| 174 |
confidence = predictions[0][predicted_class].item()
|
| 175 |
|
| 176 |
print(f"Text: {text}")
|
| 177 |
+
print(f"Sentiment: {sentiment} (confidence: {confidence:.3f})
|
| 178 |
+
")
|
| 179 |
```
|
| 180 |
|
| 181 |
### Batch Processing
|
|
|
|
| 201 |
|
| 202 |
for text, result in zip(texts, results):
|
| 203 |
print(f"Text: {text}")
|
| 204 |
+
print(f"Sentiment: {result['label']} (score: {result['score']:.3f})
|
| 205 |
+
")
|
| 206 |
```
|
| 207 |
|
| 208 |
### Integration with News Feeds
|
|
|
|
| 227 |
|
| 228 |
print(f"Headline: {title}")
|
| 229 |
print(f"Market Sentiment: {result['label']} ({result['score']:.2%})")
|
| 230 |
+
print(f"Link: {entry.link}
|
| 231 |
+
")
|
| 232 |
```
|
| 233 |
|
| 234 |
## Citation
|
|
|
|
| 255 |
|
| 256 |
### Additional Fine-tuned Models
|
| 257 |
|
| 258 |
+
LedgerBERT can also be fine-tuned for other sentiment dimensions available in the DLT-Sentiment-News dataset:
|
| 259 |
- **Content Characteristics** (liked, disliked, neutral)
|
| 260 |
- **Engagement Quality** (important, lol, neutral)
|
| 261 |
|