Instructions to use sriksven/FinanceForge-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use sriksven/FinanceForge-8b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="sriksven/FinanceForge-8b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("sriksven/FinanceForge-8b")
model = AutoModelForCausalLM.from_pretrained("sriksven/FinanceForge-8b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use sriksven/FinanceForge-8b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "sriksven/FinanceForge-8b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sriksven/FinanceForge-8b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/sriksven/FinanceForge-8b

SGLang

How to use sriksven/FinanceForge-8b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "sriksven/FinanceForge-8b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sriksven/FinanceForge-8b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "sriksven/FinanceForge-8b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "sriksven/FinanceForge-8b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use sriksven/FinanceForge-8b with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for sriksven/FinanceForge-8b to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for sriksven/FinanceForge-8b to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for sriksven/FinanceForge-8b to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="sriksven/FinanceForge-8b",
    max_seq_length=2048,
)

Docker Model Runner
How to use sriksven/FinanceForge-8b with Docker Model Runner:
```
docker model run hf.co/sriksven/FinanceForge-8b
```

krishna-finance-7b

A fine-tuned Qwen2.5-7B-Instruct model specialized for financial question answering and quantitative reasoning. Trained on a combination of financial QA and instruction-following datasets to handle earnings analysis, ratio calculations, financial statement interpretation, and investment reasoning.

Key Details


Base model	Qwen/Qwen2.5-7B-Instruct
Method	QLoRA (4-bit NF4, rank 16, alpha 16)
Library	Unsloth + TRL SFTTrainer
Datasets	TheFinAI/flare-finqa (5K) + Sujet-Finance-Instruct-177k (5K)
Total examples	10,000
Hardware	NVIDIA RTX A5000 (24GB VRAM) on RunPod
Training time	~2.75 hours
Parameters trained	40.4M of 7.66B (0.53%)
Format	ChatML (`<\|im_start\|>` / `<\|im_end\|>`)
Output	Merged 16-bit safetensors

Dataset Composition

The training data blends two complementary sources:

FinQA (5,000 examples) — financial question answering requiring numerical reasoning over earnings reports, balance sheets, and financial tables. Teaches the model to extract numbers, perform calculations, and explain financial logic step by step.
Sujet Finance Instruct (5,000 examples) — broad financial instruction data covering investment analysis, market concepts, risk assessment, portfolio management, and financial planning. Gives the model general financial fluency.

Usage

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("sriksven/krishna-finance-7b")
tokenizer = AutoTokenizer.from_pretrained("sriksven/krishna-finance-7b")

messages = [
    {
        "role": "system",
        "content": "You are a financial analyst. Answer questions about financial data with precise calculations and step-by-step reasoning.",
    },
    {
        "role": "user",
        "content": "A company reported revenue of $120M and cost of goods sold of $75M. Operating expenses were $25M. Calculate the gross margin and operating margin.",
    },
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Unsloth (faster inference)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="sriksven/krishna-finance-7b",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

Example Capabilities

Financial ratio calculation — gross margin, operating margin, ROE, P/E, debt-to-equity
Earnings analysis — interpreting revenue trends, YoY growth, segment performance
Financial statement reading — balance sheet, income statement, cash flow analysis
Investment reasoning — valuation approaches, risk factors, portfolio considerations
Quantitative QA — multi-step numerical reasoning over financial data

Intended Use

Financial question answering systems
Building finance-focused chatbots or copilots
Quantitative analysis assistants for analysts and students
Research on domain-specific LLM fine-tuning in finance

Limitations

Not a financial advisor — outputs should not be used as investment advice
Trained on English-language financial data only
May hallucinate financial figures not present in the input context
No real-time market data access — knowledge limited to training data patterns
Not evaluated against established financial NLP benchmarks (FinQA leaderboard, etc.)
Best results when using the system prompt format matching training

Training Infrastructure


GPU	NVIDIA RTX A5000 24GB
Cloud	RunPod ($0.27/hr)
Framework	Unsloth 2026.5.2 + TRL + Transformers 5.5.0
Precision	BF16 training, 4-bit NF4 base quantization
Optimizer	AdamW 8-bit
Learning rate	2e-4, linear decay
Batch size	16 effective (4 per device × 4 accumulation)
Packing	Enabled

Source Code

Training scripts and configs: github.com/sriksven/LLM-FineTune-Suite

License

Apache 2.0

Downloads last month: 1

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for sriksven/FinanceForge-8b

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct