Phi-1.5-RLLMv3
Collection
This is a collection designed to present the ten RLLM steps/ training runs intended to improve Phi-1.5's outputs towards coherence and politeness. • 10 items • Updated
YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
Companion Post: Research Log, RLLMv3 (GPT2-XL, Phi-1.5 and Falcon-RW-1B)
Main post: BetterDAN, AI Machiavelli & Oppo Jailbreaks vs. SOTA models & GPT2XL_RLLMv3
Related post: Coherence (and Response Time) Test