LLMPopcorn Usage Instructions

Welcome to LLMPopcorn! This guide will help you generate video titles and prompts, as well as create AI-generated videos based on those prompts.

Prerequisites

Install Required Python Packages

Before running the scripts, ensure that you have installed the necessary Python packages. You can do this by executing the following command:

pip install torch transformers diffusers tqdm numpy pandas sentence-transformers faiss-cpu openai huggingface_hub safetensors

Download the MicroLens Dataset:
Download the following files from the MicroLens dataset and place them in the Microlens/ folder:

File	Description
`MicroLens-100k_likes_and_views.txt`	Video engagement stats (tab-separated)
`MicroLens-100k_title_en.csv`	Cover image descriptions (comma-separated)
`Microlens100K_captions_en.csv`	Video captions in English (tab-separated)
`MicroLens-100k_comment_en.txt`	User comments (tab-separated)
`tags_to_summary.csv`	Video category tags (comma-separated)

Your directory structure should look like:

LLMPopcorn/
├── Microlens/
│   ├── MicroLens-100k_likes_and_views.txt
│   ├── MicroLens-100k_title_en.csv
│   ├── Microlens100K_captions_en.csv
│   ├── MicroLens-100k_comment_en.txt
│   └── tags_to_summary.csv
├── PE.py
├── pipline.py
└── ...

Step 1: Generate Video Titles and Prompts

To generate video titles and prompts, run the LLMPopcorn.py script:

python LLMPopcorn.py

To enhance LLMPopcorn, execute the PE.py script:

python PE.py

Step 2: Generate AI Videos

To create AI-generated videos, execute the generating_images_videos_three.py script:

python generating_images_videos_three.py

Step 3: Clone the Evaluation Code

Then, following the instructions in the MMRA repository, you can evaluate the generated videos.

Tutorial: Using the Prompts Dataset

You can easily download and use the structured prompts directly from Hugging Face:

1. Install `datasets`

pip install datasets

2. Load the Dataset in Python

from datasets import load_dataset

# Load the LLMPopcorn prompts
dataset = load_dataset("junchenfu/llmpopcorn_prompts")

# Access the data (abstract or concrete)
for item in dataset["train"]:
    print(f"Type: {item['type']}, Prompt: {item['prompt']}")

This dataset contains both abstract and concrete prompts, which you can use as input for the video generation scripts in Step 2.

RAG Reference Dataset: MicroLens

For the RAG-enhanced pipeline (PE.py + pipline.py), we provide a pre-processed version of the MicroLens dataset on Hugging Face so you don't need to download and process the raw files manually.

The dataset is available at: junchenfu/microlens_rag

It contains 19,560 video entries across 22 categories with the following fields:

Column	Description
`video_id`	Unique video identifier
`title_en`	Cover image description (used as title)
`cover_desc`	Cover image description
`caption_en`	Full video caption in English
`partition`	Video category (e.g., Anime, Game, Delicacy)
`likes`	Number of likes
`views`	Number of views
`comment_count`	Number of comments (used as popularity signal)

Load the RAG Dataset in Python

from datasets import load_dataset

rag_dataset = load_dataset("junchenfu/microlens_rag")

# Access as a pandas DataFrame
df = rag_dataset["train"].to_pandas()
print(df.head())
print(f"Total: {len(df)} videos, {df['partition'].nunique()} categories")

Downloads last month: -; Downloads are not tracked for this model. How to track

junchenfu
/

LLMPopcorn

LLMPopcorn Usage Instructions

Prerequisites

Install Required Python Packages

Step 1: Generate Video Titles and Prompts

Step 2: Generate AI Videos

Step 3: Clone the Evaluation Code

Tutorial: Using the Prompts Dataset

1. Install `datasets`

2. Load the Dataset in Python

RAG Reference Dataset: MicroLens

Load the RAG Dataset in Python

Datasets used to train junchenfu/LLMPopcorn

LLMPopcorn Usage Instructions

Prerequisites

Install Required Python Packages

Step 1: Generate Video Titles and Prompts

Step 2: Generate AI Videos

Step 3: Clone the Evaluation Code

Tutorial: Using the Prompts Dataset

1. Install datasets

2. Load the Dataset in Python

RAG Reference Dataset: MicroLens

Load the RAG Dataset in Python

Datasets used to train junchenfu/LLMPopcorn

1. Install `datasets`