| | --- |
| | language: |
| | - en |
| | license: apache-2.0 |
| | library_name: transformers |
| | pipeline_tag: text-generation |
| | tags: |
| | - graphic-design |
| | - design-generation |
| | - layout-planning |
| | - qwen3 |
| | base_model: Qwen/Qwen3-8B |
| | --- |
| | |
| | # DesignAsCode Semantic Planner |
| |
|
| | The Semantic Planner for the [DesignAsCode](https://github.com/liuziyuan1109/design-as-code) pipeline. Given a natural-language design request, it generates a structured design plan β including layout reasoning, layer grouping, image generation prompts, and text element specifications. |
| |
|
| | ## Model Details |
| |
|
| | | | | |
| | |---|---| |
| | | **Base Model** | Qwen3-8B | |
| | | **Fine-tuning** | Supervised Fine-Tuning (SFT) | |
| | | **Size** | 16 GB (fp16) | |
| | | **Context Window** | 8,192 tokens | |
| |
|
| | ## Training Data |
| |
|
| | Trained on ~10k examples sampled from the [DesignAsCode Training Data](https://huggingface.co/datasets/Tony1109/DesignAsCode-training-data), which contains 19,479 design samples distilled from the [Crello](https://huggingface.co/datasets/cyberagent/crello) dataset using GPT-4o and GPT-o3. No additional data was used. |
| |
|
| | ### Training Format |
| |
|
| | - **Input:** `prompt` β natural-language design request |
| | - **Output:** `layout_thought` + `grouping` + `image_generator` + `generate_text` |
| |
|
| | See the [training data repo](https://huggingface.co/datasets/Tony1109/DesignAsCode-training-data) for field details. |
| |
|
| | ## Training Configuration |
| |
|
| | | | | |
| | |---|---| |
| | | **Batch Size** | 1 | |
| | | **Gradient Accumulation** | 2 | |
| | | **Learning Rate** | 5e-5 (AdamW) | |
| | | **Epochs** | 2 | |
| | | **Max Sequence Length** | 8,192 tokens | |
| | | **Precision** | bfloat16 | |
| | | **Loss** | Completion-only (only on generated tokens) | |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | import torch |
| | |
| | model_path = "Tony1109/DesignAsCode-planner" |
| | tokenizer = AutoTokenizer.from_pretrained(model_path) |
| | model = AutoModelForCausalLM.from_pretrained( |
| | model_path, |
| | torch_dtype=torch.float16, |
| | device_map="auto" |
| | ) |
| | ``` |
| |
|
| | For full pipeline usage (plan β implement β reflection), see the [project repo](https://github.com/liuziyuan1109/design-as-code) and [Quick Start](https://github.com/liuziyuan1109/design-as-code#quick-start). |
| |
|
| | ## Outputs |
| |
|
| | The model generates semi-structured text with XML tags: |
| |
|
| | - `<layout_thought>...</layout_thought>` β detailed layout reasoning |
| | - `<grouping>...</grouping>` β JSON array grouping related layers with thematic labels |
| | - `<image_generator>...</image_generator>` β JSON array of per-layer image generation prompts |
| | - `<generate_text>...</generate_text>` β JSON array of text element specifications (font, size, alignment, etc.) |
| |
|
| | ## Ethical Considerations |
| |
|
| | - Designs should be reviewed by humans before production use. |
| | - May reflect biases present in the training data. |
| | - Generated content should be checked for copyright compliance. |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @article{liu2026designascode, |
| | title = {DesignAsCode: Bridging Structural Editability and |
| | Visual Fidelity in Graphic Design Generation}, |
| | author = {Liu, Ziyuan and Sun, Shizhao and Huang, Danqing |
| | and Shi, Yingdong and Zhang, Meisheng and Li, Ji |
| | and Yu, Jingsong and Bian, Jiang}, |
| | journal = {arXiv preprint arXiv:2602.17690}, |
| | year = {2026}, |
| | url = {https://arxiv.org/abs/2602.17690} |
| | } |
| | ``` |
| |
|