Instructions to use ideogram-ai/ideogram-4-fp8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use ideogram-ai/ideogram-4-fp8 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("ideogram-ai/ideogram-4-fp8", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
the pormpt "a cat" gets blocked
the prompt "a cat" gets blocked
json formatting the prompts helps sometimes. but still strange behavior.
Thank you for the protection provided by your filter
- error:
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
It could have been a very dangerous cat.
this is a new level safety cucking. imagine art does this with their models but this new for a open source model.
it seems to need quite a bit of a structured prompt, or use an LLM node to do it for you from a basic prompt input
and after that you never see this blocked message again
i did not pormpt for this text
Just use the suggested prompt structure from them, and you can prompt for pretty much anything..
And not see any block
https://huggingface.co/ideogram-ai/ideogram-4-fp8/discussions/5#6a208948d2802761c1c3a188
i am using that already
this is interesting. Exploring the frontier indeed. Is this coming from the prompt enhancer? the clip? the model itself?
as for the model i hear a lot about text. How good is it with the images stuff, when it occasionally works?
its baked into the model its not part of the prompt enhancer. ether rl or just train on this image with parid nsfw prompts.
Its the prompt structure.
With correct json prompt i am never seeing the filter. Even for a bit ranchy or sexy images (that ideogram even have on their own website). (its not a model for more hardcore nsfw, of course, but the filter seems to trigger for non related prompts that has nothing to do with NSFW but rather the prompt structure)
But apparently they are working on a fix for that. But if you use an LLM or other ways to properly structure the prompt you wont see the safety filter
would be better without such (excessive) "sa-fe-ty", it will already inspire other companies to do the same thing. I am glad I haven't downloaded this hell, I think I respect myself :>
the prompt format isn't really a problem in the case, just remember that SD3 forced us to use natural language instead of tags, it remind me that
the fact the devs thought their model was "open source" is +1 point they could not apply enough care
but prompt format is absolutely the case... it doesnt fail on NSFW, but on anything like "woman in the grass", cat, car.. Nothing to do with safety when a prompt for car fails, obviously
With correct prompt format, it never fails. I dont see this "safety filter" anymore at all
https://huggingface.co/ideogram-ai/ideogram-4-fp8/discussions/5#6a2172dbbe3b28898d637750
. I am glad I haven't downloaded this hell
thats all up to you of course, but its probably one of the best open source image models currently.
Been creating some amazing posters, infographics, text and more.. that i am not able to do with other models easily
Its the prompt structure.
With correct json prompt i am never seeing the filter. Even for a bit ranchy or sexy images (that ideogram even have on their own website). (its not a model for more hardcore nsfw, of course, but the filter seems to trigger for non related prompts that has nothing to do with NSFW but rather the prompt structure)But apparently they are working on a fix for that. But if you use an LLM or other ways to properly structure the prompt you wont see the safety filter
Ok ideogram lawyer π
but prompt format is absolutely the case... it doesnt fail on NSFW, but on anything like "woman in the grass", cat, car.. Nothing to do with safety when a prompt for car fails, obviously
With correct prompt format, it never fails. I dont see this "safety filter" anymore at allhttps://huggingface.co/ideogram-ai/ideogram-4-fp8/discussions/5#6a2172dbbe3b28898d637750
. I am glad I haven't downloaded this hell
thats all up to you of course, but its probably one of the best open source image models currently.
Been creating some amazing posters, infographics, text and more.. that i am not able to do with other models easily
Ppl trying to get something worked up so they can get busy with their life. I think you should just ignore them
well if it's true that's it's about the prompt format then it's true.
But i don't like the very fact that you need an LLm to write prompt. Not for all models where people go on about that, most are perfectly fine with simple ones. Some models feel even kinda creatively suffocated if you spell every detail out for them. But this one seems to be hardcore requiring it.
Which brings me back to the question if it's worth that trouble (even if it's not censored). But also the irritating fact that this kind of "safety " is hard wired in at all. (i love this nice orwell-ianism "safety" for censorship, inspired by almost exclusively american pathological prudery. ) )
They seem to be working on it.. .
Personally i think it worth it, using LLM for json prompt, until a fix.
The model is extremely good at text, infographics, composition etc.
But yeah, i understand the hassle. I have a node in Comfy that does it all automagically on the fly. But if you are going to go to an LLM website for each prompt, its for sure a bit of a hassle.
Maybe they are updating the models with a fix in not so long, at least some twitter ( or X .. ) suggested they would
A little bit of a bummer for sure, since some people are reacting negatively.
Personally it wasn't on my bingo card that Ideogram would go open source, and grateful they did, despite the prompt challenges.
Open weights, not open source. Devalues work for those labs which actually do opensource their models. Can't wait for the trend of the safety filter garbage baked into the weights to take off.
Which brings me back to the question if it's worth that trouble (even if it's not censored). But also the irritating fact that this kind of "safety " is hard wired in at all. (i love this nice orwell-ianism "safety" for censorship, inspired by almost exclusively american pathological prudery. ) )
The term Orwellian-ism is actually quite apt since the model can get tripped up on a prompt like below, which I assume is due to the word "intimate", time to brush up on your Newspeak comrade.
{
"high_level_description": "A close-up portrait photograph of an elderly potter in her sunlit studio, clay on her hands, looking just off-camera with a faint smile, shot on Kodak Portra 400 with shallow depth of field.",
"style_description": {
"aesthetics": "Warm, intimate, documentary portrait.",
"lighting": "Soft late-afternoon window light from the left, gentle falloff into shadow.",
"photo": "85mm, f/2, Kodak Portra 400, shallow depth of field, natural skin texture.",
"medium": "Photograph.",
"color_palette": ["#6B4A2F", "#D8C3A5", "#A7553C"]
},
"compositional_deconstruction": {
"elements": [
{ "type": "obj", "bbox": [80, 220, 760, 780], "desc": "Woman in her seventies filling most of the frame, face on the upper third, three-quarter view turned slightly left, soft smile, fine wrinkles and visible pores, silver hair loosely tied back, clay-streaked linen apron over a charcoal shirt, eyes catching the window light." },
{ "type": "obj", "bbox": [620, 300, 920, 700], "desc": "Her hands in the lower foreground, fingers dusted with grey wet clay, resting near a half-formed bowl on a potter's wheel, slightly soft in focus." }
]
}





