ideogram-ai/ideogram-4-fp8 · the pormpt "a cat" gets blocked

Manni1000

2 days ago

the prompt "a cat" gets blocked

Manni1000

2 days ago

Manni1000

2 days ago

json formatting the prompts helps sometimes. but still strange behavior.

LSI

2 days ago

•

edited 2 days ago

Thank you for the protection provided by your filter

error:
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.
[ERROR] Error running sage attention: Unsupported head_dim: 256, using pytorch attention instead.

CalamitousFelicitousness

2 days ago

It could have been a very dangerous cat.

sasori7713

2 days ago

this is a new level safety cucking. imagine art does this with their models but this new for a open source model.

RuneXX

2 days ago

it seems to need quite a bit of a structured prompt, or use an LLM node to do it for you from a basic prompt input
and after that you never see this blocked message again

Manni1000

2 days ago

they seem to be aware of the issue. but i would say its not just higher false posive its nealy 99% of the time when the prompt is short or non json

Manni1000

2 days ago

i still get a partial warning even with perfect json. i think it just does not like shorter prompts

Manni1000

2 days ago

i did not pormpt for this text

RuneXX

2 days ago

Just use the suggested prompt structure from them, and you can prompt for pretty much anything..
And not see any block

https://huggingface.co/ideogram-ai/ideogram-4-fp8/discussions/5#6a208948d2802761c1c3a188

Manni1000

2 days ago

i am using that already

Andyx1976

1 day ago

this is interesting. Exploring the frontier indeed. Is this coming from the prompt enhancer? the clip? the model itself?
as for the model i hear a lot about text. How good is it with the images stuff, when it occasionally works?

Manni1000

1 day ago

its baked into the model its not part of the prompt enhancer. ether rl or just train on this image with parid nsfw prompts.

RuneXX

1 day ago

•

edited 1 day ago

Its the prompt structure.
With correct json prompt i am never seeing the filter. Even for a bit ranchy or sexy images (that ideogram even have on their own website). (its not a model for more hardcore nsfw, of course, but the filter seems to trigger for non related prompts that has nothing to do with NSFW but rather the prompt structure)

But apparently they are working on a fix for that. But if you use an LLM or other ways to properly structure the prompt you wont see the safety filter

dummy9996

1 day ago

would be better without such (excessive) "sa-fe-ty", it will already inspire other companies to do the same thing. I am glad I haven't downloaded this hell, I think I respect myself :>
the prompt format isn't really a problem in the case, just remember that SD3 forced us to use natural language instead of tags, it remind me that
the fact the devs thought their model was "open source" is +1 point they could not apply enough care

RuneXX

1 day ago

•

edited 1 day ago

but prompt format is absolutely the case... it doesnt fail on NSFW, but on anything like "woman in the grass", cat, car.. Nothing to do with safety when a prompt for car fails, obviously
With correct prompt format, it never fails. I dont see this "safety filter" anymore at all

https://huggingface.co/ideogram-ai/ideogram-4-fp8/discussions/5#6a2172dbbe3b28898d637750

. I am glad I haven't downloaded this hell

thats all up to you of course, but its probably one of the best open source image models currently.
Been creating some amazing posters, infographics, text and more.. that i am not able to do with other models easily

ComUser

1 day ago

•

edited 1 day ago

Its the prompt structure.
With correct json prompt i am never seeing the filter. Even for a bit ranchy or sexy images (that ideogram even have on their own website). (its not a model for more hardcore nsfw, of course, but the filter seems to trigger for non related prompts that has nothing to do with NSFW but rather the prompt structure)

But apparently they are working on a fix for that. But if you use an LLM or other ways to properly structure the prompt you wont see the safety filter

Ok ideogram lawyer 🙂

roverdude

1 day ago

but prompt format is absolutely the case... it doesnt fail on NSFW, but on anything like "woman in the grass", cat, car.. Nothing to do with safety when a prompt for car fails, obviously
With correct prompt format, it never fails. I dont see this "safety filter" anymore at all

https://huggingface.co/ideogram-ai/ideogram-4-fp8/discussions/5#6a2172dbbe3b28898d637750

. I am glad I haven't downloaded this hell

thats all up to you of course, but its probably one of the best open source image models currently.
Been creating some amazing posters, infographics, text and more.. that i am not able to do with other models easily

Ppl trying to get something worked up so they can get busy with their life. I think you should just ignore them

Andyx1976

about 22 hours ago

well if it's true that's it's about the prompt format then it's true.
But i don't like the very fact that you need an LLm to write prompt. Not for all models where people go on about that, most are perfectly fine with simple ones. Some models feel even kinda creatively suffocated if you spell every detail out for them. But this one seems to be hardcore requiring it.

Which brings me back to the question if it's worth that trouble (even if it's not censored). But also the irritating fact that this kind of "safety " is hard wired in at all. (i love this nice orwell-ianism "safety" for censorship, inspired by almost exclusively american pathological prudery. ) )

RuneXX

about 21 hours ago

•

edited about 21 hours ago

They seem to be working on it.. .

Personally i think it worth it, using LLM for json prompt, until a fix.
The model is extremely good at text, infographics, composition etc.

But yeah, i understand the hassle. I have a node in Comfy that does it all automagically on the fly. But if you are going to go to an LLM website for each prompt, its for sure a bit of a hassle.

Maybe they are updating the models with a fix in not so long, at least some twitter ( or X .. ) suggested they would

A little bit of a bummer for sure, since some people are reacting negatively.
Personally it wasn't on my bingo card that Ideogram would go open source, and grateful they did, despite the prompt challenges.

RuneXX

about 21 hours ago

CalamitousFelicitousness

about 18 hours ago

Open weights, not open source. Devalues work for those labs which actually do opensource their models. Can't wait for the trend of the safety filter garbage baked into the weights to take off.

Which brings me back to the question if it's worth that trouble (even if it's not censored). But also the irritating fact that this kind of "safety " is hard wired in at all. (i love this nice orwell-ianism "safety" for censorship, inspired by almost exclusively american pathological prudery. ) )

The term Orwellian-ism is actually quite apt since the model can get tripped up on a prompt like below, which I assume is due to the word "intimate", time to brush up on your Newspeak comrade.

{
  "high_level_description": "A close-up portrait photograph of an elderly potter in her sunlit studio, clay on her hands, looking just off-camera with a faint smile, shot on Kodak Portra 400 with shallow depth of field.",
  "style_description": {
    "aesthetics": "Warm, intimate, documentary portrait.",
    "lighting": "Soft late-afternoon window light from the left, gentle falloff into shadow.",
    "photo": "85mm, f/2, Kodak Portra 400, shallow depth of field, natural skin texture.",
    "medium": "Photograph.",
    "color_palette": ["#6B4A2F", "#D8C3A5", "#A7553C"]
  },
  "compositional_deconstruction": {
    "elements": [
      { "type": "obj", "bbox": [80, 220, 760, 780], "desc": "Woman in her seventies filling most of the frame, face on the upper third, three-quarter view turned slightly left, soft smile, fine wrinkles and visible pores, silver hair loosely tied back, clay-streaked linen apron over a charcoal shirt, eyes catching the window light." },
      { "type": "obj", "bbox": [620, 300, 920, 700], "desc": "Her hands in the lower foreground, fingers dusted with grey wet clay, resting near a half-formed bowl on a potter's wheel, slightly soft in focus." }
    ]
  }