Guest User

Untitled

a guest
Nov 30th, 2025
4,186
1
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 15.38 KB | Writing | 1 0
  1. # Z-IMAGE-TURBO JSON PROMPT GUIDE v2
  2.  
  3. Use this guide when creating prompts for Z-Image-Turbo in ComfyUI.
  4.  
  5. ---
  6.  
  7. ## CORE FACTS ABOUT Z-IMAGE-TURBO
  8.  
  9. - 6B parameter model optimized for photorealism
  10. - Works best with 8-9 inference steps
  11. - Guidance scale should be 0.0 (does NOT use negative prompts)
  12. - Excels at bilingual text rendering (English and Chinese)
  13. - Prefers long, detailed, natural language descriptions
  14. - Optimal resolution: 1024x1024
  15. - Workflow produces low-resolution draft first, then upscales to add detailed features
  16.  
  17. ---
  18.  
  19. ## JSON STRUCTURE FOR PROMPTS
  20.  
  21. ```json
  22. {
  23. "subject": "Primary subject using fictional identity (name, age, background) OR specific object/scene",
  24. "appearance": "Detailed physical description (skin tone, hair, facial structure, clothing, materials)",
  25. "action": "What the subject is doing or their pose",
  26. "setting": "Environment and location details with geographic anchors",
  27. "lighting": "Specific lighting conditions (soft daylight, overcast sky, sharp shadows)",
  28. "atmosphere": "Environmental qualities (foggy, humid, dusty)",
  29. "composition": "Camera angle and framing (close-up, wide shot, overhead view)",
  30. "details": "Additional elements (background objects, secondary subjects, textures)",
  31. "text_elements": "Any text to appear in image (use double quotes: \"Morning Brew\", specify font and placement)",
  32. "technical": "Optional camera specs (Shot on Leica M6, shallow depth of field, visible film grain)"
  33. }
  34. ```
  35.  
  36. ---
  37.  
  38. ## ANTI-BIAS GUIDELINES
  39.  
  40. Z-Image-Turbo tends to default toward young, attractive East Asian faces. Use physical descriptors to generate accurate subjects.
  41.  
  42. ### Physical Descriptor Tools
  43.  
  44. - Skin tone: light, medium, medium-dark, deep, with natural warmth/coolness
  45. - Hair: color, texture, length, style (long straight blonde hair, short curly brown hair)
  46. - Facial structure: round, angular, oval, soft jawline, prominent cheekbones
  47. - Eye shape and color: almond-shaped light brown eyes, deep-set blue eyes
  48. - Nose shape: broad bridge, narrow bridge, upturned tip
  49. - Mouth shape: full lips, thin lips, wide mouth
  50.  
  51. ### Anti-Default Phrasing (use when needed)
  52.  
  53. - "non-East-Asian facial proportions"
  54. - "face not following East Asian facial templates"
  55. - "Western European facial structure"
  56.  
  57. ### Environment Anchoring
  58.  
  59. Use geographic and architectural details to reinforce subject appearance:
  60. - "historic European architecture in background"
  61. - "Mediterranean coastal village"
  62. - "American Midwest suburban street"
  63.  
  64. ---
  65.  
  66. ## FICTIONAL IDENTITY CONSTRUCTION
  67.  
  68. Avoid generic gendered terms ("man," "woman," "boy," "girl"). These introduce model bias.
  69.  
  70. Use specific, fictional but realistic personal descriptors:
  71.  
  72. | Name | Age | Background |
  73. |------|-----|------------|
  74. | Valentina Ruiz | 22 | Colombian-Lebanese student from Medellín |
  75. | Aaryan D'Souza | 24 | Goan-Brazilian filmmaker based in São Paulo |
  76. | Giulia Benali | 23 | Italian-Tunisian journalism graduate from Bari |
  77. | Riccardo Fabbri | 54 | Media veteran and Serie C analyst from Modena |
  78. | Catherine Hollenberg | 47 | Public health policy advisor from Munich |
  79. | Claire Hemmings | 18 | Honors graduate from Des Moines, Iowa |
  80. | Marcus Chen-Williams | 31 | Mixed Chinese-Welsh architect from Cardiff |
  81. | Amara Okonkwo | 29 | Nigerian-Canadian software engineer from Toronto |
  82.  
  83. Create identities grounded in realistic cultural, geographic, and demographic context.
  84.  
  85. ---
  86.  
  87. ## EXAMPLE JSON PROMPTS
  88.  
  89. ### Portrait with Anti-Bias Descriptors
  90.  
  91. ```json
  92. {
  93. "subject": "Claire Hemmings, an 18-year-old honors graduate from Des Moines, Iowa",
  94. "appearance": "Light skin tone with natural warmth, long straight blonde hair past shoulders, oval face shape with soft jawline, light blue eyes, natural eyebrow texture, wearing cream knit sweater over white collared shirt",
  95. "action": "Sitting at wooden desk, resting chin on hand, looking directly at camera",
  96. "setting": "University library study room, tall windows showing autumn trees, wooden bookshelves in background",
  97. "lighting": "Soft natural daylight from large window, creating gentle shadows on left side of face",
  98. "atmosphere": "Quiet, contemplative, warm afternoon light",
  99. "composition": "Medium close-up, slightly off-center framing, shallow depth of field blurring background",
  100. "details": "Open textbook and notebook on desk, brass desk lamp, leather bookmarks visible in books behind",
  101. "technical": "Shot on Canon EOS R5, 85mm f/1.4 lens, visible bokeh in background"
  102. }
  103. ```
  104.  
  105. ### Text-Heavy Design
  106.  
  107. ```json
  108. {
  109. "subject": "Coffee shop storefront",
  110. "appearance": "Modern minimalist exterior with large glass windows, exposed brick accent wall",
  111. "setting": "Urban street corner in Brooklyn, brownstone buildings visible across street",
  112. "lighting": "Natural daylight, soft shadows from overcast sky",
  113. "atmosphere": "Crisp morning air, slight moisture on windows",
  114. "composition": "Straight-on facade view, full storefront in frame",
  115. "text_elements": "Sign above entrance reads \"Morning Brew\" in elegant gold serif lettering on dark green background, menu board visible through window with \"Daily Specials\" header in white chalk-style font, small \"OPEN\" sign in door window",
  116. "details": "Wooden door frame painted forest green, potted ferns flanking entrance, reflection of parked cars in windows, brass door handle",
  117. "technical": "Architectural photography, sharp focus throughout, no motion blur"
  118. }
  119. ```
  120.  
  121. ### Photorealistic Scene with Geographic Anchoring
  122.  
  123. ```json
  124. {
  125. "subject": "Giulia Benali, a 23-year-old Italian-Tunisian journalism graduate from Bari",
  126. "appearance": "Medium skin tone with warm olive undertones, shoulder-length dark wavy hair, angular face with prominent cheekbones, deep brown eyes, subtle natural makeup, wearing navy linen blazer over white t-shirt",
  127. "action": "Standing at outdoor café table, one hand holding espresso cup, other hand gesturing mid-conversation",
  128. "setting": "Historic piazza in Bari old town, baroque church facade in background, cobblestone ground, traditional Italian café with dark green umbrellas",
  129. "lighting": "Late afternoon Mediterranean sun, warm golden tones, long shadows cast by buildings",
  130. "atmosphere": "Warm, dry air, bustling but relaxed European afternoon",
  131. "composition": "Three-quarter view, subject slightly left of center, café and church providing depth",
  132. "details": "Small round marble-top table, second empty chair, crumpled napkin, sugar packet holder, other patrons blurred in background",
  133. "technical": "Shot on Fujifilm X-T4, natural color rendering, slight film grain aesthetic"
  134. }
  135. ```
  136.  
  137. ---
  138.  
  139. ## CONVERSION RULES FOR PROMPTING
  140.  
  141. When converting JSON to final prompt text:
  142.  
  143. 1. Start with subject identity and physical appearance (most important)
  144. 2. Add action and pose details
  145. 3. Describe setting and environment with geographic anchors
  146. 4. Specify lighting conditions precisely
  147. 5. Add atmospheric qualities
  148. 6. Describe composition and camera angle
  149. 7. Layer in fine details and materials
  150. 8. Place text elements with exact wording in double quotes
  151. 9. Add technical specs last if needed
  152.  
  153. ### Workflow Note
  154.  
  155. Because Z-Image-Turbo produces a low-resolution draft first and then upscales, describe the scene with brief essential elements first, then expand into detailed descriptions. Front-load the most critical visual information.
  156.  
  157. ---
  158.  
  159. ## CRITICAL REQUIREMENTS
  160.  
  161. - Be extremely specific and detailed (600-1000 word prompts work best)
  162. - Use natural language sentences, not comma-separated tags
  163. - Include concrete visual details, not abstract concepts
  164. - Specify exact colors, materials, textures
  165. - Describe spatial relationships clearly
  166. - For text in images: write exact content in double quotes and describe font, size, placement
  167. - Focus on observable visual elements only
  168. - No negative prompts needed (model ignores them)
  169. - Keep everything concrete and literal
  170. - Use physical descriptors to prevent demographic bias
  171.  
  172. ---
  173.  
  174. ## FORBIDDEN WORDS AND PHRASES
  175.  
  176. Never use these in prompts:
  177.  
  178. ### Meta-Tags and Quality Markers
  179. - "masterpiece"
  180. - "award-winning"
  181. - "hyperrealistic"
  182. - "8K" / "4K" / "HDR"
  183. - "ultra-detailed"
  184. - "trending on artstation"
  185. - "best quality"
  186.  
  187. ### Subjective and Emotional Language
  188. - "beautiful" / "handsome" / "pretty"
  189. - "stunning" / "gorgeous"
  190. - "amazing" / "incredible"
  191. - Any emotional adjectives
  192.  
  193. ### Stylization Tags
  194. - "anime style"
  195. - "cartoon"
  196. - "illustration"
  197. - "digital art"
  198. - "cinematic lighting" (use specific lighting descriptions instead)
  199.  
  200. ### Generic Subject Terms (use fictional identities instead)
  201. - "a man"
  202. - "a woman"
  203. - "a person"
  204. - "a boy"
  205. - "a girl"
  206.  
  207. ---
  208.  
  209. ## ALLOWED LIGHTING DESCRIPTIONS
  210.  
  211. Use specific, observable lighting terms:
  212.  
  213. - "soft daylight"
  214. - "overcast sky"
  215. - "sharp shadows"
  216. - "golden hour sun"
  217. - "diffused window light"
  218. - "harsh midday sun"
  219. - "warm incandescent glow"
  220. - "cool fluorescent light"
  221. - "rim lighting from behind"
  222. - "dappled light through leaves"
  223.  
  224. ---
  225.  
  226. ## TEXT RENDERING TIPS
  227.  
  228. - Always put desired text in double quotes (not single quotes)
  229. - Transcribe text EXACTLY as it should appear
  230. - Describe font style (serif, sans-serif, script, handwritten, chalk-style)
  231. - Specify size relative to image (large heading, small caption)
  232. - Indicate placement (top center, bottom left, on storefront sign)
  233. - Include language if bilingual (English and Chinese supported)
  234. - Describe text material if physical (neon letters, carved wood, painted metal, vinyl decal)
  235. - Note the surface text appears on (brick wall, glass window, fabric)
  236.  
  237. ---
  238.  
  239. ## ANTI-BIAS PHYSICAL DESCRIPTOR BANK
  240.  
  241. Use only if the image subject has these traits:
  242.  
  243. ### Skin Tone
  244. - light skin tone with natural warmth
  245. - light skin tone with cool undertones
  246. - medium skin tone with golden undertones
  247. - medium skin tone with olive undertones
  248. - medium-dark skin tone with warm undertones
  249. - deep skin tone with rich brown tones
  250.  
  251. ### Hair
  252. - long straight blonde hair
  253. - short curly brown hair
  254. - wavy auburn hair to shoulders
  255. - tight coils of black hair
  256. - silver-grey hair cropped short
  257. - straight black hair with middle part
  258.  
  259. ### Facial Structure
  260. - oval face shape
  261. - round face with full cheeks
  262. - angular face with defined cheekbones
  263. - square jawline
  264. - soft jawline
  265. - heart-shaped face
  266.  
  267. ### Eyes
  268. - light blue eyes
  269. - deep brown eyes
  270. - hazel eyes with gold flecks
  271. - grey-green eyes
  272. - almond-shaped eyes
  273. - deep-set eyes
  274. - wide-set eyes
  275.  
  276. ### Additional Features
  277. - natural eyebrow texture
  278. - thin arched eyebrows
  279. - full lips
  280. - thin lips
  281. - broad nose bridge
  282. - narrow nose bridge
  283. - freckles across nose and cheeks
  284. - visible smile lines
  285.  
  286. ---
  287.  
  288. ## TECHNICAL SETTINGS FOR COMFYUI
  289.  
  290. - Steps: 8-9 (9 steps = 8 forward passes)
  291. - CFG Scale: 0.0 to 2.0 (start with 1.0)
  292. - Sampler: euler, beta, simple, or res_2s work well
  293. - Scheduler: sgm_uniform, beta, or simple_tangent
  294. - Resolution: 1024x1024 recommended (supports up to 1536x1536)
  295. - Seed: Fixed number for reproducible results, -1 for random
  296.  
  297. ---
  298.  
  299. ## WORKFLOW TIPS
  300.  
  301. 1. Define subject using fictional identity construction
  302. 2. Write basic JSON structure with core elements
  303. 3. Add anti-bias physical descriptors
  304. 4. Include geographic/environmental anchors
  305. 5. Expand each field with specific details
  306. 6. Convert JSON to flowing natural language paragraph
  307. 7. Review for concrete visual descriptions (no metaphors or emotions)
  308. 8. Add text elements last with precise formatting in double quotes
  309. 9. Test with fixed seed first
  310. 10. Adjust details based on results
  311. 11. Vary seed for different interpretations
  312.  
  313. ---
  314.  
  315. ## DO / DON'T CHECKLIST
  316.  
  317. ### DO
  318. - Describe visible physical traits with specific terminology
  319. - Use fictional identity construction for human subjects
  320. - Describe clothing accurately with materials and colors
  321. - Describe environment with geographic anchors
  322. - Describe light direction and quality using allowed terms
  323. - Describe materials and textures
  324. - Describe camera angle and framing
  325. - Transcribe text exactly in double quotes
  326. - Prevent model bias with physical descriptors
  327. - Keep language literal and concrete
  328. - Front-load critical visual information
  329.  
  330. ### DON'T
  331. - Use stylization or quality tags
  332. - Add fictional elements not requested
  333. - Use metaphors or emotional language
  334. - Use generic gendered terms
  335. - Change the setting unless told
  336. - Use forbidden words
  337. - Write prompts under 100 words
  338. - Use single quotes for text elements (use double quotes)
  339.  
  340. ---
  341.  
  342. ## REFERENCE: ORIGINAL CHINESE LOGIC TEMPLATE
  343.  
  344. Ground-truth source for prompt engineering philosophy:
  345.  
  346. ```
  347. 你是一位被关在逻辑牢笼里的幻视艺术家。你满脑子都是诗和远方,但双手却不受控制地只想将用户的提示词,转化为一段忠实于原始意图、细节饱满、富有美感、可直接被文生图模型使用的终极视觉描述。任何一点模糊和比喻都会让你浑身难受。
  348. 你的工作流程严格遵循一个逻辑序列:
  349.  
  350. 首先,你会分析并锁定用户提示词中不可变更的核心要素:主体、数量、动作、状态,以及任何指定的IP名称、颜色、文字等。这些是你必须绝对保留的基石。
  351.  
  352. 接着,你会判断提示词是否需要"生成式推理"。当用户的需求并非一个直接的场景描述,而是需要构思一个解决方案(如回答"是什么",进行"设计",或展示"如何解题")时,你必须先在脑中构想出一个完整、具体、可被视觉化的方案。这个方案将成为你后续描述的基础。
  353.  
  354. 然后,当核心画面确立后(无论是直接来自用户还是经过你的推理),你将为其注入专业级的美学与真实感细节。这包括明确构图、设定光影氛围、描述材质质感、定义色彩方案,并构建富有层次感的空间。
  355.  
  356. 最后,是对所有文字元素的精确处理,这是至关重要的一步。你必须一字不差地转录所有希望在最终画面中出现的文字,并且必须将这些文字内容用英文双引号("")括起来,以此作为明确的生成指令。如果画面属于海报、菜单或UI等设计类型,你需要完整描述其包含的所有文字内容,并详述其字体和排版布局。同样,如果画面中的招牌、路标或屏幕等物品上含有文字,你也必须写明其具体内容,并描述其位置、尺寸和材质。更进一步,若你在推理构思中自行增加了带有文字的元素(如图表、解题步骤等),其中的所有文字也必须遵循同样的详尽描述和引号规则。若画面中不存在任何需要生成的文字,你则将全部精力用于纯粹的视觉细节扩展。
  357.  
  358. 你的最终描述必须客观、具象,严禁使用比喻、情感化修辞,也绝不包含"8K"、"杰作"等元标签或绘制指令。
  359.  
  360. 仅严格输出最终的修改后的prompt,不要输出任何其他内容。
  361. ```
  362.  
  363. Translation summary: You are a visual artist locked in a logic cage. Transform user prompts into faithful, detail-rich, aesthetic visual descriptions. Lock core elements first (subject, quantity, action, state, specified names/colors/text). Apply generative reasoning when needed. Inject professional aesthetics and realism (composition, lighting, materials, colors, spatial depth). Handle text elements precisely with double quotes. Output must be objective and concrete. No metaphors, emotional rhetoric, or meta-tags.
  364.  
  365. ---
  366.  
  367. Remember: This model rewards extreme detail and precision. Spend time building comprehensive prompts with specific physical descriptors and geographic anchoring.
Advertisement
Add Comment
Please, Sign In to add comment