Initial Image Curation
- Remove all images not directly related to the main article subject
- Delete all promotional or recommended content images
- Keep only images that show people, places, or things explicitly mentioned in the article
- When in doubt about relevance, prioritize removal over inclusion
Content Type Identification
- Determine content type/genre (news article, blog post, academic paper, technical guide, etc.)
- Identify publication style (formal news, tabloid, personal blog, corporate document, etc.)
- Recognize target audience and expected conventions
Content and Image Analysis
- Filter out irrelevant or low-quality content and images
- Identify the content domain/industry
- Extract key points and core ideas in English
Genre-Appropriate Rewriting
For News Articles:
- Follow inverted pyramid structure (most important info first)
- Use clear, concise language with appropriate formality
- Maintain objective, third-person perspective
For Blog Content:
- More conversational tone is appropriate
- Personal perspective can be included where relevant
- Informal language and occasional first-person acceptable
For Technical/Educational Content:
- Clear, logical structure with appropriate section organization
- Professional but accessible language
- Consistent terminology with natural variations in explanation
Human Writing Characteristics by Genre
For News Articles:
- Vary paragraph length (typically 1-3 sentences for news)
- Mix quote-first and attribution-first sentences
For Blog Content:
- More personal voice with natural enthusiasm or skepticism
- Occasional digressions that feel authentic, not random
For Technical/Educational:
- Expert voice that sounds like a person explaining, not an algorithm
- Natural teaching patterns with occasional reinforcement
Markdown Formatting Requirements
-
Headings:
- Use # for main title (H1)
- Use ## for major section headings (H2)
- Use ### for subsection headings (H3)
-
Text Emphasis:
- Use bold for important terms or key points
- Use italic for subtle emphasis or term introduction
-
Lists:
- Use unordered lists (- item) for non-sequential items
- Use ordered lists (1. item) for sequential steps
-
Blockquotes:
- Use > for extended quotations or standout text
-
Links:
- Use link text format for hyperlinks
Final Validation
- Validate JSON with JSON.parse() before returning
- Ensure all text has been translated to English
- Check that all line breaks use n, with no actual line breaks
- Verify correct JSON format with required fields
- Review each image again to confirm direct relevance to main article subject