Veo UGC-Style Prompts

These are tested Veo prompts for UGC-style content — selfie vlogs, product testimonials, street interviews, ASMR, and GRWM. The trick is to prompt against polish: ask for a “selfie video”, “slightly shaky handheld”, an “authentic phone-camera look”, and vertical 9:16, then let Veo’s native audio carry the spoken delivery. Copy any prompt, paste it into Veo, and change the bold details. This cluster sits inside the full Veo Prompt Library alongside product, dialogue, and image-to-video prompts.

What makes UGC read as real

Produced video and UGC are almost opposites. Cinematic camera moves, studio lighting, and clean framing signal “ad”; arm’s-length framing, handheld wobble, natural light, and a single casual spoken line signal “real person”. Lean into the imperfect cues. For testimonials and street interviews, Veo’s native audio is the whole point — a short, conversational, lip-synced line is what separates believable UGC from a silent clip with captions slapped on.

How to write Veo UGC prompts

Once you know which format you want, build it in the Veo Prompt Builder — the UGC preset starts from these same handheld, vertical, spoken-line defaults. If your UGC clip needs a scripted back-and-forth instead of one line, see the dialogue and audio prompts for the exact says-quote syntax. And if you already have a photo of the creator or product, the image-to-video prompts show how to animate it instead of generating from text.

Got your prompt? Run it on a Veo-capable tool

UGC formats lean hard on Veo’s native audio, so run them somewhere that supports it. If you do not have direct Veo access, Pollo AI runs Veo and other models in one place — handy for batching selfie-vlog and testimonial variations and re-rolling until the delivery lands. Disclosure: affiliate link — we may earn a commission if you subscribe, at no extra cost to you. We only suggest tools we would actually use to run these prompts.

Prompt deck

Copy a format, check the evidence, then customize it.

18 prompts 8 evidenced 8 community 0 owner-tested

Veo / UGC / selfie-vlog / testimonial

Selfie vlog — talking to camera

Prompt
Vertical 9:16 selfie video. A **Gen Z guy in a hoodie** holds his phone at arm's length, slightly shaky handheld, walking through a **sunny city park**. He looks into the lens and says in a casual, excited voice, "Okay you have to see this place, it's actually unreal." Natural daylight, authentic phone-camera look, mild lens distortion at the edges, photoreal. Ambient noise: light wind, distant park chatter, footsteps. No background music, no on-screen text.

TweakSwap the bold person, location, and line. The "slightly shaky handheld" + "phone-camera look" cues are what sell the authentic UGC feel.

Veo / UGC / selfie-vlog / testimonial

Product testimonial (UGC ad)

Prompt
Vertical 9:16 selfie shot. A **woman in her 30s in a cozy living room** holds a **cream-colored skincare jar** up near her face, arm extended, natural handheld wobble. She smiles and says in a warm, genuine voice, "I've used this for two weeks and honestly my skin's never looked better." Soft window light, authentic phone-camera look, photoreal. Ambient noise: quiet room tone. No background music, no on-screen text.

TweakSwap the bold product and line; keep the delivery conversational, not scripted. Under ~16 words keeps lip-sync clean in 8 seconds.

Veo / UGC / selfie-vlog / testimonial

Street interview (single answer)

Prompt
Handheld documentary shot on a **busy downtown sidewalk**, a **young woman with headphones around her neck** stopped mid-walk facing the camera, an out-of-frame interviewer just heard. She laughs and says in a candid voice, "Honestly? I'd spend it all on tacos." Shallow depth of field, natural daylight, slight handheld movement, photoreal. Ambient noise: street traffic, passing footsteps, distant city hum. No background music, no on-screen text.

TweakChange the person and the answer. Frame it as a candid reply ("she laughs and says") so it reads like a real vox-pop, not a monologue.

Veo / UGC / selfie-vlog / testimonial

Day-in-the-life POV vlog

Prompt
Vertical 9:16 first-person POV. The camera looks down at hands making **pour-over coffee in a bright minimalist kitchen**, then lifts to a bathroom mirror selfie. A **young woman's voice** narrates over the footage in a soft, relaxed tone, "Slow mornings are kind of my whole personality now." Handheld, natural light, authentic phone look, photoreal. Ambient noise: kettle pour, soft kitchen sounds. No background music.

TweakPOV plus voice-over narration gives the trendy "GRWM / day-in-the-life" feel — describe the narration "over the footage" rather than an on-screen speaker.

Veo / UGC / selfie-vlog / testimonial

Reaction / POV-caption skit

Prompt
Vertical 9:16 selfie, a **millennial man in a home office** holding the phone at arm's length, looking into the lens with a deadpan, conspiratorial expression. He says in a dry, comedic voice, "POV: you said one quick email and it's now three hours later." Natural desk light, authentic phone-camera look, photoreal. Ambient noise: quiet room tone, a faint keyboard. No background music, no on-screen text.

TweakSpoken delivery is more reliable than rendered captions — say the POV line aloud rather than asking Veo to print it on screen.

Veo / UGC / selfie-vlog / testimonial

ASMR — close-up tactile (no speech)

Prompt
Extreme close-up of **hands slowly slicing a bar of blue kinetic sand with a sharp knife** on a clean white surface. Macro detail, soft even lighting, slow deliberate motion, photoreal, 4K. SFX: the crisp, satisfying crunch of the blade cutting through sand, soft grains falling. Ambient noise: quiet room tone. No dialogue, no background music.

TweakName the exact sound ("crisp, satisfying crunch") — vague "ASMR sounds" gives weak audio. Swap the material for soap, glass, or fruit.

Veo / UGC / selfie-vlog / testimonial

Reaction/commentary voice-over on live action (streamer format)

Builder

Why it works Framing the scene as an achievement moment gives Veo a clear emotional register (excitement, disbelief) to voice over the action, which is why a four-word brief still yields a full, energetic reaction take rather than a flat description.

Prompt
Streamer getting a victory royale with just his pickaxe

TweakA minimal prompt that still produces excited, in-character streamer commentary synced to the action — the same pattern that powers "gamer reacts" and achievement-clip UGC formats. Swap the game/activity and the achievement.

Credit@mattshumer_, via jax-explorer/awesome-veo3-videos

Veo / UGC / selfie-vlog / testimonial

Mockumentary reaction cutaway (classroom / crowd format)

Why it works Naming the pan and the reacting group in the same sentence as the presenter ("pans over to... taking notes") gives Veo two performances to render in one shot, which is what makes the joke land as a single continuous take instead of two disconnected clips.

Prompt
A college professor doing a class on Gen Z slang and the video pans over to all the boomers taking notes and seeming super interested

TweakA one-sentence setup that produces both a presenter and a reacting crowd — the same beat used in "normal person explains something to confused onlookers" social formats. Swap the presenter/topic and the reacting group for your own mismatch.

Credit@HonestBlogging, via jax-explorer/awesome-veo3-videos

Veo / UGC / selfie-vlog / testimonial

ASMR — glass-material slicing (viral hyper-real format)

Prompt
Static shot, A man delicately slices a hyper-realistic glass dragon fruit on a pristine cutting board. The whisper-thin blade glides through the transparent fruit, scattering soft-glimmering shards. Surgical, serene lighting. Hyper-clean, ASMR video

TweakThis is the community "glass fruit" ASMR format that went viral. Swap "glass dragon fruit" for glass strawberry, glass onion, glass egg — the "hyper-realistic glass [object]" + "static shot" + "ASMR video" is the load-bearing combination.

Veo / UGC / selfie-vlog / testimonial

ASMR fantasy-object close-up (heat/ember texture)

Why it works Structuring "props" and "audio.primary_sounds" as separate fields forces a named material-specific sound for every key press rather than a generic keyboard-clack, which is exactly the kind of specific tactile audio that makes ASMR content perform.

Prompt
{
  "shot": {
    "composition": "Extreme close-up, 135mm lens, shoulder-mounted for subtle sway",
    "camera_motion": "slow left-to-right pan with slight handheld shake",
    "frame_rate": "60fps",
    "film_grain": "slight vintage grain with digital clarity"
  },
  "subject": {
    "description": "calloused hands with soot-stained fingertips rapidly typing on burning keys",
    "wardrobe": "charcoal black hoodie sleeves pushed up to elbows"
  },
  "scene": {
    "location": "dark forge-style desktop lit by glowing coals",
    "time_of_day": "late evening",
    "environment": "embers floating in smoky low-light haze"
  },
  "visual_details": {
    "action": "keys ignite on each press, flaring momentarily before cooling to a glow, smoke curling with every impact",
    "props": "keyboard forged from volcanic glass and ember veins"
  },
  "cinematography": {
    "lighting": "backlit ember underglow with dynamic contrast",
    "tone": "intense, elemental, darkly magical",
    "color_palette": "burnt oranges, obsidian black, crimson pulses"
  },
  "audio": {
    "primary_sounds": "crackles, fire pops, crunch of hot glass under fingers",
    "ambient": "deep furnace hum and distant metallic resonance",
    "music": "no music",
    "technical_effects": "ASMR mic with heat-reactive reverb tail"
  }
}

TweakA fantasy-texture ASMR format: an ordinary action (typing) rendered through an impossible material (burning glass keys). Swap the material and the action; keep the per-press sound cue ("crackles, fire pops, crunch") as the ASMR payload.

Credit@heyglif, via songguoxs/awesome-video-prompts

Veo / UGC / selfie-vlog / testimonial

ASMR creator to camera (typing + spoken aside)

Prompt
asmr creator typing on a noisy keyboard and then looking up and blowing into the microphone as she talks

TweakA short, loose prompt that still yields synced keyboard SFX plus a spoken aside — proof Veo fills in rich tactile audio from a minimal ASMR brief. Add a quoted line if you want to script what she says.

Veo / UGC / selfie-vlog / testimonial

Character selfie-stick vlog with scripted dialogue (JSON)

Builder

Why it works Naming the exact prop ("smartphone mounted on a selfie stick") tells Veo how the shot should be framed and why the camera shakes naturally, which is what makes a non-human subject read as convincingly self-filming rather than externally filmed.

Prompt
{
  "shot": {
    "composition": "Medium shot, vertical format, handheld camera",
    "camera_motion": "slight natural shake",
    "frame_rate": "30fps",
    "film_grain": "none"
  },
  "subject": {
    "description": "A towering, snow-white Yeti with shaggy fur and expressive blue eyes",
    "wardrobe": "slightly oversized white T-shirt with the name 'Emily' in bold, blood-red letters across the chest"
  },
  "scene": {
    "location": "lush forest clearing",
    "time_of_day": "daytime",
    "environment": "sunlight filtering through the canopy, creating dappled light patterns on the forest floor"
  },
  "visual_details": {
    "action": "Yeti holds a smartphone on a selfie stick, speaking excitedly to the camera before letting out a dramatic scream",
    "props": "smartphone mounted on a selfie stick"
  },
  "cinematography": {
    "lighting": "natural sunlight with soft shadows",
    "tone": "lighthearted and humorous"
  },
  "audio": {
    "ambient": "rustling leaves, distant bird calls",
    "dialogue": { "character": "Yeti", "line": "Veo3 Fast is now available in the Gemini app—three videos per day! People are going to prompt me like crazy!", "subtitles": false },
    "effects": "sudden loud scream, flapping wings of startled birds"
  },
  "color_palette": "naturalistic with earthy greens and browns; bold red lettering on shirt provides contrast"
}

TweakA creature-vlogger selfie-stick format with a full scripted line and a physical-comedy button (the scream). Swap the character, the T-shirt text, and the dialogue line; keep "vertical format, handheld camera" plus the named prop ("smartphone mounted on a selfie stick") to sell the selfie framing.

Credit@IamEmily2050, via songguoxs/awesome-video-prompts

Veo / UGC / selfie-vlog / testimonial

Low-fi phone-shot influencer testimonial (non-English original)

Builder

Why it works Explicitly requesting "low-quality amateur video shot on phone" rather than leaving quality unstated is what suppresses the default polished look, and scripting the line in the target language directly (not translating after the fact) is what keeps the lip-sync and delivery natural.

Prompt
tiktok 风格的影响者视频。一位年轻的中国女性举起并谈论这个产品,她用清晰的中文说到:"欢迎大家来尝试我们家新出的 katon 音响,音质超一流,支持 ChatGPT",用手机拍摄的低质量业余视频。 (English gloss: A TikTok-style influencer video. A young Chinese woman holds up and talks about the product, saying clearly in Mandarin, "Come try our new Katon speaker — top-tier sound quality, supports ChatGPT." Shot as a low-quality, amateur phone video.)

TweakProof that Veo follows non-English dialogue and delivers it naturally, and that naming the format explicitly ("低质量业余视频" — low-quality amateur video) is enough to get an authentically unpolished UGC look. Swap the product, the language, and the spoken line.

Credit@hellokaton, via songguoxs/awesome-video-prompts

Veo / UGC / selfie-vlog / testimonial

ASMR — food close-up

Prompt
Macro close-up of a **glossy chocolate-glazed donut** being slowly pulled apart by two hands, soft strands stretching. Bright soft lighting, shallow depth of field, slow motion, photoreal, 4K. SFX: the soft tear of the dough, a faint sticky pull of the glaze. Ambient noise: quiet kitchen tone. No dialogue, no background music.

TweakFood ASMR lives on the SFX line and slow motion. Change the food and rebuild the sound from what that food would actually make.

Veo / UGC / selfie-vlog / testimonial

Get-ready-with-me (GRWM) talking

Prompt
Vertical 9:16, a **woman doing her makeup at a vanity mirror**, talking to the camera between steps, handheld phone propped to the side. She glances at the lens and says in a chatty, friendly voice, "Real quick before we start — today was a lot." Soft ring-light look, authentic phone-camera feel, photoreal. Ambient noise: soft room tone. No background music, no on-screen text.

TweakKeep it to one short line and one action (applying makeup) so the 8 seconds do not feel rushed. Swap the line for your topic.

Veo / UGC / selfie-vlog / testimonial

Unboxing UGC

Prompt
Vertical 9:16 over-the-shoulder and selfie mix. A **man at a wooden desk** opens a **plain kraft shipping box**, lifts out a **pair of white sneakers**, and holds one up to the camera with a delighted reaction. He says in an excited voice, "Oh these are so much cleaner in person." Natural window light, authentic handheld look, photoreal. Ambient noise: cardboard rustle, tissue paper. No background music, no on-screen text.

TweakName the product and the box concretely. The reaction line plus the cardboard SFX is what makes it feel like a real unboxing.

Veo / UGC / selfie-vlog / testimonial

Character VLog (multi-clip, persona-driven)

Prompt
Generate 1 VLog using Veo3, consisting of 4 clips. Protagonist: The star is a large, fluffy white yeti who lives in a snowy forest. His personality is a hilarious mix of sarcastic, emotionally unstable, and overly dramatic, but he's ultimately lovable.

TweakThe community "creature vlogger" format — define a protagonist and personality, and Veo carries a consistent vlog voice across clips. Swap the yeti for your character; keep the personality line, it drives the delivery.

Veo / UGC / selfie-vlog / testimonial

Travel selfie vlog

Prompt
Vertical 9:16 selfie of a **travel blogger in a denim jacket** walking through a **bustling night market**, phone at arm's length, neon stalls behind. She looks into the lens, excited, and says, "I can't believe the food here, I'm trying everything tonight." Then she turns to point at a food stall. Handheld, warm neon light, authentic phone look, photoreal. Ambient noise: market chatter, sizzling food, distant music. No added background music, no on-screen text.

TweakThe "then she turns to point" beat adds natural movement. Swap the location and line; keep one turn so the motion stays clean.

FAQ

How do I make Veo videos look like real UGC instead of a polished ad?

Ask for the look explicitly: "selfie video", "slightly shaky handheld", "authentic phone-camera look", "natural light", and vertical 9:16. Avoid cinematic camera moves and studio lighting — those read as produced, not user-generated.

Should UGC prompts be vertical?

For TikTok, Reels, and Shorts, yes — set 9:16. Veo 3.1 supports 16:9 and 9:16 natively, so prompt 9:16 directly rather than cropping a landscape clip.

How do I get authentic spoken delivery?

Use the says, "..." pattern with a casual voice cue ("in a relaxed, chatty voice") and keep the line short and conversational. Over-scripted lines sound like an ad; one natural sentence reads as real.

Can I use my own product or face?

For a consistent product or person across clips, use Veo image-to-video or the reference-image ("ingredients") feature to carry the same look between shots. See the image-to-video guide below.

Why does ASMR audio sound weak?

Vague audio prompts produce vague sound. Name the exact noise ("crisp crunch", "soft sticky pull") with SFX:, use slow motion, and add "no background music" so the tactile sound stands alone.