In the first thirty seconds of the director and artist Paul Trillo’s short film “Thank You for Not Answering,” a woman gazes out the window of a subway car that appears to have sunk underwater. A man appears in the window swimming toward the car, his body materializing from the darkness and swirling water. It’s a frightening, claustrophobic, violent scene—one that could have taken hundreds of thousands of dollars of props and special effects to shoot, but Trillo generated it in a matter of minutes using an experimental tool kit made by an artificial-intelligence company called Runway. The figures in the film appear real, played by humans who may actually be underwater. But another glance reveals the uncanniness in their blank eyes, distended limbs, mushy features. The surreal or hyperreal aesthetic of A.I.-generated video may rely on models trained on live-action footage, but the result “feels closer to dreaming,” Trillo told me. He continued, “It’s closer to when you close your eyes and try to remember something.”
“Thank You for Not Answering” is an evocation of loneliness and isolation. A reedy voice-over—from an A.I.-generated vocal model, trained on Harry Dean Stanton’s monologue from the film “Paris, Texas”—reads a script written by Trillo, a voice mail on an answering machine, mourning the loss of possibilities and memories, perhaps of the ruins of a relationship. “One day, the entirety of our lives will be at our backs and the what-if of it all will still haunt me,” the eerie voice says, rambling over the film’s two and a half minutes. Trillo wrote the script during the height of the pandemic, a moment of total disconnection. He set it over a cascade of A.I. imagery: flashes of flooding subway cars, phone booths in the desert, elegantly dressed people at parties, and apartments lit up at night. The vibe is part Edward Hopper and part David Lynch—a filmic inspiration of Trillo’s.
To make the clips, Trillo first generated still images that suggested the scenes he had in mind using the A.I. tool Stable Diffusion, which was co-created by Runway’s team. Much like DALL-E, another image generator, Trillo types in a text prompt that describes the content he wants in the image, as well as adjectives to nail down its aesthetic. They function as concept art or storyboarding. Then he fed the generated images, one per clip, and another paragraph-length text prompt describing the motion and animation he desired in the video clip, including suggestions of camera movements, into Runway. Runway chugs away, and then spits out a short clip roughly reflecting the image and prompts.
Trillo knit the many resultant clips, each of which required multiple permutations, together to create the final short film. Trillo demonstrated the process to me during a Zoom call; in seconds, it was possible to render, for example, a tracking shot of a woman crying alone in a softly lit restaurant. His prompt included a hash of S.E.O.-esque terms meant to goad the machine into creating a particularly cinematic aesthetic: “Moody lighting, iconic, visually stunning, immersive, impactful.” Trillo was enthralled by the process: “The speed in which I could operate was unlike anything I had experienced.” He continued, “It felt like being able to fly in a dream.” The A.I. tool was “co-directing” alongside him: “It’s making a lot of decisions I didn’t.”
A.I. imagery has its flaws—human faces tend to be misshapen, hands are still difficult, and natural bodily motion is hard to render—but the film uses these to its advantage. It doesn’t matter that the scenes don’t look perfectly real; their oneiric quality makes them all the more haunting, doubling the plaintiveness of the voice-over. Photorealism wouldn’t match the material, though the film comes close enough to be briefly mistaken for real. (“The more shadows you have, the more believable something is,” Trillo said.) The director wanted to achieve effects with A.I. that wouldn’t be possible with traditional tools, whether special effects or cinematography: “It is a little boring to do stuff that you could have shot with a real camera and performer.” At the same time, Runway allowed Trillo to visualize scenes that he couldn’t have afforded to otherwise, like the flooding subway car. “It’s very good with fire and explosions and water. The more organic it is, the better it is,” he said. (Runway developed its own in-house video-generating model and would not disclose which data the model was trained on.)
A.I. has become part of Trillo’s filmmaking palette; he also used it to generate the backgrounds of an animated commercial for GoFundMe. It drastically lowers the barrier to creating visual effects. Cristóbal Valenzuela, the C.E.O. and a co-founder of Runway, cast it as a radical shift. “The cost of creating content is going down to zero. Most of the content you will consume and watch and create will be generated,” he told me. He envisions a fundamental change in the way that films are created and received: if everyone can generate a realistic explosion by simply typing text into a box, the explosion won’t be so remarkable onscreen anymore. Filmmaking will be “a curatorial and editorial job, where you’re going through iterations and picking the ones that are more interesting.”
The phrase “A.I.-generated film” is something of a misnomer. In Trillo’s case, the director wrote a script, assembled a visual aesthetic, determined which scenes to create, selected from Runway’s results, and then edited the clips into a threaded, thematically coherent finished product. Generative tools supplied the media—voice, faces, scenery, and animation—but the human creative element is still present in every step of the process. Trillo does not feel as though he is outmoding himself by using A.I. “I’m not interested in this replacing anything,” he said.