Once the avatar is generated, its mouth and body move in time with the scripted audio. While the scripts were once pre-written by humans, companies are now using large language models to generate them too.
Now, all the human workers have to do is input basic information such as the name and price of the product being sold, proofread the generated script, and watch the digital influencer go live. A more advanced version of the technology can spot live comments and find matching answers in its database to answer in real time, so it looks as if the AI streamer is actively communicating with the audience. It can even adjust its marketing strategy based on the number of viewers, Sima says.
These livestream AI clones are trained on the common scripts and gestures seen in e-commerce videos, says Huang Wei, the director of virtual influencer livestreaming business at the Chinese AI company Xiaoice. The company has a database of nearly a hundred pre-designed movements.
“For example, [when human streamers say] ‘Welcome to my livestream channel. Move your fingers and hit the follow button,’ they are definitely pointing their finger upward, because that’s where the ‘Follow’ button is on the screen of most mobile livestream apps,” says Huang. Similarly, when streamers introduce a new product, they point down—to the shopping cart, where viewers can find all products. Xiaoice’s AI streamers replicate all these common tricks. “We want to make sure the spoken language and the body language are matching. You don’t want it to be talking about the Follow button while it’s clapping its hands. That would look weird,” she says.
Spun off from Microsoft Software Technology Center Asia in 2020, Xiaoice has always been focused on creating more human-like AI, particularly avatars that are capable of showing emotions. “Traditional e-commerce sites just feel like a shelf of goods to most customers. It’s cold. In livestreaming, there is more emotional connection between the host and the viewers, and they can introduce the products better,” Huang says.
After piloting with a few clients last year, Xiaoice officially launched its service of generating under-$1,000 digital clones this year; like Silicon Intelligence, Xiaoice only needs human streamers to provide a one-minute video of themselves.
And like its competitors, Xiaoice clients can spend more to fine-tune the details. For example, Liu Jianhong, a Chinese sports announcer, made an exquisite clone of himself during the 2022 FIFA World Cup to read out the match results and other relevant news on Douyin.
A cheap replacement for human streamers
These generated streamers won’t be able to beat the star e-commerce influencers, Huang says, but they are good enough to replace mid-tier ones. Human creators, including those who used their videos to train their AI clones, are already feeling the squeeze from their digital rivals to some extent. It’s harder to get a job as an e-commerce livestream host this year, and the average salary for livestream hosts in China went down 20% compared to 2022, according to the analytics firm iiMedai Research.