The initial novelty of AI chatbots is shifting toward a new frontier: visual creation. While text-based conversational updates once dominated the headlines, recent data from Appfigures indicates that image-generation models are now the most effective way to drive mobile app adoption. These visual upgrades are currently generating 6.5 times more downloads than traditional model improvements or voice-interface features.
The Visual Surge: Gemini and ChatGPT
Leading the charge are the industry’s biggest players, who have seen massive spikes in user acquisition following the release of specialized image tools.
- Google Gemini: The introduction of the Gemini 2.5 Flash image model, dubbed “Nano Banana,” resulted in a staggering 22 million incremental downloads in just 28 days. This represented a 4x lift in the app’s standard download volume.
- OpenAI: When ChatGPT launched its GPT-4o image model, it secured over 12 million new installs within a month. This visual-centric update outperformed the download metrics of the GPT-4.5 and GPT-5 text model releases by a factor of 4.5.
Even video-centric content is contributing to this trend. Meta AI saw a boost of 2.6 million downloads following the release of “Vibes,” its AI-driven video feed, proving that users are increasingly prioritizing apps that offer rich, visual experiences over text-only interactions.
The Conversion Challenge: Downloads vs. Dollars
While “curiosity downloads” are easy to trigger with flashy image tools, converting those users into paying subscribers remains a hurdle for most developers.
Despite its massive download spike, Google’s Nano Banana model only generated an estimated $181,000 in gross consumer spending during its launch window. Similarly, Meta’s video features failed to produce significant immediate revenue.
OpenAI stands as the clear outlier. The GPT-4o image release didn’t just bring in users; it successfully converted them, leading to $70 million in consumer spending within the first 28 days of its launch.
Exceptions to the Rule
Not every success story is tied to pixels. DeepSeek defied the visual trend with its R1 model, which drove 28 million downloads in early 2025. This growth was fueled not by image generation, but by the tech industry’s fascination with the company’s ability to train high-performing models at a fraction of the cost of its competitors.
However, for the broader consumer market, the message is clear: if you want to scale an AI app today, visual capabilities are the most potent hook available.







