OpenAI and advanced image generation, Europe's defense mobilization, China’s commercial space industry

“
6Pages is a fantastic source for quickly gaining a deep understanding of a topic. I use their briefs for driving conversations with industry players.
— Associate Investment Director, Cambridge Associates
“
6Pages write-ups are some of the most comprehensive and insightful I’ve come across – they lay out a path to the future that businesses need to pay attention to.
— Head of Deloitte Pixel
“
At 500 Startups, we’ve found 6Pages briefs to be super helpful in staying smart on a wide range of key issues and shaping discussions with founders and partners.
— Thomas Jeng, Director of Innovation & Partnerships, 500 Startups
“
6Pages is a fantastic source for quickly gaining a deep understanding of a topic. I use their briefs for driving conversations with industry players.
— Associate Investment Director, Cambridge Associates
“
6Pages write-ups are some of the most comprehensive and insightful I’ve come across – they lay out a path to the future that businesses need to pay attention to.
— Head of Deloitte Pixel

“

6Pages write-ups are some of the most comprehensive and insightful I’ve come across – they lay out a path to the future that businesses need to pay attention to.

— Head of Deloitte Pixel

“

At 500 Startups, we’ve found 6Pages briefs to be super helpful in staying smart on a wide range of key issues and shaping discussions with founders and partners.

— Thomas Jeng, Director of Innovation & Partnerships, 500 Startups

“

6Pages is a fantastic source for quickly gaining a deep understanding of a topic. I use their briefs for driving conversations with industry players.

— Associate Investment Director, Cambridge Associates

Read by

Used at top MBA programs including

Mar 21 2025

3 Shifts Edition (Mar 21 2025): Amazon gets ambitious, The EPA’s regulation rollback begins, BYD EVs can charge 400 km in 5 min

Mar 14 2025

3 Shifts Edition (Mar 14 2025): Is this a surge in M&A?, AI’s inevitability in ad-buying, Trump’s revival of US shipbuilding

Mar 7 2025

3 Shifts Edition (Mar 7 2025): 25% tariffs on Canada & Mexico, The body blow to federal consulting, Google’s crackdown on search manipulation

Feb 28 2025

3 Shifts Edition (Feb 28 2025): The growth of TikTok Shop, When will humanoid robots be useful?, Distributed networks of GPUs

Feb 21 2025

3 Shifts Edition (Feb 21 2025): xAI’s Grok 3 is highly performant, Arm will make its own chips, China’s breakneck pace in biotech

Feb 14 2025

3 Shifts Edition (Feb 14 2025): The fight against “forever chemicals”, Tech unemployment reaches 5.7%, “Deep research” tools everywhere

Feb 7 2025

3 Shifts Edition (Feb 7 2025): Tariffs and the de minimis closure, The end of the CFPB as we know it, Distillation and AI economics

Jan 31 2025

3 Shifts Edition (Jan 31 2025): The Fed’s pause on rate cuts, A private-sector rollback on DEI, This generation of personal AI assistants

Jan 24 2025

3 Shifts Edition (Jan 24 2025): TikTok is off app stores, Trump's executive orders, Blue Origin's New Glenn launch

Jan 17 2025

3 Shifts Edition (Jan 17 2025): What happens to US home insurance next, China’s notable progress in AI models, NATO countries’ defense spending will rise

All Briefs

Mar 28 2025

Defense Space AI

3 Shifts Edition (Mar 28 2025): OpenAI and advanced image generation, Europe's defense mobilization, China’s commercial space industry

14 min read

Listen on:

1. OpenAI and advanced image generation

On Tuesday, OpenAI turned on GPT-4o’s Image Generation, a model “capable of precise, accurate, photorealistic outputs” that industry watchers are calling “insane.” Unlike OpenAI’s DALL-E, 4o Image Generation is a native capability of the GPT-4o multimodal model, which means it can take more precise direction from users’ prompts and iterate on images with users to produce more useful results. The release – which is among a swath of advanced image-generating features recently introduced by leading AI players (e.g. Google, xAI) – signals an inflection point for image generation.

4o Image Generation can produce, for instance, high-fidelity versions of images in the Studio Ghibli style popularized by movies like My Neighbor Totoro. (OpenAI CEO Sam Altman changed his X profile photo to a Studio Ghibli-style version, which is up as of this writing.) It can generate a comic strip in which Elon Musk explains quantum computing, a version of a user’s profile pic as a Muppet astronaut, an image restyled in the ‘80s era, detailed street signage for witches with accurate text rendering, scientific diagrams in an illustrated style, a mockup of a game controller that exists nowhere else, a Mughal portrait in the Rembrandt style, a new furniture design based on a color swatch, a Korean organic restaurant menu with illustrations, or a high-end wedding invitation with iconography, among many other possibilities.

4o Image Generation is autoregressive, producing images from left-to-right and top-to-bottom. In contrast, OpenAI’s DALL-E uses a diffusion model technique that produces images all at once. OpenAI reportedly trained 4o Image Generation with reinforcement learning using 100+ human workers.

“Useful image generation,” as OpenAI calls it, has long been limited by models’ challenges in taking close direction from users while iterating on images. The non-deterministic nature of prior generations of the technology meant that achieving a desired specific outcome often required some measure of luck. While generative AI could create “surreal, breathtaking scenes,” it was much harder to create images that conveyed precise information and meaning.

The ability to accurately follow precise direction from users means that users can, for instance, add a hat and monocle to a cat, turn the detective cat into a AAA video-game image, update the image ratio, add a steampunk Manhattan background, and create a game-style character profile for the cat. A graphic designer producing a sticker can start with a raccoon, change the style and color, and add a chew mark to the strawberry and residue around the raccoon’s mouth. He or she can start with a product image of a chainsaw, and turn into an ad in which a grandma uses the chainsaw to carve the turkey at Thanksgiving under a tagline. Perhaps more usefully, 4o Image Generation can also create images based on a user-uploaded brand style guide, logos or business icons with transparent backgrounds, or other work products.

Alternatively, 4o Image Generation can incorporate world knowledge into images so the user can prompt the AI with just a high-level description if they prefer. For instance, it can design an infographic on why San Francisco is so foggy, a graphic with recipes for cocktails, or an educational poster about whales – all without explicitly providing the model with the specific information to be displayed.

The images produced by 4o Image Generation can be photorealistic, in addition to being able to mimic certain styles. Examples provided by OpenAI include a paparazzi-style photo of Karl Marx with shopping bags walking through a mall parking lot, a time-stamped developed photograph of a girl drinking a smoothie at a Toronto farmer’s market, and a fruit bowl full of miniature planets. Like other ChatGPT-generated images, the images produced by 4o are owned by the user and can be used within the scope of OpenAI’s usage policies.

4o Image Generation has its limitations. It can crop images too tightly near the bottom, hallucinate information, struggle to render more than 10-20 distinct images, represent graphs imprecisely, and depict non-Latin characters inaccurately. OpenAI has instituted certain safeguards on what can be generated – barring, for instance, photorealistic images of children. It also adds standard C2PA (Coalition for Content Provenance and Authenticity) metadata on all assets, enabling tracing of origin. Altman has said 4o Image Generation has some flexibility to be a little offensive within reason if directed to do so by the user – a move potentially influenced by the success of xAI’s Grok 3.

OpenAI’s new image-generation feature is available to ChatGPT users in OpenAI’s Plus, Pro, and Team tiers, and will become available to Enterprise and Edu users next week. The Free tier originally had access immediately but OpenAI had to pull back because its “GPUs are melting” due to demand. Free users will “soon” be able to generate 3 images per day, the same as DALL-E 3. Other tiers will see limits as well, according to OpenAI CEO Sam Altman. Developers will also be able to generate images using GPT‑4o through OpenAI’s API within the next few weeks.

OpenAI isn’t the only player working on native image generation, although it reportedly has the most advanced model among those tested. Its release follows soon after Google’s announcement of its own native image-generation feature for Gemini 2.0 Flash, to positive reviews. (Google’s Gemini 2.5 Pro is the model currently topping the leaderboards.) Google’s feature, like OpenAI’s, allows for story/illustration generation, conversational image-editing, incorporation of world understanding, and better text rendering. Google’s model will reportedly even remove watermarks from images.

Last week, Elon Musk’s xAI introduced image generation through its API, using its “grok-2-image-1212” model (which is more limited than its Grok 3). The feature can generate up to 10 images per request, at $0.07 per image, although it’s not capable yet of editing the image quality, size, or style. (Musk’s X social platform added a photorealistic image generator called Aurora from xAI in Dec 2024.) There are also image-generation offerings from players like Runway, Adobe, Playground AI, and others.

4o Image Generation’s level of capability is already spurring more debate around the use of human creative work to train AI models that compete with creators. Style – such as Studio Ghibli’s style – may not be explicitly protected in US IP law. However, whether the use of copyrighted content for training should be considered “fair use” – especially when AI-generated work might have a negative commercial impact on the original creators – is still an unresolved question. OpenAI’s 4o Image Generation was trained using both publicly available data as well as proprietary data through OpenAI’s partnerships (e.g. Shutterstock).

While OpenAI’s models continue to rank near the top of the leaderboards (although not at the very top), the company seems to have shifted its strategy away from being the leading provider of best-in-class models. Rather than hanging its hat on being a model purveyor, OpenAI is instead aiming to be the next global big tech firm with 1B+ daily active users. ChatGPT already gets 400M+ weekly users as of Feb 2025 (up 33% from 300M in Dec 2024), including 2M paid enterprise users. (The ChatGPT.com website alone sees 5B+ visitors every month.) It is aiming to reach 1B users and triple revenue to $12.7B by the end of 2025.

OpenAI has notably been broadening its scope along this vein. In just the past month, OpenAI has revealed GPT-4.5, its NextGenAI AI research and education consortium, enterprise tools for building AI agents, a new model able to write artful short stories, next-gen audio models for voice agents, a collaboration with the MIT Media Lab studying how AI can affect well-being, the OpenAI Academy hub for AI literacy, updates on its security initiatives, and now its 4o Image Generation. OpenAI is also expanding the scope of its leaders – with Mark Chen as Chief Research Officer focusing on “faster translation of research into products people love”; and Brad Lightcap as Chief Operating Officer focusing on global deployment, including “business strategy, key partnerships, infrastructure, and operational excellence to maximize the impact of our research.”

Perhaps most notably, Altman has hinted that OpenAI may open up some of its models to compete with open-source players like DeepSeek. Just this week, it added support for rival Anthropic’s open-source Model Context Protocol (MCP), which is used to connect a user’s data sources to AI applications.

We seem to be getting closer to the point where AI does most of the work and pure creativity can be of value on its own. In such a world, the only limits – at least in the digital realm – are the limits of the imagination. On the other hand, with fewer barriers to producing creative work, more people will become creators with less effort required of them to create. This could mean diminishing returns to average creativity, and perhaps even diminishing returns to above-average creativity if IP protections become eroded. Tools like OpenAI’s image generation are likely to end up having the greatest impact on the middle 50% of the creator economy, touching platforms like Etsy as well as professions like graphic design and photography.