A NextJs quickstart for creating and editing images and videos using Google's latest Gemini API models including Veo 3.1, Imagen 4, and Gemini 2.5 Flash Image aka nano banana.
Compose |
Edit |
Video |
Note
If you want a full studio, consider Google's Flow (a professional environment for Veo/Imagen). Use this repo as a lightweight studio to learn how to build your own UI that generates content with Google's AI models via the Gemini API.
(This is not an official Google product.)
The quickstart provides a unified composer UI with different modes for content creation:
- Create Image: Generate images from text prompts using Imagen 4 or Gemini 2.5 Flash Image.
- Edit Image: Edit an image based on a text prompt using Gemini 2.5 Flash Image.
- Compose Image: Combine multiple images with a text prompt to create a new image using Gemini 2.5 Flash Image.
- Create Video: Generate videos from text prompts or an initial image using Veo 3.1.
- Extend Video: Extend existing Veo-generated videos with a new prompt.
- Interpolate Video: Generate a video by providing a start and end frame.
- Reference Images: Use up to three images to guide the video generation process.
- Seamless navigation between modes after generating content
- Download generated images & videos
- Cut videos directly in the browser to specific time ranges
- Prompt Magic: Enhance your prompts with AI using Gemini 2.5 Flash.
- Dark Mode: Toggle between light and dark modes for a better user experience.
Follow these steps to get the application running locally for development and testing.
1. Prerequisites:
- Node.js and npm (or yarn/pnpm)
GEMINI_API_KEY: The application requires a GEMINI API key. Either create a.envfile in the project root and add your API key:GEMINI_API_KEY="YOUR_API_KEY"or set the environment variable in your system.
Warning
Google Veo 3, Imagen 4, and Gemini 2.5 Flash Image are part of the Gemini API Paid tier. You will need to be on the paid tier to use these models.
2. Install Dependencies:
npm install3. Run Development Server:
npm run devOpen your browser and navigate to http://localhost:3000 to see the application.
The project is a standard Next.js application with the following key directories:
app/: Contains the main application logic and pagespage.tsx: Main page with the unified composer UI.api/: API routes for different operationsimagen/generate/: Image generation with Imagen 4gemini/generate/: Image generation with Gemini 2.5 Flash Imagegemini/edit/: Image editing/composition with Gemini 2.5 Flash Imageveo/generate/: Video generation operationsveo/operation/: Check video generation statusveo/download/: Download generated videos
components/: Reusable React componentsui/Composer.tsx: The main unified composer for all interactions.ui/VideoPlayer.tsx: Video player with trimmingui/ModelSelector.tsx: Model selection componentui/dropzone.tsx: Drag-and-drop component for file uploads.
lib/: Utility functions and schema definitionspublic/: Static assets
- Gemini API docs:
https://ai.google.dev/gemini-api/docs - Veo 3 Guide:
https://ai.google.dev/gemini-api/docs/video?example=dialogue - Imagen 4 Guide:
https://ai.google.dev/gemini-api/docs/imagen
The application uses the following API routes to interact with the Google models:
app/api/imagen/generate/route.ts: Handles image generation requests with Imagen 4app/api/gemini/generate/route.ts: Handles image generation requests with Gemini 2.5 Flash Imageapp/api/gemini/edit/route.ts: Handles image editing and composition with Gemini 2.5 Flash (supports multiple images)
app/api/veo/generate/route.ts: Handles video generation requests with Veo 3.1, including extension, interpolation, and reference images.app/api/veo/operation/route.ts: Checks the status of video generation operations.app/api/veo/download/route.ts: Downloads generated videos.
app/api/gemini/prompt-magic/route.ts: Enhances user prompts with Gemini 2.5 Flash.
- Next.js - React framework for building the user interface
- React - JavaScript library for building user interfaces
- Tailwind CSS - For styling
- Gemini API with:
- Veo 3 - For video generation
- Imagen 4 - For high-quality image generation
- Gemini 2.5 Flash - For fast image generation, editing, and composition
- Want a feature? Please open an issue describing the use case and proposed behavior.
This project is licensed under the Apache License 2.0.


