Skip to content

marlonka/veo-3-gemini-api-quickstart

 
 

Repository files navigation

Gemini API Veo 3.1 & Nano Banana Quickstart

A NextJs quickstart for creating and editing images and videos using Google's latest Gemini API models including Veo 3.1, Imagen 4, and Gemini 2.5 Flash Image aka nano banana.

Compose
Compose
Edit
Edit
Video
Video

Note

If you want a full studio, consider Google's Flow (a professional environment for Veo/Imagen). Use this repo as a lightweight studio to learn how to build your own UI that generates content with Google's AI models via the Gemini API.

(This is not an official Google product.)

Features

The quickstart provides a unified composer UI with different modes for content creation:

  • Create Image: Generate images from text prompts using Imagen 4 or Gemini 2.5 Flash Image.
  • Edit Image: Edit an image based on a text prompt using Gemini 2.5 Flash Image.
  • Compose Image: Combine multiple images with a text prompt to create a new image using Gemini 2.5 Flash Image.
  • Create Video: Generate videos from text prompts or an initial image using Veo 3.1.
  • Extend Video: Extend existing Veo-generated videos with a new prompt.
  • Interpolate Video: Generate a video by providing a start and end frame.
  • Reference Images: Use up to three images to guide the video generation process.

Quick Actions & UI Features

  • Seamless navigation between modes after generating content
  • Download generated images & videos
  • Cut videos directly in the browser to specific time ranges
  • Prompt Magic: Enhance your prompts with AI using Gemini 2.5 Flash.
  • Dark Mode: Toggle between light and dark modes for a better user experience.

Getting Started: Development and Local Testing

Follow these steps to get the application running locally for development and testing.

1. Prerequisites:

  • Node.js and npm (or yarn/pnpm)
  • GEMINI_API_KEY: The application requires a GEMINI API key. Either create a .env file in the project root and add your API key: GEMINI_API_KEY="YOUR_API_KEY" or set the environment variable in your system.

Warning

Google Veo 3, Imagen 4, and Gemini 2.5 Flash Image are part of the Gemini API Paid tier. You will need to be on the paid tier to use these models.

2. Install Dependencies:

npm install

3. Run Development Server:

npm run dev

Open your browser and navigate to http://localhost:3000 to see the application.

Project Structure

The project is a standard Next.js application with the following key directories:

  • app/: Contains the main application logic and pages
    • page.tsx: Main page with the unified composer UI.
    • api/: API routes for different operations
      • imagen/generate/: Image generation with Imagen 4
      • gemini/generate/: Image generation with Gemini 2.5 Flash Image
      • gemini/edit/: Image editing/composition with Gemini 2.5 Flash Image
      • veo/generate/: Video generation operations
      • veo/operation/: Check video generation status
      • veo/download/: Download generated videos
  • components/: Reusable React components
    • ui/Composer.tsx: The main unified composer for all interactions.
    • ui/VideoPlayer.tsx: Video player with trimming
    • ui/ModelSelector.tsx: Model selection component
    • ui/dropzone.tsx: Drag-and-drop component for file uploads.
  • lib/: Utility functions and schema definitions
  • public/: Static assets

Official Docs and Resources

  • Gemini API docs: https://ai.google.dev/gemini-api/docs
  • Veo 3 Guide: https://ai.google.dev/gemini-api/docs/video?example=dialogue
  • Imagen 4 Guide: https://ai.google.dev/gemini-api/docs/imagen

How it Works

The application uses the following API routes to interact with the Google models:

Image APIs

  • app/api/imagen/generate/route.ts: Handles image generation requests with Imagen 4
  • app/api/gemini/generate/route.ts: Handles image generation requests with Gemini 2.5 Flash Image
  • app/api/gemini/edit/route.ts: Handles image editing and composition with Gemini 2.5 Flash (supports multiple images)

Video APIs

  • app/api/veo/generate/route.ts: Handles video generation requests with Veo 3.1, including extension, interpolation, and reference images.
  • app/api/veo/operation/route.ts: Checks the status of video generation operations.
  • app/api/veo/download/route.ts: Downloads generated videos.

Other APIs

  • app/api/gemini/prompt-magic/route.ts: Enhances user prompts with Gemini 2.5 Flash.

Technologies Used

  • Next.js - React framework for building the user interface
  • React - JavaScript library for building user interfaces
  • Tailwind CSS - For styling
  • Gemini API with:
  • Veo 3 - For video generation
  • Imagen 4 - For high-quality image generation
  • Gemini 2.5 Flash - For fast image generation, editing, and composition

Questions and feature requests

  • Want a feature? Please open an issue describing the use case and proposed behavior.

License

This project is licensed under the Apache License 2.0.

About

Veo 3 Gemini API Quickstart

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 95.2%
  • CSS 4.4%
  • JavaScript 0.4%