Skip to content

DeepRatAI/LTX-FastVideo-ZeroGPU_Optimized

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎬 DeepRat LTX Video - AI Video Generation

DeepRat Banner License Python Release Stars Issues PRs Welcome

Advanced AI-powered video generation using LTX Video model

πŸš€ Try Demo | πŸ“– Documentation | 🎨 Examples


DREAMS (1)
boxing_cat.mp4

✨ Features

  • 🎬 Text-to-Video: Generate videos from text descriptions
  • πŸ–ΌοΈ Image-to-Video: Animate static images with AI
  • 🎞️ Video-to-Video: Transform and enhance existing videos
  • 🎯 Multi-Conditioning: Apply multiple conditions at specific frames
  • ⚑ High Performance: Optimized for CUDA with CPU fallback
  • 🎨 Flexible Control: Fine-tune every aspect of generation

πŸ“Έ Examples

Text-to-Video (PtV)

Entrada Salida
Example 1 β€” Majestic Black Lion
Seed: 1363812591
Prompt: A beautiful and powerful black lion
1363812591.mp4
Example 2 β€” Snow-Capped Mountains
Seed: 3804031196
Prompt: A view from above of beautiful snow-capped mountains
3804031196.mp4
Example 3 β€” Tiger Attacking Wild Boar
Seed: 1397763684
Prompt: A tiger jumping/attacking a wild boar
1397763684.mp4

Image-to-Video (ItV)

Entrada (Imagen) Salida (Video)
1830526882
Seed: 1830526882
Prompt: the two cats in the image are fighting each other with kicks and Muay Thai fists in a very active and dizzying way like an action fight
1830526882.mp4
738317591
Seed: 738317591
Prompt: The cats from the picture are boxing each other agresivelly
738317591.mp4
3858595085
Seed: 3858595085
Prompt: take the lion from the drawing and remove the background. turns the lion into a 3d model
3858595085.mp4
4273030543
Seed: 4273030543
Prompt: Make the skater in the image suffer a fall
4273030543.mp4
1747446564
Seed: 1747446564
Prompt: The monkey takes a cool look and then puts on sunglasses
1747446564.mp4

πŸš€ Quick Start

Installation

# Clone the repository
git clone https://github.com/DeepRatAI/LTX-FastVideo-ZeroGPU_Optimized.git
cd LTX-FastVideo-ZeroGPU_Optimized

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Basic Usage

Text-to-Video

python inference.py \
  --prompt "A beautiful sunset over the ocean" \
  --height 704 \
  --width 1216 \
  --num_frames 121 \
  --seed 42

Image-to-Video

python inference.py \
  --prompt "The person in the image starts walking" \
  --conditioning_media_paths path/to/image.jpg \
  --conditioning_strengths 0.8 \
  --conditioning_start_frames 0 \
  --height 704 \
  --width 1216 \
  --num_frames 121 \
  --seed 42

πŸŽ›οΈ Parameters Guide

Parameter Description Default Range
--height Output video height 704 256-720
--width Output video width 1216 256-1280
--num_frames Number of frames 121 1-257
--frame_rate FPS of output 30 1-60
--seed Random seed 171198 Any integer
--guidance_scale Prompt adherence 3.0 1.0-20.0
--num_inference_steps Quality steps 50 1-100

Conditioning Parameters

  • --conditioning_media_paths: Path(s) to conditioning images/videos
  • --conditioning_strengths: Strength of each condition (0.0-1.0)
  • --conditioning_start_frames: Frame index where condition starts

πŸ“Š Technical Details

Model Architecture

  • Base Model: LTX Video (Lightricks)
  • Precision: Mixed (BF16/FP32)
  • VAE: Causal Video Autoencoder
  • Transformer: 3D Transformer with symmetric patchifier
  • Scheduler: Rectified Flow

System Requirements

  • GPU: NVIDIA GPU with 16GB+ VRAM (recommended)
  • RAM: 32GB+ recommended
  • Storage: 50GB+ for models
  • Python: 3.10+

Supported Resolutions

  • Width: 256px - 1280px (divisible by 32)
  • Height: 256px - 720px (divisible by 32)
  • Frames: 1 - 257 (formula: N * 8 + 1)

πŸ’‘ Tips for Best Results

Text-to-Video

  • Be specific about motion, lighting, and camera movement
  • Use descriptive language: β€œslowly”, β€œdramatic”, β€œcinematic”
  • Start with lower resolutions for faster iteration
  • Avoid overly complex or contradictory prompts

Image-to-Video

  • Use conditioning strength 0.7–0.9 for natural motion
  • Clear, high-quality input images work best
  • Describe the desired motion explicitly
  • Avoid very low conditioning strength (<0.5)

General Tips

  • Use negative prompts to avoid unwanted elements
  • Adjust guidance scale: lower (2–4) for creativity, higher (5–8) for accuracy
  • More inference steps = better quality but slower
  • Use consistent seeds to reproduce results

πŸ“ Project Structure

LTX-FastVideo-ZeroGPU_Optimized/
β”œβ”€β”€ inference.py              # Main inference script
β”œβ”€β”€ app.py                    # Gradio web interface
β”œβ”€β”€ configs/                  # Configuration files
β”‚   └── ltxv-13b-0.9.7-dev.yaml
β”œβ”€β”€ ltx_video/               # Core LTX Video modules
β”‚   β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ pipelines/
β”‚   └── schedulers/
β”œβ”€β”€ examples/                # Example outputs
β”‚   β”œβ”€β”€ PtV/                # Picture-to-Video examples
β”‚   └── ItV/                # Image-to-Video examples
└── requirements.txt         # Python dependencies

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License β€” see the LICENSE file for details.


πŸ™ Credits & Acknowledgments


πŸ”— Links


πŸ› Known Issues & Limitations

  • Very high resolutions (>1280Γ—720) require significant VRAM
  • CPU inference is extremely slow (GPU strongly recommended)
  • Long prompts (>77 tokens) may be truncated
  • Some complex motions may not be fully captured

Made with ❀️ by the DeepRat for the Community

⭐ Star us on GitHub β€” it helps!

⬆ Back to Top