Advanced AI-powered video generation using LTX Video model
boxing_cat.mp4
- π¬ Text-to-Video: Generate videos from text descriptions
- πΌοΈ Image-to-Video: Animate static images with AI
- ποΈ Video-to-Video: Transform and enhance existing videos
- π― Multi-Conditioning: Apply multiple conditions at specific frames
- β‘ High Performance: Optimized for CUDA with CPU fallback
- π¨ Flexible Control: Fine-tune every aspect of generation
| Entrada | Salida |
|---|---|
Example 1 β Majestic Black Lion Seed: 1363812591 Prompt: A beautiful and powerful black lion |
1363812591.mp4 |
Example 2 β Snow-Capped Mountains Seed: 3804031196 Prompt: A view from above of beautiful snow-capped mountains |
3804031196.mp4 |
Example 3 β Tiger Attacking Wild Boar Seed: 1397763684 Prompt: A tiger jumping/attacking a wild boar |
1397763684.mp4 |
| Entrada (Imagen) | Salida (Video) |
|---|---|
1830526882.mp4 |
|
738317591.mp4 |
|
3858595085.mp4 |
|
4273030543.mp4 |
|
1747446564.mp4 |
# Clone the repository
git clone https://github.com/DeepRatAI/LTX-FastVideo-ZeroGPU_Optimized.git
cd LTX-FastVideo-ZeroGPU_Optimized
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtpython inference.py \
--prompt "A beautiful sunset over the ocean" \
--height 704 \
--width 1216 \
--num_frames 121 \
--seed 42python inference.py \
--prompt "The person in the image starts walking" \
--conditioning_media_paths path/to/image.jpg \
--conditioning_strengths 0.8 \
--conditioning_start_frames 0 \
--height 704 \
--width 1216 \
--num_frames 121 \
--seed 42| Parameter | Description | Default | Range |
|---|---|---|---|
--height |
Output video height | 704 | 256-720 |
--width |
Output video width | 1216 | 256-1280 |
--num_frames |
Number of frames | 121 | 1-257 |
--frame_rate |
FPS of output | 30 | 1-60 |
--seed |
Random seed | 171198 | Any integer |
--guidance_scale |
Prompt adherence | 3.0 | 1.0-20.0 |
--num_inference_steps |
Quality steps | 50 | 1-100 |
--conditioning_media_paths: Path(s) to conditioning images/videos--conditioning_strengths: Strength of each condition (0.0-1.0)--conditioning_start_frames: Frame index where condition starts
- Base Model: LTX Video (Lightricks)
- Precision: Mixed (BF16/FP32)
- VAE: Causal Video Autoencoder
- Transformer: 3D Transformer with symmetric patchifier
- Scheduler: Rectified Flow
- GPU: NVIDIA GPU with 16GB+ VRAM (recommended)
- RAM: 32GB+ recommended
- Storage: 50GB+ for models
- Python: 3.10+
- Width: 256px - 1280px (divisible by 32)
- Height: 256px - 720px (divisible by 32)
- Frames: 1 - 257 (formula: N * 8 + 1)
- Be specific about motion, lighting, and camera movement
- Use descriptive language: βslowlyβ, βdramaticβ, βcinematicβ
- Start with lower resolutions for faster iteration
- Avoid overly complex or contradictory prompts
- Use conditioning strength 0.7β0.9 for natural motion
- Clear, high-quality input images work best
- Describe the desired motion explicitly
- Avoid very low conditioning strength (<0.5)
- Use negative prompts to avoid unwanted elements
- Adjust guidance scale: lower (2β4) for creativity, higher (5β8) for accuracy
- More inference steps = better quality but slower
- Use consistent seeds to reproduce results
LTX-FastVideo-ZeroGPU_Optimized/
βββ inference.py # Main inference script
βββ app.py # Gradio web interface
βββ configs/ # Configuration files
β βββ ltxv-13b-0.9.7-dev.yaml
βββ ltx_video/ # Core LTX Video modules
β βββ models/
β βββ pipelines/
β βββ schedulers/
βββ examples/ # Example outputs
β βββ PtV/ # Picture-to-Video examples
β βββ ItV/ # Image-to-Video examples
βββ requirements.txt # Python dependencies
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License β see the LICENSE file for details.
- LTX Video: Lightricks
- Model: Lightricks/LTX-Video
- Paper: LTX-Video: Realtime Video Latent Diffusion
- Interface: Built with Gradio
- Maintainer: DeepRat (solo)
- π Hugging Face Space
- π¦ Model Card
- π Original Repository
- π Research Paper
- Very high resolutions (>1280Γ720) require significant VRAM
- CPU inference is extremely slow (GPU strongly recommended)
- Long prompts (>77 tokens) may be truncated
- Some complex motions may not be fully captured
Made with β€οΈ by the DeepRat for the Community
β Star us on GitHub β it helps!




