feat(cli): Add flash init with project skeleton template and in-place initialization #110

deanq · 2025-11-10T05:52:55Z

Flash CLI Polish: Simplified Skeleton, Async Support, and Bug Fixes

Summary

Comprehensive improvements to the Flash CLI focused on simplifying the developer experience, fixing critical bugs, and removing unnecessary complexity. This PR makes Flash more approachable for new users while maintaining all core functionality and adding robust async function support.

Key Improvements

Simplified Skeleton Template (New!)

What Changed:

Replaced with simple "hello world" style examples demonstrating the @remote decorator
Restructured workers: workers/example/ → workers/gpu/, added workers/cpu/

GPU Worker (workers/gpu/endpoint.py):

@remote(resource_config=gpu_config, dependencies=["torch"])
async def gpu_hello(input_data: dict) -> dict:
    """Simple GPU worker with GPU detection."""
    import torch
    import platform

    # Returns GPU info: name, count, memory
    return {
        "status": "success",
        "message": input_data.get("message", "Hello from GPU!"),
        "gpu_info": {
            "available": torch.cuda.is_available(),
            "name": torch.cuda.get_device_name(0),
            "count": torch.cuda.device_count(),
            "memory_gb": torch.cuda.get_device_properties(0).total_memory / (1024**3)
        }
    }

CPU Worker (workers/cpu/endpoint.py):

@remote(resource_config=cpu_config)
async def cpu_hello(input_data: dict) -> dict:
    """Simple CPU worker."""
    return {
        "status": "success",
        "message": input_data.get("message", "Hello from CPU!"),
        "worker_type": "CPU"
    }

Removed Conda Environment Management (New!)

What Changed:

Removed all conda-related imports and utilities
Removed --no-env flag from flash init
Removed automatic conda environment creation
Removed REQUIRED_PACKAGES constant
Updated all documentation to use standard pip workflow

Updated Documentation

CLI Documentation (cli/docs/):

Removed all conda references
Updated flash-init.md with new structure
Updated README.md with simplified examples
Updated test endpoints to /gpu/hello and /cpu/hello

Skeleton README:

Reduced from 620 → 252 lines
Focused on core concepts (Remote Execution, Resource Scaling)
Clear GPU/CPU type lists
Simple development workflow
Removed marketing language per guidelines

In-Place Initialization

Features:

Initialize in current directory: flash init or flash init .
Create new directory: flash init project-name
Automatic conflict detection with file listing
Interactive confirmation before overwriting
--force flag to skip confirmation
Smart project naming from directory

Manual Testing

# Test flash init in empty directory
cd /tmp/test-project && flash init
# Creates skeleton with gpu/ and cpu/ workers

# Test flash build with uv
cd test-project && flash build
# Uses 'uv pip' automatically, finds workers in subdirectories

# Test async functions
# Both sync and async @remote functions work

# Test GPU detection
python -m workers.gpu.endpoint
# Returns GPU info (name: "RTX 4090", memory: 23.53 GB)

Reviewers: Please pay special attention to:

Simplified skeleton template - Is it easier for new users?
Async function support - Critical bug fix
UV pip fallback - Modern workflow compatibility
Documentation clarity - Did we achieve simplification goals?

This activates the support for deploying Load-balancer -based endpoints. We formerly covered Queue-based only.

- Add comprehensive README.md with complete documentation - Add example GPU worker with @Remote decorator - Add example CPU interface with todo list functions - Add FastAPI main.py entry point - Add .env template with commented values - Add .flashignore and .gitignore - Add requirements.txt with dependencies - Add skeleton.py with conflict detection and template creation

Enables flash init to initialize projects in the current directory with interactive overwrite warnings. Usage: - `flash init` → Initialize in current directory - `flash init .` → Initialize in current directory - `flash init <name>` → Create new directory (existing behavior) Features: - Automatic conflict detection for existing files - Interactive confirmation prompt with file list - `--force` flag to skip prompts - Smart project naming (uses current dir name when in-place) - Conditional "Next steps" (omits cd when in current dir) - Separate success messages for each mode

Fixed critical documentation errors in code examples: - Correct GPU Types - Correct CPU Types

Copilot

Pull Request Overview

This PR introduces comprehensive flash init functionality with project skeleton templates, in-place initialization support, and conflict detection. It adds accurate GPU/CPU type documentation, template-based project creation, and improved UX for different initialization modes.

Key Changes:

Added ServerlessType enum to distinguish between queue-based (QB) and load-balancer (LB) serverless endpoints
Implemented in-place initialization allowing flash init in current directory with conflict detection
Created complete project template structure with GPU workers, CPU interfaces, and comprehensive documentation
Refactored skeleton creation from hardcoded strings to file-based templates with substitution support

Reviewed Changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`src/tetra_rp/core/resources/serverless.py`	Added ServerlessType enum with QB/LB variants and integrated it into ServerlessResource
`src/tetra_rp/core/resources/__init__.py`	Exported ServerlessType for public API use
`src/tetra_rp/__init__.py`	Added ServerlessType to package exports
`src/tetra_rp/cli/main.py`	Refactored command registration from direct imports to inline function wrappers, made project_name optional
`src/tetra_rp/cli/commands/init.py`	Implemented in-place initialization with conflict detection and conditional messaging
`src/tetra_rp/cli/utils/skeleton.py`	Replaced hardcoded templates with file-based template system and added conflict detection
`src/tetra_rp/cli/utils/skeleton_template/main.py`	FastAPI application entry point template with worker router registration
`src/tetra_rp/cli/utils/skeleton_template/workers/example/endpoint.py`	GPU worker template using LiveServerless with RTX 4090 configuration
`src/tetra_rp/cli/utils/skeleton_template/workers/example/__init__.py`	GPU worker router template demonstrating queue-based handler pattern
`src/tetra_rp/cli/utils/skeleton_template/workers/interface/endpoint.py`	CPU interface template with multiple remote functions
`src/tetra_rp/cli/utils/skeleton_template/workers/interface/__init__.py`	CPU interface router template demonstrating load-balancer pattern
`src/tetra_rp/cli/utils/skeleton_template/requirements.txt`	Minimal dependency specification for new projects
`src/tetra_rp/cli/utils/skeleton_template/README.md`	Comprehensive documentation covering architecture, deployment, API reference, and troubleshooting
`src/tetra_rp/cli/utils/skeleton_template/.env`	Environment variable template with commented defaults
`src/tetra_rp/cli/utils/skeleton_template/.gitignore`	Standard Python/IDE/OS ignore patterns
`src/tetra_rp/cli/utils/skeleton_template/.flashignore`	Flash build-specific ignore patterns

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/tetra_rp/cli/utils/skeleton_template/main.py

src/tetra_rp/cli/utils/skeleton_template/README.md

…on README Address Copilot review feedback on PR 110: **Quick Start fixes:** - GPU endpoint: /example/process → /example/ - CPU endpoint: /interface/todo/list → /interface/list - Fix request model: {"task": "..."} → {"data": "..."} **API Reference fixes:** - GPU Worker: /example/process → /example/ - GPU request: {"input_data": {...}} → {"data": "string"} - CPU list: /interface/todo/list → /interface/list - CPU add: /interface/todo/add → /interface/add - CPU delete: DELETE /interface/todo/delete → POST /interface/delete - Request bodies: {"item": "..."} → {"data": "..."} - Response format: direct strings → {"result": "string"} All endpoints now match actual FastAPI router implementation.

Add `if __name__ == "__main__"` test block to CPU interface endpoint to match the pattern in GPU worker endpoint. Features: - Test all three interface functions in parallel using asyncio.gather() - Demonstrates concurrent execution of remote CPU functions - Provides example of running endpoint directly for testing Users can now test CPU interface by running: python workers/interface/endpoint.py

KAJdev

the example seems a bit convoluted, if this is supposed to be a skeleton for starting projects, it should be minimal? Think cargo inits 3 line hello world main.rs, or wrangler inits 10 line default worker script.

src/tetra_rp/cli/main.py

rambo-runpod · 2025-11-11T16:13:15Z

some feedback and qs. i was able to deploy a serverless endpoint and tested out flash init. loved that we can provision infra and have remote code execution all within flash :)

flash init is not creating requirements.txt, README.md, .env for me when running locally
should we create a requirements.txt or a pyroject.toml instead since we're using uv. or we can give the user the option to specify which pkg mgr they're using and generate the corresponding file?
are we planning to have flash/tetrap_rp sdk manage state? for example, if i want to update the workersMax for an existing serverless endpoint rerunning my python script should update that config. maybe this is what the deploy command is for?
should we standardize on snake case naming convention for params? we use it for the remote decorator but not for the resource objects
similar sentiment to zeke, i think flash init should spin up a boilerplate main.py file with the sdk imported with either the worker example inline or we can prompt in the cli to see what they want to spin up (pods, sls, network vols, etc)
not related to flash init but more to the project as whole, have we explored separating the functionality of deploying infra vs code execution on infra? it may make sense for a pattern like having the cli flash deploy handle deployments and flash run for remote execution.

rambo-runpod · 2025-11-12T16:50:55Z

another thought just now as we were discussing flash though lower prioritiy it would be good to show pricing each time we deploy new resources

jhcipar · 2025-11-12T18:57:24Z

Similar comments as above I think, I also just wonder if, in the spirit of making the entrypoint for Flash as dead simple as possible, we have pointers towards running a command where we execute worker code directly instead of first standing up the API server and sending requests to it

I know the end goal is the fastapi server, but it feels like a big potential barrier to getting your head wrapped around how this works if you have to absorb the concept of liveserverless and and making routes with an api server, etc. right off the bat

In other words I'm not opposed to that being part of the init skeleton, but the simplest unit that makes Flash feel magical is the ability to run function code on remote infra in a way that feels effortless and like how I'd write normal code. If I immediately have to think about an API server and then make requests to it instead of just running some function code, that feels like a potential place where we introduce complexity up front when we could move it backwards in the journey a little bit

also - not part of this PR probably but is there a way we could use uv inside of the flash app? Right now it feels weird to have uv as the "outer" package manager but then the expectation is that you use conda for a flash project

Reverted aggressive lazy-loading patterns that broke IDE reference linking and type checking. Performance is maintained through upstream boto3 lazy-loading in runpod-python. Changes: - Restored direct imports in __init__.py (removed __getattr__ pattern) - Removed get_runpod() lazy-loading function - Fixed ResourceManager to load resources at init - Added Typer decorators to command functions for proper CLI signatures - Improved flash init overwrite behavior with user prompts - Updated tests to handle new initialization pattern Breaking: None - all CLI commands work identically Performance: Maintained with upstream boto3 fix (0.6s cold start) Developer Experience: IDE navigation, autocomplete, and type checking restored

Support both `def` and `async def` function definitions in @Remote decorator. Changes: - Modified AST parser to recognize both FunctionDef and AsyncFunctionDef nodes - Added textwrap.dedent() to handle indented function definitions - Updated client.py docstring with async/sync examples - Added comprehensive test suite for async function support Fixes ValueError when wrapping async functions with @Remote decorator.

Support both `def` and `async def` function definitions in the `@remote` decorator. Changes: - Modified AST parser to recognize both FunctionDef and AsyncFunctionDef nodes - Added textwrap.dedent() to handle indented function definitions - Updated client.py docstring with async/sync examples - Added comprehensive test suite for async function support Fixes ValueError when wrapping async functions with the remote decorator.

…rator' of https://github.com/runpod/tetra-rp into deanq/ae-1469-bug-support-async-and-non-for-remote-decorator

Fixes ValueError when using @Remote decorator with async functions by: - Adding inspect.unwrap() to get original function from wrapper - Supporting ast.AsyncFunctionDef in addition to ast.FunctionDef - Adding early dedenting to handle functions defined in classes/methods This enables proper source extraction for both sync and async decorated functions, resolving the "Could not find function definition" error.

…rator' into deanq/ae-1249-improve-cli

Simplify skeleton template to focus on basic patterns: **GPU Worker (workers/gpu/):** - Simple hello-world response with GPU detection - Returns GPU name, count, and memory via PyTorch - Minimal dependencies (only torch) - Clear demonstration of @Remote decorator **CPU Worker (workers/cpu/):** - Simple hello-world response - No complex data processing - Minimal dependencies (no pandas/numpy) - Clear demonstration of CpuLiveServerless **README Updates:** - Simplified documentation focusing on Tetra patterns - Removed complex ML/data processing examples - Added clear dependency management section - Updated API examples to /hello endpoints **Changes:** - Renamed workers/example/ → workers/gpu/ - Simplified gpu_hello() to return GPU info - Simplified cpu_hello() to basic response - Updated router endpoints from /matrix and /process to /hello - Reduced README from 620 lines to 252 lines (-368 lines) This makes the skeleton more approachable for new users learning the @Remote decorator pattern without the cognitive overhead of understanding PyTorch matrix operations or pandas data analysis.

Update project structure display in flash init command output to show the current skeleton structure: - workers/example/ → workers/gpu/ - workers/interface/ → workers/cpu/ This matches the skeleton template refactoring from commit 590df46.

Simplify flash init command by removing all conda-related functionality: **Removed:** - Conda utility imports - REQUIRED_PACKAGES list - --no-env flag - Conda environment creation and package installation logic - Conditional output based on conda environment status **Simplified:** - Next steps always show: pip install, add API key, flash run - Users manage their own virtual environments (venv, conda, poetry, etc.) - Reduced file from 191 lines to 116 lines (-75 lines) This removes unnecessary complexity and lets users choose their preferred environment management tool.

Update CLI documentation to reflect removal of conda environment management: **flash-init.md:** - Removed --no-env option - Removed conda environment creation documentation - Updated project structure to show gpu/ and cpu/ workers - Simplified next steps to use pip install **README.md:** - Removed conda activate from quick start - Removed --no-env from flash init options - Updated project structure to match skeleton template - Updated test endpoints to /gpu/hello and /cpu/hello Documentation now matches the simplified init command behavior.

Change glob pattern from '*.py' to '**/*.py' in extract_remote_dependencies() to recursively search worker subdirectories. This fixes the build command to work with the new skeleton structure where workers are organized in subdirectories (workers/gpu/, workers/cpu/) instead of directly in the workers/ directory. The build command can now find and extract dependencies from @Remote decorators in workers/gpu/endpoint.py and workers/cpu/endpoint.py.

Add detailed error messages when pip check fails: - Show stderr output from pip --version command - Show exception details if subprocess fails - Add helpful hint to install pip with ensurepip This helps users diagnose why pip detection is failing instead of showing a generic 'pip not found' error.

Detect and use 'uv pip' as fallback when 'python -m pip' is not available. This handles uv-created virtual environments that don't include pip by default. The build command now: 1. Tries python -m pip first 2. Falls back to uv pip if pip not in venv 3. Shows a note when using uv pip 4. Provides helpful install instructions if neither works Fixes build errors in uv environments without pip installed.

deanq · 2025-11-14T10:47:53Z

I have updated this branch so that the init skeleton is simple. I would recommend the following steps to get started:

mkdir myapp
cd myapp
uv init
uv venv --python 3.11
uv add git+https://github.com/runpod/tetra-rp/@deanq/ae-1249-improve-cli
# flash commands...

Thanks, @KAJdev @rambo-runpod @jhcipar @PranjalJain-1

Copilot

Pull Request Overview

Copilot reviewed 31 out of 33 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/tetra_rp/cli/utils/skeleton_template/main.py

src/tetra_rp/cli/docs/README.md

src/tetra_rp/cli/docs/flash-init.md

src/tetra_rp/cli/utils/skeleton_template/workers/gpu/endpoint.py

deanq · 2025-11-14T11:09:26Z

some feedback and qs. i was able to deploy a serverless endpoint and tested out flash init. loved that we can provision infra and have remote code execution all within flash :)

flash init is not creating requirements.txt, README.md, .env for me when running locally

This should be fixed now

should we create a requirements.txt or a pyroject.toml instead since we're using uv. or we can give the user the option to specify which pkg mgr they're using and generate the corresponding file?

Yes, this is addressed with the latest changes

are we planning to have flash/tetrap_rp sdk manage state? for example, if i want to update the workersMax for an existing serverless endpoint rerunning my python script should update that config. maybe this is what the deploy command is for?

Yes, that's a separate task and PR by Jacob

should we standardize on snake case naming convention for params? we use it for the remote decorator but not for the resource objects

This is tricky. The reason for the non-pythonic naming has to do with our backend that uses camelCase. We want to use Pydantic's (de)serializer with as little maintenance as possible. If we have to add a snake-to-camel and back translator, we're just adding unnecessary overhead.

similar sentiment to zeke, i think flash init should spin up a boilerplate main.py file with the sdk imported with either the worker example inline or we can prompt in the cli to see what they want to spin up (pods, sls, network vols, etc)

This has been addressed with the latest changes. I would start with an easy skeleton and encourage the use of our separate examples repo for deep-dive.

not related to flash init but more to the project as whole, have we explored separating the functionality of deploying infra vs code execution on infra? it may make sense for a pattern like having the cli flash deploy handle deployments and flash run for remote execution.

Yes, flash build and deploy will put these all up on our serverless platform. The flash run is basically a local preview of how that works.

…ences - Remove workers/interface/ directory from skeleton template (was supposed to be removed but wasn't) - Fix main.py home endpoint to reference correct paths: /gpu/hello and /cpu/hello (was /gpu/matrix and /cpu/process) - Addresses PR feedback about skeleton complexity

- Change main.py default port from 8000 to 8888 to match flash run command - Ensures consistency between skeleton template and CLI behavior - Documentation already uses port 8888 correctly - Addresses Copilot feedback about port inconsistency

Temporarily disable build and deploy commands with coming soon message. This release focuses on core init and run functionality, with build and deploy features scheduled for future releases.

jhcipar · 2025-11-14T17:19:41Z

src/tetra_rp/cli/commands/init.py

+        actual_project_name = project_dir.name
+    else:
+        # Create new directory
+        project_dir = Path(project_name)


I really like this behavior! I feel like with most cli tool inits I don't like when the default behavior is to create a subfolder inside of my working directory, so it's great to support both that and generating a named project folder

deanq added 4 commits November 7, 2025 15:07

feat(resources): Support for Serverless.type QB|LB

2059d97

This activates the support for deploying Load-balancer -based endpoints. We formerly covered Queue-based only.

fix(cli): correct GPU and CPU types in skeleton README

dce72e9

Fixed critical documentation errors in code examples: - Correct GPU Types - Correct CPU Types

deanq requested review from KAJdev, Copilot, jhcipar and rambo-runpod November 10, 2025 05:52

Copilot AI reviewed Nov 10, 2025

View reviewed changes

chore(cli): lint and format fixes

5d62eb3

deanq changed the base branch from main to deanq/ae-1102-flash-support-for-load-balancer-based-endpoints November 10, 2025 06:00

deanq added 2 commits November 9, 2025 22:06

KAJdev reviewed Nov 10, 2025

View reviewed changes

src/tetra_rp/cli/main.py Outdated Show resolved Hide resolved

src/tetra_rp/cli/main.py Outdated Show resolved Hide resolved

Base automatically changed from deanq/ae-1102-flash-support-for-load-balancer-based-endpoints to main November 11, 2025 04:50

Merge branch 'main' into deanq/ae-1249-improve-cli

6f0aaf7

deanq added 11 commits November 13, 2025 16:38

perf: reduce the load footprint from runpod library

53795b9

build: pegged latest main branch of runpod-python for now

aed00e3

test: fix test for lazy-loaded runpod

8cbb166

fix: ruff format

62715d8

Merge branch 'deanq/ae-1469-bug-support-async-and-non-for-remote-deco…

698596c

…rator' of https://github.com/runpod/tetra-rp into deanq/ae-1469-bug-support-async-and-non-for-remote-decorator

Merge branch 'deanq/ae-1469-bug-support-async-and-non-for-remote-deco…

37ecaa1

…rator' into deanq/ae-1249-improve-cli

chore: quality check

a089d42

deanq added 7 commits November 14, 2025 02:16

deanq requested review from KAJdev and Copilot November 14, 2025 10:48

Copilot AI reviewed Nov 14, 2025

View reviewed changes

deanq added 3 commits November 14, 2025 03:30

feat(cli): limit flash CLI to init and run commands for this release

5a27411

Temporarily disable build and deploy commands with coming soon message. This release focuses on core init and run functionality, with build and deploy features scheduled for future releases.

jhcipar approved these changes Nov 14, 2025

View reviewed changes

deanq and others added 2 commits November 14, 2025 10:52

Merge branch 'main' into deanq/ae-1249-improve-cli

8061b4c

Merge branch 'main' into deanq/ae-1249-improve-cli

2f29c5e

deanq merged commit 155d6ee into main Nov 14, 2025
7 checks passed

deanq deleted the deanq/ae-1249-improve-cli branch November 14, 2025 19:05

runpod-release-please-bot bot mentioned this pull request Nov 14, 2025

chore: release 0.16.0 #114

Merged

feat(cli): Add flash init with project skeleton template and in-place initialization #110

feat(cli): Add flash init with project skeleton template and in-place initialization #110

Uh oh!

Conversation

deanq commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Flash CLI Polish: Simplified Skeleton, Async Support, and Bug Fixes

Summary

Key Improvements

Simplified Skeleton Template (New!)

Removed Conda Environment Management (New!)

Updated Documentation

In-Place Initialization

Manual Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KAJdev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rambo-runpod commented Nov 11, 2025

Uh oh!

rambo-runpod commented Nov 12, 2025

Uh oh!

jhcipar commented Nov 12, 2025

Uh oh!

deanq commented Nov 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

deanq commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhcipar Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

deanq commented Nov 10, 2025 •

edited

Loading

deanq commented Nov 14, 2025 •

edited

Loading