Skip to content

Conversation

@deanq
Copy link
Member

@deanq deanq commented Nov 10, 2025

Flash CLI Polish: Simplified Skeleton, Async Support, and Bug Fixes

Summary

Comprehensive improvements to the Flash CLI focused on simplifying the developer experience, fixing critical bugs, and removing unnecessary complexity. This PR makes Flash more approachable for new users while maintaining all core functionality and adding robust async function support.

Key Improvements

Simplified Skeleton Template (New!)

What Changed:

  • Replaced with simple "hello world" style examples demonstrating the @remote decorator
  • Restructured workers: workers/example/workers/gpu/, added workers/cpu/

GPU Worker (workers/gpu/endpoint.py):

@remote(resource_config=gpu_config, dependencies=["torch"])
async def gpu_hello(input_data: dict) -> dict:
    """Simple GPU worker with GPU detection."""
    import torch
    import platform

    # Returns GPU info: name, count, memory
    return {
        "status": "success",
        "message": input_data.get("message", "Hello from GPU!"),
        "gpu_info": {
            "available": torch.cuda.is_available(),
            "name": torch.cuda.get_device_name(0),
            "count": torch.cuda.device_count(),
            "memory_gb": torch.cuda.get_device_properties(0).total_memory / (1024**3)
        }
    }

CPU Worker (workers/cpu/endpoint.py):

@remote(resource_config=cpu_config)
async def cpu_hello(input_data: dict) -> dict:
    """Simple CPU worker."""
    return {
        "status": "success",
        "message": input_data.get("message", "Hello from CPU!"),
        "worker_type": "CPU"
    }

Removed Conda Environment Management (New!)

What Changed:

  • Removed all conda-related imports and utilities
  • Removed --no-env flag from flash init
  • Removed automatic conda environment creation
  • Removed REQUIRED_PACKAGES constant
  • Updated all documentation to use standard pip workflow

Updated Documentation

CLI Documentation (cli/docs/):

  • Removed all conda references
  • Updated flash-init.md with new structure
  • Updated README.md with simplified examples
  • Updated test endpoints to /gpu/hello and /cpu/hello

Skeleton README:

  • Reduced from 620 → 252 lines
  • Focused on core concepts (Remote Execution, Resource Scaling)
  • Clear GPU/CPU type lists
  • Simple development workflow
  • Removed marketing language per guidelines

In-Place Initialization

Features:

  • Initialize in current directory: flash init or flash init .
  • Create new directory: flash init project-name
  • Automatic conflict detection with file listing
  • Interactive confirmation before overwriting
  • --force flag to skip confirmation
  • Smart project naming from directory

Manual Testing

# Test flash init in empty directory
cd /tmp/test-project && flash init
# Creates skeleton with gpu/ and cpu/ workers

# Test flash build with uv
cd test-project && flash build
# Uses 'uv pip' automatically, finds workers in subdirectories

# Test async functions
# Both sync and async @remote functions work

# Test GPU detection
python -m workers.gpu.endpoint
# Returns GPU info (name: "RTX 4090", memory: 23.53 GB)

Reviewers: Please pay special attention to:

  1. Simplified skeleton template - Is it easier for new users?
  2. Async function support - Critical bug fix
  3. UV pip fallback - Modern workflow compatibility
  4. Documentation clarity - Did we achieve simplification goals?

deanq added 4 commits November 7, 2025 15:07
This activates the support for deploying Load-balancer -based endpoints. We formerly covered Queue-based only.
- Add comprehensive README.md with complete documentation
- Add example GPU worker with @Remote decorator
- Add example CPU interface with todo list functions
- Add FastAPI main.py entry point
- Add .env template with commented values
- Add .flashignore and .gitignore
- Add requirements.txt with dependencies
- Add skeleton.py with conflict detection and template creation
Enables flash init to initialize projects in the current directory with
interactive overwrite warnings.

Usage:
- `flash init` → Initialize in current directory
- `flash init .` → Initialize in current directory
- `flash init <name>` → Create new directory (existing behavior)

Features:
- Automatic conflict detection for existing files
- Interactive confirmation prompt with file list
- `--force` flag to skip prompts
- Smart project naming (uses current dir name when in-place)
- Conditional "Next steps" (omits cd when in current dir)
- Separate success messages for each mode
Fixed critical documentation errors in code examples:
- Correct GPU Types
- Correct CPU Types
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces comprehensive flash init functionality with project skeleton templates, in-place initialization support, and conflict detection. It adds accurate GPU/CPU type documentation, template-based project creation, and improved UX for different initialization modes.

Key Changes:

  • Added ServerlessType enum to distinguish between queue-based (QB) and load-balancer (LB) serverless endpoints
  • Implemented in-place initialization allowing flash init in current directory with conflict detection
  • Created complete project template structure with GPU workers, CPU interfaces, and comprehensive documentation
  • Refactored skeleton creation from hardcoded strings to file-based templates with substitution support

Reviewed Changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/tetra_rp/core/resources/serverless.py Added ServerlessType enum with QB/LB variants and integrated it into ServerlessResource
src/tetra_rp/core/resources/__init__.py Exported ServerlessType for public API use
src/tetra_rp/__init__.py Added ServerlessType to package exports
src/tetra_rp/cli/main.py Refactored command registration from direct imports to inline function wrappers, made project_name optional
src/tetra_rp/cli/commands/init.py Implemented in-place initialization with conflict detection and conditional messaging
src/tetra_rp/cli/utils/skeleton.py Replaced hardcoded templates with file-based template system and added conflict detection
src/tetra_rp/cli/utils/skeleton_template/main.py FastAPI application entry point template with worker router registration
src/tetra_rp/cli/utils/skeleton_template/workers/example/endpoint.py GPU worker template using LiveServerless with RTX 4090 configuration
src/tetra_rp/cli/utils/skeleton_template/workers/example/__init__.py GPU worker router template demonstrating queue-based handler pattern
src/tetra_rp/cli/utils/skeleton_template/workers/interface/endpoint.py CPU interface template with multiple remote functions
src/tetra_rp/cli/utils/skeleton_template/workers/interface/__init__.py CPU interface router template demonstrating load-balancer pattern
src/tetra_rp/cli/utils/skeleton_template/requirements.txt Minimal dependency specification for new projects
src/tetra_rp/cli/utils/skeleton_template/README.md Comprehensive documentation covering architecture, deployment, API reference, and troubleshooting
src/tetra_rp/cli/utils/skeleton_template/.env Environment variable template with commented defaults
src/tetra_rp/cli/utils/skeleton_template/.gitignore Standard Python/IDE/OS ignore patterns
src/tetra_rp/cli/utils/skeleton_template/.flashignore Flash build-specific ignore patterns

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@deanq deanq changed the base branch from main to deanq/ae-1102-flash-support-for-load-balancer-based-endpoints November 10, 2025 06:00
deanq added 2 commits November 9, 2025 22:06
…on README

Address Copilot review feedback on PR 110:

**Quick Start fixes:**
- GPU endpoint: /example/process → /example/
- CPU endpoint: /interface/todo/list → /interface/list
- Fix request model: {"task": "..."} → {"data": "..."}

**API Reference fixes:**
- GPU Worker: /example/process → /example/
- GPU request: {"input_data": {...}} → {"data": "string"}
- CPU list: /interface/todo/list → /interface/list
- CPU add: /interface/todo/add → /interface/add
- CPU delete: DELETE /interface/todo/delete → POST /interface/delete
- Request bodies: {"item": "..."} → {"data": "..."}
- Response format: direct strings → {"result": "string"}

All endpoints now match actual FastAPI router implementation.
Add `if __name__ == "__main__"` test block to CPU interface endpoint
to match the pattern in GPU worker endpoint.

Features:
- Test all three interface functions in parallel using asyncio.gather()
- Demonstrates concurrent execution of remote CPU functions
- Provides example of running endpoint directly for testing

Users can now test CPU interface by running:
python workers/interface/endpoint.py
Copy link

@KAJdev KAJdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the example seems a bit convoluted, if this is supposed to be a skeleton for starting projects, it should be minimal? Think cargo inits 3 line hello world main.rs, or wrangler inits 10 line default worker script.

Base automatically changed from deanq/ae-1102-flash-support-for-load-balancer-based-endpoints to main November 11, 2025 04:50
@rambo-runpod
Copy link

some feedback and qs. i was able to deploy a serverless endpoint and tested out flash init. loved that we can provision infra and have remote code execution all within flash :)

  • flash init is not creating requirements.txt, README.md, .env for me when running locally
  • should we create a requirements.txt or a pyroject.toml instead since we're using uv. or we can give the user the option to specify which pkg mgr they're using and generate the corresponding file?
  • are we planning to have flash/tetrap_rp sdk manage state? for example, if i want to update the workersMax for an existing serverless endpoint rerunning my python script should update that config. maybe this is what the deploy command is for?
  • should we standardize on snake case naming convention for params? we use it for the remote decorator but not for the resource objects
  • similar sentiment to zeke, i think flash init should spin up a boilerplate main.py file with the sdk imported with either the worker example inline or we can prompt in the cli to see what they want to spin up (pods, sls, network vols, etc)
  • not related to flash init but more to the project as whole, have we explored separating the functionality of deploying infra vs code execution on infra? it may make sense for a pattern like having the cli flash deploy handle deployments and flash run for remote execution.

@rambo-runpod
Copy link

another thought just now as we were discussing flash though lower prioritiy it would be good to show pricing each time we deploy new resources

@jhcipar
Copy link

jhcipar commented Nov 12, 2025

Similar comments as above I think, I also just wonder if, in the spirit of making the entrypoint for Flash as dead simple as possible, we have pointers towards running a command where we execute worker code directly instead of first standing up the API server and sending requests to it

I know the end goal is the fastapi server, but it feels like a big potential barrier to getting your head wrapped around how this works if you have to absorb the concept of liveserverless and and making routes with an api server, etc. right off the bat

In other words I'm not opposed to that being part of the init skeleton, but the simplest unit that makes Flash feel magical is the ability to run function code on remote infra in a way that feels effortless and like how I'd write normal code. If I immediately have to think about an API server and then make requests to it instead of just running some function code, that feels like a potential place where we introduce complexity up front when we could move it backwards in the journey a little bit

also - not part of this PR probably but is there a way we could use uv inside of the flash app? Right now it feels weird to have uv as the "outer" package manager but then the expectation is that you use conda for a flash project

deanq added 11 commits November 13, 2025 16:38
Reverted aggressive lazy-loading patterns that broke IDE reference linking
and type checking. Performance is maintained through upstream boto3
lazy-loading in runpod-python.

Changes:
- Restored direct imports in __init__.py (removed __getattr__ pattern)
- Removed get_runpod() lazy-loading function
- Fixed ResourceManager to load resources at init
- Added Typer decorators to command functions for proper CLI signatures
- Improved flash init overwrite behavior with user prompts
- Updated tests to handle new initialization pattern

Breaking: None - all CLI commands work identically
Performance: Maintained with upstream boto3 fix (0.6s cold start)
Developer Experience: IDE navigation, autocomplete, and type checking restored
Support both `def` and `async def` function definitions in @Remote decorator.

Changes:
- Modified AST parser to recognize both FunctionDef and AsyncFunctionDef nodes
- Added textwrap.dedent() to handle indented function definitions
- Updated client.py docstring with async/sync examples
- Added comprehensive test suite for async function support

Fixes ValueError when wrapping async functions with @Remote decorator.
Support both `def` and `async def` function definitions in the `@remote` decorator.

Changes:
- Modified AST parser to recognize both FunctionDef and AsyncFunctionDef nodes
- Added textwrap.dedent() to handle indented function definitions
- Updated client.py docstring with async/sync examples
- Added comprehensive test suite for async function support

Fixes ValueError when wrapping async functions with the remote decorator.
…rator' of https://github.com/runpod/tetra-rp into deanq/ae-1469-bug-support-async-and-non-for-remote-decorator
Fixes ValueError when using @Remote decorator with async functions by:
- Adding inspect.unwrap() to get original function from wrapper
- Supporting ast.AsyncFunctionDef in addition to ast.FunctionDef
- Adding early dedenting to handle functions defined in classes/methods

This enables proper source extraction for both sync and async decorated
functions, resolving the "Could not find function definition" error.
Simplify skeleton template to focus on basic patterns:

**GPU Worker (workers/gpu/):**
- Simple hello-world response with GPU detection
- Returns GPU name, count, and memory via PyTorch
- Minimal dependencies (only torch)
- Clear demonstration of @Remote decorator

**CPU Worker (workers/cpu/):**
- Simple hello-world response
- No complex data processing
- Minimal dependencies (no pandas/numpy)
- Clear demonstration of CpuLiveServerless

**README Updates:**
- Simplified documentation focusing on Tetra patterns
- Removed complex ML/data processing examples
- Added clear dependency management section
- Updated API examples to /hello endpoints

**Changes:**
- Renamed workers/example/ → workers/gpu/
- Simplified gpu_hello() to return GPU info
- Simplified cpu_hello() to basic response
- Updated router endpoints from /matrix and /process to /hello
- Reduced README from 620 lines to 252 lines (-368 lines)

This makes the skeleton more approachable for new users learning
the @Remote decorator pattern without the cognitive overhead of
understanding PyTorch matrix operations or pandas data analysis.
Update project structure display in flash init command output to show
the current skeleton structure:
- workers/example/ → workers/gpu/
- workers/interface/ → workers/cpu/

This matches the skeleton template refactoring from commit 590df46.
Simplify flash init command by removing all conda-related functionality:

**Removed:**
- Conda utility imports
- REQUIRED_PACKAGES list
- --no-env flag
- Conda environment creation and package installation logic
- Conditional output based on conda environment status

**Simplified:**
- Next steps always show: pip install, add API key, flash run
- Users manage their own virtual environments (venv, conda, poetry, etc.)
- Reduced file from 191 lines to 116 lines (-75 lines)

This removes unnecessary complexity and lets users choose their preferred
environment management tool.
Update CLI documentation to reflect removal of conda environment management:

**flash-init.md:**
- Removed --no-env option
- Removed conda environment creation documentation
- Updated project structure to show gpu/ and cpu/ workers
- Simplified next steps to use pip install

**README.md:**
- Removed conda activate from quick start
- Removed --no-env from flash init options
- Updated project structure to match skeleton template
- Updated test endpoints to /gpu/hello and /cpu/hello

Documentation now matches the simplified init command behavior.
Change glob pattern from '*.py' to '**/*.py' in extract_remote_dependencies()
to recursively search worker subdirectories.

This fixes the build command to work with the new skeleton structure where
workers are organized in subdirectories (workers/gpu/, workers/cpu/) instead
of directly in the workers/ directory.

The build command can now find and extract dependencies from @Remote
decorators in workers/gpu/endpoint.py and workers/cpu/endpoint.py.
Add detailed error messages when pip check fails:
- Show stderr output from pip --version command
- Show exception details if subprocess fails
- Add helpful hint to install pip with ensurepip

This helps users diagnose why pip detection is failing instead of
showing a generic 'pip not found' error.
Detect and use 'uv pip' as fallback when 'python -m pip' is not available.
This handles uv-created virtual environments that don't include pip by default.

The build command now:
1. Tries python -m pip first
2. Falls back to uv pip if pip not in venv
3. Shows a note when using uv pip
4. Provides helpful install instructions if neither works

Fixes build errors in uv environments without pip installed.
@deanq
Copy link
Member Author

deanq commented Nov 14, 2025

I have updated this branch so that the init skeleton is simple. I would recommend the following steps to get started:

mkdir myapp
cd myapp
uv init
uv venv --python 3.11
uv add git+https://github.com/runpod/tetra-rp/@deanq/ae-1249-improve-cli
# flash commands...

Thanks, @KAJdev @rambo-runpod @jhcipar @PranjalJain-1

@deanq deanq requested review from KAJdev and Copilot November 14, 2025 10:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 31 out of 33 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@deanq
Copy link
Member Author

deanq commented Nov 14, 2025

some feedback and qs. i was able to deploy a serverless endpoint and tested out flash init. loved that we can provision infra and have remote code execution all within flash :)

  • flash init is not creating requirements.txt, README.md, .env for me when running locally

This should be fixed now

  • should we create a requirements.txt or a pyroject.toml instead since we're using uv. or we can give the user the option to specify which pkg mgr they're using and generate the corresponding file?

Yes, this is addressed with the latest changes

  • are we planning to have flash/tetrap_rp sdk manage state? for example, if i want to update the workersMax for an existing serverless endpoint rerunning my python script should update that config. maybe this is what the deploy command is for?

Yes, that's a separate task and PR by Jacob

  • should we standardize on snake case naming convention for params? we use it for the remote decorator but not for the resource objects

This is tricky. The reason for the non-pythonic naming has to do with our backend that uses camelCase. We want to use Pydantic's (de)serializer with as little maintenance as possible. If we have to add a snake-to-camel and back translator, we're just adding unnecessary overhead.

  • similar sentiment to zeke, i think flash init should spin up a boilerplate main.py file with the sdk imported with either the worker example inline or we can prompt in the cli to see what they want to spin up (pods, sls, network vols, etc)

This has been addressed with the latest changes. I would start with an easy skeleton and encourage the use of our separate examples repo for deep-dive.

  • not related to flash init but more to the project as whole, have we explored separating the functionality of deploying infra vs code execution on infra? it may make sense for a pattern like having the cli flash deploy handle deployments and flash run for remote execution.

Yes, flash build and deploy will put these all up on our serverless platform. The flash run is basically a local preview of how that works.

…ences

- Remove workers/interface/ directory from skeleton template (was supposed to be removed but wasn't)
- Fix main.py home endpoint to reference correct paths: /gpu/hello and /cpu/hello (was /gpu/matrix and /cpu/process)
- Addresses PR feedback about skeleton complexity
- Change main.py default port from 8000 to 8888 to match flash run command
- Ensures consistency between skeleton template and CLI behavior
- Documentation already uses port 8888 correctly
- Addresses Copilot feedback about port inconsistency
Temporarily disable build and deploy commands with coming soon message. This release focuses on core init and run functionality, with build and deploy features scheduled for future releases.
actual_project_name = project_dir.name
else:
# Create new directory
project_dir = Path(project_name)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this behavior! I feel like with most cli tool inits I don't like when the default behavior is to create a subfolder inside of my working directory, so it's great to support both that and generating a named project folder

@deanq deanq merged commit 155d6ee into main Nov 14, 2025
7 checks passed
@deanq deanq deleted the deanq/ae-1249-improve-cli branch November 14, 2025 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants