Skip to content

Conversation

@edmundmiller
Copy link
Contributor

Overview

This PR adds a comprehensive Pulumi Infrastructure as Code project for managing the iGenomes S3 infrastructure. The project imports and tracks the ngi-igenomes bucket from AWS Open Data Registry, providing reference documentation and integration with the nf-core infrastructure ecosystem.

What's Added

Core Infrastructure

  • 📦 S3 Bucket Import: Imports existing ngi-igenomes bucket with protective measures
  • 🔐 1Password Integration: Secure credential management using the same patterns as AWSMegatests
  • 🏗️ Modular Architecture: Clean separation of concerns (config, providers, infrastructure, utils)
  • 🛡️ Read-Only Tracking: Protected resources with extensive ignore_changes for properties we can't manage

Documentation (1500+ lines total)

  • README.md (320+ lines): Comprehensive user documentation with usage examples
  • CLAUDE.md (360+ lines): AI assistant context with architecture and troubleshooting
  • CONTEXT7.md: AWS SDK documentation integration guide
  • SETUP_VERIFICATION.md: Step-by-step verification and testing guide

Project Structure

pulumi/igenomes/
├── __main__.py              # Main Pulumi program
├── Pulumi.yaml              # Project configuration
├── pyproject.toml           # Python dependencies
├── .envrc                   # 1Password credential loading
├── test_setup.sh            # Automated verification script
└── src/                     # Modular source code
    ├── config/              # Environment variable loading
    ├── providers/           # AWS provider configuration
    ├── infrastructure/      # S3 bucket import logic
    └── utils/               # Centralized constants

Features

Security

  • ✅ All credentials from 1Password (never in git)
  • ✅ Protected resources prevent accidental deletion
  • ✅ Read-only tracking for AWS Open Data bucket
  • ✅ Comprehensive .gitignore for secrets

Integration

  • ✅ Follows AWSMegatests project patterns
  • ✅ Uses shared S3 backend (nf-core-pulumi-state)
  • ✅ Consistent 1Password integration approach
  • ✅ Ready for nf-core infrastructure ecosystem

Outputs

The project exports rich metadata for integration:

  • Bucket information (name, ARN, region, description)
  • Usage examples (S3 URIs, CLI commands, Nextflow configs)
  • Documentation links (AWS Open Data, GitHub, docs)
  • Resource tracking (Pulumi IDs for state management)

About iGenomes

The ngi-igenomes bucket is hosted by AWS Open Data Registry and contains:

  • ~5TB of reference genome data
  • 30+ species reference genomes
  • Pre-built indices for alignment tools (BWA, STAR, Bowtie2, etc.)
  • Annotation files in GTF and BED formats
  • GATK bundles for human genomes

Usage

Once merged, users can:

cd pulumi/igenomes

# Authenticate with 1Password
eval $(op signin)

# Enable direnv
direnv allow

# Preview infrastructure import
direnv exec . uv run pulumi preview

# Import resources
direnv exec . uv run pulumi up

Testing

The project includes an automated test script:

bash test_setup.sh

This verifies:

  • 1Password authentication
  • AWS credential loading
  • UV and dependencies
  • Pulumi backend connection
  • Stack initialization

Why This Matters

  1. Infrastructure as Code: Tracks important nf-core infrastructure in version control
  2. Documentation: Provides comprehensive reference for iGenomes usage
  3. Integration: Ready for use in other nf-core Pulumi projects
  4. Consistency: Follows established patterns from AWSMegatests
  5. Security: Demonstrates proper credential management with 1Password

Related Projects

  • AWSMegatests: AWS Batch compute environments for testing
  • pulumi_state: S3 backend for Pulumi state storage
  • seqera_platform: Seqera Platform workspace management

🤖 Generated with Claude Code

@edmundmiller edmundmiller requested review from a team and maxulysse as code owners October 28, 2025 11:36
@edmundmiller edmundmiller force-pushed the feat/pulumi-igenomes-infrastructure branch from 1ec7204 to 3327e42 Compare October 28, 2025 15:42
Add comprehensive Pulumi project for managing iGenomes S3 infrastructure with proper import workflow and 1Password integration.

## Features

- S3 bucket import for ngi-igenomes (AWS Open Data Registry)
- Secure credential management via 1Password
- Modular architecture (config, providers, infrastructure, utils)
- Comprehensive documentation (README, CLAUDE.md, Context7 guide)
- Protected resources with read-only tracking approach
- Rich metadata exports for integration with nf-core ecosystem

## Project Structure

- `__main__.py`: Main Pulumi program with S3 import logic
- `src/`: Modular source code organization
  - `config/`: Environment variable loading and validation
  - `providers/`: AWS provider configuration
  - `infrastructure/`: S3 bucket import implementation
  - `utils/`: Centralized constants
- Documentation:
  - `README.md`: Comprehensive user documentation (320+ lines)
  - `CLAUDE.md`: AI assistant context (360+ lines)
  - `CONTEXT7.md`: AWS SDK documentation guide
  - `SETUP_VERIFICATION.md`: Verification and testing guide
- `test_setup.sh`: Automated setup verification script
- `.envrc`: 1Password credential loading configuration

## Security

- All credentials from 1Password (never in git)
- Protected resources prevent accidental deletion
- Read-only tracking for AWS Open Data bucket
- Extensive ignore_changes for properties we can't manage

## Integration

- Follows AWSMegatests project patterns
- Uses shared S3 backend (nf-core-pulumi-state)
- Consistent 1Password integration approach
- Ready for nf-core infrastructure ecosystem

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@edmundmiller edmundmiller force-pushed the feat/pulumi-igenomes-infrastructure branch from 3327e42 to 6b705f1 Compare October 28, 2025 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants