-
Notifications
You must be signed in to change notification settings - Fork 3
Add ingest task definition #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This commit adds a task definition for ingestion. The task definition ARN is injected into the environments of tasks that require it. This is in preparation for supporting isolated ingestion for on premises deployments.
|
See polytomic/app#12381. |
|
@codex review |
|
@greptile review |
jpalawaga
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm no expert with this stuff, though it seems like mostly a cmd-c cmd-v of the existing service definitions, which makes sense to me.
| scheduler_log_group = module.ecs_log_groups["scheduler"].cloudwatch_log_group_name | ||
| env = merge(local.environment.env, { | ||
| INGEST_EXECUTOR_TASK_DEFINITION = aws_ecs_task_definition.ingest.arn, | ||
| INGEST_EXECUTOR_CONTAINER_NAME = "ingest" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, but where are these used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR adds infrastructure support for isolated data ingestion in Polytomic's on-premises ECS deployments. The changes introduce a new "ingest" task definition that follows the existing service patterns, complete with logging, resource configuration, and environment variable management. The ingest task definition ARN is injected into worker, sync, and scheduler services via INGEST_EXECUTOR_TASK_DEFINITION and INGEST_EXECUTOR_CONTAINER_NAME environment variables, enabling these services to dynamically spawn dedicated ingestion containers using ECS Run Task API. This architectural pattern separates compute-intensive ingestion workloads from core services, improving resource isolation and scaling flexibility. The implementation maintains consistency with existing task definitions while introducing new CPU/memory configuration variables with sensible defaults (2 vCPU, 8GB memory) that reflect the expected resource requirements for data ingestion operations.
Changed Files
| Filename | Score | Overview |
|---|---|---|
terraform/modules/ecs/logs.tf |
5/5 | Added "ingest" to the list of services requiring CloudWatch log groups |
terraform/modules/ecs/main.tf |
5/5 | Added ingest memory configuration to the locals environment block |
terraform/modules/ecs/README.md |
5/5 | Auto-generated documentation update for new ingest task definition and variables |
terraform/modules/ecs/vars.tf |
5/5 | Added CPU and memory configuration variables for the ingest container |
terraform/modules/ecs/task-definitions/ingest.json.tftpl |
4/5 | New ECS task definition template following established patterns with proper configuration |
terraform/modules/ecs/ecs-tasks.tf |
2/5 | Added ingest task definition but missing corresponding ECS service to deploy it |
Confidence score: 2/5
- This PR introduces a potentially incomplete implementation that may not function as intended
- Score lowered due to missing ECS service for the new ingest task definition, making the task unusable in practice, and potential environment variable merging issues in task definitions
- Pay close attention to
terraform/modules/ecs/ecs-tasks.tfwhich defines the task but provides no deployment mechanism
Sequence Diagram
sequenceDiagram
participant User
participant Terraform
participant AWS_ECS
participant Task_Definitions as Task Definitions
participant ECS_Services as ECS Services
participant Environment as Environment Variables
User->>Terraform: "Apply terraform configuration"
Terraform->>Task_Definitions: "Create ingest task definition"
Task_Definitions->>AWS_ECS: "Register aws_ecs_task_definition.ingest"
Note over Task_Definitions: Task definition includes:<br/>- CPU: polytomic_resource_ingest_cpu<br/>- Memory: polytomic_resource_ingest_memory<br/>- Container: ingest.json.tftpl
Terraform->>Environment: "Set INGEST_EXECUTOR_TASK_DEFINITION"
Environment->>Task_Definitions: "Inject ingest task ARN into worker"
Environment->>Task_Definitions: "Inject ingest task ARN into sync"
Environment->>Task_Definitions: "Inject ingest task ARN into scheduler"
Note over Environment: Environment variables added:<br/>INGEST_EXECUTOR_TASK_DEFINITION<br/>INGEST_EXECUTOR_CONTAINER_NAME
Terraform->>ECS_Services: "Update worker service with ingest task ARN"
Terraform->>ECS_Services: "Update sync service with ingest task ARN"
Terraform->>ECS_Services: "Update scheduler service with ingest task ARN"
AWS_ECS->>User: "Ingest task definition ready for isolated ingestion"
Additional Comments (1)
-
terraform/modules/ecs/ecs-tasks.tf, line 233-274 (link)logic: Missing ECS service for ingest task definition. The task definition is created but no
aws_ecs_serviceresource exists to run it, making the task unreachable.
6 files reviewed, 2 comments
| env = merge(local.environment.env, { | ||
| INGEST_EXECUTOR_TASK_DEFINITION = aws_ecs_task_definition.ingest.arn, | ||
| INGEST_EXECUTOR_CONTAINER_NAME = "ingest" | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Environment variable structure inconsistency. This creates env.env nesting which differs from other tasks that use flat environment variables.
Prompt To Fix With AI
This is a comment left during a code review.
Path: terraform/modules/ecs/ecs-tasks.tf
Line: 72:75
Comment:
**logic:** Environment variable structure inconsistency. This creates `env.env` nesting which differs from other tasks that use flat environment variables.
How can I resolve this? If you propose a fix, please make it concise.
This commit adds a task definition for ingestion. The task definition ARN is injected into the environments of tasks that require it. This is in preparation for supporting isolated ingestion for on premises deployments.