Skip to content

Conversation

@lawrencejones
Copy link
Member

@isaacseymour I think we were actually pointed at a commit that isn't in a branch or attached at all which is unideal. I've taken the commit we were pointing at from the app (which is your last commit) and cherry-picked the optimisations on top.

I'm going to point the codebase at this until we actually fix things and figure out what our strategy is with upstream.

isaacseymour and others added 2 commits January 29, 2025 22:20
This replicates an issue we're seeing in the OpenAPIv3 generator, where
if two request bodies are structurally similar (i.e. same primitive
types), but not identical (in this example both have a single string
attribute with the same name, but with different enum validations), the
request body for those two methods points at the same type schema.

There's a (previously-failing) test, where:
- two types which are 'similar' (both have a string-type attr called
  `my_attr`), but not identical (they are different enums)
- two different services with the same method, each referencing
  different types

The resulting OpenAPIv3 schema generates a single
`MethodEnumRequestBody` type, and uses it for the `requestBody` type of
both methods.

This is incorrect, since the two methods use a different enum.

The fix here is to update the `hashAttribute` method to hash the
`.Validation` of the attribute, if it is set 🎉
tl;dr: make codegen much faster through parallel writing of files

As a recap, the generation process has several stages:

1. **Compile temporary binary**: Goa creates a temporary Go program that
   imports the design package (which triggers package initialization and
   DSL execution via blank import)
2. **Execute binary**: The binary runs through multiple phases:
   - Package initialization (runs DSL definitions)
   - `eval.RunDSL()` - processes the DSL in 4 phases (execute, prepare,
     validate, finalize)
   - `generator.Generate()` - produces the actual Go files

Measuring codegen for the incident-io codebase on 3,036 generated files:

**Total time: 61 seconds**

Breakdown:
- build.Import: 117ms
- NewGenerator (packages.Load): 52ms
- Write (generate main.go): 14ms
- Compile (go get + go build): 3.6s
  - packages.Load: 47ms
  - go get: 514ms
  - go build: 3.0s
- **Run (execute binary): 52.2s** ⚠️ 85% of total time
  - Check eval.Context.Errors: <1ms
  - eval.RunDSL(): 105ms
  - **generator.Generate(): 51.6s** ⚠️ Main bottleneck
    - Stage 1 (Compute design roots): <1ms
    - Stage 2 (Compute gen package): 33ms
    - Stage 3 (Retrieve generators): <1ms
    - Stage 4 (Pre-generation plugins): <1ms
    - **Stage 5 (Generate files): 26.2s** (3 generators, sequential)
      - Generator 0: 7.2s → 1,438 files
      - Generator 1: 18.8s → 1,594 files
      - Generator 2: 0.2s → 4 files
    - Stage 6 (Post-generation plugins): <1ms
    - **Stage 7 (Write files): 32.1s** ⚠️ Biggest bottleneck (52% of generation)
    - Stage 8 (Compute filenames): 2ms

This commit tries optimising the biggest stage of this process and
changes file rendering from sequential loop to parallel worker pool:

```go
// Before: Sequential (32.1s for 3,036 files)
for _, f := range genfiles {
    filename, err := f.Render(dir)
    // ...
}

// After: Parallel with runtime.NumCPU() workers
numWorkers := runtime.NumCPU()
// Worker pool processes files concurrently
```

Which changed execution time from **61 seconds total** to **34 seconds
total**, for an overall speedup of 1.8x.

While here, I also tried parallelising Stage 5 (generator functions) but
hit infinite recursion in `AsObject()` when handling circular type
references concurrently. This is where I'd go for the next biggest
speed-up.
@isaacseymour isaacseymour force-pushed the v3 branch 4 times, most recently from 1c59d19 to 24c1f32 Compare November 10, 2025 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants