remove makefile (distracting) prefer uv

everettVT · everettVT · commit 416216c49482 · 2025-09-01T16:42:29.000-05:00
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -5,6 +5,21 @@ on:
     branches: [ main ]
   pull_request:
     branches: [ main ]
+  workflow_dispatch:
+    inputs:
+      run_integration:
+        description: "Run integration tests"
+        required: false
+        default: false
+        type: boolean
+      target_env:
+        description: "Target environment for integration tests"
+        required: false
+        default: "env1"
+        type: choice
+        options:
+          - env1
+          - env2
 
 permissions:
   contents: read
@@ -37,6 +52,12 @@ jobs:
   integration:
     runs-on: ubuntu-latest
     needs: unit
+    strategy:
+      matrix:
+        target_env: [env1]
+      fail-fast: false
+    env:
+      TARGET_ENV: ${{ github.event.inputs.target_env || matrix.target_env }}
     steps:
       - name: Checkout
         uses: actions/checkout@v4
@@ -52,15 +73,33 @@ jobs:
       - name: Sync dependencies
         run: uv sync
 
-      - name: Run integration tests
+      - name: Select environment secrets
+        id: envsecrets
+        run: |
+          echo "TARGET_ENV=$TARGET_ENV" >> $GITHUB_OUTPUT
+      - name: Run integration tests (env1)
+        if: ${{ steps.envsecrets.outputs.TARGET_ENV == 'env1' }}
+        env:
+          OPENAI_API_KEY: ${{ secrets.ENV1_OPENAI_API_KEY || secrets.OPENAI_API_KEY }}
+          OPENAI_BASE_URL: ${{ secrets.ENV1_OPENAI_BASE_URL || secrets.OPENAI_BASE_URL }}
+          MODEL_ID: ${{ secrets.ENV1_MODEL_ID || secrets.MODEL_ID }}
+          HF_TOKEN: ${{ secrets.ENV1_HF_TOKEN || secrets.HF_TOKEN }}
+        run: |
+          if [ -z "$OPENAI_BASE_URL" ]; then
+            echo "OPENAI_BASE_URL not set; skipping integration tests for env1";
+            exit 0;
+          fi
+          uv run pytest -q -m integration
+      - name: Run integration tests (env2)
+        if: ${{ steps.envsecrets.outputs.TARGET_ENV == 'env2' }}
         env:
-          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
-          OPENAI_BASE_URL: ${{ secrets.OPENAI_BASE_URL }}
-          MODEL_ID: ${{ secrets.MODEL_ID }}
-          HF_TOKEN: ${{ secrets.HF_TOKEN }}
+          OPENAI_API_KEY: ${{ secrets.ENV2_OPENAI_API_KEY || secrets.OPENAI_API_KEY }}
+          OPENAI_BASE_URL: ${{ secrets.ENV2_OPENAI_BASE_URL || secrets.OPENAI_BASE_URL }}
+          MODEL_ID: ${{ secrets.ENV2_MODEL_ID || secrets.MODEL_ID }}
+          HF_TOKEN: ${{ secrets.ENV2_HF_TOKEN || secrets.HF_TOKEN }}
         run: |
-          if [ -z "${{ secrets.OPENAI_BASE_URL }}" ]; then
-            echo "OPENAI_BASE_URL not set; skipping integration tests";
+          if [ -z "$OPENAI_BASE_URL" ]; then
+            echo "OPENAI_BASE_URL not set; skipping integration tests for env2";
             exit 0;
           fi
           uv run pytest -q -m integration
diff --git a/FRICTION_LOG.md b/FRICTION_LOG.md
@@ -34,7 +34,7 @@ Basically, for class-based user defined functions, naive initialization usage pa
 
 For some reason, when we use an openai client inside of a function based batch UDF, we can't add the concurrency parameter. We get this runtime error referring to pickle serialization. I ran into this while I was initially developing the batch udf and it took a second to actually reproduce, but it looks like others have run into it. I also started a [thread on slack](https://dist-data.slack.com/archives/C052CA6Q9N1/p1756400464828409) due to my confusion on whether or not the daft.func supported a concurrency argument.
 
-### [Issue 5090](https://github.com/Eventual-Inc/Daft/issues/5088) Scaling headaches 
+### Scaling headaches (Demonstrated in workload notebook)
 
 For the average user who is looking to leverage daft to perform ai inference using a client (whether it would be openai or otherwise), most users will try either a row_wise UDF or a synchronous Batch UDF. These implementations work at small scale but run into issues once users attempt to run them at 2000 + rows. Regardless of how they arrive at the conclusion, eventually they will attempt to run their inference calls asynchronously which will produce non-blocking errors at the 200-1500 row limit range. 
 
diff --git a/README.md b/README.md
@@ -40,36 +40,24 @@ Core Deliverable:
 - **Python**: 3.12+
 - **uv**: Fast Python package/venv manager. Install:
 ```bash
-curl -LsSf https://astral.sh/uv/install.sh | sh
+pip install uv
 ```
 
 ### Install and Setup
+Clone this repository and then run 
 ```bash
 cd daft-structured-outputs
-make setup
+uv venv && uv sync
 ```
-This creates a local `.venv` and syncs dependencies from `pyproject.toml`. Prefer running commands with `uv run` without activating the venv.
+- This creates a local `.venv` and syncs dependencies from `pyproject.toml`. 
+- Prefer running commands with `uv run` without activating the venv.
 
 ### Environment Variables
-These are read by tests and examples. The `makefile` also exports them.
+These are read by tests and examples. A `.env.examples` has been provided as a template. 
 - `OPENAI_API_KEY`: Any non-empty value when using a local vLLM server (e.g., `none`).
-- `OPENAI_BASE_URL`: Defaults to `http://0.0.0.0:8000/v1`.
+- `OPENAI_BASE_URL`: Defaults to None. vLLM examples default to localhost:8000 
 - `HF_TOKEN`: Hugging Face token for model pulls. If not set, use `make hf-auth`.
-
-Example:
-```bash
-export OPENAI_API_KEY=none
-export OPENAI_BASE_URL=http://0.0.0.0:8000/v1
-export HF_TOKEN=hf_...
-```
-
-### Make Targets
-```bash
-make setup        # Create venv and uv sync
-make sync         # Re-sync dependencies
-make activate     # Echo activation instructions (prefer `uv run`)
-make clean        # Remove .venv
-```
+- `MODEL_ID`: for integration tests and CI
 
 ---
 
diff --git a/makefile b/makefile