Skip to content

Conversation

@dsubak
Copy link
Contributor

@dsubak dsubak commented Nov 6, 2025

Description (What does it do?)

This is a first, rough cut at getting a jupyterhub authoring environment set up. It's definitely not complete but it provides a starting point.

How can this be tested?

Currently it cannot and probably doesn't work. Notably, I expect that we may need some changes to the APISix and Keycloak. However I wanted to put this up as early as possible so I could get eyes on it.

I have not currently run this, but want to run this against applications.jupyterhub.CI once it has gotten some preliminary eyes. I believe I'll need to run against infrastructure.aws.eks.applications.CI before I do so to get the namespace provisioned

The pulumi up command output is as follows.

     pulumi:pulumi:Stack                                                     ol-infrastructure-jupyterhub-application-applications.jupyterhub.CI                                                    
     ├─ ol:services:Vault:K8S:VaultDynamicSecret                             jupyterhub-app-db-creds-vaultdynamicsecret                                                                             
     │  └─ kubernetes:yaml/v2:ConfigGroup                                    OLVaultK8SSecret-jupyter-jupyterhub-app-db-creds                                                                       
     │     └─ kubernetes:secrets.hashicorp.com/v1beta1:VaultDynamicSecret    OLVaultK8SSecret-jupyter-jupyterhub-app-db-creds:jupyter/jupyterhub-app-db-creds                                       
     ├─ ol:infrastructure.aws.cloudwatch.OLCloudWatchAlarmRDS                jupyterhub-db-ci-FreeStorageSpace-OLCloudWatchAlarmSimpleRDSConfig                                                     
     │  └─ aws:cloudwatch:MetricAlarm                                        jupyterhub-db-ci-FreeStorageSpace-simple-rds-alarm                                                                     
 ~   ├─ pulumi:pulumi:StackReference                                         infrastructure.aws.policies                                                                                            
     ├─ ol:infrastructure:aws:database:OLAmazonDB                            jupyterhub-db-ci                                                                                                       
     │  ├─ aws:rds:ParameterGroup                                            jupyterhub-db-ci-postgres-parameter-group                                                                              
     │  └─ aws:rds:Instance                                                  jupyterhub-db-ci-postgres-instance                                                                                     
     ├─ aws:ec2:SecurityGroup                                                jupyterhub-db-security-group-jupyterhub-ci                                                                             
     ├─ ol:infrastructure.aws.cloudwatch.OLCloudWatchAlarmRDS                jupyterhub-db-ci-CPUUtilization-OLCloudWatchAlarmSimpleRDSConfig                                                       
     │  └─ aws:cloudwatch:MetricAlarm                                        jupyterhub-db-ci-CPUUtilization-simple-rds-alarm                                                                       
     ├─ ol:infrastructure.aws.cloudwatch.OLCloudWatchAlarmRDS                jupyterhub-db-ci-WriteLatency-OLCloudWatchAlarmSimpleRDSConfig                                                         
     │  └─ aws:cloudwatch:MetricAlarm                                        jupyterhub-db-ci-WriteLatency-simple-rds-alarm                                                                         
     ├─ ol:infrastructure.services.cert_manager:OLCertManagerCert            ol-jupyterhub-cert-manager-certificate-ci                                                                              
     │  ├─ kubernetes:cert-manager.io/v1:Certificate                         ol-cert-manager-certificate-jupyterhub-cert                                                                            
     │  └─ kubernetes:apisix.apache.org/v2:ApisixTls                         ol-cert-manager-apisix-tls-jupyterhub-cert                                                                             
     ├─ ol:services:Vault:K8S:ResourcesConfig                                jupyterhub                                                                                                             
     │  ├─ kubernetes:rbac.authorization.k8s.io/v1:ClusterRoleBinding        jupyterhub-vault-cluster-role-binding                                                                                  
     │  ├─ kubernetes:yaml/v2:ConfigGroup                                    jupyterhub-vso-resources                                                                                               
     │  │  ├─ kubernetes:secrets.hashicorp.com/v1beta1:VaultConnection       jupyterhub-vso-resources:jupyter/jupyterhub-vault-connection                                                           
     │  │  └─ kubernetes:secrets.hashicorp.com/v1beta1:VaultAuth             jupyterhub-vso-resources:jupyter/jupyterhub-auth                                                                       
     │  └─ kubernetes:core/v1:ServiceAccount                                 jupyterhub-vault-service-account                                                                                       
     ├─ vault:kubernetes:AuthBackendRole                                     ol-jupyterhub-vault-k8s-auth-backend-role-ci                                                                           
     ├─ kubernetes:helm.sh/v3:Release                                        jupyterhub-CI-application-helm-release                                                                                 
 ~   ├─ pulumi:pulumi:StackReference                                         infrastructure.aws.dns                                                                                                 
 -   ├─ ol:infrastructure:services:k8s:OLApisixOIDCResources                 ol-k8s-apisix-olapisixoidcresources-ci                                                                                 
 ~   ├─ pulumi:pulumi:StackReference                                         infrastructure.aws.network.CI                                                                                          
 ~   ├─ pulumi:pulumi:StackReference                                         infrastructure.consul.operations.CI                                                                                    
 ~   ├─ pulumi:pulumi:StackReference                                         infrastructure.vault.operations.CI                                                                                     
     ├─ ol:infrastructure.aws.cloudwatch.OLCloudWatchAlarmRDS                jupyterhub-db-ci-ReadLatency-OLCloudWatchAlarmSimpleRDSConfig                                                          
     │  └─ aws:cloudwatch:MetricAlarm                                        jupyterhub-db-ci-ReadLatency-simple-rds-alarm                                                                          
     ├─ ol:services:Vault:DatabaseBackend:postgresql                         jupyterhub                                                                                                             
     │  └─ vault:index:Mount                                                 jupyterhub-mount-point                                                                                                 
     │     └─ vault:database:SecretBackendConnection                         jupyterhub-database-connection                                                                                         
     │        ├─ vault:database:SecretBackendRole                            jupyterhub-database-role-app                                                                                           
     │        ├─ vault:database:SecretBackendRole                            jupyterhub-database-role-admin                                                                                         
     │        └─ vault:database:SecretBackendRole                            jupyterhub-database-role-readonly                                                                                      
     ├─ vault:index:Policy                                                   ol-jupyterhub-vault-policy-ci                                                                                          
 ~   ├─ pulumi:pulumi:StackReference                                         infrastructure.aws.eks.applications.CI                                                                                 
     ├─ pulumi:providers:vault                                               vault-provider                                                                                                         
     ├─ pulumi:providers:kubernetes                                          k8s-provider                                                                                                           
     ├─ ol:infrastructure:services:k8s:OLApisixRoute                         ol-jupyterhub-k8s-apisix-route-ci                                                                                      
     │  └─ kubernetes:apisix.apache.org/v2:ApisixRoute                       OLApisixRoute-ol-jupyterhub-k8s-apisix-route-ci                                                                        
     ├─ ol:infrastructure:services:k8s:OLApisixSharedPlugin                  ol-jupyterhub-external-service-apisix-plugins                                                                          
     │  └─ kubernetes:apisix.apache.org/v2:ApisixPluginConfig                OLApisixSharedPlugin-jupyterhub-ol-shared-plugins                                                                      
 ~   ├─ pulumi:pulumi:StackReference                                         implicit.infrastructure.monitoring                                                                                     
 +   ├─ ol:infrastructure.services.cert_manager:OLCertManagerCert            ol-jupyterhub-authoring-cert-manager-certificate-ci                                                                    
 +   │  ├─ kubernetes:cert-manager.io/v1:Certificate                         ol-cert-manager-certificate-jupyterhub-authoring-cert                                                                  
 +   │  └─ kubernetes:apisix.apache.org/v2:ApisixTls                         ol-cert-manager-apisix-tls-jupyterhub-authoring-cert                                                                   
 +   ├─ ol:infrastructure:services:k8s:OLApisixRoute                         ol-jupyterhub-authoring-k8s-apisix-route-ci                                                                            
 +   │  └─ kubernetes:apisix.apache.org/v2:ApisixRoute                       OLApisixRoute-ol-jupyterhub-authoring-k8s-apisix-route-ci                                                              
 +   ├─ ol:infrastructure:services:k8s:OLApisixSharedPlugin                  ol-jupyterhub-authoring-external-service-apisix-plugins                                                                
 +   │  └─ kubernetes:apisix.apache.org/v2:ApisixPluginConfig                OLApisixSharedPlugin-jupyterhub-authoring-ol-shared-plugins                                                            
 +   ├─ ol:infrastructure:services:k8s:OLApisixOIDCResources                 ol-k8s-apisix-jupyterhub-olapisixoidcresources-ci                                                                      
     │  └─ ol:services:Vault:K8S:VaultStaticSecret                           jupyterhub-oidc-secrets                                                                                                
     │     └─ kubernetes:yaml/v2:ConfigGroup                                 OLVaultK8SSecret-jupyter-ol-apisix-jupyterhub-oidc-secrets                                                             
     │        └─ kubernetes:secrets.hashicorp.com/v1beta1:VaultStaticSecret  OLVaultK8SSecret-jupyter-ol-apisix-jupyterhub-oidc-secrets:jupyter/ol-apisix-jupyterhub-oidc-secrets                   
 +   ├─ ol:infrastructure:services:k8s:OLApisixOIDCResources                 ol-k8s-apisix-jupyterhub-authoring-olapisixoidcresources-ci                                                            
 +   │  └─ ol:services:Vault:K8S:VaultStaticSecret                           jupyterhub-authoring-oidc-secrets                                                                                      
 +   │     └─ kubernetes:yaml/v2:ConfigGroup                                 OLVaultK8SSecret-jupyter-authoring-ol-apisix-jupyterhub-authoring-oidc-secrets                                         
 +   │        └─ kubernetes:secrets.hashicorp.com/v1beta1:VaultStaticSecret  OLVaultK8SSecret-jupyter-authoring-ol-apisix-jupyterhub-authoring-oidc-secrets:jupyter-authoring/ol-apisix-jupyterhub-a
 +   ├─ vault:index:Policy                                                   ol-jupyterhub-authoring-vault-policy-ci                                                                                
 +   ├─ ol:services:Vault:K8S:VaultDynamicSecret                             jupyterhub-authoring-app-db-creds-vaultdynamicsecret                                                                   
 +   │  └─ kubernetes:yaml/v2:ConfigGroup                                    OLVaultK8SSecret-jupyter-authoring-jupyterhub-authoring-app-db-creds                                                   
 +   │     └─ kubernetes:secrets.hashicorp.com/v1beta1:VaultDynamicSecret    OLVaultK8SSecret-jupyter-authoring-jupyterhub-authoring-app-db-creds:jupyter-authoring/jupyterhub-authoring-app-db-cred
 +   ├─ ol:services:Vault:DatabaseBackend:postgresql                         jupyterhub_authoring                                                                                                   
 +   │  └─ vault:index:Mount                                                 jupyterhub_authoring-mount-point                                                                                       
 +   │     └─ vault:database:SecretBackendConnection                         jupyterhub_authoring-database-connection                                                                               
 +   │        ├─ vault:database:SecretBackendRole                            jupyterhub_authoring-database-role-admin                                                                               
 +   │        ├─ vault:database:SecretBackendRole                            jupyterhub_authoring-database-role-readonly                                                                            
 +   │        └─ vault:database:SecretBackendRole                            jupyterhub_authoring-database-role-app                                                                                 
 +   ├─ ol:services:Vault:K8S:ResourcesConfig                                jupyterhub-authoring                                                                                                   
 +   │  ├─ kubernetes:yaml/v2:ConfigGroup                                    jupyterhub-authoring-vso-resources                                                                                     
 +   │  │  ├─ kubernetes:secrets.hashicorp.com/v1beta1:VaultAuth             jupyterhub-authoring-vso-resources:jupyter-authoring/jupyterhub-authoring-auth                                         
 +   │  │  └─ kubernetes:secrets.hashicorp.com/v1beta1:VaultConnection       jupyterhub-authoring-vso-resources:jupyter-authoring/jupyterhub-authoring-vault-connection                             
 +   │  ├─ kubernetes:rbac.authorization.k8s.io/v1:ClusterRoleBinding        jupyterhub-authoring-vault-cluster-role-binding                                                                        
 +   │  └─ kubernetes:core/v1:ServiceAccount                                 jupyterhub-authoring-vault-service-account                                                                             
 +   ├─ vault:kubernetes:AuthBackendRole                                     ol-jupyterhub-authoring-vault-k8s-auth-backend-role-ci                                                                 
 +   └─ kubernetes:helm.sh/v3:Release                                        jupyterhub-authoring-CI-application-helm-release  

Details can be found in the following.
jupyterhub_authoring_pulumi_ouput.txt

The only resource it wants to destroy is ol-k8s-apisix-olapisixoidcresources-ci - this is because it's now being replaced with an identical one named ol-k8s-apisix-jupyterhub-olapisixoidcresources-ci.

Open Questions

  • Namespace: ATM I'm assuming we'll want this in its own namespace. This is because resources within a namespace need to be unique, and I'm not sure I can easily control the names of the deployments (i.e. hub)
  • Keycloak: I haven't made any keycloak substructure changes for this. I'm not sure if we'll need to or not - we do want this to be more heavily restricted than the learner notebook servers, but I don't know if that can be done while reusing the OIDC config for Jupyter?
  • Vault HCL: Different HCLs? Or one HCL that can manage all jupyterhub related leases?
  • Is the replacement of ol-k8s-apisix-olapisixoidcresources-ci an online operation or will that result in downtime?

Resolved Questions

  • Database: We could probably use the same one and just do two logical databases if we want. I can't see an obvious reason it wouldn't work - question is if we'd prefer hardware-level separation or not. We're going to cram everything into one physical database.

@dsubak dsubak requested review from Ardiea and blarghmatey November 6, 2025 19:58
@dsubak
Copy link
Contributor Author

dsubak commented Nov 6, 2025

@blarghmatey or @Ardiea - this still needs work, but I'd appreciate if ya'll could take a look. Of note, I'd love your opinions on the open questions listed in the description; guessing I'll need another pass once I've got some of those answers!

Copy link
Member

@blarghmatey blarghmatey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would probably make sense is to wrap the majority of the resource provisioning into a function call as a sibling Python module which can then be executed using inputs to provide the customization that we want on a per-deploy basis. Most of the resources just need a '-authoring` change, so this would cut down a lot on copy-🍝

Copilot should be able to take what you've got here and figure out the majority of the refactoring.

@blarghmatey
Copy link
Member

Regarding the RDS, I think it makes sense to keep that on a single piece of hardware since the majority of load will be from the students and not the course authors.

@dsubak
Copy link
Contributor Author

dsubak commented Nov 12, 2025

@blarghmatey Alright, wrapped most of the deployment related stuff into a function - let me know if this is what you had in mind.

Output looks about right at a high level - it's mostly a no-op for the existing deployment and the new one looks plausibly correct. I did end up with one more open question w/r/t downtime which I've put into the description since I'm definitely not up to speed on the APISix resources; would appreciate some additional guidance!

@dsubak dsubak force-pushed the dansubak/202510_authoring_jupyterhub branch from 4d8f3d4 to d5ab680 Compare November 13, 2025 17:24
@dsubak dsubak requested a review from blarghmatey November 14, 2025 13:01
@blarghmatey blarghmatey requested a review from Copilot November 14, 2025 15:31
Copilot finished reviewing on behalf of blarghmatey November 14, 2025 15:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR sets up a separate JupyterHub authoring environment for course content creation, running alongside the existing learner-facing JupyterHub deployment. The changes refactor the JupyterHub deployment code into a reusable function to support both environments.

Key changes:

  • Refactored JupyterHub deployment into a parameterized provision_jupyterhub_deployment() function
  • Added separate jupyter-authoring namespace and deployment with its own domain and configuration
  • Both deployments share the same physical RDS database but use separate logical databases and Vault backends

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
deployment.py New reusable function to provision JupyterHub with all AWS/K8s resources
__main__.py Refactored to use new deployment function for both main and authoring environments
author_menu_override.json Empty JSON config placeholder for authoring environment menu customization
author_disabled_extensions.json Empty JSON config placeholder for authoring environment extension settings
EKS Pulumi configs (CI/QA/Production) Added jupyter-authoring namespace to each environment
JupyterHub Pulumi configs (CI/QA/Production) Added authoring_domain configuration for each environment

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 57 to 72
Args:
base_name: Base name for the deployment (e.g., "jupyterhub" or
"jupyterhub-authoring"). This name is used to derive all resource
names throughout the deployment.
domain_name: Domain name for the deployment (e.g., "jupyter.mitlearn.mit.edu")
namespace: Kubernetes namespace for the deployment
stack_info: Stack information from parse_stack()
jupyterhub_config: Pulumi config for jupyterhub
vault_config: Pulumi config for vault
cluster_stack: EKS cluster stack reference
application_labels: Labels to apply to Kubernetes resources
k8s_global_labels: Global Kubernetes labels
extra_images: List of extra images to pre-pull (optional)
menu_override_json: JSON contents for menu override Jupyter config
disabled_extensions_json: JSON contents for disabled extensions Jupyter config
extra_config: Extra configuration values to merge into the Helm chart values
Copy link

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring is missing the db_config and app_db parameters which are required arguments in the function signature (lines 45 and 47). These should be documented to explain their purpose in configuring the database connection.

Copilot uses AI. Check for mistakes.
f"ol-jupyterhub-vault-policy-{stack_info.env_suffix}",
name="jupyterhub",
policy=Path(__file__).parent.joinpath("jupyterhub_policy.hcl").read_text(),
jupyterhub_authoring_db_config = OLPostgresDBConfig(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're not creating another physical RDS instance, the change that is necessary to allow for an additional logical database is actually to manage that via the Vault role definitions. You can see an example of how to modify that in https://github.com/mitodl/ol-infrastructure/blob/main/src/ol_infrastructure/applications/mit_learn/__main__.py#L539-L752

In the role definition you'll want to add a 'create databasestatement along the lines ofSELECT 'CREATE DATABASE my_database' WHERE NOT EXISTS (SELECT FROM pg_database WHERE datname = 'my_database')\gexec;`

You'll then need to reference that role in the dynamic credentials used by the authoring hub deployment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright @blarghmatey, I pattern matched against the link so it looks plausibly correct now. That said, I'm definitely a bit out of my depth, having done very little with postgres or vault so I've got a few questions to run by ya!

Correct me if I'm misreading anything but I'm not quite sure this'll work as written based on a bit of research - I think gexec is specific to psql and vault will use a driver so we won't have that, and since CREATE DATABASE won't run in a transaction and it seems like Vault's PG plugin appears to use txns I don't think I can conditionalize it using DO.

One suggestion was to use dblink_exec (this actually also references gexec as well) - do you know if that's installed and available to vault?

FWIW, I'm happy to give this a rip on CI and see what happens to start with as long as you think that's safe enough, but any preliminary advice is greatly appreciated!

Copy link
Contributor Author

@dsubak dsubak Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running it as is gets me the following error:

Diagnostics:
  pulumi:pulumi:Stack (ol-infrastructure-jupyterhub-application-applications.jupyterhub.CI):
    error: update failed
    error: kubernetes:yaml/v2:ConfigGroup resource 'jupyterhub-authoring-vso-resources' has a problem: grpc: the client connection is closing
    error: kubernetes:yaml/v2:ConfigGroup resource 'OLVaultK8SSecret-jupyter-authoring-jupyterhub-authoring-app-db-creds' has a problem: grpc: the client connection is closing

    WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
    I0000 00:00:1764097778.993076 40670945 fork_posix.cc:71] Other threads are currently calling into gRPC, skipping fork() handlers

  vault:database:SecretBackendConnection (jupyterhub_authoring-database-connection):
    error:   sdk-v2/provider2.go:572: sdk.helper_schema: error configuring database connection "postgres-jupyterhub-authoring/config/jupyterhub_authoring": Error making API request.
    
    URL: PUT https://vault-ci.odl.mit.edu/v1/postgres-jupyterhub-authoring/config/jupyterhub_authoring
    Code: 400. Errors:
    
    * error creating database object: error verifying connection: ping failed: failed to connect to `host=jupyterhub-db-ci.cbnm7ajau6mi.us-east-1.rds.amazonaws.com user=oldevops database=jupyterhub_authoring`: server error (FATAL: database "jupyterhub_authoring" does not exist (SQLSTATE 3D000)): [email protected]
    error: 1 error occurred:
    	* error configuring database connection "postgres-jupyterhub-authoring/config/jupyterhub_authoring": Error making API request.
    
    URL: PUT https://vault-ci.odl.mit.edu/v1/postgres-jupyterhub-authoring/config/jupyterhub_authoring
    Code: 400. Errors:
    
    * error creating database object: error verifying connection: ping failed: failed to connect to `host=jupyterhub-db-ci.cbnm7ajau6mi.us-east-1.rds.amazonaws.com user=oldevops database=jupyterhub_authoring`: server error (FATAL: database "jupyterhub_authoring" does not exist (SQLSTATE 3D000))

That's not entirely surprising if I'm reading this right and understanding when everything executes - the database wouldn't exist until the role statements run (assuming the role statements work) so just attempting to connect as the admin user chokes.

Unfortunately that does leave me a bit unsure of how to proceed.

  • I could specify the application jupyter database in the OLPostgresDBConfig, which I think will let it connect and execute the role creation statements, but will that cause issues down the line? I think I can specify the deployment-specific database value in the OLVaultK8SDynamicSecretConfig but I'm not sure if there's other times it'd connect using the vault info that I should be aware of.
  • I can make the database manually if we'd like - it'd work, but it also means that we'd have a manual step every time we instantiated more than one deployment per stack with distinct databases
  • I could maybe wire up a database resource directly? It's not entirely clear to me if this'll work since I've only been working w/ the OL component resource wrappers, but if I can connect to it and that represents a logical db, I think I can get that going
  • I could try and shove everything into the same logical database. Probably like 50/50 odds that'd work at the outset, but I am pretty sure that it'll be a question of when, not if it comes back to bite us.

@dsubak dsubak force-pushed the dansubak/202510_authoring_jupyterhub branch from 283f9aa to 75b56f0 Compare November 24, 2025 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants