Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,7 @@
"group": "Deployment & scaling",
"pages": [
"geneva/deployment/index",
"geneva/deployment/helm",
"geneva/deployment/troubleshooting"
]
},
Expand Down
86 changes: 86 additions & 0 deletions docs/geneva/deployment/helm.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
---
title: Deploy Geneva using Helm
sidebarTitle: Helm Deployment
description: Learn how to deploy Geneva on Kubernetes using the Geneva Helm Chart
icon: cogs
---

<Tip>
**Feature Engineering is deployed automatically in LanceDB Enterprise**

In self-managed environments, Geneva can be installed into existing Kubernetes clusters using Helm. Please [contact LanceDB](https://lancedb.com/contact/) for access to the Helm Chart and related resources.
</Tip>

## Pre-requisites

- An existing Kubernetes cluster
- An existing node pool(s) for Geneva workloads. By default, Geneva uses node selector
`{"geneva.lancedb.com/ray-head": "true"}` for Ray head nodes, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in our GCP cluster we've got {"geneva.lancedb.com/ray-head": ""} instead of "true". (same for the worker nodes) Is that just a quirk of our setup there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was fixed recently, both AWS and GCP follow this convention now

`{"geneva.lancedb.com/ray-worker-cpu": "true"}` and `{"geneva.lancedb.com/ray-worker-gpu": "true"}`
for Ray CPU worker and Ray GPU worker nodes respectively. This can be overridden in the Geneva client.
- Geneva Helm chart. Please [contact LanceDB](https://lancedb.com/contact/) for access to the Helm Chart and related resources.

For more information on deploying the required cloud resources, see the [manual deployment instructions](/geneva/deployment/).

## Geneva Helm Chart

The Helm chart includes resources required for running [Geneva](https://lancedb.com/docs/geneva/) in Kubernetes.

It includes services, service accounts, RBAC roles, etc. that are used by the Geneva client to manage resources.

## Install

1. Authenticate with Kubernetes cluster, i.e. update kubeconfig
2. Configure Helm chart values

In values.yaml, configure the service account, node selectors, and cloud resources, if applicable.

```
geneva:
# Object storage root URI
rootUri:
value: "s3://my-data-bucket"

serviceAccount:
# Service account for Geneva worker pods and services
annotations:
# Set per-CSP annotations to provide access to CSP resources, i.e.
# eks.amazonaws.com/role-arn: arn:aws:iam::0123456789:role/geneva_service_role
# iam.gke.io/gcp-service-account: geneva-service-account@my-project.iam.gserviceaccount.com

gcp:
# GCP service account email for the Geneva client.
# It should have access to the GKS cluster and "roles/storage.objectUser"
# permissions on the object storage bucket.
# e.g., geneva-client-sa@project-id.iam.gserviceaccount.com
clientServiceAccount: ""

aws:
# AWS IAM role ARN to be assumed by the Geneva client.
# This role should have an access entry to the cluster with username matching the role ARN.
# It should also have r/w access to the object storage bucket.
# e.g., arn:aws:iam::123456789012:role/geneva-client-role
clientRoleArn: ""
```

3. Install kuberay operator
```bash
export NAMESPACE=geneva

helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update
helm install kuberay-operator kuberay/kuberay-operator -n $NAMESPACE --create-namespace
```
4. Install NVIDIA device plugin (if using GPU nodes)

For GPU support, the NVIDIA device plugin must be installed in your EKS cluster:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same on GCP and Azure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah


```bash
curl https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml > nvidia-device-plugin.yml
kubectl apply -f nvidia-device-plugin.yml
```

5. Install Geneva Helm chart
```bash
helm install geneva ./geneva -n $NAMESPACE --create-namespace
```
11 changes: 8 additions & 3 deletions docs/geneva/deployment/index.mdx
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
---
title: Geneva on Kubernetes Deployments
sidebarTitle: Deployment
title: Manual Deployment on Kubernetes
sidebarTitle: Manual Deployment
description: Learn how to deploy Geneva on Kubernetes using KubeRay for distributed feature engineering workflows on GKE and EKS.
icon: cogs
---

<Tip>
**Feature Engineering is deployed automatically in LanceDB Enteprise**
**Feature Engineering is deployed automatically in LanceDB Enterprise**

Feature Engineering is deployed automatically as part of [LanceDB Enterprise](/enterprise/).
For manual installation in self-managed environments, follow the instructions below.
Expand All @@ -23,6 +23,11 @@ See below for installation instructions for:

## Basic Kubernetes Setup

<Tip>
Kubernetes resources can be deployed automatically via [Helm](/geneva/deployment/helm/) or manually
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we prefer the Helm installation? If so, we should say that, and probably have Helm before Manual in the sidebar. But if we don't care either way, then ignore this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we do. I kept this order because the helm install does depend on some cloud resources from this section.

via the instructions below.
</Tip>

In the following sections we'll use these variables:

```bash
Expand Down
Loading