You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/integrations/dremio.md
+87-8Lines changed: 87 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,18 +8,97 @@ description: This section shows how you can start using lakeFS with Dremio, a ne
8
8
[Dremio](https://www.dremio.com/) is a next-generation data lake engine that liberates your data with live,
9
9
interactive queries directly on cloud data lake storage, including S3 and lakeFS.
10
10
11
-
## Configuration
11
+
12
+
## Iceberg REST Catalog
13
+
14
+
lakeFS Iceberg REST Catalog allow you to use lakeFS as a [spec-compliant](https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml) Apache [Iceberg REST catalog](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/apache/iceberg/main/open-api/rest-catalog-open-api.yaml),
15
+
allowing Dremio to manage and access tables using a standard REST API.
This is the recommended way to use lakeFS with Dremio, as it allows lakeFS to stay completely outside the data path: data itself is read and written by Dremio executors, directly to the underlying object store. Metadata is managed by Iceberg at the table level, while lakeFS keeps track of new snapshots to provide versioning and isolation.
20
+
21
+
[Read more about using the Iceberg REST Catalog](./iceberg.md#iceberg-rest-catalog).
22
+
23
+
### Configuration
24
+
25
+
To configure Dremio to work with the Iceberg REST Catalog, you need to configure the [Iceberg REST Catalog in Dremio](https://docs.dremio.com/current/data-sources/lakehouse-catalogs/iceberg-rest-catalog/).
26
+
27
+
1. On the Datasets page, to the right of **Sources** in the left panel, click `+`
28
+
1. In the **Add Data Source** dialog, under Lakehouse Catalogs, select **Iceberg REST Catalog** Source. The New Iceberg REST Catalog Source dialog box appears, which contains the following tabs:
29
+
1. In **General** →
30
+
- Enter a name for your Iceberg REST Catalog source, specify the endpoint URI (i.e. `https://lakefs.example.com/iceberg/api`)
31
+
- Uncheck "Use vended credentials"
32
+
1. In **Advanced Options** → Catalog Properties, add the following key-value pairs (left = key, right = value):
To learn more about the Iceberg REST Catalog, see the [Iceberg REST Catalog](./iceberg.md#iceberg-rest-catalog) documentation.
81
+
82
+
## Using Dremio with the S3 Gateway
83
+
84
+
Alternatively, you can use the S3 Gateway to read and write data to lakeFS from Dremio.
85
+
86
+
While flexible, this approach requires lakeFS to be involved in the data path, which can be less efficient than the Iceberg
87
+
REST Catalog approach, since lakeFS has to proxy all data operations through the lakeFS server. This is particularly true
88
+
for large data sets where network bandwidth might incur some overhead.
89
+
90
+
### Configuration
12
91
13
92
Starting from version 3.2.3, Dremio supports Minio as an [experimental S3-compatible plugin](https://docs.dremio.com/current/sonar/data-sources/object/s3/#configuring-s3-for-minio).
14
93
Similarly, you can connect lakeFS with Dremio.
15
94
16
95
Suppose you already have both lakeFS and Dremio deployed, and want to use Dremio to query your data in the lakeFS repositories.
17
96
You can follow the steps listed below to configure on Dremio UI:
18
97
19
-
1. click _Add Data Lake_.
20
-
1. Under _File Stores_, choose _Amazon S3_.
21
-
1. Under _Advanced Options_, check _Enable compatibility mode (experimental)_.
22
-
1. Under _Advanced Options_ > _Connection Properties_, add `fs.s3a.path.style.access` and set the value to `true`.
23
-
1. Under _Advanced Options_ > _Connection Properties_, add `fs.s3a.endpoint` and set lakeFS S3 endpoint to the value.
24
-
1. Under the _General_ tab, specify the _access_key_id_ and _secret_access_key_ provided by lakeFS server.
25
-
1. Click _Save_, and now you should be able to browse lakeFS repositories on Dremio.
98
+
1. click **Add Data Lake**.
99
+
1. Under **File Stores**, choose **Amazon S3**.
100
+
1. Under **Advanced Options**, check **Enable compatibility mode (experimental)**.
101
+
1. Under **Advanced Options** > **Connection Properties**, add `fs.s3a.path.style.access` and set the value to `true`.
102
+
1. Under **Advanced Options** > **Connection Properties**, add `fs.s3a.endpoint` and set lakeFS S3 endpoint to the value.
103
+
1. Under the **General** tab, specify the **access_key_id** and **secret_access_key** provided by lakeFS server.
104
+
1. Click **Save**, and now you should be able to browse lakeFS repositories on Dremio.
0 commit comments