Download directly from S3 for faster query times

## Opportunity

Whenever you query with Athena it will save the result into some S3 bucket. You can simply download this `.csv` file like any normal file from S3.

Another observation is that Athena API can return at most 1000 rows on one single page. This has a significant performance impact if you try to download 100k + rows. There need to be 100+ requests, even if it's just a few MB.

In our case, we are querying Athena from a different region (and different continent) so just the latency alone on those 100+ requests is multiple seconds.

Downloading from S3 is a single request, which is faster. There are almost no downsides.

## Result

After I implemented fetching directly from Athena we observed a significant speed-up in our query times. For queries that ~100k rows, it went from 38 seconds to just 18 seconds which is more than a 2x improvement. This is even more significant for queries that return more rows (in some places it was even 4x speed-up).

## Request

It would be nice if some form of S3 fetching would be implemented upstream. I have opened [PR](https://github.com/uber/athenadriver/pull/66) with my implementation, it's not in a mergeable state right now. I will not have time to clean it up and create a proper PR, but I wanted to share my code anyway in case it helps someone or someone finds the time to properly integrate that functionality into `athenadriver` API.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Download directly from S3 for faster query times #65

Opportunity

Result

Request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Download directly from S3 for faster query times #65

Description

Opportunity

Result

Request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions