Huge tables (WIP) by natmaka · Pull Request #67 · mla/pg_sample

natmaka · 2025-08-28T14:24:01Z

My goal is to speed things up when the database is "huge" (total size way above the host's RAM size).

Approach number 1: offering to the user a way to let pg_sample use the "SYSTEM" option instead of "BERNOULLI". This patch does it. It seems OK but I didn't test thoroughly.

Approach number 2: obtaining the amount of tuples in a table using meta-information collected and stored during an ANALYZE pass, instead of the usual SELECT COUNT() way which often implies reading the whole table. The patch offers provisions to do so, but most of the work has to be done. Let me know if it seems interesting to you.

There are also various modifications made in "janitor" mode.

mla · 2025-08-28T17:31:26Z

Awesome, than you @natmaka ! Will review this weekend.

mla · 2025-11-23T06:24:59Z

Hey @natmaka, check out the dev branch is you would and see if that's what you're going for.

natmaka · 2025-11-24T08:07:58Z

Hi, @mla, I could not launch it on a huge DB but it worked a on small DB.

I hacked a way to:

neglect any table vanishing during a run (on-the-fly)
avoid repeating a notice about the same table if all its tuples are already imported
complete the code enabling an '--approxcount' option (now probably useless) which lets the code use pg_class instead of 'SELECT COUNT(*)...'

natmaka@df2d882

If it seems useful to you I may submit potentially useful ones as a request against the dev branch.

Thank you!

natmaka added 6 commits August 26, 2025 17:58

sampling method, WIP

699f840

sampling method: parameter uppercased , default value enforced

998c416

minor

0e1b965

APPROX_table_tuples_count

365d29f

APPROX_table_tuples_count comments

09a3b92

APPROX_table_tuples_count usage, WIP

25da7ea

vanished tables

df2d882

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huge tables (WIP)#67

Huge tables (WIP)#67
natmaka wants to merge 7 commits intomla:masterfrom
natmaka:master

natmaka commented Aug 28, 2025

Uh oh!

mla commented Aug 28, 2025

Uh oh!

mla commented Nov 23, 2025

Uh oh!

natmaka commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

natmaka commented Aug 28, 2025

Uh oh!

mla commented Aug 28, 2025

Uh oh!

mla commented Nov 23, 2025

Uh oh!

natmaka commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants