A CLI tool for searching text within Apache Parquet files. Works like grep but for Parquet files, with support for recursive directory search and multiple output formats.
Built on top of hyparquet for high-performance Parquet parsing.
npm install -g parquet-grepOr use directly with npx:
npx parquet-grep "search term" file.parquetparquet-grep [options] <query> [parquet-file]-i- Force case-insensitive search (by default: case-insensitive if query is lowercase, case-sensitive if query contains uppercase)--table- Output in markdown table format (default, grouped by file)--jsonl- Output as JSON lines (one match per line with filename, rowOffset, and value)
If no file is specified, recursively searches all .parquet files in the current directory, skipping node_modules and hidden directories.
Search a single file:
parquet-grep "Holland" bunnies.parquetSearch recursively in current directory:
parquet-grep "search term"Case-insensitive search:
parquet-grep -i "HOLLAND" bunnies.parquetJSONL output:
parquet-grep --jsonl "Holland" bunnies.parquet