pq #

A Parquet Swiss Army Knife. Inspect, query, transform, and view Parquet files from the command line.

# What's in this file?
$ pq info events.parquet
File:         events.parquet
Size:         1.2 GiB
Rows:         48,291,037
Row Groups:   12
Columns:      8
Compression:  ZSTD

# What columns does it have?
$ pq schema events.parquet
Schema (8 columns):
├── id: int64
├── event: string
├── user_id: int32
├── ts: timestamp(us)
├── city: string
├── device: string
├── duration_ms: int32
╰── payload: struct
    ├── action: string
    ╰── metadata: map<string, string>

# Peek at the data
$ pq head events.parquet -n 3
╭────┬───────┬─────────┬──────────────────────┬──────────╮
│ id ┆ event ┆ user_id ┆ ts                   ┆ city     │
╞════╪═══════╪═════════╪══════════════════════╪══════════╡
│  1 ┆ click ┆     402 ┆ 2025-01-15T08:23:11Z ┆ Seattle  │
├╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│  2 ┆ view  ┆     117 ┆ 2025-01-15T08:23:14Z ┆ Portland │
├╌╌╌╌┼╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌┤
│  3 ┆ click ┆     892 ┆ 2025-01-15T08:23:19Z ┆ Denver   │
╰────┴───────┴─────────┴──────────────────────┴──────────╯

# SQL queries - reference files directly in FROM
$ pq sql "SELECT city, count(*) n FROM 'events.parquet' GROUP BY city ORDER BY n DESC LIMIT 3"
╭──────────┬───────╮
│ city     ┆ n     │
╞══════════╪═══════╡
│ Seattle  ┆ 12485 │
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ Portland ┆ 9712  │
├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌┤
│ Denver   ┆ 8033  │
╰──────────┴───────╯

# jq expressions
$ pq jq events.parquet '{city, event}' | head -2
{"city":"Seattle","event":"click"}
{"city":"Portland","event":"view"}

# Works with remote files too - lazily fetches only the bytes it needs
$ pq count "https://example.com/big-dataset.parquet"
2964624

Features #

🔍

Inspect

File summary, schema, column statistics, physical layout, and validation

📊

Read

Dump rows, preview head/tail, random sample, fast count, regex search

Query

Full SQL via Apache DataFusion and jq expressions via jaq

🔧

Transform

Project columns, extract row ranges, combine files, partition splits

📦

Import & Export

Convert between Parquet, CSV, JSON, and JSONL

🖥

Interactive Viewer

Full-screen TUI with scrolling, column navigation, and remote file support

🌐

Remote Files

HTTPS, S3, GCS, and Azure URLs work everywhere, lazily fetching only the bytes it needs

📋

Output Formats

Pretty tables in a terminal, JSONL when piped, plus JSON, CSV, and plain TSV

Install #

Homebrew (macOS/Linux)

brew install joewalnes/tap/pq

Download binary

Pre-built binaries for macOS and Linux:

# macOS (Apple Silicon)
curl -Lo pq https://github.com/joewalnes/pq/releases/latest/download/pq-darwin-arm64

# Linux (x86_64)
curl -Lo pq https://github.com/joewalnes/pq/releases/latest/download/pq-linux-amd64

# Linux (ARM)
curl -Lo pq https://github.com/joewalnes/pq/releases/latest/download/pq-linux-arm64

Then make it executable and move it to your PATH:

chmod +x pq
sudo mv pq /usr/local/bin/

From source

Requires Rust 1.75+:

git clone https://github.com/joewalnes/pq.git
cd pq
make install    # builds release binary, copies to ~/.local/bin/pq

Getting started #