Analysis Workflows Cheatsheet
Repeatable, step-by-step analysis patterns for the most common VisiData use cases.
Workflow 1: Frequency Distribution
Goal: Count how many rows exist for each value in a column.
1. Open file: vd data.csv
2. Move to column: h/l or arrow keys
3. Cast if needed: # (int) or ~ (str)
4. Open freq table: Shift+F
5. Sort by count: ] (descending)
6. Drill into group: Enter
7. Return: q
Workflow 2: Pivot Table (Cross-Tabulation)
Goal: Sum/mean of a value column grouped by two categorical columns.
1. Open file: vd data.csv
2. Cast value column: % or #
3. Add aggregator: + → enter: sum
4. Mark row key: ! on first category column
5. Mark col key: ! on second category column
6. Move to value col: h/l
7. Open pivot: Shift+W
8. Explore results: arrow keys, Enter for drill-down
9. Return: q
Workflow 3: Data Profiling (Instant Audit)
Goal: See nulls, distinct counts, min/max for all columns.
1. Open file: vd data.csv
2. Open Describe: Shift+I
3. Read: nulls column → which columns have missing data?
4. Read: distinct → is a column constant? (distinct=1 → useless)
5. Read: min/max → spot outliers
6. Drill into column: Enter → opens frequency table for that column
7. Return: q
Workflow 4: Filter and Export
Goal: Extract a subset of rows matching a condition.
1. Open file: vd data.csv
2. Select matching: | regex (by regex in current column)
z| expr (by Python expression)
, (by matching current cell value)
3. Verify count: check status bar (N rows selected)
4. Open filtered: "
5. Save filtered: Ctrl+S → new_filename.csv
6. Return: q
Workflow 5: Log Parsing
Goal: Parse unstructured log lines into columns.
1. Open log: vd /var/log/nginx/access.log
2. Parse with regex: ; (semicolon)
Enter: (?P<ip>\S+) \S+ \S+ \[(?P<date>[^\]]+)\] "(?P<method>\S+) (?P<path>\S+) \S+" (?P<status>\d+) (?P<bytes>\d+)
3. Type columns: Move to status → #
Move to bytes → #
Move to date → @
4. Analyze: Shift+F on status → distribution
5. Drill into 500s: Enter on 500 row in freq table
6. Export: Ctrl+S → report.csv
Workflow 6: Data Cleaning
Goal: Clean a messy CSV for import or analysis.
1. Open file: vd raw_data.csv
2. Inspect: Shift+I (Describe Sheet)
3. Fix types: # % @ on numeric/date columns
4. Fill nulls: f on columns with propagating values
ge "default" on columns needing a fixed fill
5. Normalize text: gs → g* → (?i)ACTIVE<Tab>active
6. Remove invalids: z| expr → gd
7. Check duplicates: Shift+F on key column → look for count > 1
8. Verify: Shift+I again → confirm nulls=0, types correct
9. Export: Ctrl+S → clean_data.csv
Workflow 7: Sysadmin Process Inspection
Goal: Interactively inspect running processes.
1. Launch: ps aux | vd -f fixed --skip 1
2. Type CPU: % on %CPU column
3. Type MEM: % on %MEM column
4. Sort by CPU: ] (descending)
5. Filter high-CPU: z| → CPU > 5
6. Open filtered: "
7. Inspect PIDs: note the PID column values
Workflow 8: Batch File Conversion
Goal: Convert data format without interactive UI.
# CSV → JSON
vd -b input.csv -o output.json
# JSONL → CSV
vd -b events.jsonl -o events.csv
# Excel → TSV
vd -b report.xlsx -o report.tsv
# With row limit (sample conversion)
vd -b --max-rows 10000 input.csv -o sample.json
# Replay a recorded session on new input
vd --play clean_workflow.vdj --batch new_input.csv -o cleaned.csv
Workflow 9: Join Two Datasets
Goal: Enrich one dataset with columns from another.
1. Open both files: vd file1.csv file2.csv
2. Set key on file1: ! on 'id' column
3. Set key on file2: ! on 'id' column (switch with Ctrl+^)
4. Open Sheets Sheet: Shift+S
5. Select both: s on each sheet row
6. Join: & → choose jointype (inner/outer/full)
7. Result sheet opens: verify row count and columns
8. Save: Ctrl+S → joined.csv
Workflow 10: CommandLog Replay (Reproducible Analysis)
Goal: Save and replay an analysis for documentation or automation.
During interactive session:
1. Perform all your analysis steps
2. Open CommandLog: Shift+D
3. Save: Ctrl+S → workflow.vdj
Replay later:
vd --play workflow.vdj # interactive replay
vd --play workflow.vdj --batch -o result.csv # batch replay
vd --play workflow.vdj --replay-wait 2 # demo (2s between steps)
Quick Analysis Decision Matrix
| What you need | VisiData command |
|---|---|
| Count by category | Shift+F (frequency table) |
| Sum/mean by category | + aggregator + Shift+F |
| Cross-tab two categories | key columns ! + Shift+W |
| All column statistics | Shift+I (Describe Sheet) |
| Filter to matching rows | ` |
| Plot a numeric column | . (after setting key column !) |
| Compare two variables | key=x !, y column, . (scatterplot) |
| Join two files | Shift+S → select both → & |
| Convert file format | vd -b input.ext -o output.ext |
| Parse log fields | ; with named capture regex |
Common Command Sequences
# Open → frequency → drill → export
vd f.csv → Shift+F → ] → Enter → Ctrl+S
# Open → type → pivot → export
vd f.csv → # on value → + sum → ! on keys → Shift+W → Ctrl+S
# Open → parse log → analyze → export
vd log → ; regex → # on status → Shift+F → | 500 → " → Ctrl+S
# Open → filter → clean → export
vd f.csv → Shift+I → # % @ → f → gs → g* → gd → Ctrl+S
Next Steps
You have completed the VisiData curriculum. Return to any module for deeper practice:
- What is VisiData — fundamentals
- Log Analysis — server workflows
- Core Keys Cheatsheet — daily reference