Scatterplots and Histograms
VisiData renders scatterplots by plotting one numeric column against another (both set as key and value), and histograms through the frequency table's built-in histogram column. Both run entirely in the terminal with no external libraries.
Scatterplots reveal correlation between two variables. Histograms reveal distribution shape. Learn when each is more informative than a simple frequency table.
Scatterplots
Two Numeric Variables
# Mark x-axis column as key (numeric)
# Move to 'response_time' column, cast to int: #
! # key column (x-axis)
# Move to 'bytes_sent' column, cast to int: #
. # plot bytes_sent vs response_time
When the key column is numeric (not categorical), VisiData renders a true scatterplot.
Color-Coded by Category
When a categorical key column is set alongside a numeric key column, VisiData assigns distinct colors to each category:
# Set categorical key column: e.g., 'method' (GET/POST/PUT)
# Move to 'method' column
! # categorical key
# Set numeric key column: e.g., 'response_time'
# Move to 'response_time' column
! # numeric key (x-axis)
# Move to 'bytes_sent' column
. # scatterplot: x=response_time, color=method
Each HTTP method (GET, POST, PUT) appears in a distinct color on the canvas.
Histograms from Frequency Tables
The frequency table includes a built-in histogram column using ■ characters:
# Open frequency table on any column
Shift+F
# The table includes a 'histogram' column automatically
# This is a text-based bar chart
# To get a canvas-based histogram:
# Move cursor to 'count' column in the frequency table
. # canvas graph of frequency counts
Configuring the Canvas
# Set x range manually
x
# Enter: 0 1000 (xmin xmax)
# Set y range manually
y
# Enter: 0 5000 (ymin ymax)
# Reset to auto-fit
_ # zoom to fit full extent
Practical Use Cases
Correlation: Response Time vs. Bytes Sent
vd /var/log/nginx/access.log
# Cast 'response_time' to int: #, mark as key: !
# Cast 'bytes_sent' to int: #
.
# Scatterplot: do larger responses take longer? Look for a trend.
Distribution of HTTP Status Codes
vd /var/log/nginx/access.log
# Move to 'status' column
Shift+F
# Frequency table shows text histogram:
# 200 ████████████████ 8500
# 301 ████ 400
# 404 ██ 150
# 500 ■ 12
CPU Load vs Memory Usage
vd /var/log/system_metrics.csv
# Cast 'cpu_percent' to float: %, mark as key: !
# Cast 'mem_percent' to float: %
.
# Scatterplot showing correlation between CPU and memory load
Histogram of Order Values
vd /var/www/html/exports/orders.csv
# Cast 'amount' to float: %
Shift+F
# Frequency table of order amounts with histogram
# Press ] to sort by count descending — see most common order value ranges
Canvas Layer Toggling
Inside the canvas, toggle display of individual plot layers:
1 toggle layer 1 (first plotted column/category)
2 toggle layer 2
...
9 toggle layer 9
Useful when g. plots many overlapping columns — disable layers one at a time to isolate signals.
Reading the Canvas
· or ⠁⠂⠄⠈ sparse data points (Braille cells with few dots)
⠿ or ⣿ dense cluster of many data points in one cell
color distinct categorical values (when categorical key is set)
─ │ axis lines
↑ → labels axis labels and scale ticks
Troubleshooting Matrix
| Problem | Cause | Fix |
|---|---|---|
| Scatterplot looks like vertical lines | X-axis column is categorical | Use a numeric column as the key |
| Canvas all one color | No categorical key set | Mark a categorical column as an additional key |
| Points too sparse to see | Data range too wide | Use x and y to set axis ranges |
| Canvas renders slowly | Very many distinct points | Filter to a sample: --max-rows 10000 |
■ histogram too small to read | Terminal too narrow | Widen terminal or use canvas graph |
Best Practices
- Use scatterplots for correlation discovery between two numeric variables.
- Use histograms (frequency table) for distribution shape of a single variable.
- Use
g.to plot multiple numeric columns simultaneously for multi-variable overview. - Set axis ranges manually (
xandy) when outliers compress the visible range.
Hands-On Practice
cat > /tmp/requests.csv << 'EOF'
response_ms,bytes_sent,method,status
120,1200,GET,200
350,45000,GET,200
80,800,POST,201
2100,200,GET,500
95,1100,GET,200
430,52000,POST,201
55,600,GET,304
1800,150,GET,500
EOF
vd /tmp/requests.csv
# 1. Cast response_ms to int: #, mark as key: !
# 2. Cast bytes_sent to int: #
# 3. Press . → scatterplot: bytes vs response time
# 4. Press q → return
# 5. Move to 'method' column
# 6. Press Shift+F → histogram of methods
# 7. Press . on count column → canvas histogram