xsv
Fast CSV toolkit written in Rust.
Official WebsiteFeatures
FastIndexingJoiningStatisticsSQL-like Queries
Installation
Homebrew
brew install xsvCargo (Rust)
cargo install xsvWhy use xsv?
xsv is a Swiss Army knife for CSV data processing. Written in Rust for maximum performance, it enables searching, indexing, joining, and transforming CSV files with SQL-like commands—all from the command line.
Lightning Fast
Built with Rust for blazing-fast CSV processing. Handles gigabytes of data efficiently.
SQL-like Queries
Perform SELECT, JOIN, GROUP BY operations on CSV files with intuitive command syntax.
Flexible Filtering
Search, filter, and select specific rows and columns with regex support.
Rich Features
Index files for fast access, split files, transpose data, and calculate statistics.
Installation
Installation
# macOS (Homebrew)
brew install xsv
# Ubuntu/Debian
cargo install xsv
# Arch Linux
pacman -S xsv
# From source
git clone https://github.com/BurntSushi/xsv
cd xsv
cargo install --path .
# Download pre-built binary
wget https://github.com/BurntSushi/xsv/releases/download/1.0.1/xsv-1.0.1-x86_64-unknown-linux-musl.tar.gz
tar xf xsv-1.0.1-x86_64-unknown-linux-musl.tar.gz
sudo mv xsv /usr/local/bin/Basic Usage
Inspecting CSV Files
Inspection
# View column headers
xsv headers data.csv
# Display file statistics
xsv stats data.csv
# Show first few rows
xsv slice -l 5 data.csv
# Count rows
xsv count data.csv
# View specific columns
xsv select 1,3,5 data.csv | headSelecting and Filtering
Selection & Search
# Select specific columns by name
xsv select name,email,age data.csv
# Select by index
xsv select 1,2,4 data.csv
# Select range of columns
xsv select 1-5 data.csv
# Search for rows containing text
xsv search 'John' data.csv
# Filter with regex
xsv search '.*@gmail\.com' data.csv --select emailData Transformation
Transformations
# Rename columns
xsv rename 'old_name,new_name' data.csv
# Convert to different delimiter
xsv fmt --crlf data.csv # Windows line endings
xsv fmt --tab data.csv # Tab-separated
# Create new column from expression
xsv eval 'full_name' '"{first_name} {last_name}"' data.csv
# Sort by column
xsv sort -s age data.csv
# Sort in reverse
xsv sort -R -s age data.csvCommon Patterns
Joining CSV Files
Joining
# Join two CSV files on common column
xsv join --no-case user_id users.csv id orders.csv
# Left join with different column names
xsv join --left --left-columns id --right-columns user_id users.csv orders.csv
# Inner join
xsv join --inner employee_id departments.csv id employees.csvStatistics and Analysis
Statistics
# Get detailed statistics
xsv stats --all data.csv
# Stats for specific columns
xsv stats --select age,salary data.csv
# Show stats with cardinality
xsv stats --cardinality data.csv
# Find distinct values
xsv select category data.csv | sort | uniq -cSplitting and Slicing
Splitting
# Split into chunks of 1000 rows
xsv split -s 1000 data.csv split-dir/
# Get rows 100-200
xsv slice -i 100 -l 100 data.csv
# Get specific row range
xsv slice 50..150 data.csv
# Split by column value
xsv groupby country data.csv | xsv split -s 100 - split-by-country/Indexing for Performance
Indexing
# Create index file for fast access
xsv index large-file.csv
# Now search and access operations are faster
xsv search 'value' large-file.csv
# List indexed files
ls -la large-file.csv*
# Index files in batch
for f in *.csv; do xsv index "$f"; doneAdvanced Features
Field Evaluation
Evaluation
# Create calculated column
xsv eval 'total' '$price * $quantity' data.csv
# Conditional column
xsv eval 'status' 'if $age >= 18 { "adult" } else { "minor" }' data.csv
# Format existing column
xsv eval 'email' 'tolower($email)' data.csv
# Math operations
xsv eval 'percentage' '($value / $total) * 100' data.csvData Deduplication
Deduplication
# Remove duplicate rows
xsv dedup data.csv
# Deduplicate by specific column
xsv dedup -s email data.csv
# Sort before deduplication
xsv sort -s email data.csv | xsv dedup -s email
# Count duplicates
xsv select email data.csv | sort | uniq -c | sort -rnFormat Conversion
Format Conversion
# Convert to JSON
xsv fmt --crlf data.csv > data-crlf.csv
# Create delimited format for databases
xsv fmt --delimiter '|' data.csv > data-pipe.csv
# Convert with quoting style
xsv fmt --quote-always data.csv
# Escape special characters
xsv select name,description data.csv | xsv fmt --escape --delimiter ','Combining with Other Tools
Combination
# Filter with grep then process
grep 'status=active' data.csv | xsv select name,email
# Process each row with xargs
xsv select url data.csv | xargs -I {} curl {}
# Combine stats from multiple files
cat file1.csv file2.csv | xsv stats
# Use with awk for complex filtering
xsv select name,salary data.csv | awk -F, '$2 > 50000'Command Reference
| Command | Description | Example |
|---|---|---|
count | Count number of rows | xsv count data.csv |
headers | List column headers | xsv headers data.csv |
select | Choose columns | xsv select name,age |
search | Search for pattern | xsv search 'value' |
join | Join files on column | xsv join id f1.csv f2.csv |
sort | Sort by column | xsv sort -s age |
dedup | Remove duplicates | xsv dedup -s id |
Tips
- •Always use
xsv indexon large files for significant performance improvements on subsequent operations - •Use
xsv countto quickly get row count without loading entire file - •Combine with standard Unix tools like
grep,sort, andawkfor more complex operations - •Use
-sflag with column names to be case-insensitive with column selection - •The
statscommand with--cardinalityhelps identify unique values in columns