Terminal GuideTerminal Guide

uniq Command Guide

The uniq command filters out repeated lines in a file. Learn how to find unique entries and count duplicates.

5 min readLast updated: 2024
Dai Aoki

Dai Aoki

CEO at init, Inc. / CTO at US & JP startups / Creator of WebTerm

Quick Reference

Basic

uniq fileRemove adjacent dups
sort file | uniqRemove all dups
uniq -c fileCount occurrences

Options

-dShow only duplicates
-uShow only unique
-iIgnore case

Common

sort | uniq -cCount each line
sort | uniq -c | sort -rnTop counts
sort -u fileAlternative dedup

Downloadable Image Preview

Failed to generate preview

Basic Usage

uniq removes consecutive duplicate lines. The input must be sorted first.

bash
# Remove duplicates (input must be sorted)
sort file.txt | uniq
Warning
uniq only removes consecutive duplicates. Always use sort first to group duplicates together.

Common Options

uniq Options

-cPrefix lines with occurrence count
-dOnly print duplicate lines
-uOnly print unique lines
-iIgnore case when comparing
-f NSkip first N fields
-s NSkip first N characters
-w NCompare only first N characters

Counting Occurrences

bash
# Count occurrences of each line
sort file.txt | uniq -c

# Output example:
#   3 apple
#   1 banana
#   2 cherry

Finding Duplicates

bash
# Show only lines that appear more than once
sort file.txt | uniq -d

# Show only lines that appear exactly once
sort file.txt | uniq -u

Case-Insensitive

bash
# Treat "Apple" and "apple" as duplicates
sort -f file.txt | uniq -i

Practical Examples

Find most common lines

bash
sort file.txt | uniq -c | sort -rn | head -10

Count unique visitors from log

bash
# Extract IPs and count unique
awk '{print $1}' access.log | sort | uniq | wc -l

# Most frequent visitors
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10

Find duplicate files by checksum

bash
md5sum * | sort | uniq -d -w32

Count word frequency

bash
cat file.txt | tr '[:upper:]' '[:lower:]' | tr -s ' ' '\n' | sort | uniq -c | sort -rn

Find commands you use most

bash
history | awk '{print $2}' | sort | uniq -c | sort -rn | head -10

Remove duplicate lines from file

bash
sort file.txt | uniq > unique.txt

# Or use sort -u
sort -u file.txt > unique.txt

Find lines in file1 but not in file2

bash
sort file1.txt file2.txt file2.txt | uniq -u

Skip Fields or Characters

bash
# Skip first field (compare from second field)
sort -k2 data.txt | uniq -f1

# Skip first 10 characters
sort data.txt | uniq -s10

# Compare only first 20 characters
sort data.txt | uniq -w20

uniq vs sort -u

sort -u is often simpler for basic deduplication:

bash
# These are equivalent for basic deduplication
sort file.txt | uniq
sort -u file.txt

# But uniq has more features
sort file.txt | uniq -c    # Count occurrences
sort file.txt | uniq -d    # Show only duplicates

Without Sorting

If you need to remove duplicates while preserving order, use awk:

bash
# Remove duplicates, preserve order
awk '!seen[$0]++' file.txt
Tip
The awk method preserves original order and removes non-consecutive duplicates, but is slower for large files.

Summary

uniq is essential for data deduplication. Key takeaways:

  • Always sort before uniq
  • Use uniq -c to count occurrences
  • Use uniq -d to find duplicates
  • Use uniq -u to find unique lines
  • Use sort -u for simple deduplication

Related Articles