How to determine the number of unique column entries in CSV data using Unix
Assume you have CSV data that looks like the following
Bob, Red, 5
Cindy, Blue, 2
Frank, Red, 3
Lisa, Green, 3
and you would like to determine the number of unique colors that appear in column 1.
Use the following commands at the Unix Prompt
$ export IFS=\",\"
$ cat myfile.csv | while read a b c; do echo \"$b\"; done | sort | uniq | wc
The result returned is 3. Just delete the commands and pipes one at a time if you
want a better understanding. Note the export command sets the
delimiter to a comma.
See also