Bash: excluding duplicate rows (both of each pair) based on one column -
i have file (called example.txt) looks following:
a b c d e f h c z b y b c t e f w o f based on column 2 only, wish identify rows have non-unique entry , remove them completely. real file may have duplicates entries, triplicates entries, quadruple entries etc. want keep rows entry of column 2 unique.
the output file should this:
h c w o f i wanted in r file big r slow , crashing. in bash directly. new bash, tried not working:
arraytmp=($(cat example.txt | awk '{print $2}' | sort | uniq -d)) sed "/${arraytmp[@]}\/d" example.txt
if order not matter:
awk '{a[$2]=$0;b[$2]++}end{for (i in b){if(b[i]==1){print a[i]}}}' your_file
Comments
Post a Comment