Bash: excluding duplicate rows (both of each pair) based on one column -


i have file (called example.txt) looks following:

a b c   d e f   h c   z b y   b c   t e f   w o f   

based on column 2 only, wish identify rows have non-unique entry , remove them completely. real file may have duplicates entries, triplicates entries, quadruple entries etc. want keep rows entry of column 2 unique.

the output file should this:

h c   w o f 

i wanted in r file big r slow , crashing. in bash directly. new bash, tried not working:

arraytmp=($(cat example.txt | awk '{print $2}' | sort | uniq -d))   sed "/${arraytmp[@]}\/d" example.txt 

if order not matter:

awk '{a[$2]=$0;b[$2]++}end{for (i in b){if(b[i]==1){print a[i]}}}' your_file 

Comments

Popular posts from this blog

inversion of control - Autofac named registration constructor injection -

verilog - Systemverilog dynamic casting issues -

ios - Change Storyboard View using Seague -