unix - Removing every character in .txt that isn't enclosed by pipes ( | ) OR where the line starts with # -
i've got bunch of .txt files this:
 # title: got stripes    # artist: johnny cash    # metre: 4/4    # tonic: db    0.000000000 silence   0.348299319 a, intro, | cb:maj | db:maj | db:maj |, (guitar)   3.931269841 b, verse, | db:maj | db:maj | ab:maj | ab:maj |, (voice 8.662993197 | ab:maj | ab:maj | db:maj | db:maj |    # tonic: eb   78.145873015    d, modulation, | eb:maj | eb:maj |, (guitar)   80.474625850    b, verse, | eb:maj | eb:maj | bb:maj | bb:maj |, (voice   85.104784580    | bb:maj | bb:maj | eb:maj | eb:maj |     and need convert them this:
  # title: got stripes     # artist: johnny cash     # metre: 4/4     # tonic: db    | cb:maj | db:maj | db:maj |   | db:maj | db:maj | ab:maj | ab:maj |   | ab:maj | ab:maj | db:maj | db:maj |     # tonic: eb   | eb:maj | eb:maj |   | eb:maj | eb:maj | bb:maj | bb:maj |   | bb:maj | bb:maj | eb:maj | eb:maj |     specifically, means:
- every line starts # needs stay same
 - every blank line (such line 5 in mock example) needs stay there
 - for other lines, every character isn't enclosed pipes ( | ) needs removed
 
i have +/- 700 files, in different subdirectories.
i thinking of writing sed script, can't quite figure out how it.
using sed:
sed '/^ *#/b;s/^[^|]*//;s/[^|]*$//' filename   how works:
- if line begins 
#(with optional spaces before#), branch next cycle (i.e. don't anything) - remove beginning of line 
| - remove end of line before 
| 
if using bsd sed, split up:
sed -e '/^ *#/b' -e 's/^[^|]*//;s/[^|]*$//;' filename      
Comments
Post a Comment