regex - python regular expression remove matching brackets file -


i have latex file lot of text marked \red{}, there may brackets inside \red{}, \red{here \underline{underlined} text}. want remove red color , after googling wrote python script:

import os, re, sys #start program in terminal #python redremover.py filename #sys.argv[1] has value filename ifn = sys.argv[1] #open file , read f = open(ifn, "r") c = f.read()  #the whole file content stored in string c #remove occurences of \red{...} in c c=re.sub(r'\\red\{(?:[^\}|]*\|)?([^\}|]*)\}', r'\1', c) #write c new file nf=open("redremoved_"+ifn,"w") nf.write(c)  f.close() nf.close() 

but convert

\red{here \underline{underlined} text}

to

here \underline{underlined text}

which not want. want

here \underline{underlined} text

you can't match undetermined level of nested brackets re module since doesn't support recursion. solve that, can use new regex module:

import regex  c = r'\red{here \underline{underlined} text}'  c = regex.sub(r'\\red({((?>[^{}]+|(?1))*)})', r'\2', c) 

where (?1) recursive call capture group 1.


Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -