sequencing - Biopython SeqIO processing NNNNN in *.ab1 files -


thanks help. apologize in advance if there function built biopython handles this, read whole manual , couldn't find anything.

goal: read in raw sequencing file (*.ab1) , process using sequence.seq.translate(11) however, error - "bio.data.codontable.translationerror: codon 'nnn' invalid"

my solution: added additional table codontable , commented out ambiguous checker in bio.data.codontable (had make work)

register_ncbi_table(     name = 'bacteria sequencing table',     alt_name = none,     id = 24,     table = {         'ttt': 'f', 'ttc': 'f', 'tta': 'l', 'ttg': 'l', 'tct': 's',         'tcc': 's', 'tca': 's', 'tcg': 's', 'tat': 'y', 'tac': 'y',         'tgt': 'c', 'tgc': 'c', 'tgg': 'w', 'ctt': 'l', 'ctc': 'l',         'cta': 'l', 'ctg': 'l', 'cct': 'p', 'ccc': 'p', 'cca': 'p',         'ccg': 'p', 'cat': 'h', 'cac': 'h', 'caa': 'q', 'cag': 'q',         'cgt': 'r', 'cgc': 'r', 'cga': 'r', 'cgg': 'r', 'att': 'i',         'atc': 'i', 'ata': 'i', 'atg': 'm', 'act': 't', 'acc': 't',         'aca': 't', 'acg': 't', 'aat': 'n', 'aac': 'n', 'aaa': 'k',         'aag': 'k', 'agt': 's', 'agc': 's', 'aga': 'r', 'agg': 'r',         'gtt': 'v', 'gtc': 'v', 'gta': 'v', 'gtg': 'v', 'gct': 'a',         'gcc': 'a', 'gca': 'a', 'gcg': 'a', 'gat': 'd', 'gac': 'd',         'gaa': 'e', 'gag': 'e', 'ggt': 'g', 'ggc': 'g', 'gga': 'g',         'ggg': 'g', 'aan': 'x', 'tan': 'x', 'gan': 'x', 'can': 'x',         'atn': 'x', 'ttn': 'x', 'gtn': 'x', 'ctn': 'x', 'acn': 'x',          'tcn': 'x', 'gcn': 'x', 'ccn': 'x', 'agn': 'x', 'tgn': 'x',         'ggn': 'x', 'cgn': 'x', 'ana': 'x', 'tna': 'x', 'gna': 'x',         'cna': 'x', 'ant': 'x', 'tnt': 'x', 'gnt': 'x', 'cnt': 'x',         'anc': 'x', 'tnc': 'x', 'gnc': 'x', 'cnc': 'x', 'ang': 'x',          'tng': 'x', 'gng': 'x', 'cng': 'x', 'naa': 'x', 'nta': 'x',          'nga': 'x', 'nca': 'x', 'nat': 'x', 'ntt': 'x', 'ngt': 'x',          'nct': 'x', 'nac': 'x', 'ntc': 'x', 'ngc': 'x', 'ncc': 'x',         'nag': 'x', 'ntg': 'x', 'ngg': 'x', 'ncg': 'x', 'nnn': 'x',         'ann': 'x', 'tnn': 'x', 'gnn': 'x', 'cnn': 'x', 'nan': 'x',         'ntn': 'x', 'ngn': 'x', 'ncn': 'x', 'nna': 'x', 'nnt': 'x',         'nng': 'x', 'nnc': 'x', 'nnn': 'x'},     stop_codons = ['taa', 'tag', 'tga'],     start_codons = ['ttg', 'ctg', 'att', 'atc', 'ata', 'atg', 'gtg']) 

ambiguous checker

for n in ambiguous_generic_by_id:      assert ambiguous_rna_by_id[n].forward_table["guu"] == "v"     assert ambiguous_rna_by_id[n].forward_table["gun"] == "v"     if n != 23 :         #for table 23, uun = f, l or stop.         assert ambiguous_rna_by_id[n].forward_table["uun"] == "x"  # f or l          #r = or g, urr = uaa or uga / tra = taa or tga = stop codons     if "uaa" in unambiguous_rna_by_id[n].stop_codons and\        "uga" in unambiguous_rna_by_id[n].stop_codons:        try:            print(ambiguous_dna_by_id[n].forward_table["tra"])            assert false, "should stop only"         except keyerror:             pass     assert "ura" in ambiguous_generic_by_id[n].stop_codons     assert "ura" in ambiguous_rna_by_id[n].stop_codons     assert "tra" in ambiguous_generic_by_id[n].stop_codons     assert "tra" in ambiguous_dna_by_id[n].stop_codons      del n 

question 1: prefer not edit root codontable.py file. suggestions on how avoid that?

question 2: don't want comment out ambiguous checker. can me write exception ambiguous checker ignore new codon table?

when load abi file, biopython set seq alphabet iupacunambiguousdna(). first approach set alphabet singleletteralphabet():

from bio import seqio bio.alphabet import singleletteralphabet  rec in seqio.parse("prots.ab1", "abi", alphabet=singleletteralphabet()):     print rec.seq.translate(11) 

now seq translates "x" , "n".


Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -