Guest User

Untitled

a guest
Mar 19th, 2018
86
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.91 KB | None | 0 0
  1. class SimplePeptide:
  2. """
  3. This allows comparisons between peptide sequences that are nearly identical
  4. except for the Isboaric Isoleucine and Leucine [IL]. MS Proteomics is based on
  5. identifying peptides based on their fragment mass profiles. Since [IL] have the
  6. same mass they can not be differentiated.
  7.  
  8. The equals is acomplished by a simple replacement of all r'[IL]' in the sequence with
  9. the string [IL]. Then an exact match (No Regex matching needed) is the most efficient
  10. way to match two peptides.
  11.  
  12. You could also use the complied pattern to do a RegEx pattern search against a larger sequence.
  13.  
  14. This was really for putting the object into a Pandas DF
  15. pdDF['SIMPLE_PEP'] = pdDF['Sequence'].apply(lambda x: SimplePeptide(x))
  16. """
  17.  
  18. def __init__(self, sequence):
  19. self.sequence = sequence
  20. self.pattern_string = re.sub(r'[IL]', '[IL]', sequence)
  21. self.pattern = re.compile(self.pattern_string)
  22. self.myhash = hash(self.pattern_string)
  23.  
  24. def __eq__(self, other):
  25. if isinstance(self, other.__class__):
  26. return bool(self.pattern_string == other.pattern_string)
  27. return False
  28.  
  29. def __hash__(self):
  30. return self.myhash
  31.  
  32. def __lt__(self, other):
  33. return self.pattern_string < other.pattern_string
  34.  
  35. def __str__(self):
  36. return self.sequence
  37.  
  38. pep1 = SimplePeptide('SDFSDFSDFLSDFISDFR')
  39. pep2 = SimplePeptide('SDFSDFSDFLSDFISDFR')
  40. pep3 = SimplePeptide('SDFSDFSDFISDFLSDFR')
  41. pep4 = SimplePeptide('XZCZXCZXCIZXCLSDFR')
  42.  
  43. print('{} = {} ? {}'.format(pep1, pep2, (pep1==pep2))) # SDFSDFSDFLSDFISDFR = SDFSDFSDFLSDFISDFR ? True
  44. print('{} = {} ? {}'.format(pep1, pep3, (pep1==pep3))) # SDFSDFSDFLSDFISDFR = SDFSDFSDFISDFLSDFR ? True
  45. print('{} = {} ? {}'.format(pep1, pep4, (pep1==pep4))) # SDFSDFSDFLSDFISDFR = XZCZXCZXCIZXCLSDFR ? False
Add Comment
Please, Sign In to add comment