Advertisement
ibrahim_065

DNA Errors

Apr 19th, 2021
190
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
C++ 3.51 KB | None | 0 0
  1. /*
  2. Write a function named dnaErrors that accepts two strings representing DNA sequences as parameters
  3. and returns an integer representing the number of errors found between the two sequences,
  4. using a formula described below. DNA contains nucleotides,
  5. which are represented by four different letters A, C, T, and G. DNA is made up of a pair of nucleotide strands,
  6. where a letter from the first strand is paired with a corresponding letter from the second.
  7. The letters are paired as follows:
  8.  
  9. A is paired with T and vice-versa.
  10. C is paired with G and vice-versa.
  11. Below are two perfectly matched DNA strands. Notice how the letters are paired up according to the above rules.
  12.  
  13. "GCATGGATTAATATGAGACGACTAATAGGATAGTTACAACCCTTACGTCACCGCCTTGA"
  14.  |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
  15. "CGTACCTAATTATACTCTGCTGATTATCCTATCAATGTTGGGAATGCAGTGGCGGAACT"
  16. In some cases, errors occur within DNA molecules; the task of your function is to find two particular kinds of errors:
  17.  
  18. Unmatched nucleotides, in which one strand contains a dash ('-') at a given index,
  19. or does not contain a nucleotide at the given index (if the strings are not the same length).
  20. Each of these counts as 1 error.
  21. Point mutations, in which a letter from one strand is matched against the wrong letter in the other strand.
  22. For example, A might accidentally pair with C, or G might pair with G. Each of these counts as 2 errors.
  23. For example, consider these two DNA strands:
  24.  
  25. index 01234567890123456789012
  26.      "GGGA-GAATCTCTGGACT"
  27.      "CTCTACTTA-AGACCGGTACAGG"
  28. This pair of strands has three point mutations (at indexes 1, 15, and 17),
  29. and seven unmatched nucleotides (dashes at indexes 4 and 9, and nucleotides
  30. in the second string with no match at indexes 18-22). The point mutations
  31. count as a total of 3 * 2 = 6 errors, and the unmatched nucleotides count as 7 * 1 = 7 errors,
  32. so your function would return an error count of 6+7 = 13 total errors if passed the two above strands.
  33.  
  34. You may assume that each string consists purely of the characters A, C, T, G, and - (the dash character),
  35. but the letters could appear in either upper or lowercase. The strings might be the same length,
  36. or the first or second might be longer than the other. Either string could be very long, very short,
  37. or even the empty string. If the strings match perfectly with no errors as defined above, your function should return 0.
  38. */
  39.  
  40. #include<iostream>
  41. using namespace std;
  42.  
  43. int dnaErrors(string &T_1, string &T_2)
  44. {
  45.     int len1 = T_1.size();
  46.     int len2 = T_2.size();
  47.     int total_errors = len1 - len2;
  48.     int error_1 = 0, error_2 = 0;
  49.     for (int i = 0; i < len2; i++)
  50.     {
  51.         if (T_1[i] != 'A' && T_1[i] != 'T' && T_1[i] != 'G' && T_1[i] != 'C')
  52.         {
  53.             error_1++;
  54.         }
  55.         if (T_2[i] != 'A' && T_2[i] != 'T' && T_2[i] != 'G' && T_2[i] != 'C')
  56.         {
  57.             error_1++;
  58.         }
  59.         else if ((T_1[i] == 'A' || T_1[i] == 'T' || T_1[i] == 'G' || T_1[i] == 'C') &&
  60.                  (T_2[i] == 'A' || T_2[i] == 'T' || T_2[i] == 'G' || T_2[i] == 'C'))
  61.         {
  62.             if (!((T_1[i] == 'A' && T_2[i] == 'T') || (T_1[i] == 'T' && T_2[i] == 'A') || (T_1[i] == 'G' && T_2[i] == 'C') || (T_1[i] == 'C' && T_2[i] == 'G')))
  63.             {
  64.                 error_2++;
  65.             }
  66.         }
  67.     }
  68.     total_errors += error_1 + error_2 * 2;
  69.     return total_errors;
  70. }
  71.  
  72. int main()
  73. {
  74.  
  75.     string dna1, dna2;
  76.     getline(cin, dna1);
  77.     getline(cin, dna2);
  78.     int len_1 = dna1.size();
  79.     int len_2 = dna2.size();
  80.     if (len_1 >= len_2)
  81.         cout << "Total Errors: " << dnaErrors(dna1, dna2) << endl;
  82.     else
  83.         cout << "Total Errors: " << dnaErrors(dna2, dna1) << endl;
  84.  
  85.     return 0;
  86. }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement