Advertisement
purxiz

Code w/ Explanation

Nov 24th, 2016
138
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
C++ 11.16 KB | None | 0 0
  1. #include <iostream>
  2. #include <fstream>
  3. #include <vector>
  4. using namespace std;
  5.  
  6. int main () { // main function
  7.   ifstream f; //declare a new "in file stream" i.e. new buffer for reading a file
  8.   f.open ("message_revised.txt"); // open the file message_revised.txt into the buffer stream
  9.   char b[9] = ""; //make a new array of characters 9 long, this will store the 8 digits as text, + 1 'null-terminating character' to denote the end of the text.
  10.   //an array holds many different data points. Think of it like 9 cells in Excel.
  11.   //[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]
  12.   //That is what our array would look like, 9 empty boxes, ready to hold 9 characters.
  13.   vector<vector<int>> combo;
  14.   /*
  15.     The easiest way to think of this is a 2-d array aka matrix like
  16.     [ ]  [ ]  [ ]
  17.     [ ]  [ ]  [ ]
  18.     [ ]  [ ]  [ ]
  19.     For example, this is a 3 row x 3 column matrix above, each box can hold 1 piece of data.
  20.     We haven't defined a size for our matrix in the code yet, and in fact we will never explicitly define a size.
  21.     Instead, whenever we encounter a new 8-digits, we'll put it in a new row
  22.     [XXXXXXXX]
  23.     [YYYYYYYY]
  24.     [ZZZZZZZZ]
  25.     Like that. Whenever we encounte a duplicate, say for example we found [XXXXXXXX] again, we add a counter to our matrix
  26.     [XXXXXXXX][2]
  27.     [YYYYYYYY][1]
  28.     [ZZZZZZZZ][1]
  29.     The above represents two occurences of our x value, and 1 of each of Y and Z.
  30.     In reality, we'll end up with a n row x 2 column matrix, where n is our number of different combinations of 8-digits.
  31.     Hopefully this makes sense, if not the comments below should help
  32.   */
  33.   int count = 0; //make a new variable called count, set it to 0. It will keep track of how many different combinations of 8 digits exist
  34.   while(f.read(b, sizeof(b) - 1)){ //this line can be a little confusing, so I'll devote some space to it right below
  35.   /*
  36.     the world 'while' defines a loop. That is, the code in between the '{' bracket at the end of the above line,
  37.     and the one down below indented the same amount, will run multiple times. the code in the parentheses 'f.read(b, sizeof(b) -1)'
  38.     is a condition. I'll describe what read does in a second, but all you have to know now is that it reads some characters
  39.     from the file defined by 'f' from above. If it doesn't get to the end of the file, it returns true, and the loop goes again.
  40.     When it does reach the end of the file, it returns false, and the loop stops, and whatever is after the loop happens.
  41.     The code f.read(b, sizeof(b) -1) reads characters, like I said earlier.
  42.     The first 'b', before the comma, tells it to store the characters it reads into the array 'b', which we made above.
  43.     the part after the comma tells it how many characters to read. We say sizeof(b) - 1, because 'b' is 9 long, but we only want to
  44.     read 8 characters, because there are 8 digits in each message segment, and we need to leave a space for the null terminating character,
  45.     as mentioned above. Every data point in the array defaults to the null-terminating character, so by filling in the first 8, we just
  46.     leave the 9th character as null-terminating. We don't need to set it explicitly.
  47.     We use sizeof(b) -1 instead of just saying 8 so that our code works on all systems. This isn't technically necessary, but is just good practice.
  48.     The reason being that some computers are 32-bit, and some are 64-bit, and a variable takes up different amounts of memory on either type
  49.     of computer. By using sizeof(b), we make sure that our code works regardless of what computer we are using.
  50.   */
  51.     int found = 0; //We set found = 0 every time the loop starts, because we're checking a new 8 digits every time.
  52.     //printf("%s\n", b); // Lines like this are for debugging the code, and can be ignored
  53.     for(int j = 0; j < combo.size(); j++){ //a for loop is like a while loop, in that the code inside it runs multiple times.
  54.       //in this case, our for loop runs the code once for each row in our matrix combo
  55.       //the exact syntax is, set j = 0, then start loop: (1) If j < # of rows, run loop, then add 1 to j, then return to (1), otherwise stop looping
  56.       //printf("checking for equality %d == %d\n", atoi(b), combo[j][0]); //ignore
  57.       if (atoi(b) == combo[j][0]) {
  58.         /*
  59.           if statements execute the code inside of them if the condition in parentheses is mentioned
  60.           In this case, we're checking if the 8 digits the program is currently reading (stored in b) are equal to
  61.           any of the values we've already read (stored in our matrix combo).
  62.           if you remember our matrix, it looks like
  63.                 0      1
  64.           0 [XXXXXXXX][2]
  65.           1 [YYYYYYYY][1]
  66.           2 [ZZZZZZZZ][1]
  67.           I added numbers to represent the rows (0,1, and 2), and columns (0, and 1).
  68.           You can see the values we want to check are in column 0.
  69.           Our for loop adds 1 to j every time, starting at 0.
  70.           So the first time it checks if b = row 0, column 0
  71.           then row 1, column 0
  72.           then row 2, column 0, etc.
  73.           you can see this in the line combo[j][0], which means row j, column 0
  74.           The only other thing of interest here is atoi(b). b is an array of characters, atoi just turns it into a number.
  75.           You can't compare an array of characters to a number (like the ones stored in combo), so we turn it into a number
  76.         */
  77.           combo[j][1]++;
  78.           /*
  79.             If this happens, it means the if condition was met, i.e. we've already read these 8 digits.
  80.             In that case we'd want to change
  81.             [XXXXXXXX][1] -> [XXXXXXXX][2]
  82.             to indicate that we've found XXXXXXXX again
  83.             the ++ means add 1. So we add 1 to row j, column 1
  84.           */
  85.           found = 1; //we set find = 1 if we found it, will come into play in a second
  86.       }
  87.     }//for loop end
  88.     if(found == 0){ //this only happens if we didn't find it.
  89.       //printf("no match was found\n"); //ignore
  90.       combo.push_back(vector<int> ());
  91.       combo[count].push_back(atoi(b));
  92.       combo[count++].push_back(1);
  93.       /*
  94.         The above three lines all go together. They are what happens if we have just read a new 8 digits
  95.         i.e. 8 digits we haven't read before. We have to add a new row to our combo matrix to add our new number.
  96.         If you'll remember, b has an explicit size, 9 characters long. To have an array, you need an explicit size.
  97.         However, we don't know exactly how much data we're dealing with, so as I mentioned earlier, combo doesn't
  98.         have an explicit size. Because of this, it can't be an array, so instead it's a vector. A vector is exactly
  99.         like an array, except we can use 'push_back' to just keep adding data to it. An array has a defined size, so
  100.         we can only add data until it fills up, and then, well, it's full, no more data.
  101.         However, earlier you'll notice I said combo was a 2-D vector. This is because it's a vector, where each element
  102.         is also a vector. So while earlier I drew it like this:
  103.         [ ]  [ ]
  104.         [ ]  [ ]
  105.         [ ]  [ ]
  106.         That's not really an accurate representation. In reality it's more like:
  107.         [ [ ] [ ] ], [ [ ] [ ] ]
  108.         Which is probably easier to understand if I put in our data:
  109.         [ [XXXXXXXX] [2] ]
  110.         [ [YYYYYYYY] [1] ]
  111.         So you can see that we kind of have to cheat to make a "matrix." What we're really making is
  112.         just a vector full of 2-long vectors.
  113.         So back to the three lines above, 'combo.push_back(vector<int> ())' adds a new empty space in our outer vector:
  114.         [ [XXXXXXXX] [2] ]
  115.         [ [YYYYYYYY] [1] ]
  116.         [                ]
  117.         the <int> part just means our new row is designed to hold ints, aka integers aka non-fraction numbers.
  118.         In the next line combo[count] refers specifically to the row we have just created.
  119.         we push_back, atoi(b), adding our current 8 digits into our new row. (remember atoi just converts b to a number)
  120.         Now we have:
  121.         [ [XXXXXXXX] [2] ]
  122.         [ [YYYYYYYY] [1] ]
  123.         [ [ZZZZZZZZ]     ]
  124.         Finally, the third line references our new again, and says to push_back the value '1' as this is the 1st time
  125.         we have found this 8 digit combo. So now we have:
  126.         [ [XXXXXXXX] [2] ]
  127.         [ [YYYYYYYY] [1] ]
  128.         [ [ZZZZZZZZ] [1] ]
  129.         And boom, we've added [ZZZZZZZZ] successfully!
  130.         The only other thing is the count++ in the third line. This just adds 1 to count so that next time this code runs,
  131.         it will add a new row below, instead of overwriting the old one. Think of Excel again, without this line we would
  132.         just keep putting our data in row 0 over and over. With it, we put our first data in row 0, our second data in row 1
  133.         , third data in row 2, etc.
  134.         (because the ++ is after the variable name 'count', it doesn't add 1 until after this whole line of code finishes.)
  135.         (Yes that is a little weird, but it's just how it do, and it's pretty convenient for stuff like this.)
  136.         (If we wrote ++count, instead of count++, it wouldn't work, because ++count would add 1 to count before that line of code ran)
  137.       */
  138.     }
  139.     f.seekg(2, f.cur); //There are two spaces after each 8 characters in message_revised.txt. This just take the current position (f.cur),
  140.                        // and adds 2 to it, to skip those two spaces. Then when the loop runs again, it will read the next 2 characters
  141.                        // after the spaces
  142.   }//while loop end
  143.  
  144.   vector<int> max; //we make another vector called max
  145.   vector<int> pos; //we make another vector called position
  146.   /*
  147.     What this part does, is find the most common 8-digit long segment, aka the largest number in column 1, and then saves
  148.     it's position in the vector pos. Then it sets column 1 to -1 so that the next time the loop runs, it won't find the same
  149.     8-digits again. Since the old maximum is now -1, the new maximum will be the second most common occurence, and so on and
  150.     so forth. The only thing we haven't encountered yet is printf. This is what prints the values so we as humans can read them.
  151.     It is the output of the program. Google how printf works, there's tons of stuff about it, it's pretty simple.
  152.     You can probably figure out how this works, but if you need help, try this method.
  153.     Delete all but the first 10 8-digit strings in message_revised.txt
  154.     Write down the starting values of all the variables. i.e. k = 0, max = [-50000000], pos = [0]
  155.     go through the code and by hand write down the result of every line. It shouldn't take too long as you're
  156.     only dealing with 10 values.
  157.     It will also help to write out what combo looks like for the first 10 segments.
  158.     I.E.
  159.     [66111166][2]
  160.     [76666647][1]
  161.     etc.
  162.   */
  163.   for (int k = 0; k < combo.size(); k++) {
  164.     max.push_back(-5000000);
  165.     pos.push_back(0);
  166.     for(int i = 0; i < combo.size(); i++){
  167.       //printf("%d\n", combo[i][0]); //ignore
  168.       if(combo[i][1] > max[k]){
  169.         max[k] = combo[i][1];
  170.         pos[k] = i;
  171.       }
  172.     }
  173.     printf("%d occured %d times\n", combo[pos[k]][0], max[k]);
  174.     combo[pos[k]][1] = -1;
  175.   }
  176.   f.close();
  177.   return 0;
  178. }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement