Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- #include <iostream>
- #include <fstream>
- #include <vector>
- using namespace std;
- int main () { // main function
- ifstream f; //declare a new "in file stream" i.e. new buffer for reading a file
- f.open ("message_revised.txt"); // open the file message_revised.txt into the buffer stream
- char b[9] = ""; //make a new array of characters 9 long, this will store the 8 digits as text, + 1 'null-terminating character' to denote the end of the text.
- //an array holds many different data points. Think of it like 9 cells in Excel.
- //[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]
- //That is what our array would look like, 9 empty boxes, ready to hold 9 characters.
- vector<vector<int>> combo;
- /*
- The easiest way to think of this is a 2-d array aka matrix like
- [ ] [ ] [ ]
- [ ] [ ] [ ]
- [ ] [ ] [ ]
- For example, this is a 3 row x 3 column matrix above, each box can hold 1 piece of data.
- We haven't defined a size for our matrix in the code yet, and in fact we will never explicitly define a size.
- Instead, whenever we encounter a new 8-digits, we'll put it in a new row
- [XXXXXXXX]
- [YYYYYYYY]
- [ZZZZZZZZ]
- Like that. Whenever we encounte a duplicate, say for example we found [XXXXXXXX] again, we add a counter to our matrix
- [XXXXXXXX][2]
- [YYYYYYYY][1]
- [ZZZZZZZZ][1]
- The above represents two occurences of our x value, and 1 of each of Y and Z.
- In reality, we'll end up with a n row x 2 column matrix, where n is our number of different combinations of 8-digits.
- Hopefully this makes sense, if not the comments below should help
- */
- int count = 0; //make a new variable called count, set it to 0. It will keep track of how many different combinations of 8 digits exist
- while(f.read(b, sizeof(b) - 1)){ //this line can be a little confusing, so I'll devote some space to it right below
- /*
- the world 'while' defines a loop. That is, the code in between the '{' bracket at the end of the above line,
- and the one down below indented the same amount, will run multiple times. the code in the parentheses 'f.read(b, sizeof(b) -1)'
- is a condition. I'll describe what read does in a second, but all you have to know now is that it reads some characters
- from the file defined by 'f' from above. If it doesn't get to the end of the file, it returns true, and the loop goes again.
- When it does reach the end of the file, it returns false, and the loop stops, and whatever is after the loop happens.
- The code f.read(b, sizeof(b) -1) reads characters, like I said earlier.
- The first 'b', before the comma, tells it to store the characters it reads into the array 'b', which we made above.
- the part after the comma tells it how many characters to read. We say sizeof(b) - 1, because 'b' is 9 long, but we only want to
- read 8 characters, because there are 8 digits in each message segment, and we need to leave a space for the null terminating character,
- as mentioned above. Every data point in the array defaults to the null-terminating character, so by filling in the first 8, we just
- leave the 9th character as null-terminating. We don't need to set it explicitly.
- We use sizeof(b) -1 instead of just saying 8 so that our code works on all systems. This isn't technically necessary, but is just good practice.
- The reason being that some computers are 32-bit, and some are 64-bit, and a variable takes up different amounts of memory on either type
- of computer. By using sizeof(b), we make sure that our code works regardless of what computer we are using.
- */
- int found = 0; //We set found = 0 every time the loop starts, because we're checking a new 8 digits every time.
- //printf("%s\n", b); // Lines like this are for debugging the code, and can be ignored
- for(int j = 0; j < combo.size(); j++){ //a for loop is like a while loop, in that the code inside it runs multiple times.
- //in this case, our for loop runs the code once for each row in our matrix combo
- //the exact syntax is, set j = 0, then start loop: (1) If j < # of rows, run loop, then add 1 to j, then return to (1), otherwise stop looping
- //printf("checking for equality %d == %d\n", atoi(b), combo[j][0]); //ignore
- if (atoi(b) == combo[j][0]) {
- /*
- if statements execute the code inside of them if the condition in parentheses is mentioned
- In this case, we're checking if the 8 digits the program is currently reading (stored in b) are equal to
- any of the values we've already read (stored in our matrix combo).
- if you remember our matrix, it looks like
- 0 1
- 0 [XXXXXXXX][2]
- 1 [YYYYYYYY][1]
- 2 [ZZZZZZZZ][1]
- I added numbers to represent the rows (0,1, and 2), and columns (0, and 1).
- You can see the values we want to check are in column 0.
- Our for loop adds 1 to j every time, starting at 0.
- So the first time it checks if b = row 0, column 0
- then row 1, column 0
- then row 2, column 0, etc.
- you can see this in the line combo[j][0], which means row j, column 0
- The only other thing of interest here is atoi(b). b is an array of characters, atoi just turns it into a number.
- You can't compare an array of characters to a number (like the ones stored in combo), so we turn it into a number
- */
- combo[j][1]++;
- /*
- If this happens, it means the if condition was met, i.e. we've already read these 8 digits.
- In that case we'd want to change
- [XXXXXXXX][1] -> [XXXXXXXX][2]
- to indicate that we've found XXXXXXXX again
- the ++ means add 1. So we add 1 to row j, column 1
- */
- found = 1; //we set find = 1 if we found it, will come into play in a second
- }
- }//for loop end
- if(found == 0){ //this only happens if we didn't find it.
- //printf("no match was found\n"); //ignore
- combo.push_back(vector<int> ());
- combo[count].push_back(atoi(b));
- combo[count++].push_back(1);
- /*
- The above three lines all go together. They are what happens if we have just read a new 8 digits
- i.e. 8 digits we haven't read before. We have to add a new row to our combo matrix to add our new number.
- If you'll remember, b has an explicit size, 9 characters long. To have an array, you need an explicit size.
- However, we don't know exactly how much data we're dealing with, so as I mentioned earlier, combo doesn't
- have an explicit size. Because of this, it can't be an array, so instead it's a vector. A vector is exactly
- like an array, except we can use 'push_back' to just keep adding data to it. An array has a defined size, so
- we can only add data until it fills up, and then, well, it's full, no more data.
- However, earlier you'll notice I said combo was a 2-D vector. This is because it's a vector, where each element
- is also a vector. So while earlier I drew it like this:
- [ ] [ ]
- [ ] [ ]
- [ ] [ ]
- That's not really an accurate representation. In reality it's more like:
- [ [ ] [ ] ], [ [ ] [ ] ]
- Which is probably easier to understand if I put in our data:
- [ [XXXXXXXX] [2] ]
- [ [YYYYYYYY] [1] ]
- So you can see that we kind of have to cheat to make a "matrix." What we're really making is
- just a vector full of 2-long vectors.
- So back to the three lines above, 'combo.push_back(vector<int> ())' adds a new empty space in our outer vector:
- [ [XXXXXXXX] [2] ]
- [ [YYYYYYYY] [1] ]
- [ ]
- the <int> part just means our new row is designed to hold ints, aka integers aka non-fraction numbers.
- In the next line combo[count] refers specifically to the row we have just created.
- we push_back, atoi(b), adding our current 8 digits into our new row. (remember atoi just converts b to a number)
- Now we have:
- [ [XXXXXXXX] [2] ]
- [ [YYYYYYYY] [1] ]
- [ [ZZZZZZZZ] ]
- Finally, the third line references our new again, and says to push_back the value '1' as this is the 1st time
- we have found this 8 digit combo. So now we have:
- [ [XXXXXXXX] [2] ]
- [ [YYYYYYYY] [1] ]
- [ [ZZZZZZZZ] [1] ]
- And boom, we've added [ZZZZZZZZ] successfully!
- The only other thing is the count++ in the third line. This just adds 1 to count so that next time this code runs,
- it will add a new row below, instead of overwriting the old one. Think of Excel again, without this line we would
- just keep putting our data in row 0 over and over. With it, we put our first data in row 0, our second data in row 1
- , third data in row 2, etc.
- (because the ++ is after the variable name 'count', it doesn't add 1 until after this whole line of code finishes.)
- (Yes that is a little weird, but it's just how it do, and it's pretty convenient for stuff like this.)
- (If we wrote ++count, instead of count++, it wouldn't work, because ++count would add 1 to count before that line of code ran)
- */
- }
- f.seekg(2, f.cur); //There are two spaces after each 8 characters in message_revised.txt. This just take the current position (f.cur),
- // and adds 2 to it, to skip those two spaces. Then when the loop runs again, it will read the next 2 characters
- // after the spaces
- }//while loop end
- vector<int> max; //we make another vector called max
- vector<int> pos; //we make another vector called position
- /*
- What this part does, is find the most common 8-digit long segment, aka the largest number in column 1, and then saves
- it's position in the vector pos. Then it sets column 1 to -1 so that the next time the loop runs, it won't find the same
- 8-digits again. Since the old maximum is now -1, the new maximum will be the second most common occurence, and so on and
- so forth. The only thing we haven't encountered yet is printf. This is what prints the values so we as humans can read them.
- It is the output of the program. Google how printf works, there's tons of stuff about it, it's pretty simple.
- You can probably figure out how this works, but if you need help, try this method.
- Delete all but the first 10 8-digit strings in message_revised.txt
- Write down the starting values of all the variables. i.e. k = 0, max = [-50000000], pos = [0]
- go through the code and by hand write down the result of every line. It shouldn't take too long as you're
- only dealing with 10 values.
- It will also help to write out what combo looks like for the first 10 segments.
- I.E.
- [66111166][2]
- [76666647][1]
- etc.
- */
- for (int k = 0; k < combo.size(); k++) {
- max.push_back(-5000000);
- pos.push_back(0);
- for(int i = 0; i < combo.size(); i++){
- //printf("%d\n", combo[i][0]); //ignore
- if(combo[i][1] > max[k]){
- max[k] = combo[i][1];
- pos[k] = i;
- }
- }
- printf("%d occured %d times\n", combo[pos[k]][0], max[k]);
- combo[pos[k]][1] = -1;
- }
- f.close();
- return 0;
- }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement