Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- % Partial solution to Lab 9, the 'elegant' way (using regular expressions)
- %
- % Note : Extreme hackiness required because MATLAB is shithouse at string
- % handling and requires strings to be stored in cell arrays.
- %
- % Note: Uses the containers.Map data structure, only available in MATLAB
- % r2008b and better. (Why didn't MATLAB have an associative array data type
- % in the first place?!)
- %
- % Li-aung Yip, liaung.yip@ieee.org
- ElementSymbols = { 'H', 'He', 'Li', 'Be', 'O' };
- % Mapping of element symbols to element weights.
- ElementWeights = containers.Map({ 'H', 'He', 'Li', 'Be', 'O', 'Fe' },{ 1, 2, 3, 4, 16,26 });
- Input = 'Fe2H2SO4Fe2O3';
- % Step 1 : "Tokenise" the input string.
- % That is, split the input into (element symbol)(number of atoms) pairs like so:
- % Water: 'H20' -> 'H2', 'O'
- % Sulphuric Acid 'H2SO4' -> 'H2', 'S', 'O4'
- % Rust 'Fe2O3' -> 'Fe2', 'O3'
- % Hexane 'CH3CH2CH2CH2CH2CH3' -> 'C', 'H3', 'C', 'H2', .... 'C','H3'.
- %
- % What's the pattern here? Each 'token' consists of an element symbol,
- % optionally followed by a number. (An element symbol consists of one
- % capital letter, optionally followed by some lowercase letters.)
- TokenPattern = '[A-Z][a-z]*\d*';
- Tokens = regexp(Input,TokenPattern,'match')
- for Token = Tokens
- % Step 2 : Convert tokens like 'Fe2' to the symbol 'Fe' and the number 2.
- % Pull out the letters at the start
- ElementSymbol = regexp(Token,'[A-Z][a-z]*','match');
- ElementSymbol = ElementSymbol{1}{1}; % convert from cell array to text string.
- % Pull out the number at the end
- NumAtoms = regexp(Token,'\d*','match');
- if ( isempty(NumAtoms{1}) ) %If there were no numbers, assume one atom.
- NumAtoms = 1;
- else
- NumAtoms = str2num(NumAtoms{1}{1});
- end
- if ( ElementWeights.isKey(ElementSymbol) );
- ElementWeight = ElementWeights(ElementSymbol);
- TokenWeight = ElementWeight * NumAtoms;
- fprintf('Token %-6s | Symbol: %-3s | Number of: %-3i | Weight : %i * %i = %i\n',Token{1}, ElementSymbol,NumAtoms,NumAtoms,ElementWeight,NumAtoms*ElementWeight);
- else
- fprintf('I don''t know the weight of the element symbol %s\n',ElementSymbol);
- end
- end
- Sample Output:
- Tokens =
- 'Fe2' 'H2' 'S' 'O4' 'Fe2' 'O3'
- Token Fe2 | Symbol: Fe | Number of: 2 | Weight : 2 * 49 = 98
- Token H2 | Symbol: H | Number of: 2 | Weight : 2 * 1 = 2
- I don't know the weight of the element symbol S
- Token O4 | Symbol: O | Number of: 4 | Weight : 4 * 16 = 64
- Token Fe2 | Symbol: Fe | Number of: 2 | Weight : 2 * 49 = 98
- Token O3 | Symbol: O | Number of: 3 | Weight : 3 * 16 = 48
- Which is most of the way to a solution.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement