Advertisement
mixster

Predecrement C

Oct 23rd, 2013
408
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. /* The top level of a --C program a series of 1 or more function definitions.
  2.  * Not having a function named "main" should be considered an error, though
  3.  * "main" may appear anywhere.
  4.  * Comments can also appear on the top level, but they should be tokenised and
  5.  * then dropped (https://www.youtube.com/watch?v=YLdCKDyxQeI) during the
  6.  * lexical analysis.
  7.  */
  8.  
  9. // I'm going to use regular yacc syntax in this document because.
  10.  
  11. /*     function_definition : identifier identifier LBRACKET param_list RBRACKET block
  12.  *                         ;
  13.  
  14. type_identifier identifier
  15.  
  16.  * (names for stuff to be decided)
  17.  *
  18.  * (side note: Params - in the function definition. Args - in the function
  19.  * call.)
  20.  * There are only three types (int, function, void) and one of them is only to
  21.  * say that a function doesn't return anything
  22.  
  23. I'm thinking no void and default to returning 0 in int functions and raising an error in function returning functions.
  24.  
  25. . We may want to just use an
  26.  * IDENTIFIER token for both of them, but an identifier rule would be useful so
  27.  * we can have it do t[0] = t[1].value for all the other times we need to use
  28.  * such identifiers.
  29.  *     def p_identifier(t):
  30.  *         """identifier : IDENTIFIER"""
  31.  *         t[0] = t[1].value
  32.  * (side note: yacc case sensitivity?)
  33.  * (of course we may want it to output something like ("ID", t[1].value) so the
  34.  * execution part can easily tell what kind of eldrich horror it has on its
  35.  * virtual hands)
  36.  *
  37.  * We might also want to separate type identifiers and variable identifiers so
  38.  * that we can get yacc to deal with the fact that there are only three possible
  39.  * types at parse time.
  40.  *     type : TYPE_INT
  41.  *          | TYPE_FUNCTION
  42.  *          ;
  43.  * If we were feeling particularly enterprising/insane we could even make a
  44.  * separate function_type rule for dealing with functions.
  45.  *     type_void : TYPE_VOID
  46.  *               ;
  47.  *     function_type : type
  48.  *                   | type_void
  49.  *                   ;
  50.  * (using "type_void" instead of "TYPE_VOID" works better)
  51.  */
  52. function negative_double_multiplier(int m) {
  53.     int m2 = -2 * m;
  54.     int f(int n) {
  55.         return n * m2;
  56.     }
  57.     return f;
  58. }
  59.  
  60. /*     block : LCURLY declare_list statement_list RCURLY
  61.  *           ;
  62.  
  63. We'll want block : { statement_list } | { declare_list statement_list }
  64. So we don't have to include an empty statement_list
  65.  
  66.  * A block is the body of a function, an if statement, a while loop etc.
  67.  * I'm not sure about this, but would a function definition such as maybe_this
  68.  * below be valid?
  69.  */
  70. int maybe_this(int a, int b)
  71.     return a + b;
  72. /* If so, then the "block" in function_definition above could be replaced with
  73.  * "statement".
  74.  * (side note: statements are executed, expressions are evaluated(?))
  75.  *     statement : block
  76.  *               | expression SEMICOLON
  77.  *               | assign_statement SEMICOLON
  78.  *               | function_definition
  79.  *               | while_statement
  80.  *               | if_statement
  81.  *               | return_statement SEMICOLON
  82.  *               | iter_control SEMICOLON
  83.  *               ;
  84.  
  85. Your side note doesn't make sense as evaluation should be the same as execution. Possibly you can split them based on execution affecting environment.
  86.  
  87. Either way, I'd think:
  88.     statement : block | expression | control_statement
  89.  
  90. control_statement them subsumes while, if, return, iter (we don't need iteration as far as I know, just while)
  91. expression subsumes assignment as (a = b) should evaluate to b in general C and it's easy enough to do
  92. function_definition should only be permitted in declare_list
  93.  
  94.  * The only difference between a function's block and an if statement's one is
  95.  * that the function also has the calling arguments in the local environment.
  96.  
  97. Not a difference at all really, since the calling arguments should behave like a parent environment or be initially set for the current environment
  98.  
  99.  * (also it can't contain iter_control statements (break, continue) as top-level
  100.  * statements, but when not inside a while loop neither can if statements, so
  101.  * we will also have to some logic for that somewhere (while possible,
  102.  * making an alternate statement definition, if_statement definition etc might
  103.  * not be the best (but it would allow for yacc to detect wrongly placed
  104.  * iter_control statements))).
  105.  
  106. I don't think you can elegantly do this in the grammar, however you can maybe set a loop_counter that gets incremented when inside a loop, decerementing when leaving, raising an error when trying to leave l_c=0.
  107.  
  108. In the interpreter, detecting this is trivial since control statements should just bubble up the call chain until reaching a loop object or function object, raising an error if it reaches a function object (break, continue) or the "parent of root" (outside of function return)
  109.  
  110.  *
  111.  * Declaration statements (while they are statements) aren't here because they
  112.  * aren't allowed in statement_list, only in declare_list.
  113.  *
  114.  * One other point: do embedded function definitions have to be at the top of
  115.  * the block (that is to say in the declare_list)? If so then obviously
  116.  * function_definition shouldn't be here.
  117.  *
  118.  *     declare_start : type identifier
  119.  *                   ;
  120.  * I separate this from declare because it can also be used in param_list.
  121.  * (unrelated: (lambda x: x(x))(lambda y: y(y)))
  122.  *     declare : declare_start SEMICOLON
  123.  *             | declare_start ASSIGN expression SEMICOLON
  124.  *             ;
  125.  * ASSIGN is "=", I used the name because it is less ambiguous. Equality is
  126.  * "EQ".
  127.  *
  128.  *     expression : function_call
  129.  *                | unary_expression
  130.  *                | infix_expression
  131.  *                | bracket_expression
  132.  *                | number
  133.  *                | identifier
  134.  *                ;
  135.  
  136. I massively suspect reading about expression grammars is important here to make sure stuff like precedence is inherent in the grammar where feasible.
  137.  
  138.  * I may have forgotten one or two things here.
  139.  */
  140.  
  141. int main() {
  142.     int x;
  143.     function multi = negative_double_multiplier(7);
  144.     x = multi(multi(5));
  145.     return maybe_this(2, x);
  146. }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement