Untitled

(*
**** Hacheme ****
*)

(*p
\documentclass[a4paper]{report}
\usepackage{amsmath,amssymb,amsthm}
\usepackage{ocamlweb}
\usepackage{makedix,glossaries,epigraph}
\usepackage{filecontents}
\usepackage{simpletrs}
\usepackage{expl3}
\usepackage[backend=biber,style=authoryear]{biblatex}

\addbibresource{references.bib}
\includeonly{hacheme-command-definitions}
\includeonly{hacheme-glossary-definitions}
\includeonly{hacheme-index-definitions}
\makeindex
\makeglossaries

\begin{document}

The booklet before you presents the source code for \hacheme, an implementation of the \hachemestd, in a literate manner. This is not a textbook, nor is it a tutorial or a work of academia. This is merely a \litprogram. This also allows us to use \formeth{}s to, albeit partially, \verif\ this program --- and furthermore, delve into the history of \scheme\ and \lisp.


\section*{Map to the program}\label{sec:prog_map}


\section{A brief history of \lisp}\label{sec:lisp_hist}

\epigraph{"Ut therefore seems to be desirable to attempt to construct an artificial language which a computer can be programmed to use on problems and self~-reference. It should correspond to English in the sense that short English statements about the given subject matter should have short correspondents in the language and so should short arguments or consjectural arguments."}{J. McCarthy}

In the mid-1950s, \jmccarthy, an assisstant professor at \dartmouth, was presented the language \ipl; to which he showed no excitement due to its low level and rough shape\parencite{stoyan1984early}. In a surviving archival note, he prescribes \textit{the problem of programming a computer} into two bullet-points; the first being that "there does not exist an adequate language for human beings to describes procedures to each other", in a language which he felt like must have been \textit{explicit, univeral and concise} -- and the second issue he found was that computers at the time needed extra "engineering consideration" to be able t ounderstand such language\parencite{stoyan1984early}.

In his visit to \ibm, he was impressed by the success at engineering of the \fortran\ language by a team led by \jbackus. During September and October of that year, \jmccarthy\ worte some memos about the new language he was brewing in his mind\parencite{stoyan1979lisp}. McCarthy claimed he "did not understand Church's paper" so \lambdacalc\ did not play much of a role in early construction of \lisp\footnote{However, as we see later, \lambdacal\ plays a huge role in evolution of \scheme}.

% TODO: Complete history of lisp

\section{History of \scheme}


% TODO: Complete history of scheme


*)

(*

\section{\moduledef{Syntax}}\label{sec:mod_syntax}

\subsection{About the \absyn\ of \scheme}

The \modsty{Syntax} module defines the \syntax\ for \hacheme. As mentioned in \ref{sec:about_scheme}, the language uses fully-parenthesized \index{prefix} notation, or in other words, \sexpr. The \modsty{Syntax} module reflects this, but also, encodes several \textit{peculiarities} of the \scheme-flavored \sexp.

Where we to just use a \textit{basic} syntax for an \sexp\ language, we would have sufficed to \sexpatom{}s and \sexplist{}s. But our aim is to encode the syntax of \scheme, not just a run-off-the-mill \sexp\ syntax!

That is why \modsty{Syntax} module represents the \absyn\ for \scheme, and adds extra \hacheme-specific elements to it. In reality, the best way to convey the \absyn\ is using \sexp (\parencite{turbak2008design}\refextsection{2.3}). So languages like \scheme\ which use fully-parenthesized syntax are not only easier to parse, but each consturct conveys more info \textit{symbolically}.

The \scheme\ syntax is made up of \datum{}s which, in turn, form an \expr.

% TODO: Complete this section

*)
(*
\moduledefbegin{Syntax}
*)
module Syntax = struct
(*
\marktype{Syntax}{ProductType}{t}{The base type for \modsty{Syntax}}
*)
   type t =
       { contents : expr list
       ; location : Location.t
       }

(*
\marktype{Syntax}{SumType}{datum}{The \datum{}s, please see \ref{subsec:syntax_datum}}
*)
   and datum =
      | DATUM_Boolean of bool
      | DATUM_String of string
      | DATUM_Char of char
      | DATUM_Label of label * datum
      | DATUM_List of datum list
      | DATUM_Symbol of label
      | DATUM_Vector of datum array
      | DATUM_Pair of datum * datum

(*
\marktype{Syntax}{SumType}{quote}{The \quote{}s, please see \ref{subsec:syntax_quote}}
*)
   and quote =
      | QUOTE_Quote of expr
      | QUOTE_Quasiquote of expr
      | QUOTE_Unquote of expr
      | QUOTE_Splice of expr

(*
\marktype{Syntax}{SumType}{formals}{The \formal{}s, please see \ref{subsec:syntax_formals}}
*)
   and formals =
      | FORMALS_IdentList of ident list
      | FORMALS_SoloIdent of ident
      | FORMALS_IdentPair of ident list * ident


(*
\marktype{Syntax}{SumType}{expr}{The \expr{}s, please see \ref{subsec:syntax_expr}}
*)

   and expr =
      | EXPR_Ident of ident
      | EXPR_Thunk of datum
      | EXPR_Call of expr * expr list
      | EXPR_Lambda of ident option * formals * body
      | EXPR_Cond of Conditional.t
      | EXPR_Assign of ident * expr
      | EXPR_Macro of Macro.t
      | EXPR_Include of Include.t
      | EXPR_Quote of quote
      | EXPR_Derived of DerivedExpr.t

(*
Auxillary types for \modsty{Syntax} module.
*)
   and ident = string

   and label = string

   and body = expr list
end

(*
\moduledefend{Syntax}
*)

(*

\section{\moduledef{Parser}}\label{sec:mod_parser}

\subsection{A theory for \sexpr}\label{subsec:theory_sexpr}

A theory for \sexpr\ was proposed in \parencite{sato1983theory} and subsequently in \parencite{sato1985theory}.

% TODO: Discuss Sato's Sexp theory

\subsection{The challenges in parsing \scheme}\label{subsec:challenges_parse_scheme}

Parsing \sexpr{}s are pretty simple, however; \scheme\ brandishes them with its own flavor of \textit{extensibility}, meaning that, an \scheme\ program is a living thing. It's not what is written down in a file, it gets expanded and expanded during runtime. This makes a native-code \index{compiler} for \scheme\ to be quite difficult to make. \parencite{dybvig1987models} discusses three models for implementing \scheme, and seminal as the dissertation is (I have made good use of it as well, this literate program is peppered with citations of this dissertation), all three models are what we today consider \textit{interpretive}. Since \hacheme\ is a compiler and not an interpreter, it must make use of \partialeval\ to first get all the expressions into a \normalform. The \modsty{Parser} must have facilities that will be used repeatedly.

*)

(*
\moduledefbegin{Parser}
*)

module Parser = struct

(*
\submoduledefbegin{Lexeme}

The \submodsty{Lexeme} submodule holds the \lexeme\ types for \hacheme.

\asidebox{Did you know...}{A \textit{lexeme} is like a \textit{raw token}. Just like when you put dough into oven and it bakes and becomes bread, when you put a \lexeme\ into a \gengrammar, it becomes a \token.}

*)
   module Lexeme = struct
       type t =
          | LEXEME_Number of string
	  | LEXEME_Ident of string
	  | LEXEME_String of string
	  | LEXEME_Symbol of string
	  | LEXEME_Char of string
	  | LEXEME_Delim of string
	  | LEXEME_Punct of string
   end


(*
\submoduledefend{Lexeme}
*)

(*
\markexn{End_of_consumption}{This is raised at the end of \textit{consume_\ldots} functions}
*)
    exception End_of_consumption
(*
\markmodfn{Parser}{scan_hacheme}{The lexical scanner for \hacheme, please seee \ref{subsec:lexical_sanning_of_hacheme}}
*)

   let scan_hacheme (input : string) : Lexem.t list =
      let char_list = Utils.explode input in

(*
\markfnfn{scan_hacheme}{scan_number}{The \index{scan_number|texttt} function will be invoked when we reach a digit}
*)
      let scan_number (input' : char list) : (string * char list) =
         let buffer = Buffer.create 21 in

(*
\markrecfn{scan_number}{scan_hacheme}{consume_numeric_chars}{The \recfn\ which will gobble up any numberic symbol}
*)

	 let rec consume_numeric_chars ch =
            match ch with
	    | '0' .. '9'
	    | 'a' | 'b' | 'c' | 'd' | 'e'
	    | 'E'
	    | '.' ->
	         Buffer.add_char buffer ch;
		 consume_numeric_chars (List.tl input')
	    | _ -> ((Buffer.contents buffer), input')

	 in
	 consume_numeric_chars input'
      in

      let


end

(*
\moduledefend{Parser}
*)

(*

\newpage
\printindex

\newpage
\printglossaries

\newpage
\printbibliography

\begin{filecontents}{references.bib}
@phdthesis{dybvig1987models,
  author="R. Kent Dybvig",
  title="Three Implementation Models for Scheme",
  year=1987,
  school="University of North Carolina at Chapel Hill",
  address="Chapel Hill, NC, USA",
  url="https://dl.acm.org/doi/10.5555/91071",
  note="Ph.D. thesis"
}

@techreport{scheme2013r7rs,
  title={Revised^7 Report on the Algorithmic Language Scheme},
  editor={R. Kent Dybvig and Michael Sperber and William D. Clinger and others},
  year={2013},
  journal={Scheme Language Steering Committee, Rep. R7RS},
  url={https://small.r7rs.org/attachment/r7rs.pdf},
}
@article{sato1983theory,
  title={Theory of symbolic expressions, I},
  author={Sato, Masahiko},
  journal={Theoretical Computer Science},
  volume={22},
  number={1-2},
  pages={19--55},
  year={1983},
  publisher={Elsevier}
}
@article{sato1985theory,
  title={Theory of symbolic expressions, ii},
  author={Sato, Masahiko},
  journal={Publications of the Research Institute for Mathematical Sciences},
  volume={21},
  number={3},
  pages={455--540},
  year={1985}
}
@inproceedings{stoyan1984early,
  title={Early LISP history (1956-1959)},
  author={Stoyan, Herbert},
  booktitle={Proceedings of the 1984 ACM Symposium on LISP and functional programming},
  pages={299--310},
  year={1984}
}
@article{stoyan1979lisp,
  title={LISP history},
  author={Stoyan, Herbert},
  journal={ACM Lisp Bulletin},
  number={3},
  pages={42--53},
  year={1979},
  publisher={ACM New York, NY, USA}
}
@inproceedings{padget1986desiderata,
  title={Desiderata for the standardization of LISP},
  author={Padget, Julian and Chailloux, J{\'e}r{\^o}me and Christaller, Thomas and DeMantaras, Ramon and Dalton, Jeff and Devin, Matthieu and Fitch, John and Krumnack, Timm and Neidl, Eugen and Papon, Eric and others},
  booktitle={Proceedings of the 1986 ACM conference on LISP and functional programming},
  pages={54--66},
  year={1986}
}
@incollection{steele1996evolution,
  title={The evolution of Lisp},
  author={Steele, Guy L and Gabriel, Richard P},
  booktitle={History of programming languages---II},
  pages={233--330},
  year={1996}
}
@book{turbak2008design,
  title={Design concepts in programming languages},
  author={Turbak, Franklyn and Gifford, David},
  year={2008},
  publisher={MIT press}
}
\end{filecontents}

\begin{filecontents}{hacheme-command-definitions.tex}

\newcommand\hacheme{}

\newcounter{modcnt}
\setcounter{modcnt}{0}
\newcommand\moduledef[1]{%
\stepcounter{modcnt}\textsc{Module} \#\arabic{modcnt} -> #1
}


\end{filecontents}

\begin{filecontents}{hacheme-glossary-definitions.tex}
\newglossaryentry{Term-rewriting}{
   name=Term-rewriting,
   description={}
}
\end{filecontents}

\begin{filecontents}{hacheme-index-definitions.tex}

\newcommand\scheme{\index{Scheme}}

\newcommand\hachemestd{\index{R7RS@$R^{7}RS$}}

\newcommand\litprogram{\index{literate program@\textit{literate program}}}

\newcommand\lisp{\index{Lisp}}

\newcommand\formeth{\index{formal method}}

\newcommand\verif{\index{verify@\textit{formally verify}}}

\end{filecontents}

\begin{filecontents}{simpletrs.sty}
\NeedsTeXFormat{LaTeX2e}
\ProvidesPackage{simpletrs}[2024/9/15 A simple package for typesetting Term-rewriting systems]

\RequirePackage{amsmath}
\RequirePackage{xparse}


% Simple command for basic term-rewriting rules
\NewDocumentCommand{\trule}{m m}{
  \[ #1 \to #2 \]
}

% For conditional term-rewriting rules
\NewDocumentCommand{\truleif}{m m m}{
  \[ #1 \to #2 \quad \text{if} \quad #3 \]
}

% For contextual term-rewriting rules
\NewDocumentCommand{\ctrule}{m m m}{
  \[ #1[#2] \to #1[#3] \]
}

% For named term-rewriting rules
\NewDocumentCommand{\ntrule}{m m m}{
  \[ #1 \colon \quad #2 \to #3 \]
}

\newcommand{\relarrow}{\to}
\end{filecontents}

\end{document}

*)