Guest User

Untitled

a guest
Sep 18th, 2024
118
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 12.48 KB | None | 0 0
  1. (*
  2. **** Hacheme ****
  3. *)
  4.  
  5. (*p
  6. \documentclass[a4paper]{report}
  7. \usepackage{amsmath,amssymb,amsthm}
  8. \usepackage{ocamlweb}
  9. \usepackage{makedix,glossaries,epigraph}
  10. \usepackage{filecontents}
  11. \usepackage{simpletrs,formal-grammar}
  12. \usepackage{expl3}
  13. \usepackage[backend=biber,style=authoryear]{biblatex}
  14.  
  15. \addbibresource{references.bib}
  16. \includeonly{hacheme-command-definitions}
  17. \includeonly{hacheme-glossary-definitions}
  18. \includeonly{hacheme-index-definitions}
  19. \makeindex
  20. \makeglossaries
  21.  
  22. \begin{document}
  23.  
  24. The booklet before you presents the source code for \hacheme, an implementation of the \hachemestd, in a literate manner. This is not a textbook, nor is it a tutorial or a work of academia. This is merely a \litprogram. This also allows us to use \formeth{}s to, albeit partially, \verif\ this program --- and furthermore, delve into the history of \scheme\ and \lisp.
  25.  
  26.  
  27. \section*{Map to the program}\label{sec:prog_map}
  28.  
  29.  
  30. \section{A brief history of \lisp}\label{sec:lisp_hist}
  31.  
  32. \epigraph{"Ut therefore seems to be desirable to attempt to construct an artificial language which a computer can be programmed to use on problems and self~-reference. It should correspond to English in the sense that short English statements about the given subject matter should have short correspondents in the language and so should short arguments or consjectural arguments."}{J. McCarthy}
  33.  
  34. In the mid-1950s, \jmccarthy, an assisstant professor at \dartmouth, was presented the language \ipl; to which he showed no excitement due to its low level and rough shape\parencite{stoyan1984early}. In a surviving archival note, he prescribes \textit{the problem of programming a computer} into two bullet-points; the first being that "there does not exist an adequate language for human beings to describes procedures to each other", in a language which he felt like must have been \textit{explicit, univeral and concise} -- and the second issue he found was that computers at the time needed extra "engineering consideration" to be able t ounderstand such language\parencite{stoyan1984early}.
  35.  
  36. In his visit to \ibm, he was impressed by the success at engineering of the \fortran\ language by a team led by \jbackus. During September and October of that year, \jmccarthy\ worte some memos about the new language he was brewing in his mind\parencite{stoyan1979lisp}. McCarthy claimed he "did not understand Church's paper" so \lambdacalc\ did not play much of a role in early construction of \lisp\footnote{However, as we see later, \lambdacal\ plays a huge role in evolution of \scheme}.
  37.  
  38. % TODO: Complete history of lisp
  39.  
  40. \section{History of \scheme}
  41.  
  42.  
  43. % TODO: Complete history of scheme
  44.  
  45.  
  46. *)
  47.  
  48. (*
  49.  
  50. \section{\moduledef{Syntax}}\label{sec:mod_syntax}
  51.  
  52. \subsection{About the \absyn\ of \scheme}
  53.  
  54. The \modsty{Syntax} module defines the \syntax\ for \hacheme. As mentioned in \ref{sec:about_scheme}, the language uses fully-parenthesized \index{prefix} notation, or in other words, \sexpr. The \modsty{Syntax} module reflects this, but also, encodes several \textit{peculiarities} of the \scheme-flavored \sexp.
  55.  
  56. Where we to just use a \textit{basic} syntax for an \sexp\ language, we would have sufficed to \sexpatom{}s and \sexplist{}s. But our aim is to encode the syntax of \scheme, not just a run-off-the-mill \sexp\ syntax!
  57.  
  58. That is why \modsty{Syntax} module represents the \absyn\ for \scheme, and adds extra \hacheme-specific elements to it. In reality, the best way to convey the \absyn\ is using \sexp (\parencite{turbak2008design}\refextsection{2.3}). So languages like \scheme\ which use fully-parenthesized syntax are not only easier to parse, but each consturct conveys more info \textit{symbolically}.
  59.  
  60. The \scheme\ syntax is made up of \datum{}s which, in turn, form an \expr.
  61.  
  62. % TODO: Complete this section
  63.  
  64. *)
  65. (*
  66. \moduledefbegin{Syntax}
  67. *)
  68. module Syntax = struct
  69. (*
  70. \marktype{Syntax}{ProductType}{t}{The base type for \modsty{Syntax}}
  71. *)
  72. type t =
  73. { contents : expr list
  74. ; location : Location.t
  75. }
  76.  
  77. (*
  78. \marktype{Syntax}{SumType}{datum}{The \datum{}s, please see \ref{subsec:syntax_datum}}
  79. *)
  80. and datum =
  81. | DATUM_Boolean of bool
  82. | DATUM_String of string
  83. | DATUM_Char of char
  84. | DATUM_Label of label * datum
  85. | DATUM_List of datum list
  86. | DATUM_Symbol of label
  87. | DATUM_Vector of datum array
  88. | DATUM_Pair of datum * datum
  89.  
  90. (*
  91. \marktype{Syntax}{SumType}{quote}{The \quote{}s, please see \ref{subsec:syntax_quote}}
  92. *)
  93. and quote =
  94. | QUOTE_Quote of expr
  95. | QUOTE_Quasiquote of expr
  96. | QUOTE_Unquote of expr
  97. | QUOTE_Splice of expr
  98.  
  99. (*
  100. \marktype{Syntax}{SumType}{formals}{The \formal{}s, please see \ref{subsec:syntax_formals}}
  101. *)
  102. and formals =
  103. | FORMALS_IdentList of ident list
  104. | FORMALS_SoloIdent of ident
  105. | FORMALS_IdentPair of ident list * ident
  106.  
  107.  
  108. (*
  109. \marktype{Syntax}{SumType}{expr}{The \expr{}s, please see \ref{subsec:syntax_expr}}
  110. *)
  111.  
  112. and expr =
  113. | EXPR_Ident of ident
  114. | EXPR_Thunk of datum
  115. | EXPR_Call of expr * expr list
  116. | EXPR_Lambda of ident option * formals * body
  117. | EXPR_Cond of Conditional.t
  118. | EXPR_Assign of ident * expr
  119. | EXPR_Macro of Macro.t
  120. | EXPR_Include of Include.t
  121. | EXPR_Quote of quote
  122. | EXPR_Derived of DerivedExpr.t
  123.  
  124. (*
  125. \markauxtype{Syntax}{ident}
  126. *)
  127. and ident = string
  128.  
  129. (*
  130. \markauxtype{Syntax}{label}
  131. *)
  132.  
  133. and label = string
  134.  
  135. (*
  136. \markauxtype{Syntax}{body}
  137. *)
  138.  
  139. and body = expr list
  140. end
  141.  
  142. (*
  143. \moduledefend{Syntax}
  144. *)
  145.  
  146. (*
  147.  
  148. \section{\moduledef{Parser}}\label{sec:mod_parser}
  149.  
  150. \subsection{A theory for \sexpr}\label{subsec:theory_sexpr}
  151.  
  152. A theory for \sexpr\ was proposed in \parencite{sato1983theory} and subsequently in \parencite{sato1985theory}.
  153.  
  154. % TODO: Discuss Sato's Sexp theory
  155.  
  156. \subsection{The challenges in parsing \scheme}\label{subsec:challenges_parse_scheme}
  157.  
  158. Parsing \sexpr{}s are pretty simple, however; \scheme\ brandishes them with its own flavor of \textit{extensibility}, meaning that, an \scheme\ program is a living thing. It's not what is written down in a file, it gets expanded and expanded during runtime. This makes a native-code \index{compiler} for \scheme\ to be quite difficult to make. \parencite{dybvig1987models} discusses three models for implementing \scheme, and seminal as the dissertation is (I have made good use of it as well, this literate program is peppered with citations of this dissertation), all three models are what we today consider \textit{interpretive}. Since \hacheme\ is a compiler and not an interpreter, it must make use of \partialeval\ to first get all the expressions into a \normalform. The \modsty{Parser} must have facilities that will be used repeatedly.
  159.  
  160. *)
  161.  
  162. (*
  163. \moduledefbegin{Parser}
  164. *)
  165.  
  166. module Parser = struct
  167. ZZ
  168. (*
  169. \marktype{Parser}{ProductType}{t}{The base type for \modsty{Parser} module}
  170. *)
  171. type t =
  172. { input : string
  173. ; tokens : token list
  174. }
  175.  
  176. (*
  177. \marktype{Parser}{SumType}{token}{The type which represents \scheme's various \token{}s}
  178. *)
  179.  
  180. and token =
  181. | TOKEN_Numeric of lexeme
  182. | TOKEN_Ident of lexeme
  183. | TOKEN_String of lexeme
  184. | TOKEN_Char of lexeme
  185. | TOKEN_Bool of lexeme
  186. | TOKEN_Delim of lexeme
  187. | TOKEN_Punct of lexeme
  188.  
  189. (*
  190. \markmodfn{Parser}{consume_until}{char list -> (char -> bool) -> (lexeme * char list)}{The \index{utility function} to \consume\ characters \textit{until} a condition becomes true, inverse of \funsty{consume_while}}
  191. *)
  192.  
  193. let consume_until chlist cond =
  194. let buffer = Buffer.create 32 in
  195.  
  196. let rec aux ch chlist' =
  197. if (cond ch) then
  198. (Buffer.content buffer, chlist')
  199. else
  200. Buffer.add_char ch;
  201. aux (List.hd chlist') (List.tl chlist')
  202. in
  203. aux (List.hd chlist) (List.tl chlist)
  204.  
  205. (*
  206. \markmodfn{Parser}{consume_while}{char list -> (char -> bool) -> (lexeme * char list)}{The \index{utility function} to \consume\ characters \textit{whilst} a condition is true, inverse of \funsty{consume_until}}
  207. *)
  208.  
  209. let consume_while chlist cond =
  210. let buffer = Buffer.create 32 in
  211. (* \markauxfn *)
  212. let rec aux ch chlist' =
  213. if (cond ch) then
  214. Buffer.add_char ch;
  215. aux (List.hd chlist') (List.tl chlist')
  216. else
  217. (Buffer.content buffer, chlist')
  218. in
  219. aux (List.hd chlist) (List.tl chlist)
  220.  
  221. let scan_hacheme input =
  222. (* TODO *)
  223.  
  224. (*
  225. \markauxty{Parser}{lexeme}
  226. \aside{A lexeme is just a raw token\ldots!}
  227. *)
  228. type lexeme = string
  229. end
  230.  
  231. (*
  232. \moduledefend{Parser}
  233. *)
  234.  
  235. (*
  236.  
  237. \newpage
  238. \printindex
  239.  
  240. \newpage
  241. \printglossaries
  242.  
  243. \newpage
  244. \printbibliography
  245.  
  246. \begin{filecontents}{references.bib}
  247. @phdthesis{dybvig1987models,
  248. author="R. Kent Dybvig",
  249. title="Three Implementation Models for Scheme",
  250. year=1987,
  251. school="University of North Carolina at Chapel Hill",
  252. address="Chapel Hill, NC, USA",
  253. url="https://dl.acm.org/doi/10.5555/91071",
  254. note="Ph.D. thesis"
  255. }
  256.  
  257. @techreport{scheme2013r7rs,
  258. title={Revised^7 Report on the Algorithmic Language Scheme},
  259. editor={R. Kent Dybvig and Michael Sperber and William D. Clinger and others},
  260. year={2013},
  261. journal={Scheme Language Steering Committee, Rep. R7RS},
  262. url={https://small.r7rs.org/attachment/r7rs.pdf},
  263. }
  264. @article{sato1983theory,
  265. title={Theory of symbolic expressions, I},
  266. author={Sato, Masahiko},
  267. journal={Theoretical Computer Science},
  268. volume={22},
  269. number={1-2},
  270. pages={19--55},
  271. year={1983},
  272. publisher={Elsevier}
  273. }
  274. @article{sato1985theory,
  275. title={Theory of symbolic expressions, ii},
  276. author={Sato, Masahiko},
  277. journal={Publications of the Research Institute for Mathematical Sciences},
  278. volume={21},
  279. number={3},
  280. pages={455--540},
  281. year={1985}
  282. }
  283. @inproceedings{stoyan1984early,
  284. title={Early LISP history (1956-1959)},
  285. author={Stoyan, Herbert},
  286. booktitle={Proceedings of the 1984 ACM Symposium on LISP and functional programming},
  287. pages={299--310},
  288. year={1984}
  289. }
  290. @article{stoyan1979lisp,
  291. title={LISP history},
  292. author={Stoyan, Herbert},
  293. journal={ACM Lisp Bulletin},
  294. number={3},
  295. pages={42--53},
  296. year={1979},
  297. publisher={ACM New York, NY, USA}
  298. }
  299. @inproceedings{padget1986desiderata,
  300. title={Desiderata for the standardization of LISP},
  301. author={Padget, Julian and Chailloux, J{\'e}r{\^o}me and Christaller, Thomas and DeMantaras, Ramon and Dalton, Jeff and Devin, Matthieu and Fitch, John and Krumnack, Timm and Neidl, Eugen and Papon, Eric and others},
  302. booktitle={Proceedings of the 1986 ACM conference on LISP and functional programming},
  303. pages={54--66},
  304. year={1986}
  305. }
  306. @incollection{steele1996evolution,
  307. title={The evolution of Lisp},
  308. author={Steele, Guy L and Gabriel, Richard P},
  309. booktitle={History of programming languages---II},
  310. pages={233--330},
  311. year={1996}
  312. }
  313. @book{turbak2008design,
  314. title={Design concepts in programming languages},
  315. author={Turbak, Franklyn and Gifford, David},
  316. year={2008},
  317. publisher={MIT press}
  318. }
  319. \end{filecontents}
  320.  
  321. \begin{filecontents}{hacheme-command-definitions.tex}
  322.  
  323. \newcommand\hacheme{}
  324.  
  325. \newcounter{modcnt}
  326. \setcounter{modcnt}{0}
  327. \newcommand\moduledef[1]{%
  328. \stepcounter{modcnt}\textsc{Module} \#\arabic{modcnt} -> #1
  329. }
  330.  
  331.  
  332. \end{filecontents}
  333.  
  334. \begin{filecontents}{hacheme-glossary-definitions.tex}
  335. \newglossaryentry{Term-rewriting}{
  336. name=Term-rewriting,
  337. description={}
  338. }
  339. \end{filecontents}
  340.  
  341. \begin{filecontents}{hacheme-index-definitions.tex}
  342.  
  343. \newcommand\scheme{\index{Scheme}}
  344.  
  345. \newcommand\hachemestd{\index{R7RS@$R^{7}RS$}}
  346.  
  347. \newcommand\litprogram{\index{literate program@\textit{literate program}}}
  348.  
  349. \newcommand\lisp{\index{Lisp}}
  350.  
  351. \newcommand\formeth{\index{formal method}}
  352.  
  353. \newcommand\verif{\index{verify@\textit{formally verify}}}
  354.  
  355. \end{filecontents}
  356.  
  357. \begin{filecontents}{simpletrs.sty}
  358. \NeedsTeXFormat{LaTeX2e}
  359. \ProvidesPackage{simpletrs}[2024/9/15 A simple package for typesetting Term-rewriting systems]
  360.  
  361. \RequirePackage{amsmath}
  362. \RequirePackage{xparse}
  363.  
  364.  
  365. % Simple command for basic term-rewriting rules
  366. \NewDocumentCommand{\trule}{m m}{
  367. \[ #1 \to #2 \]
  368. }
  369.  
  370. % For conditional term-rewriting rules
  371. \NewDocumentCommand{\truleif}{m m m}{
  372. \[ #1 \to #2 \quad \text{if} \quad #3 \]
  373. }
  374.  
  375. % For contextual term-rewriting rules
  376. \NewDocumentCommand{\ctrule}{m m m}{
  377. \[ #1[#2] \to #1[#3] \]
  378. }
  379.  
  380. % For named term-rewriting rules
  381. \NewDocumentCommand{\ntrule}{m m m}{
  382. \[ #1 \colon \quad #2 \to #3 \]
  383. }
  384.  
  385. \newcommand{\relarrow}{\to}
  386. \end{filecontents}
  387.  
  388. \end{document}
  389.  
  390. *)
  391.  
Advertisement
Add Comment
Please, Sign In to add comment