Advertisement
Guest User

Untitled

a guest
Sep 4th, 2024
40
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Latex 13.83 KB | None | 0 0
  1. \documentclass[structure=hierarchic]{cweb}
  2. \usepackage{amsmath,amsthm,amssymb,stmaryrd,stmaryd}
  3. \usepackage{xcolor,tcolorbox,mdframed,epigraph,soul}
  4. \usepackage{array,tabularx,minted,graphicx,fancyhdr}
  5. \usepackage{tikz,simplebnf,oz}
  6. \usepackage{filecontents}
  7. \usepackage[backend=biber,style=authoryear]{biblatex}
  8. \usepackage{makeidx,authblk,hyperref}
  9. \usepackage{lmodern}
  10.  
  11. \addbibresource{references.bib}
  12. \makeindex
  13.  
  14. \begin{document}
  15.  
  16. \maketitle
  17. \tableofcontents
  18. \listoffigures
  19. \listoftables
  20.  
  21. This is \THIS; a \POSIX-compliant \UNIX\ \Shell{} program written in \ANSI-C. This program is written using the \WEB{} literate programming framework.
  22.  
  23. @*Saving the turtles.\label{sec:saving_the_turtles}
  24.  
  25. It is true that, for the past handful of decades (nearing 6), the \UNIX\ operating system and its descendants have played a very large role in progress of \textit{Common Computation}. I define this term as the aspects of computational arts which is seldom discussed in academia. The \POSIX\ \Shell{} itself has rarely been subject of academic scrutiny. The clergy don't see any worth in it, because \Shell{}, by design, is a program which is \textit{non-inductive}. It's hard to verify a program whose entire job is to communicate with a system that itself is hard to verify. It's hard to come up with theoretical frameworks about the \UNIX\ \Shell{}. \parencite(felici2024\Shell{}fuzzer) admits this, and it's one of those rare works of clergy on the \Shell{}.
  26.  
  27. But as we said, \Shell{} is \textit{Common}. \Shell{} is \textit{Pop culture}. \Shell{} is perhaps, one of the earliest \textit{scripting} languages, barring, of course, \Forth\parencite{rather1996evolution}.
  28.  
  29. It is upon this \textit{lay} foundation that \THIS\ builds itself. \THISs tries not to be a marvel, rather, a spec of dust. One \Shell{} amongst the many.
  30.  
  31. @*Prelude to \THIS.\label{sec:prelude_to_this}
  32.  
  33. First, let's use \CWEB's macro capabilities to define three numbers, |NAT_SMALL|, |NAT_MEDIUM| and |NAT_LARGE|. These will be very useful as quantities, and other uses.
  34.  
  35. @d NAT_SMALL 32
  36. @ |NAT_SMALL| for small quantities;
  37. @d NAT_MEDIUM 128
  38. @ |NAT_MEDIUM| for medium quantities;
  39. @d NAT_LARGE 1024
  40. @ and |NAT_LARGE| for large quantities;
  41.  
  42. @ Let's set up our program in three sections:
  43.  
  44. \begin{enumerate}
  45. \item Including necessary runtime and \UNIX-specific header files;
  46. \item Defining constant symbols;
  47. \item Defining necessary types and data structures;
  48. \end{enumerate}
  49.  
  50. @p
  51.  
  52. @<Including necessary runtime files@>@/
  53. @<Defining constant symbols@>@/
  54. @<Defining necessary types and data structures@>@/
  55.  
  56. @ Let's include Standard Library runtime header files. Remember that we are limited to \ANSI-C, this means we don't have access to goodies like |stdbool.h| and |stdint.h|. We'll have to make due.
  57.  
  58. @<Including necessary runtime files@>=
  59. %START_C_CODE
  60. #include <stdio.h>
  61. #include <stdlib.h>
  62. #include <string.h>
  63. %END_C_CODE
  64.  
  65. @ We now include \UNIX-specific header files$\ldots$
  66.  
  67. @<Including necessary runtime files@>+=
  68. %START_C_CODE
  69. #include <unistd.h>
  70. #include <signal.h>
  71. #include <limits.h>
  72. #include <syslog.h>
  73. %END_C_CODE
  74.  
  75. @ We now define several types. Our inagural |typdef| is the |bool_t| type. We define types for integers as well, such as |scalar_t| and |length_t|.
  76.  
  77. @c @<Defining necessary types and data structures@>=
  78. %START_C_CODE
  79. typedef int bool_t;     /* The bool type */
  80. typedef unsigned long length_t; /* The length type */
  81. typedef double scalar_t; /* The rational type */
  82. typedef int pstat_t; /* Integer holding the Process state */
  83. typedef int jstate_t; /* Integer holding the Job state */
  84. typedef pid_t jid_t; /* The job ID */
  85. typedef char name_t[NAT_SMALL]; /* A `small` character string for names */
  86. %END_C_CODE
  87.  
  88. @ We shall now define several data structures for \THIS. Our \Shell{} is a proper \POSIX\ \Shell{}. For everything standards-related, we shall follow \parencite{posix2017std}, or \POSIXSTDXCU, which from now on we shall refer to as \XCU. According to \XCU, a \Shell{} may, or may not have \JobControl. To quote \XCU:
  89.  
  90. \begin{quotation}
  91. [Job control is a] facility that allows users selectively to stop (suspend) the execution of process and continue (resume) their execution at a later point. The user typically employs this facility via the interactive interface jointly supplied by the terminal I/IO driver and command interpreter.
  92. \end{quotation}
  93.  
  94. So we shall now define a data strcuture (or several) which facilitate \JobControl.
  95.  
  96. But before that, we must ask ourselves, \textsf{What is a \Shell{} to do$\ldots$?}.
  97.  
  98. This is indeed an interesting question. We \textit{are} writing a \Shell{} program, are we not?
  99.  
  100. So let's take a detour and define what we exactly want out of this program. First, let's take a look at the history of \Shell.
  101.  
  102. @*The history of \Shell{}.\label{sec:a_history_of_shell}
  103.  
  104. \BellLabs\ was originally located at \StreetAddress{463 West Street, New York City}. But at the beginning of the \WorldWarII, many of its activities moved out of NYC\parencite{kernighan2020unix}. \BellLabs was responsible for advent of the \Transistor. Due to the gripe \BellTelephones\ on wired communications in the \UnitedStates, the government of the US decided to disallow \BellTelephones from ever entering the computer hardware market. This encouraged \BellTelephones\ to enter, or rather, \textit{synthesize} the software market. So they began work on a \TimeSharing\ operating system, \MULTICS, with \MIT. \MULTICS\ failed as a concept. At this time, an employee of \BellLabs\, \KenThompson, found a small mini-computer (a \PDPSeven) and ported \MULTICS\ into it. This was beginning of \UNIX, renamed from \UNICS.
  105.  
  106. \UNIX\ is a time-sharing operating system for one system. In a time-sharing system, resources are allocated to terminals, most likely connected via 120mA current to a mainframe, and each person logs into their terminal and uses his allotment. \UNIX\ ran on a mini-computer, so those resources went to \textit{processes}, and to manage those processes, \KenThompson\ created the \UNIX\ \Shell{}.
  107.  
  108. \Shell{} was based on works of his friend and co-worker, \McIlroy, who had was working on dataflow and coroutines at the same time\parencite{mcilroy1968coroutines}. \Shell\ is, technically partially, a \textit{dataflow language}. It leverages \UNIX's \IPCPIPE{}s to pass data along between applications. Although this feature did not come into full fruition until \Bourne\ did a full re-evaluation of \Shell\ in 1978\parencite{bourne1978unix}.
  109.  
  110. Aftex \UNIX\ was standardized into \POSIX, \Shell, too, was standardized. \GNU's \Bash\ is a free-and-open-source version of the \Bourne\ shell. Other \POSIX\ shells include \Zsh, \Mksh, and \Dash.
  111.  
  112. @*How does \Shell\ operate? (Part \Roman{1}).\label{sec:how_does_shell_operate_part_1}
  113.  
  114. "The Shell" is the most visible system interface in \Unix\parencite{mcilroy1978foreword}. \Shell\ is used to combine programs. In a way, \Shell\ is a \textit{Program-flow manager}.
  115.  
  116. This program is either \textit{interacted with} or \textit{batched off}. The correct verbage for 'batching off' is a \textit{Shell script}. The correct terminology for 'interacted with' is \textit{Interactive session}.
  117.  
  118. I am personally an avid user of \Fish\ shell. It has a different syntax that \POSIX's which we shall be making, but it's nonetheless, a shell designed for interactive use. The syntax for \Fish\ is, most often, identical with \POSIX's. So, to launch my text editor (\NeoVim) to edit this document, I type in:
  119.  
  120. \begin{ShellScript}{interactive}
  121. neovim /mnt/manifest/\jobname.w
  122. \end{ShellScript}
  123.  
  124. This is the simplest way one could utilize \Shell. Let's look at a more complex usage (\textit{slightly} more!) of \Shell:
  125.  
  126. \begin{ShellScript}{interactive}
  127. for i in {1, 2, 3}; do echo \$i; done
  128. \end{ShellScript}
  129.  
  130. This program prints these three numbers. \Util{echo} is a \Utility; it is not a \Function; unless you override it that is. We have utils like \Util{rm}, \Util{mv}, and \Util{cp}. But if you define a \Function\ with thiese names, they shadow over.
  131.  
  132. Now, this not a \Shell\ tutorial, and as for syntax and grammar, we'll talk about it in \ref{sec:syntax_and_grammar_of_shell}. So for now, this shall suffice to give you an idea of how a shell operation is done.
  133.  
  134. So when we pass something like \InlineShell{ls -a .} to our shell \Interpreter{} (we'll talk about this concept later!), it will list all the files and subdirectories, plus the information about those files, in the \REPL. Unless shadowed by a \Function, here, \Util{ls} is a utility.
  135.  
  136. When we type that, and press \KEnter, what happens in the backend is called a \FEWL. What is a \FEWL? We, of course, must implement it later. But at the moment, what we want to do is defining data structues that facilitates, among many things, a \FEWL. So let's define the |Process| structure.
  137.  
  138. @**Defining the \Process{} data structure.\label{sec:defining_process_struct}
  139.  
  140. @c @<Defining necessary types and...@>+=
  141. /* Let's define out first structure already! */
  142. %START_C_CODE
  143. typedef struct Process {
  144.    pid_t system_id;     /* ID of this process in system */
  145.   int internal_id;     /* Our ID for this process */
  146.   pstat_t current_state; /* Current status */
  147.   int fno_in,
  148.       fno_out,
  149.        fno_err;    /* File descriptors for input, output and error */
  150.   bool_t is_lead; /* Is this a 'lead' in a pipeline? */
  151.   bool_t is_bkrnd; /* Is this a 'background' process? */
  152. } Process;
  153. %END_C_CODE
  154.  
  155. @ So, all these fields are self-explanatory and if you don't understand any of them don't worry --- \textit{We shall explain}.
  156.  
  157. Some of these values flag, and some enumerate. Of those who enumerate, |current_state| does so for the internal status of the process, values of which we saw in \ref{sec:enumerative_defines}. Flaggers; |is_lead| basically flags it if the process is a \textit{lead} in a \Pipeline, a concept we shall explain in \ref{sec:explaining_pipelines}; and |is_bkgrnd| flags if the process has been sent to the backgound, part of \JobControl. We explain job control in \ref{sec:explaining_job_control}.
  158.  
  159. @**Defining the \Job{} data structure.\label{sec:defining_job_struct}
  160.  
  161. Our subsequent data structure to be defined is |Job|. The |Job| data structure is used in \JobControl. Keep in mind that job control is optional, according to \XCU. It is enabled by default, but the \CmdlineSwitch{m} switch turns it off.
  162.  
  163. @c @<Defining necessary types and...@>+=
  164. %START_C_CODE
  165. typedef struct Job {
  166.  jid_t  internal_id;   /* The internal ID for this job */
  167.  name_t internal_name; /* The internal name for the job */
  168.  jstate_t current_state; /* The current status of this job */
  169.   Process *processes;   /* The processes for this job */
  170. } Job;
  171. %END_C_CODE
  172.  
  173. This was |Job|. A small note on |internal_name|; this name is rarely used outside of internal utilities like \InternalUtility{qselect} and \InternalUtility{qsub}, at least according to the \XCU. The name of a job is most likely used in context of \textit{batch jobs} (see \ref{sec:explaining_job_control}).
  174.  
  175. Speaking of names, we really need a place to store them, don't we? And we also need a data structure, |Symbol|, to give \textit{meaning} to names. A name, by itself, is meaningless. A name might refer to a \Job{}, but as we'll see in \ref{sec:shell_variables} --- and \Variable{}s are just an example, a name could have several \textit{semantics}.
  176.  
  177. @**Defining the \Symbol{} data structure.\label{sec:defining_symbol_struct}
  178.  
  179. To handle the name issue, we must define a \Symbol\ data structure.
  180.  
  181. @c @<Defining necessary types and...@>+=
  182. %START_C_CODE
  183. typedef struct Symbol {
  184.   symid_t internal_id;  /* The internal ID of this symbol */
  185.   name_t symbol_name;  /* The symbol name */
  186.   symkind_t symbol_kind; /* The symbol kind */
  187.   union {
  188.     Job *v_job; /* A Job symbol */
  189.     /*  TODO ... */
  190.   };
  191. } Symbol;
  192. %END_C_CODE
  193.  
  194. %%% END OF PROGRAM %%%
  195. @
  196.  
  197. \newpage
  198. \printindex
  199.  
  200. \newpage
  201. \printbibliography
  202.  
  203. \begin{filecontents}{references.bib}
  204. @@techreport{mcilroy1978foreword,
  205.  title={The Bell System Technical Journal Foreword},
  206.  author={McIlroy, M.D. and Pinson, E.N. and Tague, B.A.},
  207.  year={July-August 1978},
  208.  volume={57},
  209.  publisher={Bell Labs}
  210. }
  211. @@article{bourne1978unix,
  212.  title={UNIX time-sharing system: The UNIX shell},
  213.  author={Bourne, Stephen Richard},
  214.  journal={The Bell System Technical Journal},
  215.  volume={57},
  216.  number={6},
  217.  pages={1971--1990},
  218.  year={1978},
  219.  publisher={Nokia Bell Labs}
  220. }
  221. @@techreport{mcilroy1968coroutines,
  222.  author      = {M. D. McIlroy},
  223.  title       = {Coroutines},
  224.  institution = {Bell Telephone Laboratories},
  225.  year        = {1968},
  226.  address     = {Murray Hill, New Jersey},
  227.  type        = {Technical report},
  228. }
  229. @@book{kernighan2020unix,
  230.  title={UNIX: A History and a Memoir},
  231.  author={Kernighan, Brian W},
  232.  year={2020},
  233.  publisher={Kindle Direct Publishing Seattle, WA}
  234. }
  235. @@techreport{posix2017std
  236.  type={Standard},
  237.  key={IEEE 1003.1(TM)}
  238.  year={2017}
  239.  title={IEEE Standard for Information Technology Portable Operating System Interface (POSIX(R))
  240. }
  241.  institution={IEEE}}
  242. @@incollection{rather1996evolution,
  243.  title={The evolution of Forth},
  244.  author={Rather, Elizabeth D and Colburn, Donald R and Moore, Charles H},
  245.  booktitle={History of programming languages---II},
  246.  pages={625--670},
  247.  year={1996}
  248. }
  249. @@article{felici2024shellfuzzer,
  250.  title={Shell Fuzzer: Grammar-based Fuzzing of Shell Interpreters},
  251.  author={Felici, Riccardo and Pozzi, Laura and Furia, Carlo A},
  252.  journal={arXiv preprint arXiv:2408.00433},
  253.  year={2024}
  254. }
  255. @@incollection{ritchie1996development,
  256.  title={The development of the C programming language},
  257.  author={Ritchie, Dennis M},
  258.  booktitle={History of Programming languages---II},
  259.  pages={671--698},
  260.  year={1996}
  261. }
  262. @@book{kernighan1988c,
  263.  title={The C programming language},
  264.  author={Kernighan, Brian W and Ritchie, Dennis M},
  265.  year={1988},
  266.  publisher={prentice-Hall}
  267. }
  268. \end{filecontents}
  269.  
  270. \end{document}
  271.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement