Advertisement
xdxdxd123

Untitled

May 22nd, 2017
753
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 129.63 KB | None | 0 0
  1. Working with Unix Processes
  2. Copyright © 2012 Jesse Storimer. All rights reserved. This ebook is licensed for
  3. individual use only.
  4. This is a one-man operation, please respect the time and effort that went into this
  5. book. If you came by a free copy and find it useful, you can compensate me at
  6. http://workingwithunixprocesses.com.
  7. Acknowledgements
  8. A big thank you to a few awesome folks who read early drafts of the book, helped me
  9. understand how to market this thing, gave me a push when I needed it, and were all-
  10. around extremely helpful: Sam Storry, Jesse Kaunisviita, and Marc-André Cournoyer.
  11. I have to express my immense gratitude towards my wife and daughter for not only
  12. supporting the erratic schedule that made this book possible, but also always being
  13. there to provide a second opinion. Without your love and support I couldn't have done
  14. this. You make it all worthwhile.
  15. 2
  16. Contents
  17. 9 Introduction
  18. 11 Primer
  19. 11 Why Care?
  20. 12 Harness the Power!
  21. 12 Overview
  22. 13 System Calls
  23. 14 Nomenclature, wtf(2)
  24. 16 Processes: The Atoms of Unix
  25. 18 Processes Have IDs
  26. 18 Cross Referencing
  27. 19 In the Real World
  28. 20 System Calls
  29. 21 Processes Have Parents
  30. 21 Cross Referencing
  31. 22 In the Real World
  32. 22 System Calls
  33. 23 Processes Have File Descriptors
  34. 23 Everything is a File
  35. 23 Descriptors Represent Resources
  36. 27 Standard Streams
  37. 28 In the Real World
  38. 28 System Calls
  39. 29 Processes Have Resource Limits
  40. 29 Finding the Limits
  41. 30 Soft Limits vs. Hard Limits
  42. 31 Bumping the Soft Limit
  43. 32 Exceeding the Limit
  44. 32 Other Resources
  45. 33 In the Real World
  46. 34 System Calls
  47. 35 Processes Have an Environment
  48. 36 It's a hash, right?
  49. 37 In the Real World
  50. 37 System Calls
  51. 38 Processes Have Arguments
  52. 38 It's an Array!
  53. 39 In the Real World
  54. 40 Processes Have Names
  55. 40 Naming Processes
  56. 42 In the Real World
  57. 43 Processes Have Exit Codes
  58. 43 How to Exit a Process
  59. 47 Processes Can Fork
  60. 47 Use the fork(2), Luke
  61. 51 Multicore Programming?
  62. 51 Using a Block
  63. 52 In the Real World
  64. 52 System Calls
  65. 53 Orphaned Processes
  66. 53 Out of Control
  67. 54 Abandoned Children
  68. 54 Managing Orphans
  69. 56 Processes Are Friendly
  70. 56 Being CoW Friendly
  71. 58 MRI / RBX users
  72. 60 Processes Can Wait
  73. 61 Babysitting
  74. 62 Process.wait and Cousins
  75. 63 Communicating with Process.wait2
  76. 65 Waiting for Specific Children
  77. 66 Race Conditions
  78. 68 In the Real World
  79. 68 System Calls
  80. 69 Zombie Processes
  81. 69 Good Things Come to Those Who wait(2)
  82. 71 What Do Zombies Look Like?
  83. 71 In The Real World
  84. 72 System Calls
  85. 73 Processes Can Get Signals
  86. 73 Trapping SIGCHLD
  87. 75 SIGCHLD and Concurrency
  88. 78 Signals Primer
  89. 79 Where do Signals Come From?
  90. 80 The Big Picture
  91. 83 Redefining Signals
  92. 84 Ignoring Signals
  93. 84 Signal Handlers are Global
  94. 85 Being Nice about Redefining Signals
  95. 87 When Can't You Receive Signals?
  96. 87 In the Real World
  97. 88 System Calls
  98. 89 Processes Can Communicate
  99. 89 Our First Pipe
  100. 91 Pipes Are One-Way Only
  101. 91 Sharing Pipes
  102. 93 Streams vs. Messages
  103. 95 Remote IPC?
  104. 95 In the Real World
  105. 96 System Calls
  106. 97 Daemon Processes
  107. 97 The First Process
  108. 98 Creating Your First Daemon Process
  109. 99 Diving into Rack
  110. 100 Daemonizing a Process, Step by Step
  111. 101 Process Groups and Session Groups
  112. 106 In the Real World
  113. 106 System Calls
  114. 107 Spawning Terminal Processes
  115. 107 fork + exec
  116. 109 Arguments to exec
  117. 114 In the Real World
  118. 116 System Calls
  119. 117 Ending
  120. 117 Abstraction
  121. 118 Communication
  122. 118 Farewell, But Not Goodbye
  123. 120 Appendix: How Resque Manages Processes
  124. 120 The Architecture
  125. 121 Forking for Memory Management
  126. 123 Why Bother?
  127. 124 Doesn't the GC clean up for us?
  128. 126 Appendix: How Unicorn Reaps Worker Processes
  129. 126 Reaping What?
  130. 132 Conclusion
  131. 133 Appendix: Preforking Servers
  132. 134 1. Efficient use of memory
  133. 135 Many Mongrels
  134. 135 Many Unicorn
  135. 136 2. Efficient load balancing
  136. 138 3. Efficient sysadminning
  137. 138 Basic Example of a Preforking Server
  138. 141 Appendix: Spyglass
  139. 141 Spyglass' Architecture
  140. 142 Booting Spyglass
  141. 142 Before a Request Arrives
  142. 142 Connection is Made
  143. 143 Things Get Quiet
  144. 143 Getting Started
  145. Updates
  146. • December 20, 2011 - First public version
  147. • December 21, 2011 - Typos
  148. • December 23, 2011 - Explanation for Process.setsid
  149. • December 27, 2011 - Section on SIGCHLD and concurrency
  150. • December 27, 2011 - Note about redefining 'default' signal handlers
  151. • December 28, 2011 - Typos
  152. • December 31, 2011 - Clarification around exiting with Kernel.raise; Section on
  153. using fork with a block; More typos; Note about getsid(2)
  154. • January 13, 2012 - Improved code highlighting. Improved e-reader formatting.
  155. • February 1, 2012 - New cover art.
  156. • February 7, 2012 - New chapters: zombie processes, environment variables,
  157. preforking servers, the spyglass project. Clarifications on CoW-friendliness in
  158. MRI. Added sections for IO.popen and Open3.
  159. • February 13, 2012 - Clarifications on Process::WNOHANG and file descriptor
  160. relations.
  161. • March 13, 2012 - Include TXT format.
  162. • March 29, 2012 - Formatting and errata.
  163. • April 20, 2012 - New chapters: ARGV and IPC.
  164. • May 15, 2012 - Clarification about reentrancy in signal handlers.
  165. 7
  166. • June 12, 2012 - New chapter on rlimits; Many formatting/syntax updates.
  167. 8
  168. Chapter 1
  169. Introduction
  170. When I was growing up I was sitting in front of a computer every chance I got. Not
  171. because I was programming, but because I was fascinated by what was possible with
  172. this amazing machine. I grew up as a computer user using ICQ, Winamp, and Napster.
  173. As I got older I spent more time playing video games on the computer. At first I was
  174. into first-person shooters and eventually spent most of my time playing real-time
  175. strategy games. And then I discovered that you can play these games online!
  176. Throughout my youth I was a 'computer guy': I knew how to use computers, but I
  177. had no idea how they worked under the hood.
  178. The reason I'm giving you my background is because I want you to know that I was not
  179. a child prodigy. I did not teach myself how to program Basic at age 7. When I took my
  180. first computer programming class I was not teaching the teacher and correcting his
  181. mistakes.
  182. It wasn't until my second year of a University degree that I really came to love
  183. programming as an activity. Some may say that I'm a late bloomer, but I have a feeling
  184. that I'm closer to the norm than you may think.
  185. Although I came to love programming for the sake of programming itself I still didn't
  186. have a good grasp of how the computer was working under the hood. If you had told
  187. me back then that all of my code ran inside of a process I would have looked at you
  188. sideways.
  189. 9
  190. Fortunately for me I was given a great work opportunity at a local web startup. This
  191. gave me a chance to do some programming on a real production system. This changed
  192. everything for me. This gave me a reason to learn how things were working under the
  193. hood.
  194. As I worked on this high-traffic production system I was presented with increasingly
  195. complex problems. As our traffic and resource demands increased we had to begin
  196. looking at our full stack to debug and fix outstanding issues. By just focusing on
  197. the application code we couldn't get the full picture of how the app was functioning.
  198. We had many layers in front of the application: a firewall, load balancer, reverse proxy,
  199. and http cache. We had layers that worked alongside the application: job queue,
  200. database server, and stats collector. Every application will have a different set of
  201. components that comprise it, and this book won't teach you everything there is to
  202. know about all of it.
  203. This book will teach you all you need to know about Unix processes, and that is
  204. guaranteed to improve your understanding of any component at work in your
  205. application.
  206. Through debugging issues I was forced to dig deep into Ruby projects that made use of
  207. Unix programming concepts. Projects like Resque and Unicorn. These two projects
  208. were my introduction to Unix programming in Ruby.
  209. After getting a deeper understanding of how they were working I was able to
  210. diagnose issues faster and with greater understanding, as well as debug pesky
  211. problems that didn't make sense when looking at the application code by itself.
  212. I even started coming up with new, faster, more efficient solutions to the problems I
  213. was solving that used the techniques I was learning from these projects. Alright,
  214. enough about me. Let's go down the rabbit hole.
  215. 10
  216. Chapter 2
  217. Primer
  218. This section will provide background on some key concepts used in the book. It's
  219. definitely recommended that you read this before moving on to the meatier chapters.
  220. Why Care?
  221. The Unix programming model has existed, in some form, since 1970. It was then that
  222. Unix was famously invented at Bell Labs, along with the C programming language or
  223. framework. In the decades that have elapsed since then Unix has stood the test of time
  224. as the operating system of choice for reliability, security, and stability.
  225. Unix programming concepts and techniques are not a fad, they're not the latest
  226. popular programming language. These techniques transcend programming languages.
  227. Whether you're programming in C, C++, Ruby, Python, JavaScript, Haskell, or [insert
  228. your favourite language here] these techniques WILL be useful.
  229. This stuff has existed, largely unchanged, for decades. Smart programmers have been
  230. using Unix programming to solve tough problems with a multitude of programming
  231. languages for the last 40 years, and they will continue to do so for the next 40 years.
  232. 11
  233. Harness the Power!
  234. I'll warn you now, the concepts and techniques described in this book can bring you
  235. great power. With this power you can create new software, understand complex
  236. software that is already out there, even use this knowledge to advance your career to
  237. the next level.
  238. Just remember, with great power comes great responsibility. Read on and I'll tell you
  239. everything you need to know to gain the power and avoid the pitfalls.
  240. Overview
  241. This book is not meant to be read as a reference manual. It's more of a walkthrough.
  242. To get the most out of it you should read it sequentially, since each chapter builds on
  243. the last. Once you're finished you can use the chapter headings to find information if
  244. you need a refresher.
  245. This book contains many code examples. I highly recommend that you follow along
  246. with them by actually running them yourself in a Ruby interpreter. Playing with the
  247. code yourself and making tweaks will help the concepts sink in that much more.
  248. Once you've read through the book and played with the examples I'm sure you'll be
  249. wanting to get your hands on a real world project that's a little more in depth. At that
  250. point have a look at the included Spyglass project.
  251. Spyglass is a web server that was created specifically for inclusion with this book. It's
  252. designed to teach Unix programming concepts. It takes the concepts you learn here
  253. 12
  254. and shows how a real-world project would put them to use. Have a look at the last
  255. chapter in this book for a deeper introduction.
  256. System Calls
  257. To understand system calls first requires a quick explanation of the components of a
  258. Unix system, specifically userland vs. the kernel.
  259. The kernel of your Unix system sits atop the hardware of your computer. It's a
  260. middleman for any interactions that need to happen with the hardware. This includes
  261. things like writing/reading from the filesystem, sending data over the network,
  262. allocating memory, or playing audio over the speakers. Given its power, programs are
  263. not allowed direct access to the kernel. Any communication is done via system calls.
  264. The system call interface connects the kernel to userland. It defines the interactions
  265. that are allowed between your program and the computer hardware.
  266. Userland is where all of your programs run. You can do a lot in your userland programs
  267. without ever making use of a system call: do mathematics, string operations, control
  268. flow with logical statements. But I'd go as far as saying that if you want your programs
  269. to do anything interesting then you'll need to involve the kernel via system calls.
  270. If you were a C programmer this stuff would probably be second nature to you. System
  271. calls are at the heart of C programming.
  272. But I'm going to expect that you, like me, don't have any C programming experience.
  273. You learned to program in a high level language. When you learned to write data to
  274. the filesystem you weren't told which system calls make that happen.
  275. 13
  276. The takeaway here is that system calls allow your user-space programs to interact
  277. indirectly with the hardware of your computer, via the kernel. We'll be looking at
  278. common system calls as we go through the chapters.
  279. Nomenclature, wtf(2)
  280. One of the roadblocks to learning about Unix programming is where to find the
  281. proper documentation. Want to hear the kicker? It's all available via Unix manual
  282. pages (manpages), and if you're using a Unix based computer right now it's already on
  283. your computer!
  284. If you've never used manpages before you can start by invoking the command man man
  285. from a terminal.
  286. Perfect, right? Well, kind of. The manpages for the system call api are a great resource
  287. in two situations:
  288. 1. you're a C programmer who wants to know how to invoke a given system call,
  289. or
  290. 2. you're trying to figure out the purpose of a given system call
  291. I'm going to assume we're not C programmers here, so #1 isn't so useful, but #2 is very
  292. useful.
  293. You'll see references throughout this text to things like this: select(2). This bit of text is
  294. telling you where you can find the manpage for a given system call. You may or may
  295. not know this, but there are many sections to the Unix manpages.
  296. 14
  297. Here's a look at the most commonly used sections of the manpages for FreeBSD and
  298. Linux systems:
  299. • Section 1: General Commands
  300. • Section 2: System Calls
  301. • Section 3: C Library Functions
  302. • Section 4: Special Files
  303. So Section 1 is for general commands (a.k.a. shell commands). If I wanted to refer you
  304. to the manual page for the find command I would write it like this: find(1). This tells
  305. you that there is a manual page for find in section 1 of the manpages.
  306. If I wanted to refer to the manual page for the getpid system call I would write it like
  307. this: getpid(2). This tells you that there is a manual page for getpid in section 2 of the
  308. manpages.
  309. Why do manpages need multiple sections? Because a command may be
  310. available in more than one section, ie. available as both a shell command and a
  311. system call.
  312. Take stat(1) and stat(2) as an example.
  313. In order to access other sections of the manpages you can specify it like this on the
  314. command line:
  315. 15
  316. $ man 2 getpid
  317. $ man 3 malloc
  318. $ man find # same as man 1 find
  319. This nomenclature was not invented for this book, it's a convention that's used
  320. everywhere
  321. 1 when referring to the manpages. So it's a good idea to learn it now and
  322. get comfortable with seeing it.
  323. Processes: The Atoms of Unix
  324. Processes are the building blocks of a Unix system. Why? Because any code that is
  325. executed happens inside a process.
  326. For example, when you launch ruby from the command line a new process is created
  327. for your code. When your code is finished that process exits.
  328. $ ruby -e "p Time.now"
  329. The same is true for all code running on your system. You know that MySQL server
  330. that's always running? That's running in its own process. The e-reader software you're
  331. using right now? That's running in its own process. The email client that's desperately
  332. trying to tell you you have new messages? You should ignore it by the way and keep
  333. reading! It also runs in its own process.
  334. 1.http://en.wikipedia.org/wiki/Man_page#Usage
  335. 16
  336. Things start to get interesting when you realize that one process can spawn and
  337. manage many others. We'll be taking a look at that over the course of this book.
  338. 17
  339. Chapter 3
  340. Processes Have IDs
  341. Every process running on your system has a unique process identifier, hereby referred
  342. to as 'pid'.
  343. The pid doesn't say anything about the process itself, it's simply a sequential numeric
  344. label. This is how the kernel sees your process: as a number.
  345. Here's how we can inspect the current pid in a ruby program. Fire up irb and try this:
  346. # This line will print the pid of the current ruby process. This might be an
  347. # irb process, a rake process, a rails server, or just a plain ruby script.
  348. puts Process.pid
  349. A pid is a simple, generic representation of a process. Since it's not tied to any aspect
  350. of the content of the process it can be understood from any programming language
  351. and with simple tools. We'll see below how we can use the pid to trace the process
  352. details using different utilities.
  353. Cross Referencing
  354. To get a full picture, we can use ps(1) to cross-reference our pid with what the kernel is
  355. seeing. Leaving your irb session open run the following command at a terminal:
  356. 18
  357. $ ps -p <pid-of-irb-process>
  358. That command should show a process called 'irb' with a pid matching what was
  359. printed in the irb session.
  360. In the Real World
  361. Just knowing the pid isn't all that useful in itself. So where is it used?
  362. A common place you'll find pids in the real world is in log files. When you have
  363. multiple processes logging to one file it's imperative that you're able to tell which log
  364. line comes from which process. Including the pid in each line solves that problem.
  365. Including the pid also allows you to cross reference information with the OS, through
  366. the use of commands like top(1) or lsof(1). Here's some sample output from the
  367. Spyglass server booting up. The first square brackets of each line denote the pid where
  368. the log line is coming from.
  369. [58550] [Spyglass::Server] Listening on port 4545
  370. [58550] [Spyglass::Lookout] Received incoming connection
  371. [58557] [Spyglass::Master] Loaded the app
  372. [58557] [Spyglass::Master] Spawned 4 workers. Babysitting now...
  373. [58558] [Spyglass::Worker] Received connection
  374. 19
  375. System Calls
  376. Ruby's Process.pid maps to getpid(2).
  377. There is also a global variable that holds the value of the current pid. You can
  378. access it with $$ .
  379. Ruby inherits this behaviour from other languages before it (both Perl and bash
  380. support $$ ), however I avoid it when possible. Typing out Process.pid in full is
  381. much more expressive of your intent than the dollar-dollar variable, and less likely
  382. to confuse those who haven't seen the dollar-dollar before.
  383. 20
  384. Chapter 4
  385. Processes Have Parents
  386. Every process running on your system has a parent process. Each process knows its
  387. parent process identifier (hereby referred to as 'ppid').
  388. In the majority of cases the parent process for a given process is the process that
  389. invoked it. For example, you're an OSX user who starts up Terminal.app and lands in a
  390. bash prompt. Since everything is a process that action started a new Terminal.app
  391. process, which in turn started a bash process.
  392. The parent of that new bash process will be the Terminal.app process. If you then
  393. invoke ls(1) from the bash prompt, the parent of that ls process will be the bash
  394. process. You get the picture.
  395. Since the kernel deals only in pids there is a way to get the pid of the current parent
  396. process. Here's how it's done in Ruby:
  397. # Notice that this is only one character different from getting the
  398. # pid of the current process.
  399. puts Process.ppid
  400. Cross Referencing
  401. Leaving your irb session open run the following command at a terminal:
  402. 21
  403. $ ps -p <ppid-of-irb-process>
  404. That command should show a process called 'bash' (or 'zsh' or whatever) with a pid
  405. that matches the one that was printed in your irb session.
  406. In the Real World
  407. There aren't a ton of uses for the ppid in the real world. It can be important when
  408. detecting daemon processes, something covered in a later chapter.
  409. System Calls
  410. Ruby's Process.ppid maps to getppid(2).
  411. 22
  412. Chapter 5
  413. Processes Have File
  414. Descriptors
  415. In much the same way as pids represent running processes, file descriptors represent
  416. open files.
  417. Everything is a File
  418. A part of the Unix philosophy: in the land of Unix 'everything is a file'. This means that
  419. devices are treated as files, sockets and pipes are treated as files, and files are treated as
  420. files.
  421. Since all of these things are treated as files I'm going to use the word 'resource'
  422. when I'm talking about files in a general sense (including devices, pipes, sockets,
  423. etc.) and I'll use the word 'file' when I mean the classical definition (a file on the
  424. file system).
  425. Descriptors Represent Resources
  426. Any time that you open a resource in a running process it is assigned a file descriptor
  427. number. File descriptors are NOT shared between unrelated processes, they live and
  428. die with the process they are bound to, just as any open resources for a process are
  429. 23
  430. closed when it exits. There are special semantics for file descriptor sharing when you
  431. fork a process, more on that later.
  432. In Ruby, open resources are represented by the IO class. Any IO object can have an
  433. associated file descriptor number. Use IO#fileno to get access to it.
  434. passwd = File.open('/etc/passwd')
  435. puts passwd.fileno
  436. outputs:
  437. 3
  438. Any resource that your process opens gets a unique number identifying it. This is how
  439. the kernel keeps track of any resources that your process is using.
  440. What happens when we have multiple resources open?
  441. 24
  442. passwd = File.open('/etc/passwd')
  443. puts passwd.fileno
  444. hosts = File.open('/etc/hosts')
  445. puts hosts.fileno
  446. # Close the open passwd file. The frees up its file descriptor
  447. # number to be used by the next opened resource.
  448. passwd.close
  449. null = File.open('/dev/null')
  450. puts null.fileno
  451. outputs:
  452. 3
  453. 4
  454. 3
  455. There are two key takeaways from this example.
  456. 1. File descriptor numbers are assigned the lowest unused value. The first file we
  457. opened, passwd , got file descriptor #3, the next open file got #4 because #3 was
  458. already in use.
  459. 2. Once a resource is closed its file descriptor number becomes available again.
  460. Once we closed the passwd file its file descriptor number became available
  461. again. So when we opened the file at dev/null it was assigned the lowest
  462. unused value, which was then #3.
  463. 25
  464. It's important to note that file descriptors keep track of open resources only. Closed
  465. resources are not given a file descriptor number.
  466. Stepping back to the kernel's viewpoint again this makes a lot of sense. Once a
  467. resource is closed it no longer needs to interact with the hardware layer so the kernel
  468. can stop keeping track of it.
  469. Given the above, file descriptors are sometimes called 'open file descriptors'. This is a
  470. bit of misnomer since there is no such thing as a 'closed file descriptor'. In fact, trying
  471. to read the file descriptor number from a closed resource will raise an exception:
  472. passwd = File.open('/etc/passwd')
  473. puts passwd.fileno
  474. passwd.close
  475. puts passwd.fileno
  476. outputs:
  477. 3
  478. -e:4:in `fileno': closed stream (IOError)
  479. You may have noticed that when we open a file and ask for its file descriptor number
  480. the lowest value we get is 3. What happened to 0, 1, and 2?
  481. 26
  482. Standard Streams
  483. Every Unix process comes with three open resources. These are your standard input
  484. (STDIN), standard output (STDOUT), and standard error (STDERR) resources.
  485. These standard resources exist for a very important reason that we take for granted
  486. today. STDIN provides a generic way to read input from keyboard devices or pipes,
  487. STDOUT and STDERR provide generic ways to write output to monitors, files,
  488. printers, etc. This was one of the innovations of Unix.
  489. Before STDIN existed your program had to include a keyboard driver for all the
  490. keyboards it wanted to support! And if it wanted to print something to the screen it
  491. had to know how to manipulate the pixels required to do so. So let's all be thankful for
  492. standard streams.
  493. puts STDIN.fileno
  494. puts STDOUT.fileno
  495. puts STDERR.fileno
  496. outputs:
  497. 0
  498. 1
  499. 2
  500. That's where those first 3 file descriptor numbers went to.
  501. 27
  502. In the Real World
  503. File descriptors are at the core of network programming using sockets, pipes, etc. and
  504. are also at the core of any file system operations.
  505. Hence, they are used by every running process and are at the core of most of the
  506. interesting stuff you can do with a computer. You'll see many more examples of how to
  507. use them in the following chapters or in the attached Spyglass project.
  508. System Calls
  509. Many methods on Ruby's IO class map to system calls of the same name. These
  510. include open(2), close(2), read(2), write(2), pipe(2), fsync(2), stat(2), among others.
  511. 28
  512. Chapter 6
  513. Processes Have Resource
  514. Limits
  515. In the last chapter we looked at the fact that open resources are represented by file
  516. descriptors. You may have noticed that when resources aren't being closed the file
  517. descriptor numbers continue to increase. It begs the question: how many file
  518. descriptors can one process have?
  519. The answer depends on your system configuration, but the important point is there
  520. are some resource limits imposed on a process by the kernel.
  521. Finding the Limits
  522. We'll continue on the subject of file descriptors. Using Ruby we can ask directly for the
  523. maximum number of allowed file descriptors:
  524. p Process.getrlimit(:NOFILE)
  525. On my machine this snippet outputs:
  526. [2560, 9223372036854775807]
  527. 29
  528. We used a method called Process.getrlimit and asked for the maximum number of
  529. open files using the symbol :NOFILE . It returned a two-element Array.
  530. The first element in the Array is the soft limit for the number of file descriptors, the
  531. second element in the Array is the hard limit for the number of file descriptors.
  532. Soft Limits vs. Hard Limits
  533. What's the difference? Glad you asked. The soft limit isn't really a limit. Meaning that if
  534. you exceed the soft limit (in this case by opening more than 2560 resources at once) an
  535. exception will be raised, but you can always change that limit if you want to.
  536. Note that the hard limit on my system for the number of file descriptors is a
  537. ridiculously large integer. Is it even possible to open that many? Likely not, I'm
  538. sure you'd run into hardware constraints before that many resources could be
  539. opened at once.
  540. On my system that number actually represents infinity. It's repeated in the
  541. constant Process::RLIMIT_INFINITY . Try comparing those two values to be sure. So,
  542. on my system, I can effectively open as many resources as I'd like, once I bump the
  543. soft limit for my needs.
  544. So any process is able to change its own soft limit, but what about the hard limit?
  545. Typically that can only be done by a superuser. However, your process is also able to
  546. bump the hard limit assuming it has the required permissions. If you're interested in
  547. changing the limits at a sytem-wide level then start by having a look at sysctl(8).
  548. 30
  549. Bumping the Soft Limit
  550. Let's go ahead and bump the soft limit for the current process:
  551. Process.setrlimit(:NOFILE, 4096)
  552. p Process.getrlimit(:NOFILE)
  553. outputs:
  554. [4096, 4096]
  555. You can see that we set a new limit for the number of open files, and upon asking for
  556. that limit again both the hard limit and the soft limit were set to the new value 4096.
  557. We can optionally pass a third argument to Process.setrlimit specifying a new hard
  558. limit as well, assuming we have the permissions to do so. Note that lowering the hard
  559. limit, as we did in that last snippet, is irreversible: once it comes down it won't go back
  560. up.
  561. The following example is a common way to raise the soft limit of a system resource to
  562. be equal with the hard limit, the maximum allowed value.
  563. Process.setrlimit(:NOFILE, Process.getrlimit(:NOFILE)[1])
  564. 31
  565. Exceeding the Limit
  566. Note that exceeding the soft limit will raise Errno::EMFILE :
  567. # Set the maximum number of open files to 3. We know this
  568. # will be maxed out because the standard streams occupy
  569. # the first three file descriptors.
  570. Process.setrlimit(:NOFILE, 3)
  571. File.open('/dev/null')
  572. outputs:
  573. Errno::EMFILE: Too many open files - /dev/null
  574. Other Resources
  575. You can use these same methods to check and modify limits on other system
  576. resources. Some common ones are:
  577. 32
  578. # The maximum number of simultaneous processes
  579. # allowed for the current user.
  580. Process.getrlimit(:NPROC)
  581. # The largest size file that may be created.
  582. Process.getrlimit(:FSIZE)
  583. # The maximum size of the stack segment of the
  584. # process.
  585. Process.getrlimit(:STACK)
  586. Have a look at the documentation
  587. 1 for
  588. Process.getrlimit for a full listing of the
  589. available options.
  590. In the Real World
  591. Needing to modify limits for system resources isn't a common need for most
  592. programs. However, for some specialized tools this can be very important.
  593. One use case is any process needing to handle thousands of simultaneous network
  594. connections. An example of this is the httperf(1) http performance tool. A command
  595. like httperf --hog --server www --num-conn 5000 will ask httperf(1) to create 5000
  596. concurrent connections. Obviously this will be a problem on my system due to its
  597. default soft limit, so httperf(1) will need to bump its soft limit before it can properly do
  598. its testing.
  599. 1.http://www.ruby-doc.org/core-1.9.3/Process.html#method-c-setrlimit
  600. 33
  601. Another real world use case for limiting system resources is a situation where you
  602. execute third-party code and need to keep it within certain constraints. You could set
  603. limits for the processes running that code and revoke the permissions required to
  604. change them, hence ensuring that they don't use more resources than you allow for
  605. them.
  606. System Calls
  607. Ruby's Process.getrlimit and Process.setrlimit map to getrlimit(2) and setrlimit(2),
  608. respectively.
  609. 34
  610. Chapter 7
  611. Processes Have an
  612. Environment
  613. Environment, in this sense, refers to what's known as 'environment variables'.
  614. Environment variables are key-value pairs that hold data for a process.
  615. Every process inherits environment variables from its parent. They are set by a parent
  616. process and inherited by its child processes. Environment variables are per-process
  617. and are global to each process.
  618. Here's a simple example of setting an environment variable in a bash shell, launching a
  619. Ruby process, and reading that environment variable.
  620. $ MESSAGE='wing it' ruby -e "puts ENV['MESSAGE']"
  621. The VAR=value syntax is the bash way of setting environment variables. The same thing
  622. can be accomplished in Ruby using the ENV constant.
  623. # The same thing, with places reversed!
  624. ENV['MESSAGE'] = 'wing it'
  625. system "echo $MESSAGE"
  626. Both of these examples print:
  627. 35
  628. wing it
  629. In bash environment variables are accessed using the syntax: $VAR . As you can tell
  630. from these few examples environment variables can be used to share state between
  631. processes running different languages, bash and ruby in this case.
  632. It's a hash, right?
  633. Although ENV uses the hash-style accessor API it's not actually a Hash . For instance, it
  634. implements Enumerable and some of the Hash API, but not all of it. Key methods like
  635. merge are not implemented. So you can do things like ENV.has_key? , but don't count
  636. on all hash operations working.
  637. puts ENV['EDITOR']
  638. puts ENV.has_key?('PATH')
  639. puts ENV.is_a?(Hash)
  640. outputs:
  641. vim
  642. true
  643. false
  644. 36
  645. In the Real World
  646. In the real world environment variables have many uses. Here's a few that are common
  647. workflows in the Ruby community:
  648. $ RAILS_ENV= production rails server
  649. $ EDITOR= mate bundle open actionpack
  650. $ QUEUE= default rake resque:work
  651. Environment variables are often used as a generic way to accept input into a
  652. command-line program. Any terminal (on Unix or Windows) already supports them
  653. and most programmers are familiar with them. Using environment variables is often
  654. less overhead than explicitly parsing command line options.
  655. System Calls
  656. There are no system calls for directly manipulating environment variables, but the C
  657. library functions setenv(3) and getenv(3) do the brunt of the work. Also have a look at
  658. environ(7) for an overview.
  659. 37
  660. Chapter 8
  661. Processes Have Arguments
  662. Every process has access to a special array called ARGV . Other programming languages
  663. may implement it slightly differently, but every one has something called 'argv'.
  664. argv is a short form for 'argument vector'. In other words: a vector, or array, of
  665. arguments. It holds the arguments that were passed in to the current process on the
  666. command line. Here's an example of inspecting ARGV and passing in some simple
  667. options.
  668. $ cat argv.rb
  669. p ARGV
  670. $ ruby argv.rb foo bar -va
  671. ["foo", "bar", "-va"]
  672. It's an Array!
  673. Unlike the previous chapter, where we learned that ENV isn't a Hash , ARGV is simply an
  674. Array . You can add elements to it, remove elements from it, change the elements it
  675. contains, whatever you like. But if it simply represents the arguments passed in on the
  676. command line why would you need to change anything?
  677. Some libraries will read from ARGV to parse command line options, for example. You
  678. can programmatically change ARGV before they have a chance to see it in order to
  679. modify the options at runtime.
  680. 38
  681. In the Real World
  682. The most common use case for ARGV is probably for accepting filenames into a
  683. program. It's very common to write a program that takes one or more filenames as
  684. input on the command line and does something useful with them.
  685. The other common use case, as mentioned, is for parsing command line input. There
  686. are many Ruby libraries for dealing with command line input. One called optparse is
  687. available as part of the standard library.
  688. But now that you know how ARGV works you can skip that extra overhead for simple
  689. command line options and do it by hand. If you just want to support a few flags you
  690. can implement them directly as array operations.
  691. # did the user request help?
  692. ARGV.include?('--help')
  693. # get the value of the -c option
  694. ARGV.include?('-c') && ARGV[ARGV.index('-c') + 1]
  695. 39
  696. Chapter 9
  697. Processes Have Names
  698. Unix processes have very few inherent ways of communicating about their state.
  699. Programmers have worked around this and invented things like logfiles. Logfiles allow
  700. processes to communicate anything they want about their state by writing to the
  701. filesystem, but this operates at the level of the filesystem rather than being inherent to
  702. the process itself.
  703. Similarly, processes can use the network to open sockets and communicate with other
  704. processes. But again, that operates at a different level than the process itself, since it
  705. relies on the network.
  706. There are two mechanisms that operate at the level of the process itself that can be
  707. used to communicate information. One is the process name, the other is exit codes.
  708. Naming Processes
  709. Every process on the system has a name. For example, when you start up an irb
  710. session that process is given the name 'irb'. The neat thing about process names is that
  711. they can be changed at runtime and used as a method of communication.
  712. In Ruby you can access the name of the current process in the $PROGRAM_NAME variable.
  713. Similarly, you can assign a value to that global variable to change the name of the
  714. current process.
  715. 40
  716. puts $PROGRAM_NAME
  717. 10.downto(1) do |num|
  718. $PROGRAM_NAME = "Process: #{num}"
  719. puts $PROGRAM_NAME
  720. end
  721. outputs:
  722. irb
  723. Process: 10
  724. Process: 9
  725. Process: 8
  726. Process: 7
  727. Process: 6
  728. Process: 5
  729. Process: 4
  730. Process: 3
  731. Process: 2
  732. Process: 1
  733. As a fun exercise you can start an irb session, print the pid, and change the process
  734. name. Then you can use the ps(1) utility to see your changes reflected on the system.
  735. Unfortunately this global variable (and its mirror $0 ) is the only mechanism
  736. provided by Ruby for this feature. There is not a more intent-revealing way to
  737. change the name of the current process.
  738. 41
  739. In the Real World
  740. To see an example of how this is used in a real project read through How Resque
  741. Manages Processes in the appendices.
  742. 42
  743. Chapter 10
  744. Processes Have Exit Codes
  745. When a process comes to an end it has one last chance to make its mark on the world:
  746. its exit code. Every process that exits does so with a numeric exit code (0-255)
  747. denoting whether it exited successfully or with an error.
  748. Traditionally, a process that exits with an exit code of 0 is said to be successful. Any
  749. other exit code denotes an error, with different codes pointing to different errors.
  750. Though traditionally they're used to denote different errors, they're really just a
  751. channel for communication. All you need to do is handle the different exit codes that a
  752. process may exit with in a way that suits your program and you've gotten away from
  753. the traditions.
  754. It's usually a good idea to stick with the '0 as success' exit code tradition so that your
  755. programs will play nicely with other Unix tools.
  756. How to Exit a Process
  757. There are several ways you can exit a process in Ruby, each for different purposes.
  758. exit
  759. The simplest way to exit a process is using Kernel#exit . This is also what happens
  760. implicitly when your script ends without an explicit exit statement.
  761. 43
  762. # This will exit the program with the success status code (0).
  763. exit
  764. # You can pass a custom exit code to this method
  765. exit 22
  766. # When Kernel#exit is invoked, before exiting Ruby invokes any blocks
  767. # defined by Kernel#at_exit.
  768. at_exit { puts 'Last!' }
  769. exit
  770. will output:
  771. Last!
  772. exit!
  773. Kernel#exit! is almost exactly the same as Kernel#exit , but with two key differences.
  774. The first is that it sets an unsuccessful status code by default (1), and the second is that
  775. it will not invoke any blocks defined using Kernel#at_exit .
  776. # This will exit the program with a status code 1.
  777. exit!
  778. 44
  779. # You can still pass an exit code.
  780. exit! 33
  781. # This block will never be invoked.
  782. at_exit { puts 'Silence!' }
  783. exit!
  784. abort
  785. Kernel#abort provides a generic way to exit a process unsuccessfully. Kernel#abort will
  786. set the exit code to 1 for the current process.
  787. # Will exit with exit code 1.
  788. abort
  789. # You can pass a message to Kernel#abort. This message will be printed
  790. # to STDERR before the process exits.
  791. abort "Something went horribly wrong."
  792. # Kernel#at_exit blocks are invoked when using Kernel#abort.
  793. at_exit { puts 'Last!' }
  794. abort "Something went horribly wrong."
  795. will output:
  796. 45
  797. Something went horribly wrong.
  798. Last!
  799. raise
  800. A different way to end a process is with an unhandled exception. This is something
  801. that you never want to happen in a production environment, but it's almost always
  802. happening in development and test environments.
  803. Note that Kernel#raise , unlike the previous methods, will not exit the process
  804. immediately. It simply raises an exception that may be rescued somewhere up the
  805. stack. If the exception is not rescued anywhere in the codebase then the unhandled
  806. exception will cause the process to exit.
  807. Ending a process this way will still invoke any at_exit handlers and will print the
  808. exception message and backtrace to STDERR .
  809. # Similar to abort, an unhandled exception will set the exit code to 1.
  810. raise 'hell'
  811. 46
  812. Chapter 11
  813. Processes Can Fork
  814. Use the fork(2), Luke
  815. Forking is one of the most powerful concepts in Unix programming. The fork(2)
  816. system call allows a running process to create new process programmatically. This new
  817. process is an exact copy of the original process.
  818. Up until now we've talked about creating processes by launching them from the
  819. terminal. We've also mentioned low level operating system processes that create other
  820. processes: fork(2) is how they do it.
  821. When forking, the process that initiates the fork(2) is called the "parent", and the
  822. newly created process is called the "child".
  823. The child process inherits a copy of all of the memory in use by the parent
  824. process, as well as any open file descriptors belonging to the parent process.
  825. Let's take a moment to review child processes from the eye of our first three chapters.
  826. Since the child process is an entirely new process, it gets its own unique pid.
  827. The parent of the child process is, obviously, its parent process. So its ppid is set to the
  828. pid of the process that initiated the fork(2).
  829. 47
  830. The child process inherits any open file descriptors from the parent at the time of the
  831. fork(2). It's given the same map of file descriptor numbers that the parent process has.
  832. In this way the two processes can share open files, sockets, etc.
  833. The child process inherits a copy of everything that the parent process has in main
  834. memory. In this way a process could load up a large codebase, say a Rails app, that
  835. occupies 500MB of main memory. Then this process can fork 2 new child processes.
  836. Each of these child processes would effectively have their own copy of that codebase
  837. loaded in memory.
  838. The call to fork returns near-instantly so we now have 3 processes with each using
  839. 500MB of memory. Perfect for when you want to have multiple instances of your
  840. application loaded in memory at the same time. Because only one process needs to
  841. load the app and forking is fast, this method is faster than loading the app 3 times in
  842. separate instances.
  843. The child processes would be free to modify their copy of the memory without
  844. affecting what the parent process has in memory. See the next chapter for a discussion
  845. of copy-on-write and how it affects memory when forking.
  846. Let's get started with forking in Ruby by looking at a mind-bending example:
  847. if fork
  848. puts "entered the if block"
  849. else
  850. puts "entered the else block"
  851. end
  852. outputs:
  853. 48
  854. entered the if block
  855. entered the else block
  856. WTF! What's going on here? A call to the fork method has taken the once-familiar
  857. if construct and turned it on its head. Somehow this piece of code is entering both
  858. the if and else block of the if construct!
  859. It's no mystery what's happening here. One call to the fork method actually returns
  860. twice. Remember that fork creates a new process. So it returns once in the calling
  861. process (parent) and once in the newly created process (child).
  862. The last example becomes more obvious if we print the pids.
  863. puts "parent process pid is #{Process.pid}"
  864. if fork
  865. puts "entered the if block from #{Process.pid}"
  866. else
  867. puts "entered the else block from #{Process.pid}"
  868. end
  869. outputs:
  870. parent process is 21268
  871. entered the if block from 21268
  872. entered the else block from 21282
  873. 49
  874. Now it becomes clear that the code in the if block is being executed by the parent
  875. process, while the code in the else block is being executed by the child process. The
  876. child process will exit after executing its code in the else block, while the parent
  877. process will carry on.
  878. Again, there's a rhythm to this beat, and it has to do with the return value of the fork
  879. method. In the child process fork returns nil . Since nil is falsy it executes the
  880. code in the else block.
  881. In the parent process fork returns the pid of the newly created child process.
  882. Since an integer is truthy it executes the code in the if block.
  883. This concept is illustrated nicely by simply printing the return value of a fork call.
  884. puts fork
  885. outputs
  886. 21423
  887. nil
  888. Here we have the two different return values. The first value returned is the pid of the
  889. newly created child process; this comes from the parent. The second return value is
  890. the nil from the child process.
  891. 50
  892. Multicore Programming?
  893. In a roundabout way, yes. By making new processes it means that your code is able,
  894. but not guaranteed, to be distributed across multiple CPU cores.
  895. Given a system with 4 CPUs, if you fork 4 new processes then those can be handled
  896. each by a separate CPU, giving you multicore concurrency.
  897. However, there's no guarantee that stuff will be happening in parallel. On a busy
  898. system it's possible that all 4 of your processes are handled by the same CPU.
  899. fork(2) creates a new process that's a copy of the old process. So if a process is
  900. using 500MB of main memory, then it forks, now you have 1GB in main memory.
  901. Do this another ten times and you can quickly exhaust main memory. This is often
  902. called a fork bomb. Before you turn up the concurrency make sure that you know
  903. the consequences.
  904. Using a Block
  905. In the example above we've demonstrated fork with an if/else construct. It's also
  906. possible, and more common in Ruby code, to use fork with a block.
  907. When you pass a block to the fork method that block will be executed in the new
  908. child process, while the parent process simply skips over it. The child process exits
  909. 51
  910. when it's done executing the block. It does not continue along the same code path as
  911. the parent.
  912. fork do
  913. # Code here is only executed in the child process
  914. end
  915. # Code here is only executed in the parent process.
  916. In the Real World
  917. Have a look at either of the appendices, or the attached Spyglass project, to see some
  918. real-world examples of using fork(2).
  919. System Calls
  920. Ruby's Kernel#fork maps to fork(2).
  921. 52
  922. Chapter 12
  923. Orphaned Processes
  924. Out of Control
  925. You may have noticed when running the examples in the last chapter that when child
  926. processes are involved, it's no longer possible to control everything from a terminal
  927. like we're used to.
  928. When starting a process via a terminal, we normally have only one process writing to
  929. STDOUT , taking keyboard input, or listening for that Ctrl-C telling it to exit.
  930. But once that process has forked child processes that all becomes a little more difficult.
  931. When you press Ctrl-C which process should exit? All of them? Only the parent?
  932. It's good to know about this stuff because it's actually very easy to create orphaned
  933. processes:
  934. fork do
  935. 5.times do
  936. sleep 1
  937. puts "I'm an orphan!"
  938. end
  939. end
  940. abort "Parent process died..."
  941. 53
  942. If you run this program from a terminal you'll notice that since the parent process dies
  943. immediately the terminal returns you to the command prompt. At which point, it's
  944. overwritten by the STDOUT from the child process! Strange things can start to happen
  945. when forking processes.
  946. Abandoned Children
  947. What happens to a child process when its parent dies?
  948. The short answer is, nothing. That is to say, the operating system doesn't treat child
  949. processes any differently than any other processes. So, when the parent process dies
  950. the child process continues on; the parent process does not take the child down with
  951. it.
  952. Managing Orphans
  953. Can you still manage orphaned processes?
  954. We're getting a bit ahead of ourselves with this question, but it touches on two
  955. interesting concepts.
  956. The first is something called daemon processes. Daemon processes are long running
  957. processes that are intentionally orphaned and meant to stay running forever. These are
  958. covered in detail in a later chapter.
  959. 54
  960. The second interesting bit here is communicating with processes that are not attached
  961. to a terminal session. You can do this using something called Unix signals. This is also
  962. covered in more detail in a later chapter.
  963. We'll soon talk about how to properly manage and control child processes.
  964. 55
  965. Chapter 13
  966. Processes Are Friendly
  967. Let's take a step back from looking at code for a minute to talk about a higher level
  968. concept and how it's handled in different Ruby implementations.
  969. Being CoW Friendly
  970. As mentioned in the forking chapter, fork(2) creates a new child process that's an exact
  971. copy of the parent process. This includes a copy of everything the parent process has in
  972. memory.
  973. Physically copying all of that data can be considerable overhead, so modern Unix
  974. systems employ something called copy-on-write semantics (CoW) to combat this.
  975. As you may have guessed from the name, CoW delays the actual copying of memory
  976. until it needs to be written.
  977. So a parent process and a child process will actually share the same physical data in
  978. memory until one of them needs to modify it, at which point the memory will be
  979. copied so that proper separation between the two processes can be preserved.
  980. 56
  981. arr = [1,2,3]
  982. fork do
  983. # At this point the child process has been initialized.
  984. # Using CoW this process doesn't need to copy the arr variable,
  985. # since it hasn't modified any shared values it can continue reading
  986. # from the same memory location as the parent process.
  987. p arr
  988. end
  989. arr = [1,2,3]
  990. fork do
  991. # At this point the child process has been initialized.
  992. # Because of CoW the arr variable hasn't been copied yet.
  993. arr << 4
  994. # The above line of code modifies the array, so a copy of
  995. # the array will need to be made for this process before
  996. # it can modify it. The array in the parent process remains
  997. # unchanged.
  998. end
  999. This is a big win when using fork(2) as it saves on resources. It means that fork(2) is
  1000. fast since it doesn't need to copy any of the physical memory of the parent. It also
  1001. means that child processes only get a copy of the data they need, the rest can be
  1002. shared.
  1003. CoW is great, but unfortunately it's not supported in MRI or Rubinius.
  1004. 57
  1005. In order for CoW to work properly programs need to be written in a CoW friendly
  1006. manner. In other words, they need to manage memory in a way that makes CoW
  1007. possible. MRI and Rubinius are not written this way.
  1008. Why not?
  1009. MRI's garbage collector uses a 'mark-and-sweep' algorithm. In a nutshell this
  1010. means that when the GC is invoked it must iterate over every known object and
  1011. write to it, either saying it should be garbage collected or it shouldn't. The
  1012. important point here is that every time the GC runs every object in memory is
  1013. written to.
  1014. So, after forking, the first time that the GC runs will retract the benefit that copy-
  1015. on-write provides.
  1016. This is one of the main reasons why Ruby Enterprise Edition
  1017. 1 was created. ree is CoW
  1018. friendly. For a deeper look at this issue and how Ruby Enterprise Edition solves it
  1019. check out their Google Tech Talk: Building a More Efficient Ruby Interpreter
  1020. 2 .
  1021. MRI / RBX users
  1022. What does this mean for users of other Ruby VMs?
  1023. 1.http://www.rubyenterpriseedition.com/
  1024. 2.http://www.youtube.com/watch?v=ghLCtCwAKqQ
  1025. 58
  1026. Simply, when forking processes you will not reap the benefits of CoW. Child processes
  1027. will require a complete copy of the memory owned by the calling process.
  1028. If you're building something that will depend heavily on fork(2) then have a serious
  1029. look at Ruby Enterprise Edition. Looking ahead to the future, MRI trunk now has
  1030. patches
  1031. 3 that make the GC CoW-friendly. So the 2.0 release of Ruby will ship with a
  1032. CoW-friendly GC!
  1033. 3.http://bugs.ruby-lang.org/issues/5839
  1034. 59
  1035. Chapter 14
  1036. Processes Can Wait
  1037. In the examples of fork(2) up until now we have let the parent process continue on in
  1038. parallel with the child process. In some cases this led to weird results, such as when
  1039. the parent process exited before the child process.
  1040. That kind of scenario is really only suitable for one use case, fire and forget. It's useful
  1041. when you want a child process to handle something asynchronously, but the parent
  1042. process still has its own work to do.
  1043. message = 'Good Morning'
  1044. recipient = 'tree@mybackyard.com'
  1045. fork do
  1046. # In this contrived example the parent process forks a child to take
  1047. # care of sending data to the stats collector. Meanwhile the parent
  1048. # process has continued on with its work of sending the actual payload.
  1049. # The parent process doesn't want to be slowed down with this task, and
  1050. # it doesn't matter if this would fail for some reason.
  1051. StatsCollector.record message, recipient
  1052. end
  1053. # send message to recipient
  1054. 60
  1055. Babysitting
  1056. For most other use cases involving fork(2) you'll want some way to keep tabs on your
  1057. child processes. In Ruby, one technique for this is provided by Process.wait . Let's
  1058. rewrite our orphan-inducing example from the last chapter to perform with less
  1059. surprises.
  1060. fork do
  1061. 5.times do
  1062. sleep 1
  1063. puts "I am an orphan!"
  1064. end
  1065. end
  1066. Process.wait
  1067. abort "Parent process died..."
  1068. This time the output will look like:
  1069. I am an orphan!
  1070. I am an orphan!
  1071. I am an orphan!
  1072. I am an orphan!
  1073. I am an orphan!
  1074. Parent process died...
  1075. Not only that, but control will not be returned to the terminal until all of the output
  1076. has been printed.
  1077. 61
  1078. So what does Process.wait do? Process.wait is a blocking call instructing the
  1079. parent process to wait for one of its child processes to exit before continuing.
  1080. Process.wait and Cousins
  1081. I mentioned something key in that last statement, Process.wait blocks until any one
  1082. of its child processes exit. If you have a parent that's babysitting more than one child
  1083. process and you're using Process.wait , you need to know which one exited. For this,
  1084. you can use the return value.
  1085. Process.wait returns the pid of the child that exited. Check it out.
  1086. # We create 3 child processes.
  1087. 3.times do
  1088. fork do
  1089. # Each one sleeps for a random amount of number less than 5 seconds.
  1090. sleep rand(5)
  1091. end
  1092. end
  1093. 3.times do
  1094. # We wait for each child process to exit and print the pid that
  1095. # gets returned.
  1096. puts Process.wait
  1097. end
  1098. 62
  1099. Communicating with Process.wait2
  1100. But wait! Process.wait has a cousin called Process.wait2 !
  1101. Why the name confusion? It makes sense once you know that Process.wait returns 1
  1102. value (pid), but Process.wait2 returns 2 values (pid, status).
  1103. This status can be used as communication between processes via exit codes. In our
  1104. chapter on Exit Codes we mentioned that you can use exit codes to encode
  1105. information for other processes. Process.wait2 gives you direct access to that
  1106. information.
  1107. The status returned from Process.wait2 is an instance of Process::Status . It has a lot
  1108. of useful information attached to it for figuring out exactly how a process exited.
  1109. 63
  1110. # We create 5 child processes.
  1111. 5.times do
  1112. fork do
  1113. # Each generates a random number. If even they exit
  1114. # with a 111 exit code, otherwise they use a 112 exit code.
  1115. if rand(5).even?
  1116. exit 111
  1117. else
  1118. exit 112
  1119. end
  1120. end
  1121. end
  1122. 5.times do
  1123. # We wait for each of the child processes to exit.
  1124. pid, status = Process.wait2
  1125. # If the child process exited with the 111 exit code
  1126. # then we know they encountered an even number.
  1127. if status.exitstatus == 111
  1128. puts "#{pid} encountered an even number!"
  1129. else
  1130. puts "#{pid} encountered an odd number!"
  1131. end
  1132. end
  1133. Communication between processes without the filesystem or network!
  1134. 64
  1135. Waiting for Specific Children
  1136. But wait! The Process.wait cousins have two more cousins. Process.waitpid and
  1137. Process.waitpid2 .
  1138. You can probably guess what these do. They function the same as Process.wait and
  1139. Process.wait2 except, rather than waiting for any child to exit they only wait for a
  1140. specific child to exit, specified by pid.
  1141. favourite = fork do
  1142. exit 77
  1143. end
  1144. middle_child = fork do
  1145. abort "I want to be waited on!"
  1146. end
  1147. pid, status = Process.waitpid2 favourite
  1148. puts status.exitstatus
  1149. Although it appears that Process.wait and Process.waitpid provide different
  1150. behaviour don't be fooled! They are actually aliased to the same thing. Both will
  1151. accept the same arguments and behave the same.
  1152. You can pass a pid to Process.wait in order to get it to wait for a specific child, and
  1153. you can pass -1 as the pid to Process.waitpid to get it to wait for any child process.
  1154. 65
  1155. The same is true for Process.wait2 and Process.waitpid2 .
  1156. Just like with Process.pid vs. $$ I think it's important that, as programmers, we
  1157. use the provided tools to reveal our intent where possible. Although these
  1158. methods are identical you should use Process.wait when you're waiting for any
  1159. child process and use Process.waitpid when you're waiting for a specific process.
  1160. Race Conditions
  1161. As you look at these simple code examples you may start to wonder about race
  1162. conditions.
  1163. What if the code that handles one exited process is still running when another child
  1164. process exits? What if I haven't gotten back around to Process.wait and another
  1165. process exits? Let's see:
  1166. 66
  1167. # We create two child processes.
  1168. 2.times do
  1169. fork do
  1170. # Both processes exit immediately.
  1171. abort "Finished!"
  1172. end
  1173. end
  1174. # The parent process waits for the first process, then sleeps for 5 seconds.
  1175. # In the meantime the second child process has exited and is no
  1176. # longer running.
  1177. puts Process.wait
  1178. sleep 5
  1179. # The parent process asks to wait once again, and amazingly enough, the second
  1180. # process' exit information has been queued up and is returned here.
  1181. puts Process.wait
  1182. As you can see this technique is free from race conditions. The kernel queues up
  1183. information about exited processes so that the parent always receives the information
  1184. in the order that the children exited.
  1185. So even if the parent is slow at processing each exited child it will always be able to get
  1186. the information for each exited child when it's ready for it.
  1187. Take note that calling any variant of Process.wait when there are no child
  1188. processes will raise Errno::ECHILD . It's always a good idea to keep track of how
  1189. many child processes you have created so you don't encounter this exception.
  1190. 67
  1191. In the Real World
  1192. e
  1193. The idea of looking in on your child processes is at the core of a common Unix
  1194. programming pattern. The pattern is sometimes called babysitting processes, master/
  1195. worker, or preforking.
  1196. At the core of this pattern is the concept that you have one process that forks several
  1197. child processes, for concurrency, and then spends its time looking after them: making
  1198. sure they are still responsive, reacting if any of them exit, etc.
  1199. For example, the Unicorn web server
  1200. 1
  1201. mploys this pattern. You tell it how many
  1202. worker processes you want it to start up for you, 5 for instance.
  1203. Then a unicorn process will boot up that will fork 5 child processes to handle web
  1204. requests. The parent (or master) process maintains a heartbeat with each child and
  1205. ensures that all of the child processes stay responsive.
  1206. This pattern allows for both concurrency and reliability. Read more about Unicorn in
  1207. its Appendix at the end of the book.
  1208. For an alternative usage of this technique read through the Lookout class in the
  1209. attached Spyglass project.
  1210. System Calls
  1211. Ruby's Process.wait and cousins map to waitpid(2).
  1212. 1.http://unicorn.bogomips.org
  1213. 68
  1214. Chapter 15
  1215. Zombie Processes
  1216. At the beginning of the last chapter we looked at an example that used a child process
  1217. to asynchronously handle a task in a fire and forget manner. We need to revisit that
  1218. example and ensure that we clean up that child process appropriately, lest it become a
  1219. zombie!
  1220. Good Things Come to Those Who wait(2)
  1221. In the last chapter I showed that the kernel queues up status information about child
  1222. processes that have exited. So even if you call Process.wait long after the child process
  1223. has exited its status information is still available. I'm sure you can smell a problem
  1224. here...
  1225. The kernel will retain the status of exited child processes until the parent process
  1226. requests that status using Process.wait . If the parent never requests the status then
  1227. the kernel can never reap that status information. So creating fire and forget child
  1228. processes without collecting their status information is a poor use of kernel resources.
  1229. If you're not going to wait for a child process to exit using Process.wait (or the
  1230. technique described in the next chapter) then you need to 'detach' that child
  1231. process. Here's the fire and forget example from last chapter rectified to properly
  1232. detach the child process:
  1233. 69
  1234. message = 'Good Morning'
  1235. recipient = 'tree@mybackyard.com'
  1236. pid = fork do
  1237. # In this contrived example the parent process forks a child to take
  1238. # care of sending data to the stats collector. Meanwhile the parent
  1239. # process has continued on with its work of sending the actual payload.
  1240. # The parent process doesn't want to be slowed down with this task, and
  1241. # it doesn't matter if this would fail for some reason.
  1242. StatsCollector.record message, recipient
  1243. end
  1244. # This line ensures that the process performing the stats collection
  1245. # won't become a zombie.
  1246. Process.detach(pid)
  1247. What does Process.detach do? It simply spawns a new thread whose sole job is to wait
  1248. for the child process specified by pid to exit. This ensures that the kernel doesn't hang
  1249. on to any status information we don't need.
  1250. 70
  1251. What Do Zombies Look Like?
  1252. # Create a child process that exits after 1 second.
  1253. pid = fork { sleep 1 }
  1254. # Print its pid.
  1255. puts pid
  1256. # Put the parent process to sleep indefinitely so we can inspect the
  1257. # process status of the child
  1258. sleep
  1259. Running the following command at a terminal, using the pid printed from the last
  1260. snippet, will print the status of that zombie process. The status should say 'z' or 'Z+',
  1261. meaning that the process is a zombie.
  1262. ps -ho pid,state -p [pid of zombie process]
  1263. In The Real World
  1264. Notice that any dead process whose status hasn't been waited on is a zombie process.
  1265. So every child process that dies while its parent is still active will be a zombie, if only
  1266. for a short time. Once the parent process collects the status from the zombie then it
  1267. effectively disappears, no longer consuming kernel resources.
  1268. It's fairly uncommon to fork child processes in a fire and forget manner, never
  1269. collecting their status. If work needs to be offloaded in the background it's much more
  1270. common to do that with a dedicated background queueing system.
  1271. 71
  1272. That being said there is a Rubygem called spawn
  1273. 1 that provides this exact
  1274. functionality. Besides providing a generic API over processes or threads, it ensures that
  1275. fire and forget processes are properly detached.
  1276. System Calls
  1277. There's no system call for Process.detach because it's implemented in Ruby simply as a
  1278. thread and Process.wait . The implementation in Rubinius
  1279. 2 is stark in its simplicity.
  1280. 1.https://github.com/tra/spawn
  1281. 2.https://github.com/rubinius/rubinius/blob/c6e8e33b37601d4a082ddcbbd60a568767074771/kernel/common/
  1282. process.rb#L377-395
  1283. 72
  1284. Chapter 16
  1285. Processes Can Get Signals
  1286. In the last chapter we looked at Process.wait . It provides a nice way for a parent
  1287. process to keep tabs on its child processes. However it is a blocking call: it will not
  1288. return until a child process dies.
  1289. What's a busy parent to do? Not every parent has the luxury of waiting around on
  1290. their children all day. There is a solution for the busy parent! And it's our introduction
  1291. to Unix signals.
  1292. Trapping SIGCHLD
  1293. Let's take a simple example from the last chapter and rewrite it for a busy parent
  1294. process.
  1295. 73
  1296. child_processes = 3
  1297. dead_processes = 0
  1298. # We fork 3 child processes.
  1299. child_processes.times do
  1300. fork do
  1301. # They sleep for 3 seconds.
  1302. sleep 3
  1303. end
  1304. end
  1305. # Our parent process will be busy doing some intense mathematics.
  1306. # But still wants to know when one of its children exits.
  1307. # By trapping the :CHLD signal our process will be notified by the kernel
  1308. # when one of its children exits.
  1309. trap(:CHLD) do
  1310. # Since Process.wait queues up any data that it has for us we can ask for it
  1311. # here, since we know that one of our child processes has exited.
  1312. puts Process.wait
  1313. dead_processes += 1
  1314. # We exit explicitly once all the child processes are accounted for.
  1315. exit if dead_processes == child_processes
  1316. end
  1317. # Work it.
  1318. loop do
  1319. (Math.sqrt(rand(44)) ** 8).floor
  1320. sleep 1
  1321. end
  1322. 74
  1323. SIGCHLD and Concurrency
  1324. Before we go on I must mention a caveat. Signal delivery is unreliable. By this I
  1325. mean that if your code is handling a CHLD signal while another child process dies you
  1326. may or may not receive a second CHLD signal.
  1327. This can be lead to inconsistent results with the code snippet above. Sometimes the
  1328. timing will be such that things will work out perfectly, and sometimes you'll actually
  1329. 'miss' an instance of a child process dying.
  1330. This behaviour only happens when receiving the same signal several times in quick
  1331. succession, you can always count on at least one instance of the signal arriving. This
  1332. same caveat is true for other signals you handle you handle in Ruby; read on to hear
  1333. more about those.
  1334. To properly handle CHLD you must call Process.wait in a loop and look for as many
  1335. dead child processes as are available, since you may have received multiple CHLD
  1336. signals since entering the signal handler. But....isn't Process.wait a blocking call? If
  1337. there's only one dead child process and I call Process.wait again how will I avoid
  1338. blocking the whole process?
  1339. Now we get to the second argument to Process.wait . In the last chapter we looked at
  1340. passing a pid to Process.wait as the first argument, but it also takes a second
  1341. argument, flags. One such flag that can be passed tells the kernel not to block if no
  1342. child has exited. Just what we need!
  1343. There's a constant that represents the value of this flag, Process::WNOHANG , and it can
  1344. be used like so:
  1345. 75
  1346. Process.wait(-1, Process::WNOHANG)
  1347. Easy enough.
  1348. Here's a rewrite of the code snippet from the beginning of this chapter that won't
  1349. 'miss' any child process deaths:
  1350. 76
  1351. child_processes = 3
  1352. dead_processes = 0
  1353. # We fork 3 child processes.
  1354. child_processes.times do
  1355. fork do
  1356. # They sleep for 3 seconds.
  1357. sleep 3
  1358. end
  1359. end
  1360. # Sync $stdout so the call to #puts in the CHLD handler isn't
  1361. # buffered. Can cause a ThreadError if a signal handler is
  1362. # interrupted after calling #puts. Always a good idea to do
  1363. # this if your handlers will be doing IO.
  1364. $stdout.sync = true
  1365. # Our parent process will be busy doing some intense mathematics.
  1366. # But still wants to know when one of its children exits.
  1367. # By trapping the :CHLD signal our process will be notified by the kernel
  1368. # when one of its children exits.
  1369. trap(:CHLD) do
  1370. # Since Process.wait queues up any data that it has for us we can ask for it
  1371. # here, since we know that one of our child processes has exited.
  1372. # We loop over a non-blocking Process.wait to ensure that any dead child
  1373. # processes are accounted for.
  1374. begin
  1375. while pid = Process.wait(-1, Process::WNOHANG)
  1376. puts pid
  1377. dead_processes += 1
  1378. # We exit ourselves once all the child processes are accounted for.
  1379. exit if dead_processes == child_processes
  1380. 77
  1381. end
  1382. rescue Errno::ECHILD
  1383. end
  1384. end
  1385. # Work it.
  1386. loop do
  1387. (Math.sqrt(rand(44)) ** 8).floor
  1388. sleep 1
  1389. end
  1390. One more thing to remember is that Process.wait , even this variant, will raise
  1391. Errno::ECHILD if no child processes exist. Since signals might arrive at any time it's
  1392. possible for the last CHLD signal to arrive after the previous CHLD handler has
  1393. already called Process.wait twice and gotten the last available status. This
  1394. asynchronous stuff can be mind-bending. Any line of code can be interrupted with a
  1395. signal. You've been warned!
  1396. So you must handle the Errno::ECHILD exception in your CHLD signal handler. Also if
  1397. you don't know how many child processes you are waiting on you should rescue that
  1398. exception and handle it properly.
  1399. Signals Primer
  1400. This was our first foray to Unix signals. Signals are asynchronous communication.
  1401. When a process receives a signal from the kernel it can do one of the following:
  1402. 1. ignore the signal
  1403. 78
  1404. 2. perform a specified action
  1405. 3. perform the default action
  1406. Where do Signals Come From?
  1407. Technically signals are sent by the kernel, just like text messages are sent by a cell
  1408. phone carrier. But text messages have an original sender, and so do signals. Signals are
  1409. sent from one process to another process, using the kernel as a middleman.
  1410. The original purpose of signals was to specify different ways that a process should be
  1411. killed. Let's start there.
  1412. Let's start up two ruby programs and we'll use one to kill the other.
  1413. For these examples we won't use irb because it defines its own signal handlers
  1414. that get in the way of our demonstrations. Instead we'll just use the ruby program
  1415. itself.
  1416. Give this a try: launch the ruby program without any arguments. Enter some code.
  1417. Hit Ctrl-D.
  1418. This executes the code that you entered and then exits.
  1419. Start up two ruby processes using the technique mentioned above and we'll kill one of
  1420. them using a signal.
  1421. 79
  1422. 1. In the first ruby session execute the following code:
  1423. puts Process.pid
  1424. sleep # so that we have time to send it a signal
  1425. 2. In the second ruby session issue the following command to kill the first session
  1426. with a signal:
  1427. Process.kill(:INT, <pid of first session>)
  1428. So the second process sent an "INT" signal to the first process, causing it to exit. "INT"
  1429. is short for "INTERRUPT".
  1430. The system default when a process receives this signal is that it should interrupt
  1431. whatever it's doing and exit immediately.
  1432. The Big Picture
  1433. Below is a table showing signals commonly supported on Unix systems. Every Unix
  1434. process will be able to respond to these signals and any signal can be sent to any
  1435. process.
  1436. When naming signals the SIG portion of the name is optional. The Action column in
  1437. the table describes the default action for each signal:
  1438. Term
  1439. means that the process will terminate immediately
  1440. 80
  1441. Core
  1442. means that the process will terminate immediately and dump core (stack trace)
  1443. Ign
  1444. means that the process will ignore the signal
  1445. Stop
  1446. means that the process will stop (ie pause)
  1447. Cont
  1448. means that the process will resume (ie unpause)
  1449. 81
  1450. Signal Value Action Comment
  1451. -------------------------------------------------------------------------
  1452. SIGHUP 1 Term Hangup detected on controlling terminal
  1453. or death of controlling process
  1454. SIGINT 2 Term Interrupt from keyboard
  1455. SIGQUIT 3 Core Quit from keyboard
  1456. SIGILL 4 Core Illegal Instruction
  1457. SIGABRT 6 Core Abort signal from abort(3)
  1458. SIGFPE 8 Core Floating point exception
  1459. SIGKILL 9 Term Kill signal
  1460. SIGSEGV 11 Core Invalid memory reference
  1461. SIGPIPE 13 Term Broken pipe: write to pipe with no readers
  1462. SIGALRM 14 Term Timer signal from alarm(2)
  1463. SIGTERM 15 Term Termination signal
  1464. SIGUSR1 30,10,16 Term User-defined signal 1
  1465. SIGUSR2 31,12,17 Term User-defined signal 2
  1466. SIGCHLD 20,17,18 Ign Child stopped or terminated
  1467. SIGCONT 19,18,25 Cont Continue if stopped
  1468. SIGSTOP 17,19,23 Stop Stop process
  1469. SIGTSTP 18,20,24 Stop Stop typed at tty
  1470. SIGTTIN 21,21,26 Stop tty input for background process
  1471. SIGTTOU 22,22,27 Stop tty output for background process
  1472. The signals SIGKILL and SIGSTOP cannot be trapped, blocked, or ignored.
  1473. This table might seem a bit out of left field, but it gives you a rough idea of what to
  1474. expect when you send a certain signal to a process. You can see that, by default, most
  1475. of the signals terminate a process.
  1476. It's interesting to note the SIGUSR1 and SIGUSR2 signals. These are signals whose action
  1477. is meant specifically to be defined by your process. We'll see shortly that we're free to
  1478. 82
  1479. redefine any of the signal actions that we please, but those two signals are meant for
  1480. your use.
  1481. Redefining Signals
  1482. Let's go back to our two ruby sessions and have some fun.
  1483. 1. In the first ruby session use the following code to redefine the behaviour of the
  1484. INT signal:
  1485. puts Process.pid
  1486. trap(:INT) { print "Na na na, you can't get me" }
  1487. sleep # so that we have time to send it a signal
  1488. Now our process won't exit when it receives the INT signal.
  1489. 2. In the second ruby session issue the following command and notice that the
  1490. first process is taunting us!
  1491. Process.kill(:INT, <pid of first session>)
  1492. 3. You can try using Ctrl-C to kill that first session, and notice that it responds
  1493. the same!
  1494. 4. But as the table said there are some signals that cannot be redefined. SIGKILL
  1495. will show that guy who's boss.
  1496. 83
  1497. Process.kill(:KILL, <pid of first session>)
  1498. Ignoring Signals
  1499. 1. In the first ruby session use the following code:
  1500. puts Process.pid
  1501. trap(:INT, "IGNORE")
  1502. sleep # so that we have time to send it a signal
  1503. 2. In the second ruby session issue the following command and notice that the
  1504. first process isn't affected.
  1505. Process.kill(:INT, <pid of first session>)
  1506. The first ruby session is unaffected.
  1507. Signal Handlers are Global
  1508. Signals are a great tool and are the perfect fit for certain situations. But it's good to
  1509. keep in mind that trapping a signal is a bit like using a global variable, you might
  1510. be overwriting something that some other code depends on. And unlike global
  1511. variables signal handlers can't be namespaced.
  1512. 84
  1513. So make sure you read this next section before you go and add signal handlers to all of
  1514. your open source libraries :)
  1515. Being Nice about Redefining Signals
  1516. There is a way to preserve handlers defined by other Ruby code, so that your signal
  1517. handler won't trample any other ones that are already defined. It looks something like
  1518. this:
  1519. trap(:INT) { puts 'This is the first signal handler' }
  1520. old_handler = trap(:INT) {
  1521. old_handler.call
  1522. puts 'This is the second handler'
  1523. exit
  1524. }
  1525. sleep # so that we have time to send it a signal
  1526. Just send it a Ctrl-C to see the effect. Both signal handlers are called.
  1527. Now let's see if we can preserve the system default behaviour. Hit the code below with
  1528. a Ctrl-C.
  1529. system_handler = trap(:INT) {
  1530. puts 'about to exit!'
  1531. system_handler.call
  1532. }
  1533. sleep # so that we have time to send it a signal
  1534. 85
  1535. :/ It blew up that time. So we can't preserve the system default behaviour with this
  1536. technique, but we can preserve other Ruby code handlers that have been defined.
  1537. In terms of best practices your code probably shouldn't define any signal handlers,
  1538. unless it's a server. As in a long-running process that's booted from the command
  1539. line. It's very rare that library code should trap a signal.
  1540. # The 'friendly' method of trapping a signal.
  1541. old_handler = trap(:QUIT) {
  1542. # do some cleanup
  1543. puts 'All done!'
  1544. old_handler.call if old_handler.respond_to?(:call)
  1545. }
  1546. This handler for the QUIT signal will preserve any previous QUIT handlers that have
  1547. been defined. Though this looks 'friendly' it's not generally a good idea. Imagine a
  1548. scenario where a Ruby server tells its users they can send it a QUIT signal and it will
  1549. do a graceful shutdown. You tell the users of your library that they can send a QUIT
  1550. signal and it will draw an ASCII rainbow. Now if a user sends the QUIT signal both
  1551. handlers will be invoked. This violates the expectations of both libraries.
  1552. Whether or not you decide to preserve previously defined signal handlers is up to you,
  1553. just make sure you know why you're doing it. If you simply want to wire up some
  1554. behaviour to clean up resources before exiting you can use an at_exit hook, which we
  1555. touched on in the chapter about exit codes.
  1556. 86
  1557. When Can't You Receive Signals?
  1558. Your process can receive a signal anytime. That's the beauty of them! They're
  1559. asynchronous.
  1560. Your process can be pulled out of a busy for-loop into a signal handler, or even out of a
  1561. long sleep . Your process can even be pulled from one signal handler to another if it
  1562. receives one signal while processing another. But, as expected, it will always go back
  1563. and finish the code in all the handlers that are invoked.
  1564. In the Real World
  1565. With signals, any process can communicate with any other process on the system, so
  1566. long as it knows its pid. This makes signals a very powerful communication tool. It's
  1567. common to send signals from the shell using kill(1).
  1568. In the real world signals are mostly used by long running processes like servers and
  1569. daemons. And for the most part it will be the human users who are sending signals
  1570. rather than automated programs.
  1571. For instance, the Unicorn web server
  1572. 1 responds to the
  1573. INT signal by killing all of its
  1574. processes and shutting down immediately. It responds to the USR2 signal by re-
  1575. executing itself for a zero-downtime restart. It responds to the TTIN signal by
  1576. incrementing the number of worker processes it has running.
  1577. 1.http://unicorn.bogomips.org
  1578. 87
  1579. See the SIGNALS file included with Unicorn
  1580. 2 for a full list of the signals it supports
  1581. and how it responds to them.
  1582. The memprof project has a interesting example of being a friendly citizen when
  1583. handling signals
  1584. 3 .
  1585. System Calls
  1586. Ruby's Process.kill maps to kill(2), Kernel#trap maps roughly to sigaction(2).
  1587. signal(7) is also useful.
  1588. 2.http://unicorn.bogomips.org/SIGNALS.html
  1589. 3.https://github.com/ice799/memprof/blob/d4bc228aca323b58fea92dbde20c1f8ec36e5386/lib/memprof/signal.rb#L8-16
  1590. 88
  1591. Chapter 17
  1592. Processes Can Communicate
  1593. Up until now we've looked at related processes that share memory and share open
  1594. resources. But what about communicating information between multiple processes?
  1595. This is part of a whole field of study called Inter-process communication (IPC for
  1596. short). There are many different ways to do IPC but I'm going to cover two commonly
  1597. useful method: pipes and socket pairs.
  1598. Our First Pipe
  1599. A pipe is a uni-directional stream of data. In other words you can open a pipe, one
  1600. process can 'claim' one end of it and another process can 'claim' the other end. Then
  1601. data can be passed along the pipe but only in one direction. So if one process 'claims'
  1602. the position of reader, rather than writer, it will not be able to write to the pipe. And
  1603. vice versa.
  1604. Before we involve multiple processes let's just look at how to create a pipe and what we
  1605. get from that:
  1606. reader, writer = IO.pipe #=> [#<IO:fd 5>, #<IO:fd 6>]
  1607. IO.pipe returns an array with two elements, both of which are IO objects. Ruby's
  1608. amazing IO class
  1609. 1 is the superclass to
  1610. File , TCPSocket , UDPSocket , and others. As such,
  1611. all of these resources have a common interface.
  1612. 89
  1613. The IO objects returned from IO.pipe can be thought of something like anonymous
  1614. files. You can basically treat them the same way you would a File . You can call #read ,
  1615. #write , #close , etc. But this object won't respond to #path and won't have a location
  1616. on the filesystem.
  1617. Still holding back from bringing in multiple processes let's demonstrate
  1618. communication with a pipe:
  1619. reader, writer = IO.pipe
  1620. writer.write("Into the pipe I go...")
  1621. writer.close
  1622. puts reader.read
  1623. outputs
  1624. Into the pipe I go...
  1625. Pretty simple right? Notice that I had to close the writer after I wrote to the pipe?
  1626. That's because when the reader calls IO#read it will continue trying to read data until
  1627. it sees an EOF (aka. end-of-file marker
  1628. 2 ). This tells the reader that no more data will
  1629. be available for reading.
  1630. So long as the writer is still open the reader might see more data, so it waits. By closing
  1631. the writer before reading it puts an EOF on the pipe so the reader stops reading after it
  1632. gets the initial data. If you skip closing the writer then the reader will block and
  1633. continue trying to read indefinitely.
  1634. 1.http://librelist.com/browser//usp.ruby/2011/9/17/the-ruby-io-class/
  1635. 2.http://en.wikipedia.org/wiki/End-of-file
  1636. 90
  1637. Pipes Are One-Way Only
  1638. reader, writer = IO.pipe
  1639. reader.write("Trying to get the reader to write something")
  1640. outputs
  1641. >> reader.write("Trying to get the reader to write something")
  1642. IOError: not opened for writing
  1643. from (irb):2:in `write'
  1644. from (irb):2
  1645. The IO objects returned by IO.pipe can only be used for uni-directional
  1646. communication. So the reader can only read and the writer can only write.
  1647. Now let's introduce processes into the mix.
  1648. Sharing Pipes
  1649. In the chapter on forking I described how open resources are shared, or copied, when
  1650. a process forks a child. Pipes are considered a resource, they get their own file
  1651. descriptors and everything, so they are shared with child processes.
  1652. Here's a simple example of using a pipe to communicate between a parent and child
  1653. process. The child indicates to the parent that it has finished an iteration of work by
  1654. writing to the pipe:
  1655. 91
  1656. reader, writer = IO.pipe
  1657. fork do
  1658. reader.close
  1659. 10.times do
  1660. # heavy lifting
  1661. writer.puts "Another one bites the dust"
  1662. end
  1663. end
  1664. writer.close
  1665. while message = reader.gets
  1666. $stdout.puts message
  1667. end
  1668. outupts Another one bites the dust ten times.
  1669. Notice that, like above, the unused ends of the pipe are closed so as not to interfere
  1670. with EOF being sent. There's actually one more layer when considering EOF now that
  1671. two processes are involved. Since the file descriptors were copied there's now 4
  1672. instances floating around. Since only two of them will be used to communicate the
  1673. other 2 instances must be closed. Hence the extra instances of closing.
  1674. Since the ends of the pipe are IO objects we can call any IO methods on them, not just
  1675. #read and #write . In this example I use #puts and #gets to read and write a String
  1676. delimited with a newline. I actually used those here to simplify one aspect of pipes:
  1677. pipes hold a stream of data.
  1678. 92
  1679. Streams vs. Messages
  1680. When I say stream I mean that when writing and reading data to a pipe there's no
  1681. concept of beginning and end. When working with an IO stream, like pipes or TCP
  1682. sockets, you write your data to the stream followed by some protocol-specific
  1683. delimiter. For example, HTTP uses a series of newlines to delimit the headers from the
  1684. body.
  1685. Then when reading data from that IO stream you read it in one chunk at a time,
  1686. stopping when you come across the delimiter. That's why I used #puts and #gets in
  1687. the last example: it used a newline as the delimiter for me.
  1688. As you may have guessed it's possible to communicate via messages instead of streams.
  1689. We can't do it with pipe, but we can do it with Unix sockets. Without going into too
  1690. much detail, Unix sockets are a type of socket that can only communicate on the same
  1691. physical machine. As such it's much faster than TCP sockets and is a great fit for IPC.
  1692. Here's an example where we create a pair of Unix sockets that can communicate via
  1693. messages:
  1694. require 'socket'
  1695. Socket.pair(:UNIX, :DGRAM, 0) #=> [#<Socket:fd 15>, #<Socket:fd 16>]
  1696. This creates a pair of UNIX sockets these sockets that are already connected up to each
  1697. other. These sockets communicate using datagrams, rather than a stream. In this way
  1698. you write a whole message to one of the sockets and read a whole message from the
  1699. other socket. No delimiters required.
  1700. 93
  1701. Here's a slightly more complex version of the pipe example where the child process
  1702. actually waits for the parent to tell it what to work on, then it reports back to the
  1703. parent once it's finished the work:
  1704. require 'socket'
  1705. child_socket, parent_socket = Socket.pair(:UNIX, :DGRAM, 0)
  1706. maxlen = 1000
  1707. fork do
  1708. parent_socket.close
  1709. 4.times do
  1710. instruction = child_socket.recv(maxlen)
  1711. child_socket.send("#{instruction} accomplished!", 0)
  1712. end
  1713. end
  1714. child_socket.close
  1715. 2.times do
  1716. parent_socket.send("Heavy lifting", 0)
  1717. end
  1718. 2.times do
  1719. parent_socket.send("Feather lifting", 0)
  1720. end
  1721. 4.times do
  1722. $stdout.puts parent_socket.recv(maxlen)
  1723. end
  1724. outputs:
  1725. 94
  1726. Heavy lifting accomplished!
  1727. Heavy lifting accomplished!
  1728. Feather lifting accomplished!
  1729. Feather lifting accomplished!
  1730. So whereas pipes provide uni-directional communication, a socket pair provides bi-
  1731. directional communication. The parent socket can both read and write to the child
  1732. socket, and vice versa.
  1733. Remote IPC?
  1734. IPC implies communication between processes running on the same machine. If
  1735. you're interested in scaling up from one machine to many machines while still doing
  1736. something resembling IPC there are a few things to look into. The first one would
  1737. simply be to communicate via TCP sockets. This option would require more
  1738. boilerplate code than the others for a non-trivial system. Other plausible solutions
  1739. would be RPC
  1740. 3 (remote procedure call), a messaging system like ZeroMQ 4 , or the
  1741. general body of distributed systems
  1742. 5 .
  1743. In the Real World
  1744. Both pipes and socket pairs are useful abstractions for communicating between
  1745. processes. They're fast and easy. They're often used as a communication channel
  1746. instead of a more brute force approach such as a shared database or log file.
  1747. 3.http://en.wikipedia.org/wiki/Remote_procedure_call
  1748. 4.http://www.zeromq.org/
  1749. 5.http://en.wikipedia.org/wiki/Distributed_computing
  1750. 95
  1751. As for which method to use: it depends on your needs. Keep in mind that pipes are
  1752. uni-directional and socket pairs are bi-directional when weighing your decision.
  1753. For a more in-depth example have a look at the Spyglass Master class in the included
  1754. Spyglass project. It uses a more involved example of the code you saw above where
  1755. many child processes communicate over a single pipe with their parent process.
  1756. System Calls
  1757. Ruby's IO.pipe maps to pipe(2), Socket.pair maps to socketpair(2). Socket.recv maps
  1758. to recv(2) and Socket.send maps to send(2).
  1759. 96
  1760. Chapter 18
  1761. Daemon Processes
  1762. Daemon processes are processes that run in the background, rather than under the
  1763. control of a user at a terminal. Common examples of daemon processes are things like
  1764. web servers, or database servers which will always be running in the background in
  1765. order to serve requests.
  1766. Daemon processes are also at the core of your operating system. There are many
  1767. processes that are constantly running in the background that keep your system
  1768. functioning normally. These are things like the window server on a GUI system,
  1769. printing services or audio services so that your speakers are always ready to play that
  1770. annoying 'ding' notification.
  1771. The First Process
  1772. There is one daemon process in particular that has special significance for your
  1773. operating system. We talked in a previous chapter about every process having a parent
  1774. process. Can that be true for all processes? What about the very first process on the
  1775. system?
  1776. This is a classic who-created-the-creator kind of problem, and it has a simple answer.
  1777. When the kernel is bootstrapped it spawns a process called the init process. This
  1778. process has a ppid of 0 and is the 'grandparent of all processes'. It's the first one and it
  1779. has no ancestor. Its pid is 1 .
  1780. 97
  1781. Creating Your First Daemon Process
  1782. What do we need to get started? Not much. Any process can be made into a daemon
  1783. process.
  1784. Let's look to the rack project
  1785. 1 for an example here. Rack ships with a
  1786. rackup command
  1787. to serve applications using different rack supported web servers. Web servers are a
  1788. great example of a process that will never end; so long as your application is active
  1789. you'll need a server listening for connections.
  1790. The rackup command includes an option to daemonize the server and run it in the
  1791. background. Let's have a look at what that does.
  1792. 1.http://github.com/rack/rack
  1793. 98
  1794. Diving into Rack
  1795. def daemonize_app
  1796. if RUBY_VERSION < "1.9"
  1797. exit if fork
  1798. Process.setsid
  1799. exit if fork
  1800. Dir.chdir "/"
  1801. STDIN.reopen "/dev/null"
  1802. STDOUT.reopen "/dev/null", "a"
  1803. STDERR.reopen "/dev/null", "a"
  1804. else
  1805. Process.daemon
  1806. end
  1807. end
  1808. Lots going on here. Let's first jump to the else block. Ruby 1.9.x ships with a method
  1809. called Process.daemon that will daemonize the current process! How convenient!
  1810. But don't you want to know how it works under the hood? I knew ya did! The truth is
  1811. that if you look at the MRI source for Process.daemon
  1812. 2 and stumble through the C code
  1813. it ends up doing the exact same thing that Rack does in the if block above.
  1814. So let's continue using that as an example. We'll break down the code line by line.
  1815. 2.https://github.com/ruby/ruby/blob/c852d76f46a68e28200f0c3f68c8c67879e79c86/process.c#L4817-4860
  1816. 99
  1817. Daemonizing a Process, Step by Step
  1818. exit if fork
  1819. This line of code makes intelligent use of the return value of the fork method. Recall
  1820. from the forking chapter that fork returns twice, once in the parent process and once
  1821. in the child process. In the parent process it returns the child's pid and in the child
  1822. process it returns nil.
  1823. As always, the return value will be truth-y for the parent and false-y for the child. This
  1824. means that the parent process will exit, and as we know, orphaned child processes
  1825. carry on as normal.
  1826. If a process is orphaned then what happens when you ask for Process.ppid ?
  1827. This is where knowledge of the init process becomes relevant. The ppid of
  1828. orphaned processes is always 1 . This is the only process that the kernel can be sure
  1829. is active at all times.
  1830. This first step is imperative when creating a daemon because it causes the terminal
  1831. that invoked this script to think the command is done, returning control to the
  1832. terminal and taking it out of the equation.
  1833. Process.setsid
  1834. 100
  1835. Calling Process.setsid does three things:
  1836. 1. The process becomes a session leader of a new session
  1837. 2. The process becomes the process group leader of a new process group
  1838. 3. The process has no controlling terminal
  1839. To understand exactly what effect these three things have we need to step out of the
  1840. context of our Rack example for a moment and look a little deeper.
  1841. Process Groups and Session Groups
  1842. Process groups and session groups are all about job control. By 'job control' I'm
  1843. referring to the way that processes are handled by the terminal.
  1844. We begin with process groups.
  1845. Each and every process belongs to a group, and each group has a unique integer id. A
  1846. process group is just a collection of related processes, typically a parent process and its
  1847. children. However you can also group your processes arbitrarily by setting their group
  1848. id using Process.setpgrp(new_group_id) .
  1849. Have a look at the output from the following snippet.
  1850. puts Process.getpgrp
  1851. puts Process.pid
  1852. 101
  1853. If you ran that code in an irb session then those two values will be equal. Typically the
  1854. process group id will be the same as the pid of the process group leader. The process
  1855. group leader is the 'originating' process of a terminal command. ie. If you start an irb
  1856. process at the terminal it will become the group leader of a new process group. Any
  1857. child processes that it creates will be made part of the same process group.
  1858. Try out the following example to see that process groups are inherited.
  1859. puts Process.pid
  1860. puts Process.getpgrp
  1861. fork {
  1862. puts Process.pid
  1863. puts Process.getpgrp
  1864. }
  1865. You can see that although the child process gets a unique pid it inherits the group id
  1866. from its parent. So these two processes are part of the same group.
  1867. You'll recall that we looked previously at Orphaned Processes. In that section I said
  1868. that child processes are not given special treatment by the kernel. Exit a parent process
  1869. and the child will continue on. This is the behaviour when a parent process exits, but
  1870. the behaviour is a bit different when the parent process is being controlled by a
  1871. terminal and is killed by a signal.
  1872. Consider for a moment: a Ruby script that shells out to a long-running shell
  1873. command, eg. a long backup script. What happens if you kill the Ruby script with a
  1874. Ctrl-C?
  1875. 102
  1876. If you try this out you'll notice that the long-running backup script is not orphaned, it
  1877. does not continue on when its parent is killed. We haven't set up any code to forward
  1878. the signal from the parent to the child, so how is this done?
  1879. The terminal receives the signal and forwards it on to any process in the foreground
  1880. process group. In this case, both the Ruby script and the long-running shell command
  1881. would part of the same process group, so they would both be killed by the same signal.
  1882. And then session groups...
  1883. A session group is one level of abstraction higher up, a collection of process groups.
  1884. Consider the following shell command:
  1885. git log | grep shipped | less
  1886. In this case each command will get its own process group, since each may be creating
  1887. child processes but none is a child process of another. Even though these commands
  1888. are not part of the same process group one Ctrl-C will kill them all.
  1889. These commands are part of the same session group. Each invocation from the shell
  1890. gets its own session group. An invocation may be a single command or a string of
  1891. commands joined by pipes.
  1892. Like in the above example, a session group may be attached to a terminal. It might also
  1893. not be attached to any terminal, as in the case of a daemon.
  1894. Again, your terminal handles session groups in a special way: sending a signal to the
  1895. session leader will forward that signal to all the process groups in that session, which
  1896. will forward it to all the processes in those process groups. Turtles all the way down ;)
  1897. 103
  1898. There is a system call for retrieving the current session group id, getsid(2), but Ruby's
  1899. core library has no interface to it. Using Process.setsid will return the id of the new
  1900. sesssion group it creates, you can store that if you need it.
  1901. So, getting back to our Rack example, in the first line a child process was forked and
  1902. the parent exited. The originating terminal recognized the exit and returned control to
  1903. the user, but the forked process still has the inherited group id and session id from its
  1904. parent. At the moment this forked process is neither a session leader nor a group
  1905. leader.
  1906. So the terminal still has a link to our forked process, if it were to send a signal to its
  1907. session group the forked process would receive it, but we want to be fully detached
  1908. from a terminal.
  1909. Process.setsid will make this forked process the leader of a new process group and a
  1910. new session group. Note that Process.setsid will fail in a process that is already a
  1911. process group leader, it can only be run from child processes.
  1912. This new session group does not have a controlling terminal, but technically one could
  1913. be assigned.
  1914. exit if fork
  1915. The forked process that had just become a process group and session group leader
  1916. forks again and then exits.
  1917. This newly forked process is no longer a process group leader nor a session leader.
  1918. Since the previous session leader had no controlling terminal, and this process is not a
  1919. 104
  1920. session leader, it's guaranteed that this process can never have a controlling terminal.
  1921. Terminals can only be assigned to session leaders.
  1922. This dance ensures that our process is now fully detached from a controlling terminal
  1923. and will run to its completion.
  1924. Dir.chdir "/"
  1925. This changes the current working directory to the root directory for the system. This
  1926. isn't strictly necessary but it's an extra step to ensure that current working directory of
  1927. the daemon doesn't disappear during its execution.
  1928. This avoids problems where the directory that the daemon was started from gets
  1929. deleted or unmounted for any reason.
  1930. STDIN.reopen "/dev/null"
  1931. STDOUT.reopen "/dev/null", "a"
  1932. STDERR.reopen "/dev/null", "a"
  1933. This sets all of the standard streams to go to /dev/null , a.k.a. to be ignored. Since the
  1934. daemon is no longer attached to a terminal session these are of no use anyway. They
  1935. can't simply be closed because some programs expect them to always be available.
  1936. Redirecting them to /dev/null ensures that they're still available to the program but
  1937. have no effect.
  1938. 105
  1939. In the Real World
  1940. As mentioned, the rackup command ships with a command line option for
  1941. daemonizing the process. Same goes with any of the popular Ruby web servers.
  1942. If you want to dig in to more internals of daemon processes you should look at the
  1943. daemons rubygem
  1944. 3 .
  1945. If you think you want to create a daemon process you should ask yourself one basic
  1946. question: Does this process need to stay responsive forever?
  1947. If the answer is no then you probably want to look at a cron job or background job
  1948. system. If the answer is yes, then you probably have a good candidate for a daemon
  1949. process.
  1950. System Calls
  1951. Ruby's Process.setsid maps to setsid(2), Process.getpgrp maps to getpgrp(2). Other
  1952. system calls mentioned in this chapter were covered in detail in previous chapters.
  1953. 3.http://rubygems.org/gems/daemons
  1954. 106
  1955. Chapter 19
  1956. Spawning Terminal
  1957. Processes
  1958. A common interaction in a Ruby program is 'shelling out' from your program to run a
  1959. command in a terminal. This happens especially when I'm writing a Ruby script to
  1960. glue together some common commands for myself. There are several ways you can
  1961. spawn processes to run terminal commands in Ruby.
  1962. Before we look at the different ways of 'shelling out' let's look at the mechanism they're
  1963. all using under the hood.
  1964. fork + exec
  1965. All of the methods described below are variations on one theme: fork(2) + exec(2).
  1966. We've had a good look at fork(2) in previous chapters, but this is our first look at
  1967. exec(2). It's pretty simple, exec(2) allows you to replace the current process with a
  1968. different process.
  1969. Put another way: exec(2) allows you to transform the current process into any other
  1970. process. You can take a Ruby process and turn it into a Python process, or an ls(1)
  1971. process, or another Ruby process.
  1972. 107
  1973. exec(2) transforms the process and never returns. Once you've transformed your Ruby
  1974. process into something else you can never come back.
  1975. exec 'ls', '--help'
  1976. The fork + exec combo is a common one when spawning new processes. exec(2) is a
  1977. very powerful and efficient way to transform the current process into another one; the
  1978. only catch is that your current process is gone. That's where fork(2) comes in handy.
  1979. You can use fork(2) to create a new process, then use exec(2) to transform that process
  1980. into anything you like. Voila! Your current process is still running just as it was before
  1981. and you were able to spawn any other process that you want to.
  1982. If your program depends on the output from the exec(2) call you can use the tools you
  1983. learned in previous chapters to handle that. Process.wait will ensure that your
  1984. program waits for the child process to finish whatever it's doing so you can get the
  1985. result back.
  1986. Keep in mind that exec(2) doesn't close any open file descriptors (by default) or do any
  1987. memory cleanup. You can use this to your advantage in certain situations. In other
  1988. situations it may cause problems with resource usage.
  1989. hosts = File.open('/etc/hosts')
  1990. exec 'python', '-c', "import os; print os.fdopen(#{hosts.fileno}).read()"
  1991. In this example we start up a Ruby program and open the /etc/hosts file. Then we
  1992. exec(2) a python process and tell it to open the file descriptor number that Ruby
  1993. received for opening the /etc/hosts file. You can see that python recognizes this file
  1994. 108
  1995. descriptor (because it was shared via exec(2)) and is able to read from it without
  1996. having to open the file again.
  1997. Arguments to exec
  1998. Notice in all of the examples above I sent an array of arguments to exec , rather than
  1999. passing them as a string? There's a subtle difference to the two argument forms.
  2000. Pass a string to exec and it will actually start up a shell process and pass the string to
  2001. the shell to interpret. Pass an array and it will skip the shell and set up the array
  2002. directly as the ARGV to the new process.
  2003. Generally you want to avoid passing a string unless you really need to. Pass an
  2004. array where possible. Passing a string and running code through the shell can raise
  2005. security concerns. If user input is involved it may be possible for them to inject a
  2006. malicious command directly in a shell, potentially gaining access to any privileges the
  2007. current process has. In a case where you want to do something like
  2008. exec('ls * | awk '{print($1)}') you'll have to pass it as a string.
  2009. Kernel#system
  2010. system('ls')
  2011. system('ls', '--help')
  2012. system('git log | tail -10')
  2013. 109
  2014. The return value of Kernel#system reflects the exit code of the terminal command in
  2015. the most basic way. If the exit code of the terminal command was 0 then it returns
  2016. true , otherwise it returns false .
  2017. The standard streams of the the terminal command are shared with the current
  2018. process (through the magic of fork(2)), so any output coming from the terminal
  2019. command should be seen in the same way output is seen from the current process.
  2020. Kernel#`
  2021. `ls`
  2022. `ls --help`
  2023. %x[git log | tail -10]
  2024. Kernel#` works slightly differently. The value returned is the STDOUT of the terminal
  2025. program collected into a String.
  2026. As mentioned, it's using fork(2) under the hood and it doesn't do anything special
  2027. with STDERR , so you can see in the second example that STDERR is printed to the screen
  2028. just as with Kernel#system .
  2029. Kernel#` and %x[] do the exact same thing.
  2030. 110
  2031. Process.spawn
  2032. # Ruby 1.9 only!
  2033. # This call will start up the 'rails server' process with the
  2034. # RAILS_ENV environment variable set to 'test'.
  2035. Process.spawn({'RAILS_ENV' => 'test'}, 'rails server')
  2036. # This call will merge STDERR with STDOUT for the duration
  2037. # of the 'ls --help' program.
  2038. Process.spawn('ls', '--help', STDERR => STDOUT)
  2039. Process.spawn is a bit different than the others in that it is non-blocking.
  2040. If you compare the following two examples you will see that Kernel#system will block
  2041. until the command is finished, whereas Process.spawn will return immediately.
  2042. # Do it the blocking way
  2043. system 'sleep 5'
  2044. # Do it the non-blocking way
  2045. Process.spawn 'sleep 5'
  2046. # Do it the blocking way with Process.spawn
  2047. # Notice that it returns the pid of the child process
  2048. pid = Process.spawn 'sleep 5'
  2049. Process.waitpid(pid)
  2050. 111
  2051. The last example in this code block is a really great example of the flexibility of
  2052. Unix programming. In previous chapters we talked a lot about Process.wait , but it
  2053. was always in the context of forking and then running some Ruby code. You can
  2054. see from this example that the kernel cares not what you are doing in your process,
  2055. it will always work the same.
  2056. So even though we fork(2) and then run the sleep(1) program (a C program) the
  2057. kernel still knows how to wait for that process to finish. Not only that, it will be
  2058. able to properly return the exit code just as was happening in our Ruby programs.
  2059. All code looks the same to the kernel; that's what makes it such a flexible system.
  2060. You can use any programming language to interact with any other programming
  2061. language, and all will be treated equally.
  2062. Process.spawn takes many options that allow you to control the behaviour of the child
  2063. process. I showed a few useful ones in the example above. Consult the official rdoc
  2064. 1 for
  2065. an exhaustive list.
  2066. IO.popen
  2067. # This example will return a file descriptor (IO object). Reading from it
  2068. # will return what was printed to STDOUT from the shell command.
  2069. IO.popen('ls')
  2070. 1.http://www.ruby-doc.org/core-1.9.3/Process.html#method-c-spawn
  2071. 112
  2072. The most common usage for IO.popen is an implementation of Unix pipes in pure
  2073. Ruby. That's where the 'p' comes from in popen. Underneath it's still doing the
  2074. fork+exec, but it's also setting up a pipe to communicate with the spawned process.
  2075. That pipe is passed as the block argument in the block form of IO.popen .
  2076. # An IO object is passed into the block. In this case we open the stream
  2077. # for writing, so the stream is set to the STDIN of the spawned process.
  2078. #
  2079. # If we open the stream for reading (the default) then
  2080. # the stream is set to the STDOUT of the spawned process.
  2081. IO.popen('less', 'w') { |stream|
  2082. stream.puts "some\ndata"
  2083. }
  2084. With IO.popen you have to choose which stream you have access to. You can't access
  2085. them all at once.
  2086. open3
  2087. Open3 allows simultaneous access to the STDIN, STDOUT, and STDERR of a spawned
  2088. process.
  2089. 113
  2090. # This is available as part of the standard library.
  2091. require 'open3'
  2092. Open3.popen3('grep', 'data') { |stdin, stdout, stderr|
  2093. stdin.puts "some\ndata"
  2094. stdin.close
  2095. puts stdout.read
  2096. }
  2097. # Open3 will use Process.spawn when available. Options can be passed to
  2098. # Process.spawn like so:
  2099. Open3.popen3('ls', '-uhh', :err => :out) { |stdin, stdout, stderr|
  2100. puts stdout.read
  2101. }
  2102. Open3 acts like a more flexible version of IO.popen , for those times when you need it.
  2103. In the Real World
  2104. All of these methods are common in the Real World. Since they all differ in their
  2105. behaviour you have to select one based on your needs.
  2106. One drawback to all of these methods is that they rely on fork(2). What's wrong with
  2107. that? Imagine this scenario: You have a big Ruby app that is using hundreds of MB of
  2108. memory. You need to shell out. If you use any of the methods above you'll incur the
  2109. cost of forking.
  2110. Even if you're shelling out to a simple ls(1) call the kernel will still need to make sure
  2111. that all of the memory that your Ruby process is using is available for that new ls(1)
  2112. 114
  2113. process. Why? Because that's the API of fork(2). When you fork(2) the process the
  2114. kernel doesn't know that you're about to transform that process with an exec(2). You
  2115. may be forking in order to run Ruby code, in which case you'll need to have all of the
  2116. memory available.
  2117. It's good to keep in mind that fork(2) has a cost, and sometimes it can be a
  2118. performance bottleneck. What if you need to shell out a lot and don't want to incur
  2119. the cost of fork(2)?
  2120. There are some native Unix system calls for spawning processes without the overhead
  2121. of fork(2). Unfortunately they don't have support in the Ruby language core library.
  2122. However, there is a Rubygem that provides a Ruby interface to these system calls. The
  2123. posix-spawn project
  2124. 2 provides access to posix_spawn(2), which is available on most
  2125. Unix systems.
  2126. posix-spawn mimics the Process.spawn API. In fact, most of the options that you pass
  2127. to Process.spawn can also be passed to POSIX::Spawn.spawn . So you can keep using the
  2128. same API and yet reap the benefits of faster, more resource efficient spawning.
  2129. At a basic level posix_spawn(2) is a subset of fork(2). Recall the two discerning
  2130. attributes of a new child process from fork(2): 1) it gets an exact copy of everything
  2131. that the parent process had in memory, and 2) it gets a copy of all the file descriptors
  2132. that the parent process had open.
  2133. posix_spawn(2) preserves #2, but not #1. That's the big difference between the two. So
  2134. you can expect a newly spawned process to have access to any of the file descriptors
  2135. opened by the parent, but it won't share any of the memory. This is what makes
  2136. posix_spawn(2) faster and more efficient than fork(2). But keep in mind that it also
  2137. makes it less flexible.
  2138. 2.http://github.com/rtomayko/posix-spawn
  2139. 115
  2140. System Calls
  2141. Ruby's Kernel#system maps to system(3), Kernel#exec maps to execve(2), IO.popen
  2142. maps to popen(3), posix-spawn uses posix_spawn(2).
  2143. 116
  2144. Chapter 20
  2145. Ending
  2146. Working with processes in Unix is about two things: abstraction and communication.
  2147. Abstraction
  2148. The kernel has an extremely abstract (and simple) view of its processes. As
  2149. programmers we're used to looking at source code as the differentiator between two
  2150. programs.
  2151. We are masters of many programming languages, using each for different purposes.
  2152. We couldn't possibly write memory-efficient code in a language with a garbage
  2153. collector, we'll have to use C. But we need objects, let's use C++. On and on.
  2154. But if you ask the kernel it all looks the same. In the end, all of our code is compiled
  2155. down to something simple that the kernel can understand. And when it's working at
  2156. that level all processes are treated the same. Everything gets its numeric identifier and
  2157. is given equal access to the resources of the kernel.
  2158. What's the point of all this jibber-jabber? Using Unix programming lets you twiddle
  2159. with these knobs a little bit. It lets you do things that you can't accomplish when
  2160. working at the programming language level.
  2161. Unix programming is programming language agnostic. It lets you interface your Ruby
  2162. script with a C program, and vice versa. It also lets you reuse its concepts across
  2163. programming languages. The Unix Programming skills that you get from Ruby will be
  2164. 117
  2165. just as applicable in Python, or node.js, or C. These are skills that are about
  2166. programming in general.
  2167. Communication
  2168. Besides the basic act of creating new processes, almost everything else we talked about
  2169. was regarding communication. Following the principle of abstraction mentioned
  2170. above, the kernel provides very abstract ways of communicating between processes.
  2171. Using signals any two processes on the system can communicate with each other. By
  2172. naming your processes you can communicate with any user who is inspecting your
  2173. program on the command line. Using exit codes you can send success/failure messages
  2174. to any process that's looking after your own.
  2175. Farewell, But Not Goodbye
  2176. That's the end! Congratulations for making it here! Believe it or not, you now know
  2177. more than most programmers about the inner workings of Unix processes.
  2178. Now that you know the fundamentals you can go out apply your newfound knowledge
  2179. to anything that you work on. Things are going to start making more sense for you.
  2180. And the more you apply your newfound knowledge: the clearer things will become.
  2181. There's no stopping you now.
  2182. And we haven't even talked about networking :) We'll save that one for another
  2183. edition.
  2184. 118
  2185. Read the appendices at the end of this book for a look at some popular Ruby projects
  2186. and how they use Unix processes to be awesome.
  2187. If you have any feedback on this book, find an error or build something cool with your
  2188. newfound knowledge, I'd love to hear it. Send a message to
  2189. jesse@workingwithunixprocesses.com. Happy coding!
  2190. 119
  2191. Chapter 21
  2192. Appendix: How Resque
  2193. Manages Processes
  2194. This section looks at how a popular Ruby job queue, Resque
  2195. 1 , effectively manages
  2196. processes. Specifically it makes use of fork(2) to manage memory, not for concurrency
  2197. or speed reasons.
  2198. The Architecture
  2199. To understand why Resque works the way it does we need a basic understanding of
  2200. how the system works.
  2201. From the README:
  2202. Resque is a Redis-backed library for creating background jobs, placing those
  2203. jobs on multiple queues, and processing them later.
  2204. The component that we're interested in is the Resque worker. Resque workers take care
  2205. of the 'processing them later' part. The job of a Resque worker is to boot up, load your
  2206. application environment, then connect to Redis and try to reserve any pending
  2207. background jobs. When it's able to reserve one such job it works off the job, then goes
  2208. back to step 1. Simple enough.
  2209. 1.http://github.com/defunkt/resque#readme
  2210. 120
  2211. For an application of non-trivial size one Resque worker is not enough. So it's very
  2212. common to spin up multiple Resque workers in parallel to work off jobs.
  2213. Forking for Memory Management
  2214. Resque workers employ fork(2) for memory management purposes. Let's have a look at
  2215. the relevant bit of code and then dissect it line by line.
  2216. if @child = fork
  2217. srand # Reseeding
  2218. procline "Forked #{@child} at #{Time.now.to_i}"
  2219. Process.wait(@child)
  2220. else
  2221. procline "Processing #{job.queue} since #{Time.now.to_i}"
  2222. perform(job, &block)
  2223. exit! unless @cant_fork
  2224. end
  2225. This bit of code is executed every time Resque works off a job.
  2226. If you've read through the Forking chapter then you'll already be familiar with the if/
  2227. else style here. Otherwise go read it now!
  2228. We'll start by looking at the code inside the parent process (ie. inside the if block).
  2229. srand # Reseeding
  2230. This line is here simply because of a bug
  2231. 2 in a certain patchlevel of MRI Ruby 1.8.7.
  2232. 121
  2233. procline "Forked #{@child} at #{Time.now.to_i}"
  2234. procline is Resque's internal way of updating the name of the current process.
  2235. Remember we noted that you can change the name of the current process using $0
  2236. but there's not programmatic way to do it?
  2237. This is Resque's solution. procline sets the name of the current process.
  2238. Process.wait(@child)
  2239. If you've read the chapter on Process.wait then this line of code should be familiar to
  2240. you.
  2241. The @child variable was assigned the value of the fork call. So in the parent process
  2242. that will be the child pid. This line of code tells the parent process to block until the
  2243. child is finished.
  2244. Now we'll look at what happens in the child process.
  2245. procline "Processing #{job.queue} since #{Time.now.to_i}"
  2246. Notice that both the if and else block make a call to procline. Even though these two
  2247. lines are part of the same logical construct they are being executed in two different
  2248. processes. Since the process name is process-specific these two calls will set the name
  2249. for the parent and child process respectively.
  2250. 2.http://redmine.ruby-lang.org/issues/4338
  2251. 122
  2252. perform(job, &block)
  2253. Here in the child process is where the job is actually 'performed' by Resque.
  2254. exit! unless @cant_fork
  2255. Then the child process exits.
  2256. Why Bother?
  2257. As mentioned in the first paragraph of this chapter, Resque isn't doing this to achieve
  2258. concurrency or to make things faster. In fact, it adds an extra step to the processing of
  2259. each job which makes the whole thing slower. So why go to the trouble? Why not just
  2260. process job after job?
  2261. Resque uses fork(2) to ensure that the memory usage of its worker processes don't
  2262. bloat. Let's review what happens when a Resque worker forks and how that affects the
  2263. Ruby VM.
  2264. You'll recall that fork(2) creates a new process that's an exact copy of the original
  2265. process. The original process, in this case, has preloaded the application environment
  2266. and nothing else. So we know that after forking we'll have a new process with just the
  2267. application environment loaded.
  2268. Then the child process will go to the task of working off the job. This is where memory
  2269. usage can go awry. The background job may require that image files are loaded into
  2270. main memory for processing, or many ActiveRecord objects are fetched from the
  2271. 123
  2272. database, or any other operation that requires large amounts of main memory to be
  2273. used.
  2274. Once the child process is finished with the job it exits, which releases all of its memory
  2275. back to the OS to clean up. Then the original process can resume, once again with only
  2276. the application environment loaded.
  2277. So each time after a job is performed by Resque you end up back at a clean slate in
  2278. terms of memory usage. This means that memory usage may spike when jobs are
  2279. being worked on, but it should always come back to that nice baseline.
  2280. Doesn't the GC clean up for us?
  2281. Well, yes, but it doesn't do a great job. It does an OK job. The truth is that MRI's GC
  2282. has a hard time releasing memory that it doesn't need anymore.
  2283. When the Ruby VM boots up it is allocated a certain block of main memory by the
  2284. kernel. When it uses up all that it has it needs to ask for another block of main
  2285. memory from the kernel.
  2286. Due to numerous issues with Ruby's GC (naive approach, disk fragmentation) it is rare
  2287. that the VM is able to release a block of memory back to the kernel. So the memory
  2288. usage of a Ruby process is likely to grow over time, but not to shrink. Now Resque's
  2289. approach begins to make sense!
  2290. If the Resque worker simply worked off each job as it became available then it wouldn't
  2291. be able to maintain that nice baseline level of memory usage. As soon as it worked on
  2292. a job that required lots of main memory then that memory would be stuck with the
  2293. worker process until it exited.
  2294. 124
  2295. Even if subsequent jobs needed much less memory Ruby would have a hard time
  2296. giving that memory back to the kernel. Hence, the worker processes would inevitably
  2297. get bigger over time. Never shrinking.
  2298. Thanks to the power of fork(2) Resque workers are reliable and don't need to be
  2299. restarted after working a certain number of jobs.
  2300. 125
  2301. Chapter 22
  2302. Appendix: How Unicorn
  2303. Reaps Worker Processes
  2304. Any investigation of Unix Programming in the Ruby language would be remiss without
  2305. many mentions of the Unicorn web server
  2306. 1 . Indeed, the project has already been
  2307. mentioned several times in this book.
  2308. What's the big deal? Unicorn is a web server that attempts to push as much
  2309. responsibility onto the kernel as it can. It uses lots of Unix Programming. The
  2310. codebase is chock full of Unix Programming techniques.
  2311. Not only that, but it's performant and reliable. It's used by lots of big Ruby websites
  2312. like Github and Shopify.
  2313. The point is, if this book has whet your appetite and you want to learn more about
  2314. Unix Programming in Ruby you should plumb the depths of Unicorn. It may take you
  2315. several trips into the belly of the mythical beast but you will come out with better
  2316. understanding and new ideas.
  2317. Reaping What?
  2318. Before we dive into the code I'd like to provide a bit of context about how Unicorn
  2319. works. At a very high level Unicorn is a pre-forking web server.
  2320. 1.http://unicorn.bogomips.org
  2321. 126
  2322. This means that you boot it up and tell it how many worker processes you would like it
  2323. to have. It starts by initializing its network sockets and loading your application. Then
  2324. it uses fork(2) to create the worker processes. It uses the master-worker pattern we
  2325. mentioned in the chapter on forking.
  2326. The Unicorn master process keep a heartbeat on each of its workers and ensures
  2327. they're not taking too long to process requests. The code below is used when you tell
  2328. the Unicorn master process to exit. As we covered in chapter (Forking) if a parent
  2329. process doesn't kill its children before it exits they will continue on without stopping.
  2330. So it's important that Unicorn clean up after itself before it exits. The code below is
  2331. invoked as part of Unicorn's exit procedure. Before invoking this code it will send a
  2332. QUIT signal to each of its worker process, instructing it to exit gracefully.
  2333. The code below is used by Unicorn to clean up its internal representation of its
  2334. workers and ensure that they all exited properly.
  2335. Let's dive in.
  2336. 127
  2337. # reaps all unreaped workers
  2338. def reap_all_workers
  2339. begin
  2340. wpid, status = Process.waitpid2(-1, Process::WNOHANG)
  2341. wpid or return
  2342. if reexec_pid == wpid
  2343. logger.error "reaped #{status.inspect} exec()-ed"
  2344. self.reexec_pid = 0
  2345. self.pid = pid.chomp('.oldbin') if pid
  2346. proc_name 'master'
  2347. else
  2348. worker = WORKERS.delete(wpid) and worker.close rescue nil
  2349. m = "reaped #{status.inspect} worker=#{worker.nr rescue 'unknown'}"
  2350. status.success? ? logger.info(m) : logger.error(m)
  2351. end
  2352. rescue Errno::ECHILD
  2353. break
  2354. end while true
  2355. end
  2356. We'll take it one line at a time:
  2357. begin
  2358. ...
  2359. end while true
  2360. The first thing that I want to draw your attention to is the fact that the begin block
  2361. that's started on the first line of this method actually starts an endless loop. There are
  2362. others ways to write endless loops in Ruby, but alas, we should keep in mind that we're
  2363. 128
  2364. in an endless loop so we'll need a hard return or a break in order to finish this
  2365. method.
  2366. wpid, status = Process.waitpid2(-1, Process::WNOHANG)
  2367. This line should have some familiarity. We looked at Process.waitpid2 in the chapter
  2368. on Process.wait , but we always passed in a pid as the first option and never passed
  2369. anything as the second option.
  2370. So we saw that passing a valid pid as the first option would cause the Process.waitpid
  2371. call to wait only for that pid. What happens when you pass -1 to Process.waitpid ? We
  2372. know that there are no processes with a pid less than 1, so...
  2373. Passing -1 waits for any child process to exit. It turns out that this is the default
  2374. option to that method. If you don't specify a pid then it uses -1 by default. In this
  2375. case, since the author needed to pass something in for the second argument, the first
  2376. argument couldn't be left blank, so it was set to the default.
  2377. Hey, if you're waiting on any child process why not use Process.wait2 then? I suspect
  2378. that the author decided here, and I agree with him, that it was most readable to use a
  2379. waitpid variation when specifying a value for the pid. As mentioned above the value
  2380. specified is simply the default, but nonetheless it's most salient to use waitpid if you're
  2381. specifying any value for the pid.
  2382. The second argument is for special flags. Passing Process::WNOHANG causes this
  2383. normally blocking call to become a non-blocking call. When using this flag if there are
  2384. no processes that have exited for us then it will not block and simply return nil.
  2385. 129
  2386. wpid or return
  2387. This line may look a little odd but it's actually a conditional return statement. If wpid
  2388. is nil then we know that the last line returned nil . This would mean that there are no
  2389. child processes that have exited returning their status to us.
  2390. If this is the case then this method will return and its job is done.
  2391. if reexec_pid == wpid
  2392. logger.error "reaped #{status.inspect} exec()-ed"
  2393. self.reexec_pid = 0
  2394. self.pid = pid.chomp('.oldbin') if pid
  2395. proc_name 'master'
  2396. I don't want to spend much time talking about this bit. The 'reexec' stuff has to do
  2397. with Unicorn internals, specifically how it handles zero-downtime restarts. Perhaps I
  2398. can cover that process in a future report.
  2399. One thing that I will draw your attention to is the call to proc_name . This is similar to
  2400. the procline method from the Resque chapter. Unicorn also has a method for
  2401. changing the display name of the current process. A critical piece of communication
  2402. with the user of your software.
  2403. else
  2404. worker = WORKERS.delete(wpid) and worker.close rescue nil
  2405. 130
  2406. Unicorn stores a list of currently active worker processes in its WORKERS constant.
  2407. WORKERS is a hash where the key is the pid of the worker process and the value is an
  2408. instance of Unicorn::Worker .
  2409. So this line removes the worker process from Unicorn's internal tracking list ( WORKERS )
  2410. and calls #close on the worker instance, which closes its no longer needed heartbeat
  2411. mechanism.
  2412. m = "reaped #{status.inspect} worker=#{worker.nr rescue 'unknown'}"
  2413. This lines craft a log message based on the status returned from the Process.waitpid2
  2414. call.
  2415. The string is crafted by first inspecting the status variable. What does that look like?
  2416. Something like this:
  2417. #<Process::Status: pid=32227,exited(0)>
  2418. # or
  2419. #<Process::Status: pid=32308,signaled(SIGINT=2)>
  2420. It includes the pid of the ended process, as well as the way it ended. In the first line the
  2421. process exited itself with an exit code of 0. In the second line the process was killed
  2422. with a signal, SIGINT in this case. So a line like that will be added to the Unicorn log.
  2423. The second part of the log line worker.nr is Unicorn's internal representation of the
  2424. worker's number.
  2425. 131
  2426. status.success? ? logger.info(m) : logger.error(m)
  2427. This line takes the crafted log message and sends it to the logger. It uses the success?
  2428. method on the status object to log this message as at the INFO level or the ERROR
  2429. level.
  2430. The success? method will only return true in one case, when the process exited with
  2431. an exit code of 0. If it exited with a different code it will return false . If it was killed by
  2432. a signal, it will return nil .
  2433. rescue Errno::ECHILD
  2434. break
  2435. This is part of the top-level begin statement in this method. If this exception is raised
  2436. then the endless loop that is this method break s and it will return.
  2437. The Errno::ECHILD exception will be raised by Process.waitpid2 (or any of its cousins)
  2438. if there are no child processes for the current processes. If that happens in this case
  2439. then it means the job of this method is done! All of the child processes have been
  2440. reaped. So it returns.
  2441. Conclusion
  2442. If this bit of code interested you and you want to learn more about Unix Programming
  2443. in Ruby, Unicorn is a great resource. See the official site at
  2444. http://unicorn.bogomips.org and go learn!
  2445. 132
  2446. Chapter 23
  2447. Appendix: Preforking
  2448. Servers
  2449. I'm glad you made it this far because this chapter may be the most action-packed in
  2450. the whole book. Preforking servers bring together a lot of the concepts that are
  2451. explained in this book into a powerful, highly-efficient approach to solving certain
  2452. problems.
  2453. There's a good chance that you've used either Phusion Passenger
  2454. 1 or Unicorn 2 . Both of
  2455. those servers, and Spyglass (the web server included with this book), are examples of
  2456. preforking servers.
  2457. At the core of all these projects is the preforking model. There are a few things about
  2458. preforking that make it special, here are 3:
  2459. 1. Efficient use of memory.
  2460. 2. Efficieng load balancing.
  2461. 3. Efficient sysadminning.
  2462. We'll look at each in turn.
  2463. 1.http://www.modrails.com/
  2464. 2.http://unicorn.bogomips.org
  2465. 133
  2466. 1. Efficient use of memory
  2467. In the chapter on forking we discussed how fork(2) creates a new process that's an
  2468. exact copy of the calling (parent) process. This includes anything that the parent
  2469. process had in memory at the time.
  2470. Loading a Rails App
  2471. On my Macbook Pro loading only Rails 3.1 (no libraries or application code) takes
  2472. in the neighbourhood of 3 seconds. After loading Rails the process is consuming
  2473. about 70MB of memory.
  2474. Whether or not these numbers are exactly the same on your machine isn't
  2475. significant for our purposes. I'll be referring to these as a baseline in the following
  2476. examples.
  2477. Preforking uses memory more efficiently than does spawning multiple unrelated
  2478. processes. For comparison, this is like running Unicorn with 10 worker processes
  2479. compared to running 10 instances of Mongrel (a non-preforking server).
  2480. Let's review what will happen from the standpoint of processes, first looking at
  2481. Mongrel, then at Unicorn, when we boot up 10 instances of each server.
  2482. 134
  2483. Many Mongrels
  2484. Booting up 10 Mongrel processes in parallel will look about the same as booting up 10
  2485. Mongrel processes serially.
  2486. When booting them in parallel all 10 processes will be competing for resources from
  2487. the kernel. Each will be consuming resources to load Rails, and each can be expected
  2488. to take the customary 3 seconds to boot. In total, that's 30 seconds. On top of that,
  2489. each process will be consuming 70MB of memory once Rails has been loaded. In total,
  2490. that's 700MB of memory for 10 processes.
  2491. A preforking server can do better.
  2492. Many Unicorn
  2493. Booting up 10 Unicorn workers will make use of 11 processes. One process will be the
  2494. master, babysitting the other worker processes, of which there are 10.
  2495. When booting Unicorn only one process, the master process, will load Rails. There
  2496. won't be competition for kernel resources.
  2497. The master process will take the customary 3 seconds to load, and forking 10 processes
  2498. will be more-or-less instantaneous. The master process will be consuming 70MB of
  2499. memory to load Rails and, thanks to copy-on-write, the child processes should not be
  2500. using any memory on top of what the master was using.
  2501. 135
  2502. The truth is that it does take some time to fork a process (it's not instantaneous) and
  2503. that there is some memory overhead for each child process. These values are negligible
  2504. compared to the overhead of booting many Mongrels. Preforking wins.
  2505. Keep in mind that the benefits of copy-on-write are forfeited if you're running
  2506. MRI. To reap these benefits you need to be using REE.
  2507. 2. Efficient load balancing
  2508. I already highlighted the fact that fork(2) creates an exacty copy of the calling process.
  2509. This includes any file descriptors that the parent process has open.
  2510. The Very Basics of Sockets
  2511. Efficient load balancing has a lot to do with how sockets work. Since we're talking
  2512. about web servers: sockets are important. They're at the very core of networking.
  2513. As I hinted earlier: sockets and networking are a complex topic, too big to fit into
  2514. this book. But you need to understand the very basic workflow in order to
  2515. understand this next part.
  2516. Using a socket involves multiple steps: 1) A socket is opened and binds to a unique
  2517. port, 2) A connection is accepted on that socket using accept(2), and 3) Data can
  2518. 136
  2519. be read from this connection, written to the connection, and ultimately the
  2520. connection is closed. The socket stays open, but the connection is closed.
  2521. Typically this would happen in the same process. A socket is opened, then the process
  2522. waits for connections on that socket. The connection is handled, closed, and the loop
  2523. starts over again.
  2524. Preforking servers use a different workflow to let the kernel balance heavy load across
  2525. the socket. Let's look at how that's done.
  2526. In servers like Unicorn and Spyglass the first thing that the master process does is
  2527. open the socket, before even loading the Rails app. This is the socket that is available
  2528. for external connections from web clients. But the master process does not accept
  2529. connections. Thanks to the way fork(2) works, when the master process forks worker
  2530. processes each one gets a copy of the open socket.
  2531. This is where the magic happens.
  2532. Each worker process has an exact copy of the open socket, and each worker process
  2533. attempts to accept connections on that socket using accept(2). This is where the kernel
  2534. takes over and balances load across the 10 copies of the socket. It ensures that one, and
  2535. only one, process can accept each individual connection. Even under heavy load the
  2536. kernel ensures that the load is balanced and that only one process handles each
  2537. connection.
  2538. Compare this to how Mongrel achieves load balancing.
  2539. 137
  2540. Given 10 unrelated processes that aren't sharing a socket each one must bind to a
  2541. unique port. Now a piece of infrastructure must sit in front of all of the Mongrel
  2542. processes. It must know which port each Mongrel processes is bound to, and it must
  2543. do the job of making sure that each Mongrel is handling only one connection at a time
  2544. and that connections are load balanced properly.
  2545. Again, preforking wins both for simplicity and resource efficiency.
  2546. 3. Efficient sysadminning
  2547. This point is less technical, more human-centric.
  2548. As someone administering a preforking server you typically only need to issue
  2549. commands (usually signals) to the master process. It will handle keeping track of and
  2550. relaying messages to its worker processes.
  2551. When administering many instances of a non-preforking server the sysadmin must
  2552. keep track of each instance, adminster them separately and ensure that their
  2553. commands are followed.
  2554. Basic Example of a Preforking Server
  2555. What follows is some really basic code for a preforking server. It can respond to
  2556. requests in parallel using multiple processes and will leverage the kernel for load
  2557. balancing. For a more involved example of a preforking server I suggest you check out
  2558. the Spyglass source code (next chapter) or the Unicorn source code.
  2559. 138
  2560. require 'socket'
  2561. # Open a socket.
  2562. socket = TCPServer.open('0.0.0.0', 8080)
  2563. # Preload app code.
  2564. # require 'config/environment'
  2565. # Forward any relevant signals to the child processes.
  2566. [:INT, :QUIT].each do |signal|
  2567. Signal.trap(signal) {
  2568. wpids.each { |wpid| Process.kill(signal, wpid) }
  2569. }
  2570. end
  2571. # For keeping track of child process pids.
  2572. wpids = []
  2573. 5.times {
  2574. wpids << fork do
  2575. loop {
  2576. connection = socket.accept
  2577. connection.puts 'Hello Readers!'
  2578. connection.close
  2579. }
  2580. end
  2581. }
  2582. Process.waitall
  2583. You can consume it with something like nc(1) or telnet(1) to see it in action.
  2584. 139
  2585. $ nc localhost 8080
  2586. $ telnet localhost 8080
  2587. Notice that I snuck something new into that one? We haven't seen
  2588. Process.waitall yet, it appeared on the last line of the example code above.
  2589. Process.waitall is simply a convenience method around Process.wait . It runs a
  2590. loop waiting for all child processes to exit and returns array of process statuses.
  2591. Useful when you don't actually want to do anything with the process status info, it
  2592. just waits for the children to exit.
  2593. 140
  2594. Chapter 24
  2595. Appendix: Spyglass
  2596. If you want to know even more about Unix processes then your next stop should be
  2597. the included Spyglass project. Why? Because it was written specifically to showcase
  2598. Unix programming concepts.
  2599. If you have a copy of this book but didn't get the included code project, send me
  2600. an email and I'll hook you up: jesse@workingwithunixprocesses.com.
  2601. The case studies you read are meant to showcase the same thing, but at times they can
  2602. be dense and hard to read when you're new to Unix programming. Spyglass is meant to
  2603. bridge that gap.
  2604. Spyglass' Architecture
  2605. Spyglass is a web server. It opens a socket to the outside world and handles web
  2606. requests. Spyglass parses HTTP, is Rack-compliant, and is awesome.
  2607. Here's a brief summary of how to start a Spyglass server and what happens when it
  2608. receives an HTTP request.
  2609. 141
  2610. Booting Spyglass
  2611. $ spyglass
  2612. $ spyglass -p other_port
  2613. $ spyglass -h # for help
  2614. Before a Request Arrives
  2615. After it boots, control is passed to Spyglass::Lookout . This class DOES NOT preload
  2616. the Rack application and knows nothing about HTTP, it just waits for a connection. At
  2617. this point in time Spyglass is extremely lightweight, it's nothing more than just an
  2618. open socket.
  2619. Connection is Made
  2620. When Spyglass::Lookout is notified that a connection has been made it forks a
  2621. Spyglass::Master to actually handle the connection. Spyglass::Lookout uses
  2622. Process.wait after forking the master process, so it remains idle until the master exits.
  2623. Spyglass::Master is responsible for preloading the Rack application and forking/
  2624. babysitting worker processes. The master process itself doesn't know anything about
  2625. HTTP parsing or request handling.
  2626. The real work is done in Spyglass::Worker . It accepts connections using the method
  2627. outlined in the chapter on preforking, leaning on the kernel for load balancing. Once
  2628. 142
  2629. it has a connection it parses the HTTP request, calls the Rack app, and writes the
  2630. response to the client.
  2631. Things Get Quiet
  2632. So long as there is a steady flow of incoming traffic Spyglass continues to act as a
  2633. preforking server. If its internal timeout is able to expire without receiving any more
  2634. incoming requests then the master process, and all its worker processes, exit. Control
  2635. is returned to Spyglass::Lookout and the workflow begins again.
  2636. Getting Started
  2637. Spyglass is not a production-ready server, so don't rush to start using it for your
  2638. projects! It's a codebase that's meant to be read. It's heavily commented and formatted
  2639. documentation is generated with rocco
  2640. 1 .
  2641. The best thing to do at this point is enter the code directory that comes with this book
  2642. in your terminal, find the Spyglass codebase, and run rake read . This will open up the
  2643. formatted documentation in your browser for your reading pleasure.
  2644. Now go forth and read the code! And may the fork(2) be with you!
  2645. 1.http://rtomayko.github.com/rocco
  2646. 143
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement