Read From Multiple Child Processes From Parent
How to use spawn(), exec(), execFile(), and fork()
Update: This article is at present role of my book "Node.js Beyond The Nuts".Read the updated version of this content and more about Node at jscomplete.com/node-beyond-basics.
Single-threaded, not-blocking performance in Node.js works bang-up for a single procedure. But eventually, ane procedure in one CPU is not going to be enough to handle the increasing workload of your application.
No matter how powerful your server may be, a single thread can just back up a express load.
The fact that Node.js runs in a single thread does non mean that we tin can't take advantage of multiple processes and, of course, multiple machines also.
Using multiple processes is the best way to scale a Node application. Node.js is designed for edifice distributed applications with many nodes. This is why it's named Node. Scalability is baked into the platform and it's not something yous commencement thinking about after in the lifetime of an application.
This article is a write-upwardly of role of my Pluralsight course about Node.js. I cover similar content in video format there.
Please note that you'll demand a good understanding of Node.js events and streams before you read this commodity. If you haven't already, I recommend that y'all read these two other articles before you read this one:
Agreement Node.js Issue-Driven Architecture
Most of Node's objects — similar HTTP requests, responses, and streams — implement the EventEmitter module and so they can…
Streams: Everything you demand to know
Node.js streams have a reputation for being hard to work with, and even harder to understand. Well I've got good news…
The Child Processes Module
We can easily spin a child process using Node's child_process module and those child processes can easily communicate with each other with a messaging system.
The child_process module enables united states of america to access Operating System functionalities by running any system command inside a, well, child procedure.
We can control that child process input stream, and listen to its output stream. We can also command the arguments to be passed to the underlying OS command, and we tin can exercise whatever nosotros want with that command's output. We can, for example, pipe the output of one control as the input to another (simply similar nosotros do in Linux) as all inputs and outputs of these commands can exist presented to u.s.a. using Node.js streams.
Notation that examples I'll be using in this article are all Linux-based. On Windows, y'all need to switch the commands I utilize with their Windows alternatives.
At that place are iv unlike ways to create a child process in Node: spawn(), fork(), exec(), and execFile().
We're going to see the differences betwixt these four functions and when to use each.
Spawned Child Processes
The spawn role launches a command in a new procedure and we can use it to pass that command any arguments. For example, here's code to spawn a new process that will execute the pwd command.
const { spawn } = crave('child_process'); const child = spawn('pwd'); We only destructure the spawn part out of the child_process module and execute information technology with the OS command every bit the kickoff statement.
The result of executing the spawn role (the kid object above) is a ChildProcess instance, which implements the EventEmitter API. This means we can annals handlers for events on this child object directly. For example, we can do something when the child process exits by registering a handler for the exit event:
kid.on('get out', role (code, signal) { panel.log('child process exited with ' + `code ${code} and signal ${signal}`); }); The handler above gives u.s. the exit code for the child process and the signal, if any, that was used to terminate the kid process. This signal variable is cypher when the child process exits usually.
The other events that we can register handlers for with the ChildProcess instances are disconnect, error, shut, and bulletin.
- The
disconnectevent is emitted when the parent process manually calls thechild.disconnectrole. - The
mistakeevent is emitted if the process could not exist spawned or killed. - The
shutevent is emitted when thestdiostreams of a child process get closed. - The
bulletinoutcome is the most of import 1. It's emitted when the child process uses theprocedure.send()part to send messages. This is how parent/kid processes can communicate with each other. We'll come across an instance of this below.
Every child process too gets the three standard stdio streams, which we tin can access using child.stdin, child.stdout, and child.stderr.
When those streams become closed, the child process that was using them will emit the shut effect. This close event is different than the exit upshot because multiple kid processes might share the same stdio streams and so one kid process exiting does not hateful that the streams got closed.
Since all streams are event emitters, we can listen to dissimilar events on those stdio streams that are attached to every child procedure. Unlike in a normal procedure though, in a child process, the stdout/stderr streams are readable streams while the stdin stream is a writable one. This is basically the inverse of those types as plant in a main procedure. The events we can use for those streams are the standard ones. Most importantly, on the readable streams, we can listen to the data event, which will have the output of the command or whatsoever error encountered while executing the command:
kid.stdout.on('data', (data) => { console.log(`child stdout:\n${information}`); }); child.stderr.on('data', (data) => { console.error(`child stderr:\n${data}`); }); The two handlers in a higher place will log both cases to the principal procedure stdout and stderr. When we execute the spawn office above, the output of the pwd command gets printed and the child process exits with code 0, which ways no error occurred.
We tin pass arguments to the control that'southward executed past the spawn part using the second statement of the spawn part, which is an array of all the arguments to be passed to the command. For example, to execute the observe command on the current directory with a -type f argument (to list files only), we tin can do:
const child = spawn('find', ['.', '-type', 'f']); If an error occurs during the execution of the control, for case, if nosotros give notice an invalid destination above, the child.stderr data event handler will exist triggered and the exit event handler will report an leave code of 1, which signifies that an error has occurred. The error values actually depend on the host OS and the type of error.
A child process stdin is a writable stream. We can apply it to ship a control some input. Merely like whatever writable stream, the easiest fashion to swallow it is using the pipe part. We merely pipe a readable stream into a writable stream. Since the main process stdin is a readable stream, we tin piping that into a child process stdin stream. For example:
const { spawn } = crave('child_process'); const kid = spawn('wc'); process.stdin.pipe(child.stdin) child.stdout.on('data', (information) => { console.log(`child stdout:\n${information}`); }); In the example higher up, the child procedure invokes the wc command, which counts lines, words, and characters in Linux. We and so pipage the main process stdin (which is a readable stream) into the kid procedure stdin (which is a writable stream). The result of this combination is that nosotros get a standard input mode where we can blazon something and when we hit Ctrl+D, what we typed will exist used as the input of the wc command.
We tin also pipe the standard input/output of multiple processes on each other, but similar we can practise with Linux commands. For example, we can pipe the stdout of the discover control to the stdin of the wc command to count all the files in the current directory:
const { spawn } = require('child_process'); const find = spawn('find', ['.', '-blazon', 'f']); const wc = spawn('wc', ['-l']); find.stdout.pipe(wc.stdin); wc.stdout.on('information', (data) => { console.log(`Number of files ${data}`); }); I added the -l argument to the wc command to brand information technology count merely the lines. When executed, the code above will output a count of all files in all directories under the current one.
Trounce Syntax and the exec part
Past default, the spawn function does not create a shell to execute the control we laissez passer into it. This makes information technology slightly more efficient than the exec function, which does create a vanquish. The exec function has one other major deviation. It buffers the command's generated output and passes the whole output value to a callback function (instead of using streams, which is what spawn does).
Here's the previous notice | wc case implemented with an exec function.
const { exec } = require('child_process'); exec('discover . -type f | wc -l', (err, stdout, stderr) => { if (err) { console.error(`exec mistake: ${err}`); return; } panel.log(`Number of files ${stdout}`); }); Since the exec function uses a beat out to execute the command, we can apply the shell syntax directly here making use of the beat out piping characteristic.
Note that using the beat out syntax comes at a security take chances if you're executing whatever kind of dynamic input provided externally. A user tin simply practise a control injection attack using shell syntax characters like ; and $ (for example, command + '; rm -rf ~' )
The exec function buffers the output and passes it to the callback function (the second argument to exec) equally the stdout argument there. This stdout argument is the command's output that we want to print out.
The exec role is a practiced choice if you need to use the shell syntax and if the size of the information expected from the command is small. (Recollect, exec will buffer the whole data in memory before returning information technology.)
The spawn part is a much better selection when the size of the data expected from the command is large, because that data will be streamed with the standard IO objects.
Nosotros can make the spawned kid process inherit the standard IO objects of its parents if nosotros want to, but also, more chiefly, we can brand the spawn function use the crush syntax likewise. Here's the same find | wc command implemented with the spawn function:
const child = spawn('find . -type f | wc -50', { stdio: 'inherit', trounce: true }); Considering of the stdio: 'inherit' option in a higher place, when we execute the lawmaking, the child process inherits the main process stdin, stdout, and stderr. This causes the child process data events handlers to be triggered on the principal process.stdout stream, making the script output the result right abroad.
Because of the vanquish: true pick higher up, nosotros were able to employ the shell syntax in the passed command, just like nosotros did with exec. But with this code, we still go the advantage of the streaming of data that the spawn function gives us. This is actually the best of both worlds.
There are a few other good options nosotros tin can use in the terminal argument to the child_process functions too shell and stdio. We can, for example, use the cwd choice to alter the working directory of the script. For example, here's the same count-all-files instance washed with a spawn office using a shell and with a working directory prepare to my Downloads folder. The cwd option hither will make the script count all files I take in ~/Downloads:
const child = spawn('find . -blazon f | wc -l', { stdio: 'inherit', crush: truthful, cwd: '/Users/samer/Downloads' }); Some other selection we can use is the env pick to specify the environment variables that volition be visible to the new child process. The default for this choice is process.env which gives any command access to the current process surround. If we want to override that behavior, nosotros can simply pass an empty object as the env option or new values there to be considered as the merely environment variables:
const kid = spawn('echo $ANSWER', { stdio: 'inherit', shell: true, env: { ANSWER: 42 }, }); The echo control above does not have access to the parent process's surround variables. It can't, for example, access $HOME, but it tin admission $Answer because information technology was passed every bit a custom environs variable through the env option.
Ane last important child process option to explain here is the discrete pick, which makes the child process run independently of its parent procedure.
Assuming we accept a file timer.js that keeps the event loop decorated:
setTimeout(() => { // keep the event loop busy }, 20000); We can execute it in the background using the discrete pick:
const { spawn } = require('child_process'); const kid = spawn('node', ['timer.js'], { discrete: true, stdio: 'ignore' }); child.unref(); The verbal behavior of detached child processes depends on the Bone. On Windows, the detached child process will have its own console window while on Linux the detached kid procedure will exist made the leader of a new process grouping and session.
If the unref office is called on the detached process, the parent process tin exit independently of the child. This can be useful if the kid is executing a long-running process, only to continue information technology running in the background the child's stdio configurations besides take to exist independent of the parent.
The example above will run a node script (timer.js) in the groundwork by detaching and besides ignoring its parent stdio file descriptors then that the parent tin terminate while the child keeps running in the background.
The execFile function
If you demand to execute a file without using a shell, the execFile function is what you need. It behaves exactly similar the exec part, merely does not use a trounce, which makes it a bit more efficient. On Windows, some files cannot be executed on their own, like .bat or .cmd files. Those files cannot be executed with execFile and either exec or spawn with vanquish prepare to true is required to execute them.
The *Sync function
The functions spawn, exec, and execFile from the child_process module also have synchronous blocking versions that will expect until the child process exits.
const { spawnSync, execSync, execFileSync, } = require('child_process'); Those synchronous versions are potentially useful when trying to simplify scripting tasks or any startup processing tasks, but they should be avoided otherwise.
The fork() office
The fork function is a variation of the spawn function for spawning node processes. The biggest divergence betwixt spawn and fork is that a communication channel is established to the child procedure when using fork, and then we can use the ship office on the forked process forth with the global process object itself to substitution messages betwixt the parent and forked processes. We do this through the EventEmitter module interface. Here's an case:
The parent file, parent.js:
const { fork } = require('child_process'); const forked = fork('kid.js'); forked.on('message', (msg) => { console.log('Bulletin from kid', msg); }); forked.ship({ howdy: 'earth' }); The child file, child.js:
process.on('message', (msg) => { console.log('Message from parent:', msg); }); permit counter = 0; setInterval(() => { process.ship({ counter: counter++ }); }, 1000); In the parent file above, we fork child.js (which will execute the file with the node command) and then nosotros listen for the bulletin upshot. The bulletin issue volition be emitted whenever the child uses process.ship, which we're doing every second.
To pass downwards messages from the parent to the child, we can execute the send office on the forked object itself, and then, in the child script, nosotros can listen to the message upshot on the global process object.
When executing the parent.js file to a higher place, it'll first send down the { how-do-you-do: 'world' } object to be printed by the forked child process and so the forked kid procedure will send an incremented counter value every second to exist printed by the parent procedure.
Let's do a more than applied example most the fork function.
Let'south say nosotros have an http server that handles ii endpoints. One of these endpoints (/compute below) is computationally expensive and will take a few seconds to complete. We can use a long for loop to simulate that:
const http = crave('http'); const longComputation = () => { let sum = 0; for (let i = 0; i < 1e9; i++) { sum += i; }; return sum; }; const server = http.createServer(); server.on('request', (req, res) => { if (req.url === '/compute') { const sum = longComputation(); return res.end(`Sum is ${sum}`); } else { res.end('Ok') } }); server.listen(3000); This program has a big problem; when the the /compute endpoint is requested, the server will not be able to handle any other requests considering the result loop is busy with the long for loop functioning.
There are a few ways with which we can solve this problem depending on the nature of the long functioning just one solution that works for all operations is to just motion the computational performance into some other process using fork.
We first move the whole longComputation function into its ain file and make it invoke that office when instructed via a message from the main process:
In a new compute.js file:
const longComputation = () => { permit sum = 0; for (let i = 0; i < 1e9; i++) { sum += i; }; return sum; }; process.on('message', (msg) => { const sum = longComputation(); process.send(sum); }); Now, instead of doing the long operation in the main process result loop, we tin can fork the compute.js file and utilize the letters interface to communicate messages betwixt the server and the forked process.
const http = require('http'); const { fork } = require('child_process'); const server = http.createServer(); server.on('request', (req, res) => { if (req.url === '/compute') { const compute = fork('compute.js'); compute.send('commencement'); compute.on('bulletin', sum => { res.end(`Sum is ${sum}`); }); } else { res.finish('Ok') } }); server.heed(3000); When a asking to /compute happens now with the to a higher place lawmaking, we but send a message to the forked process to start executing the long operation. The principal procedure's event loop volition not be blocked.
In one case the forked process is done with that long functioning, it can send its result back to the parent process using process.send.
In the parent process, we listen to the message event on the forked child process itself. When we get that event, we'll take a sum value ready for us to send to the requesting user over http.
The lawmaking above is, of course, limited past the number of processes nosotros tin can fork, only when nosotros execute it and request the long computation endpoint over http, the main server is not blocked at all and can take further requests.
Node'south cluster module, which is the topic of my next article, is based on this thought of child process forking and load balancing the requests among the many forks that we can create on any system.
That'south all I have for this topic. Thanks for reading! Until next time!
Learning React or Node? Checkout my books:
- Learn React.js by Building Games
- Node.js Across the Basics
Learn to lawmaking for free. freeCodeCamp'south open source curriculum has helped more than than 40,000 people go jobs as developers. Get started
Source: https://www.freecodecamp.org/news/node-js-child-processes-everything-you-need-to-know-e69498fe970a/
0 Response to "Read From Multiple Child Processes From Parent"
Post a Comment