In the Node.js servers and scripts you have written thus far, you have already consumed external functionality in the form of modules. In this chapter, I explain how this all works and how to write your own. In addition to all the powerful and functional modules that Node already provides for you, there is a huge community further developing modules that you can take advantage of in your programs, and indeed you can even write your own to give something back!
One of the cool things about Node is that you don’t really distinguish between modules that you have produced and modules that you consume from external repositories, such as those you see later in this chapter via npm
, the Node Package Manager. When you write separate classes and groups of functions in Node, you put them in basically the same format—perhaps with a bit less dressing and documentation—as modules you download from the Internet and use. In fact, it usually takes only an extra bit of JSON and maybe line or two of code to prepare your code for consumption by others!
Node ships with a large number of built-in modules, all of which are packaged in the node
executable on your system. You can view their source if you download the Node source code from the nodejs.org website. They all live in the lib/ subdirectory.
At a high level, modules are a way to group common functionality in Node.js. If you have a library of functions or classes for working with a particular database server, for example, it would make a lot of sense to put that code into a module and package it for consumption.
Every file in Node.js is a module, although modules do not necessarily have to be this simple. You can package complex modules with many files, unit tests, documentation, and other support files into folders and consume them in the same way you would a module with only a single JavaScript file (see “Writing Modules” later in this chapter).
To write your own module that exposes, or exports, a function called hello_world
, you can write the following and save it to mymodule.js:
exports.hello_world = function () {
console.log("Hello World");
}
The exports
object is a special object created by the Node module system in every file you create and is returned as the value of the require
function when you include that module. It lives off the module
object that every module has and is used to expose functions, variables, or classes. In the simple example here, the module exposes a single function on the exports
object, and to consume it, you could write the following and save it to modtest.js:
var mm = require ('./mymodule');
mm.hello_world();
Running node modtest.js
causes Node to print out "Hello World"
exactly as you would expect. You can expose as many functions and classes as you want off the exports
object as you’d like. For example:
function Greeter (lang) {
this.language = lang;
this.greet = function () {
switch (this.language) {
case "en": return "Hello!";
case "de": return "Hallo!";
case "jp": return "";
default: return "No speaka that language";
}
}
}
exports.hello_world = function () {
console.log("Hello World");
}
exports.goodbye = function () {
console.log("Bye bye!");
}
exports.create_greeter = function (lang) {
return new Greeter(lang);
}
The module
variable given to each module contains information such as the filename of the current module, its child modules, its parent modules, and more.
You frequently return objects from modules that you write. There are two key patterns through which you do this.
The previous sample module contains a class called Greeter
. To get an instance of a Greeter
object, you call a creation function—or factory function—to create and return an instance of this class. The basic model is as follows:
function ABC (parms) {
this.varA = ...;
this.varB = ...;
this.functionA = function () {
...
}
}
exports.create_ABC = function (parms) {
return new ABC(parms);
}
The advantage to this model is that the module can still expose other functions and classes via the exports
object.
Another way to expose classes from a module you write would be to completely replace the exports
object in the module with a class that you want people to use:
function ABC () {
this.varA = 10;
this.varB = 20;
this.functionA = function (var1, var2) {
console.log(var1 + " " + var2);
}
}
module.exports = ABC;
To use this module, you would change your code to be the following:
var ABCClass = require('./conmod2');
var obj = new ABCClass();
obj.functionA(1, 2);
Thus, the only thing you are really exposing from the module is a constructor for the class. This approach feels nice and OOP-y, but has the disadvantage of not letting you expose much else from your module; it also tends to feel a bit awkward in the Node way of doing things. I showed it to you here so that you can recognize it for what it is when you see it, but you will almost never use it in this book or your projects—you will largely stick with the factory model.
Apart from writing your own modules and using those provided by Node.js, you will frequently use code written by other people in the Node community and published on the Internet. The most common way this is done today is by using npm
, the Node Package Manager. npm
is installed with your node installation (as you saw in Chapter 1, “Getting Started”), and you can go to the command line and type npm help
to verify that it’s still there and working.
To install modules via npm
, you use the npm install
command. This technique requires only the name of the module package you want to install. Many npm
modules have their source code hosted on github.com, so they usually tell you the name required, for example:
host:ch5 marcw$ npm install mysql
[email protected] /Users/marcwan/src/misc/LearningNodeJS/Chapter05
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
If you’re not sure of the name of the package you want to install, you can use the npm search command
, as follows:
npm search sql
This command prints the name and description of all matching modules.
However, you’re going to have a far richer and easier experience if you search by visiting npmjs.org and looking there.
npm
installs module packages to the node_modules/ subdirectory of your project. If a module package itself has any dependencies, they are installed to a node_modules/ subdirectory of that module’s folder.
+ project/
+ node_modules/
module1
module2
+ node_modules/
dependency1
main.js
To see a list of all modules that a project is currently using, you can use the npm ls
command:
host:ch05 marcwan$ npm ls
[email protected] /Users/marc/src/misc/LearningNodeJS/Chapter05
[email protected]
[email protected]
To update an installed package to a newer version, use the npm update
command. If you specify a package name, it updates only that one. If you do not specify a package name, it updates all packages to their latest version. If there are no changes to the package, it will print out nothing:
host:ch5 marcw$ npm update mysql
host:ch5 marcw$
As you have already seen, to include a module in a Node file that you are writing, you use the require
function. To be able to reference the functions and/or classes on that module, you assign the results (the exports
object of the loaded module) to a variable:
var http = require('http');
Included modules are private to the module that includes them, so if a.js loads the http module, then b.js cannot reference it, unless it itself also loads http.
Node.js uses a pretty straightforward set of rules for finding modules requested with the require
function:
1. If the requested module is a built-in one—such as http or fs—Node uses that.
2. If the module name in the require
function begins with a path component (/, ../, or /). Node looks in the specified directory for that module and tries to load it there. If you don’t specify a .js extension on your module name, Node first looks for a folder-based module of that name. If it does not find that, it then adds the extensions .js, .json, .node and tries to load modules of those types. (Modules with the extension .node are compiled add-on modules.)
3. If the module name does not have a path component at the beginning, Node looks in the node_modules/ subfolder of the current folder for the module there. If it is found, that is loaded; otherwise, Node works its way up the path tree of the current location looking for node_modules/ folders there. If those continue to fail, it looks in some standard default locations, such as /usr/lib, /usr/local/lib, or C:Program FilesUserApp Datalocation pm if you’re running on Windows.
4. If the module isn’t found in any of these locations, an error is thrown.
After a module has been loaded from a particular file or directory, Node.js caches it. Subsequent calls to require
that would load the same module from the same location get the exact same code, with any initialization or other work that has taken place. Where this becomes interesting is in situations where we have a few different people asking for the same module. Consider the following project structure:
+ my_project/
+ node_modules/
+ special_widget/
+ node_modules/
mail_widget (v2.0.1)
mail_widget (v1.0.0)
main.js
utils.js
In this example, if either main.js or utils.js requires mail_widget, it gets v1.0.0 because Node’s search rules find it in the node_modules/ subdirectory of my_project. However, if they require special_widget, which in turn wishes to use mail_widget, special_widget gets its own privately included version of mail_widget, the v2.0.1 one in its own node_modules/ folder.
This is one of the most powerful and awesome features of the Node.js module system! In so many other systems, modules, widgets, or dynamic libraries are all stored in a central location, which creates versioning nightmares when you require packages that themselves require different versions of some other module. In Node, they are free to include these different versions of the other modules, and Node’s namespace and module rules mean that they do not interfere with each other at all! Individual modules and portions of a project are free to include, update, or modify included modules as they see fit without affecting the rest of the system.
In short, Node.js works intuitively, and for perhaps the first time in your life, you don’t have to sit there endlessly cursing the package repository system you’re using.
Consider the following situation:
a.js requires b.js.
b.js requires a.js.
main.js requires a.js.
You can see that you clearly have a cycle in the preceding modules. Node stops cycles from being a problem by simply returning uninitialized modules when it detects one. In the preceding case, the following happens:
main.js is loaded, and code runs that requires a.js.
a.js is loaded, and code runs that requires b.js.
b.js is loaded, and code runs that requires a.js.
Node detects the cycle and returns an object referring to a.js, but does not execute any more code—the loading and initialization of a.js are unfinished at this point!
b.js, a.js, and main.js all finish initializing (in that order), and then the reference from b.js to a.js is valid and fully usable.
Recall that every file in Node.js is itself a module, with a module
and exports
object. However, you also should know that modules can be a bit more complicated than that, with a directory to hold its contents and a file containing packaging information. For those cases in which you want to write a bunch of support files, break up the functionality of the module into separate JavaScript files, or even include unit tests, you can write modules in this format.
The basic format is as follows:
1. Create the folder to hold the module contents.
2. Put a file called package.json into this folder. This file should contain at least a name for the module and main JavaScript file that Node should load initially for that module.
3. If Node cannot find the package.json file or no main JavaScript file is specified, it looks for index.js (or index.node for compiled add-on modules).
Now take the code you wrote for managing photos and albums in the preceding chapter and put it into a module. Doing so lets you share it with other projects that you write later and isolate the code so you can write unit tests, and so on.
First, create the following directory structure in the source scratch directory (that is, ~/src/scratch or wherever you’re playing around with Node):
+ album_mgr/
+ lib/
+ test/
In the album_mgr folder, create a file called package.json and put the following in it:
{ "name": "album-manager",
"version": "1.0.0",
"main": "./lib/albums.js" }
This is the most basic of package.json files; it tells npm
that the package should have the friendly name album-manager and that the “main” or starting JavaScript file for the package is the albums.js file in the lib/ subdirectory. Package.json files can contain many other fields, including descriptions, author information, licensing, etc. The npm
documentation covers this in detail.
The preceding directory structure is by no means mandatory or written in stone; it is simply one of the common layouts for packages that I have found to be useful and have thus latched on to. You are under no obligation to follow it. I do, however, recommend that you start doing things this way and start experimenting with different layouts only after you’re comfortable with the whole system.
Sites such as github.com that are frequently used to host Node module source automatically display Readme documentation if they find it. Thus, it is pretty common for people to include a Readme.md (the “md” stands for markdown and refers to the standard documentation format that github.com uses). You are highly encouraged to write documentation for your modules to help people get started using it. For the album-manager module, I wrote the following Readme file:
# Album-Manager
This is our module for managing photo albums based on a directory. We
assume that, given a path, there is an albums sub-folder, and each of
its individual sub-folders are themselves the albums. Files in those
sub-folders are photos.
## Album Manager
The album manager exposes a single function, `albums`, which returns
an array of `Album` objects for each album it contains.
## Album Object
The album object has the following two properties and one method:
* `name` -- The name of the album
* `path` -- The path to the album
* `photos()` -- Calling this method will return all the album's photos
Now you can write your actual module files. First, start with the promised lib/albums.js, which is just some of the album-loading code from Chapter 4, “Writing Applications,” repackaged into a module-like JavaScript file:
var fs = require('fs'),
album = require('./album.js');
exports.version = "1.0.0";
exports.albums = function (root, callback) {
// we will just assume that any directory in our 'albums'
// subfolder is an album.
fs.readdir(root + "/albums", (err, files) => {
if (err) {
callback(err);
return;
}
var album_list = [];
(function iterator(index) {
if (index == files.length) {
callback(null, album_list);
return;
}
fs.stat(root + "albums/" + files[index], (err, stats) => {
if (err) {
callback(make_error("file_error",
JSON.stringify(err)));
return;
}
if (stats.isDirectory()) {
var p = root + "albums/" + files[index];
album_list.push(album.create_album(p));
}
iterator(index + 1)
});
})(0);
});
};
function make_error(err, msg) {
var e = new Error(msg);
e.code = err;
return e;
}
One of the standard things to provide in the exported functionality of modules is a version
member field. Although I don’t always use it, it can be a helpful way for calling modules to check your version and execute different code depending on what it has.
You can see that the album functionality is split into a new file called lib/album.js, and there is a new class called Album
. This class looks as follows:
function Album (album_path) {
this.name = path.basename(album_path);
this.path = album_path;
}
Album.prototype.name = null;
Album.prototype.path = null;
Album.prototype._photos = null;
Album.prototype.photos = function (callback) {
if (this._photos != null) {
callback(null, this._photos);
return;
}
fs.readdir(this.path, (err, files) => {
if (err) {
if (err.code == "ENOENT") {
callback(no_such_album());
} else {
callback(make_error("file_error", JSON.stringify(err)));
}
return;
}
var only_files = [];
var iterator = (index) => {
if (index == files.length) {
callback(null, only_files);
return;
}
fs.stat(this.path + "/" + files[index], (err, stats) => {
if (err) {
callback(make_error("file_error",
JSON.stringify(err)));
return;
}
if (stats.isFile()) {
only_files.push(files[index]);
}
iterator(index + 1)
});
};
iterator(0);
});
};
If you’re confused by the prototype
keyword used a few times in the preceding source code, perhaps now is a good time to jump back to Chapter 2 and review the section on writing classes in JavaScript. The prototype
keyword here is simply a way to set properties on all instances of our Album class.
Again, this is pretty much what you saw in Chapter 4 with the basic JSON server. The only real difference is that it is packaged into a class with a prototype object and method called photos
.
I hope you also noted the following two things:
1. You now use a new built-in module called path, and you use the basename
function on it to extract the album’s name from the path.
2. By using arrow functions for anonymous callbacks within this class, we avoid the problems with the this
pointer mentioned in “Who Am I? Maintaining a Sense of Identity” in Chapter 3, “Asynchronous Programming.” If you’re not sure what we’re talking about here, please take a moment to refer back to that section.
The rest of the album.js file is simply as follows:
var path = require('path'),
fs = require('fs');
// Album class code goes here
exports.create_album = function (path) {
return new Album(path);
};
function make_error(err, msg) {
var e = new Error(msg);
e.code = err;
return e;
}
function no_such_album() {
return { error: "no_such_album",
message: "The specified album does not exist." };
}
And that is all you need for your album-manager module! To test it, go back to the scratch directory and enter the following test program as atest.js:
var amgr = require('./album_mgr'); // Our module is in the album_mgr dir as per above
amgr.albums('./', function (err, albums) {
if (err) {
console.log("Unexpected error: " + JSON.stringify(err));
return;
}
var iterator = (index) => {
if (index == albums.length) {
console.log("Done");
return;
}
albums[index].photos(function (err, photos) {
if (err) {
console.log("Err loading album: " + JSON.stringify(err));
return;
}
console.log(albums[index].name);
console.log(photos);
console.log("");
iterator(index + 1);
});
}
iterator(0);
});
Now, all you have to do is ensure you have an albums/ subfolder in the current directory, and you should be able to run atest.js and see something like the following:
hostname:Chapter05 marcw$ node atest
australia2010
[ 'aus_01.jpg',
'aus_02.jpg',
'aus_03.jpg',
'aus_04.jpg',
'aus_05.jpg',
'aus_06.jpg',
'aus_07.jpg',
'aus_08.jpg',
'aus_09.jpg' ]
italy2012
[ 'picture_01.jpg',
'picture_02.jpg',
'picture_03.jpg',
'picture_04.jpg',
'picture_05.jpg' ]
japan2010
[ 'picture_001.jpg',
'picture_002.jpg',
'picture_003.jpg',
'picture_004.jpg',
'picture_005.jpg',
'picture_006.jpg',
'picture_007.jpg' ]
Done
You now have a module for working with albums. If you would like to use it in multiple projects, you could copy it to the node_modules/ folder of your other projects, but then you would have a problem: What happens when you want to make a change to your albums module? Do you have to copy the source code over to all the locations it is being used each and every time you change it? Ideally, we’d like to be able to use npm
even for our own private modules but not risk having them get uploaded to the actual npm
repository on the Internet.
Fortunately, npm
solves both of these problems for us. You can modify the package.json file to add the following:
{ "name": "album-manager",
"version": "1.0.0",
"main": "./lib/albums.js",
"private": true }
This code tells npm
to never accidentally publish this to the live npm
repository, which you don’t want for this module now.
Then, you can use the npm link
command, which tells npm
to put a link to the album-manager package in the local machine’s default public package repository (such as /usr/local/lib/node_modules on Linux and Mac machines, or C:UsersusernameAppDatalocation
pm on Windows).
host:Chapter05 marcw$ cd album_mgr
host:album_mgr marcw$ sudo npm link
/usr/local/lib/node_modules/album-manager ->
/Users/marcw/src/scratch/Chapter05/album_mgr
Note that depending on how the permissions and such are set up on your local machine, you might need to run this command as super-user with sudo
(Windows users will certainly not need to).
Now, to consume this module, you need to do two things:
1. Refer to 'album-manager'
instead of 'album_mgr'
in the code (because npm
uses the name
field in package.json).
2. Create a reference to the album-manager module with npm
for each project that wants to use it. You can just type npm link album-manager:
host:Chapter05 marcw$ mkdir test_project
host:Chapter05 marcw$ cd test_project/
host:test_project marcw$ npm link album-manager
/Users/marcw/src/scratch/Chapter05/test_project/node_modules/album-manager ->
/usr/local/lib/node_modules/album-manager ->
/Users/marcw/src/scratch/Chapter05/album_mgr
host:test_project marcw$ dir
drwxr-xr-x 3 marcw staff 102 11 20 18:38 node_modules/
host:test_project marcw$ dir node_modules/
lrwxr-xr-x 1 marcw staff 41 11 20 18:38 album-manager@ ->
/usr/local/lib/node_modules/album-manager
Now, you are free to make changes to your original album manager source, and all referencing projects will see changes right away.
If you have written a module that you would like to share with other users, you can publish it to the official npm
registry using npm publish
. This requires you to do the following:
Remove the "private": true
line from the package.json file.
Create an account on the npm
registry servers with npm adduser
.
Optionally, choose to fill in more fields in package.json (run npm help json
to get more information on which fields you might want to add) with things such as a description, author contact information, and host website.
Finally, run npm publish
from the module directory to push it to npm
. That’s it!
host:album_mgr marcw$ npm adduser
Username: learningnode_test
Password:
Email: (this IS public) [email protected]
Logged in as learningnode_test on https://registry.npmjs.org/.
host:album_mgr marcw$ npm publish
+ [email protected]
If you accidentally publish something you didn’t mean to or otherwise want to remove from the npm
registry, you can use npm unpublish
:
host:album_mgr marcw$ npm unpublish
npm ERR! Refusing to delete entire project.
npm ERR! Run with --force to do this.
npm ERR! npm unpublish <project>[@<version>]
host:album_mgr marcw$ npm unpublish --force
npm WARN using --force I sure hope you know what you are doing.
- [email protected]
If you see the following when trying to publish a module:
npm ERR! publish Failed PUT 403
npm ERR! Darwin 15.6.0
npm ERR! argv "/usr/local/bin/node" "/usr/local/bin/npm" "publish"
npm ERR! node v6.3.1
npm ERR! npm v3.10.3
npm ERR! code E403
npm ERR! you do not have permission to publish "album-manager". Are you logged in as
the correct user? : album-manager
It most likely means that somebody else has registered a module with this name. Your best bet is to choose another name.
You have already used a few of the Node.js built-in modules in code written thus far (http, fs, path, querystring, and url), and you will use many more throughout the rest of the book. However, there are one or two modules you will find yourself using for nearly every single project to manage a problem every Node.js programmer runs into: managing asynchronous code. We show two solutions here.
Consider the case in which you want to write some asynchronous code to
Open a handle to a path.
Determine whether or not the path points to a file.
Load in the contents of the file if the path does point to a file.
Close the file handle and return the contents to the caller.
You’ve seen almost all this code before, and the function to do this looks something like the following, where functions you call are bold and the callback arrow functions you write are bold and italic:
var fs = require('fs');
function load_file_contents(path, callback) {
fs.open(path, 'r', (err, f) => {
if (err) {
callback(err);
return;
} else if (!f) {
callback({ error: "invalid_handle",
message: "bad file handle from fs.open"});
return;
}
fs.fstat(f, (err, stats) => {
if (err) {
callback(err);
return;
}
if (stats.isFile()) {
var b = new Buffer(stats.size);
fs.read(f, b, 0, stats.size, null, (err, br, buf) => {
if (err) {
callback(err);
return;
}
fs.close(f, (err) => {
if (err) {
callback(err);
return;
}
callback(null, b.toString('utf8', 0, br));
});
});
} else {
callback({ error: "not_file",
message: "Can't load directory" });
return;
}
});
});
}
As you can, even for a short, contrived example such as this, the code is starting to nest pretty seriously and deeply. Nest more than a few levels deep, and you’ll find that you cannot fit your code in an 80-column terminal or one page of printed paper any more. It can also be quite difficult to read the code, figure out what variables are being used where, and determine the flow of the functions being called and returned.
To solve this problem, you can use an npm
module called async. The async module provides an intuitive way to structure and organize asynchronous calls, and removes many, if not all, of the tricky parts of asynchronous programming you encounter in Node.js.
You can execute code serially in async in two ways: through the waterfall
function or the series
function (see Figure 5.1)
The waterfall
function takes an array of functions and executes them one at a time, passing the results from each function to the next. At the end, a resulting function is called with the results from the final function in the array. If an error is signaled at any step of the way, execution is halted, and the resulting function is called with that error instead.
For example, you could easily rewrite the previous code cleanly (it’s in the GitHub source tree) using async.waterfall
:
var fs = require('fs');
var async = require('async');
function load_file_contents(path, callback) {
async.waterfall([
function (callback) {
fs.open(path, 'r', callback);
},
// the f (file handle) was passed to the callback at the end of
// the fs.open function call. async passes all params to us.
function (f, callback) {
fs.fstat(f, function (err, stats) {
if (err)
// abort and go straight to resulting function
callback(err);
else
// f and stats are passed to next in waterfall
callback(null, f, stats);
});
},
function (f, stats, callback) {
if (stats.isFile()) {
var b = new Buffer(stats.size);
fs.read(f, b, 0, stats.size, null, function (err, br, buf) {
if (err)
callback(err);
else
// f and string are passed to next in waterfall
callback(null, f, b.toString('utf8', 0, br));
});
} else {
callback({ error: "not_file",
message: "Can't load directory" });
}
},
function (f, contents, callback) {
fs.close(f, function (err) {
if (err)
callback(err);
else
callback(null, contents);
});
}
]
// this is called after all have executed in success
// case, or as soon as there is an error.
, function (err, file_contents) {
callback(err, file_contents);
});
}
Although the code has grown a little bit in length, when you organize the functions serially in an array like this, the code is significantly cleaner looking and easier to read.
The async.series
function differs from async.waterfall
in two keys ways:
Results from one function are not passed to the next; instead, they are collected in an array, which becomes the “results” (the second) parameter to the final resulting function. Each step of the serial call gets one slot in this results array.
You can pass an object to async.series
, and it enumerates the keys and executes the functions assigned to them. In this case, the results are not passed as an array, but an object with the same keys as the functions called.
var async = require("async");
async.series({
numbers: (callback) => {
setTimeout(function () {
callback(null, [ 1, 2, 3 ]);
}, 1500);
},
strings: (callback) => {
setTimeout(function () {
callback(null, [ "a", "b", "c" ]);
}, 2000);
}
},
(err, results) => {
console.log(results);
});
This function generates the following output:
{ numbers: [ 1, 2, 3 ], strings: [ 'a', 'b', 'c' ] }
In the previous async.series
example, there was no reason to use a serial execution sequence for the functions; the second function did not depend on the results of the first, so they could have executed in parallel (see Figure 5.2). For this, async provides async.parallel
, as follows:
var async = require("async");
async.parallel({
numbers: function (callback) {
setTimeout(function () {
callback(null, [ 1, 2, 3 ]);
}, 1500);
},
strings: function (callback) {
setTimeout(function () {
callback(null, [ "a", "b", "c" ]);
}, 2000);
}
},
function (err, results) {
console.log(results);
});
This function generates the exact same output as before.
The most powerful function of them all is the async.auto
function, which lets you mix ordered and unordered functions together into one powerful sequence of functions. In this, you pass an object where keys contain either
A function to execute or
An array of dependencies and then a function to execute. These dependencies are strings and are the names of properties in the object provided to async.auto
. The auto
function waits for these dependencies to finish executing before calling the provided function.
The async.auto
function figures out the required order to execute all the functions, including which can be executed in parallel and which need to wait for others (see Figure 5.3). As with the async.waterfall
function, you can pass results from one function to the next via the callback
parameter:
var async = require("async");
async.auto({
numbers: (callback) => {
setTimeout(() => {
callback(null, [ 1, 2, 3 ]);
}, 1500);
},
strings: (callback) => {
setTimeout(() => {
callback(null, [ "a", "b", "c" ]);
}, 2000);
},
// do not execute this function until numbers and strings are done
// thus_far is an object with numbers and strings as arrays.
assemble: [ 'numbers', 'strings', (thus_far, callback) => {
callback(null, {
numbers: thus_far.numbers.join(", "),
strings: "'" + thus_far.strings.join("', '") + "'"
});
}]
},
// this is called at the end when all other functions have executed. Optional
(err, results) => {
if (err)
console.log(err);
else
console.log(results);
});
The results
parameter passed to the final resulting function is an object in which the properties hold the results of each of the functions executed on the object:
{ numbers: [ 1, 2, 3 ],
strings: [ 'a', 'b', 'c' ],
assemble: { numbers: '1, 2, 3', strings: ''a', 'b', 'c'' } }
In Chapter 3, I showed you how you can use the following pattern to iterate over the items in an array with asynchronous function calls:
var iterator = (i) => {
if( i < array.length ) {
async_work( function(){
iterator( i + 1 )
})
} else {
callback(results);
}
}
iterator(0);
Although this technique works great and is indeed gloriously geeky, it’s a bit more complicated than I’d like. The async module comes to the rescue again with async.forEachSeries
. It iterates over every element in the provided array, calling the given function for each. However, it waits for each to finish executing before calling the next in the series:
async.forEachSeries(
arr,
// called for each element in arr
(element, callback) => {
// use element
callback(null); // YOU MUST CALL ME FOR EACH ELEMENT!
},
// called at the end
function (err) {
// was there an error? err will be non-null then
}
);
To simply loop over every element in a loop and then have async wait for all of them to finish, you can use async.forEach
, which is called in the exact same way and differs in that it doesn’t execute the functions serially.
The async module contains a ton of other functionality and is truly one of the indispensable modules of Node.js programming today. I highly encourage you to browse the documentation at https://github.com/caolan/async and play around with it. It truly takes the already-enjoyable Node.js programming environment and makes it even better.
While the methods we have looked at thus far for managing asynchronous programming—patterns and the async module—are the primary ways you’ll work with Node.js throughout this book, another popular pattern for managing asynchronous programming is promises.
Promises come in various implementations and flavors. The one we will be discussing here will come via the use of the bluebird module in Node, which can be installed by running npm install bluebird
. There are a number of other modules for doing promises in Node, most notably promises, and Q, but we’ll use bluebird for now, for it has the ability to “promisify” entire modules in Node, which we’ll find quite useful—working with promises requires APIs to be written differently, and better promises packages will be able to take a regular module and wrap it in promise-ready versions.
Just like async, promises seek to make writing asynchronous code easier for you by automatically passing parameters from callbacks to the next function invocation. Similarly, they aim to centralize all error processing in one place at the end.
Looking back to the above example of opening a file, seeing if it’s actually a file, reading its contents, and then closing it, we can rewrite this using promises in the following manner:
var Promise = require("bluebird");
var fs = Promise.promisifyAll(require("fs"));
function load_file_contents2(filename, callback) {
var errorHandler = (err) => {
console.log("SO SAD: " + err);
callback(err, null);
}
fs.openAsync(filename, 'r', 0)
.then(function (fd) { // 1
fs.fstatAsync(fd)
.then(function (stats) {
if (stats.isFile()) { // 2
var b = new Buffer(stats.size);
return fs.readAsync(fd, b, 0, stats.size, null)
.then(fs.closeAsync(fd))
.then(function () {
callback(null, b.toString('utf8'))
})
.catch(errorHandler);
}
})
})
.catch(errorHandler);
}
Using APIs with the promise pattern requires modification and conversion to promise-compatible versions. For example, we want to use the File System (fs) APIs with promises in the above example, so we import bluebird and convert the module as follows:
var Promise = require("bluebird");
var fs = Promise.promisifyAll(require("fs"));
The bluebird module leaves all the original functions and adds new promisified versions for all functions that have an error as the first parameter to the callback in the module. These modified function versions have async
appended to the function name.
Effectively, when calling the then
function:
The provided function is executed.
If the callback has a non-null error, all further promises are skipped until the catch
method is reached, and it is passed the error.
If the callback has a value, this is passed to the next then function in the chain.
In the above promises code, we have to deal with a couple of interesting problems. For the part marked with // 1
, we originally wanted to write the code like this:
fs.openAsync("promises.js", 'r')
.then(fs.fstatAsync) // fd passed from openAsync to us by promises
.then(function (stats) {
if (stats.isFile() {
fs.readAsync(fd, ...); // etc
This, however presents a problem. Once you’ve verified the path is a file, we want to read from it, which requires the fd
(file descriptor) parameter that openAsync
passes to its callback. Because you just consumed this quietly in the promise chain above, you don’t have it anywhere to pass to readAsync
. Thus, you can see in the code that you’ve actually created a new function with fd as one of the parameters so that code anywhere in that scope can refer to the file descriptor.
Our second problem is then how to deal with the branching in // 2
. You want to execute different code depending on whether the given path is a file or not. Looking at the above code, you can see that the way to do this is to just start another promise chain within the first promise chain—there is no limit to how deeply you nest them!
So, while our code is still nests a bit dealing with all the possible paths, we’ve managed to make it significantly more compact, and also factor out all the error handling code into one place, which is still a big improvement to our original function. Much like async, the promises model of asynchronous programming comes with solutions to asynchronous looping and parallel execution constructs that you would expect from the asynchronous world of Node.
While promises can be an effective and common solution to managing asynchronous code, we’ll continue using mostly async and regular callbacks in this book as I find promises to have two shortcomings that don’t really work for me:
1. Using promises requires either rewriting your APIs to be promise-enabled or using promise modules such as bluebird that provide promisify functionality. The former adds complexity and differs between the various promises systems, while the latter is limited and can’t always provide promisified versions of things (e.g., the old fs.exists
function, which never returned an (err)
in its callback).
2. I find the code you write in promises not particularly readable. To my eyes, it looks complicated and introduces too many new concepts and functions you have to get used to in order to solve all the different problems. In this regard, I find async produces far cleaner code.
As always, you are encouraged to play around with all the different paradigms out there (there are other approaches to solving the asynchronous programming problem!) and choose which one works best for you.
In this chapter you were more formally introduced to modules in Node.js. Although you have seen them before, now you finally know how they’re written, how Node finds them for inclusion, and how to use npm
to find and install them. You can write your own complex modules now with package.json files and link them across your projects or even publish them for others to use via npm
.
Finally, you are now armed with knowledge of various approaches to cleaning up asynchronous programming, in particular async, one of the modules that you will use in nearly every single Node project you write from now on.
Next up: Putting the “web” back in web servers. You look at some cool ways to use JSON and Node in your web apps and how to handle some other core Node technologies such as events and streams.