Node’s `require` is dog slow

Our test environment takes 6-9 seconds to load before any tests get run. I tire of this during the ~30 times I run the test suite a day,1 so I wanted to make it faster.

For better or worse, the API runs on Sails.js. Before running model/controller tests, a bootstrap file in our tests calls sails.lift.

require('sails').lift(function(err) {
    // Run the tests
});

This lift call generates about 400 queries against Postgres to retrieve database schema, that each look like this:

SELECT x.nspname || '.' || x.relname as "Table", x.attnum
as "#", x.attname as "Column", x."Type", case x.attnotnull
when true then 'NOT NULL' else '' end as "NULL",
r.conname as "Constraint", r.contype as "C", r.consrc,
fn.nspname || '.' || f.relname as "F Key", d.adsrc as "Default"
FROM (SELECT c.oid, a.attrelid, a.attnum, n.nspname, c.relname,
a.attname, pg_catalog.format_type(a.atttypid, a.atttypmod) as "Type",
a.attnotnull FROM pg_catalog.pg_attribute a, pg_namespace n,
pg_class c WHERE a.attnum > 0 AND NOT a.attisdropped
AND a.attrelid = c.oid and c.relkind not in ('S','v')
and c.relnamespace = n.oid and n.nspname
not in ('pg_catalog','pg_toast','information_schema')) x
left join pg_attrdef d on d.adrelid = x.attrelid
and d.adnum = x.attnum left join pg_constraint r on r.conrelid = x.oid
and r.conkey[1] = x.attnum left join pg_class f on r.confrelid = f.oid
left join pg_namespace fn on f.relnamespace = fn.oid
where x.relname = 'credits' order by 1,2;

I'm not really sure what those queries do, since Sails seems to ignore the schema that's already in the database when generating table queries. Since we use the smallest, safest subset of the ORM we can find, I tried commenting out the sails-postgresql module code that makes those queries to see if the tests would still pass, and the tests did pass... but the load time was still slow.

The next step was to instrument the code to figure out what was taking so long. I wanted to have the Sails loader log the duration of each part of the load process, but this would have required a global variable, and a whole bunch of calls to console.log. It turns out the Unix function ts can do this for you, if log lines are generated at the appropriate times. Basically, it's an instant awesome tool for generating insight into a program's runtime, without needing to generate timestamps in the underlying tool.

NAME
       ts - timestamp input

SYNOPSIS
       ts [-r] [-i | -s] [format]

DESCRIPTION
       ts adds a timestamp to the beginning of each line of input.

I set the Sails logging level to verbose and piped the output through ts '[%Y-%m-%d %H:%M:%.S]'. I pretty quickly found a culprit..

[2015-04-19 21:53:45.730679] verbose: request hook loaded successfully.
[2015-04-19 21:53:45.731032] verbose: Loading the app's models and adapters...
[2015-04-19 21:53:45.731095] verbose: Loading app models...
[2015-04-19 21:53:47.928104] verbose: Loading app adapters...
[2015-04-19 21:53:47.929343] verbose: Loading blueprint middleware...

That's a 2 second gap between loading the models and loading the adapters.

I started adding profiling to the code near the "Loading app models..." line. I expected to see that attaching custom functions (findOne, update, etc) to the Sails models was the part that took so long. Instead I found out that a module called include-all accounted for almost all of the 2.5 seconds. That module simply requires every file in the models directory, about 30 files.

Further reading/logging revealed that each require call was being generated in turn. I've found it, I thought, just use different CPU's to require them all at the same time, and see a speedup. Unfortunately, the require operation is synchronous in Node, and runs on one CPU, so the process can still only perform one require at a time.

I tried just loading one model to see how slow that would be. This script took an average of 700 milliseconds to run, on my high end Macbook Pro:

var start = Date.now();
require('api/models/Drivers');
console.log(Date.now() - start);

700 milliseconds to require a model file, and that file's dependencies! I can send a packet to and from New York 8 times in that amount of time. What the hell is it actually doing? For this I turned to good old strace (or dtruss, as it's ported on Macs). First start up a shell to record syscalls for any process that is called node.

# (You'll want to kill any other node processes before running this.)
sudo dtruss -d -n 'node' > /tmp/require.log 2>&1

Open up another shell session and run your little script that calls require and prints the startup time and then exits. You should have a few thousand lines in a file called /tmp/require.log. Here's what I found near the start:

   1186335 stat64("/Users/burke/code/api/api/models/node_modules/async\0", 0x7FFF5FBFE608, 0x9)             = -1 Err#2
   1186382 stat64("/Users/burke/code/api/api/models/node_modules/async.js\0", 0x7FFF5FBFE5B8, 0x9)          = -1 Err#2
   1186405 stat64("/Users/burke/code/api/api/models/node_modules/async.json\0", 0x7FFF5FBFE5B8, 0x9)                = -1 Err#2
   1186423 stat64("/Users/burke/code/api/api/models/node_modules/async.node\0", 0x7FFF5FBFE5B8, 0x9)                = -1 Err#2
   1186438 stat64("/Users/burke/code/api/api/models/node_modules/async.coffee\0", 0x7FFF5FBFE5B8, 0x9)              = -1 Err#2
   1186473 open("/Users/burke/code/api/api/models/node_modules/async/package.json\0", 0x0, 0x1B6)           = -1 Err#2
   1186501 stat64("/Users/burke/code/api/api/models/node_modules/async/index.js\0", 0x7FFF5FBFE5B8, 0x1B6)          = -1 Err#2
   1186519 stat64("/Users/burke/code/api/api/models/node_modules/async/index.json\0", 0x7FFF5FBFE5B8, 0x1B6)                = -1 Err#2
   1186534 stat64("/Users/burke/code/api/api/models/node_modules/async/index.node\0", 0x7FFF5FBFE5B8, 0x1B6)                = -1 Err#2
   1186554 stat64("/Users/burke/code/api/api/models/node_modules/async/index.coffee\0", 0x7FFF5FBFE5B8, 0x1B6)              = -1 Err#2
   1186580 stat64("/Users/burke/code/api/api/node_modules/async\0", 0x7FFF5FBFE608, 0x1B6)          = -1 Err#2
   1186598 stat64("/Users/burke/code/api/api/node_modules/async.js\0", 0x7FFF5FBFE5B8, 0x1B6)               = -1 Err#2
   1186614 stat64("/Users/burke/code/api/api/node_modules/async.json\0", 0x7FFF5FBFE5B8, 0x1B6)             = -1 Err#2
   1186630 stat64("/Users/burke/code/api/api/node_modules/async.node\0", 0x7FFF5FBFE5B8, 0x1B6)             = -1 Err#2
   1186645 stat64("/Users/burke/code/api/api/node_modules/async.coffee\0", 0x7FFF5FBFE5B8, 0x1B6)           = -1 Err#2
   1186670 open("/Users/burke/code/api/api/node_modules/async/package.json\0", 0x0, 0x1B6)          = -1 Err#2
   1186694 stat64("/Users/burke/code/api/api/node_modules/async/index.js\0", 0x7FFF5FBFE5B8, 0x1B6)                 = -1 Err#2
   1186712 stat64("/Users/burke/code/api/api/node_modules/async/index.json\0", 0x7FFF5FBFE5B8, 0x1B6)               = -1 Err#2
   1186727 stat64("/Users/burke/code/api/api/node_modules/async/index.node\0", 0x7FFF5FBFE5B8, 0x1B6)               = -1 Err#2
   1186742 stat64("/Users/burke/code/api/api/node_modules/async/index.coffee\0", 0x7FFF5FBFE5B8, 0x1B6)             = -1 Err#2
   1186901 stat64("/Users/burke/code/api/node_modules/async\0", 0x7FFF5FBFE608, 0x1B6)              = 0 0
   1186963 stat64("/Users/burke/code/api/node_modules/async.js\0", 0x7FFF5FBFE5B8, 0x1B6)           = -1 Err#2
   1187024 stat64("/Users/burke/code/api/node_modules/async.json\0", 0x7FFF5FBFE5B8, 0x1B6)                 = -1 Err#2
   1187050 stat64("/Users/burke/code/api/node_modules/async.node\0", 0x7FFF5FBFE5B8, 0x1B6)                 = -1 Err#2
   1187074 stat64("/Users/burke/code/api/node_modules/async.coffee\0", 0x7FFF5FBFE5B8, 0x1B6)               = -1 Err#2
      1656 __semwait_signal(0x10F, 0x0, 0x1)                = -1 Err#60
   1187215 open("/Users/burke/code/api/node_modules/async/package.json\0"0x0, 0x1B6)              = 11 0

That's a lot of wasted open()'s, and that's just to find one dependency.2 To load the single model, node had to open and read 300 different files,3 and every require which wasn't a relative dependency did the same find-the-node-module-folder dance. The documentation seems to indicate this is desired behavior, and it doesn't seem like there is any way around this.

Now, failed stat's are not the entire reason require is slow, but if I am reading the timestamps correctly in the strace log, they are a fair amount of it, and the most obviously wasteful thing. I could rewrite every require to be relative, e.g. require('../../node_modules/async') but that seems cumbersome and wasteful, when I can define the exact rule I want before hand: if it's not a relative file path, look in node_modules in the top level of my project.

So that's where we are; require for a single model takes 700 milliseconds, require for all the models takes 2.5 seconds, and there don't seem to be great options for speeding that up. There's some discouraging discussion here from core developers about the possibility of speeding up module import.

You are probably saying "load fewer dependencies", and you are right and I would love to, but that is not an option at this point in time, since "dependencies" basically equals Sails, and while we are trying to move off of Sails, we're stuck on it for the time being. Where I can, I write tests that work with objects in memory and don't touch our models/controllers, but Destroy All Software videos only get you so far.

I will definitely think harder about importing another small module vs. just copying the code/license wholesale into my codebase, and I'll definitely look to add something that can warn about unused imports or variables in the code base.

I'm left concluding that importing modules in Node is dog slow. Yes, 300 imports is a lot, but a 700ms load time seems way too high. Python imports can be slow, but as far as I remember, it's mostly for things that compile C on the fly, and for everything else you can work around it by rearranging sys.path to match the most imports first (impossible here). If there's anything I can do - some kind of compile option, or saving the V8 bytecode or something, I'm open to suggestions.

1. If you follow along with the somewhat-crazy convention of crashing your Node process on an error and having a supervisor restart it, that means that your server takes 6-9 seconds of downtime as well.

2. The story gets even worse if you are writing Coffeescript, since require will also look for files matching .litcoffee and .coffee.md at every level. You can hack require.extensions to delete these keys.

3. For unknown reasons, node occasionally decided not to stat/open some imports that were require'd in the file.

Liked what you read? I am available for hire.

22 thoughts on “Node’s `require` is dog slow

  1. David Chase

    Im curious about the numbers i have benched 7 required files and their dependencies at about 300-400ms together… so i not sure how you got 700ms for one file?

    Reply
    1. kevin Post author

      Hi David – sorry for not being clear – the Drivers file required other JS files. The total number of opened files was around 300. Guessing the speed is related to the number and size of files to parse.

      Reply
  2. Alan

    It appears to be much faster on my machine, which is a fast Windows 8.1 desktop:

    Using procmon I ran some NodeJS code that requires the esprima npm package. It can’t find it in the current dir, so it goes to the parent dir, finds it in node_modules, and then finally finds the correct .js file for the library. From the first failed ::CreateFile() call to the final successful ::CreateFile() call on the correct JS file is only 1.6 milliseconds.

    Reply
    1. kevin Post author

      Hi Alan,
      Is that result for just one required file? I saw roughly the same timestamps for JS, unfortunately the file I was trying to require had over 300 includes. 1.5ms each can add up :(

      Reply
  3. Mike

    Quick question:
    You stated “I could rewrite every require to be relative, e.g. require(‘../../node_modules/async’)”

    Does simply supplying a relative path fix this issue? If so, why? #justCurious

    Great post.

    Reply
  4. Gil Tayar

    We found the same problem, and for us, load times were crucial. What we tried, and succeeded, to do, was to rewire require so that resolving the path to the module is cached, so that the next time the app loads, we load the path resolution to the module and just load it from there.
    This requires a bit of low-level require rewiring, but everyone’s doing that today. :-)

    This sped our load time significantly (although to be honest it was just a POC and we haven’t yet used it in production), especially on Windows, where this problem is significantly worse for some reason.

    Reply
  5. Tim Oxley

    There’s a great way to get to the bottom of this problem – log a new issue on the io.js issue queue along with details about the environment you’re running in – including the version of node and hardware specs. Something reproducible would be ideal.

    Reply
  6. Gleb Bahmutov

    Kevin,
    I am trying to solve this problem in user space by caching resolved module paths, see cache-require-paths. I also have done the loaded source cache separately. The caches remove a lot of stat system calls that are useless during the application’s startup. I am trying to accurately measure the impact on Sails lift off.

    Reply
  7. Travis Webb

    Hey Kevin,

    Could you be more specific about the hardware this is running on? Anywhere on the order of ~10 seconds just to `require()` files would be unprecedentedly slow in my experience.

    Another note: at Balderdash, when we are unit/integration testing our sails.js applications, we typically will use `sails.load` in our test suite for unit testing, and `sails.lift` only for integration testing. This might speed things up in some circumstances.

    We’re working on some performance documentation (https://github.com/balderdashy/sails/issues/2743#issuecomment-80672235); if you have any ideas on what you’d like to see profiled, let us know. Thanks

    – <3 the sails.js team

    Reply
  8. Finlay McWalter

    It’s really hard to see how any practical number of ENOENT stat syscalls can impose any meaningful delay.

    I wrote a little test benchmark in C which stats on 1 million non-existent files (different names each try and each run, so the fs name cache isn’t a factor).

    On an average Linux workstation, on the slowest mechanical usb disk I have, it still gets >300,000 operations per second.

    Reply
  9. Owen Allen

    I’ve been tackling this problem a lot. My application takes roughly 8 seconds to load, about 5s of that appear to be require() time and nothing else. It’s a lot.

    Here’s a question which I think is important to answer: Is it the number of files that’s loaded which causes the slowdown, or is it the size of the files?

    In example, from my tests last night loading express takes around 250ms. Generally we load modules with a single entry point such as require(“express”), if we compiled express down to a single file, how would the load time compare?

    Reply
  10. Pingback: How we took our tests from 100 seconds to 100ms

  11. Tim Popham

    I stumbled across this post while trying to figure why gulp takes so long on my system. I think that this explains it. I’m going to try `rm -rf node_modules && mkdir node_modules`, mounting a tmpfs to node_modules, and then `npm install`. Wish me luck.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Comments are heavily moderated.