Our test environment takes 6-9 seconds to load before any tests get run. I tire of this during the ~30 times I run the test suite a day,1 so I wanted to make it faster.
For better or worse, the API runs on Sails.js. Before running
model/controller tests, a bootstrap file in our tests calls sails.lift
.
require('sails').lift(function(err) { // Run the tests });
This lift
call generates about 400 queries against Postgres to retrieve
database schema, that each look like this:
SELECT x.nspname || '.' || x.relname as "Table", x.attnum as "#", x.attname as "Column", x."Type", case x.attnotnull when true then 'NOT NULL' else '' end as "NULL", r.conname as "Constraint", r.contype as "C", r.consrc, fn.nspname || '.' || f.relname as "F Key", d.adsrc as "Default" FROM (SELECT c.oid, a.attrelid, a.attnum, n.nspname, c.relname, a.attname, pg_catalog.format_type(a.atttypid, a.atttypmod) as "Type", a.attnotnull FROM pg_catalog.pg_attribute a, pg_namespace n, pg_class c WHERE a.attnum > 0 AND NOT a.attisdropped AND a.attrelid = c.oid and c.relkind not in ('S','v') and c.relnamespace = n.oid and n.nspname not in ('pg_catalog','pg_toast','information_schema')) x left join pg_attrdef d on d.adrelid = x.attrelid and d.adnum = x.attnum left join pg_constraint r on r.conrelid = x.oid and r.conkey[1] = x.attnum left join pg_class f on r.confrelid = f.oid left join pg_namespace fn on f.relnamespace = fn.oid where x.relname = 'credits' order by 1,2;
I'm not really sure what those queries do, since Sails seems to
ignore the schema that's already in the database when generating table queries.
Since we use the smallest, safest subset of the ORM we can find, I tried
commenting out the sails-postgresql
module code that makes those
queries to see if the tests would still pass, and the tests did pass... but the
load time was still slow.
The next step was to instrument the code to figure out what was taking so long.
I wanted to have the Sails loader log the duration of each part of the load
process, but this would have required a global variable, and a whole bunch of
calls to console.log. It turns out the Unix function ts
can do this for
you, if log lines are generated at the appropriate times. Basically, it's an
instant awesome tool for generating insight into a program's runtime, without
needing to generate timestamps in the underlying tool.
NAME ts - timestamp input SYNOPSIS ts [-r] [-i | -s] [format] DESCRIPTION ts adds a timestamp to the beginning of each line of input.
I set the Sails logging level to verbose
and piped the output through ts
'[%Y-%m-%d %H:%M:%.S]'
. I pretty quickly found a culprit..
[2015-04-19 21:53:45.730679] verbose: request hook loaded successfully. [2015-04-19 21:53:45.731032] verbose: Loading the app's models and adapters... [2015-04-19 21:53:45.731095] verbose: Loading app models... [2015-04-19 21:53:47.928104] verbose: Loading app adapters... [2015-04-19 21:53:47.929343] verbose: Loading blueprint middleware...
That's a 2 second gap between loading the models and loading the adapters.
I started adding profiling to the code near the "Loading app models..." line. I
expected to see that attaching custom functions (findOne, update, etc) to the
Sails models was the part that took so long. Instead I found out that a module
called include-all accounted for almost all of the 2.5 seconds. That
module simply requires every file in the models
directory, about 30 files.
Further reading/logging revealed that each require
call was being generated
in turn. I've found it, I thought, just use different CPU's to require
them all at
the same time, and see a speedup. Unfortunately, the require
operation is
synchronous in Node, and runs on one CPU, so the process can still only perform one require
at a
time.
I tried just loading one model to see how slow that would be. This script took an average of 700 milliseconds to run, on my high end Macbook Pro:
var start = Date.now(); require('api/models/Drivers'); console.log(Date.now() - start);
700 milliseconds to require a model file, and that file's dependencies! I
can send a packet to and from New York 8 times in that amount of time. What
the hell is it actually doing? For this I turned to good old strace
(or
dtruss
, as it's ported on Macs). First start up a shell to record
syscalls for any process that is called node
.
# (You'll want to kill any other node processes before running this.) sudo dtruss -d -n 'node' > /tmp/require.log 2>&1
Open up another shell session and run your little script that calls require
and prints the startup time and then exits. You should have a few thousand
lines in a file called /tmp/require.log
. Here's what I found near the start:
1186335 stat64("/Users/burke/code/api/api/models/node_modules/async\0", 0x7FFF5FBFE608, 0x9) = -1 Err#2 1186382 stat64("/Users/burke/code/api/api/models/node_modules/async.js\0", 0x7FFF5FBFE5B8, 0x9) = -1 Err#2 1186405 stat64("/Users/burke/code/api/api/models/node_modules/async.json\0", 0x7FFF5FBFE5B8, 0x9) = -1 Err#2 1186423 stat64("/Users/burke/code/api/api/models/node_modules/async.node\0", 0x7FFF5FBFE5B8, 0x9) = -1 Err#2 1186438 stat64("/Users/burke/code/api/api/models/node_modules/async.coffee\0", 0x7FFF5FBFE5B8, 0x9) = -1 Err#2 1186473 open("/Users/burke/code/api/api/models/node_modules/async/package.json\0", 0x0, 0x1B6) = -1 Err#2 1186501 stat64("/Users/burke/code/api/api/models/node_modules/async/index.js\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186519 stat64("/Users/burke/code/api/api/models/node_modules/async/index.json\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186534 stat64("/Users/burke/code/api/api/models/node_modules/async/index.node\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186554 stat64("/Users/burke/code/api/api/models/node_modules/async/index.coffee\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186580 stat64("/Users/burke/code/api/api/node_modules/async\0", 0x7FFF5FBFE608, 0x1B6) = -1 Err#2 1186598 stat64("/Users/burke/code/api/api/node_modules/async.js\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186614 stat64("/Users/burke/code/api/api/node_modules/async.json\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186630 stat64("/Users/burke/code/api/api/node_modules/async.node\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186645 stat64("/Users/burke/code/api/api/node_modules/async.coffee\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186670 open("/Users/burke/code/api/api/node_modules/async/package.json\0", 0x0, 0x1B6) = -1 Err#2 1186694 stat64("/Users/burke/code/api/api/node_modules/async/index.js\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186712 stat64("/Users/burke/code/api/api/node_modules/async/index.json\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186727 stat64("/Users/burke/code/api/api/node_modules/async/index.node\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186742 stat64("/Users/burke/code/api/api/node_modules/async/index.coffee\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1186901 stat64("/Users/burke/code/api/node_modules/async\0", 0x7FFF5FBFE608, 0x1B6) = 0 0 1186963 stat64("/Users/burke/code/api/node_modules/async.js\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1187024 stat64("/Users/burke/code/api/node_modules/async.json\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1187050 stat64("/Users/burke/code/api/node_modules/async.node\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1187074 stat64("/Users/burke/code/api/node_modules/async.coffee\0", 0x7FFF5FBFE5B8, 0x1B6) = -1 Err#2 1656 __semwait_signal(0x10F, 0x0, 0x1) = -1 Err#60 1187215 open("/Users/burke/code/api/node_modules/async/package.json\0"0x0, 0x1B6) = 11 0
That's a lot of wasted open()'s, and that's just to find one dependency.2 To load the single model, node had to open and read 300 different files,3 and every require which wasn't a relative dependency did the same find-the-node-module-folder dance. The documentation seems to indicate this is desired behavior, and it doesn't seem like there is any way around this.
Now, failed stat's are not the entire reason require is slow, but if I am
reading the timestamps correctly in the strace log, they are a fair amount of
it, and the most obviously wasteful thing. I could rewrite every require
to be relative, e.g. require('../../node_modules/async')
but that seems
cumbersome and wasteful, when I can define the exact rule I want before hand:
if it's not a relative file path, look in node_modules
in the top level of
my project.
So that's where we are; require
for a single model takes 700 milliseconds,
require
for all the models takes 2.5 seconds, and there don't seem to be
great options for speeding that up. There's some discouraging discussion
here from core developers about the possibility of speeding up module
import.
You are probably saying "load fewer dependencies", and you are right and I would love to, but that is not an option at this point in time, since "dependencies" basically equals Sails, and while we are trying to move off of Sails, we're stuck on it for the time being. Where I can, I write tests that work with objects in memory and don't touch our models/controllers, but Destroy All Software videos only get you so far.
I will definitely think harder about importing another small module vs. just copying the code/license wholesale into my codebase, and I'll definitely look to add something that can warn about unused imports or variables in the code base.
I'm left concluding that importing modules in Node is dog slow. Yes, 300
imports is a lot, but a 700ms load time seems way too high. Python imports can
be slow, but as far as I remember, it's mostly for things that compile C on the
fly, and for everything else you can work around it by rearranging sys.path
to match the most imports first (impossible here). If there's anything I can do
- some kind of compile option, or saving the V8 bytecode or something, I'm open
to suggestions.
1. If you follow along with the somewhat-crazy convention of crashing your Node process on an error and having a supervisor restart it, that means that your server takes 6-9 seconds of downtime as well.
2. The story gets even worse if you are writing Coffeescript, since
require
will also look for files matching .litcoffee
and .coffee.md
at
every level. You can hack require.extensions
to delete these keys.
3. For unknown reasons, node occasionally decided not to stat/open some imports that were require'd in the file.
Liked what you read? I am available for hire.
Im curious about the numbers i have benched 7 required files and their dependencies at about 300-400ms together… so i not sure how you got 700ms for one file?
Hi David – sorry for not being clear – the Drivers file required other JS files. The total number of opened files was around 300. Guessing the speed is related to the number and size of files to parse.
It appears to be much faster on my machine, which is a fast Windows 8.1 desktop:
Using procmon I ran some NodeJS code that requires the esprima npm package. It can’t find it in the current dir, so it goes to the parent dir, finds it in node_modules, and then finally finds the correct .js file for the library. From the first failed ::CreateFile() call to the final successful ::CreateFile() call on the correct JS file is only 1.6 milliseconds.
Hi Alan,
Is that result for just one required file? I saw roughly the same timestamps for JS, unfortunately the file I was trying to require had over 300 includes. 1.5ms each can add up :(
Just use this module to trace times for requires time-require.
Quick question:
You stated “I could rewrite every require to be relative, e.g. require(‘../../node_modules/async’)”
Does simply supplying a relative path fix this issue? If so, why? #justCurious
Great post.
We found the same problem, and for us, load times were crucial. What we tried, and succeeded, to do, was to rewire require so that resolving the path to the module is cached, so that the next time the app loads, we load the path resolution to the module and just load it from there.
This requires a bit of low-level require rewiring, but everyone’s doing that today. :-)
This sped our load time significantly (although to be honest it was just a POC and we haven’t yet used it in production), especially on Windows, where this problem is significantly worse for some reason.
There’s a great way to get to the bottom of this problem – log a new issue on the io.js issue queue along with details about the environment you’re running in – including the version of node and hardware specs. Something reproducible would be ideal.
Kevin,
I am trying to solve this problem in user space by caching resolved module paths, see cache-require-paths. I also have done the loaded source cache separately. The caches remove a lot of stat system calls that are useless during the application’s startup. I am trying to accurately measure the impact on Sails lift off.
Hey Kevin,
Could you be more specific about the hardware this is running on? Anywhere on the order of ~10 seconds just to `require()` files would be unprecedentedly slow in my experience.
Another note: at Balderdash, when we are unit/integration testing our sails.js applications, we typically will use `sails.load` in our test suite for unit testing, and `sails.lift` only for integration testing. This might speed things up in some circumstances.
We’re working on some performance documentation (https://github.com/balderdashy/sails/issues/2743#issuecomment-80672235); if you have any ideas on what you’d like to see profiled, let us know. Thanks
– <3 the sails.js team
The speed of require isn’t the problem. It’s how Sails models and their dependencies are written. Try using the migrate: ‘safe’ config to disable migrations which should dramatically reduce the startup time.
https://github.com/balderdashy/sails-docs/blob/master/concepts/ORM/model-settings.md
It’s really hard to see how any practical number of ENOENT stat syscalls can impose any meaningful delay.
I wrote a little test benchmark in C which stats on 1 million non-existent files (different names each try and each run, so the fs name cache isn’t a factor).
On an average Linux workstation, on the slowest mechanical usb disk I have, it still gets >300,000 operations per second.
You can try `require(path.join(require.main.paths[0], ‘async’))` instead of `require(‘../../node_modules/async’)`. Also, i packed a Sails-based application with my nodejs compiler (EncloseJS). All modules are loaded in memory, so that stat is not run. https://github.com/igorklopov/enclose/tree/master/examples/26-sails
Ok, caching just module file paths yourself speeds up average load by 25% (across express, karma, grunt and sails), see Faster Node app require. Caching the file source does not save anything. The next major bottleneck is the action js code compilation.
Try if
npm dedupe
solves the issue (https://docs.npmjs.com/cli/dedupe). It unravels the nested node_module dependencies into a flat structure.So if some of the modules uses the same dependency it will be loaded from only one source. It should resolve some of the unnecessary overhead of loading the same module multiple times.
What about inlining all the dependencies in one huge file? Something like Browserify does
I am having some issues due to node’s require being so slow.
I maintain a project which uses a lot of external dependencies, and it is a CLI tool. It’s taking too long to do somethings that should happen instantly and I am working on improving them.
I have create require-time to name the bastards.
https://asciinema.org/a/21275
I’ve been tackling this problem a lot. My application takes roughly 8 seconds to load, about 5s of that appear to be require() time and nothing else. It’s a lot.
Here’s a question which I think is important to answer: Is it the number of files that’s loaded which causes the slowdown, or is it the size of the files?
In example, from my tests last night loading express takes around 250ms. Generally we load modules with a single entry point such as require(“express”), if we compiled express down to a single file, how would the load time compare?
That’s a good question and I don’t know!!
Pingback: How we took our tests from 100 seconds to 100ms
I stumbled across this post while trying to figure why gulp takes so long on my system. I think that this explains it. I’m going to try `rm -rf node_modules && mkdir node_modules`, mounting a tmpfs to node_modules, and then `npm install`. Wish me luck.
FYI – console.time & console.timeEnd