综合编程

Node.js async parallel – what consequences are?

微信扫一扫,分享到朋友圈

Node.js async parallel – what consequences are?
0

There is code,

async.series(tasks, function (err) {
    return callback ({message: 'tasks execution error', error: err});
});

where, tasks
is array of functions, each of it peforms HTTP request (using request
module) and calling MongoDB API to store the data (to MongoHQ instance).

With my current input, (~200 task to execute), it takes

[normal mode] collection cycle: 1356.843 sec. (22.61405 mins.)

But simply trying change from series
to parallel
, it gives magnificent benefit. The almost same amount of tasks run in ~30 secs
instead of ~23 mins
.

But, knowing that nothing is for free, I’m trying to understand what the consequences of that change? Can I tell that number of open sockets will be much higher, more memory consumption, more hit to DB servers?

Machine that I run the code is only 1GB of RAM Ubuntu, so I so that app hangs there one time, can it be caused by lacking of resources?

Problem courtesy of: Alexander Beletsky

Solution

Your intuition is correct that the parallelism doesn’t come for free, but you certainly may be able to pay for it.

Using a load testing module (or collection of modules) like nodeload
, you can quantify how this parallel operation is affecting your server to determine if it is acceptable.

Async.parallelLimit
can be a good way of limiting server load if you need to, but first it is important to discover if limiting is necessary
. Testing explicitly is the best way to discover the limits of your system (eachLimit has a different signature, but could be used as well).

Beyond this, common pitfalls using async.parallel include wanting more complicated control flow than that function offers (which, from your description doesn’t seem to apply) and using parallel on too large of a collection naively (which, say, may cause you to bump into your system’s file descriptor limit if you are writing many files). With your ~200 request and save operations on 1GB RAM, I would imagine you would be fine as long as you aren’t doing much massaging in the event handlers, but if you are experiencing server hangs, parallelLimit could be a good way out.

Again, testing is the best way to figure these things out.

Solution courtesy of: Wyatt

Discussion

you’ll realize the difference if multiple users connect:

in this case the processor can handle multiple operations

asynch tries to run several operations of multiple users relative equal

T = task
U = user
(T1.U1 = task 1 of user 1)

T1.U1 => T1.U2 => T2.U1 => T8.U3 => T2.U2 => etc

this is the oposite of atomicy (so maybe watch for atomicy on special db operations – but thats another topic)

so maybe it is faster to use:

T2.U1 before T1.U1

– this is no problem until

T2.U1 is based on T1.U1

– this is preventable by using callbacks/ or therefore are callbacks

…hope this is what you wanted to know… its a bit late here

Discussion courtesy of: Pika

I would point out that async.parallel
executes multiple functions concurrently
not (completely) parallely
. It is more like virtual parallelism.

Executing concurrently is like running different programs on a single CPU core, via multitasking/scheduling. True parallel execution would be running different program on each core of multi-core CPU. This is important as node.js
has single-threaded
architecture.

The best thing about node is that you don’t have to worry about I/O. It handles I/O very efficiently.

In your case you are storing data to MongoDB, is mostly I/O. So running them parallely will use up your network bandwidth and if reading/writing from disk then disk bandwidth too. Your server will not hang because of CPU overload.

The consequence of this would be that if you overburden your server, your requests may fail. You may get EMFILE
error (Too many open files). Each socket counts as a file. Usually connections are pooled, meaning to establish connection a socket is picked from the pool and when finished return to the pool. You can increase the file descriptor with ulimit -n xxxx
.

You may also get socket errors when overburdened like ECONNRESET
(Error: socket hang up), ECONNREFUSED
or ETIMEDOUT
. So handle them with properly. Also check the maximum number of simultaneous connections for mongoDB server too.

Finally the server can hangup because of garbage collection. Garbage collection kicks in after your memory increases to a certain point, then runs periodically after some time. The max heap memory V8 can have is around 1.5 GB, so expect GC to run frequently if its memory is high. Node will crash with process out of memory
if asking for more, than that limit. So fix the memory leaks in your program. You can look at these tools
.

Discussion courtesy of: user568109

The main downside you’ll see here is a spike in database server load. That may or may not be okay depending on your setup.

If your database server is a shared resource then you will probably want to limit the parallel requests by using async.eachLimit
instead.

Discussion courtesy of: Daniel

This recipe can be found in it’s original form on Stack Over Flow
.

阅读原文...


Node.js Recipes

Pranit’s Remarkable Journey – From normal IT employee to Senior Big Data Developer

上一篇

女主播进小学教室摆拍 抖音:已将其封禁

下一篇

您也可能喜欢

评论已经被关闭。

插入图片
Node.js async parallel – what consequences are?

长按储存图像,分享给朋友