Nodejs scraping website after javascript has loaded the values

Probably a newbie question on nodejs/jsdom

I am trying to scrape a website using node.js
. I am using jsdom and jquery to get the html and parse the required things. But, somehow the values i am getting are not the ones shown on the website. Basically the values are dynamically changed by javascript and i want those values. The whole reason i was using nodejs/jsdom for scraping was that js would be executed and I get the values after that event.

Is there some way to tell jsdom to wait until the javascript executes? or have i got this all wrong? I have googled a lot on this matter.

Problem courtesy of: zubinmehta


You would be better of using something like casperjs
. It is a testing utility based on phantomjs. It is basically exactly like opening the page in a webkit browser, just without the GUI. You could write something like. I dont think it works with node, but it should be easy enough to run a casper script and pipe the output back to node.:

var casper = require('casper').create({
    loadImages: true,
    loadPlugins: true,
    verbose: true,
    //logLevel: 'info',
    clientScripts: [
    viewportSize: {
        width: 1366,
        height: 768,
    pageSettings: {
        javascriptEnabled: true,
        userAgent: 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.9 Safari/536.5',


casper.thenEvaluate(function () {
    //javascript code to run in the scope of the page

Solution courtesy of: tapan


I don’t know if you’re up for alternatives, but when I need such sensitive scraping, I just use Firefox with iMacros. It runs all browser JS just fine, because it is
a browser.

Discussion courtesy of: x10

First off, how are you using jsdom? Apparently, jsdom.env
does not execute scripts in the DOM, only the scripts that you add in the call to jsdom.env
. If you want to execute scripts, I think you should use jsdom.jsdom

Second, you need to specify an onload
handler. This should execute after the document is ready, and hopefully any scripts will have changed the DOM to your liking.

Something like this:

var jsdom = require('jsdom').jsdom
  , document = jsdom(html)
  , window = document.createWindow();

document.onload = function() {
  // Do your stuff

Discussion courtesy of: Linus Gustav Larsson Thiel

This recipe can be found in it’s original form on Stack Over Flow

Node.js Recipes责编内容来自:Node.js Recipes (源链) | 更多关于

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 前端开发 » Nodejs scraping website after javascript has loaded the values

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录