Extract href from the html page using php

I trying to extract the news headlines and the link (href) of each headline using the code bellow, but the link extraction is not working. It’s only getting the headline. Please help me find out what’s wrong with the code.

Link to page from which I want to get the headline and link from: http://web.tmxmoney.com/news.php?qm_symbol=BCM

$dom->preserveWhiteSpace = true;
$xpath = new DOMXPath($dom);
$rows = $xpath->query('//div');

foreach ($rows as $row) {

    $cols = $row->getElementsByTagName('span');

    $newstitle = $cols->item(0)->nodeValue;

    $link = $cols->item(0)->nodeType === HTML_ELEMENT_NODE ? $cols->item(0)->getElementsByTagName('a')->item(0)->getAttribute('href') : '';

echo $newstitle . '
'; echo $link . '

'; } ?>

Thanks in advance for your help!

Try to do this:

  $xpath = new DOMXPath($dom);
  $hrefs= $xpath->query('/html/body//a');

  for($i = 0; $i length; $i++){
   $href = $hrefs->item($i);
   $url = $href->getAttribute('href');
   $url = filter_var($url, FILTER_SANITIZE_URL);

   if(!filter_var($url, FILTER_VALIDATE_URL) === false){
      echo ''.$url.'
'; } } ?>
Hello, buddy!责编内容来自:Hello, buddy! (源链) | 更多关于

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » Extract href from the html page using php

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录