技术控

    今日:19| 主题:57957
收藏本版 (1)
最新软件应用技术尽在掌握

[其他] A pure Python impl of TextRank for document summarization

[复制链接]
我在等我的永恒- 投递于 2016-10-4 12:18:42
91 1
Python impl for TextRank

  A pure Python implementation of    TextRank, based on the    Mihalcea 2004paper. Leading toward integration with the    Text Summarizationexample by Mike Williams.  
  Modifications to the original algorithm include:
  
       
  • fixed bug; seeJava impl, 2008   
  • use of lemmatization instead of stemming   
  • verbs included in the graph (but not in the resulting keyphrases)   
  • normalized keyphrase ranks used in summarization  
  Dependencies and Installation

  The code here has dependencies on several other projects:
  
       
  •       NLTK   
  •       TextBlob   
  •       NetworkX   
  •       datasketch  
  To install:
  1. conda config --add channels https://conda.binstar.org/sloria
  2. conda install textblob
  3. sudo python -m nltk.downloader punkt
  4. sudo python -m nltk.downloader wordnet
  5. pip install datasketch -U
复制代码
Example Usage

  Run a test case based on the Mihalcea paper:
  1. ./stage1.py dat/mih.json > out1.json
  2. ./stage2.py out1.json > out2.json
复制代码
That test case should result as:
  1. 0.2230    minimal supporting set
  2. 0.1345    types systems
  3. 0.1339    linear diophantine equations
  4. 0.0802    mixed types
  5. 0.0541    strict inequations
  6. 0.0505    nonstrict inequations
  7. 0.0368    linear constraints
  8. 0.0356    natural numbers
  9. 0.0252    corresponding algorithms
  10. 0.0116    upper bounds
  11. 0.0091    solutions
  12. 0.0027    components
  13. 0.0025    construction
  14. 0.0014    compatibility
  15. 0.0010    criteria
复制代码
Run another test based on the    Williams talk:  
  1. ./stage1.py dat/ars.json > out1.json
  2. ./stage2.py out1.json > out2.json
  3. ./stage3.py out1.json out2.json
复制代码
Those results show a summarization similar to that shown on slide 30.



上一篇:Using SQL Server Data Analysis for Stock Trading Strategies
下一篇:Introduction to User Notifications Framework in iOS 10
我是66 投递于 2016-10-4 14:29:32
如果每个女朋友用一个字来代替的话,我的情史可以写一部长篇小说了。
回复 支持 反对

使用道具 举报

我要投稿

推荐阅读


回页顶回复上一篇下一篇回列表
手机版/CoLaBug.com ( 粤ICP备05003221号 | 文网文[2010]257号 | 粤公网安备 44010402000842号 )

© 2001-2017 Comsenz Inc.

返回顶部 返回列表