Extract the first paragraph of a Wikipedia article (Python)

How can I extract the first paragraph from a Wikipedia article, using Python?

For example, for Albert Einstein
, that would be:

Albert Einstein (pronounced /ˈælbərt ˈaɪnstaɪn/; German: [ˈalbɐt ˈaɪnʃtaɪn] ( listen); 14 March 1879 – 18 April 1955) was a theoretical physicist, philosopher and author who is widely regarded as one of the most influential and iconic scientists and intellectuals of all time. A German-Swiss Nobel laureate, Einstein is often regarded as the father of modern physics.[2] He received the 1921 Nobel Prize in Physics "for his services to theoretical physics, and especially for his discovery of the law of the photoelectric effect".[3]

Some time ago I made two classes for get Wikipedia articles in plain text. I know that they aren’t the best solution, but you can adapt it to your needs:

wikipedia.py

wiki2plain.py

You can use it like this:

from wikipedia import Wikipedia
from wiki2plain import Wiki2Plain

lang = 'simple'
wiki = Wikipedia(lang)

try:
    raw = wiki.article('Uruguay')
except:
    raw = None

if raw:
    wiki2plain = Wiki2Plain(raw)
    content = wiki2plain.text
Hello, buddy!责编内容来自:Hello, buddy! (源链) | 更多关于

阅读提示:酷辣虫无法对本内容的真实性提供任何保证,请自行验证并承担相关的风险与后果!
本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » Extract the first paragraph of a Wikipedia article (Python)

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录