newsfetch.py / README.md /
f2f2ba5 7 years ago
2 contributor
30 lines | 0.81kb

newsfetch.py

Python scrapper to make 1 big HTML from NewsPaper RSS

Dependencies

sudo apt-get install libxml2-dev libxslt-dev
sudo pip install bs4 feedparser lxml slimmer

Usage

newsfetch.py -u <rss url> -o <output filename>

Default Parameters

  • url : http://www.lemonde.fr/rss/une.xml
  • output : default.html

How it works

The feed is parsed and a list of available article is created. The article content (i.e. Feed link) is fetch automatically and the content is extracted : - <article>...</article>

Examples

  • ./newsfetch.py --url http://www.vice.com/fr/rss
  • ./newsfetch.py --url http://www.lemonde.fr/rss/une.xml
  • ./newsfetch.py --url https://www.slate.fr/rss.xml
  • ./newsfetch.py --url http://www.lesinrocks.com/feeds/feed-a-la-une/