lwn_weekly: fix security section articles parsing

As security section has no URLs in article titles, findNext() boldly returns whatever
next link is encounered after the anchor. This leads to downloading and including in generated
document of heavy CVE reports, as links to them usually placed after the article title.

Instead we'd better search under anchor tag only, this way we'll filter useful articles' links.
This commit is contained in:
Sergiy Kibrik 2018-05-25 23:24:41 +03:00
parent f4a3df8523
commit 986ccd6a30

View File

@ -114,7 +114,7 @@ class WeeklyLWN(BasicNewsRecipe):
# Most articles have anchors in their titles, *except* the # Most articles have anchors in their titles, *except* the
# security vulnerabilities # security vulnerabilities
article_anchor = curr.findNext( article_anchor = curr.find(
name='a', attrs={'href': re.compile('^/Articles/')}) name='a', attrs={'href': re.compile('^/Articles/')})
if article_anchor: if article_anchor: