mirror of
https://github.com/kovidgoyal/calibre.git
synced 2025-06-23 15:30:45 -04:00
lwn_weekly: fix security section articles parsing
As security section has no URLs in article titles, findNext() boldly returns whatever next link is encounered after the anchor. This leads to downloading and including in generated document of heavy CVE reports, as links to them usually placed after the article title. Instead we'd better search under anchor tag only, this way we'll filter useful articles' links.
This commit is contained in:
parent
f4a3df8523
commit
986ccd6a30
@ -114,7 +114,7 @@ class WeeklyLWN(BasicNewsRecipe):
|
|||||||
|
|
||||||
# Most articles have anchors in their titles, *except* the
|
# Most articles have anchors in their titles, *except* the
|
||||||
# security vulnerabilities
|
# security vulnerabilities
|
||||||
article_anchor = curr.findNext(
|
article_anchor = curr.find(
|
||||||
name='a', attrs={'href': re.compile('^/Articles/')})
|
name='a', attrs={'href': re.compile('^/Articles/')})
|
||||||
|
|
||||||
if article_anchor:
|
if article_anchor:
|
||||||
|
Loading…
x
Reference in New Issue
Block a user