[facebook] extract better titles (fixes #14156)

This now tries to find the primary title, which is displayed in bold
above the previously extracted full text. This is the title which is
also displayed - if given by the uploader - to the top left when
displaying a video fullscreen.

For videos posted to groups, the extractor still doesn't use the
given comment text as a title, and now extracts e.g. "[group name]
Public Group" instead of "[group name] has 481 members", which may
or may not be better.
This commit is contained in:
Moritz Barsnick 2019-01-03 17:00:28 +01:00
parent d7c3af7a72
commit 9dfd27538e
1 changed files with 4 additions and 0 deletions

View File

@ -410,6 +410,10 @@ class FacebookIE(InfoExtractor):
video_title = self._html_search_regex(
r'<h2\s+[^>]*class="uiHeaderTitle"[^>]*>([^<]*)</h2>', webpage,
'title', default=None)
if not video_title:
video_title = self._html_search_regex(
r'(?s)<title id="pageTitle"[^>]*>([^<]*)(?: \| Facebook)</title>',
webpage, 'title', default=None)
if not video_title:
video_title = self._html_search_regex(
r'(?s)<span class="fbPhotosPhotoCaption".*?id="fbPhotoPageCaption"><span class="hasCaption">(.*?)</span>',