Merge pull request #6 from ytdl-org/master

update
2024-12-25 15:57:55 +01:00 · 2019-08-28 16:32:52 +02:00 · 2019-08-28 16:32:52 +02:00 · 70a1d4caa7
commit 70a1d4caa7
parent 9a6d52db94 b500955a58
18 changed files with 234 additions and 68 deletions
--- a/.github/ISSUE_TEMPLATE/1_broken_site.md
+++ b/.github/ISSUE_TEMPLATE/1_broken_site.md
@ -18,7 +18,7 @@ title: ''

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.02. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.13. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
 - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
 - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -26,7 +26,7 @@ Carefully read and work through this check list in order to prevent the most com
 -->

 - [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2019.08.02**
+- [ ] I've verified that I'm running youtube-dl version **2019.08.13**
 - [ ] I've checked that all provided URLs are alive and playable in a browser
 - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
 - [ ] I've searched the bugtracker for similar issues including closed ones
@ -41,7 +41,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
- [debug] youtube-dl version 2019.08.02
+ [debug] youtube-dl version 2019.08.13
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/.github/ISSUE_TEMPLATE/2_site_support_request.md
+++ b/.github/ISSUE_TEMPLATE/2_site_support_request.md
@ -19,7 +19,7 @@ labels: 'site-support-request'

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.02. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.13. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
 - Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
 - Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
 -->

 - [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2019.08.02**
+- [ ] I've verified that I'm running youtube-dl version **2019.08.13**
 - [ ] I've checked that all provided URLs are alive and playable in a browser
 - [ ] I've checked that none of provided URLs violate any copyrights
 - [ ] I've searched the bugtracker for similar site support requests including closed ones
--- a/.github/ISSUE_TEMPLATE/3_site_feature_request.md
+++ b/.github/ISSUE_TEMPLATE/3_site_feature_request.md
@ -18,13 +18,13 @@ title: ''

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.02. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.13. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
 - Finally, put x into all relevant boxes (like this [x])
 -->

 - [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2019.08.02**
+- [ ] I've verified that I'm running youtube-dl version **2019.08.13**
 - [ ] I've searched the bugtracker for similar site feature requests including closed ones


--- a/.github/ISSUE_TEMPLATE/4_bug_report.md
+++ b/.github/ISSUE_TEMPLATE/4_bug_report.md
@ -18,7 +18,7 @@ title: ''

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.02. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.13. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
 - Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
 - Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
@ -27,7 +27,7 @@ Carefully read and work through this check list in order to prevent the most com
 -->

 - [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2019.08.02**
+- [ ] I've verified that I'm running youtube-dl version **2019.08.13**
 - [ ] I've checked that all provided URLs are alive and playable in a browser
 - [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
 - [ ] I've searched the bugtracker for similar bug reports including closed ones
@ -43,7 +43,7 @@ Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <
 [debug] User config: []
 [debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
 [debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
- [debug] youtube-dl version 2019.08.02
+ [debug] youtube-dl version 2019.08.13
 [debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
 [debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
 [debug] Proxy map: {}
--- a/.github/ISSUE_TEMPLATE/5_feature_request.md
+++ b/.github/ISSUE_TEMPLATE/5_feature_request.md
@ -19,13 +19,13 @@ labels: 'request'

 <!--
 Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.02. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
+- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2019.08.13. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
 - Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
 - Finally, put x into all relevant boxes (like this [x])
 -->

 - [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2019.08.02**
+- [ ] I've verified that I'm running youtube-dl version **2019.08.13**
 - [ ] I've searched the bugtracker for similar feature requests including closed ones


--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -339,6 +339,72 @@ Incorrect:
 'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
 ```

+### Inline values
+
+Extracting variables is acceptable for reducing code duplication and improving readability of complex expressions. However, you should avoid extracting variables used only once and moving them to opposite parts of the extractor file, which makes reading the linear flow difficult.
+
+#### Example
+
+Correct:
+
+```python
+title = self._html_search_regex(r'<title>([^<]+)</title>', webpage, 'title')
+```
+
+Incorrect:
+
+```python
+TITLE_RE = r'<title>([^<]+)</title>'
+# ...some lines of code...
+title = self._html_search_regex(TITLE_RE, webpage, 'title')
+```
+
+### Collapse fallbacks
+
+Multiple fallback values can quickly become unwieldy. Collapse multiple fallback values into a single expression via a list of patterns.
+
+#### Example
+
+Good:
+
+```python
+description = self._html_search_meta(
+    ['og:description', 'description', 'twitter:description'],
+    webpage, 'description', default=None)
+```
+
+Unwieldy:
+
+```python
+description = (
+    self._og_search_description(webpage, default=None)
+    or self._html_search_meta('description', webpage, default=None)
+    or self._html_search_meta('twitter:description', webpage, default=None))
+```
+
+Methods supporting list of patterns are: `_search_regex`, `_html_search_regex`, `_og_search_property`, `_html_search_meta`.
+
+### Trailing parentheses
+
+Always move trailing parentheses after the last argument.
+
+#### Example
+
+Correct:
+
+```python
+    lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
+    list)
+```
+
+Incorrect:
+
+```python
+    lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
+    list,
+)
+```
+
 ### Use convenience conversion and parsing functions

 Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.
--- a/16
+++ b/16
@ -1,3 +1,19 @@
+version 2019.08.13
+
+Core
+* [downloader/fragment] Fix ETA calculation of resumed download (#21992)
+* [YoutubeDL] Check annotations availability (#18582)
+
+Extractors
+* [youtube:playlist] Improve flat extraction (#21927)
+* [youtube] Fix annotations extraction (#22045)
+ [discovery] Extract series meta field (#21808)
+* [youtube] Improve error detection (#16445)
+* [vimeo] Fix album extraction (#1933, #15704, #15855, #18967, #21986)
+ [roosterteeth] Add support for watch URLs
+* [discovery] Limit video data by show slug (#21980)
+
+
 version 2019.08.02

 Extractors
--- a/youtube_dl/YoutubeDL.py
+++ b/youtube_dl/YoutubeDL.py
@ -1783,6 +1783,8 @@ class YoutubeDL(object):
            annofn = replace_extension(filename, 'annotations.xml', info_dict.get('ext'))
            if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(annofn)):
                self.to_screen('[info] Video annotations are already present')
+            elif not info_dict.get('annotations'):
+                self.report_warning('There are no annotations to write.')
            else:
                try:
                    self.to_screen('[info] Writing video annotations to: ' + annofn)
--- a/youtube_dl/downloader/fragment.py
+++ b/youtube_dl/downloader/fragment.py
@ -190,12 +190,13 @@ class FragmentFD(FileDownloader):
        })

    def _start_frag_download(self, ctx):
+        resume_len = ctx['complete_frags_downloaded_bytes']
        total_frags = ctx['total_frags']
        # This dict stores the download progress, it's updated by the progress
        # hook
        state = {
            'status': 'downloading',
-            'downloaded_bytes': ctx['complete_frags_downloaded_bytes'],
+            'downloaded_bytes': resume_len,
            'fragment_index': ctx['fragment_index'],
            'fragment_count': total_frags,
            'filename': ctx['filename'],
@ -234,8 +235,8 @@ class FragmentFD(FileDownloader):
                state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes']
                if not ctx['live']:
                    state['eta'] = self.calc_eta(
-                        start, time_now, estimated_size,
-                        state['downloaded_bytes'])
+                        start, time_now, estimated_size - resume_len,
+                        state['downloaded_bytes'] - resume_len)
                state['speed'] = s.get('speed') or ctx.get('speed')
                ctx['speed'] = state['speed']
                ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes
--- a/youtube_dl/extractor/bbc.py
+++ b/youtube_dl/extractor/bbc.py
@ -40,6 +40,7 @@ class BBCCoUkIE(InfoExtractor):
                            iplayer(?:/[^/]+)?/(?:episode/|playlist/)|
                            music/(?:clips|audiovideo/popular)[/#]|
                            radio/player/|
+                            sounds/play/|
                            events/[^/]+/play/[^/]+/
                        )
                        (?P<id>%s)(?!/(?:episodes|broadcasts|clips))
@ -70,7 +71,7 @@ class BBCCoUkIE(InfoExtractor):
            'info_dict': {
                'id': 'b039d07m',
                'ext': 'flv',
-                'title': 'Leonard Cohen, Kaleidoscope - BBC Radio 4',
+                'title': 'Kaleidoscope, Leonard Cohen',
                'description': 'The Canadian poet and songwriter reflects on his musical career.',
            },
            'params': {
@ -220,6 +221,20 @@ class BBCCoUkIE(InfoExtractor):
                # rtmp download
                'skip_download': True,
            },
+        }, {
+            'url': 'https://www.bbc.co.uk/sounds/play/m0007jzb',
+            'note': 'Audio',
+            'info_dict': {
+                'id': 'm0007jz9',
+                'ext': 'mp4',
+                'title': 'BBC Proms, 2019, Prom 34: West–Eastern Divan Orchestra',
+                'description': "Live BBC Proms. West–Eastern Divan Orchestra with Daniel Barenboim and Martha Argerich.",
+                'duration': 9840,
+            },
+            'params': {
+                # rtmp download
+                'skip_download': True,
+            }
        }, {
            'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4',
            'only_matching': True,
@ -609,7 +624,7 @@ class BBCIE(BBCCoUkIE):
        'url': 'http://www.bbc.com/news/world-europe-32668511',
        'info_dict': {
            'id': 'world-europe-32668511',
-            'title': 'Russia stages massive WW2 parade despite Western boycott',
+            'title': 'Russia stages massive WW2 parade',
            'description': 'md5:00ff61976f6081841f759a08bf78cc9c',
        },
        'playlist_count': 2,
--- a/youtube_dl/extractor/discovery.py
+++ b/youtube_dl/extractor/discovery.py
@ -94,6 +94,8 @@ class DiscoveryIE(DiscoveryGoBaseIE):
                self._API_BASE_URL + 'content/videos',
                display_id, 'Downloading content JSON metadata',
                headers=headers, query={
+                    'embed': 'show.name',
+                    'fields': 'authenticated,description.detailed,duration,episodeNumber,id,name,parental.rating,season.number,show,tags',
                    'slug': display_id,
                    'show_slug': show_slug,
                })[0]
--- a/youtube_dl/extractor/einthusan.py
+++ b/youtube_dl/extractor/einthusan.py
@ -19,7 +19,7 @@ from ..utils import (


 class EinthusanIE(InfoExtractor):
-    _VALID_URL = r'https?://(?P<host>einthusan\.(?:tv|com))/movie/watch/(?P<id>[^/?#&]+)'
+    _VALID_URL = r'https?://(?P<host>einthusan\.(?:tv|com|ca))/movie/watch/(?P<id>[^/?#&]+)'
    _TESTS = [{
        'url': 'https://einthusan.tv/movie/watch/9097/',
        'md5': 'ff0f7f2065031b8a2cf13a933731c035',
@ -36,6 +36,9 @@ class EinthusanIE(InfoExtractor):
    }, {
        'url': 'https://einthusan.com/movie/watch/9097/',
        'only_matching': True,
+    }, {
+        'url': 'https://einthusan.ca/movie/watch/4E9n/?lang=hindi',
+        'only_matching': True,
    }]

    # reversed from jsoncrypto.prototype.decrypt() in einthusan-PGMovieWatcher.js
--- a/youtube_dl/extractor/openload.py
+++ b/youtube_dl/extractor/openload.py
@ -243,7 +243,12 @@ class PhantomJSwrapper(object):


 class OpenloadIE(InfoExtractor):
-    _DOMAINS = r'(?:openload\.(?:co|io|link|pw)|oload\.(?:tv|best|biz|stream|site|xyz|win|download|cloud|cc|icu|fun|club|info|press|pw|life|live|space|services|website)|oladblock\.(?:services|xyz|me)|openloed\.co)'
+    _DOMAINS = r'''(?x)
+                    (?:
+                        openload\.(?:co|io|link|pw)|
+                        oload\.(?:tv|best|biz|stream|site|xyz|win|download|cloud|cc|icu|fun|club|info|press|pw|life|live|space|services|website|vip)|
+                        oladblock\.(?:services|xyz|me)|openloed\.co)
+                    '''
    _VALID_URL = r'''(?x)
                    https?://
                        (?P<host>
@ -383,6 +388,9 @@ class OpenloadIE(InfoExtractor):
    }, {
        'url': 'https://openloed.co/f/b8NWEgkqNLI/',
        'only_matching': True,
+    }, {
+        'url': 'https://oload.vip/f/kUEfGclsU9o',
+        'only_matching': True,
    }]

    @classmethod
--- a/youtube_dl/extractor/piksel.py
+++ b/youtube_dl/extractor/piksel.py
@ -18,15 +18,14 @@ class PikselIE(InfoExtractor):
    _VALID_URL = r'https?://player\.piksel\.com/v/(?P<id>[a-z0-9]+)'
    _TESTS = [
        {
-            'url': 'http://player.piksel.com/v/nv60p12f',
-            'md5': 'd9c17bbe9c3386344f9cfd32fad8d235',
+            'url': 'http://player.piksel.com/v/ums2867l',
+            'md5': '34e34c8d89dc2559976a6079db531e85',
            'info_dict': {
-                'id': 'nv60p12f',
+                'id': 'ums2867l',
                'ext': 'mp4',
-                'title': 'فن الحياة  - الحلقة 1',
-                'description': 'احدث برامج الداعية الاسلامي " مصطفي حسني " فى رمضان 2016علي النهار نور',
-                'timestamp': 1465231790,
-                'upload_date': '20160606',
+                'title': 'GX-005 with Caption',
+                'timestamp': 1481335659,
+                'upload_date': '20161210'
            }
        },
        {
@ -39,7 +38,7 @@ class PikselIE(InfoExtractor):
                'title': 'WAW- State of Washington vs. Donald J. Trump, et al',
                'description': 'State of Washington vs. Donald J. Trump, et al, Case Number 17-CV-00141-JLR, TRO Hearing, Civil Rights Case, 02/3/2017, 1:00 PM (PST), Seattle Federal Courthouse, Seattle, WA, Judge James L. Robart presiding.',
                'timestamp': 1486171129,
-                'upload_date': '20170204',
+                'upload_date': '20170204'
            }
        }
    ]
@ -113,6 +112,13 @@ class PikselIE(InfoExtractor):
            })
        self._sort_formats(formats)

+        subtitles = {}
+        for caption in video_data.get('captions', []):
+            caption_url = caption.get('url')
+            if caption_url:
+                subtitles.setdefault(caption.get('locale', 'en'), []).append({
+                    'url': caption_url})
+
        return {
            'id': video_id,
            'title': title,
@ -120,4 +126,5 @@ class PikselIE(InfoExtractor):
            'thumbnail': video_data.get('thumbnailUrl'),
            'timestamp': parse_iso8601(video_data.get('dateadd')),
            'formats': formats,
+            'subtitles': subtitles,
        }
--- a/youtube_dl/extractor/safari.py
+++ b/youtube_dl/extractor/safari.py
@ -68,9 +68,10 @@ class SafariBaseIE(InfoExtractor):
            raise ExtractorError(
                'Unable to login: %s' % credentials, expected=True)

-        # oreilly serves two same groot_sessionid cookies in Set-Cookie header
-        # and expects first one to be actually set
-        self._apply_first_set_cookie_header(urlh, 'groot_sessionid')
+        # oreilly serves two same instances of the following cookies
+        # in Set-Cookie header and expects first one to be actually set
+        for cookie in ('groot_sessionid', 'orm-jwt', 'orm-rt'):
+            self._apply_first_set_cookie_header(urlh, cookie)

        _, urlh = self._download_webpage_handle(
            auth.get('redirect_uri') or next_uri, None, 'Completing login',)
--- a/youtube_dl/extractor/usanetwork.py
+++ b/youtube_dl/extractor/usanetwork.py
@ -1,11 +1,9 @@
 # coding: utf-8
 from __future__ import unicode_literals

-import re
-
 from .adobepass import AdobePassIE
 from ..utils import (
-    extract_attributes,
+    NO_DEFAULT,
    smuggle_url,
    update_url_query,
 )
@ -31,22 +29,22 @@ class USANetworkIE(AdobePassIE):
        display_id = self._match_id(url)
        webpage = self._download_webpage(url, display_id)

-        player_params = extract_attributes(self._search_regex(
-            r'(<div[^>]+data-usa-tve-player-container[^>]*>)', webpage, 'player params'))
-        video_id = player_params['data-mpx-guid']
-        title = player_params['data-episode-title']
+        def _x(name, default=NO_DEFAULT):
+            return self._search_regex(
+                r'data-%s\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1' % name,
+                webpage, name, default=default, group='value')

-        account_pid, path = re.search(
-            r'data-src="(?:https?)?//player\.theplatform\.com/p/([^/]+)/.*?/(media/guid/\d+/\d+)',
-            webpage).groups()
+        video_id = _x('mpx-guid')
+        title = _x('episode-title')
+        mpx_account_id = _x('mpx-account-id', '2304992029')

        query = {
            'mbr': 'true',
        }
-        if player_params.get('data-is-full-episode') == '1':
+        if _x('is-full-episode', None) == '1':
            query['manifest'] = 'm3u'

-        if player_params.get('data-entitlement') == 'auth':
+        if _x('is-entitlement', None) == '1':
            adobe_pass = {}
            drupal_settings = self._search_regex(
                r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);',
@ -57,7 +55,7 @@ class USANetworkIE(AdobePassIE):
                    adobe_pass = drupal_settings.get('adobePass', {})
            resource = self._get_mvpd_resource(
                adobe_pass.get('adobePassResourceId', 'usa'),
-                title, video_id, player_params.get('data-episode-rating', 'TV-14'))
+                title, video_id, _x('episode-rating', 'TV-14'))
            query['auth'] = self._extract_mvpd_auth(
                url, video_id, adobe_pass.get('adobePassRequestorId', 'usa'), resource)

@ -65,11 +63,11 @@ class USANetworkIE(AdobePassIE):
        info.update({
            '_type': 'url_transparent',
            'url': smuggle_url(update_url_query(
-                'http://link.theplatform.com/s/%s/%s' % (account_pid, path),
+                'http://link.theplatform.com/s/HNK2IC/media/guid/%s/%s' % (mpx_account_id, video_id),
                query), {'force_smil_url': True}),
            'id': video_id,
            'title': title,
-            'series': player_params.get('data-show-title'),
+            'series': _x('show-title', None),
            'episode': title,
            'ie_key': 'ThePlatform',
        })
--- a/youtube_dl/extractor/youtube.py
+++ b/youtube_dl/extractor/youtube.py
@ -31,6 +31,7 @@ from ..utils import (
    clean_html,
    dict_get,
    error_to_compat_str,
+    extract_attributes,
    ExtractorError,
    float_or_none,
    get_element_by_attribute,
@ -324,17 +325,18 @@ class YoutubePlaylistBaseInfoExtractor(YoutubeEntryListBaseInfoExtractor):
        for video_id, video_title in self.extract_videos_from_page(content):
            yield self.url_result(video_id, 'Youtube', video_id, video_title)

-    def extract_videos_from_page(self, page):
-        ids_in_page = []
-        titles_in_page = []
-        for mobj in re.finditer(self._VIDEO_RE, page):
+    def extract_videos_from_page_impl(self, video_re, page, ids_in_page, titles_in_page):
+        for mobj in re.finditer(video_re, page):
            # The link with index 0 is not the first video of the playlist (not sure if still actual)
            if 'index' in mobj.groupdict() and mobj.group('id') == '0':
                continue
            video_id = mobj.group('id')
-            video_title = unescapeHTML(mobj.group('title'))
+            video_title = unescapeHTML(
+                mobj.group('title')) if 'title' in mobj.groupdict() else None
            if video_title:
                video_title = video_title.strip()
+            if video_title == '► Play all':
+                video_title = None
            try:
                idx = ids_in_page.index(video_id)
                if video_title and not titles_in_page[idx]:
@ -342,6 +344,12 @@ class YoutubePlaylistBaseInfoExtractor(YoutubeEntryListBaseInfoExtractor):
            except ValueError:
                ids_in_page.append(video_id)
                titles_in_page.append(video_title)
+
+    def extract_videos_from_page(self, page):
+        ids_in_page = []
+        titles_in_page = []
+        self.extract_videos_from_page_impl(
+            self._VIDEO_RE, page, ids_in_page, titles_in_page)
        return zip(ids_in_page, titles_in_page)


@ -379,8 +387,10 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                            (?:www\.)?invidious\.enkirton\.net/|
                            (?:www\.)?invidious\.13ad\.de/|
                            (?:www\.)?invidious\.mastodon\.host/|
+                            (?:www\.)?invidious\.nixnet\.xyz/|
                            (?:www\.)?tube\.poal\.co/|
                            (?:www\.)?vid\.wxzm\.sx/|
+                            (?:www\.)?yt\.elukerio\.org/|
                            youtube\.googleapis\.com/)                        # the various hostnames, with wildcard subdomains
                         (?:.*?\#/)?                                          # handle anchor (#/) redirect urls
                         (?:                                                  # the various things that can precede the ID:
@ -1595,17 +1605,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
        video_id = mobj.group(2)
        return video_id

-    def _extract_annotations(self, video_id):
-        return self._download_webpage(
-            'https://www.youtube.com/annotations_invideo', video_id,
-            note='Downloading annotations',
-            errnote='Unable to download video annotations', fatal=False,
-            query={
-                'features': 1,
-                'legacy': 1,
-                'video_id': video_id,
-            })
-
    @staticmethod
    def _extract_chapters(description, duration):
        if not description:
@ -1812,10 +1811,15 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
                        break

        def extract_unavailable_message():
-            return self._html_search_regex(
-                (r'(?s)<div[^>]+id=["\']unavailable-submessage["\'][^>]+>(.+?)</div',
-                 r'(?s)<h1[^>]+id=["\']unavailable-message["\'][^>]*>(.+?)</h1>'),
-                video_webpage, 'unavailable message', default=None)
+            messages = []
+            for tag, kind in (('h1', 'message'), ('div', 'submessage')):
+                msg = self._html_search_regex(
+                    r'(?s)<{tag}[^>]+id=["\']unavailable-{kind}["\'][^>]*>(.+?)</{tag}>'.format(tag=tag, kind=kind),
+                    video_webpage, 'unavailable %s' % kind, default=None)
+                if msg:
+                    messages.append(msg)
+            if messages:
+                return '\n'.join(messages)

        if not video_info:
            unavailable_message = extract_unavailable_message()
@ -2277,7 +2281,21 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
        # annotations
        video_annotations = None
        if self._downloader.params.get('writeannotations', False):
-            video_annotations = self._extract_annotations(video_id)
+            xsrf_token = self._search_regex(
+                r'([\'"])XSRF_TOKEN\1\s*:\s*([\'"])(?P<xsrf_token>[A-Za-z0-9+/=]+)\2',
+                video_webpage, 'xsrf token', group='xsrf_token', fatal=False)
+            invideo_url = try_get(
+                player_response, lambda x: x['annotations'][0]['playerAnnotationsUrlsRenderer']['invideoUrl'], compat_str)
+            if xsrf_token and invideo_url:
+                xsrf_field_name = self._search_regex(
+                    r'([\'"])XSRF_FIELD_NAME\1\s*:\s*([\'"])(?P<xsrf_field_name>\w+)\2',
+                    video_webpage, 'xsrf field name',
+                    group='xsrf_field_name', default='session_token')
+                video_annotations = self._download_webpage(
+                    self._proto_relative_url(invideo_url),
+                    video_id, note='Downloading annotations',
+                    errnote='Unable to download video annotations', fatal=False,
+                    data=urlencode_postdata({xsrf_field_name: xsrf_token}))

        chapters = self._extract_chapters(description_original, video_duration)

@ -2435,7 +2453,8 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
                        (%(playlist_id)s)
                     )""" % {'playlist_id': YoutubeBaseInfoExtractor._PLAYLIST_ID_RE}
    _TEMPLATE_URL = 'https://www.youtube.com/playlist?list=%s'
-    _VIDEO_RE = r'href="\s*/watch\?v=(?P<id>[0-9A-Za-z_-]{11})(?:&amp;(?:[^"]*?index=(?P<index>\d+))?(?:[^>]+>(?P<title>[^<]+))?)?'
+    _VIDEO_RE_TPL = r'href="\s*/watch\?v=%s(?:&amp;(?:[^"]*?index=(?P<index>\d+))?(?:[^>]+>(?P<title>[^<]+))?)?'
+    _VIDEO_RE = _VIDEO_RE_TPL % r'(?P<id>[0-9A-Za-z_-]{11})'
    IE_NAME = 'youtube:playlist'
    _TESTS = [{
        'url': 'https://www.youtube.com/playlist?list=PLwiyx1dc3P2JR9N8gQaQN_BCvlSlap7re',
@ -2600,6 +2619,34 @@ class YoutubePlaylistIE(YoutubePlaylistBaseInfoExtractor):
    def _real_initialize(self):
        self._login()

+    def extract_videos_from_page(self, page):
+        ids_in_page = []
+        titles_in_page = []
+
+        for item in re.findall(
+                r'(<[^>]*\bdata-video-id\s*=\s*["\'][0-9A-Za-z_-]{11}[^>]+>)', page):
+            attrs = extract_attributes(item)
+            video_id = attrs['data-video-id']
+            video_title = unescapeHTML(attrs.get('data-title'))
+            if video_title:
+                video_title = video_title.strip()
+            ids_in_page.append(video_id)
+            titles_in_page.append(video_title)
+
+        # Fallback with old _VIDEO_RE
+        self.extract_videos_from_page_impl(
+            self._VIDEO_RE, page, ids_in_page, titles_in_page)
+
+        # Relaxed fallbacks
+        self.extract_videos_from_page_impl(
+            r'href="\s*/watch\?v\s*=\s*(?P<id>[0-9A-Za-z_-]{11})', page,
+            ids_in_page, titles_in_page)
+        self.extract_videos_from_page_impl(
+            r'data-video-ids\s*=\s*["\'](?P<id>[0-9A-Za-z_-]{11})', page,
+            ids_in_page, titles_in_page)
+
+        return zip(ids_in_page, titles_in_page)
+
    def _extract_mix(self, playlist_id):
        # The mixes are generated from a single video
        # the id of the playlist is just 'RD' + video_id
--- a/youtube_dl/version.py
+++ b/youtube_dl/version.py
@ -1,3 +1,3 @@
 from __future__ import unicode_literals

-__version__ = '2019.08.02'
+__version__ = '2019.08.13'