diff --git a/.github/ISSUE_TEMPLATE.md b/.github/ISSUE_TEMPLATE.md deleted file mode 100644 index 6911e2d5c..000000000 --- a/.github/ISSUE_TEMPLATE.md +++ /dev/null @@ -1,62 +0,0 @@ -## Please follow the guide below - -- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly -- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`) -- Use the *Preview* tab to see what your issue will actually look like - ---- - -### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2019.04.24*. If it's not, read [this FAQ entry](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. -- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2019.04.24** - -### Before submitting an *issue* make sure you have: -- [ ] At least skimmed through the [README](https://github.com/ytdl-org/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/ytdl-org/youtube-dl#faq) and [BUGS](https://github.com/ytdl-org/youtube-dl#bugs) sections -- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones -- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser -- [ ] Checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/ytdl-org/youtube-dl#video-url-contains-an-ampersand-and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command) - -### What is the purpose of your *issue*? -- [ ] Bug report (encountered problems with youtube-dl) -- [ ] Site support request (request for adding support for a new site) -- [ ] Feature request (request for a new functionality) -- [ ] Question -- [ ] Other - ---- - -### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue* - ---- - -### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows: - -Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v `), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```): - -``` -[debug] System config: [] -[debug] User config: [] -[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] -[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 -[debug] youtube-dl version 2019.04.24 -[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 -[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 -[debug] Proxy map: {} -... - -``` - ---- - -### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**): -- Single video: https://www.youtube.com/watch?v=BaW_jenozKc -- Single video: https://youtu.be/BaW_jenozKc -- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc - -Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights. - ---- - -### Description of your *issue*, suggested solution and other information - -Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible. -If work on your *issue* requires account credentials please provide them or explain how one can obtain them. diff --git a/.github/ISSUE_TEMPLATE/1_broken_site.md b/.github/ISSUE_TEMPLATE/1_broken_site.md new file mode 100644 index 000000000..bab917400 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/1_broken_site.md @@ -0,0 +1,63 @@ +--- +name: Broken site support +about: Report broken or misfunctioning site +title: '' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a broken site support +- [ ] I've verified that I'm running youtube-dl version **2019.04.24** +- [ ] I've checked that all provided URLs are alive and playable in a browser +- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped +- [ ] I've searched the bugtracker for similar issues including closed ones + + +## Verbose log + + + +``` +PASTE VERBOSE LOG HERE +``` + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/.github/ISSUE_TEMPLATE/2_site_support_request.md b/.github/ISSUE_TEMPLATE/2_site_support_request.md new file mode 100644 index 000000000..7d78921ee --- /dev/null +++ b/.github/ISSUE_TEMPLATE/2_site_support_request.md @@ -0,0 +1,54 @@ +--- +name: Site support request +about: Request support for a new site +title: '' +labels: 'site-support-request' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a new site support request +- [ ] I've verified that I'm running youtube-dl version **2019.04.24** +- [ ] I've checked that all provided URLs are alive and playable in a browser +- [ ] I've checked that none of provided URLs violate any copyrights +- [ ] I've searched the bugtracker for similar site support requests including closed ones + + +## Example URLs + + + +- Single video: https://www.youtube.com/watch?v=BaW_jenozKc +- Single video: https://youtu.be/BaW_jenozKc +- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/.github/ISSUE_TEMPLATE/3_site_feature_request.md b/.github/ISSUE_TEMPLATE/3_site_feature_request.md new file mode 100644 index 000000000..0ed4d1d6a --- /dev/null +++ b/.github/ISSUE_TEMPLATE/3_site_feature_request.md @@ -0,0 +1,37 @@ +--- +name: Site feature request +about: Request a new functionality for a site +title: '' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a site feature request +- [ ] I've verified that I'm running youtube-dl version **2019.04.24** +- [ ] I've searched the bugtracker for similar site feature requests including closed ones + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/.github/ISSUE_TEMPLATE/4_bug_report.md b/.github/ISSUE_TEMPLATE/4_bug_report.md new file mode 100644 index 000000000..fd9b09d6f --- /dev/null +++ b/.github/ISSUE_TEMPLATE/4_bug_report.md @@ -0,0 +1,65 @@ +--- +name: Bug report +about: Report a bug unrelated to any particular site or extractor +title: '' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a broken site support issue +- [ ] I've verified that I'm running youtube-dl version **2019.04.24** +- [ ] I've checked that all provided URLs are alive and playable in a browser +- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped +- [ ] I've searched the bugtracker for similar bug reports including closed ones +- [ ] I've read bugs section in FAQ + + +## Verbose log + + + +``` +PASTE VERBOSE LOG HERE +``` + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/.github/ISSUE_TEMPLATE/5_feature_request.md b/.github/ISSUE_TEMPLATE/5_feature_request.md new file mode 100644 index 000000000..94f373d4e --- /dev/null +++ b/.github/ISSUE_TEMPLATE/5_feature_request.md @@ -0,0 +1,38 @@ +--- +name: Feature request +about: Request a new functionality unrelated to any particular site or extractor +title: '' +labels: 'request' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a feature request +- [ ] I've verified that I'm running youtube-dl version **2019.04.24** +- [ ] I've searched the bugtracker for similar feature requests including closed ones + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/.github/ISSUE_TEMPLATE/6_question.md b/.github/ISSUE_TEMPLATE/6_question.md new file mode 100644 index 000000000..1fd7cd5dc --- /dev/null +++ b/.github/ISSUE_TEMPLATE/6_question.md @@ -0,0 +1,38 @@ +--- +name: Ask question +about: Ask youtube-dl related question +title: '' +labels: 'question' +--- + + + + +## Checklist + + + +- [ ] I'm asking a question +- [ ] I've looked through the README and FAQ for similar questions +- [ ] I've searched the bugtracker for similar questions including closed ones + + +## Question + + + +WRITE QUESTION HERE diff --git a/.github/ISSUE_TEMPLATE_tmpl.md b/.github/ISSUE_TEMPLATE_tmpl.md deleted file mode 100644 index bbd79afb0..000000000 --- a/.github/ISSUE_TEMPLATE_tmpl.md +++ /dev/null @@ -1,62 +0,0 @@ -## Please follow the guide below - -- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly -- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`) -- Use the *Preview* tab to see what your issue will actually look like - ---- - -### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not, read [this FAQ entry](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected. -- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s** - -### Before submitting an *issue* make sure you have: -- [ ] At least skimmed through the [README](https://github.com/ytdl-org/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/ytdl-org/youtube-dl#faq) and [BUGS](https://github.com/ytdl-org/youtube-dl#bugs) sections -- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones -- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser -- [ ] Checked that all URLs and arguments with special characters are [properly quoted or escaped](https://github.com/ytdl-org/youtube-dl#video-url-contains-an-ampersand-and-im-getting-some-strange-output-1-2839-or-v-is-not-recognized-as-an-internal-or-external-command) - -### What is the purpose of your *issue*? -- [ ] Bug report (encountered problems with youtube-dl) -- [ ] Site support request (request for adding support for a new site) -- [ ] Feature request (request for a new functionality) -- [ ] Question -- [ ] Other - ---- - -### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue* - ---- - -### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows: - -Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v `), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```): - -``` -[debug] System config: [] -[debug] User config: [] -[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj'] -[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251 -[debug] youtube-dl version %(version)s -[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2 -[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4 -[debug] Proxy map: {} -... - -``` - ---- - -### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**): -- Single video: https://www.youtube.com/watch?v=BaW_jenozKc -- Single video: https://youtu.be/BaW_jenozKc -- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc - -Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights. - ---- - -### Description of your *issue*, suggested solution and other information - -Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible. -If work on your *issue* requires account credentials please provide them or explain how one can obtain them. diff --git a/.github/ISSUE_TEMPLATE_tmpl/1_broken_site.md b/.github/ISSUE_TEMPLATE_tmpl/1_broken_site.md new file mode 100644 index 000000000..c7600d5b5 --- /dev/null +++ b/.github/ISSUE_TEMPLATE_tmpl/1_broken_site.md @@ -0,0 +1,63 @@ +--- +name: Broken site support +about: Report broken or misfunctioning site +title: '' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a broken site support +- [ ] I've verified that I'm running youtube-dl version **%(version)s** +- [ ] I've checked that all provided URLs are alive and playable in a browser +- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped +- [ ] I've searched the bugtracker for similar issues including closed ones + + +## Verbose log + + + +``` +PASTE VERBOSE LOG HERE +``` + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/.github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md b/.github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md new file mode 100644 index 000000000..d4988e639 --- /dev/null +++ b/.github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md @@ -0,0 +1,54 @@ +--- +name: Site support request +about: Request support for a new site +title: '' +labels: 'site-support-request' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a new site support request +- [ ] I've verified that I'm running youtube-dl version **%(version)s** +- [ ] I've checked that all provided URLs are alive and playable in a browser +- [ ] I've checked that none of provided URLs violate any copyrights +- [ ] I've searched the bugtracker for similar site support requests including closed ones + + +## Example URLs + + + +- Single video: https://www.youtube.com/watch?v=BaW_jenozKc +- Single video: https://youtu.be/BaW_jenozKc +- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/.github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md b/.github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md new file mode 100644 index 000000000..65f0a32f3 --- /dev/null +++ b/.github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md @@ -0,0 +1,37 @@ +--- +name: Site feature request +about: Request a new functionality for a site +title: '' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a site feature request +- [ ] I've verified that I'm running youtube-dl version **%(version)s** +- [ ] I've searched the bugtracker for similar site feature requests including closed ones + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/.github/ISSUE_TEMPLATE_tmpl/4_bug_report.md b/.github/ISSUE_TEMPLATE_tmpl/4_bug_report.md new file mode 100644 index 000000000..41fb14b72 --- /dev/null +++ b/.github/ISSUE_TEMPLATE_tmpl/4_bug_report.md @@ -0,0 +1,65 @@ +--- +name: Bug report +about: Report a bug unrelated to any particular site or extractor +title: '' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a broken site support issue +- [ ] I've verified that I'm running youtube-dl version **%(version)s** +- [ ] I've checked that all provided URLs are alive and playable in a browser +- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped +- [ ] I've searched the bugtracker for similar bug reports including closed ones +- [ ] I've read bugs section in FAQ + + +## Verbose log + + + +``` +PASTE VERBOSE LOG HERE +``` + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/.github/ISSUE_TEMPLATE_tmpl/5_feature_request.md b/.github/ISSUE_TEMPLATE_tmpl/5_feature_request.md new file mode 100644 index 000000000..b3431a7f0 --- /dev/null +++ b/.github/ISSUE_TEMPLATE_tmpl/5_feature_request.md @@ -0,0 +1,38 @@ +--- +name: Feature request +about: Request a new functionality unrelated to any particular site or extractor +title: '' +labels: 'request' +--- + + + + +## Checklist + + + +- [ ] I'm reporting a feature request +- [ ] I've verified that I'm running youtube-dl version **%(version)s** +- [ ] I've searched the bugtracker for similar feature requests including closed ones + + +## Description + + + +WRITE DESCRIPTION HERE diff --git a/Makefile b/Makefile index 4a62f44bc..3e17365b8 100644 --- a/Makefile +++ b/Makefile @@ -1,7 +1,7 @@ all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites clean: - rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe + rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe find . -name "*.pyc" -delete find . -name "*.class" -delete @@ -78,8 +78,12 @@ README.md: youtube_dl/*.py youtube_dl/*/*.py CONTRIBUTING.md: README.md $(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md -.github/ISSUE_TEMPLATE.md: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md youtube_dl/version.py - $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md .github/ISSUE_TEMPLATE.md +issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md youtube_dl/version.py + $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE/1_broken_site.md + $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE/2_site_support_request.md + $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md + $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE/4_bug_report.md + $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md .github/ISSUE_TEMPLATE/5_feature_request.md supportedsites: $(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md diff --git a/devscripts/release.sh b/devscripts/release.sh index 4c413bf6d..f2411c927 100755 --- a/devscripts/release.sh +++ b/devscripts/release.sh @@ -78,8 +78,8 @@ sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py sed -i "s//$version/" ChangeLog /bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..." -make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites -git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py ChangeLog +make README.md CONTRIBUTING.md issuetemplates supportedsites +git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md youtube_dl/version.py ChangeLog git commit $gpg_sign_commits -m "release $version" /bin/echo -e "\n### Now tagging, signing and pushing..." diff --git a/youtube_dl/extractor/adn.py b/youtube_dl/extractor/adn.py index 923c351e4..c95ad2173 100644 --- a/youtube_dl/extractor/adn.py +++ b/youtube_dl/extractor/adn.py @@ -65,14 +65,15 @@ class ADNIE(InfoExtractor): if subtitle_location: enc_subtitles = self._download_webpage( urljoin(self._BASE_URL, subtitle_location), - video_id, 'Downloading subtitles data', fatal=False) + video_id, 'Downloading subtitles data', fatal=False, + headers={'Origin': 'https://animedigitalnetwork.fr'}) if not enc_subtitles: return None # http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js dec_subtitles = intlist_to_bytes(aes_cbc_decrypt( bytes_to_intlist(compat_b64decode(enc_subtitles[24:])), - bytes_to_intlist(binascii.unhexlify(self._K + '4421de0a5f0814ba')), + bytes_to_intlist(binascii.unhexlify(self._K + '4b8ef13ec1872730')), bytes_to_intlist(compat_b64decode(enc_subtitles[:24])) )) subtitles_json = self._parse_json( diff --git a/youtube_dl/extractor/ccc.py b/youtube_dl/extractor/ccc.py index 734702144..36e6dff72 100644 --- a/youtube_dl/extractor/ccc.py +++ b/youtube_dl/extractor/ccc.py @@ -1,9 +1,12 @@ +# coding: utf-8 from __future__ import unicode_literals from .common import InfoExtractor from ..utils import ( int_or_none, parse_iso8601, + try_get, + url_or_none, ) @@ -18,11 +21,13 @@ class CCCIE(InfoExtractor): 'id': '1839', 'ext': 'mp4', 'title': 'Introduction to Processor Design', + 'creator': 'byterazor', 'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac', 'thumbnail': r're:^https?://.*\.jpg$', 'upload_date': '20131228', 'timestamp': 1388188800, 'duration': 3710, + 'tags': list, } }, { 'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download', @@ -68,6 +73,7 @@ class CCCIE(InfoExtractor): 'id': event_id, 'display_id': display_id, 'title': event_data['title'], + 'creator': try_get(event_data, lambda x: ', '.join(x['persons'])), 'description': event_data.get('description'), 'thumbnail': event_data.get('thumb_url'), 'timestamp': parse_iso8601(event_data.get('date')), @@ -75,3 +81,31 @@ class CCCIE(InfoExtractor): 'tags': event_data.get('tags'), 'formats': formats, } + + +class CCCPlaylistIE(InfoExtractor): + IE_NAME = 'media.ccc.de:lists' + _VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/c/(?P[^/?#&]+)' + _TESTS = [{ + 'url': 'https://media.ccc.de/c/30c3', + 'info_dict': { + 'title': '30C3', + 'id': '30c3', + }, + 'playlist_count': 135, + }] + + def _real_extract(self, url): + playlist_id = self._match_id(url).lower() + + conf = self._download_json( + 'https://media.ccc.de/public/conferences/' + playlist_id, + playlist_id) + + entries = [] + for e in conf['events']: + event_url = url_or_none(e.get('frontend_link')) + if event_url: + entries.append(self.url_result(event_url, ie=CCCIE.ie_key())) + + return self.playlist_result(entries, playlist_id, conf.get('title')) diff --git a/youtube_dl/extractor/cinemax.py b/youtube_dl/extractor/cinemax.py new file mode 100644 index 000000000..7f89d33de --- /dev/null +++ b/youtube_dl/extractor/cinemax.py @@ -0,0 +1,29 @@ +# coding: utf-8 +from __future__ import unicode_literals + +import re + +from .hbo import HBOBaseIE + + +class CinemaxIE(HBOBaseIE): + _VALID_URL = r'https?://(?:www\.)?cinemax\.com/(?P[^/]+/video/[0-9a-z-]+-(?P\d+))' + _TESTS = [{ + 'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903', + 'md5': '82e0734bba8aa7ef526c9dd00cf35a05', + 'info_dict': { + 'id': '20126903', + 'ext': 'mp4', + 'title': 'S1 Ep 1: Recap', + }, + 'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'], + }, { + 'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903.embed', + 'only_matching': True, + }] + + def _real_extract(self, url): + path, video_id = re.match(self._VALID_URL, url).groups() + info = self._extract_info('https://www.cinemax.com/%s.xml' % path, video_id) + info['id'] = video_id + return info diff --git a/youtube_dl/extractor/dramafever.py b/youtube_dl/extractor/dramafever.py deleted file mode 100644 index db1de699f..000000000 --- a/youtube_dl/extractor/dramafever.py +++ /dev/null @@ -1,266 +0,0 @@ -# coding: utf-8 -from __future__ import unicode_literals - -import itertools -import json - -from .common import InfoExtractor -from ..compat import ( - compat_HTTPError, - compat_urlparse, -) -from ..utils import ( - clean_html, - ExtractorError, - int_or_none, - parse_age_limit, - parse_duration, - unified_timestamp, - url_or_none, -) - - -class DramaFeverBaseIE(InfoExtractor): - _NETRC_MACHINE = 'dramafever' - - _CONSUMER_SECRET = 'DA59dtVXYLxajktV' - - _consumer_secret = None - - def _get_consumer_secret(self): - mainjs = self._download_webpage( - 'http://www.dramafever.com/static/51afe95/df2014/scripts/main.js', - None, 'Downloading main.js', fatal=False) - if not mainjs: - return self._CONSUMER_SECRET - return self._search_regex( - r"var\s+cs\s*=\s*'([^']+)'", mainjs, - 'consumer secret', default=self._CONSUMER_SECRET) - - def _real_initialize(self): - self._consumer_secret = self._get_consumer_secret() - self._login() - - def _login(self): - username, password = self._get_login_info() - if username is None: - return - - login_form = { - 'username': username, - 'password': password, - } - - try: - response = self._download_json( - 'https://www.dramafever.com/api/users/login', None, 'Logging in', - data=json.dumps(login_form).encode('utf-8'), headers={ - 'x-consumer-key': self._consumer_secret, - }) - except ExtractorError as e: - if isinstance(e.cause, compat_HTTPError) and e.cause.code in (403, 404): - response = self._parse_json( - e.cause.read().decode('utf-8'), None) - else: - raise - - # Successful login - if response.get('result') or response.get('guid') or response.get('user_guid'): - return - - errors = response.get('errors') - if errors and isinstance(errors, list): - error = errors[0] - message = error.get('message') or error['reason'] - raise ExtractorError('Unable to login: %s' % message, expected=True) - raise ExtractorError('Unable to log in') - - -class DramaFeverIE(DramaFeverBaseIE): - IE_NAME = 'dramafever' - _VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P[0-9]+/[0-9]+)(?:/|$)' - _TESTS = [{ - 'url': 'https://www.dramafever.com/drama/4274/1/Heirs/', - 'info_dict': { - 'id': '4274.1', - 'ext': 'wvm', - 'title': 'Heirs - Episode 1', - 'description': 'md5:362a24ba18209f6276e032a651c50bc2', - 'thumbnail': r're:^https?://.*\.jpg', - 'duration': 3783, - 'timestamp': 1381354993, - 'upload_date': '20131009', - 'series': 'Heirs', - 'season_number': 1, - 'episode': 'Episode 1', - 'episode_number': 1, - }, - 'params': { - # m3u8 download - 'skip_download': True, - }, - }, { - 'url': 'http://www.dramafever.com/drama/4826/4/Mnet_Asian_Music_Awards_2015/?ap=1', - 'info_dict': { - 'id': '4826.4', - 'ext': 'flv', - 'title': 'Mnet Asian Music Awards 2015', - 'description': 'md5:3ff2ee8fedaef86e076791c909cf2e91', - 'episode': 'Mnet Asian Music Awards 2015 - Part 3', - 'episode_number': 4, - 'thumbnail': r're:^https?://.*\.jpg', - 'timestamp': 1450213200, - 'upload_date': '20151215', - 'duration': 5359, - }, - 'params': { - # m3u8 download - 'skip_download': True, - }, - }, { - 'url': 'https://www.dramafever.com/zh-cn/drama/4972/15/Doctor_Romantic/', - 'only_matching': True, - }] - - def _call_api(self, path, video_id, note, fatal=False): - return self._download_json( - 'https://www.dramafever.com/api/5/' + path, - video_id, note=note, headers={ - 'x-consumer-key': self._consumer_secret, - }, fatal=fatal) - - def _get_subtitles(self, video_id): - subtitles = {} - subs = self._call_api( - 'video/%s/subtitles/webvtt/' % video_id, video_id, - 'Downloading subtitles JSON', fatal=False) - if not subs or not isinstance(subs, list): - return subtitles - for sub in subs: - if not isinstance(sub, dict): - continue - sub_url = url_or_none(sub.get('url')) - if not sub_url: - continue - subtitles.setdefault( - sub.get('code') or sub.get('language') or 'en', []).append({ - 'url': sub_url - }) - return subtitles - - def _real_extract(self, url): - video_id = self._match_id(url).replace('/', '.') - - series_id, episode_number = video_id.split('.') - - video = self._call_api( - 'series/%s/episodes/%s/' % (series_id, episode_number), video_id, - 'Downloading video JSON') - - formats = [] - download_assets = video.get('download_assets') - if download_assets and isinstance(download_assets, dict): - for format_id, format_dict in download_assets.items(): - if not isinstance(format_dict, dict): - continue - format_url = url_or_none(format_dict.get('url')) - if not format_url: - continue - formats.append({ - 'url': format_url, - 'format_id': format_id, - 'filesize': int_or_none(video.get('filesize')), - }) - - stream = self._call_api( - 'video/%s/stream/' % video_id, video_id, 'Downloading stream JSON', - fatal=False) - if stream: - stream_url = stream.get('stream_url') - if stream_url: - formats.extend(self._extract_m3u8_formats( - stream_url, video_id, 'mp4', entry_protocol='m3u8_native', - m3u8_id='hls', fatal=False)) - self._sort_formats(formats) - - title = video.get('title') or 'Episode %s' % episode_number - description = video.get('description') - thumbnail = video.get('thumbnail') - timestamp = unified_timestamp(video.get('release_date')) - duration = parse_duration(video.get('duration')) - age_limit = parse_age_limit(video.get('tv_rating')) - series = video.get('series_title') - season_number = int_or_none(video.get('season')) - - if series: - title = '%s - %s' % (series, title) - - subtitles = self.extract_subtitles(video_id) - - return { - 'id': video_id, - 'title': title, - 'description': description, - 'thumbnail': thumbnail, - 'duration': duration, - 'timestamp': timestamp, - 'age_limit': age_limit, - 'series': series, - 'season_number': season_number, - 'episode_number': int_or_none(episode_number), - 'formats': formats, - 'subtitles': subtitles, - } - - -class DramaFeverSeriesIE(DramaFeverBaseIE): - IE_NAME = 'dramafever:series' - _VALID_URL = r'https?://(?:www\.)?dramafever\.com/(?:[^/]+/)?drama/(?P[0-9]+)(?:/(?:(?!\d+(?:/|$)).+)?)?$' - _TESTS = [{ - 'url': 'http://www.dramafever.com/drama/4512/Cooking_with_Shin/', - 'info_dict': { - 'id': '4512', - 'title': 'Cooking with Shin', - 'description': 'md5:84a3f26e3cdc3fb7f500211b3593b5c1', - }, - 'playlist_count': 4, - }, { - 'url': 'http://www.dramafever.com/drama/124/IRIS/', - 'info_dict': { - 'id': '124', - 'title': 'IRIS', - 'description': 'md5:b3a30e587cf20c59bd1c01ec0ee1b862', - }, - 'playlist_count': 20, - }] - - _PAGE_SIZE = 60 # max is 60 (see http://api.drama9.com/#get--api-4-episode-series-) - - def _real_extract(self, url): - series_id = self._match_id(url) - - series = self._download_json( - 'http://www.dramafever.com/api/4/series/query/?cs=%s&series_id=%s' - % (self._consumer_secret, series_id), - series_id, 'Downloading series JSON')['series'][series_id] - - title = clean_html(series['name']) - description = clean_html(series.get('description') or series.get('description_short')) - - entries = [] - for page_num in itertools.count(1): - episodes = self._download_json( - 'http://www.dramafever.com/api/4/episode/series/?cs=%s&series_id=%s&page_size=%d&page_number=%d' - % (self._consumer_secret, series_id, self._PAGE_SIZE, page_num), - series_id, 'Downloading episodes JSON page #%d' % page_num) - for episode in episodes.get('value', []): - episode_url = episode.get('episode_url') - if not episode_url: - continue - entries.append(self.url_result( - compat_urlparse.urljoin(url, episode_url), - 'DramaFever', episode.get('guid'))) - if page_num == episodes['num_pages']: - break - - return self.playlist_result(entries, series_id, title, description) diff --git a/youtube_dl/extractor/extractors.py b/youtube_dl/extractor/extractors.py index 6a78d1ece..9073e7a4b 100644 --- a/youtube_dl/extractor/extractors.py +++ b/youtube_dl/extractor/extractors.py @@ -177,7 +177,10 @@ from .cbsnews import ( CBSNewsLiveVideoIE, ) from .cbssports import CBSSportsIE -from .ccc import CCCIE +from .ccc import ( + CCCIE, + CCCPlaylistIE, +) from .ccma import CCMAIE from .cctv import CCTVIE from .cda import CDAIE @@ -194,6 +197,7 @@ from .chirbit import ( ChirbitProfileIE, ) from .cinchcast import CinchcastIE +from .cinemax import CinemaxIE from .ciscolive import ( CiscoLiveSessionIE, CiscoLiveSearchIE, @@ -283,10 +287,6 @@ from .dplay import ( DPlayIE, DPlayItIE, ) -from .dramafever import ( - DramaFeverIE, - DramaFeverSeriesIE, -) from .dreisat import DreiSatIE from .drbonanza import DRBonanzaIE from .drtuber import DrTuberIE @@ -1101,6 +1101,10 @@ from .streetvoice import StreetVoiceIE from .stretchinternet import StretchInternetIE from .stv import STVPlayerIE from .sunporno import SunPornoIE +from .sverigesradio import ( + SverigesRadioEpisodeIE, + SverigesRadioPublicationIE, +) from .svt import ( SVTIE, SVTPageIE, @@ -1422,10 +1426,6 @@ from .weiqitv import WeiqiTVIE from .wimp import WimpIE from .wistia import WistiaIE from .worldstarhiphop import WorldStarHipHopIE -from .wrzuta import ( - WrzutaIE, - WrzutaPlaylistIE, -) from .wsj import ( WSJIE, WSJArticleIE, diff --git a/youtube_dl/extractor/hbo.py b/youtube_dl/extractor/hbo.py index 44440233d..68df748f5 100644 --- a/youtube_dl/extractor/hbo.py +++ b/youtube_dl/extractor/hbo.py @@ -13,19 +13,7 @@ from ..utils import ( ) -class HBOIE(InfoExtractor): - IE_NAME = 'hbo' - _VALID_URL = r'https?://(?:www\.)?hbo\.com/(?:video|embed)(?:/[^/]+)*/(?P[^/?#]+)' - _TEST = { - 'url': 'https://www.hbo.com/video/game-of-thrones/seasons/season-8/videos/trailer', - 'md5': '8126210656f433c452a21367f9ad85b3', - 'info_dict': { - 'id': '22113301', - 'ext': 'mp4', - 'title': 'Game of Thrones - Trailer', - }, - 'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'], - } +class HBOBaseIE(InfoExtractor): _FORMATS_INFO = { 'pro7': { 'width': 1280, @@ -65,12 +53,8 @@ class HBOIE(InfoExtractor): }, } - def _real_extract(self, url): - display_id = self._match_id(url) - webpage = self._download_webpage(url, display_id) - location_path = self._parse_json(self._html_search_regex( - r'data-state="({.+?})"', webpage, 'state'), display_id)['video']['locationUrl'] - video_data = self._download_xml(urljoin(url, location_path), display_id) + def _extract_info(self, url, display_id): + video_data = self._download_xml(url, display_id) video_id = xpath_text(video_data, 'id', fatal=True) episode_title = title = xpath_text(video_data, 'title', fatal=True) series = xpath_text(video_data, 'program') @@ -167,3 +151,25 @@ class HBOIE(InfoExtractor): 'thumbnails': thumbnails, 'subtitles': subtitles, } + + +class HBOIE(HBOBaseIE): + IE_NAME = 'hbo' + _VALID_URL = r'https?://(?:www\.)?hbo\.com/(?:video|embed)(?:/[^/]+)*/(?P[^/?#]+)' + _TEST = { + 'url': 'https://www.hbo.com/video/game-of-thrones/seasons/season-8/videos/trailer', + 'md5': '8126210656f433c452a21367f9ad85b3', + 'info_dict': { + 'id': '22113301', + 'ext': 'mp4', + 'title': 'Game of Thrones - Trailer', + }, + 'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'], + } + + def _real_extract(self, url): + display_id = self._match_id(url) + webpage = self._download_webpage(url, display_id) + location_path = self._parse_json(self._html_search_regex( + r'data-state="({.+?})"', webpage, 'state'), display_id)['video']['locationUrl'] + return self._extract_info(urljoin(url, location_path), display_id) diff --git a/youtube_dl/extractor/sixplay.py b/youtube_dl/extractor/sixplay.py index 35bc9fa50..2a72af11b 100644 --- a/youtube_dl/extractor/sixplay.py +++ b/youtube_dl/extractor/sixplay.py @@ -65,7 +65,7 @@ class SixPlayIE(InfoExtractor): for asset in assets: asset_url = asset.get('full_physical_path') protocol = asset.get('protocol') - if not asset_url or protocol == 'primetime' or asset.get('type') == 'usp_hlsfp_h264' or asset_url in urls: + if not asset_url or ((protocol == 'primetime' or asset.get('type') == 'usp_hlsfp_h264') and not ('_drmnp.ism/' in asset_url or '_unpnp.ism/' in asset_url)) or asset_url in urls: continue urls.append(asset_url) container = asset.get('video_container') @@ -82,6 +82,7 @@ class SixPlayIE(InfoExtractor): if not urlh: continue asset_url = urlh.geturl() + asset_url = asset_url.replace('_drmnp.ism/', '_unpnp.ism/') for i in range(3, 0, -1): asset_url = asset_url = asset_url.replace('_sd1/', '_sd%d/' % i) m3u8_formats = self._extract_m3u8_formats( diff --git a/youtube_dl/extractor/sverigesradio.py b/youtube_dl/extractor/sverigesradio.py new file mode 100644 index 000000000..aa0691f0d --- /dev/null +++ b/youtube_dl/extractor/sverigesradio.py @@ -0,0 +1,115 @@ +# coding: utf-8 +from __future__ import unicode_literals + +from .common import InfoExtractor +from ..utils import ( + determine_ext, + int_or_none, + str_or_none, +) + + +class SverigesRadioBaseIE(InfoExtractor): + _BASE_URL = 'https://sverigesradio.se/sida/playerajax/' + _QUALITIES = ['low', 'medium', 'high'] + _EXT_TO_CODEC_MAP = { + 'mp3': 'mp3', + 'm4a': 'aac', + } + _CODING_FORMAT_TO_ABR_MAP = { + 5: 128, + 11: 192, + 12: 32, + 13: 96, + } + + def _real_extract(self, url): + audio_id = self._match_id(url) + query = { + 'id': audio_id, + 'type': self._AUDIO_TYPE, + } + + item = self._download_json( + self._BASE_URL + 'audiometadata', audio_id, + 'Downloading audio JSON metadata', query=query)['items'][0] + title = item['subtitle'] + + query['format'] = 'iis' + urls = [] + formats = [] + for quality in self._QUALITIES: + query['quality'] = quality + audio_url_data = self._download_json( + self._BASE_URL + 'getaudiourl', audio_id, + 'Downloading %s format JSON metadata' % quality, + fatal=False, query=query) or {} + audio_url = audio_url_data.get('audioUrl') + if not audio_url or audio_url in urls: + continue + urls.append(audio_url) + ext = determine_ext(audio_url) + coding_format = audio_url_data.get('codingFormat') + abr = int_or_none(self._search_regex( + r'_a(\d+)\.m4a', audio_url, 'audio bitrate', + default=None)) or self._CODING_FORMAT_TO_ABR_MAP.get(coding_format) + formats.append({ + 'abr': abr, + 'acodec': self._EXT_TO_CODEC_MAP.get(ext), + 'ext': ext, + 'format_id': str_or_none(coding_format), + 'vcodec': 'none', + 'url': audio_url, + }) + self._sort_formats(formats) + + return { + 'id': audio_id, + 'title': title, + 'formats': formats, + 'series': item.get('title'), + 'duration': int_or_none(item.get('duration')), + 'thumbnail': item.get('displayimageurl'), + 'description': item.get('description'), + } + + +class SverigesRadioPublicationIE(SverigesRadioBaseIE): + IE_NAME = 'sverigesradio:publication' + _VALID_URL = r'https?://(?:www\.)?sverigesradio\.se/sida/(?:artikel|gruppsida)\.aspx\?.*?\bartikel=(?P[0-9]+)' + _TESTS = [{ + 'url': 'https://sverigesradio.se/sida/artikel.aspx?programid=83&artikel=7038546', + 'md5': '6a4917e1923fccb080e5a206a5afa542', + 'info_dict': { + 'id': '7038546', + 'ext': 'm4a', + 'duration': 132, + 'series': 'Nyheter (Ekot)', + 'title': 'Esa Teittinen: Sanningen har inte kommit fram', + 'description': 'md5:daf7ce66a8f0a53d5465a5984d3839df', + 'thumbnail': r're:^https?://.*\.jpg', + }, + }, { + 'url': 'https://sverigesradio.se/sida/gruppsida.aspx?programid=3304&grupp=6247&artikel=7146887', + 'only_matching': True, + }] + _AUDIO_TYPE = 'publication' + + +class SverigesRadioEpisodeIE(SverigesRadioBaseIE): + IE_NAME = 'sverigesradio:episode' + _VALID_URL = r'https?://(?:www\.)?sverigesradio\.se/(?:sida/)?avsnitt/(?P[0-9]+)' + _TEST = { + 'url': 'https://sverigesradio.se/avsnitt/1140922?programid=1300', + 'md5': '20dc4d8db24228f846be390b0c59a07c', + 'info_dict': { + 'id': '1140922', + 'ext': 'mp3', + 'duration': 3307, + 'series': 'Konflikt', + 'title': 'Metoo och valen', + 'description': 'md5:fcb5c1f667f00badcc702b196f10a27e', + 'thumbnail': r're:^https?://.*\.jpg', + } + } + _AUDIO_TYPE = 'episode' diff --git a/youtube_dl/extractor/twitch.py b/youtube_dl/extractor/twitch.py index 8c87f6dd3..dc5ff29c3 100644 --- a/youtube_dl/extractor/twitch.py +++ b/youtube_dl/extractor/twitch.py @@ -134,12 +134,12 @@ class TwitchBaseIE(InfoExtractor): def _prefer_source(self, formats): try: source = next(f for f in formats if f['format_id'] == 'Source') - source['preference'] = 10 + source['quality'] = 10 except StopIteration: for f in formats: if '/chunked/' in f['url']: f.update({ - 'source_preference': 10, + 'quality': 10, 'format_note': 'Source', }) self._sort_formats(formats) diff --git a/youtube_dl/extractor/wrzuta.py b/youtube_dl/extractor/wrzuta.py deleted file mode 100644 index 0f53f1bcb..000000000 --- a/youtube_dl/extractor/wrzuta.py +++ /dev/null @@ -1,158 +0,0 @@ -# coding: utf-8 -from __future__ import unicode_literals - -import re - -from .common import InfoExtractor -from ..utils import ( - ExtractorError, - int_or_none, - qualities, - remove_start, -) - - -class WrzutaIE(InfoExtractor): - IE_NAME = 'wrzuta.pl' - - _VALID_URL = r'https?://(?P[0-9a-zA-Z]+)\.wrzuta\.pl/(?Pfilm|audio)/(?P[0-9a-zA-Z]+)' - - _TESTS = [{ - 'url': 'http://laboratoriumdextera.wrzuta.pl/film/aq4hIZWrkBu/nike_football_the_last_game', - 'md5': '9e67e05bed7c03b82488d87233a9efe7', - 'info_dict': { - 'id': 'aq4hIZWrkBu', - 'ext': 'mp4', - 'title': 'Nike Football: The Last Game', - 'duration': 307, - 'uploader_id': 'laboratoriumdextera', - 'description': 'md5:7fb5ef3c21c5893375fda51d9b15d9cd', - }, - 'skip': 'Redirected to wrzuta.pl', - }, { - 'url': 'http://vexling.wrzuta.pl/audio/01xBFabGXu6/james_horner_-_into_the_na_39_vi_world_bonus', - 'md5': 'f80564fb5a2ec6ec59705ae2bf2ba56d', - 'info_dict': { - 'id': '01xBFabGXu6', - 'ext': 'mp3', - 'title': 'James Horner - Into The Na\'vi World [Bonus]', - 'description': 'md5:30a70718b2cd9df3120fce4445b0263b', - 'duration': 95, - 'uploader_id': 'vexling', - }, - }] - - def _real_extract(self, url): - mobj = re.match(self._VALID_URL, url) - video_id = mobj.group('id') - typ = mobj.group('typ') - uploader = mobj.group('uploader') - - webpage, urlh = self._download_webpage_handle(url, video_id) - - if urlh.geturl() == 'http://www.wrzuta.pl/': - raise ExtractorError('Video removed', expected=True) - - quality = qualities(['SD', 'MQ', 'HQ', 'HD']) - - audio_table = {'flv': 'mp3', 'webm': 'ogg', '???': 'mp3'} - - embedpage = self._download_json('http://www.wrzuta.pl/npp/embed/%s/%s' % (uploader, video_id), video_id) - - formats = [] - for media in embedpage['url']: - fmt = media['type'].split('@')[0] - if typ == 'audio': - ext = audio_table.get(fmt, fmt) - else: - ext = fmt - - formats.append({ - 'format_id': '%s_%s' % (ext, media['quality'].lower()), - 'url': media['url'], - 'ext': ext, - 'quality': quality(media['quality']), - }) - - self._sort_formats(formats) - - return { - 'id': video_id, - 'title': self._og_search_title(webpage), - 'thumbnail': self._og_search_thumbnail(webpage), - 'formats': formats, - 'duration': int_or_none(embedpage['duration']), - 'uploader_id': uploader, - 'description': self._og_search_description(webpage), - 'age_limit': embedpage.get('minimalAge', 0), - } - - -class WrzutaPlaylistIE(InfoExtractor): - """ - this class covers extraction of wrzuta playlist entries - the extraction process bases on following steps: - * collect information of playlist size - * download all entries provided on - the playlist webpage (the playlist is split - on two pages: first directly reached from webpage - second: downloaded on demand by ajax call and rendered - using the ajax call response) - * in case size of extracted entries not reached total number of entries - use the ajax call to collect the remaining entries - """ - - IE_NAME = 'wrzuta.pl:playlist' - _VALID_URL = r'https?://(?P[0-9a-zA-Z]+)\.wrzuta\.pl/playlista/(?P[0-9a-zA-Z]+)' - _TESTS = [{ - 'url': 'http://miromak71.wrzuta.pl/playlista/7XfO4vE84iR/moja_muza', - 'playlist_mincount': 14, - 'info_dict': { - 'id': '7XfO4vE84iR', - 'title': 'Moja muza', - }, - }, { - 'url': 'http://heroesf70.wrzuta.pl/playlista/6Nj3wQHx756/lipiec_-_lato_2015_muzyka_swiata', - 'playlist_mincount': 144, - 'info_dict': { - 'id': '6Nj3wQHx756', - 'title': 'Lipiec - Lato 2015 Muzyka Świata', - }, - }, { - 'url': 'http://miromak71.wrzuta.pl/playlista/7XfO4vE84iR', - 'only_matching': True, - }] - - def _real_extract(self, url): - mobj = re.match(self._VALID_URL, url) - playlist_id = mobj.group('id') - uploader = mobj.group('uploader') - - webpage = self._download_webpage(url, playlist_id) - - playlist_size = int_or_none(self._html_search_regex( - (r']+class=["\']playlist-counter["\'][^>]*>\d+/(\d+)', - r']+class=["\']all-counter["\'][^>]*>(.+?)'), - webpage, 'playlist size', default=None)) - - playlist_title = remove_start( - self._og_search_title(webpage), 'Playlista: ') - - entries = [] - if playlist_size: - entries = [ - self.url_result(entry_url) - for _, entry_url in re.findall( - r']+href=(["\'])(http.+?)\1[^>]+class=["\']playlist-file-page', - webpage)] - if playlist_size > len(entries): - playlist_content = self._download_json( - 'http://%s.wrzuta.pl/xhr/get_playlist_offset/%s' % (uploader, playlist_id), - playlist_id, - 'Downloading playlist JSON', - 'Unable to download playlist JSON') - entries.extend([ - self.url_result(entry['filelink']) - for entry in playlist_content.get('files', []) if entry.get('filelink')]) - - return self.playlist_result(entries, playlist_id, playlist_title) diff --git a/youtube_dl/extractor/youtube.py b/youtube_dl/extractor/youtube.py index 1bc2c27ad..9d542f893 100644 --- a/youtube_dl/extractor/youtube.py +++ b/youtube_dl/extractor/youtube.py @@ -27,6 +27,7 @@ from ..compat import ( ) from ..utils import ( clean_html, + dict_get, error_to_compat_str, ExtractorError, float_or_none, @@ -908,6 +909,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor): 'creator': 'Todd Haberman, Daniel Law Heath and Aaron Kaplan', 'track': 'Dark Walk - Position Music', 'artist': 'Todd Haberman, Daniel Law Heath and Aaron Kaplan', + 'album': 'Position Music - Production Music Vol. 143 - Dark Walk', }, 'params': { 'skip_download': True, @@ -1086,7 +1088,95 @@ class YoutubeIE(YoutubeBaseInfoExtractor): 'skip_download': True, 'youtube_include_dash_manifest': False, }, - } + }, + { + # Youtube Music Auto-generated description + 'url': 'https://music.youtube.com/watch?v=MgNrAu2pzNs', + 'info_dict': { + 'id': 'MgNrAu2pzNs', + 'ext': 'mp4', + 'title': 'Voyeur Girl', + 'description': 'md5:7ae382a65843d6df2685993e90a8628f', + 'upload_date': '20190312', + 'uploader': 'Various Artists - Topic', + 'uploader_id': 'UCVWKBi1ELZn0QX2CBLSkiyw', + 'artist': 'Stephen', + 'track': 'Voyeur Girl', + 'album': 'it\'s too much love to know my dear', + 'release_date': '20190313', + 'release_year': 2019, + }, + 'params': { + 'skip_download': True, + }, + }, + { + # Youtube Music Auto-generated description + # Retrieve 'artist' field from 'Artist:' in video description + # when it is present on youtube music video + 'url': 'https://www.youtube.com/watch?v=k0jLE7tTwjY', + 'info_dict': { + 'id': 'k0jLE7tTwjY', + 'ext': 'mp4', + 'title': 'Latch Feat. Sam Smith', + 'description': 'md5:3cb1e8101a7c85fcba9b4fb41b951335', + 'upload_date': '20150110', + 'uploader': 'Various Artists - Topic', + 'uploader_id': 'UCNkEcmYdjrH4RqtNgh7BZ9w', + 'artist': 'Disclosure', + 'track': 'Latch Feat. Sam Smith', + 'album': 'Latch Featuring Sam Smith', + 'release_date': '20121008', + 'release_year': 2012, + }, + 'params': { + 'skip_download': True, + }, + }, + { + # Youtube Music Auto-generated description + # handle multiple artists on youtube music video + 'url': 'https://www.youtube.com/watch?v=74qn0eJSjpA', + 'info_dict': { + 'id': '74qn0eJSjpA', + 'ext': 'mp4', + 'title': 'Eastside', + 'description': 'md5:290516bb73dcbfab0dcc4efe6c3de5f2', + 'upload_date': '20180710', + 'uploader': 'Benny Blanco - Topic', + 'uploader_id': 'UCzqz_ksRu_WkIzmivMdIS7A', + 'artist': 'benny blanco, Halsey, Khalid', + 'track': 'Eastside', + 'album': 'Eastside', + 'release_date': '20180713', + 'release_year': 2018, + }, + 'params': { + 'skip_download': True, + }, + }, + { + # Youtube Music Auto-generated description + # handle youtube music video with release_year and no release_date + 'url': 'https://www.youtube.com/watch?v=-hcAI0g-f5M', + 'info_dict': { + 'id': '-hcAI0g-f5M', + 'ext': 'mp4', + 'title': 'Put It On Me', + 'description': 'md5:93c55acc682ae7b0c668f2e34e1c069e', + 'upload_date': '20180426', + 'uploader': 'Matt Maeson - Topic', + 'uploader_id': 'UCnEkIGqtGcQMLk73Kp-Q5LQ', + 'artist': 'Matt Maeson', + 'track': 'Put It On Me', + 'album': 'The Hearse', + 'release_date': None, + 'release_year': 2018, + }, + 'params': { + 'skip_download': True, + }, + }, ] def __init__(self, *args, **kwargs): @@ -1563,6 +1653,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor): def extract_view_count(v_info): return int_or_none(try_get(v_info, lambda x: x['view_count'][0])) + def extract_token(v_info): + return dict_get(v_info, ('account_playback_token', 'accountPlaybackToken', 'token')) + player_response = {} # Get video info @@ -1622,7 +1715,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor): # The general idea is to take a union of itags of both DASH manifests (for example # video with such 'manifest behavior' see https://github.com/ytdl-org/youtube-dl/issues/6093) self.report_video_info_webpage_download(video_id) - for el in ('info', 'embedded', 'detailpage', 'vevo', ''): + for el in ('embedded', 'detailpage', 'vevo', ''): query = { 'video_id': video_id, 'ps': 'default', @@ -1652,7 +1745,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor): view_count = extract_view_count(get_video_info) if not video_info: video_info = get_video_info - get_token = get_video_info.get('token') or get_video_info.get('account_playback_token') + get_token = extract_token(get_video_info) if get_token: # Different get_video_info requests may report different results, e.g. # some may report video unavailability, but some may serve it without @@ -1663,7 +1756,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor): # due to YouTube measures against IP ranges of hosting providers. # Working around by preferring the first succeeded video_info containing # the token if no such video_info yet was found. - token = video_info.get('token') or video_info.get('account_playback_token') + token = extract_token(video_info) if not token: video_info = get_video_info break @@ -1680,28 +1773,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor): raise ExtractorError( 'YouTube said: %s' % unavailable_message, expected=True, video_id=video_id) - token = video_info.get('token') or video_info.get('account_playback_token') - if not token: - if 'reason' in video_info: - if 'The uploader has not made this video available in your country.' in video_info['reason']: - regions_allowed = self._html_search_meta( - 'regionsAllowed', video_webpage, default=None) - countries = regions_allowed.split(',') if regions_allowed else None - self.raise_geo_restricted( - msg=video_info['reason'][0], countries=countries) - reason = video_info['reason'][0] - if 'Invalid parameters' in reason: - unavailable_message = extract_unavailable_message() - if unavailable_message: - reason = unavailable_message - raise ExtractorError( - 'YouTube said: %s' % reason, - expected=True, video_id=video_id) - else: - raise ExtractorError( - '"token" parameter not in video info for unknown reason', - video_id=video_id) - if video_info.get('license_info'): raise ExtractorError('This video is DRM protected.', expected=True) @@ -2073,6 +2144,27 @@ class YoutubeIE(YoutubeBaseInfoExtractor): track = extract_meta('Song') artist = extract_meta('Artist') + album = extract_meta('Album') + + # Youtube Music Auto-generated description + release_date = release_year = None + if video_description: + mobj = re.search(r'(?s)Provided to YouTube by [^\n]+\n+(?P[^·]+)·(?P[^\n]+)\n+(?P[^\n]+)(?:.+?℗\s*(?P\d{4})(?!\d))?(?:.+?Released on\s*:\s*(?P\d{4}-\d{2}-\d{2}))?(.+?\nArtist\s*:\s*(?P[^\n]+))?', video_description) + if mobj: + if not track: + track = mobj.group('track').strip() + if not artist: + artist = mobj.group('clean_artist') or ', '.join(a.strip() for a in mobj.group('artist').split('·')) + if not album: + album = mobj.group('album'.strip()) + release_year = mobj.group('release_year') + release_date = mobj.group('release_date') + if release_date: + release_date = release_date.replace('-', '') + if not release_year: + release_year = int(release_date[:4]) + if release_year: + release_year = int(release_year) m_episode = re.search( r']+id="watch7-headline"[^>]*>\s*]*>.*?>(?P[^<]+)\s*S(?P\d+)\s*•\s*E(?P\d+)', @@ -2186,6 +2278,29 @@ class YoutubeIE(YoutubeBaseInfoExtractor): if f.get('vcodec') != 'none': f['stretched_ratio'] = ratio + if not formats: + token = extract_token(video_info) + if not token: + if 'reason' in video_info: + if 'The uploader has not made this video available in your country.' in video_info['reason']: + regions_allowed = self._html_search_meta( + 'regionsAllowed', video_webpage, default=None) + countries = regions_allowed.split(',') if regions_allowed else None + self.raise_geo_restricted( + msg=video_info['reason'][0], countries=countries) + reason = video_info['reason'][0] + if 'Invalid parameters' in reason: + unavailable_message = extract_unavailable_message() + if unavailable_message: + reason = unavailable_message + raise ExtractorError( + 'YouTube said: %s' % reason, + expected=True, video_id=video_id) + else: + raise ExtractorError( + '"token" parameter not in video info for unknown reason', + video_id=video_id) + self._sort_formats(formats) self.mark_watched(video_id, video_info, player_response) @@ -2226,6 +2341,9 @@ class YoutubeIE(YoutubeBaseInfoExtractor): 'episode_number': episode_number, 'track': track, 'artist': artist, + 'album': album, + 'release_date': release_date, + 'release_year': release_year, }