Compare commits

..

1 Commits

Author SHA1 Message Date
Sergey M․ a390c247b5
release 2019.04.17 2019-04-17 00:20:09 +07:00
352 changed files with 13016 additions and 17330 deletions

61
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@ -0,0 +1,61 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *2019.04.17*. If it's not, read [this FAQ entry](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **2019.04.17**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/ytdl-org/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/ytdl-org/youtube-dl#faq) and [BUGS](https://github.com/ytdl-org/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2019.04.17
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

View File

@ -1,63 +0,0 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.09.20
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -1,54 +0,0 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -1,37 +0,0 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@ -1,65 +0,0 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version 2020.09.20
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -1,38 +0,0 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is 2020.09.20. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **2020.09.20**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@ -1,38 +0,0 @@
---
name: Ask question
about: Ask youtube-dl related question
title: ''
labels: 'question'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- Look through the README (http://yt-dl.org/readme) and FAQ (http://yt-dl.org/faq) for similar questions
- Search the bugtracker for similar questions: http://yt-dl.org/search-issues
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm asking a question
- [ ] I've looked through the README and FAQ for similar questions
- [ ] I've searched the bugtracker for similar questions including closed ones
## Question
<!--
Ask your question in an arbitrary form. Please make sure it's worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient.
-->
WRITE QUESTION HERE

61
.github/ISSUE_TEMPLATE_tmpl.md vendored Normal file
View File

@ -0,0 +1,61 @@
## Please follow the guide below
- You will be asked some questions and requested to provide some information, please read them **carefully** and answer honestly
- Put an `x` into all the boxes [ ] relevant to your *issue* (like this: `[x]`)
- Use the *Preview* tab to see what your issue will actually look like
---
### Make sure you are using the *latest* version: run `youtube-dl --version` and ensure your version is *%(version)s*. If it's not, read [this FAQ entry](https://github.com/ytdl-org/youtube-dl/blob/master/README.md#how-do-i-update-youtube-dl) and update. Issues with outdated version will be rejected.
- [ ] I've **verified** and **I assure** that I'm running youtube-dl **%(version)s**
### Before submitting an *issue* make sure you have:
- [ ] At least skimmed through the [README](https://github.com/ytdl-org/youtube-dl/blob/master/README.md), **most notably** the [FAQ](https://github.com/ytdl-org/youtube-dl#faq) and [BUGS](https://github.com/ytdl-org/youtube-dl#bugs) sections
- [ ] [Searched](https://github.com/ytdl-org/youtube-dl/search?type=Issues) the bugtracker for similar issues including closed ones
- [ ] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser
### What is the purpose of your *issue*?
- [ ] Bug report (encountered problems with youtube-dl)
- [ ] Site support request (request for adding support for a new site)
- [ ] Feature request (request for a new functionality)
- [ ] Question
- [ ] Other
---
### The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your *issue*
---
### If the purpose of this *issue* is a *bug report*, *site support request* or you are not completely sure provide the full verbose output as follows:
Add the `-v` flag to **your command line** you run youtube-dl with (`youtube-dl -v <your command line>`), copy the **whole** output and insert it here. It should look similar to one below (replace it with **your** log inserted between triple ```):
```
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
...
<end of log>
```
---
### If the purpose of this *issue* is a *site support request* please provide all kinds of example URLs support for which should be included (replace following example URLs by **yours**):
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
Note that **youtube-dl does not support sites dedicated to [copyright infringement](https://github.com/ytdl-org/youtube-dl#can-you-add-support-for-this-anime-video-site-or-site-which-shows-current-movies-for-free)**. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
---
### Description of your *issue*, suggested solution and other information
Explanation of your *issue* in arbitrary form goes here. Please make sure the [description is worded well enough to be understood](https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient). Provide as much context and examples as possible.
If work on your *issue* requires account credentials please provide them or explain how one can obtain them.

View File

@ -1,63 +0,0 @@
---
name: Broken site support
about: Report broken or misfunctioning site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar issues including closed ones
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -1,54 +0,0 @@
---
name: Site support request
about: Request support for a new site
title: ''
labels: 'site-support-request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that site you are requesting is not dedicated to copyright infringement, see https://yt-dl.org/copyright-infringement. youtube-dl does not support such sites. In order for site support request to be accepted all provided example URLs should not violate any copyrights.
- Search the bugtracker for similar site support requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a new site support request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that none of provided URLs violate any copyrights
- [ ] I've searched the bugtracker for similar site support requests including closed ones
## Example URLs
<!--
Provide all kinds of example URLs support for which should be included. Replace following example URLs by yours.
-->
- Single video: https://www.youtube.com/watch?v=BaW_jenozKc
- Single video: https://youtu.be/BaW_jenozKc
- Playlist: https://www.youtube.com/playlist?list=PL4lCao7KL_QFVb7Iudeipvc2BCavECqzc
## Description
<!--
Provide any additional information.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -1,37 +0,0 @@
---
name: Site feature request
about: Request a new functionality for a site
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar site feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a site feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar site feature requests including closed ones
## Description
<!--
Provide an explanation of your site feature request in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@ -1,65 +0,0 @@
---
name: Bug report
about: Report a bug unrelated to any particular site or extractor
title: ''
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Make sure that all provided video/audio/playlist URLs (if any) are alive and playable in a browser.
- Make sure that all URLs and arguments with special characters are properly quoted or escaped as explained in http://yt-dl.org/escape.
- Search the bugtracker for similar issues: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Read bugs section in FAQ: http://yt-dl.org/reporting
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a broken site support issue
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've checked that all provided URLs are alive and playable in a browser
- [ ] I've checked that all URLs and arguments with special characters are properly quoted or escaped
- [ ] I've searched the bugtracker for similar bug reports including closed ones
- [ ] I've read bugs section in FAQ
## Verbose log
<!--
Provide the complete verbose output of youtube-dl that clearly demonstrates the problem.
Add the `-v` flag to your command line you run youtube-dl with (`youtube-dl -v <your command line>`), copy the WHOLE output and insert it below. It should look similar to this:
[debug] System config: []
[debug] User config: []
[debug] Command-line args: [u'-v', u'http://www.youtube.com/watch?v=BaW_jenozKcj']
[debug] Encodings: locale cp1251, fs mbcs, out cp866, pref cp1251
[debug] youtube-dl version %(version)s
[debug] Python version 2.7.11 - Windows-2003Server-5.2.3790-SP2
[debug] exe versions: ffmpeg N-75573-g1d0487f, ffprobe N-75573-g1d0487f, rtmpdump 2.4
[debug] Proxy map: {}
<more lines>
-->
```
PASTE VERBOSE LOG HERE
```
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
If work on your issue requires account credentials please provide them or explain how one can obtain them.
-->
WRITE DESCRIPTION HERE

View File

@ -1,38 +0,0 @@
---
name: Feature request
about: Request a new functionality unrelated to any particular site or extractor
title: ''
labels: 'request'
---
<!--
######################################################################
WARNING!
IGNORING THE FOLLOWING TEMPLATE WILL RESULT IN ISSUE CLOSED AS INCOMPLETE
######################################################################
-->
## Checklist
<!--
Carefully read and work through this check list in order to prevent the most common mistakes and misuse of youtube-dl:
- First of, make sure you are using the latest version of youtube-dl. Run `youtube-dl --version` and ensure your version is %(version)s. If it's not, see https://yt-dl.org/update on how to update. Issues with outdated version will be REJECTED.
- Search the bugtracker for similar feature requests: http://yt-dl.org/search-issues. DO NOT post duplicates.
- Finally, put x into all relevant boxes (like this [x])
-->
- [ ] I'm reporting a feature request
- [ ] I've verified that I'm running youtube-dl version **%(version)s**
- [ ] I've searched the bugtracker for similar feature requests including closed ones
## Description
<!--
Provide an explanation of your issue in an arbitrary form. Please make sure the description is worded well enough to be understood, see https://github.com/ytdl-org/youtube-dl#is-the-description-of-the-issue-itself-sufficient. Provide any additional information, suggested solution and as much context and examples as possible.
-->
WRITE DESCRIPTION HERE

View File

@ -9,11 +9,10 @@ python:
- "3.6" - "3.6"
- "pypy" - "pypy"
- "pypy3" - "pypy3"
dist: trusty
env: env:
- YTDL_TEST_SET=core - YTDL_TEST_SET=core
- YTDL_TEST_SET=download - YTDL_TEST_SET=download
jobs: matrix:
include: include:
- python: 3.7 - python: 3.7
dist: xenial dist: xenial
@ -21,12 +20,6 @@ jobs:
- python: 3.7 - python: 3.7
dist: xenial dist: xenial
env: YTDL_TEST_SET=download env: YTDL_TEST_SET=download
- python: 3.8
dist: xenial
env: YTDL_TEST_SET=core
- python: 3.8
dist: xenial
env: YTDL_TEST_SET=download
- python: 3.8-dev - python: 3.8-dev
dist: xenial dist: xenial
env: YTDL_TEST_SET=core env: YTDL_TEST_SET=core
@ -35,11 +28,6 @@ jobs:
env: YTDL_TEST_SET=download env: YTDL_TEST_SET=download
- env: JYTHON=true; YTDL_TEST_SET=core - env: JYTHON=true; YTDL_TEST_SET=core
- env: JYTHON=true; YTDL_TEST_SET=download - env: JYTHON=true; YTDL_TEST_SET=download
- name: flake8
python: 3.8
dist: xenial
install: pip install flake8
script: flake8 .
fast_finish: true fast_finish: true
allow_failures: allow_failures:
- env: YTDL_TEST_SET=download - env: YTDL_TEST_SET=download

View File

@ -153,7 +153,7 @@ After you have ensured this site is distributing its content legally, you can fo
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py). 5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want. 7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart): 8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](http://flake8.pycqa.org/en/latest/index.html#quickstart):
$ flake8 youtube_dl/extractor/yourextractor.py $ flake8 youtube_dl/extractor/yourextractor.py
@ -339,72 +339,6 @@ Incorrect:
'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4' 'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
``` ```
### Inline values
Extracting variables is acceptable for reducing code duplication and improving readability of complex expressions. However, you should avoid extracting variables used only once and moving them to opposite parts of the extractor file, which makes reading the linear flow difficult.
#### Example
Correct:
```python
title = self._html_search_regex(r'<title>([^<]+)</title>', webpage, 'title')
```
Incorrect:
```python
TITLE_RE = r'<title>([^<]+)</title>'
# ...some lines of code...
title = self._html_search_regex(TITLE_RE, webpage, 'title')
```
### Collapse fallbacks
Multiple fallback values can quickly become unwieldy. Collapse multiple fallback values into a single expression via a list of patterns.
#### Example
Good:
```python
description = self._html_search_meta(
['og:description', 'description', 'twitter:description'],
webpage, 'description', default=None)
```
Unwieldy:
```python
description = (
self._og_search_description(webpage, default=None)
or self._html_search_meta('description', webpage, default=None)
or self._html_search_meta('twitter:description', webpage, default=None))
```
Methods supporting list of patterns are: `_search_regex`, `_html_search_regex`, `_og_search_property`, `_html_search_meta`.
### Trailing parentheses
Always move trailing parentheses after the last argument.
#### Example
Correct:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list)
```
Incorrect:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list,
)
```
### Use convenience conversion and parsing functions ### Use convenience conversion and parsing functions
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well. Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.

1004
ChangeLog

File diff suppressed because it is too large Load Diff

View File

@ -1,7 +1,7 @@
all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites all: youtube-dl README.md CONTRIBUTING.md README.txt youtube-dl.1 youtube-dl.bash-completion youtube-dl.zsh youtube-dl.fish supportedsites
clean: clean:
rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp youtube-dl youtube-dl.exe rm -rf youtube-dl.1.temp.md youtube-dl.1 youtube-dl.bash-completion README.txt MANIFEST build/ dist/ .coverage cover/ youtube-dl.tar.gz youtube-dl.zsh youtube-dl.fish youtube_dl/extractor/lazy_extractors.py *.dump *.part* *.ytdl *.info.json *.mp4 *.m4a *.flv *.mp3 *.avi *.mkv *.webm *.3gp *.wav *.ape *.swf *.jpg *.png CONTRIBUTING.md.tmp ISSUE_TEMPLATE.md.tmp youtube-dl youtube-dl.exe
find . -name "*.pyc" -delete find . -name "*.pyc" -delete
find . -name "*.class" -delete find . -name "*.class" -delete
@ -78,12 +78,8 @@ README.md: youtube_dl/*.py youtube_dl/*/*.py
CONTRIBUTING.md: README.md CONTRIBUTING.md: README.md
$(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md $(PYTHON) devscripts/make_contributing.py README.md CONTRIBUTING.md
issuetemplates: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md youtube_dl/version.py .github/ISSUE_TEMPLATE.md: devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md youtube_dl/version.py
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/1_broken_site.md .github/ISSUE_TEMPLATE/1_broken_site.md $(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl.md .github/ISSUE_TEMPLATE.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/2_site_support_request.md .github/ISSUE_TEMPLATE/2_site_support_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/3_site_feature_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/4_bug_report.md .github/ISSUE_TEMPLATE/4_bug_report.md
$(PYTHON) devscripts/make_issue_template.py .github/ISSUE_TEMPLATE_tmpl/5_feature_request.md .github/ISSUE_TEMPLATE/5_feature_request.md
supportedsites: supportedsites:
$(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md $(PYTHON) devscripts/make_supportedsites.py docs/supportedsites.md

View File

@ -434,9 +434,9 @@ Alternatively, refer to the [developer instructions](#developer-instructions) fo
either the path to the binary or its either the path to the binary or its
containing directory. containing directory.
--exec CMD Execute a command on the file after --exec CMD Execute a command on the file after
downloading and post-processing, similar to downloading, similar to find's -exec
find's -exec syntax. Example: --exec 'adb syntax. Example: --exec 'adb push {}
push {} /sdcard/Music/ && rm {}' /sdcard/Music/ && rm {}'
--convert-subs FORMAT Convert the subtitles to other format --convert-subs FORMAT Convert the subtitles to other format
(currently supported: srt|ass|vtt|lrc) (currently supported: srt|ass|vtt|lrc)
@ -545,7 +545,7 @@ The basic usage is not to set any template arguments when downloading a single f
- `extractor` (string): Name of the extractor - `extractor` (string): Name of the extractor
- `extractor_key` (string): Key name of the extractor - `extractor_key` (string): Key name of the extractor
- `epoch` (numeric): Unix epoch when creating the file - `epoch` (numeric): Unix epoch when creating the file
- `autonumber` (numeric): Number that will be increased with each download, starting at `--autonumber-start` - `autonumber` (numeric): Five-digit number that will be increased with each download, starting at zero
- `playlist` (string): Name or id of the playlist that contains the video - `playlist` (string): Name or id of the playlist that contains the video
- `playlist_index` (numeric): Index of the video in the playlist padded with leading zeros according to the total length of the playlist - `playlist_index` (numeric): Index of the video in the playlist padded with leading zeros according to the total length of the playlist
- `playlist_id` (string): Playlist identifier - `playlist_id` (string): Playlist identifier
@ -752,8 +752,8 @@ As a last resort, you can also uninstall the version installed by your package m
Afterwards, simply follow [our manual installation instructions](https://ytdl-org.github.io/youtube-dl/download.html): Afterwards, simply follow [our manual installation instructions](https://ytdl-org.github.io/youtube-dl/download.html):
``` ```
sudo wget https://yt-dl.org/downloads/latest/youtube-dl -O /usr/local/bin/youtube-dl sudo wget https://yt-dl.org/latest/youtube-dl -O /usr/local/bin/youtube-dl
sudo chmod a+rx /usr/local/bin/youtube-dl sudo chmod a+x /usr/local/bin/youtube-dl
hash -r hash -r
``` ```
@ -835,9 +835,7 @@ In February 2015, the new YouTube player contained a character sequence in a str
### HTTP Error 429: Too Many Requests or 402: Payment Required ### HTTP Error 429: Too Many Requests or 402: Payment Required
These two error codes indicate that the service is blocking your IP address because of overuse. Usually this is a soft block meaning that you can gain access again after solving CAPTCHA. Just open a browser and solve a CAPTCHA the service suggests you and after that [pass cookies](#how-do-i-pass-cookies-to-youtube-dl) to youtube-dl. Note that if your machine has multiple external IPs then you should also pass exactly the same IP you've used for solving CAPTCHA with [`--source-address`](#network-options). Also you may need to pass a `User-Agent` HTTP header of your browser with [`--user-agent`](#workarounds). These two error codes indicate that the service is blocking your IP address because of overuse. Contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--source-address` options](#network-options) to select another IP address.
If this is not the case (no CAPTCHA suggested to solve by the service) then you can contact the service and ask them to unblock your IP address, or - if you have acquired a whitelisted IP address already - use the [`--proxy` or `--source-address` options](#network-options) to select another IP address.
### SyntaxError: Non-ASCII character ### SyntaxError: Non-ASCII character
@ -1032,7 +1030,7 @@ After you have ensured this site is distributing its content legally, you can fo
5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py). 5. Add an import in [`youtube_dl/extractor/extractors.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/extractors.py).
6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in. 6. Run `python test/test_download.py TestDownload.test_YourExtractor`. This *should fail* at first, but you can continually re-run it until you're done. If you decide to add more than one test, then rename ``_TEST`` to ``_TESTS`` and make it into a list of dictionaries. The tests will then be named `TestDownload.test_YourExtractor`, `TestDownload.test_YourExtractor_1`, `TestDownload.test_YourExtractor_2`, etc. Note that tests with `only_matching` key in test's dict are not counted in.
7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want. 7. Have a look at [`youtube_dl/extractor/common.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/extractor/common.py) for possible helper methods and a [detailed description of what your extractor should and may return](https://github.com/ytdl-org/youtube-dl/blob/7f41a598b3fba1bcab2817de64a08941200aa3c8/youtube_dl/extractor/common.py#L94-L303). Add tests and code for as many as you want.
8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](https://flake8.pycqa.org/en/latest/index.html#quickstart): 8. Make sure your code follows [youtube-dl coding conventions](#youtube-dl-coding-conventions) and check the code with [flake8](http://flake8.pycqa.org/en/latest/index.html#quickstart):
$ flake8 youtube_dl/extractor/yourextractor.py $ flake8 youtube_dl/extractor/yourextractor.py
@ -1218,72 +1216,6 @@ Incorrect:
'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4' 'PLMYEtVRpaqY00V9W81Cwmzp6N6vZqfUKD4'
``` ```
### Inline values
Extracting variables is acceptable for reducing code duplication and improving readability of complex expressions. However, you should avoid extracting variables used only once and moving them to opposite parts of the extractor file, which makes reading the linear flow difficult.
#### Example
Correct:
```python
title = self._html_search_regex(r'<title>([^<]+)</title>', webpage, 'title')
```
Incorrect:
```python
TITLE_RE = r'<title>([^<]+)</title>'
# ...some lines of code...
title = self._html_search_regex(TITLE_RE, webpage, 'title')
```
### Collapse fallbacks
Multiple fallback values can quickly become unwieldy. Collapse multiple fallback values into a single expression via a list of patterns.
#### Example
Good:
```python
description = self._html_search_meta(
['og:description', 'description', 'twitter:description'],
webpage, 'description', default=None)
```
Unwieldy:
```python
description = (
self._og_search_description(webpage, default=None)
or self._html_search_meta('description', webpage, default=None)
or self._html_search_meta('twitter:description', webpage, default=None))
```
Methods supporting list of patterns are: `_search_regex`, `_html_search_regex`, `_og_search_property`, `_html_search_meta`.
### Trailing parentheses
Always move trailing parentheses after the last argument.
#### Example
Correct:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list)
```
Incorrect:
```python
lambda x: x['ResultSet']['Result'][0]['VideoUrlSet']['VideoUrl'],
list,
)
```
### Use convenience conversion and parsing functions ### Use convenience conversion and parsing functions
Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well. Wrap all extracted numeric data into safe functions from [`youtube_dl/utils.py`](https://github.com/ytdl-org/youtube-dl/blob/master/youtube_dl/utils.py): `int_or_none`, `float_or_none`. Use them for string to number conversions as well.

View File

@ -45,12 +45,12 @@ for test in gettestcases():
RESULT = ('.' + domain + '\n' in LIST or '\n' + domain + '\n' in LIST) RESULT = ('.' + domain + '\n' in LIST or '\n' + domain + '\n' in LIST)
if RESULT and ('info_dict' not in test or 'age_limit' not in test['info_dict'] if RESULT and ('info_dict' not in test or 'age_limit' not in test['info_dict'] or
or test['info_dict']['age_limit'] != 18): test['info_dict']['age_limit'] != 18):
print('\nPotential missing age_limit check: {0}'.format(test['name'])) print('\nPotential missing age_limit check: {0}'.format(test['name']))
elif not RESULT and ('info_dict' in test and 'age_limit' in test['info_dict'] elif not RESULT and ('info_dict' in test and 'age_limit' in test['info_dict'] and
and test['info_dict']['age_limit'] == 18): test['info_dict']['age_limit'] == 18):
print('\nPotential false negative: {0}'.format(test['name'])) print('\nPotential false negative: {0}'.format(test['name']))
else: else:

View File

@ -1,6 +1,7 @@
#!/usr/bin/env python #!/usr/bin/env python
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import io import io
import json import json
import mimetypes import mimetypes
@ -14,6 +15,7 @@ sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from youtube_dl.compat import ( from youtube_dl.compat import (
compat_basestring, compat_basestring,
compat_input,
compat_getpass, compat_getpass,
compat_print, compat_print,
compat_urllib_request, compat_urllib_request,
@ -38,20 +40,28 @@ class GitHubReleaser(object):
try: try:
info = netrc.netrc().authenticators(self._NETRC_MACHINE) info = netrc.netrc().authenticators(self._NETRC_MACHINE)
if info is not None: if info is not None:
self._token = info[2] self._username = info[0]
self._password = info[2]
compat_print('Using GitHub credentials found in .netrc...') compat_print('Using GitHub credentials found in .netrc...')
return return
else: else:
compat_print('No GitHub credentials found in .netrc') compat_print('No GitHub credentials found in .netrc')
except (IOError, netrc.NetrcParseError): except (IOError, netrc.NetrcParseError):
compat_print('Unable to parse .netrc') compat_print('Unable to parse .netrc')
self._token = compat_getpass( self._username = compat_input(
'Type your GitHub PAT (personal access token) and press [Return]: ') 'Type your GitHub username or email address and press [Return]: ')
self._password = compat_getpass(
'Type your GitHub password and press [Return]: ')
def _call(self, req): def _call(self, req):
if isinstance(req, compat_basestring): if isinstance(req, compat_basestring):
req = sanitized_Request(req) req = sanitized_Request(req)
req.add_header('Authorization', 'token %s' % self._token) # Authorizing manually since GitHub does not response with 401 with
# WWW-Authenticate header set (see
# https://developer.github.com/v3/#basic-authentication)
b64 = base64.b64encode(
('%s:%s' % (self._username, self._password)).encode('utf-8')).decode('ascii')
req.add_header('Authorization', 'Basic %s' % b64)
response = self._opener.open(req).read().decode('utf-8') response = self._opener.open(req).read().decode('utf-8')
return json.loads(response) return json.loads(response)

View File

@ -78,8 +78,8 @@ sed -i "s/__version__ = '.*'/__version__ = '$version'/" youtube_dl/version.py
sed -i "s/<unreleased>/$version/" ChangeLog sed -i "s/<unreleased>/$version/" ChangeLog
/bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..." /bin/echo -e "\n### Committing documentation, templates and youtube_dl/version.py..."
make README.md CONTRIBUTING.md issuetemplates supportedsites make README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md supportedsites
git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE/1_broken_site.md .github/ISSUE_TEMPLATE/2_site_support_request.md .github/ISSUE_TEMPLATE/3_site_feature_request.md .github/ISSUE_TEMPLATE/4_bug_report.md .github/ISSUE_TEMPLATE/5_feature_request.md .github/ISSUE_TEMPLATE/6_question.md docs/supportedsites.md youtube_dl/version.py ChangeLog git add README.md CONTRIBUTING.md .github/ISSUE_TEMPLATE.md docs/supportedsites.md youtube_dl/version.py ChangeLog
git commit $gpg_sign_commits -m "release $version" git commit $gpg_sign_commits -m "release $version"
/bin/echo -e "\n### Now tagging, signing and pushing..." /bin/echo -e "\n### Now tagging, signing and pushing..."

View File

@ -26,13 +26,13 @@
- **AcademicEarth:Course** - **AcademicEarth:Course**
- **acast** - **acast**
- **acast:channel** - **acast:channel**
- **AddAnime**
- **ADN**: Anime Digital Network - **ADN**: Anime Digital Network
- **AdobeConnect** - **AdobeConnect**
- **adobetv** - **AdobeTV**
- **adobetv:channel** - **AdobeTVChannel**
- **adobetv:embed** - **AdobeTVShow**
- **adobetv:show** - **AdobeTVVideo**
- **adobetv:video**
- **AdultSwim** - **AdultSwim**
- **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault - **aenetworks**: A+E Networks: A&E, Lifetime, History.com, FYI Network and History Vault
- **afreecatv**: afreecatv.com - **afreecatv**: afreecatv.com
@ -58,8 +58,16 @@
- **ARD:mediathek** - **ARD:mediathek**
- **ARDBetaMediathek** - **ARDBetaMediathek**
- **Arkena** - **Arkena**
- **arte.tv**
- **arte.tv:+7** - **arte.tv:+7**
- **arte.tv:cinema**
- **arte.tv:concert**
- **arte.tv:creative**
- **arte.tv:ddc**
- **arte.tv:embed** - **arte.tv:embed**
- **arte.tv:future**
- **arte.tv:info**
- **arte.tv:magazine**
- **arte.tv:playlist** - **arte.tv:playlist**
- **AsianCrush** - **AsianCrush**
- **AsianCrushPlaylist** - **AsianCrushPlaylist**
@ -70,12 +78,15 @@
- **AudioBoom** - **AudioBoom**
- **audiomack** - **audiomack**
- **audiomack:album** - **audiomack:album**
- **auroravid**: AuroraVid
- **AWAAN** - **AWAAN**
- **awaan:live** - **awaan:live**
- **awaan:season** - **awaan:season**
- **awaan:video** - **awaan:video**
- **AZMedien**: AZ Medien videos - **AZMedien**: AZ Medien videos
- **BaiduVideo**: 百度视频 - **BaiduVideo**: 百度视频
- **bambuser**
- **bambuser:channel**
- **Bandcamp** - **Bandcamp**
- **Bandcamp:album** - **Bandcamp:album**
- **Bandcamp:weekly** - **Bandcamp:weekly**
@ -96,9 +107,6 @@
- **Bigflix** - **Bigflix**
- **Bild**: Bild.de - **Bild**: Bild.de
- **BiliBili** - **BiliBili**
- **BilibiliAudio**
- **BilibiliAudioAlbum**
- **BiliBiliPlayer**
- **BioBioChileTV** - **BioBioChileTV**
- **BIQLE** - **BIQLE**
- **BitChute** - **BitChute**
@ -142,7 +150,6 @@
- **CBSInteractive** - **CBSInteractive**
- **CBSLocal** - **CBSLocal**
- **cbsnews**: CBS News - **cbsnews**: CBS News
- **cbsnews:embed**
- **cbsnews:livevideo**: CBS News Live Videos - **cbsnews:livevideo**: CBS News Live Videos
- **CBSSports** - **CBSSports**
- **CCMA** - **CCMA**
@ -157,7 +164,6 @@
- **chirbit** - **chirbit**
- **chirbit:profile** - **chirbit:profile**
- **Cinchcast** - **Cinchcast**
- **Cinemax**
- **CiscoLiveSearch** - **CiscoLiveSearch**
- **CiscoLiveSession** - **CiscoLiveSession**
- **CJSW** - **CJSW**
@ -167,6 +173,7 @@
- **Clipsyndicate** - **Clipsyndicate**
- **CloserToTruth** - **CloserToTruth**
- **CloudflareStream** - **CloudflareStream**
- **cloudtime**: CloudTime
- **Cloudy** - **Cloudy**
- **Clubic** - **Clubic**
- **Clyp** - **Clyp**
@ -176,16 +183,17 @@
- **CNN** - **CNN**
- **CNNArticle** - **CNNArticle**
- **CNNBlogs** - **CNNBlogs**
- **ComCarCoff**
- **ComedyCentral** - **ComedyCentral**
- **ComedyCentralFullEpisodes** - **ComedyCentralFullEpisodes**
- **ComedyCentralShortname** - **ComedyCentralShortname**
- **ComedyCentralTV** - **ComedyCentralTV**
- **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED - **CondeNast**: Condé Nast media group: Allure, Architectural Digest, Ars Technica, Bon Appétit, Brides, Condé Nast, Condé Nast Traveler, Details, Epicurious, GQ, Glamour, Golf Digest, SELF, Teen Vogue, The New Yorker, Vanity Fair, Vogue, W Magazine, WIRED
- **CONtv**
- **Corus** - **Corus**
- **Coub** - **Coub**
- **Cracked** - **Cracked**
- **Crackle** - **Crackle**
- **Criterion**
- **CrooksAndLiars** - **CrooksAndLiars**
- **crunchyroll** - **crunchyroll**
- **crunchyroll:playlist** - **crunchyroll:playlist**
@ -193,7 +201,6 @@
- **CSpan**: C-SPAN - **CSpan**: C-SPAN
- **CtsNews**: 華視新聞 - **CtsNews**: 華視新聞
- **CTVNews** - **CTVNews**
- **cu.ntv.co.jp**: Nippon Television Network
- **Culturebox** - **Culturebox**
- **CultureUnplugged** - **CultureUnplugged**
- **curiositystream** - **curiositystream**
@ -203,6 +210,8 @@
- **dailymotion** - **dailymotion**
- **dailymotion:playlist** - **dailymotion:playlist**
- **dailymotion:user** - **dailymotion:user**
- **DaisukiMotto**
- **DaisukiMottoPlaylist**
- **daum.net** - **daum.net**
- **daum.net:clip** - **daum.net:clip**
- **daum.net:playlist** - **daum.net:playlist**
@ -222,12 +231,13 @@
- **DiscoveryNetworksDe** - **DiscoveryNetworksDe**
- **DiscoveryVR** - **DiscoveryVR**
- **Disney** - **Disney**
- **dlive:stream**
- **dlive:vod**
- **Dotsub** - **Dotsub**
- **DouyuShow** - **DouyuShow**
- **DouyuTV**: 斗鱼 - **DouyuTV**: 斗鱼
- **DPlay** - **DPlay**
- **DPlayIt**
- **dramafever**
- **dramafever:series**
- **DRBonanza** - **DRBonanza**
- **Dropbox** - **Dropbox**
- **DrTuber** - **DrTuber**
@ -280,12 +290,12 @@
- **FiveThirtyEight** - **FiveThirtyEight**
- **FiveTV** - **FiveTV**
- **Flickr** - **Flickr**
- **Flipagram**
- **Folketinget**: Folketinget (ft.dk; Danish parliament) - **Folketinget**: Folketinget (ft.dk; Danish parliament)
- **FootyRoom** - **FootyRoom**
- **Formula1** - **Formula1**
- **FOX** - **FOX**
- **FOX9** - **FOX9**
- **FOX9News**
- **Foxgay** - **Foxgay**
- **foxnews**: Fox News and Fox Business Video - **foxnews**: Fox News and Fox Business Video
- **foxnews:article** - **foxnews:article**
@ -305,12 +315,16 @@
- **FrontendMastersCourse** - **FrontendMastersCourse**
- **FrontendMastersLesson** - **FrontendMastersLesson**
- **Funimation** - **Funimation**
- **Funk** - **FunkChannel**
- **FunkMix**
- **FunnyOrDie**
- **Fusion** - **Fusion**
- **Fux** - **Fux**
- **FXNetworks** - **FXNetworks**
- **Gaia** - **Gaia**
- **GameInformer** - **GameInformer**
- **GameOne**
- **gameone:playlist**
- **GameSpot** - **GameSpot**
- **GameStar** - **GameStar**
- **Gaskrank** - **Gaskrank**
@ -325,12 +339,14 @@
- **Globo** - **Globo**
- **GloboArticle** - **GloboArticle**
- **Go** - **Go**
- **Go90**
- **GodTube** - **GodTube**
- **Golem** - **Golem**
- **GoogleDrive** - **GoogleDrive**
- **Goshgay** - **Goshgay**
- **GPUTechConf** - **GPUTechConf**
- **Groupon** - **Groupon**
- **Hark**
- **hbo** - **hbo**
- **HearThisAt** - **HearThisAt**
- **Heise** - **Heise**
@ -359,6 +375,7 @@
- **Hungama** - **Hungama**
- **HungamaSong** - **HungamaSong**
- **Hypem** - **Hypem**
- **Iconosquare**
- **ign.com** - **ign.com**
- **imdb**: Internet Movie Database trailers - **imdb**: Internet Movie Database trailers
- **imdb:list**: Internet Movie Database lists - **imdb:list**: Internet Movie Database lists
@ -390,6 +407,7 @@
- **JeuxVideo** - **JeuxVideo**
- **Joj** - **Joj**
- **Jove** - **Jove**
- **jpopsuki.tv**
- **JWPlatform** - **JWPlatform**
- **Kakao** - **Kakao**
- **Kaltura** - **Kaltura**
@ -397,14 +415,14 @@
- **Kankan** - **Kankan**
- **Karaoketv** - **Karaoketv**
- **KarriereVideos** - **KarriereVideos**
- **Katsomo** - **keek**
- **KeezMovies** - **KeezMovies**
- **Ketnet** - **Ketnet**
- **KhanAcademy** - **KhanAcademy**
- **KickStarter** - **KickStarter**
- **KinjaEmbed**
- **KinoPoisk** - **KinoPoisk**
- **KonserthusetPlay** - **KonserthusetPlay**
- **kontrtube**: KontrTube.ru - Труба зовёт
- **KrasView**: Красвью - **KrasView**: Красвью
- **Ku6** - **Ku6**
- **KUSI** - **KUSI**
@ -421,6 +439,7 @@
- **Lcp** - **Lcp**
- **LcpPlay** - **LcpPlay**
- **Le**: 乐视网 - **Le**: 乐视网
- **Learnr**
- **Lecture2Go** - **Lecture2Go**
- **Lecturio** - **Lecturio**
- **LecturioCourse** - **LecturioCourse**
@ -441,7 +460,6 @@
- **linkedin:learning:course** - **linkedin:learning:course**
- **LinuxAcademy** - **LinuxAcademy**
- **LiTV** - **LiTV**
- **LiveJournal**
- **LiveLeak** - **LiveLeak**
- **LiveLeakEmbed** - **LiveLeakEmbed**
- **livestream** - **livestream**
@ -454,9 +472,11 @@
- **lynda**: lynda.com videos - **lynda**: lynda.com videos
- **lynda:course**: lynda.com online courses - **lynda:course**: lynda.com online courses
- **m6** - **m6**
- **macgamestore**: MacGameStore trailers
- **mailru**: Видео@Mail.Ru - **mailru**: Видео@Mail.Ru
- **mailru:music**: Музыка@Mail.Ru - **mailru:music**: Музыка@Mail.Ru
- **mailru:music:search**: Музыка@Mail.Ru - **mailru:music:search**: Музыка@Mail.Ru
- **MakerTV**
- **MallTV** - **MallTV**
- **mangomolo:live** - **mangomolo:live**
- **mangomolo:video** - **mangomolo:video**
@ -467,7 +487,6 @@
- **MatchTV** - **MatchTV**
- **MDR**: MDR.DE and KiKA - **MDR**: MDR.DE and KiKA
- **media.ccc.de** - **media.ccc.de**
- **media.ccc.de:lists**
- **Medialaan** - **Medialaan**
- **Mediaset** - **Mediaset**
- **Mediasite** - **Mediasite**
@ -483,12 +502,14 @@
- **Mgoon** - **Mgoon**
- **MGTV**: 芒果TV - **MGTV**: 芒果TV
- **MiaoPai** - **MiaoPai**
- **Minhateca**
- **MinistryGrid** - **MinistryGrid**
- **Minoto** - **Minoto**
- **miomio.tv** - **miomio.tv**
- **MiTele**: mitele.es - **MiTele**: mitele.es
- **mixcloud** - **mixcloud**
- **mixcloud:playlist** - **mixcloud:playlist**
- **mixcloud:stream**
- **mixcloud:user** - **mixcloud:user**
- **Mixer:live** - **Mixer:live**
- **Mixer:vod** - **Mixer:vod**
@ -497,7 +518,6 @@
- **MNetTV** - **MNetTV**
- **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net - **MoeVideo**: LetitBit video services: moevideo.net, playreplay.net and videochart.net
- **Mofosex** - **Mofosex**
- **MofosexEmbed**
- **Mojvideo** - **Mojvideo**
- **Morningstar**: morningstar.com - **Morningstar**: morningstar.com
- **Motherless** - **Motherless**
@ -511,10 +531,11 @@
- **mtg**: MTG services - **mtg**: MTG services
- **mtv** - **mtv**
- **mtv.de** - **mtv.de**
- **mtv81**
- **mtv:video** - **mtv:video**
- **mtvjapan**
- **mtvservices:embedded** - **mtvservices:embedded**
- **MuenchenTV**: münchen.tv - **MuenchenTV**: münchen.tv
- **MusicPlayOn**
- **mva**: Microsoft Virtual Academy videos - **mva**: Microsoft Virtual Academy videos
- **mva:course**: Microsoft Virtual Academy courses - **mva:course**: Microsoft Virtual Academy courses
- **Mwave** - **Mwave**
@ -561,6 +582,7 @@
- **NextTV**: 壹電視 - **NextTV**: 壹電視
- **Nexx** - **Nexx**
- **NexxEmbed** - **NexxEmbed**
- **nfb**: National Film Board of Canada
- **nfl.com** - **nfl.com**
- **NhkVod** - **NhkVod**
- **nhl.com** - **nhl.com**
@ -586,6 +608,7 @@
- **nowness** - **nowness**
- **nowness:playlist** - **nowness:playlist**
- **nowness:series** - **nowness:series**
- **nowvideo**: NowVideo
- **Noz** - **Noz**
- **npo**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **npo**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **npo.nl:live** - **npo.nl:live**
@ -601,7 +624,6 @@
- **NRKTVEpisodes** - **NRKTVEpisodes**
- **NRKTVSeason** - **NRKTVSeason**
- **NRKTVSeries** - **NRKTVSeries**
- **NRLTV**
- **ntv.ru** - **ntv.ru**
- **Nuvid** - **Nuvid**
- **NYTimes** - **NYTimes**
@ -619,26 +641,18 @@
- **OnionStudios** - **OnionStudios**
- **Ooyala** - **Ooyala**
- **OoyalaExternal** - **OoyalaExternal**
- **Openload**
- **OraTV** - **OraTV**
- **orf:burgenland**: Radio Burgenland
- **orf:fm4**: radio FM4 - **orf:fm4**: radio FM4
- **orf:fm4:story**: fm4.orf.at stories - **orf:fm4:story**: fm4.orf.at stories
- **orf:iptv**: iptv.ORF.at - **orf:iptv**: iptv.ORF.at
- **orf:kaernten**: Radio Kärnten
- **orf:noe**: Radio Niederösterreich
- **orf:oberoesterreich**: Radio Oberösterreich
- **orf:oe1**: Radio Österreich 1 - **orf:oe1**: Radio Österreich 1
- **orf:oe3**: Radio Österreich 3
- **orf:salzburg**: Radio Salzburg
- **orf:steiermark**: Radio Steiermark
- **orf:tirol**: Radio Tirol
- **orf:tvthek**: ORF TVthek - **orf:tvthek**: ORF TVthek
- **orf:vorarlberg**: Radio Vorarlberg
- **orf:wien**: Radio Wien
- **OsnatelTV** - **OsnatelTV**
- **OutsideTV** - **OutsideTV**
- **PacktPub** - **PacktPub**
- **PacktPubCourse** - **PacktPubCourse**
- **PandaTV**: 熊猫TV
- **pandora.tv**: 판도라TV - **pandora.tv**: 판도라TV
- **ParamountNetwork** - **ParamountNetwork**
- **parliamentlive.tv**: UK parliament videos - **parliamentlive.tv**: UK parliament videos
@ -674,20 +688,20 @@
- **Pokemon** - **Pokemon**
- **PolskieRadio** - **PolskieRadio**
- **PolskieRadioCategory** - **PolskieRadioCategory**
- **Popcorntimes**
- **PopcornTV** - **PopcornTV**
- **PornCom** - **PornCom**
- **PornerBros** - **PornerBros**
- **PornFlip**
- **PornHd** - **PornHd**
- **PornHub**: PornHub and Thumbzilla - **PornHub**: PornHub and Thumbzilla
- **PornHubPagedVideoList** - **PornHubPlaylist**
- **PornHubUser** - **PornHubUserVideos**
- **PornHubUserVideosUpload**
- **Pornotube** - **Pornotube**
- **PornoVoisines** - **PornoVoisines**
- **PornoXO** - **PornoXO**
- **PornTube** - **PornTube**
- **PressTV** - **PressTV**
- **PromptFile**
- **prosiebensat1**: ProSiebenSat.1 Digital - **prosiebensat1**: ProSiebenSat.1 Digital
- **puhutv** - **puhutv**
- **puhutv:serie** - **puhutv:serie**
@ -717,10 +731,7 @@
- **RayWenderlichCourse** - **RayWenderlichCourse**
- **RBMARadio** - **RBMARadio**
- **RDS**: RDS.ca - **RDS**: RDS.ca
- **RedBull**
- **RedBullEmbed**
- **RedBullTV** - **RedBullTV**
- **RedBullTVRrnContent**
- **Reddit** - **Reddit**
- **RedditR** - **RedditR**
- **RedTube** - **RedTube**
@ -730,6 +741,8 @@
- **Restudy** - **Restudy**
- **Reuters** - **Reuters**
- **ReverbNation** - **ReverbNation**
- **revision**
- **revision3:embed**
- **RICE** - **RICE**
- **RMCDecouverte** - **RMCDecouverte**
- **RockstarGames** - **RockstarGames**
@ -752,6 +765,7 @@
- **rtve.es:television** - **rtve.es:television**
- **RTVNH** - **RTVNH**
- **RTVS** - **RTVS**
- **Rudo**
- **RUHD** - **RUHD**
- **rutube**: Rutube videos - **rutube**: Rutube videos
- **rutube:channel**: Rutube channels - **rutube:channel**: Rutube channels
@ -774,13 +788,11 @@
- **screen.yahoo:search**: Yahoo screen search - **screen.yahoo:search**: Yahoo screen search
- **Screencast** - **Screencast**
- **ScreencastOMatic** - **ScreencastOMatic**
- **ScrippsNetworks**
- **scrippsnetworks:watch** - **scrippsnetworks:watch**
- **SCTE**
- **SCTECourse**
- **Seeker** - **Seeker**
- **SenateISVP** - **SenateISVP**
- **SendtoNews** - **SendtoNews**
- **ServingSys**
- **Servus** - **Servus**
- **Sexu** - **Sexu**
- **SeznamZpravy** - **SeznamZpravy**
@ -791,7 +803,6 @@
- **ShowRoomLive** - **ShowRoomLive**
- **Sina** - **Sina**
- **SkylineWebcams** - **SkylineWebcams**
- **SkyNews**
- **skynewsarabia:article** - **skynewsarabia:article**
- **skynewsarabia:video** - **skynewsarabia:video**
- **SkySports** - **SkySports**
@ -811,7 +822,6 @@
- **soundcloud:set** - **soundcloud:set**
- **soundcloud:trackstation** - **soundcloud:trackstation**
- **soundcloud:user** - **soundcloud:user**
- **SoundcloudEmbed**
- **soundgasm** - **soundgasm**
- **soundgasm:profile** - **soundgasm:profile**
- **southpark.cc.com** - **southpark.cc.com**
@ -838,14 +848,13 @@
- **Steam** - **Steam**
- **Stitcher** - **Stitcher**
- **Streamable** - **Streamable**
- **Streamango**
- **streamcloud.eu** - **streamcloud.eu**
- **StreamCZ** - **StreamCZ**
- **StreetVoice** - **StreetVoice**
- **StretchInternet** - **StretchInternet**
- **stv:player** - **stv:player**
- **SunPorno** - **SunPorno**
- **sverigesradio:episode**
- **sverigesradio:publication**
- **SVT** - **SVT**
- **SVTPage** - **SVTPage**
- **SVTPlay**: SVT Play and Öppet arkiv - **SVTPlay**: SVT Play and Öppet arkiv
@ -879,14 +888,13 @@
- **TeleQuebec** - **TeleQuebec**
- **TeleQuebecEmission** - **TeleQuebecEmission**
- **TeleQuebecLive** - **TeleQuebecLive**
- **TeleQuebecSquat**
- **TeleTask** - **TeleTask**
- **Telewebion** - **Telewebion**
- **TennisTV** - **TennisTV**
- **TenPlay**
- **TF1** - **TF1**
- **TFO** - **TFO**
- **TheIntercept** - **TheIntercept**
- **theoperaplatform**
- **ThePlatform** - **ThePlatform**
- **ThePlatformFeed** - **ThePlatformFeed**
- **TheScene** - **TheScene**
@ -922,12 +930,11 @@
- **tunein:topic** - **tunein:topic**
- **TunePk** - **TunePk**
- **Turbo** - **Turbo**
- **Tutv**
- **tv.dfb.de** - **tv.dfb.de**
- **TV2** - **TV2**
- **tv2.hu** - **tv2.hu**
- **TV2Article** - **TV2Article**
- **TV2DK**
- **TV2DKBornholmPlay**
- **TV4**: tv4.se and tv4play.se - **TV4**: tv4.se and tv4play.se
- **TV5MondePlus**: TV5MONDE+ - **TV5MondePlus**: TV5MONDE+
- **TVA** - **TVA**
@ -952,21 +959,22 @@
- **TVPlayHome** - **TVPlayHome**
- **Tweakers** - **Tweakers**
- **TwitCasting** - **TwitCasting**
- **twitch:chapter**
- **twitch:clips** - **twitch:clips**
- **twitch:profile**
- **twitch:stream** - **twitch:stream**
- **twitch:video**
- **twitch:videos:all**
- **twitch:videos:highlights**
- **twitch:videos:past-broadcasts**
- **twitch:videos:uploads**
- **twitch:vod** - **twitch:vod**
- **TwitchCollection**
- **TwitchVideos**
- **TwitchVideosClips**
- **TwitchVideosCollections**
- **twitter** - **twitter**
- **twitter:amplify** - **twitter:amplify**
- **twitter:broadcast**
- **twitter:card** - **twitter:card**
- **udemy** - **udemy**
- **udemy:course** - **udemy:course**
- **UDNEmbed**: 聯合影音 - **UDNEmbed**: 聯合影音
- **UFCArabia**
- **UFCTV** - **UFCTV**
- **UKTVPlay** - **UKTVPlay**
- **umg:de**: Universal Music Deutschland - **umg:de**: Universal Music Deutschland
@ -987,6 +995,7 @@
- **Vbox7** - **Vbox7**
- **VeeHD** - **VeeHD**
- **Veoh** - **Veoh**
- **Vessel**
- **Vesti**: Вести.Ru - **Vesti**: Вести.Ru
- **Vevo** - **Vevo**
- **VevoPlaylist** - **VevoPlaylist**
@ -1001,12 +1010,15 @@
- **Viddler** - **Viddler**
- **Videa** - **Videa**
- **video.google:search**: Google Video search - **video.google:search**: Google Video search
- **video.mit.edu**
- **VideoDetective** - **VideoDetective**
- **videofy.me** - **videofy.me**
- **videomore** - **videomore**
- **videomore:season** - **videomore:season**
- **videomore:video** - **videomore:video**
- **VideoPremium**
- **VideoPress** - **VideoPress**
- **videoweed**: VideoWeed
- **Vidio** - **Vidio**
- **VidLii** - **VidLii**
- **vidme** - **vidme**
@ -1015,8 +1027,9 @@
- **Vidzi** - **Vidzi**
- **vier**: vier.be and vijf.be - **vier**: vier.be and vijf.be
- **vier:videos** - **vier:videos**
- **viewlift** - **ViewLift**
- **viewlift:embed** - **ViewLiftEmbed**
- **Viewster**
- **Viidea** - **Viidea**
- **viki** - **viki**
- **viki:channel** - **viki:channel**
@ -1052,7 +1065,7 @@
- **VoxMediaVolume** - **VoxMediaVolume**
- **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **vpro**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **Vrak** - **Vrak**
- **VRT**: VRT NWS, Flanders News, Flandern Info and Sporza - **VRT**: deredactie.be, sporza.be, cobra.be and cobra.canvas.be
- **VrtNU**: VrtNU.be - **VrtNU**: VrtNU.be
- **vrv** - **vrv**
- **vrv:series** - **vrv:series**
@ -1082,18 +1095,21 @@
- **Weibo** - **Weibo**
- **WeiboMobile** - **WeiboMobile**
- **WeiqiTV**: WQTV - **WeiqiTV**: WQTV
- **wholecloud**: WholeCloud
- **Wimp**
- **Wistia** - **Wistia**
- **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl - **wnl**: npo.nl, ntr.nl, omroepwnl.nl, zapp.nl and npo3.nl
- **WorldStarHipHop** - **WorldStarHipHop**
- **wrzuta.pl**
- **wrzuta.pl:playlist**
- **WSJ**: Wall Street Journal - **WSJ**: Wall Street Journal
- **WSJArticle** - **WSJArticle**
- **WWE** - **WWE**
- **XBef** - **XBef**
- **XboxClips** - **XboxClips**
- **XFileShare**: XFileShare based sites: ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, XVideoSharing - **XFileShare**: XFileShare based sites: DaClips, FileHoot, GorillaVid, MovPod, PowerWatch, Rapidvideo.ws, TheVideoBee, Vidto, Streamin.To, XVIDSTAGE, Vid ABC, VidBom, vidlo, RapidVideo.TV, FastVideo.me
- **XHamster** - **XHamster**
- **XHamsterEmbed** - **XHamsterEmbed**
- **XHamsterUser**
- **xiami:album**: 虾米音乐 - 专辑 - **xiami:album**: 虾米音乐 - 专辑
- **xiami:artist**: 虾米音乐 - 歌手 - **xiami:artist**: 虾米音乐 - 歌手
- **xiami:collection**: 虾米音乐 - 精选集 - **xiami:collection**: 虾米音乐 - 精选集
@ -1111,7 +1127,6 @@
- **Yahoo**: Yahoo screen and movies - **Yahoo**: Yahoo screen and movies
- **yahoo:gyao** - **yahoo:gyao**
- **yahoo:gyao:player** - **yahoo:gyao:player**
- **yahoo:japannews**: Yahoo! Japan News
- **YandexDisk** - **YandexDisk**
- **yandexmusic:album**: Яндекс.Музыка - Альбом - **yandexmusic:album**: Яндекс.Музыка - Альбом
- **yandexmusic:playlist**: Яндекс.Музыка - Плейлист - **yandexmusic:playlist**: Яндекс.Музыка - Плейлист

View File

@ -3,4 +3,4 @@ universal = True
[flake8] [flake8]
exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv exclude = youtube_dl/extractor/__init__.py,devscripts/buildserver.py,devscripts/lazy_load_template.py,devscripts/make_issue_template.py,setup.py,build,.git,venv
ignore = E402,E501,E731,E741,W503 ignore = E402,E501,E731,E741

View File

@ -816,15 +816,11 @@ class TestYoutubeDL(unittest.TestCase):
'webpage_url': 'http://example.com', 'webpage_url': 'http://example.com',
} }
def get_downloaded_info_dicts(params):
ydl = YDL(params)
# make a deep copy because the dictionary and nested entries
# can be modified
ydl.process_ie_result(copy.deepcopy(playlist))
return ydl.downloaded_info_dicts
def get_ids(params): def get_ids(params):
return [int(v['id']) for v in get_downloaded_info_dicts(params)] ydl = YDL(params)
# make a copy because the dictionary can be modified
ydl.process_ie_result(playlist.copy())
return [int(v['id']) for v in ydl.downloaded_info_dicts]
result = get_ids({}) result = get_ids({})
self.assertEqual(result, [1, 2, 3, 4]) self.assertEqual(result, [1, 2, 3, 4])
@ -856,22 +852,6 @@ class TestYoutubeDL(unittest.TestCase):
result = get_ids({'playlist_items': '2-4,3-4,3'}) result = get_ids({'playlist_items': '2-4,3-4,3'})
self.assertEqual(result, [2, 3, 4]) self.assertEqual(result, [2, 3, 4])
# Tests for https://github.com/ytdl-org/youtube-dl/issues/10591
# @{
result = get_downloaded_info_dicts({'playlist_items': '2-4,3-4,3'})
self.assertEqual(result[0]['playlist_index'], 2)
self.assertEqual(result[1]['playlist_index'], 3)
result = get_downloaded_info_dicts({'playlist_items': '2-4,3-4,3'})
self.assertEqual(result[0]['playlist_index'], 2)
self.assertEqual(result[1]['playlist_index'], 3)
self.assertEqual(result[2]['playlist_index'], 4)
result = get_downloaded_info_dicts({'playlist_items': '4,2'})
self.assertEqual(result[0]['playlist_index'], 4)
self.assertEqual(result[1]['playlist_index'], 2)
# @}
def test_urlopen_no_file_protocol(self): def test_urlopen_no_file_protocol(self):
# see https://github.com/ytdl-org/youtube-dl/issues/8227 # see https://github.com/ytdl-org/youtube-dl/issues/8227
ydl = YDL() ydl = YDL()

View File

@ -39,13 +39,6 @@ class TestYoutubeDLCookieJar(unittest.TestCase):
assert_cookie_has_value('HTTPONLY_COOKIE') assert_cookie_has_value('HTTPONLY_COOKIE')
assert_cookie_has_value('JS_ACCESSIBLE_COOKIE') assert_cookie_has_value('JS_ACCESSIBLE_COOKIE')
def test_malformed_cookies(self):
cookiejar = YoutubeDLCookieJar('./test/testdata/cookies/malformed_cookies.txt')
cookiejar.load(ignore_discard=True, ignore_expires=True)
# Cookies should be empty since all malformed cookie file entries
# will be ignored
self.assertFalse(cookiejar._cookies)
if __name__ == '__main__': if __name__ == '__main__':
unittest.main() unittest.main()

View File

@ -44,16 +44,16 @@ class TestAES(unittest.TestCase):
def test_decrypt_text(self): def test_decrypt_text(self):
password = intlist_to_bytes(self.key).decode('utf-8') password = intlist_to_bytes(self.key).decode('utf-8')
encrypted = base64.b64encode( encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) intlist_to_bytes(self.iv[:8]) +
+ b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae' b'\x17\x15\x93\xab\x8d\x80V\xcdV\xe0\t\xcdo\xc2\xa5\xd8ksM\r\xe27N\xae'
).decode('utf-8') ).decode('utf-8')
decrypted = (aes_decrypt_text(encrypted, password, 16)) decrypted = (aes_decrypt_text(encrypted, password, 16))
self.assertEqual(decrypted, self.secret_msg) self.assertEqual(decrypted, self.secret_msg)
password = intlist_to_bytes(self.key).decode('utf-8') password = intlist_to_bytes(self.key).decode('utf-8')
encrypted = base64.b64encode( encrypted = base64.b64encode(
intlist_to_bytes(self.iv[:8]) intlist_to_bytes(self.iv[:8]) +
+ b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83' b'\x0b\xe6\xa4\xd9z\x0e\xb8\xb9\xd0\xd4i_\x85\x1d\x99\x98_\xe5\x80\xe7.\xbf\xa5\x83'
).decode('utf-8') ).decode('utf-8')
decrypted = (aes_decrypt_text(encrypted, password, 32)) decrypted = (aes_decrypt_text(encrypted, password, 32))
self.assertEqual(decrypted, self.secret_msg) self.assertEqual(decrypted, self.secret_msg)

View File

@ -123,6 +123,12 @@ class TestAllURLsMatching(unittest.TestCase):
self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['pbs']) self.assertMatch('http://video.pbs.org/viralplayer/2365173446/', ['pbs'])
self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['pbs']) self.assertMatch('http://video.pbs.org/widget/partnerplayer/980042464/', ['pbs'])
def test_yahoo_https(self):
# https://github.com/ytdl-org/youtube-dl/issues/2701
self.assertMatch(
'https://screen.yahoo.com/smartwatches-latest-wearable-gadgets-163745379-cbs.html',
['Yahoo'])
def test_no_duplicated_ie_names(self): def test_no_duplicated_ie_names(self):
name_accu = collections.defaultdict(list) name_accu = collections.defaultdict(list)
for ie in self.ies: for ie in self.ies:

View File

@ -26,6 +26,7 @@ from youtube_dl.extractor import (
ThePlatformIE, ThePlatformIE,
ThePlatformFeedIE, ThePlatformFeedIE,
RTVEALaCartaIE, RTVEALaCartaIE,
FunnyOrDieIE,
DemocracynowIE, DemocracynowIE,
) )
@ -321,6 +322,18 @@ class TestRtveSubtitles(BaseTestSubtitles):
self.assertEqual(md5(subtitles['es']), '69e70cae2d40574fb7316f31d6eb7fca') self.assertEqual(md5(subtitles['es']), '69e70cae2d40574fb7316f31d6eb7fca')
class TestFunnyOrDieSubtitles(BaseTestSubtitles):
url = 'http://www.funnyordie.com/videos/224829ff6d/judd-apatow-will-direct-your-vine'
IE = FunnyOrDieIE
def test_allsubtitles(self):
self.DL.params['writesubtitles'] = True
self.DL.params['allsubtitles'] = True
subtitles = self.getSubtitles()
self.assertEqual(set(subtitles.keys()), set(['en']))
self.assertEqual(md5(subtitles['en']), 'c5593c193eacd353596c11c2d4f9ecc4')
class TestDemocracynowSubtitles(BaseTestSubtitles): class TestDemocracynowSubtitles(BaseTestSubtitles):
url = 'http://www.democracynow.org/shows/2015/7/3' url = 'http://www.democracynow.org/shows/2015/7/3'
IE = DemocracynowIE IE = DemocracynowIE

View File

@ -34,8 +34,8 @@ def _make_testfunc(testfile):
def test_func(self): def test_func(self):
as_file = os.path.join(TEST_DIR, testfile) as_file = os.path.join(TEST_DIR, testfile)
swf_file = os.path.join(TEST_DIR, test_id + '.swf') swf_file = os.path.join(TEST_DIR, test_id + '.swf')
if ((not os.path.exists(swf_file)) if ((not os.path.exists(swf_file)) or
or os.path.getmtime(swf_file) < os.path.getmtime(as_file)): os.path.getmtime(swf_file) < os.path.getmtime(as_file)):
# Recompile # Recompile
try: try:
subprocess.check_call([ subprocess.check_call([

View File

@ -19,7 +19,6 @@ from youtube_dl.utils import (
age_restricted, age_restricted,
args_to_str, args_to_str,
encode_base_n, encode_base_n,
caesar,
clean_html, clean_html,
date_from_str, date_from_str,
DateRange, DateRange,
@ -70,13 +69,10 @@ from youtube_dl.utils import (
remove_start, remove_start,
remove_end, remove_end,
remove_quotes, remove_quotes,
rot47,
shell_quote, shell_quote,
smuggle_url, smuggle_url,
str_to_int, str_to_int,
strip_jsonp, strip_jsonp,
strip_or_none,
subtitles_filename,
timeconvert, timeconvert,
unescapeHTML, unescapeHTML,
unified_strdate, unified_strdate,
@ -187,7 +183,7 @@ class TestUtil(unittest.TestCase):
self.assertEqual(sanitize_filename( self.assertEqual(sanitize_filename(
'ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖŐØŒÙÚÛÜŰÝÞßàáâãäåæçèéêëìíîïðñòóôõöőøœùúûüűýþÿ', restricted=True), 'ÂÃÄÀÁÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖŐØŒÙÚÛÜŰÝÞßàáâãäåæçèéêëìíîïðñòóôõöőøœùúûüűýþÿ', restricted=True),
'AAAAAAAECEEEEIIIIDNOOOOOOOOEUUUUUYTHssaaaaaaaeceeeeiiiionooooooooeuuuuuythy') 'AAAAAAAECEEEEIIIIDNOOOOOOOOEUUUUUYPssaaaaaaaeceeeeiiiionooooooooeuuuuuypy')
def test_sanitize_ids(self): def test_sanitize_ids(self):
self.assertEqual(sanitize_filename('_n_cd26wFpw', is_id=True), '_n_cd26wFpw') self.assertEqual(sanitize_filename('_n_cd26wFpw', is_id=True), '_n_cd26wFpw')
@ -264,11 +260,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp') self.assertEqual(replace_extension('.abc', 'temp'), '.abc.temp')
self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp') self.assertEqual(replace_extension('.abc.ext', 'temp'), '.abc.temp')
def test_subtitles_filename(self):
self.assertEqual(subtitles_filename('abc.ext', 'en', 'vtt'), 'abc.en.vtt')
self.assertEqual(subtitles_filename('abc.ext', 'en', 'vtt', 'ext'), 'abc.en.vtt')
self.assertEqual(subtitles_filename('abc.unexpected_ext', 'en', 'vtt', 'ext'), 'abc.unexpected_ext.en.vtt')
def test_remove_start(self): def test_remove_start(self):
self.assertEqual(remove_start(None, 'A - '), None) self.assertEqual(remove_start(None, 'A - '), None)
self.assertEqual(remove_start('A - B', 'A - '), 'B') self.assertEqual(remove_start('A - B', 'A - '), 'B')
@ -342,8 +333,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(unified_strdate('July 15th, 2013'), '20130715') self.assertEqual(unified_strdate('July 15th, 2013'), '20130715')
self.assertEqual(unified_strdate('September 1st, 2013'), '20130901') self.assertEqual(unified_strdate('September 1st, 2013'), '20130901')
self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902') self.assertEqual(unified_strdate('Sep 2nd, 2013'), '20130902')
self.assertEqual(unified_strdate('November 3rd, 2019'), '20191103')
self.assertEqual(unified_strdate('October 23rd, 2005'), '20051023')
def test_unified_timestamps(self): def test_unified_timestamps(self):
self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600) self.assertEqual(unified_timestamp('December 21, 2010'), 1292889600)
@ -499,12 +488,6 @@ class TestUtil(unittest.TestCase):
def test_str_to_int(self): def test_str_to_int(self):
self.assertEqual(str_to_int('123,456'), 123456) self.assertEqual(str_to_int('123,456'), 123456)
self.assertEqual(str_to_int('123.456'), 123456) self.assertEqual(str_to_int('123.456'), 123456)
self.assertEqual(str_to_int(523), 523)
# Python 3 has no long
if sys.version_info < (3, 0):
eval('self.assertEqual(str_to_int(123456L), 123456)')
self.assertEqual(str_to_int('noninteger'), None)
self.assertEqual(str_to_int([]), None)
def test_url_basename(self): def test_url_basename(self):
self.assertEqual(url_basename('http://foo.de/'), '') self.assertEqual(url_basename('http://foo.de/'), '')
@ -769,18 +752,6 @@ class TestUtil(unittest.TestCase):
d = json.loads(stripped) d = json.loads(stripped)
self.assertEqual(d, {'status': 'success'}) self.assertEqual(d, {'status': 'success'})
def test_strip_or_none(self):
self.assertEqual(strip_or_none(' abc'), 'abc')
self.assertEqual(strip_or_none('abc '), 'abc')
self.assertEqual(strip_or_none(' abc '), 'abc')
self.assertEqual(strip_or_none('\tabc\t'), 'abc')
self.assertEqual(strip_or_none('\n\tabc\n\t'), 'abc')
self.assertEqual(strip_or_none('abc'), 'abc')
self.assertEqual(strip_or_none(''), '')
self.assertEqual(strip_or_none(None), None)
self.assertEqual(strip_or_none(42), None)
self.assertEqual(strip_or_none([]), None)
def test_uppercase_escape(self): def test_uppercase_escape(self):
self.assertEqual(uppercase_escape(''), '') self.assertEqual(uppercase_escape(''), '')
self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐') self.assertEqual(uppercase_escape('\\U0001d550'), '𝕐')
@ -803,8 +774,6 @@ class TestUtil(unittest.TestCase):
self.assertEqual(mimetype2ext('text/vtt'), 'vtt') self.assertEqual(mimetype2ext('text/vtt'), 'vtt')
self.assertEqual(mimetype2ext('text/vtt;charset=utf-8'), 'vtt') self.assertEqual(mimetype2ext('text/vtt;charset=utf-8'), 'vtt')
self.assertEqual(mimetype2ext('text/html; charset=utf-8'), 'html') self.assertEqual(mimetype2ext('text/html; charset=utf-8'), 'html')
self.assertEqual(mimetype2ext('audio/x-wav'), 'wav')
self.assertEqual(mimetype2ext('audio/x-wav;codec=pcm'), 'wav')
def test_month_by_name(self): def test_month_by_name(self):
self.assertEqual(month_by_name(None), None) self.assertEqual(month_by_name(None), None)
@ -840,15 +809,6 @@ class TestUtil(unittest.TestCase):
'vcodec': 'av01.0.05M.08', 'vcodec': 'av01.0.05M.08',
'acodec': 'none', 'acodec': 'none',
}) })
self.assertEqual(parse_codecs('theora, vorbis'), {
'vcodec': 'theora',
'acodec': 'vorbis',
})
self.assertEqual(parse_codecs('unknownvcodec, unknownacodec'), {
'vcodec': 'unknownvcodec',
'acodec': 'unknownacodec',
})
self.assertEqual(parse_codecs('unknown'), {})
def test_escape_rfc3986(self): def test_escape_rfc3986(self):
reserved = "!*'();:@&=+$,/?#[]" reserved = "!*'();:@&=+$,/?#[]"
@ -994,12 +954,6 @@ class TestUtil(unittest.TestCase):
on = js_to_json('{42:4.2e1}') on = js_to_json('{42:4.2e1}')
self.assertEqual(json.loads(on), {'42': 42.0}) self.assertEqual(json.loads(on), {'42': 42.0})
on = js_to_json('{ "0x40": "0x40" }')
self.assertEqual(json.loads(on), {'0x40': '0x40'})
on = js_to_json('{ "040": "040" }')
self.assertEqual(json.loads(on), {'040': '040'})
def test_js_to_json_malformed(self): def test_js_to_json_malformed(self):
self.assertEqual(js_to_json('42a1'), '42"a1"') self.assertEqual(js_to_json('42a1'), '42"a1"')
self.assertEqual(js_to_json('42a-1'), '42"a"-1') self.assertEqual(js_to_json('42a-1'), '42"a"-1')
@ -1385,20 +1339,6 @@ Line 1
self.assertRaises(ValueError, encode_base_n, 0, 70) self.assertRaises(ValueError, encode_base_n, 0, 70)
self.assertRaises(ValueError, encode_base_n, 0, 60, custom_table) self.assertRaises(ValueError, encode_base_n, 0, 60, custom_table)
def test_caesar(self):
self.assertEqual(caesar('ace', 'abcdef', 2), 'cea')
self.assertEqual(caesar('cea', 'abcdef', -2), 'ace')
self.assertEqual(caesar('ace', 'abcdef', -2), 'eac')
self.assertEqual(caesar('eac', 'abcdef', 2), 'ace')
self.assertEqual(caesar('ace', 'abcdef', 0), 'ace')
self.assertEqual(caesar('xyz', 'abcdef', 2), 'xyz')
self.assertEqual(caesar('abc', 'acegik', 2), 'ebg')
self.assertEqual(caesar('ebg', 'acegik', -2), 'abc')
def test_rot47(self):
self.assertEqual(rot47('youtube-dl'), r'J@FEF36\5=')
self.assertEqual(rot47('YOUTUBE-DL'), r'*~&%&qt\s{')
def test_urshift(self): def test_urshift(self):
self.assertEqual(urshift(3, 1), 1) self.assertEqual(urshift(3, 1), 1)
self.assertEqual(urshift(-3, 1), 2147483646) self.assertEqual(urshift(-3, 1), 2147483646)

View File

@ -267,7 +267,7 @@ class TestYoutubeChapters(unittest.TestCase):
for description, duration, expected_chapters in self._TEST_CASES: for description, duration, expected_chapters in self._TEST_CASES:
ie = YoutubeIE() ie = YoutubeIE()
expect_value( expect_value(
self, ie._extract_chapters_from_description(description, duration), self, ie._extract_chapters(description, duration),
expected_chapters, None) expected_chapters, None)

View File

@ -74,28 +74,6 @@ _TESTS = [
] ]
class TestPlayerInfo(unittest.TestCase):
def test_youtube_extract_player_info(self):
PLAYER_URLS = (
('https://www.youtube.com/s/player/64dddad9/player_ias.vflset/en_US/base.js', '64dddad9'),
# obsolete
('https://www.youtube.com/yts/jsbin/player_ias-vfle4-e03/en_US/base.js', 'vfle4-e03'),
('https://www.youtube.com/yts/jsbin/player_ias-vfl49f_g4/en_US/base.js', 'vfl49f_g4'),
('https://www.youtube.com/yts/jsbin/player_ias-vflCPQUIL/en_US/base.js', 'vflCPQUIL'),
('https://www.youtube.com/yts/jsbin/player-vflzQZbt7/en_US/base.js', 'vflzQZbt7'),
('https://www.youtube.com/yts/jsbin/player-en_US-vflaxXRn1/base.js', 'vflaxXRn1'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflXGBaUN.js', 'vflXGBaUN'),
('https://s.ytimg.com/yts/jsbin/html5player-en_US-vflKjOTVq/html5player.js', 'vflKjOTVq'),
('http://s.ytimg.com/yt/swfbin/watch_as3-vflrEm9Nq.swf', 'vflrEm9Nq'),
('https://s.ytimg.com/yts/swfbin/player-vflenCdZL/watch_as3.swf', 'vflenCdZL'),
)
for player_url, expected_player_id in PLAYER_URLS:
expected_player_type = player_url.split('.')[-1]
player_type, player_id = YoutubeIE._extract_player_info(player_url)
self.assertEqual(player_type, expected_player_type)
self.assertEqual(player_id, expected_player_id)
class TestSignature(unittest.TestCase): class TestSignature(unittest.TestCase):
def setUp(self): def setUp(self):
TEST_DIR = os.path.dirname(os.path.abspath(__file__)) TEST_DIR = os.path.dirname(os.path.abspath(__file__))

View File

@ -1,9 +0,0 @@
# Netscape HTTP Cookie File
# http://curl.haxx.se/rfc/cookie_spec.html
# This is a generated file! Do not edit.
# Cookie file entry with invalid number of fields - 6 instead of 7
www.foobar.foobar FALSE / FALSE 0 COOKIE
# Cookie file entry with invalid expires at
www.foobar.foobar FALSE / FALSE 1.7976931348623157e+308 COOKIE VALUE

View File

@ -92,7 +92,6 @@ from .utils import (
YoutubeDLCookieJar, YoutubeDLCookieJar,
YoutubeDLCookieProcessor, YoutubeDLCookieProcessor,
YoutubeDLHandler, YoutubeDLHandler,
YoutubeDLRedirectHandler,
) )
from .cache import Cache from .cache import Cache
from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER from .extractor import get_info_extractor, gen_extractor_classes, _LAZY_LOADER
@ -401,9 +400,9 @@ class YoutubeDL(object):
else: else:
raise raise
if (sys.platform != 'win32' if (sys.platform != 'win32' and
and sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968'] sys.getfilesystemencoding() in ['ascii', 'ANSI_X3.4-1968'] and
and not params.get('restrictfilenames', False)): not params.get('restrictfilenames', False)):
# Unicode filesystem API will throw errors (#1474, #13027) # Unicode filesystem API will throw errors (#1474, #13027)
self.report_warning( self.report_warning(
'Assuming --restrict-filenames since file system encoding ' 'Assuming --restrict-filenames since file system encoding '
@ -441,9 +440,9 @@ class YoutubeDL(object):
if re.match(r'^-[0-9A-Za-z_-]{10}$', a)] if re.match(r'^-[0-9A-Za-z_-]{10}$', a)]
if idxs: if idxs:
correct_argv = ( correct_argv = (
['youtube-dl'] ['youtube-dl'] +
+ [a for i, a in enumerate(argv) if i not in idxs] [a for i, a in enumerate(argv) if i not in idxs] +
+ ['--'] + [argv[i] for i in idxs] ['--'] + [argv[i] for i in idxs]
) )
self.report_warning( self.report_warning(
'Long argument string detected. ' 'Long argument string detected. '
@ -851,11 +850,10 @@ class YoutubeDL(object):
if result_type in ('url', 'url_transparent'): if result_type in ('url', 'url_transparent'):
ie_result['url'] = sanitize_url(ie_result['url']) ie_result['url'] = sanitize_url(ie_result['url'])
extract_flat = self.params.get('extract_flat', False) extract_flat = self.params.get('extract_flat', False)
if ((extract_flat == 'in_playlist' and 'playlist' in extra_info) if ((extract_flat == 'in_playlist' and 'playlist' in extra_info) or
or extract_flat is True): extract_flat is True):
self.__forced_printings( if self.params.get('forcejson', False):
ie_result, self.prepare_filename(ie_result), self.to_stdout(json.dumps(ie_result))
incomplete=True)
return ie_result return ie_result
if result_type == 'video': if result_type == 'video':
@ -991,7 +989,7 @@ class YoutubeDL(object):
'playlist_title': ie_result.get('title'), 'playlist_title': ie_result.get('title'),
'playlist_uploader': ie_result.get('uploader'), 'playlist_uploader': ie_result.get('uploader'),
'playlist_uploader_id': ie_result.get('uploader_id'), 'playlist_uploader_id': ie_result.get('uploader_id'),
'playlist_index': playlistitems[i - 1] if playlistitems else i + playliststart, 'playlist_index': i + playliststart,
'extractor': ie_result['extractor'], 'extractor': ie_result['extractor'],
'webpage_url': ie_result['webpage_url'], 'webpage_url': ie_result['webpage_url'],
'webpage_url_basename': url_basename(ie_result['webpage_url']), 'webpage_url_basename': url_basename(ie_result['webpage_url']),
@ -1621,9 +1619,9 @@ class YoutubeDL(object):
# https://github.com/ytdl-org/youtube-dl/issues/10083). # https://github.com/ytdl-org/youtube-dl/issues/10083).
incomplete_formats = ( incomplete_formats = (
# All formats are video-only or # All formats are video-only or
all(f.get('vcodec') != 'none' and f.get('acodec') == 'none' for f in formats) all(f.get('vcodec') != 'none' and f.get('acodec') == 'none' for f in formats) or
# all formats are audio-only # all formats are audio-only
or all(f.get('vcodec') == 'none' and f.get('acodec') != 'none' for f in formats)) all(f.get('vcodec') == 'none' and f.get('acodec') != 'none' for f in formats))
ctx = { ctx = {
'formats': formats, 'formats': formats,
@ -1695,36 +1693,6 @@ class YoutubeDL(object):
subs[lang] = f subs[lang] = f
return subs return subs
def __forced_printings(self, info_dict, filename, incomplete):
def print_mandatory(field):
if (self.params.get('force%s' % field, False)
and (not incomplete or info_dict.get(field) is not None)):
self.to_stdout(info_dict[field])
def print_optional(field):
if (self.params.get('force%s' % field, False)
and info_dict.get(field) is not None):
self.to_stdout(info_dict[field])
print_mandatory('title')
print_mandatory('id')
if self.params.get('forceurl', False) and not incomplete:
if info_dict.get('requested_formats') is not None:
for f in info_dict['requested_formats']:
self.to_stdout(f['url'] + f.get('play_path', ''))
else:
# For RTMP URLs, also include the playpath
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
print_optional('thumbnail')
print_optional('description')
if self.params.get('forcefilename', False) and filename is not None:
self.to_stdout(filename)
if self.params.get('forceduration', False) and info_dict.get('duration') is not None:
self.to_stdout(formatSeconds(info_dict['duration']))
print_mandatory('format')
if self.params.get('forcejson', False):
self.to_stdout(json.dumps(info_dict))
def process_info(self, info_dict): def process_info(self, info_dict):
"""Process a single resolved IE result.""" """Process a single resolved IE result."""
@ -1735,8 +1703,9 @@ class YoutubeDL(object):
if self._num_downloads >= int(max_downloads): if self._num_downloads >= int(max_downloads):
raise MaxDownloadsReached() raise MaxDownloadsReached()
# TODO: backward compatibility, to be removed
info_dict['fulltitle'] = info_dict['title'] info_dict['fulltitle'] = info_dict['title']
if len(info_dict['title']) > 200:
info_dict['title'] = info_dict['title'][:197] + '...'
if 'format' not in info_dict: if 'format' not in info_dict:
info_dict['format'] = info_dict['ext'] info_dict['format'] = info_dict['ext']
@ -1751,7 +1720,29 @@ class YoutubeDL(object):
info_dict['_filename'] = filename = self.prepare_filename(info_dict) info_dict['_filename'] = filename = self.prepare_filename(info_dict)
# Forced printings # Forced printings
self.__forced_printings(info_dict, filename, incomplete=False) if self.params.get('forcetitle', False):
self.to_stdout(info_dict['fulltitle'])
if self.params.get('forceid', False):
self.to_stdout(info_dict['id'])
if self.params.get('forceurl', False):
if info_dict.get('requested_formats') is not None:
for f in info_dict['requested_formats']:
self.to_stdout(f['url'] + f.get('play_path', ''))
else:
# For RTMP URLs, also include the playpath
self.to_stdout(info_dict['url'] + info_dict.get('play_path', ''))
if self.params.get('forcethumbnail', False) and info_dict.get('thumbnail') is not None:
self.to_stdout(info_dict['thumbnail'])
if self.params.get('forcedescription', False) and info_dict.get('description') is not None:
self.to_stdout(info_dict['description'])
if self.params.get('forcefilename', False) and filename is not None:
self.to_stdout(filename)
if self.params.get('forceduration', False) and info_dict.get('duration') is not None:
self.to_stdout(formatSeconds(info_dict['duration']))
if self.params.get('forceformat', False):
self.to_stdout(info_dict['format'])
if self.params.get('forcejson', False):
self.to_stdout(json.dumps(info_dict))
# Do nothing else if in simulate mode # Do nothing else if in simulate mode
if self.params.get('simulate', False): if self.params.get('simulate', False):
@ -1792,8 +1783,6 @@ class YoutubeDL(object):
annofn = replace_extension(filename, 'annotations.xml', info_dict.get('ext')) annofn = replace_extension(filename, 'annotations.xml', info_dict.get('ext'))
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(annofn)): if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(annofn)):
self.to_screen('[info] Video annotations are already present') self.to_screen('[info] Video annotations are already present')
elif not info_dict.get('annotations'):
self.report_warning('There are no annotations to write.')
else: else:
try: try:
self.to_screen('[info] Writing video annotations to: ' + annofn) self.to_screen('[info] Writing video annotations to: ' + annofn)
@ -1815,7 +1804,7 @@ class YoutubeDL(object):
ie = self.get_info_extractor(info_dict['extractor_key']) ie = self.get_info_extractor(info_dict['extractor_key'])
for sub_lang, sub_info in subtitles.items(): for sub_lang, sub_info in subtitles.items():
sub_format = sub_info['ext'] sub_format = sub_info['ext']
sub_filename = subtitles_filename(filename, sub_lang, sub_format, info_dict.get('ext')) sub_filename = subtitles_filename(filename, sub_lang, sub_format)
if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)): if self.params.get('nooverwrites', False) and os.path.exists(encodeFilename(sub_filename)):
self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format)) self.to_screen('[info] Video subtitle %s.%s is already present' % (sub_lang, sub_format))
else: else:
@ -1958,8 +1947,8 @@ class YoutubeDL(object):
else: else:
assert fixup_policy in ('ignore', 'never') assert fixup_policy in ('ignore', 'never')
if (info_dict.get('requested_formats') is None if (info_dict.get('requested_formats') is None and
and info_dict.get('container') == 'm4a_dash'): info_dict.get('container') == 'm4a_dash'):
if fixup_policy == 'warn': if fixup_policy == 'warn':
self.report_warning( self.report_warning(
'%s: writing DASH m4a. ' '%s: writing DASH m4a. '
@ -1978,9 +1967,9 @@ class YoutubeDL(object):
else: else:
assert fixup_policy in ('ignore', 'never') assert fixup_policy in ('ignore', 'never')
if (info_dict.get('protocol') == 'm3u8_native' if (info_dict.get('protocol') == 'm3u8_native' or
or info_dict.get('protocol') == 'm3u8' info_dict.get('protocol') == 'm3u8' and
and self.params.get('hls_prefer_native')): self.params.get('hls_prefer_native')):
if fixup_policy == 'warn': if fixup_policy == 'warn':
self.report_warning('%s: malformed AAC bitstream detected.' % ( self.report_warning('%s: malformed AAC bitstream detected.' % (
info_dict['id'])) info_dict['id']))
@ -2006,10 +1995,10 @@ class YoutubeDL(object):
def download(self, url_list): def download(self, url_list):
"""Download a given list of URLs.""" """Download a given list of URLs."""
outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL) outtmpl = self.params.get('outtmpl', DEFAULT_OUTTMPL)
if (len(url_list) > 1 if (len(url_list) > 1 and
and outtmpl != '-' outtmpl != '-' and
and '%' not in outtmpl '%' not in outtmpl and
and self.params.get('max_downloads') != 1): self.params.get('max_downloads') != 1):
raise SameFileError(outtmpl) raise SameFileError(outtmpl)
for url in url_list: for url in url_list:
@ -2154,8 +2143,8 @@ class YoutubeDL(object):
if res: if res:
res += ', ' res += ', '
res += '%s container' % fdict['container'] res += '%s container' % fdict['container']
if (fdict.get('vcodec') is not None if (fdict.get('vcodec') is not None and
and fdict.get('vcodec') != 'none'): fdict.get('vcodec') != 'none'):
if res: if res:
res += ', ' res += ', '
res += fdict['vcodec'] res += fdict['vcodec']
@ -2344,7 +2333,6 @@ class YoutubeDL(object):
debuglevel = 1 if self.params.get('debug_printtraffic') else 0 debuglevel = 1 if self.params.get('debug_printtraffic') else 0
https_handler = make_HTTPS_handler(self.params, debuglevel=debuglevel) https_handler = make_HTTPS_handler(self.params, debuglevel=debuglevel)
ydlh = YoutubeDLHandler(self.params, debuglevel=debuglevel) ydlh = YoutubeDLHandler(self.params, debuglevel=debuglevel)
redirect_handler = YoutubeDLRedirectHandler()
data_handler = compat_urllib_request_DataHandler() data_handler = compat_urllib_request_DataHandler()
# When passing our own FileHandler instance, build_opener won't add the # When passing our own FileHandler instance, build_opener won't add the
@ -2358,7 +2346,7 @@ class YoutubeDL(object):
file_handler.file_open = file_open file_handler.file_open = file_open
opener = compat_urllib_request.build_opener( opener = compat_urllib_request.build_opener(
proxy_handler, https_handler, cookie_processor, ydlh, redirect_handler, data_handler, file_handler) proxy_handler, https_handler, cookie_processor, ydlh, data_handler, file_handler)
# Delete the default user-agent header, which would otherwise apply in # Delete the default user-agent header, which would otherwise apply in
# cases where our custom HTTP handler doesn't come into play # cases where our custom HTTP handler doesn't come into play

View File

@ -94,7 +94,7 @@ def _real_main(argv=None):
if opts.verbose: if opts.verbose:
write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n') write_string('[debug] Batch file urls: ' + repr(batch_urls) + '\n')
except IOError: except IOError:
sys.exit('ERROR: batch file %s could not be read' % opts.batchfile) sys.exit('ERROR: batch file could not be read')
all_urls = batch_urls + [url.strip() for url in args] # batch_urls are already striped in read_batch_urls all_urls = batch_urls + [url.strip() for url in args] # batch_urls are already striped in read_batch_urls
_enc = preferredencoding() _enc = preferredencoding()
all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls] all_urls = [url.decode(_enc, 'ignore') if isinstance(url, bytes) else url for url in all_urls]
@ -230,14 +230,14 @@ def _real_main(argv=None):
if opts.allsubtitles and not opts.writeautomaticsub: if opts.allsubtitles and not opts.writeautomaticsub:
opts.writesubtitles = True opts.writesubtitles = True
outtmpl = ((opts.outtmpl is not None and opts.outtmpl) outtmpl = ((opts.outtmpl is not None and opts.outtmpl) or
or (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s') (opts.format == '-1' and opts.usetitle and '%(title)s-%(id)s-%(format)s.%(ext)s') or
or (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s') (opts.format == '-1' and '%(id)s-%(format)s.%(ext)s') or
or (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s') (opts.usetitle and opts.autonumber and '%(autonumber)s-%(title)s-%(id)s.%(ext)s') or
or (opts.usetitle and '%(title)s-%(id)s.%(ext)s') (opts.usetitle and '%(title)s-%(id)s.%(ext)s') or
or (opts.useid and '%(id)s.%(ext)s') (opts.useid and '%(id)s.%(ext)s') or
or (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s') (opts.autonumber and '%(autonumber)s-%(id)s.%(ext)s') or
or DEFAULT_OUTTMPL) DEFAULT_OUTTMPL)
if not os.path.splitext(outtmpl)[1] and opts.extractaudio: if not os.path.splitext(outtmpl)[1] and opts.extractaudio:
parser.error('Cannot download a video and extract audio into the same' parser.error('Cannot download a video and extract audio into the same'
' file! Use "{0}.%(ext)s" instead of "{0}" as the output' ' file! Use "{0}.%(ext)s" instead of "{0}" as the output'

View File

@ -57,17 +57,6 @@ try:
except ImportError: # Python 2 except ImportError: # Python 2
import cookielib as compat_cookiejar import cookielib as compat_cookiejar
if sys.version_info[0] == 2:
class compat_cookiejar_Cookie(compat_cookiejar.Cookie):
def __init__(self, version, name, value, *args, **kwargs):
if isinstance(name, compat_str):
name = name.encode()
if isinstance(value, compat_str):
value = value.encode()
compat_cookiejar.Cookie.__init__(self, version, name, value, *args, **kwargs)
else:
compat_cookiejar_Cookie = compat_cookiejar.Cookie
try: try:
import http.cookies as compat_cookies import http.cookies as compat_cookies
except ImportError: # Python 2 except ImportError: # Python 2
@ -2660,9 +2649,9 @@ else:
try: try:
args = shlex.split('中文') args = shlex.split('中文')
assert (isinstance(args, list) assert (isinstance(args, list) and
and isinstance(args[0], compat_str) isinstance(args[0], compat_str) and
and args[0] == '中文') args[0] == '中文')
compat_shlex_split = shlex.split compat_shlex_split = shlex.split
except (AssertionError, UnicodeEncodeError): except (AssertionError, UnicodeEncodeError):
# Working around shlex issue with unicode strings on some python 2 # Working around shlex issue with unicode strings on some python 2
@ -2765,17 +2754,6 @@ else:
compat_expanduser = os.path.expanduser compat_expanduser = os.path.expanduser
if compat_os_name == 'nt' and sys.version_info < (3, 8):
# os.path.realpath on Windows does not follow symbolic links
# prior to Python 3.8 (see https://bugs.python.org/issue9949)
def compat_realpath(path):
while os.path.islink(path):
path = os.path.abspath(os.readlink(path))
return path
else:
compat_realpath = os.path.realpath
if sys.version_info < (3, 0): if sys.version_info < (3, 0):
def compat_print(s): def compat_print(s):
from .utils import preferredencoding from .utils import preferredencoding
@ -2998,7 +2976,6 @@ __all__ = [
'compat_basestring', 'compat_basestring',
'compat_chr', 'compat_chr',
'compat_cookiejar', 'compat_cookiejar',
'compat_cookiejar_Cookie',
'compat_cookies', 'compat_cookies',
'compat_ctypes_WINFUNCTYPE', 'compat_ctypes_WINFUNCTYPE',
'compat_etree_Element', 'compat_etree_Element',
@ -3021,7 +2998,6 @@ __all__ = [
'compat_os_name', 'compat_os_name',
'compat_parse_qs', 'compat_parse_qs',
'compat_print', 'compat_print',
'compat_realpath',
'compat_setenv', 'compat_setenv',
'compat_shlex_quote', 'compat_shlex_quote',
'compat_shlex_split', 'compat_shlex_split',

View File

@ -176,9 +176,7 @@ class FileDownloader(object):
return return
speed = float(byte_counter) / elapsed speed = float(byte_counter) / elapsed
if speed > rate_limit: if speed > rate_limit:
sleep_time = float(byte_counter) / rate_limit - elapsed time.sleep(max((byte_counter // rate_limit) - elapsed, 0))
if sleep_time > 0:
time.sleep(sleep_time)
def temp_name(self, filename): def temp_name(self, filename):
"""Returns a temporary filename for the given filename.""" """Returns a temporary filename for the given filename."""
@ -332,15 +330,15 @@ class FileDownloader(object):
""" """
nooverwrites_and_exists = ( nooverwrites_and_exists = (
self.params.get('nooverwrites', False) self.params.get('nooverwrites', False) and
and os.path.exists(encodeFilename(filename)) os.path.exists(encodeFilename(filename))
) )
if not hasattr(filename, 'write'): if not hasattr(filename, 'write'):
continuedl_and_exists = ( continuedl_and_exists = (
self.params.get('continuedl', True) self.params.get('continuedl', True) and
and os.path.isfile(encodeFilename(filename)) os.path.isfile(encodeFilename(filename)) and
and not self.params.get('nopart', False) not self.params.get('nopart', False)
) )
# Check file already present # Check file already present

View File

@ -53,7 +53,7 @@ class DashSegmentsFD(FragmentFD):
except compat_urllib_error.HTTPError as err: except compat_urllib_error.HTTPError as err:
# YouTube may often return 404 HTTP error for a fragment causing the # YouTube may often return 404 HTTP error for a fragment causing the
# whole download to fail. However if the same fragment is immediately # whole download to fail. However if the same fragment is immediately
# retried with the same request data this usually succeeds (1-2 attempts # retried with the same request data this usually succeeds (1-2 attemps
# is usually enough) thus allowing to download the whole file successfully. # is usually enough) thus allowing to download the whole file successfully.
# To be future-proof we will retry all fragments that fail with any # To be future-proof we will retry all fragments that fail with any
# HTTP error. # HTTP error.

View File

@ -194,7 +194,6 @@ class Aria2cFD(ExternalFD):
cmd += self._option('--interface', 'source_address') cmd += self._option('--interface', 'source_address')
cmd += self._option('--all-proxy', 'proxy') cmd += self._option('--all-proxy', 'proxy')
cmd += self._bool_option('--check-certificate', 'nocheckcertificate', 'false', 'true', '=') cmd += self._bool_option('--check-certificate', 'nocheckcertificate', 'false', 'true', '=')
cmd += self._bool_option('--remote-time', 'updatetime', 'true', 'false', '=')
cmd += ['--', info_dict['url']] cmd += ['--', info_dict['url']]
return cmd return cmd

View File

@ -238,8 +238,8 @@ def write_metadata_tag(stream, metadata):
def remove_encrypted_media(media): def remove_encrypted_media(media):
return list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib return list(filter(lambda e: 'drmAdditionalHeaderId' not in e.attrib and
and 'drmAdditionalHeaderSetId' not in e.attrib, 'drmAdditionalHeaderSetId' not in e.attrib,
media)) media))
@ -267,8 +267,8 @@ class F4mFD(FragmentFD):
media = doc.findall(_add_ns('media')) media = doc.findall(_add_ns('media'))
if not media: if not media:
self.report_error('No media found') self.report_error('No media found')
for e in (doc.findall(_add_ns('drmAdditionalHeader')) for e in (doc.findall(_add_ns('drmAdditionalHeader')) +
+ doc.findall(_add_ns('drmAdditionalHeaderSet'))): doc.findall(_add_ns('drmAdditionalHeaderSet'))):
# If id attribute is missing it's valid for all media nodes # If id attribute is missing it's valid for all media nodes
# without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute # without drmAdditionalHeaderId or drmAdditionalHeaderSetId attribute
if 'id' not in e.attrib: if 'id' not in e.attrib:

View File

@ -190,13 +190,12 @@ class FragmentFD(FileDownloader):
}) })
def _start_frag_download(self, ctx): def _start_frag_download(self, ctx):
resume_len = ctx['complete_frags_downloaded_bytes']
total_frags = ctx['total_frags'] total_frags = ctx['total_frags']
# This dict stores the download progress, it's updated by the progress # This dict stores the download progress, it's updated by the progress
# hook # hook
state = { state = {
'status': 'downloading', 'status': 'downloading',
'downloaded_bytes': resume_len, 'downloaded_bytes': ctx['complete_frags_downloaded_bytes'],
'fragment_index': ctx['fragment_index'], 'fragment_index': ctx['fragment_index'],
'fragment_count': total_frags, 'fragment_count': total_frags,
'filename': ctx['filename'], 'filename': ctx['filename'],
@ -220,8 +219,8 @@ class FragmentFD(FileDownloader):
frag_total_bytes = s.get('total_bytes') or 0 frag_total_bytes = s.get('total_bytes') or 0
if not ctx['live']: if not ctx['live']:
estimated_size = ( estimated_size = (
(ctx['complete_frags_downloaded_bytes'] + frag_total_bytes) (ctx['complete_frags_downloaded_bytes'] + frag_total_bytes) /
/ (state['fragment_index'] + 1) * total_frags) (state['fragment_index'] + 1) * total_frags)
state['total_bytes_estimate'] = estimated_size state['total_bytes_estimate'] = estimated_size
if s['status'] == 'finished': if s['status'] == 'finished':
@ -235,8 +234,8 @@ class FragmentFD(FileDownloader):
state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes'] state['downloaded_bytes'] += frag_downloaded_bytes - ctx['prev_frag_downloaded_bytes']
if not ctx['live']: if not ctx['live']:
state['eta'] = self.calc_eta( state['eta'] = self.calc_eta(
start, time_now, estimated_size - resume_len, start, time_now, estimated_size,
state['downloaded_bytes'] - resume_len) state['downloaded_bytes'])
state['speed'] = s.get('speed') or ctx.get('speed') state['speed'] = s.get('speed') or ctx.get('speed')
ctx['speed'] = state['speed'] ctx['speed'] = state['speed']
ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes ctx['prev_frag_downloaded_bytes'] = frag_downloaded_bytes

View File

@ -64,7 +64,7 @@ class HlsFD(FragmentFD):
s = urlh.read().decode('utf-8', 'ignore') s = urlh.read().decode('utf-8', 'ignore')
if not self.can_download(s, info_dict): if not self.can_download(s, info_dict):
if info_dict.get('extra_param_to_segment_url') or info_dict.get('_decryption_key_url'): if info_dict.get('extra_param_to_segment_url'):
self.report_error('pycrypto not found. Please install it.') self.report_error('pycrypto not found. Please install it.')
return False return False
self.report_warning( self.report_warning(
@ -76,12 +76,12 @@ class HlsFD(FragmentFD):
return fd.real_download(filename, info_dict) return fd.real_download(filename, info_dict)
def is_ad_fragment_start(s): def is_ad_fragment_start(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=ad' in s or
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad')) s.startswith('#UPLYNK-SEGMENT') and s.endswith(',ad'))
def is_ad_fragment_end(s): def is_ad_fragment_end(s):
return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s return (s.startswith('#ANVATO-SEGMENT-INFO') and 'type=master' in s or
or s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment')) s.startswith('#UPLYNK-SEGMENT') and s.endswith(',segment'))
media_frags = 0 media_frags = 0
ad_frags = 0 ad_frags = 0
@ -141,7 +141,7 @@ class HlsFD(FragmentFD):
count = 0 count = 0
headers = info_dict.get('http_headers', {}) headers = info_dict.get('http_headers', {})
if byte_range: if byte_range:
headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end'] - 1) headers['Range'] = 'bytes=%d-%d' % (byte_range['start'], byte_range['end'])
while count <= fragment_retries: while count <= fragment_retries:
try: try:
success, frag_content = self._download_fragment( success, frag_content = self._download_fragment(
@ -169,7 +169,7 @@ class HlsFD(FragmentFD):
if decrypt_info['METHOD'] == 'AES-128': if decrypt_info['METHOD'] == 'AES-128':
iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence) iv = decrypt_info.get('IV') or compat_struct_pack('>8xq', media_sequence)
decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen( decrypt_info['KEY'] = decrypt_info.get('KEY') or self.ydl.urlopen(
self._prepare_url(info_dict, info_dict.get('_decryption_key_url') or decrypt_info['URI'])).read() self._prepare_url(info_dict, decrypt_info['URI'])).read()
frag_content = AES.new( frag_content = AES.new(
decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content) decrypt_info['KEY'], AES.MODE_CBC, iv).decrypt(frag_content)
self._append_fragment(ctx, frag_content) self._append_fragment(ctx, frag_content)

View File

@ -46,8 +46,8 @@ class HttpFD(FileDownloader):
is_test = self.params.get('test', False) is_test = self.params.get('test', False)
chunk_size = self._TEST_FILE_SIZE if is_test else ( chunk_size = self._TEST_FILE_SIZE if is_test else (
info_dict.get('downloader_options', {}).get('http_chunk_size') info_dict.get('downloader_options', {}).get('http_chunk_size') or
or self.params.get('http_chunk_size') or 0) self.params.get('http_chunk_size') or 0)
ctx.open_mode = 'wb' ctx.open_mode = 'wb'
ctx.resume_len = 0 ctx.resume_len = 0
@ -106,12 +106,7 @@ class HttpFD(FileDownloader):
set_range(request, range_start, range_end) set_range(request, range_start, range_end)
# Establish connection # Establish connection
try: try:
try: ctx.data = self.ydl.urlopen(request)
ctx.data = self.ydl.urlopen(request)
except (compat_urllib_error.URLError, ) as err:
if isinstance(err.reason, socket.timeout):
raise RetryDownload(err)
raise err
# When trying to resume, Content-Range HTTP header of response has to be checked # When trying to resume, Content-Range HTTP header of response has to be checked
# to match the value of requested Range HTTP header. This is due to a webservers # to match the value of requested Range HTTP header. This is due to a webservers
# that don't support resuming and serve a whole file with no Content-Range # that don't support resuming and serve a whole file with no Content-Range
@ -128,11 +123,11 @@ class HttpFD(FileDownloader):
content_len = int_or_none(content_range_m.group(3)) content_len = int_or_none(content_range_m.group(3))
accept_content_len = ( accept_content_len = (
# Non-chunked download # Non-chunked download
not ctx.chunk_size not ctx.chunk_size or
# Chunked download and requested piece or # Chunked download and requested piece or
# its part is promised to be served # its part is promised to be served
or content_range_end == range_end content_range_end == range_end or
or content_len < range_end) content_len < range_end)
if accept_content_len: if accept_content_len:
ctx.data_len = content_len ctx.data_len = content_len
return return
@ -157,8 +152,8 @@ class HttpFD(FileDownloader):
raise raise
else: else:
# Examine the reported length # Examine the reported length
if (content_length is not None if (content_length is not None and
and (ctx.resume_len - 100 < int(content_length) < ctx.resume_len + 100)): (ctx.resume_len - 100 < int(content_length) < ctx.resume_len + 100)):
# The file had already been fully downloaded. # The file had already been fully downloaded.
# Explanation to the above condition: in issue #175 it was revealed that # Explanation to the above condition: in issue #175 it was revealed that
# YouTube sometimes adds or removes a few bytes from the end of the file, # YouTube sometimes adds or removes a few bytes from the end of the file,
@ -223,27 +218,24 @@ class HttpFD(FileDownloader):
def retry(e): def retry(e):
to_stdout = ctx.tmpfilename == '-' to_stdout = ctx.tmpfilename == '-'
if ctx.stream is not None: if not to_stdout:
if not to_stdout: ctx.stream.close()
ctx.stream.close() ctx.stream = None
ctx.stream = None
ctx.resume_len = byte_counter if to_stdout else os.path.getsize(encodeFilename(ctx.tmpfilename)) ctx.resume_len = byte_counter if to_stdout else os.path.getsize(encodeFilename(ctx.tmpfilename))
raise RetryDownload(e) raise RetryDownload(e)
while True: while True:
try: try:
# Download and write # Download and write
data_block = ctx.data.read(block_size if data_len is None else min(block_size, data_len - byte_counter)) data_block = ctx.data.read(block_size if not is_test else min(block_size, data_len - byte_counter))
# socket.timeout is a subclass of socket.error but may not have # socket.timeout is a subclass of socket.error but may not have
# errno set # errno set
except socket.timeout as e: except socket.timeout as e:
retry(e) retry(e)
except socket.error as e: except socket.error as e:
# SSLError on python 2 (inherits socket.error) may have if e.errno not in (errno.ECONNRESET, errno.ETIMEDOUT):
# no errno set but this error message raise
if e.errno in (errno.ECONNRESET, errno.ETIMEDOUT) or getattr(e, 'message', None) == 'The read operation timed out': retry(e)
retry(e)
raise
byte_counter += len(data_block) byte_counter += len(data_block)
@ -307,7 +299,7 @@ class HttpFD(FileDownloader):
'elapsed': now - ctx.start_time, 'elapsed': now - ctx.start_time,
}) })
if data_len is not None and byte_counter == data_len: if is_test and byte_counter == data_len:
break break
if not is_test and ctx.chunk_size and ctx.data_len is not None and byte_counter < ctx.data_len: if not is_test and ctx.chunk_size and ctx.data_len is not None and byte_counter < ctx.data_len:

View File

@ -146,7 +146,7 @@ def write_piff_header(stream, params):
sps, pps = codec_private_data.split(u32.pack(1))[1:] sps, pps = codec_private_data.split(u32.pack(1))[1:]
avcc_payload = u8.pack(1) # configuration version avcc_payload = u8.pack(1) # configuration version
avcc_payload += sps[1:4] # avc profile indication + profile compatibility + avc level indication avcc_payload += sps[1:4] # avc profile indication + profile compatibility + avc level indication
avcc_payload += u8.pack(0xfc | (params.get('nal_unit_length_field', 4) - 1)) # complete representation (1) + reserved (11111) + length size minus one avcc_payload += u8.pack(0xfc | (params.get('nal_unit_length_field', 4) - 1)) # complete represenation (1) + reserved (11111) + length size minus one
avcc_payload += u8.pack(1) # reserved (0) + number of sps (0000001) avcc_payload += u8.pack(1) # reserved (0) + number of sps (0000001)
avcc_payload += u16.pack(len(sps)) avcc_payload += u16.pack(len(sps))
avcc_payload += sps avcc_payload += sps

View File

@ -110,17 +110,17 @@ class ABCIViewIE(InfoExtractor):
# ABC iview programs are normally available for 14 days only. # ABC iview programs are normally available for 14 days only.
_TESTS = [{ _TESTS = [{
'url': 'https://iview.abc.net.au/show/gruen/series/11/video/LE1927H001S00', 'url': 'https://iview.abc.net.au/show/ben-and-hollys-little-kingdom/series/0/video/ZX9371A050S00',
'md5': '67715ce3c78426b11ba167d875ac6abf', 'md5': 'cde42d728b3b7c2b32b1b94b4a548afc',
'info_dict': { 'info_dict': {
'id': 'LE1927H001S00', 'id': 'ZX9371A050S00',
'ext': 'mp4', 'ext': 'mp4',
'title': "Series 11 Ep 1", 'title': "Gaston's Birthday",
'series': "Gruen", 'series': "Ben And Holly's Little Kingdom",
'description': 'md5:52cc744ad35045baf6aded2ce7287f67', 'description': 'md5:f9de914d02f226968f598ac76f105bcf',
'upload_date': '20190925', 'upload_date': '20180604',
'uploader_id': 'abc1', 'uploader_id': 'abc4kids',
'timestamp': 1569445289, 'timestamp': 1528140219,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@ -148,7 +148,7 @@ class ABCIViewIE(InfoExtractor):
'hdnea': token, 'hdnea': token,
}) })
for sd in ('720', 'sd', 'sd-low'): for sd in ('sd', 'sd-low'):
sd_url = try_get( sd_url = try_get(
stream, lambda x: x['streams']['hls'][sd], compat_str) stream, lambda x: x['streams']['hls'][sd], compat_str)
if not sd_url: if not sd_url:

View File

@ -15,13 +15,10 @@ class AbcNewsVideoIE(AMPIE):
IE_NAME = 'abcnews:video' IE_NAME = 'abcnews:video'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?:// https?://
abcnews\.go\.com/
(?: (?:
abcnews\.go\.com/ [^/]+/video/(?P<display_id>[0-9a-z-]+)-|
(?: video/embed\?.*?\bid=
[^/]+/video/(?P<display_id>[0-9a-z-]+)-|
video/embed\?.*?\bid=
)|
fivethirtyeight\.abcnews\.go\.com/video/embed/\d+/
) )
(?P<id>\d+) (?P<id>\d+)
''' '''

View File

@ -4,30 +4,29 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str
from ..utils import ( from ..utils import (
dict_get,
int_or_none, int_or_none,
try_get, parse_iso8601,
) )
class ABCOTVSIE(InfoExtractor): class ABCOTVSIE(InfoExtractor):
IE_NAME = 'abcotvs' IE_NAME = 'abcotvs'
IE_DESC = 'ABC Owned Television Stations' IE_DESC = 'ABC Owned Television Stations'
_VALID_URL = r'https?://(?P<site>abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:(?:/[^/]+)*/(?P<display_id>[^/]+))?/(?P<id>\d+)' _VALID_URL = r'https?://(?:abc(?:7(?:news|ny|chicago)?|11|13|30)|6abc)\.com(?:/[^/]+/(?P<display_id>[^/]+))?/(?P<id>\d+)'
_TESTS = [ _TESTS = [
{ {
'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/', 'url': 'http://abc7news.com/entertainment/east-bay-museum-celebrates-vintage-synthesizers/472581/',
'info_dict': { 'info_dict': {
'id': '472548', 'id': '472581',
'display_id': 'east-bay-museum-celebrates-vintage-synthesizers', 'display_id': 'east-bay-museum-celebrates-vintage-synthesizers',
'ext': 'mp4', 'ext': 'mp4',
'title': 'East Bay museum celebrates synthesized music', 'title': 'East Bay museum celebrates vintage synthesizers',
'description': 'md5:24ed2bd527096ec2a5c67b9d5a9005f3', 'description': 'md5:24ed2bd527096ec2a5c67b9d5a9005f3',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1421118520, 'timestamp': 1421123075,
'upload_date': '20150113', 'upload_date': '20150113',
'uploader': 'Jonathan Bloom',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -38,63 +37,39 @@ class ABCOTVSIE(InfoExtractor):
'url': 'http://abc7news.com/472581', 'url': 'http://abc7news.com/472581',
'only_matching': True, 'only_matching': True,
}, },
{
'url': 'https://6abc.com/man-75-killed-after-being-struck-by-vehicle-in-chester/5725182/',
'only_matching': True,
},
] ]
_SITE_MAP = {
'6abc': 'wpvi',
'abc11': 'wtvd',
'abc13': 'ktrk',
'abc30': 'kfsn',
'abc7': 'kabc',
'abc7chicago': 'wls',
'abc7news': 'kgo',
'abc7ny': 'wabc',
}
def _real_extract(self, url): def _real_extract(self, url):
site, display_id, video_id = re.match(self._VALID_URL, url).groups() mobj = re.match(self._VALID_URL, url)
display_id = display_id or video_id video_id = mobj.group('id')
station = self._SITE_MAP[site] display_id = mobj.group('display_id') or video_id
data = self._download_json( webpage = self._download_webpage(url, display_id)
'https://api.abcotvs.com/v2/content', display_id, query={
'id': video_id,
'key': 'otv.web.%s.story' % station,
'station': station,
})['data']
video = try_get(data, lambda x: x['featuredMedia']['video'], dict) or data
video_id = compat_str(dict_get(video, ('id', 'publishedKey'), video_id))
title = video.get('title') or video['linkText']
formats = [] m3u8 = self._html_search_meta(
m3u8_url = video.get('m3u8') 'contentURL', webpage, 'm3u8 url', fatal=True).split('?')[0]
if m3u8_url:
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(m3u8, display_id, 'mp4')
video['m3u8'].split('?')[0], display_id, 'mp4', m3u8_id='hls', fatal=False)
mp4_url = video.get('mp4')
if mp4_url:
formats.append({
'abr': 128,
'format_id': 'https',
'height': 360,
'url': mp4_url,
'width': 640,
})
self._sort_formats(formats) self._sort_formats(formats)
image = video.get('image') or {} title = self._og_search_title(webpage).strip()
description = self._og_search_description(webpage).strip()
thumbnail = self._og_search_thumbnail(webpage)
timestamp = parse_iso8601(self._search_regex(
r'<div class="meta">\s*<time class="timeago" datetime="([^"]+)">',
webpage, 'upload date', fatal=False))
uploader = self._search_regex(
r'rel="author">([^<]+)</a>',
webpage, 'uploader', default=None)
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'title': title, 'title': title,
'description': dict_get(video, ('description', 'caption'), try_get(video, lambda x: x['meta']['description'])), 'description': description,
'thumbnail': dict_get(image, ('source', 'dynamicSource')), 'thumbnail': thumbnail,
'timestamp': int_or_none(video.get('date')), 'timestamp': timestamp,
'duration': int_or_none(video.get('length')), 'uploader': uploader,
'formats': formats, 'formats': formats,
} }

View File

@ -7,7 +7,6 @@ import functools
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_str
from ..utils import ( from ..utils import (
clean_html,
float_or_none, float_or_none,
int_or_none, int_or_none,
try_get, try_get,
@ -28,7 +27,7 @@ class ACastIE(InfoExtractor):
''' '''
_TESTS = [{ _TESTS = [{
'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna', 'url': 'https://www.acast.com/sparpodcast/2.raggarmordet-rosterurdetforflutna',
'md5': '16d936099ec5ca2d5869e3a813ee8dc4', 'md5': 'a02393c74f3bdb1801c3ec2695577ce0',
'info_dict': { 'info_dict': {
'id': '2a92b283-1a75-4ad8-8396-499c641de0d9', 'id': '2a92b283-1a75-4ad8-8396-499c641de0d9',
'ext': 'mp3', 'ext': 'mp3',
@ -47,37 +46,28 @@ class ACastIE(InfoExtractor):
}, { }, {
'url': 'https://play.acast.com/s/rattegangspodden/s04e09-styckmordet-i-helenelund-del-22', 'url': 'https://play.acast.com/s/rattegangspodden/s04e09-styckmordet-i-helenelund-del-22',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://play.acast.com/s/sparpodcast/2a92b283-1a75-4ad8-8396-499c641de0d9',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
channel, display_id = re.match(self._VALID_URL, url).groups() channel, display_id = re.match(self._VALID_URL, url).groups()
s = self._download_json( s = self._download_json(
'https://feeder.acast.com/api/v1/shows/%s/episodes/%s' % (channel, display_id), 'https://play-api.acast.com/stitch/%s/%s' % (channel, display_id),
display_id) display_id)['result']
media_url = s['url'] media_url = s['url']
if re.search(r'[0-9a-f]{8}-(?:[0-9a-f]{4}-){3}[0-9a-f]{12}', display_id):
episode_url = s.get('episodeUrl')
if episode_url:
display_id = episode_url
else:
channel, display_id = re.match(self._VALID_URL, s['link']).groups()
cast_data = self._download_json( cast_data = self._download_json(
'https://play-api.acast.com/splash/%s/%s' % (channel, display_id), 'https://play-api.acast.com/splash/%s/%s' % (channel, display_id),
display_id)['result'] display_id)['result']
e = cast_data['episode'] e = cast_data['episode']
title = e.get('name') or s['title'] title = e['name']
return { return {
'id': compat_str(e['id']), 'id': compat_str(e['id']),
'display_id': display_id, 'display_id': display_id,
'url': media_url, 'url': media_url,
'title': title, 'title': title,
'description': e.get('summary') or clean_html(e.get('description') or s.get('description')), 'description': e.get('description') or e.get('summary'),
'thumbnail': e.get('image'), 'thumbnail': e.get('image'),
'timestamp': unified_timestamp(e.get('publishingDate') or s.get('publishDate')), 'timestamp': unified_timestamp(e.get('publishingDate')),
'duration': float_or_none(e.get('duration') or s.get('duration')), 'duration': float_or_none(s.get('duration') or e.get('duration')),
'filesize': int_or_none(e.get('contentLength')), 'filesize': int_or_none(e.get('contentLength')),
'creator': try_get(cast_data, lambda x: x['show']['author'], compat_str), 'creator': try_get(cast_data, lambda x: x['show']['author'], compat_str),
'series': try_get(cast_data, lambda x: x['show']['name'], compat_str), 'series': try_get(cast_data, lambda x: x['show']['name'], compat_str),

View File

@ -0,0 +1,95 @@
from __future__ import unicode_literals
import re
from .common import InfoExtractor
from ..compat import (
compat_HTTPError,
compat_str,
compat_urllib_parse_urlencode,
compat_urllib_parse_urlparse,
)
from ..utils import (
ExtractorError,
qualities,
)
class AddAnimeIE(InfoExtractor):
_VALID_URL = r'https?://(?:\w+\.)?add-anime\.net/(?:watch_video\.php\?(?:.*?)v=|video/)(?P<id>[\w_]+)'
_TESTS = [{
'url': 'http://www.add-anime.net/watch_video.php?v=24MR3YO5SAS9',
'md5': '72954ea10bc979ab5e2eb288b21425a0',
'info_dict': {
'id': '24MR3YO5SAS9',
'ext': 'mp4',
'description': 'One Piece 606',
'title': 'One Piece 606',
},
'skip': 'Video is gone',
}, {
'url': 'http://add-anime.net/video/MDUGWYKNGBD8/One-Piece-687',
'only_matching': True,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
try:
webpage = self._download_webpage(url, video_id)
except ExtractorError as ee:
if not isinstance(ee.cause, compat_HTTPError) or \
ee.cause.code != 503:
raise
redir_webpage = ee.cause.read().decode('utf-8')
action = self._search_regex(
r'<form id="challenge-form" action="([^"]+)"',
redir_webpage, 'Redirect form')
vc = self._search_regex(
r'<input type="hidden" name="jschl_vc" value="([^"]+)"/>',
redir_webpage, 'redirect vc value')
av = re.search(
r'a\.value = ([0-9]+)[+]([0-9]+)[*]([0-9]+);',
redir_webpage)
if av is None:
raise ExtractorError('Cannot find redirect math task')
av_res = int(av.group(1)) + int(av.group(2)) * int(av.group(3))
parsed_url = compat_urllib_parse_urlparse(url)
av_val = av_res + len(parsed_url.netloc)
confirm_url = (
parsed_url.scheme + '://' + parsed_url.netloc +
action + '?' +
compat_urllib_parse_urlencode({
'jschl_vc': vc, 'jschl_answer': compat_str(av_val)}))
self._download_webpage(
confirm_url, video_id,
note='Confirming after redirect')
webpage = self._download_webpage(url, video_id)
FORMATS = ('normal', 'hq')
quality = qualities(FORMATS)
formats = []
for format_id in FORMATS:
rex = r"var %s_video_file = '(.*?)';" % re.escape(format_id)
video_url = self._search_regex(rex, webpage, 'video file URLx',
fatal=False)
if not video_url:
continue
formats.append({
'format_id': format_id,
'url': video_url,
'quality': quality(format_id),
})
self._sort_formats(formats)
video_title = self._og_search_title(webpage)
video_description = self._og_search_description(webpage)
return {
'_type': 'video',
'id': video_id,
'formats': formats,
'title': video_title,
'description': video_description
}

View File

@ -65,15 +65,14 @@ class ADNIE(InfoExtractor):
if subtitle_location: if subtitle_location:
enc_subtitles = self._download_webpage( enc_subtitles = self._download_webpage(
urljoin(self._BASE_URL, subtitle_location), urljoin(self._BASE_URL, subtitle_location),
video_id, 'Downloading subtitles data', fatal=False, video_id, 'Downloading subtitles data', fatal=False)
headers={'Origin': 'https://animedigitalnetwork.fr'})
if not enc_subtitles: if not enc_subtitles:
return None return None
# http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js # http://animedigitalnetwork.fr/components/com_vodvideo/videojs/adn-vjs.min.js
dec_subtitles = intlist_to_bytes(aes_cbc_decrypt( dec_subtitles = intlist_to_bytes(aes_cbc_decrypt(
bytes_to_intlist(compat_b64decode(enc_subtitles[24:])), bytes_to_intlist(compat_b64decode(enc_subtitles[24:])),
bytes_to_intlist(binascii.unhexlify(self._K + '4b8ef13ec1872730')), bytes_to_intlist(binascii.unhexlify(self._K + '4421de0a5f0814ba')),
bytes_to_intlist(compat_b64decode(enc_subtitles[:24])) bytes_to_intlist(compat_b64decode(enc_subtitles[:24]))
)) ))
subtitles_json = self._parse_json( subtitles_json = self._parse_json(

View File

@ -25,11 +25,6 @@ MSO_INFO = {
'username_field': 'username', 'username_field': 'username',
'password_field': 'password', 'password_field': 'password',
}, },
'ATT': {
'name': 'AT&T U-verse',
'username_field': 'userid',
'password_field': 'password',
},
'ATTOTT': { 'ATTOTT': {
'name': 'DIRECTV NOW', 'name': 'DIRECTV NOW',
'username_field': 'email', 'username_field': 'email',

View File

@ -1,119 +1,25 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import functools
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import compat_str
from ..utils import ( from ..utils import (
float_or_none,
int_or_none,
ISO639Utils,
OnDemandPagedList,
parse_duration, parse_duration,
str_or_none,
str_to_int,
unified_strdate, unified_strdate,
str_to_int,
int_or_none,
float_or_none,
ISO639Utils,
determine_ext,
) )
class AdobeTVBaseIE(InfoExtractor): class AdobeTVBaseIE(InfoExtractor):
def _call_api(self, path, video_id, query, note=None): _API_BASE_URL = 'http://tv.adobe.com/api/v4/'
return self._download_json(
'http://tv.adobe.com/api/v4/' + path,
video_id, note, query=query)['data']
def _parse_subtitles(self, video_data, url_key):
subtitles = {}
for translation in video_data.get('translations', []):
vtt_path = translation.get(url_key)
if not vtt_path:
continue
lang = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
subtitles.setdefault(lang, []).append({
'ext': 'vtt',
'url': vtt_path,
})
return subtitles
def _parse_video_data(self, video_data):
video_id = compat_str(video_data['id'])
title = video_data['title']
s3_extracted = False
formats = []
for source in video_data.get('videos', []):
source_url = source.get('url')
if not source_url:
continue
f = {
'format_id': source.get('quality_level'),
'fps': int_or_none(source.get('frame_rate')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('video_data_rate')),
'width': int_or_none(source.get('width')),
'url': source_url,
}
original_filename = source.get('original_filename')
if original_filename:
if not (f.get('height') and f.get('width')):
mobj = re.search(r'_(\d+)x(\d+)', original_filename)
if mobj:
f.update({
'height': int(mobj.group(2)),
'width': int(mobj.group(1)),
})
if original_filename.startswith('s3://') and not s3_extracted:
formats.append({
'format_id': 'original',
'preference': 1,
'url': original_filename.replace('s3://', 'https://s3.amazonaws.com/'),
})
s3_extracted = True
formats.append(f)
self._sort_formats(formats)
return {
'id': video_id,
'title': title,
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnail'),
'upload_date': unified_strdate(video_data.get('start_date')),
'duration': parse_duration(video_data.get('duration')),
'view_count': str_to_int(video_data.get('playcount')),
'formats': formats,
'subtitles': self._parse_subtitles(video_data, 'vtt'),
}
class AdobeTVEmbedIE(AdobeTVBaseIE):
IE_NAME = 'adobetv:embed'
_VALID_URL = r'https?://tv\.adobe\.com/embed/\d+/(?P<id>\d+)'
_TEST = {
'url': 'https://tv.adobe.com/embed/22/4153',
'md5': 'c8c0461bf04d54574fc2b4d07ac6783a',
'info_dict': {
'id': '4153',
'ext': 'flv',
'title': 'Creating Graphics Optimized for BlackBerry',
'description': 'md5:eac6e8dced38bdaae51cd94447927459',
'thumbnail': r're:https?://.*\.jpg$',
'upload_date': '20091109',
'duration': 377,
'view_count': int,
},
}
def _real_extract(self, url):
video_id = self._match_id(url)
video_data = self._call_api(
'episode/' + video_id, video_id, {'disclosure': 'standard'})[0]
return self._parse_video_data(video_data)
class AdobeTVIE(AdobeTVBaseIE): class AdobeTVIE(AdobeTVBaseIE):
IE_NAME = 'adobetv'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)' _VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?watch/(?P<show_urlname>[^/]+)/(?P<id>[^/]+)'
_TEST = { _TEST = {
@ -136,33 +42,45 @@ class AdobeTVIE(AdobeTVBaseIE):
if not language: if not language:
language = 'en' language = 'en'
video_data = self._call_api( video_data = self._download_json(
'episode/get', urlname, { self._API_BASE_URL + 'episode/get/?language=%s&show_urlname=%s&urlname=%s&disclosure=standard' % (language, show_urlname, urlname),
'disclosure': 'standard', urlname)['data'][0]
'language': language,
'show_urlname': show_urlname, formats = [{
'urlname': urlname, 'url': source['url'],
})[0] 'format_id': source.get('quality_level') or source['url'].split('-')[-1].split('.')[0] or None,
return self._parse_video_data(video_data) 'width': int_or_none(source.get('width')),
'height': int_or_none(source.get('height')),
'tbr': int_or_none(source.get('video_data_rate')),
} for source in video_data['videos']]
self._sort_formats(formats)
return {
'id': compat_str(video_data['id']),
'title': video_data['title'],
'description': video_data.get('description'),
'thumbnail': video_data.get('thumbnail'),
'upload_date': unified_strdate(video_data.get('start_date')),
'duration': parse_duration(video_data.get('duration')),
'view_count': str_to_int(video_data.get('playcount')),
'formats': formats,
}
class AdobeTVPlaylistBaseIE(AdobeTVBaseIE): class AdobeTVPlaylistBaseIE(AdobeTVBaseIE):
_PAGE_SIZE = 25 def _parse_page_data(self, page_data):
return [self.url_result(self._get_element_url(element_data)) for element_data in page_data]
def _fetch_page(self, display_id, query, page): def _extract_playlist_entries(self, url, display_id):
page += 1 page = self._download_json(url, display_id)
query['page'] = page entries = self._parse_page_data(page['data'])
for element_data in self._call_api( for page_num in range(2, page['paging']['pages'] + 1):
self._RESOURCE, display_id, query, 'Download Page %d' % page): entries.extend(self._parse_page_data(
yield self._process_data(element_data) self._download_json(url + '&page=%d' % page_num, display_id)['data']))
return entries
def _extract_playlist_entries(self, display_id, query):
return OnDemandPagedList(functools.partial(
self._fetch_page, display_id, query), self._PAGE_SIZE)
class AdobeTVShowIE(AdobeTVPlaylistBaseIE): class AdobeTVShowIE(AdobeTVPlaylistBaseIE):
IE_NAME = 'adobetv:show'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)' _VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?show/(?P<id>[^/]+)'
_TEST = { _TEST = {
@ -174,31 +92,26 @@ class AdobeTVShowIE(AdobeTVPlaylistBaseIE):
}, },
'playlist_mincount': 136, 'playlist_mincount': 136,
} }
_RESOURCE = 'episode'
_process_data = AdobeTVBaseIE._parse_video_data def _get_element_url(self, element_data):
return element_data['urls'][0]
def _real_extract(self, url): def _real_extract(self, url):
language, show_urlname = re.match(self._VALID_URL, url).groups() language, show_urlname = re.match(self._VALID_URL, url).groups()
if not language: if not language:
language = 'en' language = 'en'
query = { query = 'language=%s&show_urlname=%s' % (language, show_urlname)
'disclosure': 'standard',
'language': language,
'show_urlname': show_urlname,
}
show_data = self._call_api( show_data = self._download_json(self._API_BASE_URL + 'show/get/?%s' % query, show_urlname)['data'][0]
'show/get', show_urlname, query)[0]
return self.playlist_result( return self.playlist_result(
self._extract_playlist_entries(show_urlname, query), self._extract_playlist_entries(self._API_BASE_URL + 'episode/?%s' % query, show_urlname),
str_or_none(show_data.get('id')), compat_str(show_data['id']),
show_data.get('show_name'), show_data['show_name'],
show_data.get('show_description')) show_data['show_description'])
class AdobeTVChannelIE(AdobeTVPlaylistBaseIE): class AdobeTVChannelIE(AdobeTVPlaylistBaseIE):
IE_NAME = 'adobetv:channel'
_VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?' _VALID_URL = r'https?://tv\.adobe\.com/(?:(?P<language>fr|de|es|jp)/)?channel/(?P<id>[^/]+)(?:/(?P<category_urlname>[^/]+))?'
_TEST = { _TEST = {
@ -208,30 +121,24 @@ class AdobeTVChannelIE(AdobeTVPlaylistBaseIE):
}, },
'playlist_mincount': 96, 'playlist_mincount': 96,
} }
_RESOURCE = 'show'
def _process_data(self, show_data): def _get_element_url(self, element_data):
return self.url_result( return element_data['url']
show_data['url'], 'AdobeTVShow', str_or_none(show_data.get('id')))
def _real_extract(self, url): def _real_extract(self, url):
language, channel_urlname, category_urlname = re.match(self._VALID_URL, url).groups() language, channel_urlname, category_urlname = re.match(self._VALID_URL, url).groups()
if not language: if not language:
language = 'en' language = 'en'
query = { query = 'language=%s&channel_urlname=%s' % (language, channel_urlname)
'channel_urlname': channel_urlname,
'language': language,
}
if category_urlname: if category_urlname:
query['category_urlname'] = category_urlname query += '&category_urlname=%s' % category_urlname
return self.playlist_result( return self.playlist_result(
self._extract_playlist_entries(channel_urlname, query), self._extract_playlist_entries(self._API_BASE_URL + 'show/?%s' % query, channel_urlname),
channel_urlname) channel_urlname)
class AdobeTVVideoIE(AdobeTVBaseIE): class AdobeTVVideoIE(InfoExtractor):
IE_NAME = 'adobetv:video'
_VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)' _VALID_URL = r'https?://video\.tv\.adobe\.com/v/(?P<id>\d+)'
_TEST = { _TEST = {
@ -253,36 +160,38 @@ class AdobeTVVideoIE(AdobeTVBaseIE):
video_data = self._parse_json(self._search_regex( video_data = self._parse_json(self._search_regex(
r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id) r'var\s+bridge\s*=\s*([^;]+);', webpage, 'bridged data'), video_id)
title = video_data['title']
formats = [] formats = [{
sources = video_data.get('sources') or [] 'format_id': '%s-%s' % (determine_ext(source['src']), source.get('height')),
for source in sources: 'url': source['src'],
source_src = source.get('src') 'width': int_or_none(source.get('width')),
if not source_src: 'height': int_or_none(source.get('height')),
continue 'tbr': int_or_none(source.get('bitrate')),
formats.append({ } for source in video_data['sources']]
'filesize': int_or_none(source.get('kilobytes') or None, invscale=1000),
'format_id': '-'.join(filter(None, [source.get('format'), source.get('label')])),
'height': int_or_none(source.get('height') or None),
'tbr': int_or_none(source.get('bitrate') or None),
'width': int_or_none(source.get('width') or None),
'url': source_src,
})
self._sort_formats(formats) self._sort_formats(formats)
# For both metadata and downloaded files the duration varies among # For both metadata and downloaded files the duration varies among
# formats. I just pick the max one # formats. I just pick the max one
duration = max(filter(None, [ duration = max(filter(None, [
float_or_none(source.get('duration'), scale=1000) float_or_none(source.get('duration'), scale=1000)
for source in sources])) for source in video_data['sources']]))
subtitles = {}
for translation in video_data.get('translations', []):
lang_id = translation.get('language_w3c') or ISO639Utils.long2short(translation['language_medium'])
if lang_id not in subtitles:
subtitles[lang_id] = []
subtitles[lang_id].append({
'url': translation['vttPath'],
'ext': 'vtt',
})
return { return {
'id': video_id, 'id': video_id,
'formats': formats, 'formats': formats,
'title': title, 'title': video_data['title'],
'description': video_data.get('description'), 'description': video_data.get('description'),
'thumbnail': video_data.get('video', {}).get('poster'), 'thumbnail': video_data['video'].get('poster'),
'duration': duration, 'duration': duration,
'subtitles': self._parse_subtitles(video_data, 'vttPath'), 'subtitles': subtitles,
} }

View File

@ -275,7 +275,7 @@ class AfreecaTVIE(InfoExtractor):
video_element = video_xml.findall(compat_xpath('./track/video'))[-1] video_element = video_xml.findall(compat_xpath('./track/video'))[-1]
if video_element is None or video_element.text is None: if video_element is None or video_element.text is None:
raise ExtractorError( raise ExtractorError(
'Video %s does not exist' % video_id, expected=True) 'Video %s video does not exist' % video_id, expected=True)
video_url = video_element.text.strip() video_url = video_element.text.strip()

View File

@ -5,7 +5,6 @@ from .common import InfoExtractor
from ..utils import ( from ..utils import (
clean_html, clean_html,
int_or_none, int_or_none,
js_to_json,
try_get, try_get,
unified_strdate, unified_strdate,
) )
@ -14,21 +13,22 @@ from ..utils import (
class AmericasTestKitchenIE(InfoExtractor): class AmericasTestKitchenIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?americastestkitchen\.com/(?:episode|videos)/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?americastestkitchen\.com/(?:episode|videos)/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.americastestkitchen.com/episode/582-weeknight-japanese-suppers', 'url': 'https://www.americastestkitchen.com/episode/548-summer-dinner-party',
'md5': 'b861c3e365ac38ad319cfd509c30577f', 'md5': 'b861c3e365ac38ad319cfd509c30577f',
'info_dict': { 'info_dict': {
'id': '5b400b9ee338f922cb06450c', 'id': '1_5g5zua6e',
'title': 'Weeknight Japanese Suppers', 'title': 'Summer Dinner Party',
'ext': 'mp4', 'ext': 'mp4',
'description': 'md5:3d0c1a44bb3b27607ce82652db25b4a8', 'description': 'md5:858d986e73a4826979b6a5d9f8f6a1ec',
'thumbnail': r're:^https?://', 'thumbnail': r're:^https?://.*\.jpg',
'timestamp': 1523664000, 'timestamp': 1497285541,
'upload_date': '20180414', 'upload_date': '20170612',
'release_date': '20180414', 'uploader_id': 'roger.metcalf@americastestkitchen.com',
'release_date': '20170617',
'series': "America's Test Kitchen", 'series': "America's Test Kitchen",
'season_number': 18, 'season_number': 17,
'episode': 'Weeknight Japanese Suppers', 'episode': 'Summer Dinner Party',
'episode_number': 15, 'episode_number': 24,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
@ -47,7 +47,7 @@ class AmericasTestKitchenIE(InfoExtractor):
self._search_regex( self._search_regex(
r'window\.__INITIAL_STATE__\s*=\s*({.+?})\s*;\s*</script>', r'window\.__INITIAL_STATE__\s*=\s*({.+?})\s*;\s*</script>',
webpage, 'initial context'), webpage, 'initial context'),
video_id, js_to_json) video_id)
ep_data = try_get( ep_data = try_get(
video_data, video_data,
@ -55,7 +55,17 @@ class AmericasTestKitchenIE(InfoExtractor):
lambda x: x['videoDetail']['content']['data']), dict) lambda x: x['videoDetail']['content']['data']), dict)
ep_meta = ep_data.get('full_video', {}) ep_meta = ep_data.get('full_video', {})
zype_id = ep_data.get('zype_id') or ep_meta['zype_id'] zype_id = ep_meta.get('zype_id')
if zype_id:
embed_url = 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % zype_id
ie_key = 'Zype'
else:
partner_id = self._search_regex(
r'src=["\'](?:https?:)?//(?:[^/]+\.)kaltura\.com/(?:[^/]+/)*(?:p|partner_id)/(\d+)',
webpage, 'kaltura partner id')
external_id = ep_data.get('external_id') or ep_meta['external_id']
embed_url = 'kaltura:%s:%s' % (partner_id, external_id)
ie_key = 'Kaltura'
title = ep_data.get('title') or ep_meta.get('title') title = ep_data.get('title') or ep_meta.get('title')
description = clean_html(ep_meta.get('episode_description') or ep_data.get( description = clean_html(ep_meta.get('episode_description') or ep_data.get(
@ -69,8 +79,8 @@ class AmericasTestKitchenIE(InfoExtractor):
return { return {
'_type': 'url_transparent', '_type': 'url_transparent',
'url': 'https://player.zype.com/embed/%s.js?api_key=jZ9GUhRmxcPvX7M3SlfejB6Hle9jyHTdk2jVxG7wOHPLODgncEKVdPYBhuz9iWXQ' % zype_id, 'url': embed_url,
'ie_key': 'Zype', 'ie_key': ie_key,
'title': title, 'title': title,
'description': description, 'description': description,
'thumbnail': thumbnail, 'thumbnail': thumbnail,

View File

@ -1,7 +1,6 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import json
import re import re
from .common import InfoExtractor from .common import InfoExtractor
@ -23,101 +22,7 @@ from ..utils import (
from ..compat import compat_etree_fromstring from ..compat import compat_etree_fromstring
class ARDMediathekBaseIE(InfoExtractor): class ARDMediathekIE(InfoExtractor):
_GEO_COUNTRIES = ['DE']
def _extract_media_info(self, media_info_url, webpage, video_id):
media_info = self._download_json(
media_info_url, video_id, 'Downloading media JSON')
return self._parse_media_info(media_info, video_id, '"fsk"' in webpage)
def _parse_media_info(self, media_info, video_id, fsk):
formats = self._extract_formats(media_info, video_id)
if not formats:
if fsk:
raise ExtractorError(
'This video is only available after 20:00', expected=True)
elif media_info.get('_geoblocked'):
self.raise_geo_restricted(
'This video is not available due to geoblocking',
countries=self._GEO_COUNTRIES)
self._sort_formats(formats)
subtitles = {}
subtitle_url = media_info.get('_subtitleUrl')
if subtitle_url:
subtitles['de'] = [{
'ext': 'ttml',
'url': subtitle_url,
}]
return {
'id': video_id,
'duration': int_or_none(media_info.get('_duration')),
'thumbnail': media_info.get('_previewImage'),
'is_live': media_info.get('_isLive') is True,
'formats': formats,
'subtitles': subtitles,
}
def _extract_formats(self, media_info, video_id):
type_ = media_info.get('_type')
media_array = media_info.get('_mediaArray', [])
formats = []
for num, media in enumerate(media_array):
for stream in media.get('_mediaStreamArray', []):
stream_urls = stream.get('_stream')
if not stream_urls:
continue
if not isinstance(stream_urls, list):
stream_urls = [stream_urls]
quality = stream.get('_quality')
server = stream.get('_server')
for stream_url in stream_urls:
if not url_or_none(stream_url):
continue
ext = determine_ext(stream_url)
if quality != 'auto' and ext in ('f4m', 'm3u8'):
continue
if ext == 'f4m':
formats.extend(self._extract_f4m_formats(
update_url_query(stream_url, {
'hdcore': '3.1.1',
'plugin': 'aasp-3.1.1.69.124'
}), video_id, f4m_id='hds', fatal=False))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', 'm3u8_native',
m3u8_id='hls', fatal=False))
else:
if server and server.startswith('rtmp'):
f = {
'url': server,
'play_path': stream_url,
'format_id': 'a%s-rtmp-%s' % (num, quality),
}
else:
f = {
'url': stream_url,
'format_id': 'a%s-%s-%s' % (num, ext, quality)
}
m = re.search(
r'_(?P<width>\d+)x(?P<height>\d+)\.mp4$',
stream_url)
if m:
f.update({
'width': int(m.group('width')),
'height': int(m.group('height')),
})
if type_ == 'audio':
f['vcodec'] = 'none'
formats.append(f)
return formats
class ARDMediathekIE(ARDMediathekBaseIE):
IE_NAME = 'ARD:mediathek' IE_NAME = 'ARD:mediathek'
_VALID_URL = r'^https?://(?:(?:(?:www|classic)\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de|one\.ard\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?' _VALID_URL = r'^https?://(?:(?:(?:www|classic)\.)?ardmediathek\.de|mediathek\.(?:daserste|rbb-online)\.de|one\.ard\.de)/(?:.*/)(?P<video_id>[0-9]+|[^0-9][^/\?]+)[^/\?]*(?:\?.*)?'
@ -158,6 +63,94 @@ class ARDMediathekIE(ARDMediathekBaseIE):
def suitable(cls, url): def suitable(cls, url):
return False if ARDBetaMediathekIE.suitable(url) else super(ARDMediathekIE, cls).suitable(url) return False if ARDBetaMediathekIE.suitable(url) else super(ARDMediathekIE, cls).suitable(url)
def _extract_media_info(self, media_info_url, webpage, video_id):
media_info = self._download_json(
media_info_url, video_id, 'Downloading media JSON')
formats = self._extract_formats(media_info, video_id)
if not formats:
if '"fsk"' in webpage:
raise ExtractorError(
'This video is only available after 20:00', expected=True)
elif media_info.get('_geoblocked'):
raise ExtractorError('This video is not available due to geo restriction', expected=True)
self._sort_formats(formats)
duration = int_or_none(media_info.get('_duration'))
thumbnail = media_info.get('_previewImage')
is_live = media_info.get('_isLive') is True
subtitles = {}
subtitle_url = media_info.get('_subtitleUrl')
if subtitle_url:
subtitles['de'] = [{
'ext': 'ttml',
'url': subtitle_url,
}]
return {
'id': video_id,
'duration': duration,
'thumbnail': thumbnail,
'is_live': is_live,
'formats': formats,
'subtitles': subtitles,
}
def _extract_formats(self, media_info, video_id):
type_ = media_info.get('_type')
media_array = media_info.get('_mediaArray', [])
formats = []
for num, media in enumerate(media_array):
for stream in media.get('_mediaStreamArray', []):
stream_urls = stream.get('_stream')
if not stream_urls:
continue
if not isinstance(stream_urls, list):
stream_urls = [stream_urls]
quality = stream.get('_quality')
server = stream.get('_server')
for stream_url in stream_urls:
if not url_or_none(stream_url):
continue
ext = determine_ext(stream_url)
if quality != 'auto' and ext in ('f4m', 'm3u8'):
continue
if ext == 'f4m':
formats.extend(self._extract_f4m_formats(
update_url_query(stream_url, {
'hdcore': '3.1.1',
'plugin': 'aasp-3.1.1.69.124'
}),
video_id, f4m_id='hds', fatal=False))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
stream_url, video_id, 'mp4', m3u8_id='hls', fatal=False))
else:
if server and server.startswith('rtmp'):
f = {
'url': server,
'play_path': stream_url,
'format_id': 'a%s-rtmp-%s' % (num, quality),
}
else:
f = {
'url': stream_url,
'format_id': 'a%s-%s-%s' % (num, ext, quality)
}
m = re.search(r'_(?P<width>\d+)x(?P<height>\d+)\.mp4$', stream_url)
if m:
f.update({
'width': int(m.group('width')),
'height': int(m.group('height')),
})
if type_ == 'audio':
f['vcodec'] = 'none'
formats.append(f)
return formats
def _real_extract(self, url): def _real_extract(self, url):
# determine video id from url # determine video id from url
m = re.match(self._VALID_URL, url) m = re.match(self._VALID_URL, url)
@ -249,7 +242,7 @@ class ARDMediathekIE(ARDMediathekBaseIE):
class ARDIE(InfoExtractor): class ARDIE(InfoExtractor):
_VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos(?:extern)?/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html' _VALID_URL = r'(?P<mainurl>https?://(www\.)?daserste\.de/[^?#]+/videos/(?P<display_id>[^/?#]+)-(?P<id>[0-9]+))\.html'
_TESTS = [{ _TESTS = [{
# available till 14.02.2019 # available till 14.02.2019
'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html', 'url': 'http://www.daserste.de/information/talk/maischberger/videos/das-groko-drama-zerlegen-sich-die-volksparteien-video-102.html',
@ -263,9 +256,6 @@ class ARDIE(InfoExtractor):
'upload_date': '20180214', 'upload_date': '20180214',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
}, },
}, {
'url': 'https://www.daserste.de/information/reportage-dokumentation/erlebnis-erde/videosextern/woelfe-und-herdenschutzhunde-ungleiche-brueder-102.html',
'only_matching': True,
}, { }, {
'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html', 'url': 'http://www.daserste.de/information/reportage-dokumentation/dokus/videos/die-story-im-ersten-mission-unter-falscher-flagge-100.html',
'only_matching': True, 'only_matching': True,
@ -312,31 +302,21 @@ class ARDIE(InfoExtractor):
} }
class ARDBetaMediathekIE(ARDMediathekBaseIE): class ARDBetaMediathekIE(InfoExtractor):
_VALID_URL = r'https://(?:(?:beta|www)\.)?ardmediathek\.de/(?P<client>[^/]+)/(?:player|live|video)/(?P<display_id>(?:[^/]+/)*)(?P<video_id>[a-zA-Z0-9]+)' _VALID_URL = r'https://(?:beta|www)\.ardmediathek\.de/[^/]+/(?:player|live)/(?P<video_id>[a-zA-Z0-9]+)(?:/(?P<display_id>[^/?#]+))?'
_TESTS = [{ _TESTS = [{
'url': 'https://ardmediathek.de/ard/video/die-robuste-roswita/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE', 'url': 'https://beta.ardmediathek.de/ard/player/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE/die-robuste-roswita',
'md5': 'dfdc87d2e7e09d073d5a80770a9ce88f', 'md5': '2d02d996156ea3c397cfc5036b5d7f8f',
'info_dict': { 'info_dict': {
'display_id': 'die-robuste-roswita', 'display_id': 'die-robuste-roswita',
'id': '70153354', 'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
'title': 'Die robuste Roswita', 'title': 'Tatort: Die robuste Roswita',
'description': r're:^Der Mord.*trüber ist als die Ilm.', 'description': r're:^Der Mord.*trüber ist als die Ilm.',
'duration': 5316, 'duration': 5316,
'thumbnail': 'https://img.ardmediathek.de/standard/00/70/15/33/90/-1852531467/16x9/960?mandant=ard', 'thumbnail': 'https://img.ardmediathek.de/standard/00/55/43/59/34/-1774185891/16x9/960?mandant=ard',
'timestamp': 1577047500, 'upload_date': '20180826',
'upload_date': '20191222',
'ext': 'mp4', 'ext': 'mp4',
}, },
}, {
'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
'only_matching': True,
}, {
'url': 'https://ardmediathek.de/ard/video/saartalk/saartalk-gesellschaftsgift-haltung-gegen-hass/sr-fernsehen/Y3JpZDovL3NyLW9ubGluZS5kZS9TVF84MTY4MA/',
'only_matching': True,
}, {
'url': 'https://www.ardmediathek.de/ard/video/trailer/private-eyes-s01-e01/one/Y3JpZDovL3dkci5kZS9CZWl0cmFnLTE1MTgwYzczLWNiMTEtNGNkMS1iMjUyLTg5MGYzOWQxZmQ1YQ/',
'only_matching': True,
}, { }, {
'url': 'https://www.ardmediathek.de/ard/player/Y3JpZDovL3N3ci5kZS9hZXgvbzEwNzE5MTU/', 'url': 'https://www.ardmediathek.de/ard/player/Y3JpZDovL3N3ci5kZS9hZXgvbzEwNzE5MTU/',
'only_matching': True, 'only_matching': True,
@ -348,75 +328,73 @@ class ARDBetaMediathekIE(ARDMediathekBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('video_id') video_id = mobj.group('video_id')
display_id = mobj.group('display_id') display_id = mobj.group('display_id') or video_id
if display_id:
display_id = display_id.rstrip('/')
if not display_id:
display_id = video_id
player_page = self._download_json( webpage = self._download_webpage(url, display_id)
'https://api.ardmediathek.de/public-gateway', data_json = self._search_regex(r'window\.__APOLLO_STATE__\s*=\s*(\{.*);\n', webpage, 'json')
display_id, data=json.dumps({ data = self._parse_json(data_json, display_id)
'query': '''{
playerPage(client:"%s", clipId: "%s") { res = {
blockedByFsk 'id': video_id,
broadcastedOn
maturityContentRating
mediaCollection {
_duration
_geoblocked
_isLive
_mediaArray {
_mediaStreamArray {
_quality
_server
_stream
}
}
_previewImage
_subtitleUrl
_type
}
show {
title
}
synopsis
title
tracking {
atiCustomVars {
contentId
}
}
}
}''' % (mobj.group('client'), video_id),
}).encode(), headers={
'Content-Type': 'application/json'
})['data']['playerPage']
title = player_page['title']
content_id = str_or_none(try_get(
player_page, lambda x: x['tracking']['atiCustomVars']['contentId']))
media_collection = player_page.get('mediaCollection') or {}
if not media_collection and content_id:
media_collection = self._download_json(
'https://www.ardmediathek.de/play/media/' + content_id,
content_id, fatal=False) or {}
info = self._parse_media_info(
media_collection, content_id or video_id,
player_page.get('blockedByFsk'))
age_limit = None
description = player_page.get('synopsis')
maturity_content_rating = player_page.get('maturityContentRating')
if maturity_content_rating:
age_limit = int_or_none(maturity_content_rating.lstrip('FSK'))
if not age_limit and description:
age_limit = int_or_none(self._search_regex(
r'\(FSK\s*(\d+)\)\s*$', description, 'age limit', default=None))
info.update({
'age_limit': age_limit,
'display_id': display_id, 'display_id': display_id,
'title': title, }
'description': description, formats = []
'timestamp': unified_timestamp(player_page.get('broadcastedOn')), subtitles = {}
'series': try_get(player_page, lambda x: x['show']['title']), geoblocked = False
for widget in data.values():
if widget.get('_geoblocked') is True:
geoblocked = True
if '_duration' in widget:
res['duration'] = int_or_none(widget['_duration'])
if 'clipTitle' in widget:
res['title'] = widget['clipTitle']
if '_previewImage' in widget:
res['thumbnail'] = widget['_previewImage']
if 'broadcastedOn' in widget:
res['timestamp'] = unified_timestamp(widget['broadcastedOn'])
if 'synopsis' in widget:
res['description'] = widget['synopsis']
subtitle_url = url_or_none(widget.get('_subtitleUrl'))
if subtitle_url:
subtitles.setdefault('de', []).append({
'ext': 'ttml',
'url': subtitle_url,
})
if '_quality' in widget:
format_url = url_or_none(try_get(
widget, lambda x: x['_stream']['json'][0]))
if not format_url:
continue
ext = determine_ext(format_url)
if ext == 'f4m':
formats.extend(self._extract_f4m_formats(
format_url + '?hdcore=3.11.0',
video_id, f4m_id='hds', fatal=False))
elif ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', m3u8_id='hls',
fatal=False))
else:
# HTTP formats are not available when geoblocked is True,
# other formats are fine though
if geoblocked:
continue
quality = str_or_none(widget.get('_quality'))
formats.append({
'format_id': ('http-' + quality) if quality else 'http',
'url': format_url,
'preference': 10, # Plain HTTP, that's nice
})
if not formats and geoblocked:
self.raise_geo_restricted(
msg='This video is not available due to geoblocking',
countries=['DE'])
self._sort_formats(formats)
res.update({
'subtitles': subtitles,
'formats': formats,
}) })
return info
return res

View File

@ -4,10 +4,17 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_str from ..compat import (
compat_parse_qs,
compat_str,
compat_urllib_parse_urlparse,
)
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
find_xpath_attr,
get_element_by_attribute,
int_or_none, int_or_none,
NO_DEFAULT,
qualities, qualities,
try_get, try_get,
unified_strdate, unified_strdate,
@ -18,7 +25,59 @@ from ..utils import (
# add tests. # add tests.
class ArteTvIE(InfoExtractor):
_VALID_URL = r'https?://videos\.arte\.tv/(?P<lang>fr|de|en|es)/.*-(?P<id>.*?)\.html'
IE_NAME = 'arte.tv'
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
lang = mobj.group('lang')
video_id = mobj.group('id')
ref_xml_url = url.replace('/videos/', '/do_delegate/videos/')
ref_xml_url = ref_xml_url.replace('.html', ',view,asPlayerXml.xml')
ref_xml_doc = self._download_xml(
ref_xml_url, video_id, note='Downloading metadata')
config_node = find_xpath_attr(ref_xml_doc, './/video', 'lang', lang)
config_xml_url = config_node.attrib['ref']
config = self._download_xml(
config_xml_url, video_id, note='Downloading configuration')
formats = [{
'format_id': q.attrib['quality'],
# The playpath starts at 'mp4:', if we don't manually
# split the url, rtmpdump will incorrectly parse them
'url': q.text.split('mp4:', 1)[0],
'play_path': 'mp4:' + q.text.split('mp4:', 1)[1],
'ext': 'flv',
'quality': 2 if q.attrib['quality'] == 'hd' else 1,
} for q in config.findall('./urls/url')]
self._sort_formats(formats)
title = config.find('.//name').text
thumbnail = config.find('.//firstThumbnailUrl').text
return {
'id': video_id,
'title': title,
'thumbnail': thumbnail,
'formats': formats,
}
class ArteTVBaseIE(InfoExtractor): class ArteTVBaseIE(InfoExtractor):
@classmethod
def _extract_url_info(cls, url):
mobj = re.match(cls._VALID_URL, url)
lang = mobj.group('lang')
query = compat_parse_qs(compat_urllib_parse_urlparse(url).query)
if 'vid' in query:
video_id = query['vid'][0]
else:
# This is not a real id, it can be for example AJT for the news
# http://www.arte.tv/guide/fr/emissions/AJT/arte-journal
video_id = mobj.group('id')
return video_id, lang
def _extract_from_json_url(self, json_url, video_id, lang, title=None): def _extract_from_json_url(self, json_url, video_id, lang, title=None):
info = self._download_json(json_url, video_id) info = self._download_json(json_url, video_id)
player_info = info['videoJsonPlayer'] player_info = info['videoJsonPlayer']
@ -49,15 +108,13 @@ class ArteTVBaseIE(InfoExtractor):
'upload_date': unified_strdate(upload_date_str), 'upload_date': unified_strdate(upload_date_str),
'thumbnail': player_info.get('programImage') or player_info.get('VTU', {}).get('IUR'), 'thumbnail': player_info.get('programImage') or player_info.get('VTU', {}).get('IUR'),
} }
qfunc = qualities(['MQ', 'HQ', 'EQ', 'SQ']) qfunc = qualities(['HQ', 'MQ', 'EQ', 'SQ'])
LANGS = { LANGS = {
'fr': 'F', 'fr': 'F',
'de': 'A', 'de': 'A',
'en': 'E[ANG]', 'en': 'E[ANG]',
'es': 'E[ESP]', 'es': 'E[ESP]',
'it': 'E[ITA]',
'pl': 'E[POL]',
} }
langcode = LANGS.get(lang, lang) langcode = LANGS.get(lang, lang)
@ -69,8 +126,8 @@ class ArteTVBaseIE(InfoExtractor):
l = re.escape(langcode) l = re.escape(langcode)
# Language preference from most to least priority # Language preference from most to least priority
# Reference: section 6.8 of # Reference: section 5.6.3 of
# https://www.arte.tv/sites/en/corporate/files/complete-technical-guidelines-arte-geie-v1-07-1.pdf # http://www.arte.tv/sites/en/corporate/files/complete-technical-guidelines-arte-geie-v1-05.pdf
PREFERENCES = ( PREFERENCES = (
# original version in requested language, without subtitles # original version in requested language, without subtitles
r'VO{0}$'.format(l), r'VO{0}$'.format(l),
@ -136,59 +193,274 @@ class ArteTVBaseIE(InfoExtractor):
class ArteTVPlus7IE(ArteTVBaseIE): class ArteTVPlus7IE(ArteTVBaseIE):
IE_NAME = 'arte.tv:+7' IE_NAME = 'arte.tv:+7'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/(?P<lang>fr|de|en|es|it|pl)/videos/(?P<id>\d{6}-\d{3}-[AF])' _VALID_URL = r'https?://(?:(?:www|sites)\.)?arte\.tv/(?:[^/]+/)?(?P<lang>fr|de|en|es)/(?:videos/)?(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.arte.tv/en/videos/088501-000-A/mexico-stealing-petrol-to-survive/', 'url': 'http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D',
'only_matching': True,
}, {
'url': 'http://sites.arte.tv/karambolage/de/video/karambolage-22',
'only_matching': True,
}, {
'url': 'http://www.arte.tv/de/videos/048696-000-A/der-kluge-bauch-unser-zweites-gehirn',
'only_matching': True,
}]
@classmethod
def suitable(cls, url):
return False if ArteTVPlaylistIE.suitable(url) else super(ArteTVPlus7IE, cls).suitable(url)
def _real_extract(self, url):
video_id, lang = self._extract_url_info(url)
webpage = self._download_webpage(url, video_id)
return self._extract_from_webpage(webpage, video_id, lang)
def _extract_from_webpage(self, webpage, video_id, lang):
patterns_templates = (r'arte_vp_url=["\'](.*?%s.*?)["\']', r'data-url=["\']([^"]+%s[^"]+)["\']')
ids = (video_id, '')
# some pages contain multiple videos (like
# http://www.arte.tv/guide/de/sendungen/XEN/xenius/?vid=055918-015_PLUS7-D),
# so we first try to look for json URLs that contain the video id from
# the 'vid' parameter.
patterns = [t % re.escape(_id) for _id in ids for t in patterns_templates]
json_url = self._html_search_regex(
patterns, webpage, 'json vp url', default=None)
if not json_url:
def find_iframe_url(webpage, default=NO_DEFAULT):
return self._html_search_regex(
r'<iframe[^>]+src=(["\'])(?P<url>.+\bjson_url=.+?)\1',
webpage, 'iframe url', group='url', default=default)
iframe_url = find_iframe_url(webpage, None)
if not iframe_url:
embed_url = self._html_search_regex(
r'arte_vp_url_oembed=\'([^\']+?)\'', webpage, 'embed url', default=None)
if embed_url:
player = self._download_json(
embed_url, video_id, 'Downloading player page')
iframe_url = find_iframe_url(player['html'])
# en and es URLs produce react-based pages with different layout (e.g.
# http://www.arte.tv/guide/en/053330-002-A/carnival-italy?zone=world)
if not iframe_url:
program = self._search_regex(
r'program\s*:\s*({.+?["\']embed_html["\'].+?}),?\s*\n',
webpage, 'program', default=None)
if program:
embed_html = self._parse_json(program, video_id)
if embed_html:
iframe_url = find_iframe_url(embed_html['embed_html'])
if iframe_url:
json_url = compat_parse_qs(
compat_urllib_parse_urlparse(iframe_url).query)['json_url'][0]
if json_url:
title = self._search_regex(
r'<h3[^>]+title=(["\'])(?P<title>.+?)\1',
webpage, 'title', default=None, group='title')
return self._extract_from_json_url(json_url, video_id, lang, title=title)
# Different kind of embed URL (e.g.
# http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium)
entries = [
self.url_result(url)
for _, url in re.findall(r'<iframe[^>]+src=(["\'])(?P<url>.+?)\1', webpage)]
return self.playlist_result(entries)
# It also uses the arte_vp_url url from the webpage to extract the information
class ArteTVCreativeIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:creative'
_VALID_URL = r'https?://creative\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://creative.arte.tv/fr/episode/osmosis-episode-1',
'info_dict': { 'info_dict': {
'id': '088501-000-A', 'id': '057405-001-A',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Mexico: Stealing Petrol to Survive', 'title': 'OSMOSIS - N\'AYEZ PLUS PEUR D\'AIMER (1)',
'upload_date': '20190628', 'upload_date': '20150716',
},
}, {
'url': 'http://creative.arte.tv/fr/Monty-Python-Reunion',
'playlist_count': 11,
'add_ie': ['Youtube'],
}, {
'url': 'http://creative.arte.tv/de/episode/agentur-amateur-4-der-erste-kunde',
'only_matching': True,
}]
class ArteTVInfoIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:info'
_VALID_URL = r'https?://info\.arte\.tv/(?P<lang>fr|de|en|es)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://info.arte.tv/fr/service-civique-un-cache-misere',
'info_dict': {
'id': '067528-000-A',
'ext': 'mp4',
'title': 'Service civique, un cache misère ?',
'upload_date': '20160403',
}, },
}] }]
class ArteTVFutureIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:future'
_VALID_URL = r'https?://future\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://future.arte.tv/fr/info-sciences/les-ecrevisses-aussi-sont-anxieuses',
'info_dict': {
'id': '050940-028-A',
'ext': 'mp4',
'title': 'Les écrevisses aussi peuvent être anxieuses',
'upload_date': '20140902',
},
}, {
'url': 'http://future.arte.tv/fr/la-science-est-elle-responsable',
'only_matching': True,
}]
class ArteTVDDCIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:ddc'
_VALID_URL = r'https?://ddc\.arte\.tv/(?P<lang>emission|folge)/(?P<id>[^/?#&]+)'
_TESTS = []
def _real_extract(self, url): def _real_extract(self, url):
lang, video_id = re.match(self._VALID_URL, url).groups() video_id, lang = self._extract_url_info(url)
return self._extract_from_json_url( if lang == 'folge':
'https://api.arte.tv/api/player/v1/config/%s/%s' % (lang, video_id), lang = 'de'
video_id, lang) elif lang == 'emission':
lang = 'fr'
webpage = self._download_webpage(url, video_id)
scriptElement = get_element_by_attribute('class', 'visu_video_block', webpage)
script_url = self._html_search_regex(r'src="(.*?)"', scriptElement, 'script url')
javascriptPlayerGenerator = self._download_webpage(script_url, video_id, 'Download javascript player generator')
json_url = self._search_regex(r"json_url=(.*)&rendering_place.*", javascriptPlayerGenerator, 'json url')
return self._extract_from_json_url(json_url, video_id, lang)
class ArteTVConcertIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:concert'
_VALID_URL = r'https?://concert\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://concert.arte.tv/de/notwist-im-pariser-konzertclub-divan-du-monde',
'md5': '9ea035b7bd69696b67aa2ccaaa218161',
'info_dict': {
'id': '186',
'ext': 'mp4',
'title': 'The Notwist im Pariser Konzertclub "Divan du Monde"',
'upload_date': '20140128',
'description': 'md5:486eb08f991552ade77439fe6d82c305',
},
}]
class ArteTVCinemaIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:cinema'
_VALID_URL = r'https?://cinema\.arte\.tv/(?P<lang>fr|de|en|es)/(?P<id>.+)'
_TESTS = [{
'url': 'http://cinema.arte.tv/fr/article/les-ailes-du-desir-de-julia-reck',
'md5': 'a5b9dd5575a11d93daf0e3f404f45438',
'info_dict': {
'id': '062494-000-A',
'ext': 'mp4',
'title': 'Film lauréat du concours web - "Les ailes du désir" de Julia Reck',
'upload_date': '20150807',
},
}]
class ArteTVMagazineIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:magazine'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/magazine/[^/]+/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
# Embedded via <iframe src="http://www.arte.tv/arte_vp/index.php?json_url=..."
'url': 'http://www.arte.tv/magazine/trepalium/fr/entretien-avec-le-realisateur-vincent-lannoo-trepalium',
'md5': '2a9369bcccf847d1c741e51416299f25',
'info_dict': {
'id': '065965-000-A',
'ext': 'mp4',
'title': 'Trepalium - Extrait Ep.01',
'upload_date': '20160121',
},
}, {
# Embedded via <iframe src="http://www.arte.tv/guide/fr/embed/054813-004-A/medium"
'url': 'http://www.arte.tv/magazine/trepalium/fr/episode-0406-replay-trepalium',
'md5': 'fedc64fc7a946110fe311634e79782ca',
'info_dict': {
'id': '054813-004_PLUS7-F',
'ext': 'mp4',
'title': 'Trepalium (4/6)',
'description': 'md5:10057003c34d54e95350be4f9b05cb40',
'upload_date': '20160218',
},
}, {
'url': 'http://www.arte.tv/magazine/metropolis/de/frank-woeste-german-paris-metropolis',
'only_matching': True,
}]
class ArteTVEmbedIE(ArteTVPlus7IE): class ArteTVEmbedIE(ArteTVPlus7IE):
IE_NAME = 'arte.tv:embed' IE_NAME = 'arte.tv:embed'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https://www\.arte\.tv http://www\.arte\.tv
/player/v3/index\.php\?json_url= /(?:playerv2/embed|arte_vp/index)\.php\?json_url=
(?P<json_url> (?P<json_url>
https?://api\.arte\.tv/api/player/v1/config/ http://arte\.tv/papi/tvguide/videos/stream/player/
(?P<lang>[^/]+)/(?P<id>\d{6}-\d{3}-[AF]) (?P<lang>[^/]+)/(?P<id>[^/]+)[^&]*
) )
''' '''
_TESTS = [] _TESTS = []
def _real_extract(self, url): def _real_extract(self, url):
json_url, lang, video_id = re.match(self._VALID_URL, url).groups() mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id')
lang = mobj.group('lang')
json_url = mobj.group('json_url')
return self._extract_from_json_url(json_url, video_id, lang) return self._extract_from_json_url(json_url, video_id, lang)
class TheOperaPlatformIE(ArteTVPlus7IE):
IE_NAME = 'theoperaplatform'
_VALID_URL = r'https?://(?:www\.)?theoperaplatform\.eu/(?P<lang>fr|de|en|es)/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'http://www.theoperaplatform.eu/de/opera/verdi-otello',
'md5': '970655901fa2e82e04c00b955e9afe7b',
'info_dict': {
'id': '060338-009-A',
'ext': 'mp4',
'title': 'Verdi - OTELLO',
'upload_date': '20160927',
},
}]
class ArteTVPlaylistIE(ArteTVBaseIE): class ArteTVPlaylistIE(ArteTVBaseIE):
IE_NAME = 'arte.tv:playlist' IE_NAME = 'arte.tv:playlist'
_VALID_URL = r'https?://(?:www\.)?arte\.tv/(?P<lang>fr|de|en|es|it|pl)/videos/(?P<id>RC-\d{6})' _VALID_URL = r'https?://(?:www\.)?arte\.tv/guide/(?P<lang>fr|de|en|es)/[^#]*#collection/(?P<id>PL-\d+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.arte.tv/en/videos/RC-016954/earn-a-living/', 'url': 'http://www.arte.tv/guide/de/plus7/?country=DE#collection/PL-013263/ARTETV',
'info_dict': { 'info_dict': {
'id': 'RC-016954', 'id': 'PL-013263',
'title': 'Earn a Living', 'title': 'Areva & Uramin',
'description': 'md5:d322c55011514b3a7241f7fb80d494c2', 'description': 'md5:a1dc0312ce357c262259139cfd48c9bf',
}, },
'playlist_mincount': 6, 'playlist_mincount': 6,
}, {
'url': 'http://www.arte.tv/guide/de/playlists?country=DE#collection/PL-013190/ARTETV',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
lang, playlist_id = re.match(self._VALID_URL, url).groups() playlist_id, lang = self._extract_url_info(url)
collection = self._download_json( collection = self._download_json(
'https://api.arte.tv/api/player/v1/collectionData/%s/%s?source=videos' 'https://api.arte.tv/api/player/v1/collectionData/%s/%s?source=videos'
% (lang, playlist_id), playlist_id) % (lang, playlist_id), playlist_id)

View File

@ -5,12 +5,14 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from .kaltura import KalturaIE from .kaltura import KalturaIE
from ..utils import extract_attributes from ..utils import (
extract_attributes,
remove_end,
)
class AsianCrushIE(InfoExtractor): class AsianCrushIE(InfoExtractor):
_VALID_URL_BASE = r'https?://(?:www\.)?(?P<host>(?:(?:asiancrush|yuyutv|midnightpulp)\.com|cocoro\.tv))' _VALID_URL = r'https?://(?:www\.)?asiancrush\.com/video/(?:[^/]+/)?0+(?P<id>\d+)v\b'
_VALID_URL = r'%s/video/(?:[^/]+/)?0+(?P<id>\d+)v\b' % _VALID_URL_BASE
_TESTS = [{ _TESTS = [{
'url': 'https://www.asiancrush.com/video/012869v/women-who-flirt/', 'url': 'https://www.asiancrush.com/video/012869v/women-who-flirt/',
'md5': 'c3b740e48d0ba002a42c0b72857beae6', 'md5': 'c3b740e48d0ba002a42c0b72857beae6',
@ -18,7 +20,7 @@ class AsianCrushIE(InfoExtractor):
'id': '1_y4tmjm5r', 'id': '1_y4tmjm5r',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Women Who Flirt', 'title': 'Women Who Flirt',
'description': 'md5:7e986615808bcfb11756eb503a751487', 'description': 'md5:3db14e9186197857e7063522cb89a805',
'timestamp': 1496936429, 'timestamp': 1496936429,
'upload_date': '20170608', 'upload_date': '20170608',
'uploader_id': 'craig@crifkin.com', 'uploader_id': 'craig@crifkin.com',
@ -26,27 +28,10 @@ class AsianCrushIE(InfoExtractor):
}, { }, {
'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/', 'url': 'https://www.asiancrush.com/video/she-was-pretty/011886v-pretty-episode-3/',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.yuyutv.com/video/013886v/the-act-of-killing/',
'only_matching': True,
}, {
'url': 'https://www.yuyutv.com/video/peep-show/013922v-warring-factions/',
'only_matching': True,
}, {
'url': 'https://www.midnightpulp.com/video/010400v/drifters/',
'only_matching': True,
}, {
'url': 'https://www.midnightpulp.com/video/mononoke/016378v-zashikiwarashi-part-1/',
'only_matching': True,
}, {
'url': 'https://www.cocoro.tv/video/the-wonderful-wizard-of-oz/008878v-the-wonderful-wizard-of-oz-ep01/',
'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) video_id = self._match_id(url)
host = mobj.group('host')
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
@ -66,7 +51,7 @@ class AsianCrushIE(InfoExtractor):
r'\bentry_id["\']\s*:\s*["\'](\d+)', webpage, 'entry id') r'\bentry_id["\']\s*:\s*["\'](\d+)', webpage, 'entry id')
player = self._download_webpage( player = self._download_webpage(
'https://api.%s/embeddedVideoPlayer' % host, video_id, 'https://api.asiancrush.com/embeddedVideoPlayer', video_id,
query={'id': entry_id}) query={'id': entry_id})
kaltura_id = self._search_regex( kaltura_id = self._search_regex(
@ -78,23 +63,15 @@ class AsianCrushIE(InfoExtractor):
r'/p(?:artner_id)?/(\d+)', player, 'partner id', r'/p(?:artner_id)?/(\d+)', player, 'partner id',
default='513551') default='513551')
description = self._html_search_regex( return self.url_result(
r'(?s)<div[^>]+\bclass=["\']description["\'][^>]*>(.+?)</div>', 'kaltura:%s:%s' % (partner_id, kaltura_id),
webpage, 'description', fatal=False) ie=KalturaIE.ie_key(), video_id=kaltura_id,
video_title=title)
return {
'_type': 'url_transparent',
'url': 'kaltura:%s:%s' % (partner_id, kaltura_id),
'ie_key': KalturaIE.ie_key(),
'id': video_id,
'title': title,
'description': description,
}
class AsianCrushPlaylistIE(InfoExtractor): class AsianCrushPlaylistIE(InfoExtractor):
_VALID_URL = r'%s/series/0+(?P<id>\d+)s\b' % AsianCrushIE._VALID_URL_BASE _VALID_URL = r'https?://(?:www\.)?asiancrush\.com/series/0+(?P<id>\d+)s\b'
_TESTS = [{ _TEST = {
'url': 'https://www.asiancrush.com/series/012481s/scholar-walks-night/', 'url': 'https://www.asiancrush.com/series/012481s/scholar-walks-night/',
'info_dict': { 'info_dict': {
'id': '12481', 'id': '12481',
@ -102,16 +79,7 @@ class AsianCrushPlaylistIE(InfoExtractor):
'description': 'md5:7addd7c5132a09fd4741152d96cce886', 'description': 'md5:7addd7c5132a09fd4741152d96cce886',
}, },
'playlist_count': 20, 'playlist_count': 20,
}, { }
'url': 'https://www.yuyutv.com/series/013920s/peep-show/',
'only_matching': True,
}, {
'url': 'https://www.midnightpulp.com/series/016375s/mononoke/',
'only_matching': True,
}, {
'url': 'https://www.cocoro.tv/series/008549s/the-wonderful-wizard-of-oz/',
'only_matching': True,
}]
def _real_extract(self, url): def _real_extract(self, url):
playlist_id = self._match_id(url) playlist_id = self._match_id(url)
@ -128,15 +96,15 @@ class AsianCrushPlaylistIE(InfoExtractor):
entries.append(self.url_result( entries.append(self.url_result(
mobj.group('url'), ie=AsianCrushIE.ie_key())) mobj.group('url'), ie=AsianCrushIE.ie_key()))
title = self._html_search_regex( title = remove_end(
r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage, self._html_search_regex(
'title', default=None) or self._og_search_title( r'(?s)<h1\b[^>]\bid=["\']movieTitle[^>]+>(.+?)</h1>', webpage,
webpage, default=None) or self._html_search_meta( 'title', default=None) or self._og_search_title(
'twitter:title', webpage, 'title', webpage, default=None) or self._html_search_meta(
default=None) or self._search_regex( 'twitter:title', webpage, 'title',
r'<title>([^<]+)</title>', webpage, 'title', fatal=False) default=None) or self._search_regex(
if title: r'<title>([^<]+)</title>', webpage, 'title', fatal=False),
title = re.sub(r'\s*\|\s*.+?$', '', title) ' | AsianCrush')
description = self._og_search_description( description = self._og_search_description(
webpage, default=None) or self._html_search_meta( webpage, default=None) or self._html_search_meta(

View File

@ -1,118 +1,202 @@
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import time
import hmac
import hashlib
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_HTTPError from ..compat import compat_str
from ..utils import ( from ..utils import (
ExtractorError, ExtractorError,
float_or_none,
int_or_none, int_or_none,
sanitized_Request,
urlencode_postdata, urlencode_postdata,
xpath_text,
) )
class AtresPlayerIE(InfoExtractor): class AtresPlayerIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?atresplayer\.com/[^/]+/[^/]+/[^/]+/[^/]+/(?P<display_id>.+?)_(?P<id>[0-9a-f]{24})' _VALID_URL = r'https?://(?:www\.)?atresplayer\.com/television/[^/]+/[^/]+/[^/]+/(?P<id>.+?)_\d+\.html'
_NETRC_MACHINE = 'atresplayer' _NETRC_MACHINE = 'atresplayer'
_TESTS = [ _TESTS = [
{ {
'url': 'https://www.atresplayer.com/antena3/series/pequenas-coincidencias/temporada-1/capitulo-7-asuntos-pendientes_5d4aa2c57ed1a88fc715a615/', 'url': 'http://www.atresplayer.com/television/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_2014122100174.html',
'md5': 'efd56753cda1bb64df52a3074f62e38a',
'info_dict': { 'info_dict': {
'id': '5d4aa2c57ed1a88fc715a615', 'id': 'capitulo-10-especial-solidario-nochebuena',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Capítulo 7: Asuntos pendientes', 'title': 'Especial Solidario de Nochebuena',
'description': 'md5:7634cdcb4d50d5381bedf93efb537fbc', 'description': 'md5:e2d52ff12214fa937107d21064075bf1',
'duration': 3413, 'duration': 5527.6,
}, 'thumbnail': r're:^https?://.*\.jpg$',
'params': {
'format': 'bestvideo',
}, },
'skip': 'This video is only available for registered users' 'skip': 'This video is only available for registered users'
}, },
{ {
'url': 'https://www.atresplayer.com/lasexta/programas/el-club-de-la-comedia/temporada-4/capitulo-10-especial-solidario-nochebuena_5ad08edf986b2855ed47adc4/', 'url': 'http://www.atresplayer.com/television/especial/videoencuentros/temporada-1/capitulo-112-david-bustamante_2014121600375.html',
'only_matching': True, 'md5': '6e52cbb513c405e403dbacb7aacf8747',
'info_dict': {
'id': 'capitulo-112-david-bustamante',
'ext': 'flv',
'title': 'David Bustamante',
'description': 'md5:f33f1c0a05be57f6708d4dd83a3b81c6',
'duration': 1439.0,
'thumbnail': r're:^https?://.*\.jpg$',
},
}, },
{ {
'url': 'https://www.atresplayer.com/antena3/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_5ad51046986b2886722ccdea/', 'url': 'http://www.atresplayer.com/television/series/el-secreto-de-puente-viejo/el-chico-de-los-tres-lunares/capitulo-977-29-12-14_2014122400174.html',
'only_matching': True, 'only_matching': True,
}, },
] ]
_API_BASE = 'https://api.atresplayer.com/'
_USER_AGENT = 'Dalvik/1.6.0 (Linux; U; Android 4.3; GT-I9300 Build/JSS15J'
_MAGIC = 'QWtMLXs414Yo+c#_+Q#K@NN)'
_TIMESTAMP_SHIFT = 30000
_TIME_API_URL = 'http://servicios.atresplayer.com/api/admin/time.json'
_URL_VIDEO_TEMPLATE = 'https://servicios.atresplayer.com/api/urlVideo/{1}/{0}/{1}|{2}|{3}.json'
_PLAYER_URL_TEMPLATE = 'https://servicios.atresplayer.com/episode/getplayer.json?episodePk=%s'
_EPISODE_URL_TEMPLATE = 'http://www.atresplayer.com/episodexml/%s'
_LOGIN_URL = 'https://servicios.atresplayer.com/j_spring_security_check'
_ERRORS = {
'UNPUBLISHED': 'We\'re sorry, but this video is not yet available.',
'DELETED': 'This video has expired and is no longer available for online streaming.',
'GEOUNPUBLISHED': 'We\'re sorry, but this video is not available in your region due to right restrictions.',
# 'PREMIUM': 'PREMIUM',
}
def _real_initialize(self): def _real_initialize(self):
self._login() self._login()
def _handle_error(self, e, code):
if isinstance(e.cause, compat_HTTPError) and e.cause.code == code:
error = self._parse_json(e.cause.read(), None)
if error.get('error') == 'required_registered':
self.raise_login_required()
raise ExtractorError(error['error_description'], expected=True)
raise
def _login(self): def _login(self):
username, password = self._get_login_info() username, password = self._get_login_info()
if username is None: if username is None:
return return
self._request_webpage( login_form = {
self._API_BASE + 'login', None, 'Downloading login page') 'j_username': username,
'j_password': password,
}
try: request = sanitized_Request(
target_url = self._download_json( self._LOGIN_URL, urlencode_postdata(login_form))
'https://account.atresmedia.com/api/login', None, request.add_header('Content-Type', 'application/x-www-form-urlencoded')
'Logging in', headers={ response = self._download_webpage(
'Content-Type': 'application/x-www-form-urlencoded' request, None, 'Logging in')
}, data=urlencode_postdata({
'username': username,
'password': password,
}))['targetUrl']
except ExtractorError as e:
self._handle_error(e, 400)
self._request_webpage(target_url, None, 'Following Target URL') error = self._html_search_regex(
r'(?s)<ul[^>]+class="[^"]*\blist_error\b[^"]*">(.+?)</ul>',
response, 'error', default=None)
if error:
raise ExtractorError(
'Unable to login: %s' % error, expected=True)
def _real_extract(self, url): def _real_extract(self, url):
display_id, video_id = re.match(self._VALID_URL, url).groups() video_id = self._match_id(url)
try: webpage = self._download_webpage(url, video_id)
episode = self._download_json(
self._API_BASE + 'client/v1/player/episode/' + video_id, video_id)
except ExtractorError as e:
self._handle_error(e, 403)
title = episode['titulo'] episode_id = self._search_regex(
r'episode="([^"]+)"', webpage, 'episode id')
request = sanitized_Request(
self._PLAYER_URL_TEMPLATE % episode_id,
headers={'User-Agent': self._USER_AGENT})
player = self._download_json(request, episode_id, 'Downloading player JSON')
episode_type = player.get('typeOfEpisode')
error_message = self._ERRORS.get(episode_type)
if error_message:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error_message), expected=True)
formats = [] formats = []
for source in episode.get('sources', []): video_url = player.get('urlVideo')
src = source.get('src') if video_url:
if not src: format_info = {
'url': video_url,
'format_id': 'http',
}
mobj = re.search(r'(?P<bitrate>\d+)K_(?P<width>\d+)x(?P<height>\d+)', video_url)
if mobj:
format_info.update({
'width': int_or_none(mobj.group('width')),
'height': int_or_none(mobj.group('height')),
'tbr': int_or_none(mobj.group('bitrate')),
})
formats.append(format_info)
timestamp = int_or_none(self._download_webpage(
self._TIME_API_URL,
video_id, 'Downloading timestamp', fatal=False), 1000, time.time())
timestamp_shifted = compat_str(timestamp + self._TIMESTAMP_SHIFT)
token = hmac.new(
self._MAGIC.encode('ascii'),
(episode_id + timestamp_shifted).encode('utf-8'), hashlib.md5
).hexdigest()
request = sanitized_Request(
self._URL_VIDEO_TEMPLATE.format('windows', episode_id, timestamp_shifted, token),
headers={'User-Agent': self._USER_AGENT})
fmt_json = self._download_json(
request, video_id, 'Downloading windows video JSON')
result = fmt_json.get('resultDes')
if result.lower() != 'ok':
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, result), expected=True)
for format_id, video_url in fmt_json['resultObject'].items():
if format_id == 'token' or not video_url.startswith('http'):
continue continue
src_type = source.get('type') if 'geodeswowsmpra3player' in video_url:
if src_type == 'application/vnd.apple.mpegurl': # f4m_path = video_url.split('smil:', 1)[-1].split('free_', 1)[0]
formats.extend(self._extract_m3u8_formats( # f4m_url = 'http://drg.antena3.com/{0}hds/es/sd.f4m'.format(f4m_path)
src, video_id, 'mp4', 'm3u8_native', # this videos are protected by DRM, the f4m downloader doesn't support them
m3u8_id='hls', fatal=False)) continue
elif src_type == 'application/dash+xml': video_url_hd = video_url.replace('free_es', 'es')
formats.extend(self._extract_mpd_formats( formats.extend(self._extract_f4m_formats(
src, video_id, mpd_id='dash', fatal=False)) video_url_hd[:-9] + '/manifest.f4m', video_id, f4m_id='hds',
fatal=False))
formats.extend(self._extract_mpd_formats(
video_url_hd[:-9] + '/manifest.mpd', video_id, mpd_id='dash',
fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
heartbeat = episode.get('heartbeat') or {} path_data = player.get('pathData')
omniture = episode.get('omniture') or {}
get_meta = lambda x: heartbeat.get(x) or omniture.get(x) episode = self._download_xml(
self._EPISODE_URL_TEMPLATE % path_data, video_id,
'Downloading episode XML')
duration = float_or_none(xpath_text(
episode, './media/asset/info/technical/contentDuration', 'duration'))
art = episode.find('./media/asset/info/art')
title = xpath_text(art, './name', 'title')
description = xpath_text(art, './description', 'description')
thumbnail = xpath_text(episode, './media/asset/files/background', 'thumbnail')
subtitles = {}
subtitle_url = xpath_text(episode, './media/asset/files/subtitle', 'subtitle')
if subtitle_url:
subtitles['es'] = [{
'ext': 'srt',
'url': subtitle_url,
}]
return { return {
'display_id': display_id,
'id': video_id, 'id': video_id,
'title': title, 'title': title,
'description': episode.get('descripcion'), 'description': description,
'thumbnail': episode.get('imgPoster'), 'thumbnail': thumbnail,
'duration': int_or_none(episode.get('duration')), 'duration': duration,
'formats': formats, 'formats': formats,
'channel': get_meta('channel'), 'subtitles': subtitles,
'season': get_meta('season'),
'episode_number': int_or_none(get_meta('episodeNumber')),
} }

View File

@ -2,25 +2,22 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import float_or_none
clean_html,
float_or_none,
)
class AudioBoomIE(InfoExtractor): class AudioBoomIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)' _VALID_URL = r'https?://(?:www\.)?audioboom\.com/(?:boos|posts)/(?P<id>[0-9]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://audioboom.com/posts/7398103-asim-chaudhry', 'url': 'https://audioboom.com/boos/4279833-3-09-2016-czaban-hour-3?t=0',
'md5': '7b00192e593ff227e6a315486979a42d', 'md5': '63a8d73a055c6ed0f1e51921a10a5a76',
'info_dict': { 'info_dict': {
'id': '7398103', 'id': '4279833',
'ext': 'mp3', 'ext': 'mp3',
'title': 'Asim Chaudhry', 'title': '3/09/2016 Czaban Hour 3',
'description': 'md5:2f3fef17dacc2595b5362e1d7d3602fc', 'description': 'Guest: Nate Davis - NFL free agency, Guest: Stan Gans',
'duration': 4000.99, 'duration': 2245.72,
'uploader': 'Sue Perkins: An hour or so with...', 'uploader': 'SB Nation A.M.',
'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/perkins', 'uploader_url': r're:https?://(?:www\.)?audioboom\.com/channel/steveczabanyahoosportsradio',
} }
}, { }, {
'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0', 'url': 'https://audioboom.com/posts/4279833-3-09-2016-czaban-hour-3?t=0',
@ -35,8 +32,8 @@ class AudioBoomIE(InfoExtractor):
clip = None clip = None
clip_store = self._parse_json( clip_store = self._parse_json(
self._html_search_regex( self._search_regex(
r'data-new-clip-store=(["\'])(?P<json>{.+?})\1', r'data-new-clip-store=(["\'])(?P<json>{.*?"clipId"\s*:\s*%s.*?})\1' % video_id,
webpage, 'clip store', default='{}', group='json'), webpage, 'clip store', default='{}', group='json'),
video_id, fatal=False) video_id, fatal=False)
if clip_store: if clip_store:
@ -50,15 +47,14 @@ class AudioBoomIE(InfoExtractor):
audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property( audio_url = from_clip('clipURLPriorToLoading') or self._og_search_property(
'audio', webpage, 'audio url') 'audio', webpage, 'audio url')
title = from_clip('title') or self._html_search_meta( title = from_clip('title') or self._og_search_title(webpage)
['og:title', 'og:audio:title', 'audio_title'], webpage) description = from_clip('description') or self._og_search_description(webpage)
description = from_clip('description') or clean_html(from_clip('formattedDescription')) or self._og_search_description(webpage)
duration = float_or_none(from_clip('duration') or self._html_search_meta( duration = float_or_none(from_clip('duration') or self._html_search_meta(
'weibo:audio:duration', webpage)) 'weibo:audio:duration', webpage))
uploader = from_clip('author') or self._html_search_meta( uploader = from_clip('author') or self._og_search_property(
['og:audio:artist', 'twitter:audio:artist_name', 'audio_artist'], webpage, 'uploader') 'audio:artist', webpage, 'uploader', fatal=False)
uploader_url = from_clip('author_url') or self._html_search_meta( uploader_url = from_clip('author_url') or self._html_search_meta(
'audioboo:channel', webpage, 'uploader url') 'audioboo:channel', webpage, 'uploader url')

View File

@ -47,19 +47,39 @@ class AZMedienIE(InfoExtractor):
'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1', 'url': 'https://www.telebaern.tv/telebaern-news/montag-1-oktober-2018-ganze-sendung-133531189#video=0_7xjo9lf1',
'only_matching': True 'only_matching': True
}] }]
_API_TEMPL = 'https://www.%s/api/pub/gql/%s/NewsArticleTeaser/cb9f2f81ed22e9b47f4ca64ea3cc5a5d13e88d1d'
_PARTNER_ID = '1719221' _PARTNER_ID = '1719221'
def _real_extract(self, url): def _real_extract(self, url):
host, display_id, article_id, entry_id = re.match(self._VALID_URL, url).groups() mobj = re.match(self._VALID_URL, url)
host = mobj.group('host')
video_id = mobj.group('id')
entry_id = mobj.group('kaltura_id')
if not entry_id: if not entry_id:
entry_id = self._download_json( api_url = 'https://www.%s/api/pub/gql/%s' % (host, host.split('.')[0])
self._API_TEMPL % (host, host.split('.')[0]), display_id, query={ payload = {
'variables': json.dumps({ 'query': '''query VideoContext($articleId: ID!) {
'contextId': 'NewsArticle:' + article_id, article: node(id: $articleId) {
}), ... on Article {
})['data']['context']['mainAsset']['video']['kaltura']['kalturaId'] mainAssetRelation {
asset {
... on VideoAsset {
kalturaId
}
}
}
}
}
}''',
'variables': {'articleId': 'Article:%s' % mobj.group('article_id')},
}
json_data = self._download_json(
api_url, video_id, headers={
'Content-Type': 'application/json',
},
data=json.dumps(payload).encode())
entry_id = json_data['data']['article']['mainAssetRelation']['asset']['kalturaId']
return self.url_result( return self.url_result(
'kaltura:%s:%s' % (self._PARTNER_ID, entry_id), 'kaltura:%s:%s' % (self._PARTNER_ID, entry_id),

View File

@ -0,0 +1,142 @@
from __future__ import unicode_literals
import re
import itertools
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
float_or_none,
int_or_none,
sanitized_Request,
urlencode_postdata,
)
class BambuserIE(InfoExtractor):
IE_NAME = 'bambuser'
_VALID_URL = r'https?://bambuser\.com/v/(?P<id>\d+)'
_API_KEY = '005f64509e19a868399060af746a00aa'
_LOGIN_URL = 'https://bambuser.com/user'
_NETRC_MACHINE = 'bambuser'
_TEST = {
'url': 'http://bambuser.com/v/4050584',
# MD5 seems to be flaky, see https://travis-ci.org/ytdl-org/youtube-dl/jobs/14051016#L388
# 'md5': 'fba8f7693e48fd4e8641b3fd5539a641',
'info_dict': {
'id': '4050584',
'ext': 'flv',
'title': 'Education engineering days - lightning talks',
'duration': 3741,
'uploader': 'pixelversity',
'uploader_id': '344706',
'timestamp': 1382976692,
'upload_date': '20131028',
'view_count': int,
},
'params': {
# It doesn't respect the 'Range' header, it would download the whole video
# caused the travis builds to fail: https://travis-ci.org/ytdl-org/youtube-dl/jobs/14493845#L59
'skip_download': True,
},
}
def _login(self):
username, password = self._get_login_info()
if username is None:
return
login_form = {
'form_id': 'user_login',
'op': 'Log in',
'name': username,
'pass': password,
}
request = sanitized_Request(
self._LOGIN_URL, urlencode_postdata(login_form))
request.add_header('Referer', self._LOGIN_URL)
response = self._download_webpage(
request, None, 'Logging in')
login_error = self._html_search_regex(
r'(?s)<div class="messages error">(.+?)</div>',
response, 'login error', default=None)
if login_error:
raise ExtractorError(
'Unable to login: %s' % login_error, expected=True)
def _real_initialize(self):
self._login()
def _real_extract(self, url):
video_id = self._match_id(url)
info = self._download_json(
'http://player-c.api.bambuser.com/getVideo.json?api_key=%s&vid=%s'
% (self._API_KEY, video_id), video_id)
error = info.get('error')
if error:
raise ExtractorError(
'%s returned error: %s' % (self.IE_NAME, error), expected=True)
result = info['result']
return {
'id': video_id,
'title': result['title'],
'url': result['url'],
'thumbnail': result.get('preview'),
'duration': int_or_none(result.get('length')),
'uploader': result.get('username'),
'uploader_id': compat_str(result.get('owner', {}).get('uid')),
'timestamp': int_or_none(result.get('created')),
'fps': float_or_none(result.get('framerate')),
'view_count': int_or_none(result.get('views_total')),
'comment_count': int_or_none(result.get('comment_count')),
}
class BambuserChannelIE(InfoExtractor):
IE_NAME = 'bambuser:channel'
_VALID_URL = r'https?://bambuser\.com/channel/(?P<user>.*?)(?:/|#|\?|$)'
# The maximum number we can get with each request
_STEP = 50
_TEST = {
'url': 'http://bambuser.com/channel/pixelversity',
'info_dict': {
'title': 'pixelversity',
},
'playlist_mincount': 60,
}
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
user = mobj.group('user')
urls = []
last_id = ''
for i in itertools.count(1):
req_url = (
'http://bambuser.com/xhr-api/index.php?username={user}'
'&sort=created&access_mode=0%2C1%2C2&limit={count}'
'&method=broadcast&format=json&vid_older_than={last}'
).format(user=user, count=self._STEP, last=last_id)
req = sanitized_Request(req_url)
# Without setting this header, we wouldn't get any result
req.add_header('Referer', 'http://bambuser.com/channel/%s' % user)
data = self._download_json(
req, user, 'Downloading page %d' % i)
results = data['result']
if not results:
break
last_id = results[-1]['vid']
urls.extend(self.url_result(v['page'], 'Bambuser') for v in results)
return {
'_type': 'playlist',
'title': user,
'entries': urls,
}

View File

@ -40,7 +40,6 @@ class BBCCoUkIE(InfoExtractor):
iplayer(?:/[^/]+)?/(?:episode/|playlist/)| iplayer(?:/[^/]+)?/(?:episode/|playlist/)|
music/(?:clips|audiovideo/popular)[/#]| music/(?:clips|audiovideo/popular)[/#]|
radio/player/| radio/player/|
sounds/play/|
events/[^/]+/play/[^/]+/ events/[^/]+/play/[^/]+/
) )
(?P<id>%s)(?!/(?:episodes|broadcasts|clips)) (?P<id>%s)(?!/(?:episodes|broadcasts|clips))
@ -71,7 +70,7 @@ class BBCCoUkIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': 'b039d07m', 'id': 'b039d07m',
'ext': 'flv', 'ext': 'flv',
'title': 'Kaleidoscope, Leonard Cohen', 'title': 'Leonard Cohen, Kaleidoscope - BBC Radio 4',
'description': 'The Canadian poet and songwriter reflects on his musical career.', 'description': 'The Canadian poet and songwriter reflects on his musical career.',
}, },
'params': { 'params': {
@ -221,20 +220,6 @@ class BBCCoUkIE(InfoExtractor):
# rtmp download # rtmp download
'skip_download': True, 'skip_download': True,
}, },
}, {
'url': 'https://www.bbc.co.uk/sounds/play/m0007jzb',
'note': 'Audio',
'info_dict': {
'id': 'm0007jz9',
'ext': 'mp4',
'title': 'BBC Proms, 2019, Prom 34: WestEastern Divan Orchestra',
'description': "Live BBC Proms. WestEastern Divan Orchestra with Daniel Barenboim and Martha Argerich.",
'duration': 9840,
},
'params': {
# rtmp download
'skip_download': True,
}
}, { }, {
'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4', 'url': 'http://www.bbc.co.uk/iplayer/playlist/p01dvks4',
'only_matching': True, 'only_matching': True,
@ -528,7 +513,7 @@ class BBCCoUkIE(InfoExtractor):
def get_programme_id(item): def get_programme_id(item):
def get_from_attributes(item): def get_from_attributes(item):
for p in ('identifier', 'group'): for p in('identifier', 'group'):
value = item.get(p) value = item.get(p)
if value and re.match(r'^[pb][\da-z]{7}$', value): if value and re.match(r'^[pb][\da-z]{7}$', value):
return value return value
@ -624,7 +609,7 @@ class BBCIE(BBCCoUkIE):
'url': 'http://www.bbc.com/news/world-europe-32668511', 'url': 'http://www.bbc.com/news/world-europe-32668511',
'info_dict': { 'info_dict': {
'id': 'world-europe-32668511', 'id': 'world-europe-32668511',
'title': 'Russia stages massive WW2 parade', 'title': 'Russia stages massive WW2 parade despite Western boycott',
'description': 'md5:00ff61976f6081841f759a08bf78cc9c', 'description': 'md5:00ff61976f6081841f759a08bf78cc9c',
}, },
'playlist_count': 2, 'playlist_count': 2,

View File

@ -99,8 +99,8 @@ class BeamProLiveIE(BeamProBaseIE):
class BeamProVodIE(BeamProBaseIE): class BeamProVodIE(BeamProBaseIE):
IE_NAME = 'Mixer:vod' IE_NAME = 'Mixer:vod'
_VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/[^/?#&]+\?.*?\bvod=(?P<id>[^?#&]+)' _VALID_URL = r'https?://(?:\w+\.)?(?:beam\.pro|mixer\.com)/[^/?#&]+\?.*?\bvod=(?P<id>\d+)'
_TESTS = [{ _TEST = {
'url': 'https://mixer.com/willow8714?vod=2259830', 'url': 'https://mixer.com/willow8714?vod=2259830',
'md5': 'b2431e6e8347dc92ebafb565d368b76b', 'md5': 'b2431e6e8347dc92ebafb565d368b76b',
'info_dict': { 'info_dict': {
@ -119,13 +119,7 @@ class BeamProVodIE(BeamProBaseIE):
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
}, { }
'url': 'https://mixer.com/streamer?vod=IxFno1rqC0S_XJ1a2yGgNw',
'only_matching': True,
}, {
'url': 'https://mixer.com/streamer?vod=Rh3LY0VAqkGpEQUe2pN-ig',
'only_matching': True,
}]
@staticmethod @staticmethod
def _extract_format(vod, vod_type): def _extract_format(vod, vod_type):

View File

@ -1,10 +1,7 @@
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import compat_str
compat_str,
compat_urlparse,
)
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
unified_timestamp, unified_timestamp,
@ -14,7 +11,6 @@ from ..utils import (
class BeegIE(InfoExtractor): class BeegIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?beeg\.(?:com|porn(?:/video)?)/(?P<id>\d+)' _VALID_URL = r'https?://(?:www\.)?beeg\.(?:com|porn(?:/video)?)/(?P<id>\d+)'
_TESTS = [{ _TESTS = [{
# api/v6 v1
'url': 'http://beeg.com/5416503', 'url': 'http://beeg.com/5416503',
'md5': 'a1a1b1a8bc70a89e49ccfd113aed0820', 'md5': 'a1a1b1a8bc70a89e49ccfd113aed0820',
'info_dict': { 'info_dict': {
@ -28,14 +24,6 @@ class BeegIE(InfoExtractor):
'tags': list, 'tags': list,
'age_limit': 18, 'age_limit': 18,
} }
}, {
# api/v6 v2
'url': 'https://beeg.com/1941093077?t=911-1391',
'only_matching': True,
}, {
# api/v6 v2 w/o t
'url': 'https://beeg.com/1277207756',
'only_matching': True,
}, { }, {
'url': 'https://beeg.porn/video/5416503', 'url': 'https://beeg.porn/video/5416503',
'only_matching': True, 'only_matching': True,
@ -53,25 +41,11 @@ class BeegIE(InfoExtractor):
r'beeg_version\s*=\s*([\da-zA-Z_-]+)', webpage, 'beeg version', r'beeg_version\s*=\s*([\da-zA-Z_-]+)', webpage, 'beeg version',
default='1546225636701') default='1546225636701')
if len(video_id) >= 10:
query = {
'v': 2,
}
qs = compat_urlparse.parse_qs(compat_urlparse.urlparse(url).query)
t = qs.get('t', [''])[0].split('-')
if len(t) > 1:
query.update({
's': t[0],
'e': t[1],
})
else:
query = {'v': 1}
for api_path in ('', 'api.'): for api_path in ('', 'api.'):
video = self._download_json( video = self._download_json(
'https://%sbeeg.com/api/v6/%s/video/%s' 'https://%sbeeg.com/api/v6/%s/video/%s'
% (api_path, beeg_version, video_id), video_id, % (api_path, beeg_version, video_id), video_id,
fatal=api_path == 'api.', query=query) fatal=api_path == 'api.')
if video: if video:
break break

View File

@ -22,11 +22,10 @@ class BellMediaIE(InfoExtractor):
bravo| bravo|
mtv| mtv|
space| space|
etalk| etalk
marilyn
)\.ca| )\.ca|
(?:much|cp24)\.com much\.com
)/.*?(?:\b(?:vid(?:eoid)?|clipId)=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})''' )/.*?(?:\bvid(?:eoid)?=|-vid|~|%7E|/(?:episode)?)(?P<id>[0-9]{6,})'''
_TESTS = [{ _TESTS = [{
'url': 'https://www.bnnbloomberg.ca/video/david-cockfield-s-top-picks~1403070', 'url': 'https://www.bnnbloomberg.ca/video/david-cockfield-s-top-picks~1403070',
'md5': '36d3ef559cfe8af8efe15922cd3ce950', 'md5': '36d3ef559cfe8af8efe15922cd3ce950',
@ -62,9 +61,6 @@ class BellMediaIE(InfoExtractor):
}, { }, {
'url': 'http://www.etalk.ca/video?videoid=663455', 'url': 'http://www.etalk.ca/video?videoid=663455',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.cp24.com/video?clipId=1982548',
'only_matching': True,
}] }]
_DOMAINS = { _DOMAINS = {
'thecomedynetwork': 'comedy', 'thecomedynetwork': 'comedy',
@ -74,7 +70,6 @@ class BellMediaIE(InfoExtractor):
'animalplanet': 'aniplan', 'animalplanet': 'aniplan',
'etalk': 'ctv', 'etalk': 'ctv',
'bnnbloomberg': 'bnn', 'bnnbloomberg': 'bnn',
'marilyn': 'ctv_marilyn',
} }
def _real_extract(self, url): def _real_extract(self, url):

View File

@ -15,7 +15,6 @@ from ..utils import (
float_or_none, float_or_none,
parse_iso8601, parse_iso8601,
smuggle_url, smuggle_url,
str_or_none,
strip_jsonp, strip_jsonp,
unified_timestamp, unified_timestamp,
unsmuggle_url, unsmuggle_url,
@ -24,18 +23,7 @@ from ..utils import (
class BiliBiliIE(InfoExtractor): class BiliBiliIE(InfoExtractor):
_VALID_URL = r'''(?x) _VALID_URL = r'https?://(?:www\.|bangumi\.|)bilibili\.(?:tv|com)/(?:video/av|anime/(?P<anime_id>\d+)/play#)(?P<id>\d+)'
https?://
(?:(?:www|bangumi)\.)?
bilibili\.(?:tv|com)/
(?:
(?:
video/[aA][vV]|
anime/(?P<anime_id>\d+)/play\#
)(?P<id_bv>\d+)|
video/[bB][vV](?P<id>[^/?#&]+)
)
'''
_TESTS = [{ _TESTS = [{
'url': 'http://www.bilibili.tv/video/av1074402/', 'url': 'http://www.bilibili.tv/video/av1074402/',
@ -103,10 +91,6 @@ class BiliBiliIE(InfoExtractor):
'skip_download': True, # Test metadata only 'skip_download': True, # Test metadata only
}, },
}] }]
}, {
# new BV video id format
'url': 'https://www.bilibili.com/video/BV1JE411F741',
'only_matching': True,
}] }]
_APP_KEY = 'iVGUTjsxvpLeuDCf' _APP_KEY = 'iVGUTjsxvpLeuDCf'
@ -124,7 +108,7 @@ class BiliBiliIE(InfoExtractor):
url, smuggled_data = unsmuggle_url(url, {}) url, smuggled_data = unsmuggle_url(url, {})
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
video_id = mobj.group('id') or mobj.group('id_bv') video_id = mobj.group('id')
anime_id = mobj.group('anime_id') anime_id = mobj.group('anime_id')
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
@ -322,129 +306,3 @@ class BiliBiliBangumiIE(InfoExtractor):
return self.playlist_result( return self.playlist_result(
entries, bangumi_id, entries, bangumi_id,
season_info.get('bangumi_title'), season_info.get('evaluate')) season_info.get('bangumi_title'), season_info.get('evaluate'))
class BilibiliAudioBaseIE(InfoExtractor):
def _call_api(self, path, sid, query=None):
if not query:
query = {'sid': sid}
return self._download_json(
'https://www.bilibili.com/audio/music-service-c/web/' + path,
sid, query=query)['data']
class BilibiliAudioIE(BilibiliAudioBaseIE):
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/audio/au(?P<id>\d+)'
_TEST = {
'url': 'https://www.bilibili.com/audio/au1003142',
'md5': 'fec4987014ec94ef9e666d4d158ad03b',
'info_dict': {
'id': '1003142',
'ext': 'm4a',
'title': '【tsukimi】YELLOW / 神山羊',
'artist': 'tsukimi',
'comment_count': int,
'description': 'YELLOW的mp3版',
'duration': 183,
'subtitles': {
'origin': [{
'ext': 'lrc',
}],
},
'thumbnail': r're:^https?://.+\.jpg',
'timestamp': 1564836614,
'upload_date': '20190803',
'uploader': 'tsukimi-つきみぐー',
'view_count': int,
},
}
def _real_extract(self, url):
au_id = self._match_id(url)
play_data = self._call_api('url', au_id)
formats = [{
'url': play_data['cdns'][0],
'filesize': int_or_none(play_data.get('size')),
}]
song = self._call_api('song/info', au_id)
title = song['title']
statistic = song.get('statistic') or {}
subtitles = None
lyric = song.get('lyric')
if lyric:
subtitles = {
'origin': [{
'url': lyric,
}]
}
return {
'id': au_id,
'title': title,
'formats': formats,
'artist': song.get('author'),
'comment_count': int_or_none(statistic.get('comment')),
'description': song.get('intro'),
'duration': int_or_none(song.get('duration')),
'subtitles': subtitles,
'thumbnail': song.get('cover'),
'timestamp': int_or_none(song.get('passtime')),
'uploader': song.get('uname'),
'view_count': int_or_none(statistic.get('play')),
}
class BilibiliAudioAlbumIE(BilibiliAudioBaseIE):
_VALID_URL = r'https?://(?:www\.)?bilibili\.com/audio/am(?P<id>\d+)'
_TEST = {
'url': 'https://www.bilibili.com/audio/am10624',
'info_dict': {
'id': '10624',
'title': '每日新曲推荐每日11:00更新',
'description': '每天11:00更新为你推送最新音乐',
},
'playlist_count': 19,
}
def _real_extract(self, url):
am_id = self._match_id(url)
songs = self._call_api(
'song/of-menu', am_id, {'sid': am_id, 'pn': 1, 'ps': 100})['data']
entries = []
for song in songs:
sid = str_or_none(song.get('id'))
if not sid:
continue
entries.append(self.url_result(
'https://www.bilibili.com/audio/au' + sid,
BilibiliAudioIE.ie_key(), sid))
if entries:
album_data = self._call_api('menu/info', am_id) or {}
album_title = album_data.get('title')
if album_title:
for entry in entries:
entry['album'] = album_title
return self.playlist_result(
entries, am_id, album_title, album_data.get('intro'))
return self.playlist_result(entries, am_id)
class BiliBiliPlayerIE(InfoExtractor):
_VALID_URL = r'https?://player\.bilibili\.com/player\.html\?.*?\baid=(?P<id>\d+)'
_TEST = {
'url': 'http://player.bilibili.com/player.html?aid=92494333&cid=157926707&page=1',
'only_matching': True,
}
def _real_extract(self, url):
video_id = self._match_id(url)
return self.url_result(
'http://www.bilibili.tv/video/av%s/' % video_id,
ie=BiliBiliIE.ie_key(), video_id=video_id)

View File

@ -6,6 +6,7 @@ from ..utils import (
ExtractorError, ExtractorError,
remove_end, remove_end,
) )
from .rudo import RudoIE
class BioBioChileTVIE(InfoExtractor): class BioBioChileTVIE(InfoExtractor):
@ -40,15 +41,11 @@ class BioBioChileTVIE(InfoExtractor):
}, { }, {
'url': 'http://www.biobiochile.cl/noticias/bbtv/comentarios-bio-bio/2016/07/08/edecanes-del-congreso-figuras-decorativas-que-le-cuestan-muy-caro-a-los-chilenos.shtml', 'url': 'http://www.biobiochile.cl/noticias/bbtv/comentarios-bio-bio/2016/07/08/edecanes-del-congreso-figuras-decorativas-que-le-cuestan-muy-caro-a-los-chilenos.shtml',
'info_dict': { 'info_dict': {
'id': 'b4xd0LK3SK', 'id': 'edecanes-del-congreso-figuras-decorativas-que-le-cuestan-muy-caro-a-los-chilenos',
'ext': 'mp4', 'ext': 'mp4',
# TODO: fix url_transparent information overriding 'uploader': '(none)',
# 'uploader': 'Juan Pablo Echenique', 'upload_date': '20160708',
'title': 'Comentario Oscar Cáceres', 'title': 'Edecanes del Congreso: Figuras decorativas que le cuestan muy caro a los chilenos',
},
'params': {
# empty m3u8 manifest
'skip_download': True,
}, },
}, { }, {
'url': 'http://tv.biobiochile.cl/notas/2015/10/22/ninos-transexuales-de-quien-es-la-decision.shtml', 'url': 'http://tv.biobiochile.cl/notas/2015/10/22/ninos-transexuales-de-quien-es-la-decision.shtml',
@ -63,9 +60,7 @@ class BioBioChileTVIE(InfoExtractor):
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
rudo_url = self._search_regex( rudo_url = RudoIE._extract_url(webpage)
r'<iframe[^>]+src=(?P<q1>[\'"])(?P<url>(?:https?:)?//rudo\.video/vod/[0-9a-zA-Z]+)(?P=q1)',
webpage, 'embed URL', None, group='url')
if not rudo_url: if not rudo_url:
raise ExtractorError('No videos found') raise ExtractorError('No videos found')
@ -73,7 +68,7 @@ class BioBioChileTVIE(InfoExtractor):
thumbnail = self._og_search_thumbnail(webpage) thumbnail = self._og_search_thumbnail(webpage)
uploader = self._html_search_regex( uploader = self._html_search_regex(
r'<a[^>]+href=["\'](?:https?://(?:busca|www)\.biobiochile\.cl)?/(?:lista/)?(?:author|autor)[^>]+>(.+?)</a>', r'<a[^>]+href=["\']https?://(?:busca|www)\.biobiochile\.cl/(?:lista/)?(?:author|autor)[^>]+>(.+?)</a>',
webpage, 'uploader', fatal=False) webpage, 'uploader', fatal=False)
return { return {

View File

@ -3,11 +3,10 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from .vk import VKIE from .vk import VKIE
from ..compat import ( from ..utils import (
compat_b64decode, HEADRequest,
compat_urllib_parse_unquote, int_or_none,
) )
from ..utils import int_or_none
class BIQLEIE(InfoExtractor): class BIQLEIE(InfoExtractor):
@ -43,21 +42,14 @@ class BIQLEIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id) webpage = self._download_webpage(url, video_id)
embed_url = self._proto_relative_url(self._search_regex( embed_url = self._proto_relative_url(self._search_regex(
r'<iframe.+?src="((?:https?:)?//(?:daxab\.com|dxb\.to|[^/]+/player)/[^"]+)".*?></iframe>', r'<iframe.+?src="((?:https?:)?//daxab\.com/[^"]+)".*?></iframe>',
webpage, 'embed url')) webpage, 'embed url'))
if VKIE.suitable(embed_url): if VKIE.suitable(embed_url):
return self.url_result(embed_url, VKIE.ie_key(), video_id) return self.url_result(embed_url, VKIE.ie_key(), video_id)
embed_page = self._download_webpage( self._request_webpage(
embed_url, video_id, headers={'Referer': url}) HEADRequest(embed_url), video_id, headers={'Referer': url})
video_ext = self._get_cookies(embed_url).get('video_ext') video_id, sig, _, access_token = self._get_cookies(embed_url)['video_ext'].value.split('%3A')
if video_ext:
video_ext = compat_urllib_parse_unquote(video_ext.value)
if not video_ext:
video_ext = compat_b64decode(self._search_regex(
r'video_ext\s*:\s*[\'"]([A-Za-z0-9+/=]+)',
embed_page, 'video_ext')).decode()
video_id, sig, _, access_token = video_ext.split(':')
item = self._download_json( item = self._download_json(
'https://api.vk.com/method/video.get', video_id, 'https://api.vk.com/method/video.get', video_id,
headers={'User-Agent': 'okhttp/3.4.1'}, query={ headers={'User-Agent': 'okhttp/3.4.1'}, query={

View File

@ -7,7 +7,6 @@ import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
orderedSet, orderedSet,
unified_strdate,
urlencode_postdata, urlencode_postdata,
) )
@ -24,7 +23,6 @@ class BitChuteIE(InfoExtractor):
'description': 'md5:3f21f6fb5b1d17c3dee9cf6b5fe60b3a', 'description': 'md5:3f21f6fb5b1d17c3dee9cf6b5fe60b3a',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Victoria X Rave', 'uploader': 'Victoria X Rave',
'upload_date': '20170813',
}, },
}, { }, {
'url': 'https://www.bitchute.com/embed/lbb5G1hjPhw/', 'url': 'https://www.bitchute.com/embed/lbb5G1hjPhw/',
@ -57,11 +55,6 @@ class BitChuteIE(InfoExtractor):
formats = [ formats = [
{'url': format_url} {'url': format_url}
for format_url in orderedSet(format_urls)] for format_url in orderedSet(format_urls)]
if not formats:
formats = self._parse_html5_media_entries(
url, webpage, video_id)[0]['formats']
self._check_formats(formats, video_id) self._check_formats(formats, video_id)
self._sort_formats(formats) self._sort_formats(formats)
@ -72,13 +65,8 @@ class BitChuteIE(InfoExtractor):
webpage, default=None) or self._html_search_meta( webpage, default=None) or self._html_search_meta(
'twitter:image:src', webpage, 'thumbnail') 'twitter:image:src', webpage, 'thumbnail')
uploader = self._html_search_regex( uploader = self._html_search_regex(
(r'(?s)<div class=["\']channel-banner.*?<p\b[^>]+\bclass=["\']name[^>]+>(.+?)</p>', r'(?s)<p\b[^>]+\bclass=["\']video-author[^>]+>(.+?)</p>', webpage,
r'(?s)<p\b[^>]+\bclass=["\']video-author[^>]+>(.+?)</p>'), 'uploader', fatal=False)
webpage, 'uploader', fatal=False)
upload_date = unified_strdate(self._search_regex(
r'class=["\']video-publish-date[^>]+>[^<]+ at \d+:\d+ UTC on (.+?)\.',
webpage, 'upload date', fatal=False))
return { return {
'id': video_id, 'id': video_id,
@ -86,7 +74,6 @@ class BitChuteIE(InfoExtractor):
'description': description, 'description': description,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'uploader': uploader, 'uploader': uploader,
'upload_date': upload_date,
'formats': formats, 'formats': formats,
} }

View File

@ -71,7 +71,7 @@ class BleacherReportIE(InfoExtractor):
video = article_data.get('video') video = article_data.get('video')
if video: if video:
video_type = video['type'] video_type = video['type']
if video_type in ('cms.bleacherreport.com', 'vid.bleacherreport.com'): if video_type == 'cms.bleacherreport.com':
info['url'] = 'http://bleacherreport.com/video_embed?id=%s' % video['id'] info['url'] = 'http://bleacherreport.com/video_embed?id=%s' % video['id']
elif video_type == 'ooyala.com': elif video_type == 'ooyala.com':
info['url'] = 'ooyala:%s' % video['id'] info['url'] = 'ooyala:%s' % video['id']
@ -87,9 +87,9 @@ class BleacherReportIE(InfoExtractor):
class BleacherReportCMSIE(AMPIE): class BleacherReportCMSIE(AMPIE):
_VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36}|\d{5})' _VALID_URL = r'https?://(?:www\.)?bleacherreport\.com/video_embed\?id=(?P<id>[0-9a-f-]{36})'
_TESTS = [{ _TESTS = [{
'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1&library=video-cms', 'url': 'http://bleacherreport.com/video_embed?id=8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
'md5': '2e4b0a997f9228ffa31fada5c53d1ed1', 'md5': '2e4b0a997f9228ffa31fada5c53d1ed1',
'info_dict': { 'info_dict': {
'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1', 'id': '8fd44c2f-3dc5-4821-9118-2c825a98c0e1',
@ -101,6 +101,6 @@ class BleacherReportCMSIE(AMPIE):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
info = self._extract_feed_info('http://vid.bleacherreport.com/videos/%s.akamai' % video_id) info = self._extract_feed_info('http://cms.bleacherreport.com/media/items/%s/akamai.json' % video_id)
info['id'] = video_id info['id'] = video_id
return info return info

View File

@ -32,8 +32,8 @@ class BlinkxIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
display_id = video_id[:8] display_id = video_id[:8]
api_url = ('https://apib4.blinkx.com/api.php?action=play_video&' api_url = ('https://apib4.blinkx.com/api.php?action=play_video&' +
+ 'video=%s' % video_id) 'video=%s' % video_id)
data_json = self._download_webpage(api_url, display_id) data_json = self._download_webpage(api_url, display_id)
data = json.loads(data_json)['api']['results'][0] data = json.loads(data_json)['api']['results'][0]
duration = None duration = None

View File

@ -11,8 +11,8 @@ from ..utils import ExtractorError
class BokeCCBaseIE(InfoExtractor): class BokeCCBaseIE(InfoExtractor):
def _extract_bokecc_formats(self, webpage, video_id, format_id=None): def _extract_bokecc_formats(self, webpage, video_id, format_id=None):
player_params_str = self._html_search_regex( player_params_str = self._html_search_regex(
r'<(?:script|embed)[^>]+src=(?P<q>["\'])(?:https?:)?//p\.bokecc\.com/(?:player|flash/player\.swf)\?(?P<query>.+?)(?P=q)', r'<(?:script|embed)[^>]+src="http://p\.bokecc\.com/player\?([^"]+)',
webpage, 'player params', group='query') webpage, 'player params')
player_params = compat_parse_qs(player_params_str) player_params = compat_parse_qs(player_params_str)
@ -36,9 +36,9 @@ class BokeCCIE(BokeCCBaseIE):
_VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)' _VALID_URL = r'https?://union\.bokecc\.com/playvideo\.bo\?(?P<query>.*)'
_TESTS = [{ _TESTS = [{
'url': 'http://union.bokecc.com/playvideo.bo?vid=E0ABAE9D4F509B189C33DC5901307461&uid=FE644790DE9D154A', 'url': 'http://union.bokecc.com/playvideo.bo?vid=E44D40C15E65EA30&uid=CD0C5D3C8614B28B',
'info_dict': { 'info_dict': {
'id': 'FE644790DE9D154A_E0ABAE9D4F509B189C33DC5901307461', 'id': 'CD0C5D3C8614B28B_E44D40C15E65EA30',
'ext': 'flv', 'ext': 'flv',
'title': 'BokeCC Video', 'title': 'BokeCC Video',
}, },

View File

@ -1,8 +1,6 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
from .adobepass import AdobePassIE from .adobepass import AdobePassIE
from ..utils import ( from ..utils import (
smuggle_url, smuggle_url,
@ -14,16 +12,16 @@ from ..utils import (
class BravoTVIE(AdobePassIE): class BravoTVIE(AdobePassIE):
_VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)' _VALID_URL = r'https?://(?:www\.)?bravotv\.com/(?:[^/]+/)+(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.bravotv.com/top-chef/season-16/episode-15/videos/the-top-chef-season-16-winner-is', 'url': 'http://www.bravotv.com/last-chance-kitchen/season-5/videos/lck-ep-12-fishy-finale',
'md5': 'e34684cfea2a96cd2ee1ef3a60909de9', 'md5': '9086d0b7ef0ea2aabc4781d75f4e5863',
'info_dict': { 'info_dict': {
'id': 'epL0pmK1kQlT', 'id': 'zHyk1_HU_mPy',
'ext': 'mp4', 'ext': 'mp4',
'title': 'The Top Chef Season 16 Winner Is...', 'title': 'LCK Ep 12: Fishy Finale',
'description': 'Find out who takes the title of Top Chef!', 'description': 'S13/E12: Two eliminated chefs have just 12 minutes to cook up a delicious fish dish.',
'uploader': 'NBCU-BRAV', 'uploader': 'NBCU-BRAV',
'upload_date': '20190314', 'upload_date': '20160302',
'timestamp': 1552591860, 'timestamp': 1456945320,
} }
}, { }, {
'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1', 'url': 'http://www.bravotv.com/below-deck/season-3/ep-14-reunion-part-1',
@ -34,38 +32,30 @@ class BravoTVIE(AdobePassIE):
display_id = self._match_id(url) display_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, display_id)
settings = self._parse_json(self._search_regex( settings = self._parse_json(self._search_regex(
r'<script[^>]+data-drupal-selector="drupal-settings-json"[^>]*>({.+?})</script>', webpage, 'drupal settings'), r'jQuery\.extend\(Drupal\.settings\s*,\s*({.+?})\);', webpage, 'drupal settings'),
display_id) display_id)
info = {} info = {}
query = { query = {
'mbr': 'true', 'mbr': 'true',
} }
account_pid, release_pid = [None] * 2 account_pid, release_pid = [None] * 2
tve = settings.get('ls_tve') tve = settings.get('sharedTVE')
if tve: if tve:
query['manifest'] = 'm3u' query['manifest'] = 'm3u'
mobj = re.search(r'<[^>]+id="pdk-player"[^>]+data-url=["\']?(?:https?:)?//player\.theplatform\.com/p/([^/]+)/(?:[^/]+/)*select/([^?#&"\']+)', webpage) account_pid = 'HNK2IC'
if mobj: release_pid = tve['release_pid']
account_pid, tp_path = mobj.groups()
release_pid = tp_path.strip('/').split('/')[-1]
else:
account_pid = 'HNK2IC'
tp_path = release_pid = tve['release_pid']
if tve.get('entitlement') == 'auth': if tve.get('entitlement') == 'auth':
adobe_pass = settings.get('tve_adobe_auth', {}) adobe_pass = settings.get('adobePass', {})
resource = self._get_mvpd_resource( resource = self._get_mvpd_resource(
adobe_pass.get('adobePassResourceId', 'bravo'), adobe_pass.get('adobePassResourceId', 'bravo'),
tve['title'], release_pid, tve.get('rating')) tve['title'], release_pid, tve.get('rating'))
query['auth'] = self._extract_mvpd_auth( query['auth'] = self._extract_mvpd_auth(
url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource) url, release_pid, adobe_pass.get('adobePassRequestorId', 'bravo'), resource)
else: else:
shared_playlist = settings['ls_playlist'] shared_playlist = settings['shared_playlist']
account_pid = shared_playlist['account_pid'] account_pid = shared_playlist['account_pid']
metadata = shared_playlist['video_metadata'][shared_playlist['default_clip']] metadata = shared_playlist['video_metadata'][shared_playlist['default_clip']]
tp_path = release_pid = metadata.get('release_pid') release_pid = metadata['release_pid']
if not release_pid:
release_pid = metadata['guid']
tp_path = 'media/guid/2140479951/' + release_pid
info.update({ info.update({
'title': metadata['title'], 'title': metadata['title'],
'description': metadata.get('description'), 'description': metadata.get('description'),
@ -77,7 +67,7 @@ class BravoTVIE(AdobePassIE):
'_type': 'url_transparent', '_type': 'url_transparent',
'id': release_pid, 'id': release_pid,
'url': smuggle_url(update_url_query( 'url': smuggle_url(update_url_query(
'http://link.theplatform.com/s/%s/%s' % (account_pid, tp_path), 'http://link.theplatform.com/s/%s/%s' % (account_pid, release_pid),
query), {'force_smil_url': True}), query), {'force_smil_url': True}),
'ie_key': 'ThePlatform', 'ie_key': 'ThePlatform',
}) })

View File

@ -2,43 +2,43 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import base64 import base64
import json
import re import re
import struct import struct
from .adobepass import AdobePassIE
from .common import InfoExtractor from .common import InfoExtractor
from .adobepass import AdobePassIE
from ..compat import ( from ..compat import (
compat_etree_fromstring, compat_etree_fromstring,
compat_HTTPError,
compat_parse_qs, compat_parse_qs,
compat_str,
compat_urllib_parse_urlparse, compat_urllib_parse_urlparse,
compat_urlparse, compat_urlparse,
compat_xml_parse_error, compat_xml_parse_error,
compat_HTTPError,
) )
from ..utils import ( from ..utils import (
clean_html, determine_ext,
extract_attributes,
ExtractorError, ExtractorError,
extract_attributes,
find_xpath_attr, find_xpath_attr,
fix_xml_ampersands, fix_xml_ampersands,
float_or_none, float_or_none,
int_or_none,
js_to_json, js_to_json,
mimetype2ext, int_or_none,
parse_iso8601, parse_iso8601,
smuggle_url,
str_or_none,
unescapeHTML, unescapeHTML,
unsmuggle_url, unsmuggle_url,
UnsupportedError,
update_url_query, update_url_query,
url_or_none, clean_html,
mimetype2ext,
) )
class BrightcoveLegacyIE(InfoExtractor): class BrightcoveLegacyIE(InfoExtractor):
IE_NAME = 'brightcove:legacy' IE_NAME = 'brightcove:legacy'
_VALID_URL = r'(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)' _VALID_URL = r'(?:https?://.*brightcove\.com/(services|viewer).*?\?|brightcove:)(?P<query>.*)'
_FEDERATED_URL = 'http://c.brightcove.com/services/viewer/htmlFederated'
_TESTS = [ _TESTS = [
{ {
@ -55,8 +55,7 @@ class BrightcoveLegacyIE(InfoExtractor):
'timestamp': 1368213670, 'timestamp': 1368213670,
'upload_date': '20130510', 'upload_date': '20130510',
'uploader_id': '1589608506001', 'uploader_id': '1589608506001',
}, }
'skip': 'The player has been deactivated by the content owner',
}, },
{ {
# From http://medianetwork.oracle.com/video/player/1785452137001 # From http://medianetwork.oracle.com/video/player/1785452137001
@ -71,7 +70,6 @@ class BrightcoveLegacyIE(InfoExtractor):
'upload_date': '20120814', 'upload_date': '20120814',
'uploader_id': '1460825906', 'uploader_id': '1460825906',
}, },
'skip': 'video not playable',
}, },
{ {
# From http://mashable.com/2013/10/26/thermoelectric-bracelet-lets-you-control-your-body-temperature/ # From http://mashable.com/2013/10/26/thermoelectric-bracelet-lets-you-control-your-body-temperature/
@ -81,7 +79,7 @@ class BrightcoveLegacyIE(InfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'This Bracelet Acts as a Personal Thermostat', 'title': 'This Bracelet Acts as a Personal Thermostat',
'description': 'md5:547b78c64f4112766ccf4e151c20b6a0', 'description': 'md5:547b78c64f4112766ccf4e151c20b6a0',
# 'uploader': 'Mashable', 'uploader': 'Mashable',
'timestamp': 1382041798, 'timestamp': 1382041798,
'upload_date': '20131017', 'upload_date': '20131017',
'uploader_id': '1130468786001', 'uploader_id': '1130468786001',
@ -126,7 +124,6 @@ class BrightcoveLegacyIE(InfoExtractor):
'id': '3550319591001', 'id': '3550319591001',
}, },
'playlist_mincount': 7, 'playlist_mincount': 7,
'skip': 'Unsupported URL',
}, },
{ {
# playlist with 'playlistTab' (https://github.com/ytdl-org/youtube-dl/issues/9965) # playlist with 'playlistTab' (https://github.com/ytdl-org/youtube-dl/issues/9965)
@ -136,7 +133,6 @@ class BrightcoveLegacyIE(InfoExtractor):
'title': 'Lesson 08', 'title': 'Lesson 08',
}, },
'playlist_mincount': 10, 'playlist_mincount': 10,
'skip': 'Unsupported URL',
}, },
{ {
# playerID inferred from bcpid # playerID inferred from bcpid
@ -145,6 +141,12 @@ class BrightcoveLegacyIE(InfoExtractor):
'only_matching': True, # Tested in GenericIE 'only_matching': True, # Tested in GenericIE
} }
] ]
FLV_VCODECS = {
1: 'SORENSON',
2: 'ON2',
3: 'H264',
4: 'VP8',
}
@classmethod @classmethod
def _build_brighcove_url(cls, object_str): def _build_brighcove_url(cls, object_str):
@ -236,8 +238,7 @@ class BrightcoveLegacyIE(InfoExtractor):
@classmethod @classmethod
def _make_brightcove_url(cls, params): def _make_brightcove_url(cls, params):
return update_url_query( return update_url_query(cls._FEDERATED_URL, params)
'http://c.brightcove.com/services/viewer/htmlFederated', params)
@classmethod @classmethod
def _extract_brightcove_url(cls, webpage): def _extract_brightcove_url(cls, webpage):
@ -296,12 +297,38 @@ class BrightcoveLegacyIE(InfoExtractor):
videoPlayer = query.get('@videoPlayer') videoPlayer = query.get('@videoPlayer')
if videoPlayer: if videoPlayer:
# We set the original url as the default 'Referer' header # We set the original url as the default 'Referer' header
referer = query.get('linkBaseURL', [None])[0] or smuggled_data.get('Referer', url) referer = smuggled_data.get('Referer', url)
video_id = videoPlayer[0]
if 'playerID' not in query: if 'playerID' not in query:
mobj = re.search(r'/bcpid(\d+)', url) mobj = re.search(r'/bcpid(\d+)', url)
if mobj is not None: if mobj is not None:
query['playerID'] = [mobj.group(1)] query['playerID'] = [mobj.group(1)]
return self._get_video_info(
videoPlayer[0], query, referer=referer)
elif 'playerKey' in query:
player_key = query['playerKey']
return self._get_playlist_info(player_key[0])
else:
raise ExtractorError(
'Cannot find playerKey= variable. Did you forget quotes in a shell invocation?',
expected=True)
def _brightcove_new_url_result(self, publisher_id, video_id):
brightcove_new_url = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' % (publisher_id, video_id)
return self.url_result(brightcove_new_url, BrightcoveNewIE.ie_key(), video_id)
def _get_video_info(self, video_id, query, referer=None):
headers = {}
linkBase = query.get('linkBaseURL')
if linkBase is not None:
referer = linkBase[0]
if referer is not None:
headers['Referer'] = referer
webpage = self._download_webpage(self._FEDERATED_URL, video_id, headers=headers, query=query)
error_msg = self._html_search_regex(
r"<h1>We're sorry.</h1>([\s\n]*<p>.*?</p>)+", webpage,
'error message', default=None)
if error_msg is not None:
publisher_id = query.get('publisherId') publisher_id = query.get('publisherId')
if publisher_id and publisher_id[0].isdigit(): if publisher_id and publisher_id[0].isdigit():
publisher_id = publisher_id[0] publisher_id = publisher_id[0]
@ -312,9 +339,6 @@ class BrightcoveLegacyIE(InfoExtractor):
else: else:
player_id = query.get('playerID') player_id = query.get('playerID')
if player_id and player_id[0].isdigit(): if player_id and player_id[0].isdigit():
headers = {}
if referer:
headers['Referer'] = referer
player_page = self._download_webpage( player_page = self._download_webpage(
'http://link.brightcove.com/services/player/bcpid' + player_id[0], 'http://link.brightcove.com/services/player/bcpid' + player_id[0],
video_id, headers=headers, fatal=False) video_id, headers=headers, fatal=False)
@ -325,21 +349,141 @@ class BrightcoveLegacyIE(InfoExtractor):
if player_key: if player_key:
enc_pub_id = player_key.split(',')[1].replace('~', '=') enc_pub_id = player_key.split(',')[1].replace('~', '=')
publisher_id = struct.unpack('>Q', base64.urlsafe_b64decode(enc_pub_id))[0] publisher_id = struct.unpack('>Q', base64.urlsafe_b64decode(enc_pub_id))[0]
if publisher_id: if publisher_id:
brightcove_new_url = 'http://players.brightcove.net/%s/default_default/index.html?videoId=%s' % (publisher_id, video_id) return self._brightcove_new_url_result(publisher_id, video_id)
if referer: raise ExtractorError(
brightcove_new_url = smuggle_url(brightcove_new_url, {'referrer': referer}) 'brightcove said: %s' % error_msg, expected=True)
return self.url_result(brightcove_new_url, BrightcoveNewIE.ie_key(), video_id)
# TODO: figure out if it's possible to extract playlistId from playerKey self.report_extraction(video_id)
# elif 'playerKey' in query: info = self._search_regex(r'var experienceJSON = ({.*});', webpage, 'json')
# player_key = query['playerKey'] info = json.loads(info)['data']
# return self._get_playlist_info(player_key[0]) video_info = info['programmedContent']['videoPlayer']['mediaDTO']
raise UnsupportedError(url) video_info['_youtubedl_adServerURL'] = info.get('adServerURL')
return self._extract_video_info(video_info)
def _get_playlist_info(self, player_key):
info_url = 'http://c.brightcove.com/services/json/experience/runtime/?command=get_programming_for_experience&playerKey=%s' % player_key
playlist_info = self._download_webpage(
info_url, player_key, 'Downloading playlist information')
json_data = json.loads(playlist_info)
if 'videoList' in json_data:
playlist_info = json_data['videoList']
playlist_dto = playlist_info['mediaCollectionDTO']
elif 'playlistTabs' in json_data:
playlist_info = json_data['playlistTabs']
playlist_dto = playlist_info['lineupListDTO']['playlistDTOs'][0]
else:
raise ExtractorError('Empty playlist')
videos = [self._extract_video_info(video_info) for video_info in playlist_dto['videoDTOs']]
return self.playlist_result(videos, playlist_id='%s' % playlist_info['id'],
playlist_title=playlist_dto['displayName'])
def _extract_video_info(self, video_info):
video_id = compat_str(video_info['id'])
publisher_id = video_info.get('publisherId')
info = {
'id': video_id,
'title': video_info['displayName'].strip(),
'description': video_info.get('shortDescription'),
'thumbnail': video_info.get('videoStillURL') or video_info.get('thumbnailURL'),
'uploader': video_info.get('publisherName'),
'uploader_id': compat_str(publisher_id) if publisher_id else None,
'duration': float_or_none(video_info.get('length'), 1000),
'timestamp': int_or_none(video_info.get('creationDate'), 1000),
}
renditions = video_info.get('renditions', []) + video_info.get('IOSRenditions', [])
if renditions:
formats = []
for rend in renditions:
url = rend['defaultURL']
if not url:
continue
ext = None
if rend['remote']:
url_comp = compat_urllib_parse_urlparse(url)
if url_comp.path.endswith('.m3u8'):
formats.extend(
self._extract_m3u8_formats(
url, video_id, 'mp4', 'm3u8_native', m3u8_id='hls', fatal=False))
continue
elif 'akamaihd.net' in url_comp.netloc:
# This type of renditions are served through
# akamaihd.net, but they don't use f4m manifests
url = url.replace('control/', '') + '?&v=3.3.0&fp=13&r=FEEFJ&g=RTSJIMBMPFPB'
ext = 'flv'
if ext is None:
ext = determine_ext(url)
tbr = int_or_none(rend.get('encodingRate'), 1000)
a_format = {
'format_id': 'http%s' % ('-%s' % tbr if tbr else ''),
'url': url,
'ext': ext,
'filesize': int_or_none(rend.get('size')) or None,
'tbr': tbr,
}
if rend.get('audioOnly'):
a_format.update({
'vcodec': 'none',
})
else:
a_format.update({
'height': int_or_none(rend.get('frameHeight')),
'width': int_or_none(rend.get('frameWidth')),
'vcodec': rend.get('videoCodec'),
})
# m3u8 manifests with remote == false are media playlists
# Not calling _extract_m3u8_formats here to save network traffic
if ext == 'm3u8':
a_format.update({
'format_id': 'hls%s' % ('-%s' % tbr if tbr else ''),
'ext': 'mp4',
'protocol': 'm3u8_native',
})
formats.append(a_format)
self._sort_formats(formats)
info['formats'] = formats
elif video_info.get('FLVFullLengthURL') is not None:
info.update({
'url': video_info['FLVFullLengthURL'],
'vcodec': self.FLV_VCODECS.get(video_info.get('FLVFullCodec')),
'filesize': int_or_none(video_info.get('FLVFullSize')),
})
if self._downloader.params.get('include_ads', False):
adServerURL = video_info.get('_youtubedl_adServerURL')
if adServerURL:
ad_info = {
'_type': 'url',
'url': adServerURL,
}
if 'url' in info:
return {
'_type': 'playlist',
'title': info['title'],
'entries': [ad_info, info],
}
else:
return ad_info
if not info.get('url') and not info.get('formats'):
uploader_id = info.get('uploader_id')
if uploader_id:
info.update(self._brightcove_new_url_result(uploader_id, video_id))
else:
raise ExtractorError('Unable to extract video url for %s' % video_id)
return info
class BrightcoveNewIE(AdobePassIE): class BrightcoveNewIE(AdobePassIE):
IE_NAME = 'brightcove:new' IE_NAME = 'brightcove:new'
_VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*(?P<content_type>video|playlist)Id=(?P<video_id>\d+|ref:[^&]+)' _VALID_URL = r'https?://players\.brightcove\.net/(?P<account_id>\d+)/(?P<player_id>[^/]+)_(?P<embed>[^/]+)/index\.html\?.*videoId=(?P<video_id>\d+|ref:[^&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://players.brightcove.net/929656772001/e41d32dc-ec74-459e-a845-6c69f7b724ea_default/index.html?videoId=4463358922001', 'url': 'http://players.brightcove.net/929656772001/e41d32dc-ec74-459e-a845-6c69f7b724ea_default/index.html?videoId=4463358922001',
'md5': 'c8100925723840d4b0d243f7025703be', 'md5': 'c8100925723840d4b0d243f7025703be',
@ -372,21 +516,6 @@ class BrightcoveNewIE(AdobePassIE):
# m3u8 download # m3u8 download
'skip_download': True, 'skip_download': True,
} }
}, {
# playlist stream
'url': 'https://players.brightcove.net/1752604059001/S13cJdUBz_default/index.html?playlistId=5718313430001',
'info_dict': {
'id': '5718313430001',
'title': 'No Audio Playlist',
},
'playlist_count': 7,
'params': {
# m3u8 download
'skip_download': True,
}
}, {
'url': 'http://players.brightcove.net/5690807595001/HyZNerRl7_default/index.html?playlistId=5743160747001',
'only_matching': True,
}, { }, {
# ref: prefixed video id # ref: prefixed video id
'url': 'http://players.brightcove.net/3910869709001/21519b5c-4b3b-4363-accb-bdc8f358f823_default/index.html?videoId=ref:7069442', 'url': 'http://players.brightcove.net/3910869709001/21519b5c-4b3b-4363-accb-bdc8f358f823_default/index.html?videoId=ref:7069442',
@ -426,7 +555,7 @@ class BrightcoveNewIE(AdobePassIE):
# [2] looks like: # [2] looks like:
for video, script_tag, account_id, player_id, embed in re.findall( for video, script_tag, account_id, player_id, embed in re.findall(
r'''(?isx) r'''(?isx)
(<video(?:-js)?\s+[^>]*\bdata-video-id\s*=\s*['"]?[^>]+>) (<video\s+[^>]*\bdata-video-id\s*=\s*['"]?[^>]+>)
(?:.*? (?:.*?
(<script[^>]+ (<script[^>]+
src=["\'](?:https?:)?//players\.brightcove\.net/ src=["\'](?:https?:)?//players\.brightcove\.net/
@ -555,16 +684,10 @@ class BrightcoveNewIE(AdobePassIE):
subtitles = {} subtitles = {}
for text_track in json_data.get('text_tracks', []): for text_track in json_data.get('text_tracks', []):
if text_track.get('kind') != 'captions': if text_track.get('src'):
continue subtitles.setdefault(text_track.get('srclang'), []).append({
text_track_url = url_or_none(text_track.get('src')) 'url': text_track['src'],
if not text_track_url: })
continue
lang = (str_or_none(text_track.get('srclang'))
or str_or_none(text_track.get('label')) or 'en').lower()
subtitles.setdefault(lang, []).append({
'url': text_track_url,
})
is_live = False is_live = False
duration = float_or_none(json_data.get('duration'), 1000) duration = float_or_none(json_data.get('duration'), 1000)
@ -592,65 +715,47 @@ class BrightcoveNewIE(AdobePassIE):
'ip_blocks': smuggled_data.get('geo_ip_blocks'), 'ip_blocks': smuggled_data.get('geo_ip_blocks'),
}) })
account_id, player_id, embed, content_type, video_id = re.match(self._VALID_URL, url).groups() account_id, player_id, embed, video_id = re.match(self._VALID_URL, url).groups()
policy_key_id = '%s_%s' % (account_id, player_id) webpage = self._download_webpage(
policy_key = self._downloader.cache.load('brightcove', policy_key_id) 'http://players.brightcove.net/%s/%s_%s/index.min.js'
policy_key_extracted = False % (account_id, player_id, embed), video_id)
store_pk = lambda x: self._downloader.cache.store('brightcove', policy_key_id, x)
def extract_policy_key(): policy_key = None
webpage = self._download_webpage(
'http://players.brightcove.net/%s/%s_%s/index.min.js'
% (account_id, player_id, embed), video_id)
policy_key = None catalog = self._search_regex(
r'catalog\(({.+?})\);', webpage, 'catalog', default=None)
catalog = self._search_regex( if catalog:
r'catalog\(({.+?})\);', webpage, 'catalog', default=None) catalog = self._parse_json(
js_to_json(catalog), video_id, fatal=False)
if catalog: if catalog:
catalog = self._parse_json( policy_key = catalog.get('policyKey')
js_to_json(catalog), video_id, fatal=False)
if catalog:
policy_key = catalog.get('policyKey')
if not policy_key: if not policy_key:
policy_key = self._search_regex( policy_key = self._search_regex(
r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1', r'policyKey\s*:\s*(["\'])(?P<pk>.+?)\1',
webpage, 'policy key', group='pk') webpage, 'policy key', group='pk')
store_pk(policy_key) api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/videos/%s' % (account_id, video_id)
return policy_key headers = {
'Accept': 'application/json;pk=%s' % policy_key,
api_url = 'https://edge.api.brightcove.com/playback/v1/accounts/%s/%ss/%s' % (account_id, content_type, video_id) }
headers = {}
referrer = smuggled_data.get('referrer') referrer = smuggled_data.get('referrer')
if referrer: if referrer:
headers.update({ headers.update({
'Referer': referrer, 'Referer': referrer,
'Origin': re.search(r'https?://[^/]+', referrer).group(0), 'Origin': re.search(r'https?://[^/]+', referrer).group(0),
}) })
try:
for _ in range(2): json_data = self._download_json(api_url, video_id, headers=headers)
if not policy_key: except ExtractorError as e:
policy_key = extract_policy_key() if isinstance(e.cause, compat_HTTPError) and e.cause.code == 403:
policy_key_extracted = True json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
headers['Accept'] = 'application/json;pk=%s' % policy_key message = json_data.get('message') or json_data['error_code']
try: if json_data.get('error_subcode') == 'CLIENT_GEO':
json_data = self._download_json(api_url, video_id, headers=headers) self.raise_geo_restricted(msg=message)
break raise ExtractorError(message, expected=True)
except ExtractorError as e: raise
if isinstance(e.cause, compat_HTTPError) and e.cause.code in (401, 403):
json_data = self._parse_json(e.cause.read().decode(), video_id)[0]
message = json_data.get('message') or json_data['error_code']
if json_data.get('error_subcode') == 'CLIENT_GEO':
self.raise_geo_restricted(msg=message)
elif json_data.get('error_code') == 'INVALID_POLICY_KEY' and not policy_key_extracted:
policy_key = None
store_pk(None)
continue
raise ExtractorError(message, expected=True)
raise
errors = json_data.get('errors') errors = json_data.get('errors')
if errors and errors[0].get('error_subcode') == 'TVE_AUTH': if errors and errors[0].get('error_subcode') == 'TVE_AUTH':
@ -666,12 +771,5 @@ class BrightcoveNewIE(AdobePassIE):
'tveToken': tve_token, 'tveToken': tve_token,
}) })
if content_type == 'playlist':
return self.playlist_result(
[self._parse_brightcove_metadata(vid, vid.get('id'), headers)
for vid in json_data.get('videos', []) if vid.get('id')],
json_data.get('id'), json_data.get('name'),
json_data.get('description'))
return self._parse_brightcove_metadata( return self._parse_brightcove_metadata(
json_data, video_id, headers=headers) json_data, video_id, headers=headers)

View File

@ -9,26 +9,21 @@ class BusinessInsiderIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?businessinsider\.(?:com|nl)/(?:[^/]+/)*(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:[^/]+\.)?businessinsider\.(?:com|nl)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'http://uk.businessinsider.com/how-much-radiation-youre-exposed-to-in-everyday-life-2016-6', 'url': 'http://uk.businessinsider.com/how-much-radiation-youre-exposed-to-in-everyday-life-2016-6',
'md5': 'ffed3e1e12a6f950aa2f7d83851b497a', 'md5': 'ca237a53a8eb20b6dc5bd60564d4ab3e',
'info_dict': { 'info_dict': {
'id': 'cjGDb0X9', 'id': 'hZRllCfw',
'ext': 'mp4', 'ext': 'mp4',
'title': "Bananas give you more radiation exposure than living next to a nuclear power plant", 'title': "Here's how much radiation you're exposed to in everyday life",
'description': 'md5:0175a3baf200dd8fa658f94cade841b3', 'description': 'md5:9a0d6e2c279948aadaa5e84d6d9b99bd',
'upload_date': '20160611', 'upload_date': '20170709',
'timestamp': 1465675620, 'timestamp': 1499606400,
},
'params': {
'skip_download': True,
}, },
}, { }, {
'url': 'https://www.businessinsider.nl/5-scientifically-proven-things-make-you-less-attractive-2017-7/', 'url': 'https://www.businessinsider.nl/5-scientifically-proven-things-make-you-less-attractive-2017-7/',
'md5': '43f438dbc6da0b89f5ac42f68529d84a', 'only_matching': True,
'info_dict': {
'id': '5zJwd4FK',
'ext': 'mp4',
'title': 'Deze dingen zorgen ervoor dat je minder snel een date scoort',
'description': 'md5:2af8975825d38a4fed24717bbe51db49',
'upload_date': '20170705',
'timestamp': 1499270528,
},
}, { }, {
'url': 'http://www.businessinsider.com/excel-index-match-vlookup-video-how-to-2015-2?IR=T', 'url': 'http://www.businessinsider.com/excel-index-match-vlookup-video-how-to-2015-2?IR=T',
'only_matching': True, 'only_matching': True,
@ -40,8 +35,7 @@ class BusinessInsiderIE(InfoExtractor):
jwplatform_id = self._search_regex( jwplatform_id = self._search_regex(
(r'data-media-id=["\']([a-zA-Z0-9]{8})', (r'data-media-id=["\']([a-zA-Z0-9]{8})',
r'id=["\']jwplayer_([a-zA-Z0-9]{8})', r'id=["\']jwplayer_([a-zA-Z0-9]{8})',
r'id["\']?\s*:\s*["\']?([a-zA-Z0-9]{8})', r'id["\']?\s*:\s*["\']?([a-zA-Z0-9]{8})'),
r'(?:jwplatform\.com/players/|jwplayer_)([a-zA-Z0-9]{8})'),
webpage, 'jwplatform id') webpage, 'jwplatform id')
return self.url_result( return self.url_result(
'jwplatform:%s' % jwplatform_id, ie=JWPlatformIE.ie_key(), 'jwplatform:%s' % jwplatform_id, ie=JWPlatformIE.ie_key(),

View File

@ -3,18 +3,11 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import (
determine_ext,
merge_dicts,
parse_duration,
url_or_none,
)
class BYUtvIE(InfoExtractor): class BYUtvIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?byutv\.org/(?:watch|player)/(?!event/)(?P<id>[0-9a-f-]+)(?:/(?P<display_id>[^/?#&]+))?' _VALID_URL = r'https?://(?:www\.)?byutv\.org/(?:watch|player)/(?!event/)(?P<id>[0-9a-f-]+)(?:/(?P<display_id>[^/?#&]+))?'
_TESTS = [{ _TESTS = [{
# ooyalaVOD
'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d/studio-c-season-5-episode-5', 'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d/studio-c-season-5-episode-5',
'info_dict': { 'info_dict': {
'id': 'ZvanRocTpW-G5_yZFeltTAMv6jxOU9KH', 'id': 'ZvanRocTpW-G5_yZFeltTAMv6jxOU9KH',
@ -29,20 +22,6 @@ class BYUtvIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
}, },
'add_ie': ['Ooyala'], 'add_ie': ['Ooyala'],
}, {
# dvr
'url': 'https://www.byutv.org/player/8f1dab9b-b243-47c8-b525-3e2d021a3451/byu-softball-pacific-vs-byu-41219---game-2',
'info_dict': {
'id': '8f1dab9b-b243-47c8-b525-3e2d021a3451',
'display_id': 'byu-softball-pacific-vs-byu-41219---game-2',
'ext': 'mp4',
'title': 'Pacific vs. BYU (4/12/19)',
'description': 'md5:1ac7b57cb9a78015910a4834790ce1f3',
'duration': 11645,
},
'params': {
'skip_download': True
},
}, { }, {
'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d', 'url': 'http://www.byutv.org/watch/6587b9a3-89d2-42a6-a7f7-fd2f81840a7d',
'only_matching': True, 'only_matching': True,
@ -56,62 +35,24 @@ class BYUtvIE(InfoExtractor):
video_id = mobj.group('id') video_id = mobj.group('id')
display_id = mobj.group('display_id') or video_id display_id = mobj.group('display_id') or video_id
video = self._download_json( ep = self._download_json(
'https://api.byutv.org/api3/catalog/getvideosforcontent', 'https://api.byutv.org/api3/catalog/getvideosforcontent', video_id,
display_id, query={ query={
'contentid': video_id, 'contentid': video_id,
'channel': 'byutv', 'channel': 'byutv',
'x-byutv-context': 'web$US', 'x-byutv-context': 'web$US',
}, headers={ }, headers={
'x-byutv-context': 'web$US', 'x-byutv-context': 'web$US',
'x-byutv-platformkey': 'xsaaw9c7y5', 'x-byutv-platformkey': 'xsaaw9c7y5',
}) })['ooyalaVOD']
ep = video.get('ooyalaVOD') return {
if ep: '_type': 'url_transparent',
return { 'ie_key': 'Ooyala',
'_type': 'url_transparent', 'url': 'ooyala:%s' % ep['providerId'],
'ie_key': 'Ooyala',
'url': 'ooyala:%s' % ep['providerId'],
'id': video_id,
'display_id': display_id,
'title': ep.get('title'),
'description': ep.get('description'),
'thumbnail': ep.get('imageThumbnail'),
}
info = {}
formats = []
for format_id, ep in video.items():
if not isinstance(ep, dict):
continue
video_url = url_or_none(ep.get('videoUrl'))
if not video_url:
continue
ext = determine_ext(video_url)
if ext == 'm3u8':
formats.extend(self._extract_m3u8_formats(
video_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id='hls', fatal=False))
elif ext == 'mpd':
formats.extend(self._extract_mpd_formats(
video_url, video_id, mpd_id='dash', fatal=False))
else:
formats.append({
'url': video_url,
'format_id': format_id,
})
merge_dicts(info, {
'title': ep.get('title'),
'description': ep.get('description'),
'thumbnail': ep.get('imageThumbnail'),
'duration': parse_duration(ep.get('length')),
})
self._sort_formats(formats)
return merge_dicts(info, {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'title': display_id, 'title': ep.get('title'),
'formats': formats, 'description': ep.get('description'),
}) 'thumbnail': ep.get('imageThumbnail'),
}

View File

@ -13,76 +13,48 @@ from ..utils import (
int_or_none, int_or_none,
merge_dicts, merge_dicts,
parse_iso8601, parse_iso8601,
str_or_none,
url_or_none,
) )
class CanvasIE(InfoExtractor): class CanvasIE(InfoExtractor):
_VALID_URL = r'https?://mediazone\.vrt\.be/api/v1/(?P<site_id>canvas|een|ketnet|vrt(?:video|nieuws)|sporza)/assets/(?P<id>[^/?#&]+)' _VALID_URL = r'https?://mediazone\.vrt\.be/api/v1/(?P<site_id>canvas|een|ketnet|vrtvideo)/assets/(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://mediazone.vrt.be/api/v1/ketnet/assets/md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475', 'url': 'https://mediazone.vrt.be/api/v1/ketnet/assets/md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
'md5': '68993eda72ef62386a15ea2cf3c93107', 'md5': '90139b746a0a9bd7bb631283f6e2a64e',
'info_dict': { 'info_dict': {
'id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475', 'id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
'display_id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475', 'display_id': 'md-ast-4ac54990-ce66-4d00-a8ca-9eac86f4c475',
'ext': 'mp4', 'ext': 'flv',
'title': 'Nachtwacht: De Greystook', 'title': 'Nachtwacht: De Greystook',
'description': 'Nachtwacht: De Greystook', 'description': 'md5:1db3f5dc4c7109c821261e7512975be7',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 1468.04, 'duration': 1468.03,
}, },
'expected_warnings': ['is not a supported codec', 'Unknown MIME type'], 'expected_warnings': ['is not a supported codec', 'Unknown MIME type'],
}, { }, {
'url': 'https://mediazone.vrt.be/api/v1/canvas/assets/mz-ast-5e5f90b6-2d72-4c40-82c2-e134f884e93e', 'url': 'https://mediazone.vrt.be/api/v1/canvas/assets/mz-ast-5e5f90b6-2d72-4c40-82c2-e134f884e93e',
'only_matching': True, 'only_matching': True,
}] }]
_HLS_ENTRY_PROTOCOLS_MAP = {
'HLS': 'm3u8_native',
'HLS_AES': 'm3u8',
}
_REST_API_BASE = 'https://media-services-public.vrt.be/vualto-video-aggregator-web/rest/external/v1'
def _real_extract(self, url): def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url) mobj = re.match(self._VALID_URL, url)
site_id, video_id = mobj.group('site_id'), mobj.group('id') site_id, video_id = mobj.group('site_id'), mobj.group('id')
# Old API endpoint, serves more formats but may fail for some videos
data = self._download_json( data = self._download_json(
'https://mediazone.vrt.be/api/v1/%s/assets/%s' 'https://mediazone.vrt.be/api/v1/%s/assets/%s'
% (site_id, video_id), video_id, 'Downloading asset JSON', % (site_id, video_id), video_id)
'Unable to download asset JSON', fatal=False)
# New API endpoint
if not data:
token = self._download_json(
'%s/tokens' % self._REST_API_BASE, video_id,
'Downloading token', data=b'',
headers={'Content-Type': 'application/json'})['vrtPlayerToken']
data = self._download_json(
'%s/videos/%s' % (self._REST_API_BASE, video_id),
video_id, 'Downloading video JSON', fatal=False, query={
'vrtPlayerToken': token,
'client': '%s@PROD' % site_id,
}, expected_status=400)
message = data.get('message')
if message and not data.get('title'):
if data.get('code') == 'AUTHENTICATION_REQUIRED':
self.raise_login_required(message)
raise ExtractorError(message, expected=True)
title = data['title'] title = data['title']
description = data.get('description') description = data.get('description')
formats = [] formats = []
for target in data['targetUrls']: for target in data['targetUrls']:
format_url, format_type = url_or_none(target.get('url')), str_or_none(target.get('type')) format_url, format_type = target.get('url'), target.get('type')
if not format_url or not format_type: if not format_url or not format_type:
continue continue
format_type = format_type.upper() if format_type == 'HLS':
if format_type in self._HLS_ENTRY_PROTOCOLS_MAP:
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
format_url, video_id, 'mp4', self._HLS_ENTRY_PROTOCOLS_MAP[format_type], format_url, video_id, 'mp4', entry_protocol='m3u8_native',
m3u8_id=format_type, fatal=False)) m3u8_id=format_type, fatal=False))
elif format_type == 'HDS': elif format_type == 'HDS':
formats.extend(self._extract_f4m_formats( formats.extend(self._extract_f4m_formats(
@ -158,20 +130,20 @@ class CanvasEenIE(InfoExtractor):
}, },
'skip': 'Pagina niet gevonden', 'skip': 'Pagina niet gevonden',
}, { }, {
'url': 'https://www.een.be/thuis/emma-pakt-thilly-aan', 'url': 'https://www.een.be/sorry-voor-alles/herbekijk-sorry-voor-alles',
'info_dict': { 'info_dict': {
'id': 'md-ast-3a24ced2-64d7-44fb-b4ed-ed1aafbf90b8', 'id': 'mz-ast-11a587f8-b921-4266-82e2-0bce3e80d07f',
'display_id': 'emma-pakt-thilly-aan', 'display_id': 'herbekijk-sorry-voor-alles',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Emma pakt Thilly aan', 'title': 'Herbekijk Sorry voor alles',
'description': 'md5:c5c9b572388a99b2690030afa3f3bad7', 'description': 'md5:8bb2805df8164e5eb95d6a7a29dc0dd3',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 118.24, 'duration': 3788.06,
}, },
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
'expected_warnings': ['is not a supported codec'], 'skip': 'Episode no longer available',
}, { }, {
'url': 'https://www.canvas.be/check-point/najaar-2016/de-politie-uw-vriend', 'url': 'https://www.canvas.be/check-point/najaar-2016/de-politie-uw-vriend',
'only_matching': True, 'only_matching': True,
@ -207,44 +179,19 @@ class VrtNUIE(GigyaBaseIE):
IE_DESC = 'VrtNU.be' IE_DESC = 'VrtNU.be'
_VALID_URL = r'https?://(?:www\.)?vrt\.be/(?P<site_id>vrtnu)/(?:[^/]+/)*(?P<id>[^/?#&]+)' _VALID_URL = r'https?://(?:www\.)?vrt\.be/(?P<site_id>vrtnu)/(?:[^/]+/)*(?P<id>[^/?#&]+)'
_TESTS = [{ _TESTS = [{
# Available via old API endpoint
'url': 'https://www.vrt.be/vrtnu/a-z/postbus-x/1/postbus-x-s1a1/', 'url': 'https://www.vrt.be/vrtnu/a-z/postbus-x/1/postbus-x-s1a1/',
'info_dict': { 'info_dict': {
'id': 'pbs-pub-2e2d8c27-df26-45c9-9dc6-90c78153044d$vid-90c932b1-e21d-4fb8-99b1-db7b49cf74de', 'id': 'pbs-pub-2e2d8c27-df26-45c9-9dc6-90c78153044d$vid-90c932b1-e21d-4fb8-99b1-db7b49cf74de',
'ext': 'mp4', 'ext': 'flv',
'title': 'De zwarte weduwe', 'title': 'De zwarte weduwe',
'description': 'md5:db1227b0f318c849ba5eab1fef895ee4', 'description': 'md5:d90c21dced7db869a85db89a623998d4',
'duration': 1457.04, 'duration': 1457.04,
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'season': 'Season 1', 'season': '1',
'season_number': 1, 'season_number': 1,
'episode_number': 1, 'episode_number': 1,
}, },
'skip': 'This video is only available for registered users', 'skip': 'This video is only available for registered users'
'params': {
'username': '<snip>',
'password': '<snip>',
},
'expected_warnings': ['is not a supported codec'],
}, {
# Only available via new API endpoint
'url': 'https://www.vrt.be/vrtnu/a-z/kamp-waes/1/kamp-waes-s1a5/',
'info_dict': {
'id': 'pbs-pub-0763b56c-64fb-4d38-b95b-af60bf433c71$vid-ad36a73c-4735-4f1f-b2c0-a38e6e6aa7e1',
'ext': 'mp4',
'title': 'Aflevering 5',
'description': 'Wie valt door de mand tijdens een missie?',
'duration': 2967.06,
'season': 'Season 1',
'season_number': 1,
'episode_number': 5,
},
'skip': 'This video is only available for registered users',
'params': {
'username': '<snip>',
'password': '<snip>',
},
'expected_warnings': ['Unable to download asset JSON', 'is not a supported codec', 'Unknown MIME type'],
}] }]
_NETRC_MACHINE = 'vrtnu' _NETRC_MACHINE = 'vrtnu'
_APIKEY = '3_0Z2HujMtiWq_pkAjgnS2Md2E11a1AwZjYiBETtwNE-EoEHDINgtnvcAOpNgmrVGy' _APIKEY = '3_0Z2HujMtiWq_pkAjgnS2Md2E11a1AwZjYiBETtwNE-EoEHDINgtnvcAOpNgmrVGy'

View File

@ -1,10 +1,8 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import hashlib
import json import json
import re import re
from xml.sax.saxutils import escape
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
@ -218,29 +216,6 @@ class CBCWatchBaseIE(InfoExtractor):
'clearleap': 'http://www.clearleap.com/namespace/clearleap/1.0/', 'clearleap': 'http://www.clearleap.com/namespace/clearleap/1.0/',
} }
_GEO_COUNTRIES = ['CA'] _GEO_COUNTRIES = ['CA']
_LOGIN_URL = 'https://api.loginradius.com/identity/v2/auth/login'
_TOKEN_URL = 'https://cloud-api.loginradius.com/sso/jwt/api/token'
_API_KEY = '3f4beddd-2061-49b0-ae80-6f1f2ed65b37'
_NETRC_MACHINE = 'cbcwatch'
def _signature(self, email, password):
data = json.dumps({
'email': email,
'password': password,
}).encode()
headers = {'content-type': 'application/json'}
query = {'apikey': self._API_KEY}
resp = self._download_json(self._LOGIN_URL, None, data=data, headers=headers, query=query)
access_token = resp['access_token']
# token
query = {
'access_token': access_token,
'apikey': self._API_KEY,
'jwtapp': 'jwt',
}
resp = self._download_json(self._TOKEN_URL, None, headers=headers, query=query)
return resp['signature']
def _call_api(self, path, video_id): def _call_api(self, path, video_id):
url = path if path.startswith('http') else self._API_BASE_URL + path url = path if path.startswith('http') else self._API_BASE_URL + path
@ -264,8 +239,7 @@ class CBCWatchBaseIE(InfoExtractor):
def _real_initialize(self): def _real_initialize(self):
if self._valid_device_token(): if self._valid_device_token():
return return
device = self._downloader.cache.load( device = self._downloader.cache.load('cbcwatch', 'device') or {}
'cbcwatch', self._cache_device_key()) or {}
self._device_id, self._device_token = device.get('id'), device.get('token') self._device_id, self._device_token = device.get('id'), device.get('token')
if self._valid_device_token(): if self._valid_device_token():
return return
@ -274,30 +248,16 @@ class CBCWatchBaseIE(InfoExtractor):
def _valid_device_token(self): def _valid_device_token(self):
return self._device_id and self._device_token return self._device_id and self._device_token
def _cache_device_key(self):
email, _ = self._get_login_info()
return '%s_device' % hashlib.sha256(email.encode()).hexdigest() if email else 'device'
def _register_device(self): def _register_device(self):
self._device_id = self._device_token = None
result = self._download_xml( result = self._download_xml(
self._API_BASE_URL + 'device/register', self._API_BASE_URL + 'device/register',
None, 'Acquiring device token', None, 'Acquiring device token',
data=b'<device><type>web</type></device>') data=b'<device><type>web</type></device>')
self._device_id = xpath_text(result, 'deviceId', fatal=True) self._device_id = xpath_text(result, 'deviceId', fatal=True)
email, password = self._get_login_info() self._device_token = xpath_text(result, 'deviceToken', fatal=True)
if email and password:
signature = self._signature(email, password)
data = '<login><token>{0}</token><device><deviceId>{1}</deviceId><type>web</type></device></login>'.format(
escape(signature), escape(self._device_id)).encode()
url = self._API_BASE_URL + 'device/login'
result = self._download_xml(
url, None, data=data,
headers={'content-type': 'application/xml'})
self._device_token = xpath_text(result, 'token', fatal=True)
else:
self._device_token = xpath_text(result, 'deviceToken', fatal=True)
self._downloader.cache.store( self._downloader.cache.store(
'cbcwatch', self._cache_device_key(), { 'cbcwatch', 'device', {
'id': self._device_id, 'id': self._device_id,
'token': self._device_token, 'token': self._device_token,
}) })

View File

@ -69,7 +69,7 @@ class CBSIE(CBSBaseIE):
last_e = None last_e = None
for item in items_data.findall('.//item'): for item in items_data.findall('.//item'):
asset_type = xpath_text(item, 'assetType') asset_type = xpath_text(item, 'assetType')
if not asset_type or asset_type in asset_types or 'HLS_FPS' in asset_type or 'DASH_CENC' in asset_type: if not asset_type or asset_type in asset_types or asset_type in ('HLS_FPS', 'DASH_CENC'):
continue continue
asset_types.append(asset_type) asset_types.append(asset_type)
query = { query = {

View File

@ -1,62 +1,40 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import re
import zlib
from .common import InfoExtractor from .common import InfoExtractor
from .cbs import CBSIE from .cbs import CBSIE
from ..compat import (
compat_b64decode,
compat_urllib_parse_unquote,
)
from ..utils import ( from ..utils import (
parse_duration, parse_duration,
) )
class CBSNewsEmbedIE(CBSIE):
IE_NAME = 'cbsnews:embed'
_VALID_URL = r'https?://(?:www\.)?cbsnews\.com/embed/video[^#]*#(?P<id>.+)'
_TESTS = [{
'url': 'https://www.cbsnews.com/embed/video/?v=1.c9b5b61492913d6660db0b2f03579ef25e86307a#1Vb7b9s2EP5XBAHbT6Gt98PAMKTJ0se6LVjWYWtdGBR1stlIpEBSTtwi%2F%2FvuJNkNhmHdGxgM2NL57vjd6zt%2B8PngdN%2Fyg79qeGvhzN%2FLGrS%2F%2BuBLB531V28%2B%2BO7Qg7%2Fy97r2z3xZ42NW8yLhDbA0S0KWlHnIijwKWJBHZZnHBa8Cgbpdf%2F89NM9Hi9fXifhpr8sr%2FlP848tn%2BTdXycX25zh4cdX%2FvHl6PmmPqnWQv9w8Ed%2B9GjYRim07bFEqdG%2BZVHuwTm65A7bVRrYtR5lAyMox7pigF6W4k%2By91mjspGsJ%2BwVae4%2BsvdnaO1p73HkXs%2FVisUDTGm7R8IcdnOROeq%2B19qT1amhA1VJtPenoTUgrtfKc9m7Rq8dP7nnjwOB7wg7ADdNt7VX64DWAWlKhPtmDEq22g4GF99x6Dk9E8OSsankHXqPNKDxC%2FdK7MLKTircTDgsI3mmj4OBdSq64dy7fd1x577RU1rt4cvMtOaulFYOd%2FLewRWvDO9lIgXFpZSnkZmjbv5SxKTPoQXClFbpsf%2Fhbbpzs0IB3vb8KkyzJQ%2BywOAgCrMpgRrz%2BKk4fvb7kFbR4XJCu0gAdtNO7woCwZTu%2BBUs9bam%2Fds71drVerpeisgrubLjAB4nnOSkWQnfr5W6o1ku5Xpr1MgrCbL0M0vUyDtfLLK15WiYp47xKWSLyjFVpwVmVJSLIoCjSOFkv3W7oKsVliwZJcB9nwXpZ5GEQQwY8jNKqKCBrgjTLeFxgdCIpazojDgnRtn43J6kG7nZ6cAbxh0EeFFk4%2B1u867cY5u4344n%2FxXjCqAjucdTHgLKojNKmSfO8KRsOFY%2FzKEYCKEJBzv90QA9nfm9gL%2BHulaFqUkz9ULUYxl62B3U%2FRVNLA8IhggaPycOoBuwOCESciDQVSSUgiOMsROB%2FhKfwCKOzEk%2B4k6rWd4uuT%2FwTDz7K7t3d3WLO8ISD95jSPQbayBacthbz86XVgxHwhex5zawzgDOmtp%2F3GPcXn0VXHdSS029%2Fj99UC%2FwJUvyKQ%2FzKyixIEVlYJOn4RxxuaH43Ty9fbJ5OObykHH435XAzJTHeOF4hhEUXD8URe%2FQ%2FBT%2BMpf8d5GN02Ox%2FfiGsl7TA7POu1xZ5%2BbTzcAVKMe48mqcC21hkacVEVScM26liVVBnrKkC4CLKyzAvHu0lhEaTKMFwI3a4SN9MsrfYzdBLq2vkwRD1gVviLT8kY9h2CHH6Y%2Bix6609weFtey4ESp60WtyeWMy%2BsmBuhsoKIyuoT%2Bq2R%2FrW5qi3g%2FvzS2j40DoixDP8%2BKP0yUdpXJ4l6Vla%2Bg9vce%2BC4yM5YlUcbA%2F0jLKdpmTwvsdN5z88nAIe08%2F0HgxeG1iv%2B6Hlhjh7uiW0SDzYNI92L401uha3JKYk268UVRzdOzNQvAaJqoXzAc80dAV440NZ1WVVAAMRYQ2KrGJFmDUsq8saWSnjvIj8t78y%2FRa3JRnbHVfyFpfwoDiGpPgjzekyUiKNlU3OMlwuLMmzgvEojllYVE2Z1HhImvsnk%2BuhusTEoB21PAtSFodeFK3iYhXEH9WOG2%2FkOE833sfeG%2Ff5cfHtEFNXgYes0%2FXj7aGivUgJ9XpusCtoNcNYVVnJVrrDo0OmJAutHCpuZul4W9lLcfy7BnuLPT02%2ByXsCTk%2B9zhzswIN04YueNSK%2BPtM0jS88QdLqSLJDTLsuGZJNolm2yO0PXh3UPnz9Ix5bfIAqxPjvETQsDCEiPG4QbqNyhBZISxybLnZYCrW5H3Axp690%2F0BJdXtDZ5ITuM4xj3f4oUHGzc5JeJmZKpp%2FjwKh4wMV%2FV1yx3emLoR0MwbG4K%2F%2BZgVep3PnzXGDHZ6a3i%2Fk%2BJrONDN13%2Bnq6tBTYk4o7cLGhBtqCC4KwacGHpEVuoH5JNro%2FE6JfE6d5RydbiR76k%2BW5wioDHBIjw1euhHjUGRB0y5A97KoaPx6MlL%2BwgboUVtUFRI%2FLemgTpdtF59ii7pab08kuPcfWzs0l%2FRI5takWnFpka0zOgWRtYcuf9aIxZMxlwr6IiGpsb6j2DQUXPl%2FimXI599Ev7fWjoPD78A',
'only_matching': True,
}]
def _real_extract(self, url):
item = self._parse_json(zlib.decompress(compat_b64decode(
compat_urllib_parse_unquote(self._match_id(url))),
-zlib.MAX_WBITS), None)['video']['items'][0]
return self._extract_video_info(item['mpxRefId'], 'cbsnews')
class CBSNewsIE(CBSIE): class CBSNewsIE(CBSIE):
IE_NAME = 'cbsnews' IE_NAME = 'cbsnews'
IE_DESC = 'CBS News' IE_DESC = 'CBS News'
_VALID_URL = r'https?://(?:www\.)?cbsnews\.com/(?:news|video)/(?P<id>[\da-z_-]+)' _VALID_URL = r'https?://(?:www\.)?cbsnews\.com/(?:news|videos)/(?P<id>[\da-z_-]+)'
_TESTS = [ _TESTS = [
{ {
# 60 minutes # 60 minutes
'url': 'http://www.cbsnews.com/news/artificial-intelligence-positioned-to-be-a-game-changer/', 'url': 'http://www.cbsnews.com/news/artificial-intelligence-positioned-to-be-a-game-changer/',
'info_dict': { 'info_dict': {
'id': 'Y_nf_aEg6WwO9OLAq0MpKaPgfnBUxfW4', 'id': '_B6Ga3VJrI4iQNKsir_cdFo9Re_YJHE_',
'ext': 'flv', 'ext': 'mp4',
'title': 'Artificial Intelligence, real-life applications', 'title': 'Artificial Intelligence',
'description': 'md5:a7aaf27f1b4777244de8b0b442289304', 'description': 'md5:8818145f9974431e0fb58a1b8d69613c',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'duration': 317, 'duration': 1606,
'uploader': 'CBSI-NEW', 'uploader': 'CBSI-NEW',
'timestamp': 1476046464, 'timestamp': 1498431900,
'upload_date': '20161009', 'upload_date': '20170625',
}, },
'params': { 'params': {
# rtmp download # m3u8 download
'skip_download': True, 'skip_download': True,
}, },
}, },
{ {
'url': 'https://www.cbsnews.com/video/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/', 'url': 'http://www.cbsnews.com/videos/fort-hood-shooting-army-downplays-mental-illness-as-cause-of-attack/',
'info_dict': { 'info_dict': {
'id': 'SNJBOYzXiWBOvaLsdzwH8fmtP1SCd91Y', 'id': 'SNJBOYzXiWBOvaLsdzwH8fmtP1SCd91Y',
'ext': 'mp4', 'ext': 'mp4',
@ -82,29 +60,37 @@ class CBSNewsIE(CBSIE):
# 48 hours # 48 hours
'url': 'http://www.cbsnews.com/news/maria-ridulph-murder-will-the-nations-oldest-cold-case-to-go-to-trial-ever-get-solved/', 'url': 'http://www.cbsnews.com/news/maria-ridulph-murder-will-the-nations-oldest-cold-case-to-go-to-trial-ever-get-solved/',
'info_dict': { 'info_dict': {
'id': 'QpM5BJjBVEAUFi7ydR9LusS69DPLqPJ1',
'ext': 'mp4',
'title': 'Cold as Ice', 'title': 'Cold as Ice',
'description': 'Can a childhood memory solve the 1957 murder of 7-year-old Maria Ridulph?', 'description': 'Can a childhood memory of a friend\'s murder solve a 1957 cold case? "48 Hours" correspondent Erin Moriarty has the latest.',
'upload_date': '20170604',
'timestamp': 1496538000,
'uploader': 'CBSI-NEW',
},
'params': {
'skip_download': True,
}, },
'playlist_mincount': 7,
}, },
] ]
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage(url, display_id) webpage = self._download_webpage(url, video_id)
entries = [] video_info = self._parse_json(self._html_search_regex(
for embed_url in re.findall(r'<iframe[^>]+data-src="(https?://(?:www\.)?cbsnews\.com/embed/video/[^#]*#[^"]+)"', webpage): r'(?:<ul class="media-list items" id="media-related-items"[^>]*><li data-video-info|<div id="cbsNewsVideoPlayer" data-video-player-options)=\'({.+?})\'',
entries.append(self.url_result(embed_url, CBSNewsEmbedIE.ie_key())) webpage, 'video JSON info', default='{}'), video_id, fatal=False)
if entries:
return self.playlist_result( if video_info:
entries, playlist_title=self._html_search_meta(['og:title', 'twitter:title'], webpage), item = video_info['item'] if 'item' in video_info else video_info
playlist_description=self._html_search_meta(['og:description', 'twitter:description', 'description'], webpage)) else:
state = self._parse_json(self._search_regex(
r'data-cbsvideoui-options=(["\'])(?P<json>{.+?})\1', webpage,
'playlist JSON info', group='json'), video_id)['state']
item = state['playlist'][state['pid']]
item = self._parse_json(self._html_search_regex(
r'CBSNEWS\.defaultPayload\s*=\s*({.+})',
webpage, 'video JSON info'), display_id)['items'][0]
return self._extract_video_info(item['mpxRefId'], 'cbsnews') return self._extract_video_info(item['mpxRefId'], 'cbsnews')

View File

@ -1,12 +1,9 @@
# coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
int_or_none, int_or_none,
parse_iso8601, parse_iso8601,
try_get,
url_or_none,
) )
@ -21,13 +18,11 @@ class CCCIE(InfoExtractor):
'id': '1839', 'id': '1839',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Introduction to Processor Design', 'title': 'Introduction to Processor Design',
'creator': 'byterazor',
'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac', 'description': 'md5:df55f6d073d4ceae55aae6f2fd98a0ac',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'upload_date': '20131228', 'upload_date': '20131228',
'timestamp': 1388188800, 'timestamp': 1388188800,
'duration': 3710, 'duration': 3710,
'tags': list,
} }
}, { }, {
'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download', 'url': 'https://media.ccc.de/v/32c3-7368-shopshifting#download',
@ -73,7 +68,6 @@ class CCCIE(InfoExtractor):
'id': event_id, 'id': event_id,
'display_id': display_id, 'display_id': display_id,
'title': event_data['title'], 'title': event_data['title'],
'creator': try_get(event_data, lambda x: ', '.join(x['persons'])),
'description': event_data.get('description'), 'description': event_data.get('description'),
'thumbnail': event_data.get('thumb_url'), 'thumbnail': event_data.get('thumb_url'),
'timestamp': parse_iso8601(event_data.get('date')), 'timestamp': parse_iso8601(event_data.get('date')),
@ -81,31 +75,3 @@ class CCCIE(InfoExtractor):
'tags': event_data.get('tags'), 'tags': event_data.get('tags'),
'formats': formats, 'formats': formats,
} }
class CCCPlaylistIE(InfoExtractor):
IE_NAME = 'media.ccc.de:lists'
_VALID_URL = r'https?://(?:www\.)?media\.ccc\.de/c/(?P<id>[^/?#&]+)'
_TESTS = [{
'url': 'https://media.ccc.de/c/30c3',
'info_dict': {
'title': '30C3',
'id': '30c3',
},
'playlist_count': 135,
}]
def _real_extract(self, url):
playlist_id = self._match_id(url).lower()
conf = self._download_json(
'https://media.ccc.de/public/conferences/' + playlist_id,
playlist_id)
entries = []
for e in conf['events']:
event_url = url_or_none(e.get('frontend_link'))
if event_url:
entries.append(self.url_result(event_url, ie=CCCIE.ie_key()))
return self.playlist_result(entries, playlist_id, conf.get('title'))

View File

@ -147,8 +147,6 @@ class CeskaTelevizeIE(InfoExtractor):
is_live = item.get('type') == 'LIVE' is_live = item.get('type') == 'LIVE'
formats = [] formats = []
for format_id, stream_url in item.get('streamUrls', {}).items(): for format_id, stream_url in item.get('streamUrls', {}).items():
if 'drmOnly=true' in stream_url:
continue
if 'playerType=flash' in stream_url: if 'playerType=flash' in stream_url:
stream_formats = self._extract_m3u8_formats( stream_formats = self._extract_m3u8_formats(
stream_url, playlist_id, 'mp4', 'm3u8_native', stream_url, playlist_id, 'mp4', 'm3u8_native',

View File

@ -32,7 +32,7 @@ class Channel9IE(InfoExtractor):
'upload_date': '20130828', 'upload_date': '20130828',
'session_code': 'KOS002', 'session_code': 'KOS002',
'session_room': 'Arena 1A', 'session_room': 'Arena 1A',
'session_speakers': 'count:5', 'session_speakers': ['Andrew Coates', 'Brady Gaster', 'Mads Kristensen', 'Ed Blankenship', 'Patrick Klug'],
}, },
}, { }, {
'url': 'http://channel9.msdn.com/posts/Self-service-BI-with-Power-BI-nuclear-testing', 'url': 'http://channel9.msdn.com/posts/Self-service-BI-with-Power-BI-nuclear-testing',
@ -64,15 +64,15 @@ class Channel9IE(InfoExtractor):
'params': { 'params': {
'skip_download': True, 'skip_download': True,
}, },
}, {
'url': 'https://channel9.msdn.com/Events/DEVintersection/DEVintersection-2016/RSS',
'info_dict': {
'id': 'Events/DEVintersection/DEVintersection-2016',
'title': 'DEVintersection 2016 Orlando Sessions',
},
'playlist_mincount': 14,
}, { }, {
'url': 'https://channel9.msdn.com/Niners/Splendid22/Queue/76acff796e8f411184b008028e0d492b/RSS', 'url': 'https://channel9.msdn.com/Niners/Splendid22/Queue/76acff796e8f411184b008028e0d492b/RSS',
'info_dict': {
'id': 'Niners/Splendid22/Queue/76acff796e8f411184b008028e0d492b',
'title': 'Channel 9',
},
'playlist_mincount': 100,
}, {
'url': 'https://channel9.msdn.com/Events/DEVintersection/DEVintersection-2016/RSS',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://channel9.msdn.com/Events/Speakers/scott-hanselman/RSS?UrlSafeName=scott-hanselman', 'url': 'https://channel9.msdn.com/Events/Speakers/scott-hanselman/RSS?UrlSafeName=scott-hanselman',
@ -112,11 +112,11 @@ class Channel9IE(InfoExtractor):
episode_data), content_path) episode_data), content_path)
content_id = episode_data['contentId'] content_id = episode_data['contentId']
is_session = '/Sessions(' in episode_data['api'] is_session = '/Sessions(' in episode_data['api']
content_url = 'https://channel9.msdn.com/odata' + episode_data['api'] + '?$select=Captions,CommentCount,MediaLengthInSeconds,PublishedDate,Rating,RatingCount,Title,VideoMP4High,VideoMP4Low,VideoMP4Medium,VideoPlayerPreviewImage,VideoWMV,VideoWMVHQ,Views,' content_url = 'https://channel9.msdn.com/odata' + episode_data['api']
if is_session: if is_session:
content_url += 'Code,Description,Room,Slides,Speakers,ZipFile&$expand=Speakers' content_url += '?$expand=Speakers'
else: else:
content_url += 'Authors,Body&$expand=Authors' content_url += '?$expand=Authors'
content_data = self._download_json(content_url, content_id) content_data = self._download_json(content_url, content_id)
title = content_data['Title'] title = content_data['Title']
@ -210,7 +210,7 @@ class Channel9IE(InfoExtractor):
'id': content_id, 'id': content_id,
'title': title, 'title': title,
'description': clean_html(content_data.get('Description') or content_data.get('Body')), 'description': clean_html(content_data.get('Description') or content_data.get('Body')),
'thumbnail': content_data.get('VideoPlayerPreviewImage'), 'thumbnail': content_data.get('Thumbnail') or content_data.get('VideoPlayerPreviewImage'),
'duration': int_or_none(content_data.get('MediaLengthInSeconds')), 'duration': int_or_none(content_data.get('MediaLengthInSeconds')),
'timestamp': parse_iso8601(content_data.get('PublishedDate')), 'timestamp': parse_iso8601(content_data.get('PublishedDate')),
'avg_rating': int_or_none(content_data.get('Rating')), 'avg_rating': int_or_none(content_data.get('Rating')),

View File

@ -3,15 +3,11 @@ from __future__ import unicode_literals
import re import re
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import ExtractorError
ExtractorError,
lowercase_escape,
url_or_none,
)
class ChaturbateIE(InfoExtractor): class ChaturbateIE(InfoExtractor):
_VALID_URL = r'https?://(?:[^/]+\.)?chaturbate\.com/(?:fullvideo/?\?.*?\bb=)?(?P<id>[^/?&#]+)' _VALID_URL = r'https?://(?:[^/]+\.)?chaturbate\.com/(?P<id>[^/?#]+)'
_TESTS = [{ _TESTS = [{
'url': 'https://www.chaturbate.com/siswet19/', 'url': 'https://www.chaturbate.com/siswet19/',
'info_dict': { 'info_dict': {
@ -25,9 +21,6 @@ class ChaturbateIE(InfoExtractor):
'skip_download': True, 'skip_download': True,
}, },
'skip': 'Room is offline', 'skip': 'Room is offline',
}, {
'url': 'https://chaturbate.com/fullvideo/?b=caylin',
'only_matching': True,
}, { }, {
'url': 'https://en.chaturbate.com/siswet19/', 'url': 'https://en.chaturbate.com/siswet19/',
'only_matching': True, 'only_matching': True,
@ -39,34 +32,14 @@ class ChaturbateIE(InfoExtractor):
video_id = self._match_id(url) video_id = self._match_id(url)
webpage = self._download_webpage( webpage = self._download_webpage(
'https://chaturbate.com/%s/' % video_id, video_id, url, video_id, headers=self.geo_verification_headers())
headers=self.geo_verification_headers())
found_m3u8_urls = []
data = self._parse_json(
self._search_regex(
r'initialRoomDossier\s*=\s*(["\'])(?P<value>(?:(?!\1).)+)\1',
webpage, 'data', default='{}', group='value'),
video_id, transform_source=lowercase_escape, fatal=False)
if data:
m3u8_url = url_or_none(data.get('hls_source'))
if m3u8_url:
found_m3u8_urls.append(m3u8_url)
if not found_m3u8_urls:
for m in re.finditer(
r'(\\u002[27])(?P<url>http.+?\.m3u8.*?)\1', webpage):
found_m3u8_urls.append(lowercase_escape(m.group('url')))
if not found_m3u8_urls:
for m in re.finditer(
r'(["\'])(?P<url>http.+?\.m3u8.*?)\1', webpage):
found_m3u8_urls.append(m.group('url'))
m3u8_urls = [] m3u8_urls = []
for found_m3u8_url in found_m3u8_urls:
m3u8_fast_url, m3u8_no_fast_url = found_m3u8_url, found_m3u8_url.replace('_fast', '') for m in re.finditer(
r'(["\'])(?P<url>http.+?\.m3u8.*?)\1', webpage):
m3u8_fast_url, m3u8_no_fast_url = m.group('url'), m.group(
'url').replace('_fast', '')
for m3u8_url in (m3u8_fast_url, m3u8_no_fast_url): for m3u8_url in (m3u8_fast_url, m3u8_no_fast_url):
if m3u8_url not in m3u8_urls: if m3u8_url not in m3u8_urls:
m3u8_urls.append(m3u8_url) m3u8_urls.append(m3u8_url)
@ -86,12 +59,7 @@ class ChaturbateIE(InfoExtractor):
formats = [] formats = []
for m3u8_url in m3u8_urls: for m3u8_url in m3u8_urls:
for known_id in ('fast', 'slow'): m3u8_id = 'fast' if '_fast' in m3u8_url else 'slow'
if '_%s' % known_id in m3u8_url:
m3u8_id = known_id
break
else:
m3u8_id = None
formats.extend(self._extract_m3u8_formats( formats.extend(self._extract_m3u8_formats(
m3u8_url, video_id, ext='mp4', m3u8_url, video_id, ext='mp4',
# ffmpeg skips segments for fast m3u8 # ffmpeg skips segments for fast m3u8

View File

@ -1,29 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
import re
from .hbo import HBOBaseIE
class CinemaxIE(HBOBaseIE):
_VALID_URL = r'https?://(?:www\.)?cinemax\.com/(?P<path>[^/]+/video/[0-9a-z-]+-(?P<id>\d+))'
_TESTS = [{
'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903',
'md5': '82e0734bba8aa7ef526c9dd00cf35a05',
'info_dict': {
'id': '20126903',
'ext': 'mp4',
'title': 'S1 Ep 1: Recap',
},
'expected_warnings': ['Unknown MIME type application/mp4 in DASH manifest'],
}, {
'url': 'https://www.cinemax.com/warrior/video/s1-ep-1-recap-20126903.embed',
'only_matching': True,
}]
def _real_extract(self, url):
path, video_id = re.match(self._VALID_URL, url).groups()
info = self._extract_info('https://www.cinemax.com/%s.xml' % path, video_id)
info['id'] = video_id
return info

View File

@ -1,24 +1,20 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import re import re
from .common import InfoExtractor from .common import InfoExtractor
class CloudflareStreamIE(InfoExtractor): class CloudflareStreamIE(InfoExtractor):
_DOMAIN_RE = r'(?:cloudflarestream\.com|(?:videodelivery|bytehighway)\.net)'
_EMBED_RE = r'embed\.%s/embed/[^/]+\.js\?.*?\bvideo=' % _DOMAIN_RE
_ID_RE = r'[\da-f]{32}|[\w-]+\.[\w-]+\.[\w-]+'
_VALID_URL = r'''(?x) _VALID_URL = r'''(?x)
https?:// https?://
(?: (?:
(?:watch\.)?%s/| (?:watch\.)?cloudflarestream\.com/|
%s embed\.cloudflarestream\.com/embed/[^/]+\.js\?.*?\bvideo=
) )
(?P<id>%s) (?P<id>[\da-f]+)
''' % (_DOMAIN_RE, _EMBED_RE, _ID_RE) '''
_TESTS = [{ _TESTS = [{
'url': 'https://embed.cloudflarestream.com/embed/we4g.fla9.latest.js?video=31c9291ab41fac05471db4e73aa11717', 'url': 'https://embed.cloudflarestream.com/embed/we4g.fla9.latest.js?video=31c9291ab41fac05471db4e73aa11717',
'info_dict': { 'info_dict': {
@ -35,9 +31,6 @@ class CloudflareStreamIE(InfoExtractor):
}, { }, {
'url': 'https://cloudflarestream.com/31c9291ab41fac05471db4e73aa11717/manifest/video.mpd', 'url': 'https://cloudflarestream.com/31c9291ab41fac05471db4e73aa11717/manifest/video.mpd',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://embed.videodelivery.net/embed/r4xu.fla9.latest.js?video=81d80727f3022488598f68d323c1ad5e',
'only_matching': True,
}] }]
@staticmethod @staticmethod
@ -45,28 +38,23 @@ class CloudflareStreamIE(InfoExtractor):
return [ return [
mobj.group('url') mobj.group('url')
for mobj in re.finditer( for mobj in re.finditer(
r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//%s(?:%s).*?)\1' % (CloudflareStreamIE._EMBED_RE, CloudflareStreamIE._ID_RE), r'<script[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//embed\.cloudflarestream\.com/embed/[^/]+\.js\?.*?\bvideo=[\da-f]+?.*?)\1',
webpage)] webpage)]
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
domain = 'bytehighway.net' if 'bytehighway.net/' in url else 'videodelivery.net'
base_url = 'https://%s/%s/' % (domain, video_id)
if '.' in video_id:
video_id = self._parse_json(base64.urlsafe_b64decode(
video_id.split('.')[1]), video_id)['sub']
manifest_base_url = base_url + 'manifest/video.'
formats = self._extract_m3u8_formats( formats = self._extract_m3u8_formats(
manifest_base_url + 'm3u8', video_id, 'mp4', 'https://cloudflarestream.com/%s/manifest/video.m3u8' % video_id,
'm3u8_native', m3u8_id='hls', fatal=False) video_id, 'mp4', entry_protocol='m3u8_native', m3u8_id='hls',
fatal=False)
formats.extend(self._extract_mpd_formats( formats.extend(self._extract_mpd_formats(
manifest_base_url + 'mpd', video_id, mpd_id='dash', fatal=False)) 'https://cloudflarestream.com/%s/manifest/video.mpd' % video_id,
video_id, mpd_id='dash', fatal=False))
self._sort_formats(formats) self._sort_formats(formats)
return { return {
'id': video_id, 'id': video_id,
'title': video_id, 'title': video_id,
'thumbnail': base_url + 'thumbnails/thumbnail.jpg',
'formats': formats, 'formats': formats,
} }

View File

@ -0,0 +1,74 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
int_or_none,
parse_duration,
parse_iso8601,
)
class ComCarCoffIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?comediansincarsgettingcoffee\.com/(?P<id>[a-z0-9\-]*)'
_TESTS = [{
'url': 'http://comediansincarsgettingcoffee.com/miranda-sings-happy-thanksgiving-miranda/',
'info_dict': {
'id': '2494164',
'ext': 'mp4',
'upload_date': '20141127',
'timestamp': 1417107600,
'duration': 1232,
'title': 'Happy Thanksgiving Miranda',
'description': 'Jerry Seinfeld and his special guest Miranda Sings cruise around town in search of coffee, complaining and apologizing along the way.',
},
'params': {
'skip_download': 'requires ffmpeg',
}
}]
def _real_extract(self, url):
display_id = self._match_id(url)
if not display_id:
display_id = 'comediansincarsgettingcoffee.com'
webpage = self._download_webpage(url, display_id)
full_data = self._parse_json(
self._search_regex(
r'window\.app\s*=\s*({.+?});\n', webpage, 'full data json'),
display_id)['videoData']
display_id = full_data['activeVideo']['video']
video_data = full_data.get('videos', {}).get(display_id) or full_data['singleshots'][display_id]
video_id = compat_str(video_data['mediaId'])
title = video_data['title']
formats = self._extract_m3u8_formats(
video_data['mediaUrl'], video_id, 'mp4')
self._sort_formats(formats)
thumbnails = [{
'url': video_data['images']['thumb'],
}, {
'url': video_data['images']['poster'],
}]
timestamp = int_or_none(video_data.get('pubDateTime')) or parse_iso8601(
video_data.get('pubDate'))
duration = int_or_none(video_data.get('durationSeconds')) or parse_duration(
video_data.get('duration'))
return {
'id': video_id,
'display_id': display_id,
'title': title,
'description': video_data.get('description'),
'timestamp': timestamp,
'duration': duration,
'thumbnails': thumbnails,
'formats': formats,
'season_number': int_or_none(video_data.get('season')),
'episode_number': int_or_none(video_data.get('episode')),
'webpage_url': 'http://comediansincarsgettingcoffee.com/%s' % (video_data.get('urlSlug', video_data.get('slug'))),
}

View File

@ -10,13 +10,12 @@ import os
import random import random
import re import re
import socket import socket
import ssl
import sys import sys
import time import time
import math import math
from ..compat import ( from ..compat import (
compat_cookiejar_Cookie, compat_cookiejar,
compat_cookies, compat_cookies,
compat_etree_Element, compat_etree_Element,
compat_etree_fromstring, compat_etree_fromstring,
@ -68,8 +67,6 @@ from ..utils import (
sanitized_Request, sanitized_Request,
sanitize_filename, sanitize_filename,
str_or_none, str_or_none,
str_to_int,
strip_or_none,
unescapeHTML, unescapeHTML,
unified_strdate, unified_strdate,
unified_timestamp, unified_timestamp,
@ -120,7 +117,7 @@ class InfoExtractor(object):
unfragmented media) unfragmented media)
- URL of the MPD manifest or base URL - URL of the MPD manifest or base URL
representing the media if MPD manifest representing the media if MPD manifest
is parsed from a string (in case of is parsed froma string (in case of
fragmented media) fragmented media)
for MSS - URL of the ISM manifest. for MSS - URL of the ISM manifest.
* manifest_url * manifest_url
@ -222,7 +219,7 @@ class InfoExtractor(object):
* "preference" (optional, int) - quality of the image * "preference" (optional, int) - quality of the image
* "width" (optional, int) * "width" (optional, int)
* "height" (optional, int) * "height" (optional, int)
* "resolution" (optional, string "{width}x{height}", * "resolution" (optional, string "{width}x{height"},
deprecated) deprecated)
* "filesize" (optional, int) * "filesize" (optional, int)
thumbnail: Full URL to a video thumbnail image. thumbnail: Full URL to a video thumbnail image.
@ -545,11 +542,11 @@ class InfoExtractor(object):
raise ExtractorError('An extractor error has occurred.', cause=e) raise ExtractorError('An extractor error has occurred.', cause=e)
def __maybe_fake_ip_and_retry(self, countries): def __maybe_fake_ip_and_retry(self, countries):
if (not self._downloader.params.get('geo_bypass_country', None) if (not self._downloader.params.get('geo_bypass_country', None) and
and self._GEO_BYPASS self._GEO_BYPASS and
and self._downloader.params.get('geo_bypass', True) self._downloader.params.get('geo_bypass', True) and
and not self._x_forwarded_for_ip not self._x_forwarded_for_ip and
and countries): countries):
country_code = random.choice(countries) country_code = random.choice(countries)
self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code) self._x_forwarded_for_ip = GeoUtils.random_ipv4(country_code)
if self._x_forwarded_for_ip: if self._x_forwarded_for_ip:
@ -625,12 +622,9 @@ class InfoExtractor(object):
url_or_request = update_url_query(url_or_request, query) url_or_request = update_url_query(url_or_request, query)
if data is not None or headers: if data is not None or headers:
url_or_request = sanitized_Request(url_or_request, data, headers) url_or_request = sanitized_Request(url_or_request, data, headers)
exceptions = [compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error]
if hasattr(ssl, 'CertificateError'):
exceptions.append(ssl.CertificateError)
try: try:
return self._downloader.urlopen(url_or_request) return self._downloader.urlopen(url_or_request)
except tuple(exceptions) as err: except (compat_urllib_error.URLError, compat_http_client.HTTPException, socket.error) as err:
if isinstance(err, compat_urllib_error.HTTPError): if isinstance(err, compat_urllib_error.HTTPError):
if self.__can_accept_status_code(err, expected_status): if self.__can_accept_status_code(err, expected_status):
# Retain reference to error to prevent file object from # Retain reference to error to prevent file object from
@ -688,8 +682,8 @@ class InfoExtractor(object):
def __check_blocked(self, content): def __check_blocked(self, content):
first_block = content[:512] first_block = content[:512]
if ('<title>Access to this site is blocked</title>' in content if ('<title>Access to this site is blocked</title>' in content and
and 'Websense' in first_block): 'Websense' in first_block):
msg = 'Access to this webpage has been blocked by Websense filtering software in your network.' msg = 'Access to this webpage has been blocked by Websense filtering software in your network.'
blocked_iframe = self._html_search_regex( blocked_iframe = self._html_search_regex(
r'<iframe src="([^"]+)"', content, r'<iframe src="([^"]+)"', content,
@ -707,8 +701,8 @@ class InfoExtractor(object):
if block_msg: if block_msg:
msg += ' (Message: "%s")' % block_msg.replace('\n', ' ') msg += ' (Message: "%s")' % block_msg.replace('\n', ' ')
raise ExtractorError(msg, expected=True) raise ExtractorError(msg, expected=True)
if ('<title>TTK :: Доступ к ресурсу ограничен</title>' in content if ('<title>TTK :: Доступ к ресурсу ограничен</title>' in content and
and 'blocklist.rkn.gov.ru' in content): 'blocklist.rkn.gov.ru' in content):
raise ExtractorError( raise ExtractorError(
'Access to this webpage has been blocked by decision of the Russian government. ' 'Access to this webpage has been blocked by decision of the Russian government. '
'Visit http://blocklist.rkn.gov.ru/ for a block reason.', 'Visit http://blocklist.rkn.gov.ru/ for a block reason.',
@ -1187,33 +1181,16 @@ class InfoExtractor(object):
'twitter card player') 'twitter card player')
def _search_json_ld(self, html, video_id, expected_type=None, **kwargs): def _search_json_ld(self, html, video_id, expected_type=None, **kwargs):
json_ld_list = list(re.finditer(JSON_LD_RE, html)) json_ld = self._search_regex(
JSON_LD_RE, html, 'JSON-LD', group='json_ld', **kwargs)
default = kwargs.get('default', NO_DEFAULT) default = kwargs.get('default', NO_DEFAULT)
if not json_ld:
return default if default is not NO_DEFAULT else {}
# JSON-LD may be malformed and thus `fatal` should be respected. # JSON-LD may be malformed and thus `fatal` should be respected.
# At the same time `default` may be passed that assumes `fatal=False` # At the same time `default` may be passed that assumes `fatal=False`
# for _search_regex. Let's simulate the same behavior here as well. # for _search_regex. Let's simulate the same behavior here as well.
fatal = kwargs.get('fatal', True) if default == NO_DEFAULT else False fatal = kwargs.get('fatal', True) if default == NO_DEFAULT else False
json_ld = [] return self._json_ld(json_ld, video_id, fatal=fatal, expected_type=expected_type)
for mobj in json_ld_list:
json_ld_item = self._parse_json(
mobj.group('json_ld'), video_id, fatal=fatal)
if not json_ld_item:
continue
if isinstance(json_ld_item, dict):
json_ld.append(json_ld_item)
elif isinstance(json_ld_item, (list, tuple)):
json_ld.extend(json_ld_item)
if json_ld:
json_ld = self._json_ld(json_ld, video_id, fatal=fatal, expected_type=expected_type)
if json_ld:
return json_ld
if default is not NO_DEFAULT:
return default
elif fatal:
raise RegexNotFoundError('Unable to extract JSON-LD')
else:
self._downloader.report_warning('unable to extract JSON-LD %s' % bug_reports_message())
return {}
def _json_ld(self, json_ld, video_id, fatal=True, expected_type=None): def _json_ld(self, json_ld, video_id, fatal=True, expected_type=None):
if isinstance(json_ld, compat_str): if isinstance(json_ld, compat_str):
@ -1249,10 +1226,7 @@ class InfoExtractor(object):
interaction_type = is_e.get('interactionType') interaction_type = is_e.get('interactionType')
if not isinstance(interaction_type, compat_str): if not isinstance(interaction_type, compat_str):
continue continue
# For interaction count some sites provide string instead of interaction_count = int_or_none(is_e.get('userInteractionCount'))
# an integer (as per spec) with non digit characters (e.g. ",")
# so extracting count with more relaxed str_to_int
interaction_count = str_to_int(is_e.get('userInteractionCount'))
if interaction_count is None: if interaction_count is None:
continue continue
count_kind = INTERACTION_TYPE_MAP.get(interaction_type.split('/')[-1]) count_kind = INTERACTION_TYPE_MAP.get(interaction_type.split('/')[-1])
@ -1272,7 +1246,6 @@ class InfoExtractor(object):
'thumbnail': url_or_none(e.get('thumbnailUrl') or e.get('thumbnailURL')), 'thumbnail': url_or_none(e.get('thumbnailUrl') or e.get('thumbnailURL')),
'duration': parse_duration(e.get('duration')), 'duration': parse_duration(e.get('duration')),
'timestamp': unified_timestamp(e.get('uploadDate')), 'timestamp': unified_timestamp(e.get('uploadDate')),
'uploader': str_or_none(e.get('author')),
'filesize': float_or_none(e.get('contentSize')), 'filesize': float_or_none(e.get('contentSize')),
'tbr': int_or_none(e.get('bitrate')), 'tbr': int_or_none(e.get('bitrate')),
'width': int_or_none(e.get('width')), 'width': int_or_none(e.get('width')),
@ -1282,10 +1255,10 @@ class InfoExtractor(object):
extract_interaction_statistic(e) extract_interaction_statistic(e)
for e in json_ld: for e in json_ld:
if '@context' in e: if isinstance(e.get('@context'), compat_str) and re.match(r'^https?://schema.org/?$', e.get('@context')):
item_type = e.get('@type') item_type = e.get('@type')
if expected_type is not None and expected_type != item_type: if expected_type is not None and expected_type != item_type:
continue return info
if item_type in ('TVEpisode', 'Episode'): if item_type in ('TVEpisode', 'Episode'):
episode_name = unescapeHTML(e.get('name')) episode_name = unescapeHTML(e.get('name'))
info.update({ info.update({
@ -1319,17 +1292,11 @@ class InfoExtractor(object):
}) })
elif item_type == 'VideoObject': elif item_type == 'VideoObject':
extract_video_object(e) extract_video_object(e)
if expected_type is None: continue
continue
else:
break
video = e.get('video') video = e.get('video')
if isinstance(video, dict) and video.get('@type') == 'VideoObject': if isinstance(video, dict) and video.get('@type') == 'VideoObject':
extract_video_object(video) extract_video_object(video)
if expected_type is None: break
continue
else:
break
return dict((k, v) for k, v in info.items() if v is not None) return dict((k, v) for k, v in info.items() if v is not None)
@staticmethod @staticmethod
@ -1456,10 +1423,12 @@ class InfoExtractor(object):
try: try:
self._request_webpage(url, video_id, 'Checking %s URL' % item, headers=headers) self._request_webpage(url, video_id, 'Checking %s URL' % item, headers=headers)
return True return True
except ExtractorError: except ExtractorError as e:
self.to_screen( if isinstance(e.cause, compat_urllib_error.URLError):
'%s: %s URL is invalid, skipping' % (video_id, item)) self.to_screen(
return False '%s: %s URL is invalid, skipping' % (video_id, item))
return False
raise
def http_scheme(self): def http_scheme(self):
""" Either "http:" or "https:", depending on the user's preferences """ """ Either "http:" or "https:", depending on the user's preferences """
@ -1487,14 +1456,14 @@ class InfoExtractor(object):
def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None, def _extract_f4m_formats(self, manifest_url, video_id, preference=None, f4m_id=None,
transform_source=lambda s: fix_xml_ampersands(s).strip(), transform_source=lambda s: fix_xml_ampersands(s).strip(),
fatal=True, m3u8_id=None, data=None, headers={}, query={}): fatal=True, m3u8_id=None):
manifest = self._download_xml( manifest = self._download_xml(
manifest_url, video_id, 'Downloading f4m manifest', manifest_url, video_id, 'Downloading f4m manifest',
'Unable to download f4m manifest', 'Unable to download f4m manifest',
# Some manifests may be malformed, e.g. prosiebensat1 generated manifests # Some manifests may be malformed, e.g. prosiebensat1 generated manifests
# (see https://github.com/ytdl-org/youtube-dl/issues/6215#issuecomment-121704244) # (see https://github.com/ytdl-org/youtube-dl/issues/6215#issuecomment-121704244)
transform_source=transform_source, transform_source=transform_source,
fatal=fatal, data=data, headers=headers, query=query) fatal=fatal)
if manifest is False: if manifest is False:
return [] return []
@ -1618,13 +1587,12 @@ class InfoExtractor(object):
def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None, def _extract_m3u8_formats(self, m3u8_url, video_id, ext=None,
entry_protocol='m3u8', preference=None, entry_protocol='m3u8', preference=None,
m3u8_id=None, note=None, errnote=None, m3u8_id=None, note=None, errnote=None,
fatal=True, live=False, data=None, headers={}, fatal=True, live=False):
query={}):
res = self._download_webpage_handle( res = self._download_webpage_handle(
m3u8_url, video_id, m3u8_url, video_id,
note=note or 'Downloading m3u8 information', note=note or 'Downloading m3u8 information',
errnote=errnote or 'Failed to download m3u8 information', errnote=errnote or 'Failed to download m3u8 information',
fatal=fatal, data=data, headers=headers, query=query) fatal=fatal)
if res is False: if res is False:
return [] return []
@ -1741,8 +1709,8 @@ class InfoExtractor(object):
continue continue
else: else:
tbr = float_or_none( tbr = float_or_none(
last_stream_inf.get('AVERAGE-BANDWIDTH') last_stream_inf.get('AVERAGE-BANDWIDTH') or
or last_stream_inf.get('BANDWIDTH'), scale=1000) last_stream_inf.get('BANDWIDTH'), scale=1000)
format_id = [] format_id = []
if m3u8_id: if m3u8_id:
format_id.append(m3u8_id) format_id.append(m3u8_id)
@ -1798,19 +1766,6 @@ class InfoExtractor(object):
# the same GROUP-ID # the same GROUP-ID
f['acodec'] = 'none' f['acodec'] = 'none'
formats.append(f) formats.append(f)
# for DailyMotion
progressive_uri = last_stream_inf.get('PROGRESSIVE-URI')
if progressive_uri:
http_f = f.copy()
del http_f['manifest_url']
http_f.update({
'format_id': f['format_id'].replace('hls-', 'http-'),
'protocol': 'http',
'url': progressive_uri,
})
formats.append(http_f)
last_stream_inf = {} last_stream_inf = {}
return formats return formats
@ -2055,17 +2010,15 @@ class InfoExtractor(object):
}) })
return entries return entries
def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}, data=None, headers={}, query={}): def _extract_mpd_formats(self, mpd_url, video_id, mpd_id=None, note=None, errnote=None, fatal=True, formats_dict={}):
res = self._download_xml_handle( res = self._download_xml_handle(
mpd_url, video_id, mpd_url, video_id,
note=note or 'Downloading MPD manifest', note=note or 'Downloading MPD manifest',
errnote=errnote or 'Failed to download MPD manifest', errnote=errnote or 'Failed to download MPD manifest',
fatal=fatal, data=data, headers=headers, query=query) fatal=fatal)
if res is False: if res is False:
return [] return []
mpd_doc, urlh = res mpd_doc, urlh = res
if mpd_doc is None:
return []
mpd_base_url = base_url(urlh.geturl()) mpd_base_url = base_url(urlh.geturl())
return self._parse_mpd_formats( return self._parse_mpd_formats(
@ -2363,17 +2316,15 @@ class InfoExtractor(object):
self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type) self.report_warning('Unknown MIME type %s in DASH manifest' % mime_type)
return formats return formats
def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True, data=None, headers={}, query={}): def _extract_ism_formats(self, ism_url, video_id, ism_id=None, note=None, errnote=None, fatal=True):
res = self._download_xml_handle( res = self._download_xml_handle(
ism_url, video_id, ism_url, video_id,
note=note or 'Downloading ISM manifest', note=note or 'Downloading ISM manifest',
errnote=errnote or 'Failed to download ISM manifest', errnote=errnote or 'Failed to download ISM manifest',
fatal=fatal, data=data, headers=headers, query=query) fatal=fatal)
if res is False: if res is False:
return [] return []
ism_doc, urlh = res ism_doc, urlh = res
if ism_doc is None:
return []
return self._parse_ism_formats(ism_doc, urlh.geturl(), ism_id) return self._parse_ism_formats(ism_doc, urlh.geturl(), ism_id)
@ -2527,7 +2478,7 @@ class InfoExtractor(object):
'subtitles': {}, 'subtitles': {},
} }
media_attributes = extract_attributes(media_tag) media_attributes = extract_attributes(media_tag)
src = strip_or_none(media_attributes.get('src')) src = media_attributes.get('src')
if src: if src:
_, formats = _media_formats(src, media_type) _, formats = _media_formats(src, media_type)
media_info['formats'].extend(formats) media_info['formats'].extend(formats)
@ -2537,7 +2488,7 @@ class InfoExtractor(object):
s_attr = extract_attributes(source_tag) s_attr = extract_attributes(source_tag)
# data-video-src and data-src are non standard but seen # data-video-src and data-src are non standard but seen
# several times in the wild # several times in the wild
src = strip_or_none(dict_get(s_attr, ('src', 'data-video-src', 'data-src'))) src = dict_get(s_attr, ('src', 'data-video-src', 'data-src'))
if not src: if not src:
continue continue
f = parse_content_type(s_attr.get('type')) f = parse_content_type(s_attr.get('type'))
@ -2551,8 +2502,8 @@ class InfoExtractor(object):
if str_or_none(s_attr.get(lbl)) if str_or_none(s_attr.get(lbl))
] ]
width = int_or_none(s_attr.get('width')) width = int_or_none(s_attr.get('width'))
height = (int_or_none(s_attr.get('height')) height = (int_or_none(s_attr.get('height')) or
or int_or_none(s_attr.get('res'))) int_or_none(s_attr.get('res')))
if not width or not height: if not width or not height:
for lbl in labels: for lbl in labels:
resolution = parse_resolution(lbl) resolution = parse_resolution(lbl)
@ -2580,7 +2531,7 @@ class InfoExtractor(object):
track_attributes = extract_attributes(track_tag) track_attributes = extract_attributes(track_tag)
kind = track_attributes.get('kind') kind = track_attributes.get('kind')
if not kind or kind in ('subtitles', 'captions'): if not kind or kind in ('subtitles', 'captions'):
src = strip_or_none(track_attributes.get('src')) src = track_attributes.get('src')
if not src: if not src:
continue continue
lang = track_attributes.get('srclang') or track_attributes.get('lang') or track_attributes.get('label') lang = track_attributes.get('srclang') or track_attributes.get('lang') or track_attributes.get('label')
@ -2737,7 +2688,7 @@ class InfoExtractor(object):
entry = { entry = {
'id': this_video_id, 'id': this_video_id,
'title': unescapeHTML(video_data['title'] if require_title else video_data.get('title')), 'title': unescapeHTML(video_data['title'] if require_title else video_data.get('title')),
'description': clean_html(video_data.get('description')), 'description': video_data.get('description'),
'thumbnail': urljoin(base_url, self._proto_relative_url(video_data.get('image'))), 'thumbnail': urljoin(base_url, self._proto_relative_url(video_data.get('image'))),
'timestamp': int_or_none(video_data.get('pubdate')), 'timestamp': int_or_none(video_data.get('pubdate')),
'duration': float_or_none(jwplayer_data.get('duration') or video_data.get('duration')), 'duration': float_or_none(jwplayer_data.get('duration') or video_data.get('duration')),
@ -2852,7 +2803,7 @@ class InfoExtractor(object):
def _set_cookie(self, domain, name, value, expire_time=None, port=None, def _set_cookie(self, domain, name, value, expire_time=None, port=None,
path='/', secure=False, discard=False, rest={}, **kwargs): path='/', secure=False, discard=False, rest={}, **kwargs):
cookie = compat_cookiejar_Cookie( cookie = compat_cookiejar.Cookie(
0, name, value, port, port is not None, domain, True, 0, name, value, port, port is not None, domain, True,
domain.startswith('.'), path, True, secure, expire_time, domain.startswith('.'), path, True, secure, expire_time,
discard, None, None, rest) discard, None, None, rest)
@ -2864,33 +2815,6 @@ class InfoExtractor(object):
self._downloader.cookiejar.add_cookie_header(req) self._downloader.cookiejar.add_cookie_header(req)
return compat_cookies.SimpleCookie(req.get_header('Cookie')) return compat_cookies.SimpleCookie(req.get_header('Cookie'))
def _apply_first_set_cookie_header(self, url_handle, cookie):
"""
Apply first Set-Cookie header instead of the last. Experimental.
Some sites (e.g. [1-3]) may serve two cookies under the same name
in Set-Cookie header and expect the first (old) one to be set rather
than second (new). However, as of RFC6265 the newer one cookie
should be set into cookie store what actually happens.
We will workaround this issue by resetting the cookie to
the first one manually.
1. https://new.vk.com/
2. https://github.com/ytdl-org/youtube-dl/issues/9841#issuecomment-227871201
3. https://learning.oreilly.com/
"""
for header, cookies in url_handle.headers.items():
if header.lower() != 'set-cookie':
continue
if sys.version_info[0] >= 3:
cookies = cookies.encode('iso-8859-1')
cookies = cookies.decode('utf-8')
cookie_value = re.search(
r'%s=(.+?);.*?\b[Dd]omain=(.+?)(?:[,;]|$)' % cookie, cookies)
if cookie_value:
value, domain = cookie_value.groups()
self._set_cookie(domain, cookie, value)
break
def get_testcases(self, include_onlymatching=False): def get_testcases(self, include_onlymatching=False):
t = getattr(self, '_TEST', None) t = getattr(self, '_TEST', None)
if t: if t:
@ -2921,8 +2845,8 @@ class InfoExtractor(object):
return not any_restricted return not any_restricted
def extract_subtitles(self, *args, **kwargs): def extract_subtitles(self, *args, **kwargs):
if (self._downloader.params.get('writesubtitles', False) if (self._downloader.params.get('writesubtitles', False) or
or self._downloader.params.get('listsubtitles')): self._downloader.params.get('listsubtitles')):
return self._get_subtitles(*args, **kwargs) return self._get_subtitles(*args, **kwargs)
return {} return {}
@ -2947,8 +2871,8 @@ class InfoExtractor(object):
return ret return ret
def extract_automatic_captions(self, *args, **kwargs): def extract_automatic_captions(self, *args, **kwargs):
if (self._downloader.params.get('writeautomaticsub', False) if (self._downloader.params.get('writeautomaticsub', False) or
or self._downloader.params.get('listsubtitles')): self._downloader.params.get('listsubtitles')):
return self._get_automatic_captions(*args, **kwargs) return self._get_automatic_captions(*args, **kwargs)
return {} return {}
@ -2956,9 +2880,9 @@ class InfoExtractor(object):
raise NotImplementedError('This method must be implemented by subclasses') raise NotImplementedError('This method must be implemented by subclasses')
def mark_watched(self, *args, **kwargs): def mark_watched(self, *args, **kwargs):
if (self._downloader.params.get('mark_watched', False) if (self._downloader.params.get('mark_watched', False) and
and (self._get_login_info()[0] is not None (self._get_login_info()[0] is not None or
or self._downloader.params.get('cookiefile') is not None)): self._downloader.params.get('cookiefile') is not None)):
self._mark_watched(*args, **kwargs) self._mark_watched(*args, **kwargs)
def _mark_watched(self, *args, **kwargs): def _mark_watched(self, *args, **kwargs):

View File

@ -32,19 +32,19 @@ class CommonMistakesIE(InfoExtractor):
class UnicodeBOMIE(InfoExtractor): class UnicodeBOMIE(InfoExtractor):
IE_DESC = False IE_DESC = False
_VALID_URL = r'(?P<bom>\ufeff)(?P<id>.*)$' _VALID_URL = r'(?P<bom>\ufeff)(?P<id>.*)$'
# Disable test for python 3.2 since BOM is broken in re in this version # Disable test for python 3.2 since BOM is broken in re in this version
# (see https://github.com/ytdl-org/youtube-dl/issues/9751) # (see https://github.com/ytdl-org/youtube-dl/issues/9751)
_TESTS = [] if (3, 0) < sys.version_info <= (3, 3) else [{ _TESTS = [] if (3, 0) < sys.version_info <= (3, 3) else [{
'url': '\ufeffhttp://www.youtube.com/watch?v=BaW_jenozKc', 'url': '\ufeffhttp://www.youtube.com/watch?v=BaW_jenozKc',
'only_matching': True, 'only_matching': True,
}] }]
def _real_extract(self, url): def _real_extract(self, url):
real_url = self._match_id(url) real_url = self._match_id(url)
self.report_warning( self.report_warning(
'Your URL starts with a Byte Order Mark (BOM). ' 'Your URL starts with a Byte Order Mark (BOM). '
'Removing the BOM and looking for "%s" ...' % real_url) 'Removing the BOM and looking for "%s" ...' % real_url)
return self.url_result(real_url) return self.url_result(real_url)

View File

@ -1,118 +0,0 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
from ..utils import (
float_or_none,
int_or_none,
)
class CONtvIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?contv\.com/details-movie/(?P<id>[^/]+)'
_TESTS = [{
'url': 'https://www.contv.com/details-movie/CEG10022949/days-of-thrills-&-laughter',
'info_dict': {
'id': 'CEG10022949',
'ext': 'mp4',
'title': 'Days Of Thrills & Laughter',
'description': 'md5:5d6b3d0b1829bb93eb72898c734802eb',
'upload_date': '20180703',
'timestamp': 1530634789.61,
},
'params': {
# m3u8 download
'skip_download': True,
},
}, {
'url': 'https://www.contv.com/details-movie/CLIP-show_fotld_bts/fight-of-the-living-dead:-behind-the-scenes-bites',
'info_dict': {
'id': 'CLIP-show_fotld_bts',
'title': 'Fight of the Living Dead: Behind the Scenes Bites',
},
'playlist_mincount': 7,
}]
def _real_extract(self, url):
video_id = self._match_id(url)
details = self._download_json(
'http://metax.contv.live.junctiontv.net/metax/2.5/details/' + video_id,
video_id, query={'device': 'web'})
if details.get('type') == 'episodic':
seasons = self._download_json(
'http://metax.contv.live.junctiontv.net/metax/2.5/seriesfeed/json/' + video_id,
video_id)
entries = []
for season in seasons:
for episode in season.get('episodes', []):
episode_id = episode.get('id')
if not episode_id:
continue
entries.append(self.url_result(
'https://www.contv.com/details-movie/' + episode_id,
CONtvIE.ie_key(), episode_id))
return self.playlist_result(entries, video_id, details.get('title'))
m_details = details['details']
title = details['title']
formats = []
media_hls_url = m_details.get('media_hls_url')
if media_hls_url:
formats.extend(self._extract_m3u8_formats(
media_hls_url, video_id, 'mp4',
m3u8_id='hls', fatal=False))
media_mp4_url = m_details.get('media_mp4_url')
if media_mp4_url:
formats.append({
'format_id': 'http',
'url': media_mp4_url,
})
self._sort_formats(formats)
subtitles = {}
captions = m_details.get('captions') or {}
for caption_url in captions.values():
subtitles.setdefault('en', []).append({
'url': caption_url
})
thumbnails = []
for image in m_details.get('images', []):
image_url = image.get('url')
if not image_url:
continue
thumbnails.append({
'url': image_url,
'width': int_or_none(image.get('width')),
'height': int_or_none(image.get('height')),
})
description = None
for p in ('large_', 'medium_', 'small_', ''):
d = m_details.get(p + 'description')
if d:
description = d
break
return {
'id': video_id,
'title': title,
'formats': formats,
'thumbnails': thumbnails,
'description': description,
'timestamp': float_or_none(details.get('metax_added_on'), 1000),
'subtitles': subtitles,
'duration': float_or_none(m_details.get('duration'), 1000),
'view_count': int_or_none(details.get('num_watched')),
'like_count': int_or_none(details.get('num_fav')),
'categories': details.get('category'),
'tags': details.get('tags'),
'season_number': int_or_none(details.get('season')),
'episode_number': int_or_none(details.get('episode')),
'release_year': int_or_none(details.get('pub_year')),
}

View File

@ -4,12 +4,7 @@ from __future__ import unicode_literals
import re import re
from .theplatform import ThePlatformFeedIE from .theplatform import ThePlatformFeedIE
from ..utils import ( from ..utils import int_or_none
dict_get,
ExtractorError,
float_or_none,
int_or_none,
)
class CorusIE(ThePlatformFeedIE): class CorusIE(ThePlatformFeedIE):
@ -17,49 +12,24 @@ class CorusIE(ThePlatformFeedIE):
https?:// https?://
(?:www\.)? (?:www\.)?
(?P<domain> (?P<domain>
(?: (?:globaltv|etcanada)\.com|
globaltv| (?:hgtv|foodnetwork|slice|history|showcase|bigbrothercanada)\.ca
etcanada|
seriesplus|
wnetwork|
ytv
)\.com|
(?:
hgtv|
foodnetwork|
slice|
history|
showcase|
bigbrothercanada|
abcspark|
disney(?:channel|lachaine)
)\.ca
)
/(?:[^/]+/)*
(?:
video\.html\?.*?\bv=|
videos?/(?:[^/]+/)*(?:[a-z0-9-]+-)?
)
(?P<id>
[\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12}|
(?:[A-Z]{4})?\d{12,20}
) )
/(?:video/(?:[^/]+/)?|(?:[^/]+/)+(?:videos/[a-z0-9-]+-|video\.html\?.*?\bv=))
(?P<id>\d+)
''' '''
_TESTS = [{ _TESTS = [{
'url': 'http://www.hgtv.ca/shows/bryan-inc/videos/movie-night-popcorn-with-bryan-870923331648/', 'url': 'http://www.hgtv.ca/shows/bryan-inc/videos/movie-night-popcorn-with-bryan-870923331648/',
'md5': '05dcbca777bf1e58c2acbb57168ad3a6',
'info_dict': { 'info_dict': {
'id': '870923331648', 'id': '870923331648',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Movie Night Popcorn with Bryan', 'title': 'Movie Night Popcorn with Bryan',
'description': 'Bryan whips up homemade popcorn, the old fashion way for Jojo and Lincoln.', 'description': 'Bryan whips up homemade popcorn, the old fashion way for Jojo and Lincoln.',
'uploader': 'SHWM-NEW',
'upload_date': '20170206', 'upload_date': '20170206',
'timestamp': 1486392197, 'timestamp': 1486392197,
}, },
'params': {
'format': 'bestvideo',
'skip_download': True,
},
'expected_warnings': ['Failed to parse JSON'],
}, { }, {
'url': 'http://www.foodnetwork.ca/shows/chopped/video/episode/chocolate-obsession/video.html?v=872683587753', 'url': 'http://www.foodnetwork.ca/shows/chopped/video/episode/chocolate-obsession/video.html?v=872683587753',
'only_matching': True, 'only_matching': True,
@ -78,83 +48,58 @@ class CorusIE(ThePlatformFeedIE):
}, { }, {
'url': 'https://www.bigbrothercanada.ca/video/big-brother-canada-704/1457812035894/', 'url': 'https://www.bigbrothercanada.ca/video/big-brother-canada-704/1457812035894/',
'only_matching': True 'only_matching': True
}, {
'url': 'https://www.seriesplus.com/emissions/dre-mary-mort-sur-ordonnance/videos/deux-coeurs-battant/SERP0055626330000200/',
'only_matching': True
}, {
'url': 'https://www.disneychannel.ca/shows/gabby-duran-the-unsittables/video/crybaby-duran-clip/2f557eec-0588-11ea-ae2b-e2c6776b770e/',
'only_matching': True
}] }]
_GEO_BYPASS = False
_SITE_MAP = { _TP_FEEDS = {
'globaltv': 'series', 'globaltv': {
'etcanada': 'series', 'feed_id': 'ChQqrem0lNUp',
'foodnetwork': 'food', 'account_id': 2269680845,
'bigbrothercanada': 'series', },
'disneychannel': 'disneyen', 'etcanada': {
'disneylachaine': 'disneyfr', 'feed_id': 'ChQqrem0lNUp',
'account_id': 2269680845,
},
'hgtv': {
'feed_id': 'L0BMHXi2no43',
'account_id': 2414428465,
},
'foodnetwork': {
'feed_id': 'ukK8o58zbRmJ',
'account_id': 2414429569,
},
'slice': {
'feed_id': '5tUJLgV2YNJ5',
'account_id': 2414427935,
},
'history': {
'feed_id': 'tQFx_TyyEq4J',
'account_id': 2369613659,
},
'showcase': {
'feed_id': '9H6qyshBZU3E',
'account_id': 2414426607,
},
'bigbrothercanada': {
'feed_id': 'ChQqrem0lNUp',
'account_id': 2269680845,
},
} }
def _real_extract(self, url): def _real_extract(self, url):
domain, video_id = re.match(self._VALID_URL, url).groups() domain, video_id = re.match(self._VALID_URL, url).groups()
site = domain.split('.')[0] feed_info = self._TP_FEEDS[domain.split('.')[0]]
path = self._SITE_MAP.get(site, site) return self._extract_feed_info('dtjsEC', feed_info['feed_id'], 'byId=' + video_id, video_id, lambda e: {
if path != 'series': 'episode_number': int_or_none(e.get('pl1$episode')),
path = 'migration/' + path 'season_number': int_or_none(e.get('pl1$season')),
video = self._download_json( 'series': e.get('pl1$show'),
'https://globalcontent.corusappservices.com/templates/%s/playlist/' % path, }, {
video_id, query={'byId': video_id}, 'HLS': {
headers={'Accept': 'application/json'})[0] 'manifest': 'm3u',
title = video['title'] },
'DesktopHLS Default': {
formats = [] 'manifest': 'm3u',
for source in video.get('sources', []): },
smil_url = source.get('file') 'MP4 MBR': {
if not smil_url: 'manifest': 'm3u',
continue },
source_type = source.get('type') }, feed_info['account_id'])
note = 'Downloading%s smil file' % (' ' + source_type if source_type else '')
resp = self._download_webpage(
smil_url, video_id, note, fatal=False,
headers=self.geo_verification_headers())
if not resp:
continue
error = self._parse_json(resp, video_id, fatal=False)
if error:
if error.get('exception') == 'GeoLocationBlocked':
self.raise_geo_restricted(countries=['CA'])
raise ExtractorError(error['description'])
smil = self._parse_xml(resp, video_id, fatal=False)
if smil is None:
continue
namespace = self._parse_smil_namespace(smil)
formats.extend(self._parse_smil_formats(
smil, smil_url, video_id, namespace))
if not formats and video.get('drm'):
raise ExtractorError('This video is DRM protected.', expected=True)
self._sort_formats(formats)
subtitles = {}
for track in video.get('tracks', []):
track_url = track.get('file')
if not track_url:
continue
lang = 'fr' if site in ('disneylachaine', 'seriesplus') else 'en'
subtitles.setdefault(lang, []).append({'url': track_url})
metadata = video.get('metadata') or {}
get_number = lambda x: int_or_none(video.get('pl1$' + x) or metadata.get(x + 'Number'))
return {
'id': video_id,
'title': title,
'formats': formats,
'thumbnail': dict_get(video, ('defaultThumbnailUrl', 'thumbnail', 'image')),
'description': video.get('description'),
'timestamp': int_or_none(video.get('availableDate'), 1000),
'subtitles': subtitles,
'duration': float_or_none(metadata.get('duration')),
'series': dict_get(video, ('show', 'pl1$show')),
'season_number': get_number('season'),
'episode_number': get_number('episode'),
}

View File

@ -0,0 +1,39 @@
# coding: utf-8
from __future__ import unicode_literals
from .common import InfoExtractor
class CriterionIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?criterion\.com/films/(?P<id>[0-9]+)-.+'
_TEST = {
'url': 'http://www.criterion.com/films/184-le-samourai',
'md5': 'bc51beba55685509883a9a7830919ec3',
'info_dict': {
'id': '184',
'ext': 'mp4',
'title': 'Le Samouraï',
'description': 'md5:a2b4b116326558149bef81f76dcbb93f',
'thumbnail': r're:^https?://.*\.jpg$',
}
}
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
final_url = self._search_regex(
r'so\.addVariable\("videoURL", "(.+?)"\)\;', webpage, 'video url')
title = self._og_search_title(webpage)
description = self._html_search_meta('description', webpage)
thumbnail = self._search_regex(
r'so\.addVariable\("thumbnailURL", "(.+?)"\)\;',
webpage, 'thumbnail url')
return {
'id': video_id,
'url': final_url,
'title': title,
'description': description,
'thumbnail': thumbnail,
}

View File

@ -13,7 +13,6 @@ from ..compat import (
compat_b64decode, compat_b64decode,
compat_etree_Element, compat_etree_Element,
compat_etree_fromstring, compat_etree_fromstring,
compat_str,
compat_urllib_parse_urlencode, compat_urllib_parse_urlencode,
compat_urllib_request, compat_urllib_request,
compat_urlparse, compat_urlparse,
@ -26,9 +25,9 @@ from ..utils import (
intlist_to_bytes, intlist_to_bytes,
int_or_none, int_or_none,
lowercase_escape, lowercase_escape,
merge_dicts,
remove_end, remove_end,
sanitized_Request, sanitized_Request,
unified_strdate,
urlencode_postdata, urlencode_postdata,
xpath_text, xpath_text,
) )
@ -104,6 +103,19 @@ class CrunchyrollBaseIE(InfoExtractor):
def _real_initialize(self): def _real_initialize(self):
self._login() self._login()
def _download_webpage(self, url_or_request, *args, **kwargs):
request = (url_or_request if isinstance(url_or_request, compat_urllib_request.Request)
else sanitized_Request(url_or_request))
# Accept-Language must be set explicitly to accept any language to avoid issues
# similar to https://github.com/ytdl-org/youtube-dl/issues/6797.
# Along with IP address Crunchyroll uses Accept-Language to guess whether georestriction
# should be imposed or not (from what I can see it just takes the first language
# ignoring the priority and requires it to correspond the IP). By the way this causes
# Crunchyroll to not work in georestriction cases in some browsers that don't place
# the locale lang first in header. However allowing any language seems to workaround the issue.
request.add_header('Accept-Language', '*')
return super(CrunchyrollBaseIE, self)._download_webpage(request, *args, **kwargs)
@staticmethod @staticmethod
def _add_skip_wall(url): def _add_skip_wall(url):
parsed_url = compat_urlparse.urlparse(url) parsed_url = compat_urlparse.urlparse(url)
@ -137,7 +149,6 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
# rtmp # rtmp
'skip_download': True, 'skip_download': True,
}, },
'skip': 'Video gone',
}, { }, {
'url': 'http://www.crunchyroll.com/media-589804/culture-japan-1', 'url': 'http://www.crunchyroll.com/media-589804/culture-japan-1',
'info_dict': { 'info_dict': {
@ -159,12 +170,11 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
'info_dict': { 'info_dict': {
'id': '702409', 'id': '702409',
'ext': 'mp4', 'ext': 'mp4',
'title': compat_str, 'title': 'Re:ZERO -Starting Life in Another World- Episode 5 The Morning of Our Promise Is Still Distant',
'description': compat_str, 'description': 'md5:97664de1ab24bbf77a9c01918cb7dca9',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Re:Zero Partners', 'uploader': 'TV TOKYO',
'timestamp': 1462098900, 'upload_date': '20160508',
'upload_date': '20160501',
}, },
'params': { 'params': {
# m3u8 download # m3u8 download
@ -175,13 +185,12 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
'info_dict': { 'info_dict': {
'id': '727589', 'id': '727589',
'ext': 'mp4', 'ext': 'mp4',
'title': compat_str, 'title': "KONOSUBA -God's blessing on this wonderful world! 2 Episode 1 Give Me Deliverance From This Judicial Injustice!",
'description': compat_str, 'description': 'md5:cbcf05e528124b0f3a0a419fc805ea7d',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'uploader': 'Kadokawa Pictures Inc.', 'uploader': 'Kadokawa Pictures Inc.',
'timestamp': 1484130900, 'upload_date': '20170118',
'upload_date': '20170111', 'series': "KONOSUBA -God's blessing on this wonderful world!",
'series': compat_str,
'season': "KONOSUBA -God's blessing on this wonderful world! 2", 'season': "KONOSUBA -God's blessing on this wonderful world! 2",
'season_number': 2, 'season_number': 2,
'episode': 'Give Me Deliverance From This Judicial Injustice!', 'episode': 'Give Me Deliverance From This Judicial Injustice!',
@ -204,11 +213,10 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
'info_dict': { 'info_dict': {
'id': '535080', 'id': '535080',
'ext': 'mp4', 'ext': 'mp4',
'title': compat_str, 'title': '11eyes Episode 1 Red Night ~ Piros éjszaka',
'description': compat_str, 'description': 'Kakeru and Yuka are thrown into an alternate nightmarish world they call "Red Night".',
'uploader': 'Marvelous AQL Inc.', 'uploader': 'Marvelous AQL Inc.',
'timestamp': 1255512600, 'upload_date': '20091021',
'upload_date': '20091014',
}, },
'params': { 'params': {
# Just test metadata extraction # Just test metadata extraction
@ -229,17 +237,15 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
# just test metadata extraction # just test metadata extraction
'skip_download': True, 'skip_download': True,
}, },
'skip': 'Video gone',
}, { }, {
# A video with a vastly different season name compared to the series name # A video with a vastly different season name compared to the series name
'url': 'http://www.crunchyroll.com/nyarko-san-another-crawling-chaos/episode-1-test-590532', 'url': 'http://www.crunchyroll.com/nyarko-san-another-crawling-chaos/episode-1-test-590532',
'info_dict': { 'info_dict': {
'id': '590532', 'id': '590532',
'ext': 'mp4', 'ext': 'mp4',
'title': compat_str, 'title': 'Haiyoru! Nyaruani (ONA) Episode 1 Test',
'description': compat_str, 'description': 'Mahiro and Nyaruko talk about official certification.',
'uploader': 'TV TOKYO', 'uploader': 'TV TOKYO',
'timestamp': 1330956000,
'upload_date': '20120305', 'upload_date': '20120305',
'series': 'Nyarko-san: Another Crawling Chaos', 'series': 'Nyarko-san: Another Crawling Chaos',
'season': 'Haiyoru! Nyaruani (ONA)', 'season': 'Haiyoru! Nyaruani (ONA)',
@ -263,19 +269,6 @@ class CrunchyrollIE(CrunchyrollBaseIE, VRVIE):
'1080': ('80', '108'), '1080': ('80', '108'),
} }
def _download_webpage(self, url_or_request, *args, **kwargs):
request = (url_or_request if isinstance(url_or_request, compat_urllib_request.Request)
else sanitized_Request(url_or_request))
# Accept-Language must be set explicitly to accept any language to avoid issues
# similar to https://github.com/ytdl-org/youtube-dl/issues/6797.
# Along with IP address Crunchyroll uses Accept-Language to guess whether georestriction
# should be imposed or not (from what I can see it just takes the first language
# ignoring the priority and requires it to correspond the IP). By the way this causes
# Crunchyroll to not work in georestriction cases in some browsers that don't place
# the locale lang first in header. However allowing any language seems to workaround the issue.
request.add_header('Accept-Language', '*')
return super(CrunchyrollBaseIE, self)._download_webpage(request, *args, **kwargs)
def _decrypt_subtitles(self, data, iv, id): def _decrypt_subtitles(self, data, iv, id):
data = bytes_to_intlist(compat_b64decode(data)) data = bytes_to_intlist(compat_b64decode(data))
iv = bytes_to_intlist(compat_b64decode(iv)) iv = bytes_to_intlist(compat_b64decode(iv))
@ -449,21 +442,23 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
webpage, 'language', default=None, group='lang') webpage, 'language', default=None, group='lang')
video_title = self._html_search_regex( video_title = self._html_search_regex(
(r'(?s)<h1[^>]*>((?:(?!<h1).)*?<(?:span[^>]+itemprop=["\']title["\']|meta[^>]+itemprop=["\']position["\'])[^>]*>(?:(?!<h1).)+?)</h1>', r'(?s)<h1[^>]*>((?:(?!<h1).)*?<span[^>]+itemprop=["\']title["\'][^>]*>(?:(?!<h1).)+?)</h1>',
r'<title>(.+?),\s+-\s+.+? Crunchyroll'), webpage, 'video_title')
webpage, 'video_title', default=None)
if not video_title:
video_title = re.sub(r'^Watch\s+', '', self._og_search_description(webpage))
video_title = re.sub(r' {2,}', ' ', video_title) video_title = re.sub(r' {2,}', ' ', video_title)
video_description = (self._parse_json(self._html_search_regex( video_description = (self._parse_json(self._html_search_regex(
r'<script[^>]*>\s*.+?\[media_id=%s\].+?({.+?"description"\s*:.+?})\);' % video_id, r'<script[^>]*>\s*.+?\[media_id=%s\].+?({.+?"description"\s*:.+?})\);' % video_id,
webpage, 'description', default='{}'), video_id) or media_metadata).get('description') webpage, 'description', default='{}'), video_id) or media_metadata).get('description')
if video_description: if video_description:
video_description = lowercase_escape(video_description.replace(r'\r\n', '\n')) video_description = lowercase_escape(video_description.replace(r'\r\n', '\n'))
video_upload_date = self._html_search_regex(
[r'<div>Availability for free users:(.+?)</div>', r'<div>[^<>]+<span>\s*(.+?\d{4})\s*</span></div>'],
webpage, 'video_upload_date', fatal=False, flags=re.DOTALL)
if video_upload_date:
video_upload_date = unified_strdate(video_upload_date)
video_uploader = self._html_search_regex( video_uploader = self._html_search_regex(
# try looking for both an uploader that's a link and one that's not # try looking for both an uploader that's a link and one that's not
[r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', r'<div>\s*Publisher:\s*<span>\s*(.+?)\s*</span>\s*</div>'], [r'<a[^>]+href="/publisher/[^"]+"[^>]*>([^<]+)</a>', r'<div>\s*Publisher:\s*<span>\s*(.+?)\s*</span>\s*</div>'],
webpage, 'video_uploader', default=False) webpage, 'video_uploader', fatal=False)
formats = [] formats = []
for stream in media.get('streams', []): for stream in media.get('streams', []):
@ -616,15 +611,14 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
r'(?s)<h\d[^>]+id=["\']showmedia_about_episode_num[^>]+>.+?</h\d>\s*<h4>\s*Season (\d+)', r'(?s)<h\d[^>]+id=["\']showmedia_about_episode_num[^>]+>.+?</h\d>\s*<h4>\s*Season (\d+)',
webpage, 'season number', default=None)) webpage, 'season number', default=None))
info = self._search_json_ld(webpage, video_id, default={}) return {
return merge_dicts({
'id': video_id, 'id': video_id,
'title': video_title, 'title': video_title,
'description': video_description, 'description': video_description,
'duration': duration, 'duration': duration,
'thumbnail': thumbnail, 'thumbnail': thumbnail,
'uploader': video_uploader, 'uploader': video_uploader,
'upload_date': video_upload_date,
'series': series, 'series': series,
'season': season, 'season': season,
'season_number': season_number, 'season_number': season_number,
@ -632,7 +626,7 @@ Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
'episode_number': episode_number, 'episode_number': episode_number,
'subtitles': subtitles, 'subtitles': subtitles,
'formats': formats, 'formats': formats,
}, info) }
class CrunchyrollShowPlaylistIE(CrunchyrollBaseIE): class CrunchyrollShowPlaylistIE(CrunchyrollBaseIE):
@ -667,8 +661,9 @@ class CrunchyrollShowPlaylistIE(CrunchyrollBaseIE):
webpage = self._download_webpage( webpage = self._download_webpage(
self._add_skip_wall(url), show_id, self._add_skip_wall(url), show_id,
headers=self.geo_verification_headers()) headers=self.geo_verification_headers())
title = self._html_search_meta('name', webpage, default=None) title = self._html_search_regex(
r'(?s)<h1[^>]*>\s*<span itemprop="name">(.*?)</span>',
webpage, 'title')
episode_paths = re.findall( episode_paths = re.findall(
r'(?s)<li id="showview_videos_media_(\d+)"[^>]+>.*?<a href="([^"]+)"', r'(?s)<li id="showview_videos_media_(\d+)"[^>]+>.*?<a href="([^"]+)"',
webpage) webpage)

View File

@ -3,7 +3,6 @@ from __future__ import unicode_literals
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import unified_timestamp from ..utils import unified_timestamp
from .youtube import YoutubeIE
class CtsNewsIE(InfoExtractor): class CtsNewsIE(InfoExtractor):
@ -15,8 +14,8 @@ class CtsNewsIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '201501291578109', 'id': '201501291578109',
'ext': 'mp4', 'ext': 'mp4',
'title': '以色列.真主黨交火 3人死亡 - 華視新聞網', 'title': '以色列.真主黨交火 3人死亡',
'description': '以色列和黎巴嫩真主黨,爆發五年最嚴重衝突,雙方砲轟交火,兩名以軍死亡,還有一名西班牙籍的聯合國維和人員也不幸罹難。大陸陝西、河南、安徽、江蘇和湖北五個省份出現大暴雪,嚴重影響陸空交通,不過九華山卻出現...', 'description': '以色列和黎巴嫩真主黨,爆發五年最嚴重衝突,雙方砲轟交火,兩名以軍死亡,還有一名西班牙籍的聯合國維和人...',
'timestamp': 1422528540, 'timestamp': 1422528540,
'upload_date': '20150129', 'upload_date': '20150129',
} }
@ -27,7 +26,7 @@ class CtsNewsIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '201309031304098', 'id': '201309031304098',
'ext': 'mp4', 'ext': 'mp4',
'title': '韓國31歲童顏男 貌如十多歲小孩 - 華視新聞網', 'title': '韓國31歲童顏男 貌如十多歲小孩',
'description': '越有年紀的人越希望看起來年輕一點而南韓卻有一位31歲的男子看起來像是11、12歲的小孩身...', 'description': '越有年紀的人越希望看起來年輕一點而南韓卻有一位31歲的男子看起來像是11、12歲的小孩身...',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
'timestamp': 1378205880, 'timestamp': 1378205880,
@ -63,7 +62,8 @@ class CtsNewsIE(InfoExtractor):
video_url = mp4_feed['source_url'] video_url = mp4_feed['source_url']
else: else:
self.to_screen('Not CTSPlayer video, trying Youtube...') self.to_screen('Not CTSPlayer video, trying Youtube...')
youtube_url = YoutubeIE._extract_url(page) youtube_url = self._search_regex(
r'src="(//www\.youtube\.com/embed/[^"]+)"', page, 'youtube url')
return self.url_result(youtube_url, ie='Youtube') return self.url_result(youtube_url, ie='Youtube')

View File

@ -45,8 +45,8 @@ class DailyMailIE(InfoExtractor):
sources_url = (try_get( sources_url = (try_get(
video_data, video_data,
(lambda x: x['plugins']['sources']['url'], (lambda x: x['plugins']['sources']['url'],
lambda x: x['sources']['url']), compat_str) lambda x: x['sources']['url']), compat_str) or
or 'http://www.dailymail.co.uk/api/player/%s/video-sources.json' % video_id) 'http://www.dailymail.co.uk/api/player/%s/video-sources.json' % video_id)
video_sources = self._download_json(sources_url, video_id) video_sources = self._download_json(sources_url, video_id)
body = video_sources.get('body') body = video_sources.get('body')

View File

@ -1,105 +1,64 @@
# coding: utf-8 # coding: utf-8
from __future__ import unicode_literals from __future__ import unicode_literals
import base64
import functools import functools
import hashlib
import itertools
import json import json
import random
import re import re
import string
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import compat_HTTPError from ..compat import compat_struct_pack
from ..utils import ( from ..utils import (
age_restricted, determine_ext,
clean_html, error_to_compat_str,
ExtractorError, ExtractorError,
int_or_none, int_or_none,
mimetype2ext,
OnDemandPagedList, OnDemandPagedList,
parse_iso8601,
sanitized_Request,
str_to_int,
try_get, try_get,
unescapeHTML, unescapeHTML,
update_url_query,
url_or_none,
urlencode_postdata, urlencode_postdata,
) )
class DailymotionBaseInfoExtractor(InfoExtractor): class DailymotionBaseInfoExtractor(InfoExtractor):
_FAMILY_FILTER = None
_HEADERS = {
'Content-Type': 'application/json',
'Origin': 'https://www.dailymotion.com',
}
_NETRC_MACHINE = 'dailymotion'
def _get_dailymotion_cookies(self):
return self._get_cookies('https://www.dailymotion.com/')
@staticmethod @staticmethod
def _get_cookie_value(cookies, name): def _build_request(url):
cookie = cookies.get(name) """Build a request with the family filter disabled"""
if cookie: request = sanitized_Request(url)
return cookie.value request.add_header('Cookie', 'family_filter=off; ff=off')
return request
def _set_dailymotion_cookie(self, name, value): def _download_webpage_handle_no_ff(self, url, *args, **kwargs):
self._set_cookie('www.dailymotion.com', name, value) request = self._build_request(url)
return self._download_webpage_handle(request, *args, **kwargs)
def _real_initialize(self): def _download_webpage_no_ff(self, url, *args, **kwargs):
cookies = self._get_dailymotion_cookies() request = self._build_request(url)
ff = self._get_cookie_value(cookies, 'ff') return self._download_webpage(request, *args, **kwargs)
self._FAMILY_FILTER = ff == 'on' if ff else age_restricted(18, self._downloader.params.get('age_limit'))
self._set_dailymotion_cookie('ff', 'on' if self._FAMILY_FILTER else 'off')
def _call_api(self, object_type, xid, object_fields, note, filter_extra=None):
if not self._HEADERS.get('Authorization'):
cookies = self._get_dailymotion_cookies()
token = self._get_cookie_value(cookies, 'access_token') or self._get_cookie_value(cookies, 'client_token')
if not token:
data = {
'client_id': 'f1a362d288c1b98099c7',
'client_secret': 'eea605b96e01c796ff369935357eca920c5da4c5',
}
username, password = self._get_login_info()
if username:
data.update({
'grant_type': 'password',
'password': password,
'username': username,
})
else:
data['grant_type'] = 'client_credentials'
try:
token = self._download_json(
'https://graphql.api.dailymotion.com/oauth/token',
None, 'Downloading Access Token',
data=urlencode_postdata(data))['access_token']
except ExtractorError as e:
if isinstance(e.cause, compat_HTTPError) and e.cause.code == 400:
raise ExtractorError(self._parse_json(
e.cause.read().decode(), xid)['error_description'], expected=True)
raise
self._set_dailymotion_cookie('access_token' if username else 'client_token', token)
self._HEADERS['Authorization'] = 'Bearer ' + token
resp = self._download_json(
'https://graphql.api.dailymotion.com/', xid, note, data=json.dumps({
'query': '''{
%s(xid: "%s"%s) {
%s
}
}''' % (object_type, xid, ', ' + filter_extra if filter_extra else '', object_fields),
}).encode(), headers=self._HEADERS)
obj = resp['data'][object_type]
if not obj:
raise ExtractorError(resp['errors'][0]['message'], expected=True)
return obj
class DailymotionIE(DailymotionBaseInfoExtractor): class DailymotionIE(DailymotionBaseInfoExtractor):
_VALID_URL = r'''(?ix) _VALID_URL = r'(?i)https?://(?:(www|touch)\.)?dailymotion\.[a-z]{2,3}/(?:(?:(?:embed|swf|#)/)?video|swf)/(?P<id>[^/?_]+)'
https?://
(?:
(?:(?:www|touch)\.)?dailymotion\.[a-z]{2,3}/(?:(?:(?:embed|swf|\#)/)?video|swf)|
(?:www\.)?lequipe\.fr/video
)
/(?P<id>[^/?_]+)(?:.+?\bplaylist=(?P<playlist_id>x[0-9a-z]+))?
'''
IE_NAME = 'dailymotion' IE_NAME = 'dailymotion'
_FORMATS = [
('stream_h264_ld_url', 'ld'),
('stream_h264_url', 'standard'),
('stream_h264_hq_url', 'hq'),
('stream_h264_hd_url', 'hd'),
('stream_h264_hd1080_url', 'hd180'),
]
_TESTS = [{ _TESTS = [{
'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news', 'url': 'http://www.dailymotion.com/video/x5kesuj_office-christmas-party-review-jason-bateman-olivia-munn-t-j-miller_news',
'md5': '074b95bdee76b9e3654137aee9c79dfe', 'md5': '074b95bdee76b9e3654137aee9c79dfe',
@ -108,6 +67,7 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
'ext': 'mp4', 'ext': 'mp4',
'title': 'Office Christmas Party Review Jason Bateman, Olivia Munn, T.J. Miller', 'title': 'Office Christmas Party Review Jason Bateman, Olivia Munn, T.J. Miller',
'description': 'Office Christmas Party Review - Jason Bateman, Olivia Munn, T.J. Miller', 'description': 'Office Christmas Party Review - Jason Bateman, Olivia Munn, T.J. Miller',
'thumbnail': r're:^https?:.*\.(?:jpg|png)$',
'duration': 187, 'duration': 187,
'timestamp': 1493651285, 'timestamp': 1493651285,
'upload_date': '20170501', 'upload_date': '20170501',
@ -173,171 +133,274 @@ class DailymotionIE(DailymotionBaseInfoExtractor):
}, { }, {
'url': 'http://www.dailymotion.com/swf/x3ss1m_funny-magic-trick-barry-and-stuart_fun', 'url': 'http://www.dailymotion.com/swf/x3ss1m_funny-magic-trick-barry-and-stuart_fun',
'only_matching': True, 'only_matching': True,
}, {
'url': 'https://www.lequipe.fr/video/x791mem',
'only_matching': True,
}, {
'url': 'https://www.lequipe.fr/video/k7MtHciueyTcrFtFKA2',
'only_matching': True,
}, {
'url': 'https://www.dailymotion.com/video/x3z49k?playlist=xv4bw',
'only_matching': True,
}] }]
_GEO_BYPASS = False
_COMMON_MEDIA_FIELDS = '''description
geoblockedCountries {
allowed
}
xid'''
@staticmethod @staticmethod
def _extract_urls(webpage): def _extract_urls(webpage):
urls = []
# Look for embedded Dailymotion player # Look for embedded Dailymotion player
# https://developer.dailymotion.com/player#player-parameters matches = re.findall(
for mobj in re.finditer( r'<(?:(?:embed|iframe)[^>]+?src=|input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=)(["\'])(?P<url>(?:https?:)?//(?:www\.)?dailymotion\.com/(?:embed|swf)/video/.+?)\1', webpage)
r'<(?:(?:embed|iframe)[^>]+?src=|input[^>]+id=[\'"]dmcloudUrlEmissionSelect[\'"][^>]+value=)(["\'])(?P<url>(?:https?:)?//(?:www\.)?dailymotion\.com/(?:embed|swf)/video/.+?)\1', webpage): return list(map(lambda m: unescapeHTML(m[1]), matches))
urls.append(unescapeHTML(mobj.group('url')))
for mobj in re.finditer(
r'(?s)DM\.player\([^,]+,\s*{.*?video[\'"]?\s*:\s*["\']?(?P<id>[0-9a-zA-Z]+).+?}\s*\);', webpage):
urls.append('https://www.dailymotion.com/embed/video/' + mobj.group('id'))
return urls
def _real_extract(self, url): def _real_extract(self, url):
video_id, playlist_id = re.match(self._VALID_URL, url).groups() video_id = self._match_id(url)
if playlist_id: webpage = self._download_webpage_no_ff(
if not self._downloader.params.get('noplaylist'): 'https://www.dailymotion.com/video/%s' % video_id, video_id)
self.to_screen('Downloading playlist %s - add --no-playlist to just download video' % playlist_id)
return self.url_result(
'http://www.dailymotion.com/playlist/' + playlist_id,
'DailymotionPlaylist', playlist_id)
self.to_screen('Downloading just video %s because of --no-playlist' % video_id)
password = self._downloader.params.get('videopassword') age_limit = self._rta_search(webpage)
media = self._call_api(
'media', video_id, '''... on Video {
%s
stats {
likes {
total
}
views {
total
}
}
}
... on Live {
%s
audienceCount
isOnAir
}''' % (self._COMMON_MEDIA_FIELDS, self._COMMON_MEDIA_FIELDS), 'Downloading media JSON metadata',
'password: "%s"' % self._downloader.params.get('videopassword') if password else None)
xid = media['xid']
metadata = self._download_json( description = self._og_search_description(
'https://www.dailymotion.com/player/metadata/video/' + xid, webpage, default=None) or self._html_search_meta(
xid, 'Downloading metadata JSON', 'description', webpage, 'description')
query={'app': 'com.dailymotion.neon'})
error = metadata.get('error') view_count_str = self._search_regex(
if error: (r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserPlays:([\s\d,.]+)"',
title = error.get('title') or error['raw_message'] r'video_views_count[^>]+>\s+([\s\d\,.]+)'),
# See https://developer.dailymotion.com/api#access-error webpage, 'view count', default=None)
if error.get('code') == 'DM007': if view_count_str:
allowed_countries = try_get(media, lambda x: x['geoblockedCountries']['allowed'], list) view_count_str = re.sub(r'\s', '', view_count_str)
self.raise_geo_restricted(msg=title, countries=allowed_countries) view_count = str_to_int(view_count_str)
raise ExtractorError( comment_count = int_or_none(self._search_regex(
'%s said: %s' % (self.IE_NAME, title), expected=True) r'<meta[^>]+itemprop="interactionCount"[^>]+content="UserComments:(\d+)"',
webpage, 'comment count', default=None))
title = metadata['title'] player_v5 = self._search_regex(
is_live = media.get('isOnAir') [r'buildPlayer\(({.+?})\);\n', # See https://github.com/ytdl-org/youtube-dl/issues/7826
formats = [] r'playerV5\s*=\s*dmp\.create\([^,]+?,\s*({.+?})\);',
for quality, media_list in metadata['qualities'].items(): r'buildPlayer\(({.+?})\);',
for m in media_list: r'var\s+config\s*=\s*({.+?});',
media_url = m.get('url') # New layout regex (see https://github.com/ytdl-org/youtube-dl/issues/13580)
media_type = m.get('type') r'__PLAYER_CONFIG__\s*=\s*({.+?});'],
if not media_url or media_type == 'application/vnd.lumberjack.manifest': webpage, 'player v5', default=None)
continue if player_v5:
if media_type == 'application/x-mpegURL': player = self._parse_json(player_v5, video_id, fatal=False) or {}
formats.extend(self._extract_m3u8_formats( metadata = try_get(player, lambda x: x['metadata'], dict)
media_url, video_id, 'mp4', if not metadata:
'm3u8' if is_live else 'm3u8_native', metadata_url = url_or_none(try_get(
m3u8_id='hls', fatal=False)) player, lambda x: x['context']['metadata_template_url1']))
if metadata_url:
metadata_url = metadata_url.replace(':videoId', video_id)
else: else:
f = { metadata_url = update_url_query(
'url': media_url, 'https://www.dailymotion.com/player/metadata/video/%s'
'format_id': 'http-' + quality, % video_id, {
} 'embedder': url,
m = re.search(r'/H264-(\d+)x(\d+)(?:-(60)/)?', media_url) 'integration': 'inline',
if m: 'GK_PV5_NEON': '1',
width, height, fps = map(int_or_none, m.groups())
f.update({
'fps': fps,
'height': height,
'width': width,
}) })
formats.append(f) metadata = self._download_json(
for f in formats: metadata_url, video_id, 'Downloading metadata JSON')
f['url'] = f['url'].split('#')[0]
if not f.get('fps') and f['format_id'].endswith('@60'): if try_get(metadata, lambda x: x['error']['type']) == 'password_protected':
f['fps'] = 60 password = self._downloader.params.get('videopassword')
if password:
r = int(metadata['id'][1:], 36)
us64e = lambda x: base64.urlsafe_b64encode(x).decode().strip('=')
t = ''.join(random.choice(string.ascii_letters) for i in range(10))
n = us64e(compat_struct_pack('I', r))
i = us64e(hashlib.md5(('%s%d%s' % (password, r, t)).encode()).digest())
metadata = self._download_json(
'http://www.dailymotion.com/player/metadata/video/p' + i + t + n, video_id)
self._check_error(metadata)
formats = []
for quality, media_list in metadata['qualities'].items():
for media in media_list:
media_url = media.get('url')
if not media_url:
continue
type_ = media.get('type')
if type_ == 'application/vnd.lumberjack.manifest':
continue
ext = mimetype2ext(type_) or determine_ext(media_url)
if ext == 'm3u8':
m3u8_formats = self._extract_m3u8_formats(
media_url, video_id, 'mp4', preference=-1,
m3u8_id='hls', fatal=False)
for f in m3u8_formats:
f['url'] = f['url'].split('#')[0]
formats.append(f)
elif ext == 'f4m':
formats.extend(self._extract_f4m_formats(
media_url, video_id, preference=-1, f4m_id='hds', fatal=False))
else:
f = {
'url': media_url,
'format_id': 'http-%s' % quality,
'ext': ext,
}
m = re.search(r'H264-(?P<width>\d+)x(?P<height>\d+)', media_url)
if m:
f.update({
'width': int(m.group('width')),
'height': int(m.group('height')),
})
formats.append(f)
self._sort_formats(formats)
title = metadata['title']
duration = int_or_none(metadata.get('duration'))
timestamp = int_or_none(metadata.get('created_time'))
thumbnail = metadata.get('poster_url')
uploader = metadata.get('owner', {}).get('screenname')
uploader_id = metadata.get('owner', {}).get('id')
subtitles = {}
subtitles_data = metadata.get('subtitles', {}).get('data', {})
if subtitles_data and isinstance(subtitles_data, dict):
for subtitle_lang, subtitle in subtitles_data.items():
subtitles[subtitle_lang] = [{
'ext': determine_ext(subtitle_url),
'url': subtitle_url,
} for subtitle_url in subtitle.get('urls', [])]
return {
'id': video_id,
'title': title,
'description': description,
'thumbnail': thumbnail,
'duration': duration,
'timestamp': timestamp,
'uploader': uploader,
'uploader_id': uploader_id,
'age_limit': age_limit,
'view_count': view_count,
'comment_count': comment_count,
'formats': formats,
'subtitles': subtitles,
}
# vevo embed
vevo_id = self._search_regex(
r'<link rel="video_src" href="[^"]*?vevo\.com[^"]*?video=(?P<id>[\w]*)',
webpage, 'vevo embed', default=None)
if vevo_id:
return self.url_result('vevo:%s' % vevo_id, 'Vevo')
# fallback old player
embed_page = self._download_webpage_no_ff(
'https://www.dailymotion.com/embed/video/%s' % video_id,
video_id, 'Downloading embed page')
timestamp = parse_iso8601(self._html_search_meta(
'video:release_date', webpage, 'upload date'))
info = self._parse_json(
self._search_regex(
r'var info = ({.*?}),$', embed_page,
'video info', flags=re.MULTILINE),
video_id)
self._check_error(info)
formats = []
for (key, format_id) in self._FORMATS:
video_url = info.get(key)
if video_url is not None:
m_size = re.search(r'H264-(\d+)x(\d+)', video_url)
if m_size is not None:
width, height = map(int_or_none, (m_size.group(1), m_size.group(2)))
else:
width, height = None, None
formats.append({
'url': video_url,
'ext': 'mp4',
'format_id': format_id,
'width': width,
'height': height,
})
self._sort_formats(formats) self._sort_formats(formats)
subtitles = {} # subtitles
subtitles_data = try_get(metadata, lambda x: x['subtitles']['data'], dict) or {} video_subtitles = self.extract_subtitles(video_id, webpage)
for subtitle_lang, subtitle in subtitles_data.items():
subtitles[subtitle_lang] = [{
'url': subtitle_url,
} for subtitle_url in subtitle.get('urls', [])]
thumbnails = [] title = self._og_search_title(webpage, default=None)
for height, poster_url in metadata.get('posters', {}).items(): if title is None:
thumbnails.append({ title = self._html_search_regex(
'height': int_or_none(height), r'(?s)<span\s+id="video_title"[^>]*>(.*?)</span>', webpage,
'id': height, 'title')
'url': poster_url,
})
owner = metadata.get('owner') or {}
stats = media.get('stats') or {}
get_count = lambda x: int_or_none(try_get(stats, lambda y: y[x + 's']['total']))
return { return {
'id': video_id, 'id': video_id,
'title': self._live_title(title) if is_live else title,
'description': clean_html(media.get('description')),
'thumbnails': thumbnails,
'duration': int_or_none(metadata.get('duration')) or None,
'timestamp': int_or_none(metadata.get('created_time')),
'uploader': owner.get('screenname'),
'uploader_id': owner.get('id') or metadata.get('screenname'),
'age_limit': 18 if metadata.get('explicit') else 0,
'tags': metadata.get('tags'),
'view_count': get_count('view') or int_or_none(media.get('audienceCount')),
'like_count': get_count('like'),
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'uploader': info['owner.screenname'],
'is_live': is_live, 'timestamp': timestamp,
'title': title,
'description': description,
'subtitles': video_subtitles,
'thumbnail': info['thumbnail_url'],
'age_limit': age_limit,
'view_count': view_count,
'duration': info['duration']
} }
def _check_error(self, info):
error = info.get('error')
if error:
title = error.get('title') or error['message']
# See https://developer.dailymotion.com/api#access-error
if error.get('code') == 'DM007':
self.raise_geo_restricted(msg=title)
raise ExtractorError(
'%s said: %s' % (self.IE_NAME, title), expected=True)
class DailymotionPlaylistBaseIE(DailymotionBaseInfoExtractor): def _get_subtitles(self, video_id, webpage):
try:
sub_list = self._download_webpage(
'https://api.dailymotion.com/video/%s/subtitles?fields=id,language,url' % video_id,
video_id, note=False)
except ExtractorError as err:
self._downloader.report_warning('unable to download video subtitles: %s' % error_to_compat_str(err))
return {}
info = json.loads(sub_list)
if (info['total'] > 0):
sub_lang_list = dict((l['language'], [{'url': l['url'], 'ext': 'srt'}]) for l in info['list'])
return sub_lang_list
self._downloader.report_warning('video doesn\'t have subtitles')
return {}
class DailymotionPlaylistIE(DailymotionBaseInfoExtractor):
IE_NAME = 'dailymotion:playlist'
_VALID_URL = r'(?:https?://)?(?:www\.)?dailymotion\.[a-z]{2,3}/playlist/(?P<id>x[0-9a-z]+)'
_TESTS = [{
'url': 'http://www.dailymotion.com/playlist/xv4bw_nqtv_sport/1#video=xl8v3q',
'info_dict': {
'title': 'SPORT',
'id': 'xv4bw',
},
'playlist_mincount': 20,
}]
_PAGE_SIZE = 100 _PAGE_SIZE = 100
def _fetch_page(self, playlist_id, page): def _fetch_page(self, playlist_id, authorizaion, page):
page += 1 page += 1
videos = self._call_api( videos = self._download_json(
self._OBJECT_TYPE, playlist_id, 'https://graphql.api.dailymotion.com',
'''videos(allowExplicit: %s, first: %d, page: %d) { playlist_id, 'Downloading page %d' % page,
data=json.dumps({
'query': '''{
collection(xid: "%s") {
videos(first: %d, page: %d) {
pageInfo {
hasNextPage
nextPage
}
edges { edges {
node { node {
xid xid
url url
} }
} }
}''' % ('false' if self._FAMILY_FILTER else 'true', self._PAGE_SIZE, page), }
'Downloading page %d' % page)['videos'] }
}''' % (playlist_id, self._PAGE_SIZE, page)
}).encode(), headers={
'Authorization': authorizaion,
'Origin': 'https://www.dailymotion.com',
})['data']['collection']['videos']
for edge in videos['edges']: for edge in videos['edges']:
node = edge['node'] node = edge['node']
yield self.url_result( yield self.url_result(
@ -345,49 +408,86 @@ class DailymotionPlaylistBaseIE(DailymotionBaseInfoExtractor):
def _real_extract(self, url): def _real_extract(self, url):
playlist_id = self._match_id(url) playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
api = self._parse_json(self._search_regex(
r'__PLAYER_CONFIG__\s*=\s*({.+?});',
webpage, 'player config'), playlist_id)['context']['api']
auth = self._download_json(
api.get('auth_url', 'https://graphql.api.dailymotion.com/oauth/token'),
playlist_id, data=urlencode_postdata({
'client_id': api.get('client_id', 'f1a362d288c1b98099c7'),
'client_secret': api.get('client_secret', 'eea605b96e01c796ff369935357eca920c5da4c5'),
'grant_type': 'client_credentials',
}))
authorizaion = '%s %s' % (auth.get('token_type', 'Bearer'), auth['access_token'])
entries = OnDemandPagedList(functools.partial( entries = OnDemandPagedList(functools.partial(
self._fetch_page, playlist_id), self._PAGE_SIZE) self._fetch_page, playlist_id, authorizaion), self._PAGE_SIZE)
return self.playlist_result( return self.playlist_result(
entries, playlist_id) entries, playlist_id,
self._og_search_title(webpage))
class DailymotionPlaylistIE(DailymotionPlaylistBaseIE): class DailymotionUserIE(DailymotionBaseInfoExtractor):
IE_NAME = 'dailymotion:playlist'
_VALID_URL = r'(?:https?://)?(?:www\.)?dailymotion\.[a-z]{2,3}/playlist/(?P<id>x[0-9a-z]+)'
_TESTS = [{
'url': 'http://www.dailymotion.com/playlist/xv4bw_nqtv_sport/1#video=xl8v3q',
'info_dict': {
'id': 'xv4bw',
},
'playlist_mincount': 20,
}]
_OBJECT_TYPE = 'collection'
class DailymotionUserIE(DailymotionPlaylistBaseIE):
IE_NAME = 'dailymotion:user' IE_NAME = 'dailymotion:user'
_VALID_URL = r'https?://(?:www\.)?dailymotion\.[a-z]{2,3}/(?!(?:embed|swf|#|video|playlist)/)(?:(?:old/)?user/)?(?P<id>[^/]+)' _VALID_URL = r'https?://(?:www\.)?dailymotion\.[a-z]{2,3}/(?!(?:embed|swf|#|video|playlist)/)(?:(?:old/)?user/)?(?P<user>[^/]+)'
_MORE_PAGES_INDICATOR = r'(?s)<div class="pages[^"]*">.*?<a\s+class="[^"]*?icon-arrow_right[^"]*?"'
_PAGE_TEMPLATE = 'http://www.dailymotion.com/user/%s/%s'
_TESTS = [{ _TESTS = [{
'url': 'https://www.dailymotion.com/user/nqtv', 'url': 'https://www.dailymotion.com/user/nqtv',
'info_dict': { 'info_dict': {
'id': 'nqtv', 'id': 'nqtv',
'title': 'Rémi Gaillard',
}, },
'playlist_mincount': 152, 'playlist_mincount': 100,
}, { }, {
'url': 'http://www.dailymotion.com/user/UnderProject', 'url': 'http://www.dailymotion.com/user/UnderProject',
'info_dict': { 'info_dict': {
'id': 'UnderProject', 'id': 'UnderProject',
'title': 'UnderProject',
}, },
'playlist_mincount': 1000, 'playlist_mincount': 1800,
'expected_warnings': [
'Stopped at duplicated page',
],
'skip': 'Takes too long time', 'skip': 'Takes too long time',
}, {
'url': 'https://www.dailymotion.com/user/nqtv',
'info_dict': {
'id': 'nqtv',
},
'playlist_mincount': 148,
'params': {
'age_limit': 0,
},
}] }]
_OBJECT_TYPE = 'channel'
def _extract_entries(self, id):
video_ids = set()
processed_urls = set()
for pagenum in itertools.count(1):
page_url = self._PAGE_TEMPLATE % (id, pagenum)
webpage, urlh = self._download_webpage_handle_no_ff(
page_url, id, 'Downloading page %s' % pagenum)
if urlh.geturl() in processed_urls:
self.report_warning('Stopped at duplicated page %s, which is the same as %s' % (
page_url, urlh.geturl()), id)
break
processed_urls.add(urlh.geturl())
for video_id in re.findall(r'data-xid="(.+?)"', webpage):
if video_id not in video_ids:
yield self.url_result(
'http://www.dailymotion.com/video/%s' % video_id,
DailymotionIE.ie_key(), video_id)
video_ids.add(video_id)
if re.search(self._MORE_PAGES_INDICATOR, webpage) is None:
break
def _real_extract(self, url):
mobj = re.match(self._VALID_URL, url)
user = mobj.group('user')
webpage = self._download_webpage(
'https://www.dailymotion.com/user/%s' % user, user)
full_user = unescapeHTML(self._html_search_regex(
r'<a class="nav-image" title="([^"]+)" href="/%s">' % re.escape(user),
webpage, 'user'))
return {
'_type': 'playlist',
'id': user,
'title': full_user,
'entries': self._extract_entries(user),
}

View File

@ -0,0 +1,154 @@
from __future__ import unicode_literals
import base64
import json
import random
import re
from .common import InfoExtractor
from ..aes import (
aes_cbc_decrypt,
aes_cbc_encrypt,
)
from ..compat import compat_b64decode
from ..utils import (
bytes_to_intlist,
bytes_to_long,
extract_attributes,
ExtractorError,
intlist_to_bytes,
js_to_json,
int_or_none,
long_to_bytes,
pkcs1pad,
)
class DaisukiMottoIE(InfoExtractor):
_VALID_URL = r'https?://motto\.daisuki\.net/framewatch/embed/[^/]+/(?P<id>[0-9a-zA-Z]{3})'
_TEST = {
'url': 'http://motto.daisuki.net/framewatch/embed/embedDRAGONBALLSUPERUniverseSurvivalsaga/V2e/760/428',
'info_dict': {
'id': 'V2e',
'ext': 'mp4',
'title': '#117 SHOWDOWN OF LOVE! ANDROIDS VS UNIVERSE 2!!',
'subtitles': {
'mul': [{
'ext': 'ttml',
}],
},
},
'params': {
'skip_download': True, # AES-encrypted HLS stream
},
}
# The public key in PEM format can be found in clientlibs_anime_watch.min.js
_RSA_KEY = (0xc5524c25e8e14b366b3754940beeb6f96cb7e2feef0b932c7659a0c5c3bf173d602464c2df73d693b513ae06ff1be8f367529ab30bf969c5640522181f2a0c51ea546ae120d3d8d908595e4eff765b389cde080a1ef7f1bbfb07411cc568db73b7f521cedf270cbfbe0ddbc29b1ac9d0f2d8f4359098caffee6d07915020077d, 65537)
def _real_extract(self, url):
video_id = self._match_id(url)
webpage = self._download_webpage(url, video_id)
flashvars = self._parse_json(self._search_regex(
r'(?s)var\s+flashvars\s*=\s*({.+?});', webpage, 'flashvars'),
video_id, transform_source=js_to_json)
iv = [0] * 16
data = {}
for key in ('device_cd', 'mv_id', 'ss1_prm', 'ss2_prm', 'ss3_prm', 'ss_id'):
data[key] = flashvars.get(key, '')
encrypted_rtn = None
# Some AES keys are rejected. Try it with different AES keys
for idx in range(5):
aes_key = [random.randint(0, 254) for _ in range(32)]
padded_aeskey = intlist_to_bytes(pkcs1pad(aes_key, 128))
n, e = self._RSA_KEY
encrypted_aeskey = long_to_bytes(pow(bytes_to_long(padded_aeskey), e, n))
init_data = self._download_json(
'http://motto.daisuki.net/fastAPI/bgn/init/',
video_id, query={
's': flashvars.get('s', ''),
'c': flashvars.get('ss3_prm', ''),
'e': url,
'd': base64.b64encode(intlist_to_bytes(aes_cbc_encrypt(
bytes_to_intlist(json.dumps(data)),
aes_key, iv))).decode('ascii'),
'a': base64.b64encode(encrypted_aeskey).decode('ascii'),
}, note='Downloading JSON metadata' + (' (try #%d)' % (idx + 1) if idx > 0 else ''))
if 'rtn' in init_data:
encrypted_rtn = init_data['rtn']
break
self._sleep(5, video_id)
if encrypted_rtn is None:
raise ExtractorError('Failed to fetch init data')
rtn = self._parse_json(
intlist_to_bytes(aes_cbc_decrypt(bytes_to_intlist(
compat_b64decode(encrypted_rtn)),
aes_key, iv)).decode('utf-8').rstrip('\0'),
video_id)
title = rtn['title_str']
formats = self._extract_m3u8_formats(
rtn['play_url'], video_id, ext='mp4', entry_protocol='m3u8_native')
subtitles = {}
caption_url = rtn.get('caption_url')
if caption_url:
# mul: multiple languages
subtitles['mul'] = [{
'url': caption_url,
'ext': 'ttml',
}]
return {
'id': video_id,
'title': title,
'formats': formats,
'subtitles': subtitles,
}
class DaisukiMottoPlaylistIE(InfoExtractor):
_VALID_URL = r'https?://motto\.daisuki\.net/(?P<id>information)/'
_TEST = {
'url': 'http://motto.daisuki.net/information/',
'info_dict': {
'title': 'DRAGON BALL SUPER',
},
'playlist_mincount': 117,
}
def _real_extract(self, url):
playlist_id = self._match_id(url)
webpage = self._download_webpage(url, playlist_id)
entries = []
for li in re.findall(r'(<li[^>]+?data-product_id="[a-zA-Z0-9]{3}"[^>]+>)', webpage):
attr = extract_attributes(li)
ad_id = attr.get('data-ad_id')
product_id = attr.get('data-product_id')
if ad_id and product_id:
episode_id = attr.get('data-chapter')
entries.append({
'_type': 'url_transparent',
'url': 'http://motto.daisuki.net/framewatch/embed/%s/%s/760/428' % (ad_id, product_id),
'episode_id': episode_id,
'episode_number': int_or_none(episode_id),
'ie_key': 'DaisukiMotto',
})
return self.playlist_result(entries, playlist_title='DRAGON BALL SUPER')

View File

@ -2,21 +2,25 @@
from __future__ import unicode_literals from __future__ import unicode_literals
import re
import itertools import itertools
from .common import InfoExtractor from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_parse_qs, compat_parse_qs,
compat_urllib_parse_unquote, compat_urllib_parse_unquote,
compat_urllib_parse_urlencode,
compat_urlparse, compat_urlparse,
) )
from ..utils import (
int_or_none,
str_to_int,
xpath_text,
unescapeHTML,
)
class DaumBaseIE(InfoExtractor): class DaumIE(InfoExtractor):
_KAKAO_EMBED_BASE = 'http://tv.kakao.com/embed/player/cliplink/'
class DaumIE(DaumBaseIE):
_VALID_URL = r'https?://(?:(?:m\.)?tvpot\.daum\.net/v/|videofarm\.daum\.net/controller/player/VodPlayer\.swf\?vid=)(?P<id>[^?#&]+)' _VALID_URL = r'https?://(?:(?:m\.)?tvpot\.daum\.net/v/|videofarm\.daum\.net/controller/player/VodPlayer\.swf\?vid=)(?P<id>[^?#&]+)'
IE_NAME = 'daum.net' IE_NAME = 'daum.net'
@ -32,9 +36,6 @@ class DaumIE(DaumBaseIE):
'duration': 2117, 'duration': 2117,
'view_count': int, 'view_count': int,
'comment_count': int, 'comment_count': int,
'uploader_id': 186139,
'uploader': '콘간지',
'timestamp': 1387310323,
}, },
}, { }, {
'url': 'http://m.tvpot.daum.net/v/65139429', 'url': 'http://m.tvpot.daum.net/v/65139429',
@ -43,14 +44,11 @@ class DaumIE(DaumBaseIE):
'ext': 'mp4', 'ext': 'mp4',
'title': '1297회, \'아빠 아들로 태어나길 잘 했어\' 민수, 감동의 눈물[아빠 어디가] 20150118', 'title': '1297회, \'아빠 아들로 태어나길 잘 했어\' 민수, 감동의 눈물[아빠 어디가] 20150118',
'description': 'md5:79794514261164ff27e36a21ad229fc5', 'description': 'md5:79794514261164ff27e36a21ad229fc5',
'upload_date': '20150118', 'upload_date': '20150604',
'thumbnail': r're:^https?://.*\.(?:jpg|png)', 'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 154, 'duration': 154,
'view_count': int, 'view_count': int,
'comment_count': int, 'comment_count': int,
'uploader': 'MBC 예능',
'uploader_id': 132251,
'timestamp': 1421604228,
}, },
}, { }, {
'url': 'http://tvpot.daum.net/v/07dXWRka62Y%24', 'url': 'http://tvpot.daum.net/v/07dXWRka62Y%24',
@ -61,15 +59,12 @@ class DaumIE(DaumBaseIE):
'id': 'vwIpVpCQsT8$', 'id': 'vwIpVpCQsT8$',
'ext': 'flv', 'ext': 'flv',
'title': '01-Korean War ( Trouble on the horizon )', 'title': '01-Korean War ( Trouble on the horizon )',
'description': 'Korean War 01\r\nTrouble on the horizon\r\n전쟁의 먹구름', 'description': '\nKorean War 01\nTrouble on the horizon\n전쟁의 먹구름',
'upload_date': '20080223', 'upload_date': '20080223',
'thumbnail': r're:^https?://.*\.(?:jpg|png)', 'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 249, 'duration': 249,
'view_count': int, 'view_count': int,
'comment_count': int, 'comment_count': int,
'uploader': '까칠한 墮落始祖 황비홍님의',
'uploader_id': 560824,
'timestamp': 1203770745,
}, },
}, { }, {
# Requires dte_type=WEB (#9972) # Requires dte_type=WEB (#9972)
@ -78,24 +73,60 @@ class DaumIE(DaumBaseIE):
'info_dict': { 'info_dict': {
'id': 's3794Uf1NZeZ1qMpGpeqeRU', 'id': 's3794Uf1NZeZ1qMpGpeqeRU',
'ext': 'mp4', 'ext': 'mp4',
'title': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)', 'title': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny) [쇼! 음악중심] 508회 20160611',
'description': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)\r\n\r\n[쇼! 음악중심] 20160611, 507회', 'description': '러블리즈 - Destiny (나의 지구) (Lovelyz - Destiny)\n\n[쇼! 음악중심] 20160611, 507회',
'upload_date': '20170129', 'upload_date': '20160611',
'uploader': '쇼! 음악중심',
'uploader_id': 2653210,
'timestamp': 1485684628,
}, },
}] }]
def _real_extract(self, url): def _real_extract(self, url):
video_id = compat_urllib_parse_unquote(self._match_id(url)) video_id = compat_urllib_parse_unquote(self._match_id(url))
if not video_id.isdigit(): movie_data = self._download_json(
video_id += '@my' 'http://videofarm.daum.net/controller/api/closed/v1_2/IntegratedMovieData.json',
return self.url_result( video_id, 'Downloading video formats info', query={'vid': video_id, 'dte_type': 'WEB'})
self._KAKAO_EMBED_BASE + video_id, 'Kakao', video_id)
# For urls like http://m.tvpot.daum.net/v/65139429, where the video_id is really a clipid
if not movie_data.get('output_list', {}).get('output_list') and re.match(r'^\d+$', video_id):
return self.url_result('http://tvpot.daum.net/clip/ClipView.do?clipid=%s' % video_id)
info = self._download_xml(
'http://tvpot.daum.net/clip/ClipInfoXml.do', video_id,
'Downloading video info', query={'vid': video_id})
formats = []
for format_el in movie_data['output_list']['output_list']:
profile = format_el['profile']
format_query = compat_urllib_parse_urlencode({
'vid': video_id,
'profile': profile,
})
url_doc = self._download_xml(
'http://videofarm.daum.net/controller/api/open/v1_2/MovieLocation.apixml?' + format_query,
video_id, note='Downloading video data for %s format' % profile)
format_url = url_doc.find('result/url').text
formats.append({
'url': format_url,
'format_id': profile,
'width': int_or_none(format_el.get('width')),
'height': int_or_none(format_el.get('height')),
'filesize': int_or_none(format_el.get('filesize')),
})
self._sort_formats(formats)
return {
'id': video_id,
'title': info.find('TITLE').text,
'formats': formats,
'thumbnail': xpath_text(info, 'THUMB_URL'),
'description': xpath_text(info, 'CONTENTS'),
'duration': int_or_none(xpath_text(info, 'DURATION')),
'upload_date': info.find('REGDTTM').text[:8],
'view_count': str_to_int(xpath_text(info, 'PLAY_CNT')),
'comment_count': str_to_int(xpath_text(info, 'COMMENT_CNT')),
}
class DaumClipIE(DaumBaseIE): class DaumClipIE(InfoExtractor):
_VALID_URL = r'https?://(?:m\.)?tvpot\.daum\.net/(?:clip/ClipView.(?:do|tv)|mypot/View.do)\?.*?clipid=(?P<id>\d+)' _VALID_URL = r'https?://(?:m\.)?tvpot\.daum\.net/(?:clip/ClipView.(?:do|tv)|mypot/View.do)\?.*?clipid=(?P<id>\d+)'
IE_NAME = 'daum.net:clip' IE_NAME = 'daum.net:clip'
_URL_TEMPLATE = 'http://tvpot.daum.net/clip/ClipView.do?clipid=%s' _URL_TEMPLATE = 'http://tvpot.daum.net/clip/ClipView.do?clipid=%s'
@ -111,9 +142,6 @@ class DaumClipIE(DaumBaseIE):
'thumbnail': r're:^https?://.*\.(?:jpg|png)', 'thumbnail': r're:^https?://.*\.(?:jpg|png)',
'duration': 3868, 'duration': 3868,
'view_count': int, 'view_count': int,
'uploader': 'GOMeXP',
'uploader_id': 6667,
'timestamp': 1377911092,
}, },
}, { }, {
'url': 'http://m.tvpot.daum.net/clip/ClipView.tv?clipid=54999425', 'url': 'http://m.tvpot.daum.net/clip/ClipView.tv?clipid=54999425',
@ -126,8 +154,22 @@ class DaumClipIE(DaumBaseIE):
def _real_extract(self, url): def _real_extract(self, url):
video_id = self._match_id(url) video_id = self._match_id(url)
return self.url_result( clip_info = self._download_json(
self._KAKAO_EMBED_BASE + video_id, 'Kakao', video_id) 'http://tvpot.daum.net/mypot/json/GetClipInfo.do?clipid=%s' % video_id,
video_id, 'Downloading clip info')['clip_bean']
return {
'_type': 'url_transparent',
'id': video_id,
'url': 'http://tvpot.daum.net/v/%s' % clip_info['vid'],
'title': unescapeHTML(clip_info['title']),
'thumbnail': clip_info.get('thumb_url'),
'description': clip_info.get('contents'),
'duration': int_or_none(clip_info.get('duration')),
'upload_date': clip_info.get('up_date')[:8],
'view_count': int_or_none(clip_info.get('play_count')),
'ie_key': 'Daum',
}
class DaumListIE(InfoExtractor): class DaumListIE(InfoExtractor):

View File

@ -7,51 +7,50 @@ from .common import InfoExtractor
class DBTVIE(InfoExtractor): class DBTVIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?dagbladet\.no/video/(?:(?:embed|(?P<display_id>[^/]+))/)?(?P<id>[0-9A-Za-z_-]{11}|[a-zA-Z0-9]{8})' _VALID_URL = r'https?://(?:www\.)?dbtv\.no/(?:[^/]+/)?(?P<id>[0-9]+)(?:#(?P<display_id>.+))?'
_TESTS = [{ _TESTS = [{
'url': 'https://www.dagbladet.no/video/PynxJnNWChE/', 'url': 'http://dbtv.no/3649835190001#Skulle_teste_ut_fornøyelsespark,_men_kollegaen_var_bare_opptatt_av_bikinikroppen',
'md5': 'b8f850ba1860adbda668d367f9b77699', 'md5': '2e24f67936517b143a234b4cadf792ec',
'info_dict': { 'info_dict': {
'id': 'PynxJnNWChE', 'id': '3649835190001',
'display_id': 'Skulle_teste_ut_fornøyelsespark,_men_kollegaen_var_bare_opptatt_av_bikinikroppen',
'ext': 'mp4', 'ext': 'mp4',
'title': 'Skulle teste ut fornøyelsespark, men kollegaen var bare opptatt av bikinikroppen', 'title': 'Skulle teste ut fornøyelsespark, men kollegaen var bare opptatt av bikinikroppen',
'description': 'md5:49cc8370e7d66e8a2ef15c3b4631fd3f', 'description': 'md5:1504a54606c4dde3e4e61fc97aa857e0',
'thumbnail': r're:https?://.*\.jpg', 'thumbnail': r're:https?://.*\.jpg',
'upload_date': '20160916', 'timestamp': 1404039863,
'duration': 69, 'upload_date': '20140629',
'uploader_id': 'UCk5pvsyZJoYJBd7_oFPTlRQ', 'duration': 69.544,
'uploader': 'Dagbladet', 'uploader_id': '1027729757001',
}, },
'add_ie': ['Youtube'] 'add_ie': ['BrightcoveNew']
}, { }, {
'url': 'https://www.dagbladet.no/video/embed/xlGmyIeN9Jo/?autoplay=false', 'url': 'http://dbtv.no/3649835190001',
'only_matching': True, 'only_matching': True,
}, { }, {
'url': 'https://www.dagbladet.no/video/truer-iran-bor-passe-dere/PalfB2Cw', 'url': 'http://www.dbtv.no/lazyplayer/4631135248001',
'only_matching': True,
}, {
'url': 'http://dbtv.no/vice/5000634109001',
'only_matching': True,
}, {
'url': 'http://dbtv.no/filmtrailer/3359293614001',
'only_matching': True, 'only_matching': True,
}] }]
@staticmethod @staticmethod
def _extract_urls(webpage): def _extract_urls(webpage):
return [url for _, url in re.findall( return [url for _, url in re.findall(
r'<iframe[^>]+src=(["\'])((?:https?:)?//(?:www\.)?dagbladet\.no/video/embed/(?:[0-9A-Za-z_-]{11}|[a-zA-Z0-9]{8}).*?)\1', r'<iframe[^>]+src=(["\'])((?:https?:)?//(?:www\.)?dbtv\.no/(?:lazy)?player/\d+.*?)\1',
webpage)] webpage)]
def _real_extract(self, url): def _real_extract(self, url):
display_id, video_id = re.match(self._VALID_URL, url).groups() video_id, display_id = re.match(self._VALID_URL, url).groups()
info = {
return {
'_type': 'url_transparent', '_type': 'url_transparent',
'url': 'http://players.brightcove.net/1027729757001/default_default/index.html?videoId=%s' % video_id,
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
'ie_key': 'BrightcoveNew',
} }
if len(video_id) == 11:
info.update({
'url': video_id,
'ie_key': 'Youtube',
})
else:
info.update({
'url': 'jwplatform:' + video_id,
'ie_key': 'JWPlatform',
})
return info

View File

@ -16,11 +16,10 @@ class DctpTvIE(InfoExtractor):
_TESTS = [{ _TESTS = [{
# 4x3 # 4x3
'url': 'http://www.dctp.tv/filme/videoinstallation-fuer-eine-kaufhausfassade/', 'url': 'http://www.dctp.tv/filme/videoinstallation-fuer-eine-kaufhausfassade/',
'md5': '3ffbd1556c3fe210724d7088fad723e3',
'info_dict': { 'info_dict': {
'id': '95eaa4f33dad413aa17b4ee613cccc6c', 'id': '95eaa4f33dad413aa17b4ee613cccc6c',
'display_id': 'videoinstallation-fuer-eine-kaufhausfassade', 'display_id': 'videoinstallation-fuer-eine-kaufhausfassade',
'ext': 'm4v', 'ext': 'flv',
'title': 'Videoinstallation für eine Kaufhausfassade', 'title': 'Videoinstallation für eine Kaufhausfassade',
'description': 'Kurzfilm', 'description': 'Kurzfilm',
'thumbnail': r're:^https?://.*\.jpg$', 'thumbnail': r're:^https?://.*\.jpg$',
@ -28,6 +27,10 @@ class DctpTvIE(InfoExtractor):
'timestamp': 1302172322, 'timestamp': 1302172322,
'upload_date': '20110407', 'upload_date': '20110407',
}, },
'params': {
# rtmp download
'skip_download': True,
},
}, { }, {
# 16x9 # 16x9
'url': 'http://www.dctp.tv/filme/sind-youtuber-die-besseren-lehrer/', 'url': 'http://www.dctp.tv/filme/sind-youtuber-die-besseren-lehrer/',
@ -56,26 +59,33 @@ class DctpTvIE(InfoExtractor):
uuid = media['uuid'] uuid = media['uuid']
title = media['title'] title = media['title']
is_wide = media.get('is_wide') ratio = '16x9' if media.get('is_wide') else '4x3'
formats = [] play_path = 'mp4:%s_dctp_0500_%s.m4v' % (uuid, ratio)
def add_formats(suffix): servers = self._download_json(
templ = 'https://%%s/%s_dctp_%s.m4v' % (uuid, suffix) 'http://www.dctp.tv/streaming_servers/', display_id,
formats.extend([{ note='Downloading server list JSON', fatal=False)
'format_id': 'hls-' + suffix,
'url': templ % 'cdn-segments.dctp.tv' + '/playlist.m3u8',
'protocol': 'm3u8_native',
}, {
'format_id': 's3-' + suffix,
'url': templ % 'completed-media.s3.amazonaws.com',
}, {
'format_id': 'http-' + suffix,
'url': templ % 'cdn-media.dctp.tv',
}])
add_formats('0500_' + ('16x9' if is_wide else '4x3')) if servers:
if is_wide: endpoint = next(
add_formats('720p') server['endpoint']
for server in servers
if url_or_none(server.get('endpoint')) and
'cloudfront' in server['endpoint'])
else:
endpoint = 'rtmpe://s2pqqn4u96e4j8.cloudfront.net/cfx/st/'
app = self._search_regex(
r'^rtmpe?://[^/]+/(?P<app>.*)$', endpoint, 'app')
formats = [{
'url': endpoint,
'app': app,
'play_path': play_path,
'page_url': url,
'player_url': 'http://svm-prod-dctptv-static.s3.amazonaws.com/dctptv-relaunch2012-110.swf',
'ext': 'flv',
}]
thumbnails = [] thumbnails = []
images = media.get('images') images = media.get('images')

Some files were not shown because too many files have changed in this diff Show More