Sergey M
30eecc6a04
Merge pull request #7296 from jaimeMF/xml_attrib_unicode
...
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x (…
2015-10-31 18:15:21 +00:00
Sergey M․
dbd82a1d4f
[extractor/common] Fix m3u8 extraction on failure
2015-11-01 00:01:34 +06:00
Sergey M․
dc519b5421
[extractor/common] Make ie_key and IE_NAME return unicode string
2015-10-31 23:12:57 +06:00
Jaime Marquínez Ferrándiz
36e6f62cd0
Use a wrapper around xml.etree.ElementTree.fromstring in python 2.x ( #7178 )
...
Attributes aren't unicode objects, so they couldn't be directly used in info_dict fields (for example '--write-description' doesn't work with bytes).
2015-10-25 20:13:16 +01:00
remitamine
3711304510
[extractor/common] get the redirected m3u8_url in _extract_m3u8_formats
2015-10-24 19:01:54 +06:00
Jaime Marquínez Ferrándiz
865d1fbafc
[extractor/common] Remove unused import
2015-10-24 12:39:23 +02:00
Sergey M․
943a1e24b8
[extractor/common] Use more generic URLError in _is_valid_url
2015-10-24 16:25:04 +06:00
Sergey M․
02835c6bf4
[extractor/common] Document repost_count
2015-10-18 09:34:54 +06:00
Sergey M․
448ef1f31c
[extractor/common] Allow angle brackets in attributes in _og_regexes ( #7215 )
2015-10-18 09:11:02 +06:00
Sergey M․
7a6d76a64d
[extractor/common] Require closing quote in _og_regexes ( Closes #7174 )
...
E.g. do not match `property='og:video:type'` when `og:video` is requested.
2015-10-14 20:49:39 +06:00
Sergey M․
4180a3d8b7
[extractor/common] Allow quoteless content attribute in og regexes ( Closes #7115 )
2015-10-10 01:46:01 +06:00
Yen Chi Hsuan
57935b2564
[extractor/common] Allow HTML5 unquoted attribute values
...
Fixes #7108
HTML5 allows unquoted attribute values. See the "Unquoted attribute value
syntax" section [1] for more information
[1] http://www.w3.org/TR/html5/syntax.html
2015-10-09 14:11:00 +08:00
Sergey M․
4bba371644
[YoutubeDL] Autocalculate ext for subtitles when missing
2015-10-04 20:42:26 +06:00
Sergey M․
e5851b963a
[extractor/common] Make f4m extraction for SMIL non fatal
2015-10-01 23:04:56 +06:00
Sergey M․
4de6131090
[extractor/common] Add fatal to _extract_f4m_formats
2015-10-01 23:03:31 +06:00
Sergey M․
3a1341a7bc
[extractor/common] Make m3u8 extraction for SMIL non fatal
2015-10-01 22:59:20 +06:00
Sergey M․
c78e48177c
[extractor/common] Check validity of direct URLs
2015-10-01 22:54:54 +06:00
Sergey M․
647eab4541
[extractor/common] Extract upload date from SMIL
2015-10-01 22:20:28 +06:00
Sergey M․
1e5bcdec02
[extractor/common] Extract images from SMIL
2015-10-01 22:20:21 +06:00
Sergey M․
e7d8e98a9f
[extractor/common] Allow float bitrates
2015-10-01 22:20:15 +06:00
Sergey M․
8aab976bbd
[extractor/common] Document release_date field
2015-09-26 21:07:54 +06:00
Sergey M․
c430802e32
[extractor/common] Add raise_geo_restricted
2015-09-22 21:50:20 +06:00
Sergey M․
586f1cc532
[extractor/common] Skip html comment tags ( Closes #6822 )
2015-09-11 21:07:32 +06:00
Sergey M․
73eb13dfc7
[extractor/common] Case insensitive inputs extraction
2015-09-11 20:43:05 +06:00
Sergey M․
be0e5dbd83
[extractor/common] Extract submit inputs
2015-09-06 07:20:47 +06:00
Sergey M․
43e7d3c945
[extractor/common] Add raise_login_required
2015-08-26 21:24:47 +06:00
Jaime Marquínez Ferrándiz
8c97f81943
[common] Follow convention of using 'cls' in classmethods
2015-08-21 11:35:51 +02:00
Yen Chi Hsuan
f738dd7b7c
[common] Remove debugging codes
2015-08-21 01:43:22 +08:00
Yen Chi Hsuan
912e0b7e46
[common] Add _merge_subtitles()
2015-08-21 01:37:07 +08:00
Yen Chi Hsuan
03bc7237ad
[common] _parse_smil_subtitles: accept lang
as the subtitle language
2015-08-20 23:18:58 +08:00
Sergey M․
5cdefc4625
[extractor/common] Add more subtitle mime types for guess when ext is missing
2015-08-20 01:02:50 +06:00
Sergey M․
ce00af8767
[extractor/common] Add default subtitles lang
2015-08-20 00:56:17 +06:00
Yen Chi Hsuan
f877c6ae5a
[theplatform] Use InfoExtractor._parse_smil_formats()
2015-08-19 23:11:25 +08:00
Sergey M․
e64b756943
[extractor/common] Interactive TFA code input
2015-08-15 21:55:07 +06:00
Sergey M․
201ea3ee8e
[extractor/common] Improve _hidden_inputs
2015-08-15 21:52:22 +06:00
Sergey M․
8b9848ac56
[extractor/common] Expand meta regex
2015-08-15 15:58:30 +06:00
Sergey M․
942acef594
[extractor/common] Extract _parse_xspf
2015-08-09 19:41:55 +06:00
Sergey M․
98044462b1
[extractor/common] Use playlist id as default title
2015-08-09 19:18:50 +06:00
Sergey M․
e0b9d78fab
[extractor/common] Clarify playlists can have description field
2015-08-09 19:09:50 +06:00
Sergey M․
8d6765cf48
[extractor/generic] Add generic support for xspf playist extraction
2015-08-09 19:07:18 +06:00
Sergey M.
d5d7bdaeb5
Merge pull request #6428 from dstftw/improve-generic-smil-support
...
Improve generic SMIL support
2015-08-08 05:47:33 +06:00
Sergey M․
5b0c40da24
[extractor/common] Expand meta regex
2015-08-08 03:36:29 +06:00
Sergey M․
17712eeb19
[extractor/common] Extract namespace parse routine
2015-08-02 01:31:17 +06:00
Sergey M․
41c3a5a7be
[extractor/common] Fix python 3
2015-08-02 01:20:49 +06:00
Sergey M․
a107193e4b
[extractor/common] Extract f4m and m3u8 formats, subtitles and info
2015-08-02 01:13:21 +06:00
remitamine
799207e838
[viewster] extract the api auth token
...
Closes #6406 .
2015-07-30 12:55:48 +02:00
Sergey M․
864f24bd2c
[extractor/common] Add _meta_regex and clarify tags field
2015-07-29 03:43:03 +06:00
Purdea Andrei
5316bf7487
Documented tags as a possible dict key
2015-07-28 18:30:42 +03:00
Sergey M․
10952eb2cf
[extractor/common] Consistent URL spelling
2015-07-23 23:37:45 +06:00
Jaime Marquínez Ferrándiz
297a564bee
[youtube] Extract end_time
2015-07-23 13:20:21 +02:00
Jaime Marquínez Ferrándiz
7c80519cbf
[youtube] Extract start_time
...
From the 't=*' in the url.
Currently youtube-dl doesn't use the value, but it was requested for the mpv plugin.
2015-07-20 21:10:28 +02:00
Sergey M․
74fe23ec35
[extractor/common] Style
2015-07-18 16:35:28 +06:00
Yen Chi Hsuan
a38436e889
[extractor/common] Add 'transform_source' parameter to _extract_f4m_formats()
2015-07-17 12:02:49 +08:00
Sergey M․
31c746e5dc
[extractor/common] Keep going in some media_url is missing
2015-07-16 01:25:33 +06:00
Sergey M․
70f0f5a8ca
[extractor/common] Recursively extract child f4m manifests
2015-07-16 01:15:15 +06:00
Sergey M․
cc357c4db8
[extractor/common] Properly handle full URLs
2015-07-16 01:14:52 +06:00
Sergey M․
97f4aecfc1
[extractor/common] Handle malformed f4m manifests
2015-07-16 01:14:08 +06:00
Sergey M․
cf61d96df0
[extractor/common] Add _form_hidden_inputs
2015-07-14 22:38:10 +06:00
Sergey M․
f8da79f828
[extractor/common] Improve _form_hidden_inputs and rename to _hidden_inputs
2015-07-14 22:36:30 +06:00
Sergey M․
27713812a0
[extractor/common] Add method for extracting form hidden input fields as dict
2015-07-10 21:49:09 +06:00
Yen Chi Hsuan
13af92fdc4
[common] Add 'fatal' to _extract_m3u8_formats
2015-07-06 08:39:38 +08:00
Sergey M․
5414623791
[extractor/common] Remove superfluous line
2015-06-29 00:49:19 +06:00
Sergey M․
c342041fba
[extractor/common] Use NO_DEFAULT from utils
2015-06-28 22:56:45 +06:00
Yen Chi Hsuan
621ed9f5f4
[common] Add note and errnote field for _extract_m3u8_formats
2015-06-07 16:33:22 +08:00
Sergey M․
baa43cbaf0
[extractor/common] Relax valid url check verbosity
2015-05-17 02:59:35 +06:00
Yen Chi Hsuan
c1c924abfe
[utils,common] Merge format_srt_time and _subtitles_timecode
...
format_srt_time uses a comma as the delimiter between seconds and
milliseconds while _subtitles_timecode uses a dot. All .srt examples I
found on the Internet uses a comma, so I use a comma in the merged
version. See http://matroska.org/technical/specs/subtitles/srt.html and
http://devel.aegisub.org/wiki/SubtitleFormats/SRT
2015-05-12 13:04:54 +08:00
Yen Chi Hsuan
05d5392cda
[common] Ignore subtitles in m3u8
2015-05-07 18:06:22 +08:00
Sergey M․
74f728249f
[extractor/common] Fallback to empty string for (yet) missing format_id
in _sort_formats
( Closes #5624 )
2015-05-06 21:24:24 +06:00
Jaime Marquínez Ferrándiz
2ddcd88129
Remove code that was only used by the Grooveshark extractor
2015-05-02 17:29:56 +02:00
zouhair
cf0649f8b7
Typo: twice "the the" to "the"
2015-04-29 11:03:10 -04:00
Sergey M․
3ded7bac16
[extractor/common] Add ability to specify custom field preference for _sort_formats
2015-04-20 21:13:31 +06:00
Jaime Marquínez Ferrándiz
08f2a92c9c
InfoExtractor._search_regex: Suggest updating when the regex is not found (suggested in #5442 )
...
Reuse the same message from ExtractorError
2015-04-17 14:55:24 +02:00
Yen Chi Hsuan
c9a779695d
[extractor/common] Add the encoding parameter
...
The QQMusic info extractor need forced encoding for correct working.
2015-04-16 17:34:54 +08:00
Sergey M․
830d53bfae
[utils] Add video_title
for url_result
2015-04-12 23:11:47 +06:00
Sergey M․
e21a55abcc
[extractor/common] Remove f4m section
...
It's now provided by `f4m_id`
2015-04-04 23:05:25 +06:00
Sergey M․
4a34f69ea6
[extractor/common] Add subtitles timecode formatter
2015-03-13 21:38:28 +06:00
Sergey M․
f207019ce5
[extractor/common] Remove 'm3u8' from quality selection URL
2015-03-06 22:53:53 +06:00
Sergey M․
8dc9d361c2
[extractor/common] Fix format_id when last_media
is None and always include m3u8_id
if present
...
The rationale behind `m3u8_id` was to resolve duplicates when processing several m3u8 playlists within the same media that give equal resulting `format_id`'s,
e.g. `youtube-dl http://www.rts.ch/play/tv/passe-moi-les-jumelles/video/la-fee-des-bois-mustang-les-chemins-du-vent?id=3854925 -F`
2015-03-06 22:52:50 +06:00
Philipp Hagemeister
a0bb7c5593
[extractor/common] Improve m3u format IDs ( #5143 )
2015-03-06 10:49:42 +01:00
Sergey M․
2f0f6578c3
[extractor/common] Assume non HTTP(S) URLs valid
2015-03-02 22:38:44 +06:00
Philipp Hagemeister
72a406e7aa
[extractor/common] Pass in video_id ( #5057 )
2015-02-26 01:35:43 +01:00
Antti Ajanki
6f4ba54079
[extractor/common] Extract HTTP (possibly f4m) URLs from a .smil file
2015-02-24 21:22:59 +02:00
Antti Ajanki
637570326b
[extractor/common] Extract the first of a seq of videos in a .smil file
2015-02-24 21:22:59 +02:00
Jaime Marquínez Ferrándiz
bfc993cc91
Merge branch 'subtitles-rework'
...
(Closes PR #4964 )
2015-02-23 17:13:03 +01:00
Sergey M․
9fe6ef7ab2
[extractor/common] Fix preference for m3u8 quality selection URL
2015-02-23 03:30:10 +06:00
Philipp Hagemeister
8fb3ac3649
PEP8: W503
2015-02-21 14:55:13 +01:00
Philipp Hagemeister
77b2986b5b
[extractor/common] Recognize Indian censorship ( #5021 )
2015-02-21 14:51:07 +01:00
Jaime Marquínez Ferrándiz
9868ea4936
[extractor/common] Simplify subtitles handling methods
...
Initially I was going to use a single method for handling both subtitles and automatic captions, that's why I used the 'list_subtitles' and the 'subtitles' variables.
2015-02-17 22:16:29 +01:00
Philipp Hagemeister
fa15607773
PEP8 fixes
2015-02-17 21:46:20 +01:00
Jaime Marquínez Ferrándiz
4cd95bcbc3
[twitch:stream] Prefer the 'source' format ( fixes #4972 )
2015-02-17 18:57:01 +01:00
Sergey M?
4069766c52
[extractor/common] Test URLs with GET
2015-02-17 22:35:27 +06:00
Jaime Marquínez Ferrándiz
360e1ca5cc
[youtube] Convert to new subtitles system
...
The automatic captions are stored in the 'automactic_captions' field, which is used if no normal subtitles are found for an specific language.
2015-02-16 22:47:39 +01:00
Jaime Marquínez Ferrándiz
c84dd8a90d
[YoutubeDL] store the subtitles to download in the 'requested_subtitles' field
...
We need to keep the orginal subtitles information, so that the '--load-info' option can be used to list or select the subtitles again.
We'll also be able to have a separate field for storing the automatic captions info.
2015-02-16 21:51:08 +01:00
Jaime Marquínez Ferrándiz
a504ced097
Improve subtitles support
...
For each language the extractor builds a list with the available formats sorted (like for video formats), then YoutubeDL selects one of them using the '--sub-format' option which now allows giving the format preferences (for example 'ass/srt/best').
For each format the 'url' field can be set so that we only download the contents if needed, or if the contents needs to be processed (like in crunchyroll) the 'data' field can be used.
The reasons for this change are:
* We weren't checking that the format given with '--sub-format' was available, checking it in each extractor would be repetitive.
* It allows to easily support giving a format preference.
* The subtitles were automatically downloaded in the extractor, but I think that if you use for example the '--dump-json' option you want to finish as fast as possible.
Currently only the ted extractor has been updated, but the old system still works.
2015-02-16 21:51:03 +01:00
Philipp Hagemeister
03cd72b007
[extractor/common] Move up filesize
...
filesize and tbr should correlate, so it doesn't make sense to treat them differently.
2015-02-16 04:39:22 +01:00
Jaime Marquínez Ferrándiz
6ca7732d5e
[extractor/common] Fix link to external documentation
2015-02-14 22:20:24 +01:00
Jaime Marquínez Ferrándiz
2d30521ab9
[youtube] Extract average rating ( closes #2362 )
2015-02-11 18:39:31 +01:00
Philipp Hagemeister
9650885be9
[escapist] Filter video differently ( Fixes #4919 )
2015-02-10 15:55:51 +01:00
Philipp Hagemeister
7e5db8c930
[options] Add --no-color
2015-02-10 04:22:10 +01:00
Philipp Hagemeister
3a5bcd0326
[extractor/common] Wrap extractor errors ( Fixes #1194 )
...
For now, we just wrap some common errors. More may follow. We do not want to catch actual programming errors in the extractors, such as 1 // 0.
2015-02-10 01:17:23 +01:00
Naglis Jonaitis
69319969de
[extractor/common] Add new helper method _family_friendly_search
2015-02-08 17:39:00 +02:00
Philipp Hagemeister
1e1896f2de
[extractor/common] Correct sort order.
...
We should look at height and width before ext_preference.
2015-02-06 15:16:45 +01:00
Sergey M․
3900eec27c
[extractor/common] Fix 2.0 manifest extraction ( Closes #4830 )
2015-02-06 04:29:29 +06:00
Sergey M․
60ca389c64
[extractor/common] Prefix f4m/m3u8 entries with identifier
2015-02-05 22:16:27 +06:00
Philipp Hagemeister
9bb8e0a3f9
[wsj] Add new extractor ( Fixes #4854 )
2015-02-03 10:58:28 +01:00
Philipp Hagemeister
1a6373ef39
[sort_formats] Prefer bitrate over video size
...
720p @ 1000KB/s looks way better than 1080p @ 500KB/s
2015-02-03 10:53:07 +01:00
Philipp Hagemeister
995029a142
[nerdist] Add new extractor ( Fixes #4851 )
2015-02-02 23:38:35 +01:00
Philipp Hagemeister
b04b885271
[extractor/common] Document all protocol values
2015-01-30 15:53:16 +01:00
Sergey M․
96a53167fa
[common] Generalize URLs' HTTP errors pre-testing
2015-01-26 00:32:31 +06:00
Philipp Hagemeister
3dee7826e7
[rtl2] PEP8, simplify, make rtmp tests run ( #470 )
2015-01-25 18:09:48 +01:00
Philipp Hagemeister
cfb56d1af3
Add --list-thumbnails
2015-01-25 02:43:19 +01:00
Jaime Marquínez Ferrándiz
e1554a407d
[extractors] Use http_headers for setting the User-Agent and the Referer
2015-01-24 18:23:53 +01:00
Philipp Hagemeister
121c09c7be
Merge remote-tracking branch 'Dineshs91/f4m-2.0'
2015-01-10 17:51:52 +01:00
Philipp Hagemeister
6271f1cad9
[youtube|ffmpeg] Automatically correct video with non-square pixels ( Fixes #4674 )
2015-01-10 05:45:51 +01:00
Philipp Hagemeister
ff21a8e0ee
Merge remote-tracking branch 'Tithen-Firion/master'
2015-01-10 02:26:21 +01:00
Philipp Hagemeister
dd622d7c4e
[netzkino] Add new extractor ( Fixes #4669 )
2015-01-09 23:59:18 +01:00
Philipp Hagemeister
bec2248141
[InfoExtractor/common] Correct and test meta tag matching
2015-01-08 16:14:50 +01:00
Philipp Hagemeister
0590062925
Respect age_limit when listing extractors ( Fixes #4653 )
2015-01-07 07:20:20 +01:00
Philipp Hagemeister
e65566a9cc
[youtube] Correct handling when DASH manifest is not necessary to find all formats
2015-01-03 18:33:38 +01:00
Sergey M․
6c6f1408f2
[extractor/common] Allow multiline content tags
2015-01-01 00:37:14 +06:00
Jaime Marquínez Ferrándiz
5d3808524d
[extractor/common] Update docstring: replace FileDownloader with YoutubeDL
2014-12-21 16:58:29 +01:00
Philipp Hagemeister
bf94e38d3d
Merge remote-tracking branch 'Tithen-Firion/hsw-update'
2014-12-12 04:10:55 +01:00
Philipp Hagemeister
f5e43bc695
[vine] Provide alt_title ( Fixes #4448 )
2014-12-12 03:34:28 +01:00
Sergey M․
e89a2aabed
[extractor/common] Add generic SMIL formats extraction routine
2014-12-09 22:28:28 +06:00
Philipp Hagemeister
f58766ce5c
[extractor/common] Document ie_key in url results
2014-12-09 10:58:06 +01:00
Sergey M․
acf5cbfe93
[extractor/common] Add description to playlist_result
2014-12-07 01:46:30 +06:00
Philipp Hagemeister
b82f815f37
Allow iterators for playlist result entries
2014-12-06 14:02:19 +01:00
Tithen-Firion
ebb6419960
[common] Split _download_json
...
Add ability for extractor to use _parse_json
2014-12-05 12:21:21 +01:00
Tithen-Firion
995ad69c54
[common] Add new parameters for _download_webpage
2014-12-04 14:16:09 +01:00
Philipp Hagemeister
810fb84d5e
pep8 and minor beautification all around
2014-12-04 08:27:40 +01:00
Jaime Marquínez Ferrándiz
42939b6129
[youtube] Use a cookie for seeting the language
...
This way, we don't have to do an aditional request
2014-11-30 00:03:59 +01:00
Philipp Hagemeister
4e262a8838
[generic] Detect direct video links ( Fixes #4149 , #4313 )
2014-11-26 10:44:39 +01:00
Jouke Waleson
9e1a5b8455
PEP8: applied even more rules
2014-11-23 21:39:15 +01:00
Jouke Waleson
5f6a1245ff
PEP8 applied
2014-11-23 20:41:03 +01:00
Philipp Hagemeister
fed5d03260
[extractor/common] Document _type values (Motivated by #4254 )
2014-11-20 16:47:59 +01:00
Philipp Hagemeister
aff2f4f4f5
[arte] Clean up format sorting mess
...
We now use our standard sorting facilities. As a side effect, it's finally possible to download German videos from French URLs and vice versa.
2014-11-20 12:06:35 +01:00
Philipp Hagemeister
711ede6e1b
[heise] Fix description, thumbnail and format ID
2014-11-04 23:14:16 +01:00
Philipp Hagemeister
8c25f81bee
[util] Move compatibility functions out of util
...
utils is large enough without these compatibility functions.
Everything that is present in newer versions of Python (i.e. with dev Python it's just an import) goes into compat.py .
Everything else (i.e. youtube-dl-specific helpers) goes into utils.py .
2014-11-02 11:23:42 +01:00
Philipp Hagemeister
2c8e03d937
Sort formats by fps as well
2014-10-30 09:40:52 +01:00
Philipp Hagemeister
fbb21cf528
[youtube] Add formats 298, 299 ( Fixes #4056 )
2014-10-30 09:34:13 +01:00
Philipp Hagemeister
81515ad9f6
[extractor/common] Improve m3u8 output
2014-10-27 02:28:37 +01:00
Philipp Hagemeister
23be51d8ce
[generic] Handle audio streams that do not implement HEAD ( Fixes #4032 )
2014-10-26 17:05:44 +01:00
Philipp Hagemeister
c64ed2a310
[viddler] Use API
2014-10-25 00:11:12 +02:00
Philipp Hagemeister
1ede5b2481
[glide] Simplify
2014-10-24 15:34:19 +02:00
dinesh
7a47d07c6d
[extractor/common] href attribute added
2014-10-24 09:47:39 +05:30
dinesh
34e48bed3b
[extractor/common] Added support for f4m manifest Version 2.0
2014-10-24 02:41:10 +05:30
Sergey M․
5f58165def
[extractor/common] Fix dumping requests with long file abspath on Windows
2014-10-14 21:43:48 +07:00
Philipp Hagemeister
d838b1bd4a
[utils] Default age_limit to None
...
If we can't parse it, it means we don't have any information, not that the content is unrestricted.
2014-10-03 20:17:12 +02:00
Philipp Hagemeister
e7b6d12254
[utils] Improve and test js_to_json
2014-10-01 00:08:34 +02:00
Philipp Hagemeister
b14f3a4c1d
[golem] Simplify ( #3828 )
2014-09-28 10:35:19 +02:00
Philipp Hagemeister
ed9266db90
[common] Add new helper function _match_id
2014-09-28 09:31:58 +02:00
Philipp Hagemeister
f4b1c7adb8
[muenchentv] Move live title generation to common
2014-09-28 08:53:52 +02:00
Philipp Hagemeister
f0b5d6af74
[vevo] Support 1080p videos ( Fixes #3656 )
2014-09-24 14:16:56 +02:00
Philipp Hagemeister
7267bd536f
[muenchentv] Add support ( Fixes #3507 )
2014-09-19 09:57:53 +02:00
Sergey M․
9ebf22b7d9
[common] Improve codecs extraction from m3u8
2014-09-01 20:13:04 +07:00
Philipp Hagemeister
daebaab692
[extractor/common] Correct typo
2014-08-28 13:04:49 +02:00
Philipp Hagemeister
3524cc25ca
[sportdeutschland] Add support for more plain videos
2014-08-28 10:55:32 +02:00
Philipp Hagemeister
f1a9d64eea
[extractor/common] Modernize
2014-08-28 01:04:43 +02:00
Philipp Hagemeister
da9ec3b932
[muscivault] Add extractor ( Fixes #3593 )
2014-08-27 01:44:47 +02:00
Philipp Hagemeister
704df56da7
[sportdeutschland] add new extractor
2014-08-26 12:51:13 +02:00
Philipp Hagemeister
b252735910
[extractor/common] Generate better f4m format IDs
2014-08-25 13:03:08 +02:00
Philipp Hagemeister
9480d1a566
Merge remote-tracking branch 'riking/twofactor'
2014-08-24 07:14:23 +02:00
Philipp Hagemeister
d769be6c96
[grooveshark,http] Make HTTP POST downloads work
2014-08-24 01:31:35 +02:00
Philipp Hagemeister
a36819731b
[escapist] Add support for og:video:url ( Fixes #3557 )
2014-08-21 13:05:24 +02:00
riking
165250ff5e
Remove debug prints
2014-08-16 14:49:30 -07:00
riking
83317f6938
[youtube] Add two-factor account signin (TOTP only)
...
Additional work is required to prompt the user for the SMS or phone call codes, as there is no framework currently to prompt the user during an extraction operation.
Fixes #3533
2014-08-16 14:48:17 -07:00
Jaime Marquínez Ferrándiz
f036a6328e
[extractor/common] _extract_f4m_formats: Use more specific messages when downloading the manifest
2014-07-28 15:42:19 +02:00
Jaime Marquínez Ferrándiz
31bb8d3f51
[bloomberg] Extract the available formats ( closes #2776 )
...
It uses a helper method in the InfoExtractor class.
The downloader will pick the requested formats using the bitrate in the info dict.
2014-07-28 15:32:38 +02:00
Philipp Hagemeister
c3415d1bac
[extractor/common] PEP8
2014-07-25 10:43:03 +02:00
Philipp Hagemeister
b090af5922
[vube] Fix comment count
2014-07-23 01:27:25 +02:00
Philipp Hagemeister
1a30deca50
[teachertube] Fix title and playlist recognition
2014-07-21 12:47:01 +02:00
Philipp Hagemeister
9732d77ed2
[snotr] PEP8 and minor fixes ( #3296 )
2014-07-21 12:02:44 +02:00
Philipp Hagemeister
40c696e5c6
[screencast] Add suppot for more video types ( #3236 )
2014-07-11 15:39:24 +02:00
Philipp Hagemeister
4094b6e36d
[vodlocker] PEP8, generalization, and simplification ( #3223 )
2014-07-11 10:57:40 +02:00
Jaime Marquínez Ferrándiz
78338f71ca
[livestream:original] Add support for folder urls ( closes #2631 )
...
The webpage only contains shortened links for the videos, since the server
doesn't support HEAD requests, we use an specific extractor for them.
2014-06-26 16:34:36 +02:00
Philipp Hagemeister
d551980823
[spiegeltv] Simplify and PEP8
2014-06-07 15:35:13 +02:00
Philipp Hagemeister
ad3bc6acd5
Document and test categories ( #2923 )
2014-05-15 12:41:42 +02:00
Philipp Hagemeister
5afa7f8bee
[extractor/common] --write-pages: Correct file name if video_id is None
2014-05-15 12:39:33 +02:00
Philipp Hagemeister
57c7411f46
[mixcloud] Shed API dependency ( #2904 )
2014-05-13 09:42:38 +02:00
Philipp Hagemeister
c1bce22f23
[extractor/common] Protect against long video IDs and URLs
2014-05-12 21:58:23 +02:00
Philipp Hagemeister
2099125333
[soundcloud/generic] Add support for playlists
2014-05-05 03:15:17 +02:00
Philipp Hagemeister
28746fbd59
[bilibili] Add preliminary support ( #2174 )
...
The URL http://www.bilibili.tv/video/av636603/index_2.html does not work yet.
2014-04-21 13:46:41 +02:00
Anisse Astier
ec0fafbb19
[extractor/common] fallback on utf-8 when charset is not found
...
fixes #2721
2014-04-07 23:10:16 +02:00
Philipp Hagemeister
b6cfde99b7
Only mention websense URL once
2014-04-03 08:12:53 +02:00
Philipp Hagemeister
2410c43d83
Detect Websense censorship ( Fixes #2670 )
2014-04-03 06:09:38 +02:00
Philipp Hagemeister
38d63d846e
[extractor/common] Clarify preference key in formats
2014-03-23 17:41:43 +01:00
Philipp Hagemeister
955c451456
Rename upload_timestamp to timestamp
2014-03-13 18:45:14 +01:00
Philipp Hagemeister
9d2ecdbc71
[vevo] Centralize timestamp handling
2014-03-13 15:30:25 +01:00
Philipp Hagemeister
5a25f39653
Correct extractor documentation
2014-03-10 13:09:55 +01:00
Philipp Hagemeister
9f62eaf4ef
[canal13cl] Add test and improve extraction ( #2498 )
2014-03-03 12:53:11 +01:00
Philipp Hagemeister
0afef30b23
Add display_id field
2014-03-03 12:06:28 +01:00
Philipp Hagemeister
81c2f20b53
[youtube] Correct invalid JSON ( Fixes #2353 )
2014-02-09 17:56:10 +01:00
dst
c1206423c4
Fix extraction of og content in single quotes
2014-01-31 03:57:33 +07:00
Jaime Marquínez Ferrándiz
0c708f11cb
[bloomberg] Fix ooyala url extraction
...
Added a helper method to InfoExtractor for searching the ‘twitter:player’ meta property.
Now the OoyalaIE also recognizes the ‘ec’ parameter in the url as the embed code.
2014-01-29 18:03:32 +01:00
Philipp Hagemeister
7e8caf30c0
Throw an error if no video formats are found
2014-01-27 07:31:54 +01:00
Philipp Hagemeister
db1f388878
[huffpost] Add support
2014-01-27 05:47:38 +01:00
Jaime Marquínez Ferrándiz
944d65c762
[extractor/common] Encode the url when calculating the md5 with —write-pages
option
...
This doesn’t cause any problem in python 2.*, but on python 3 the `md5` function only accepts bytes.
2014-01-25 15:32:56 +01:00
Philipp Hagemeister
1394ce65b4
[youtube] Add new formats ( Fixes #2221 )
2014-01-23 23:54:06 +01:00
Philipp Hagemeister
50317b111d
Merge branch 'youtube-dash-manifest'
...
Conflicts:
youtube_dl/extractor/youtube.py
2014-01-22 19:58:31 +01:00
Philipp Hagemeister
9d4288b2d4
[extractor/common] Clarify when and when not we generate the filename
2014-01-21 01:41:13 +01:00
Philipp Hagemeister
b60016e831
Deal with implicitly UTF-16 decoded webpages
...
These webpages don't specify an encoding and rely on the BOM
2014-01-21 01:39:40 +01:00
Philipp Hagemeister
dd27fd1739
[youtube] Download DASH manifest
...
If given, download and parse the DASH manifest file, in order to get ultra-HQ formats.
Fixes #2166
2014-01-19 05:47:20 +01:00
Philipp Hagemeister
3ec05685f7
[extractor/common] Limit --write-pages filename to 200 chars
...
This avoids problems with very long URLs.
2014-01-17 14:47:47 +01:00
Philipp Hagemeister
9933b57430
[pornhub] Use centralized sorting
2014-01-07 10:25:34 +01:00
Philipp Hagemeister
3d3538e422
[khanacademy] Add support ( Fixes #2066 )
2014-01-07 09:35:34 +01:00
Philipp Hagemeister
5d73273f6f
[orf] Use new extraction method ( Fixes #2057 )
2014-01-06 17:15:27 +01:00
Philipp Hagemeister
9887c9b2d6
[jpopsuki] Simplify
2014-01-03 12:51:37 +01:00
Philipp Hagemeister
08d13955dd
[wistia] Prefer original video format above all others
...
We could also set up a formula which would weigh filesize/bitrate and vcodec/acodec (say, 1GB h264 < 3 GB MPEG2 < 2 GB h264), but that would get really messy real soon.
2014-01-01 20:23:49 +01:00
Philipp Hagemeister
5d4f3985be
Document that format_id field should be present
2013-12-26 21:19:00 +01:00
Philipp Hagemeister
7217e148fb
[yahoo] Use centralized sorting, and add tbr field
2013-12-25 15:18:40 +01:00
Philipp Hagemeister
c7deaa4c74
[zdf] Use centralized sorting
2013-12-24 23:32:04 +01:00
Philipp Hagemeister
e6812ac99d
[spiegel] Use centralized sorting
2013-12-24 12:40:23 +01:00
Philipp Hagemeister
4bcc7bd1f2
Add temporary _sort_formats helper function
2013-12-24 12:31:42 +01:00
Philipp Hagemeister
f49d89ee04
Add a resolution field and improve general --list-formats output
2013-12-24 11:56:02 +01:00
Philipp Hagemeister
f45f96f8f8
[myvideo] Use RTMP instead of RTMPT ( Fixes #2032 )
2013-12-23 15:57:43 +01:00
Philipp Hagemeister
1538eff6d8
[bliptv] Remove support for direct downloads
...
This is now handled by the generic IE
2013-12-23 15:49:21 +01:00
Philipp Hagemeister
aa94a6d315
[aparat] Add support ( Fixes #2012 )
2013-12-20 17:05:39 +01:00
Jaime Marquínez Ferrándiz
c0d0b01f0e
[generic] Detect ooyala videos ( fixes #2013 )
2013-12-19 20:32:12 +01:00
Philipp Hagemeister
46374a56b2
[youtube] Do not warn for videos with allow_rating=0
...
This fixes #1982
Test video: http://www.youtube.com/watch?v=gi2uH3YxohU
2013-12-17 02:49:56 +01:00
Itay Brandes
87a28127d2
_search_regex's "isatty" call fails with Py2exe's
...
_search_regex calls the sys.stderr.isatty() function for unix systems.
Py2exe uses a custom Stderr() stream which doesn't have an `isatty()`
function, leading to it's crash.
Fixes easily with checking that it's a unix system first.
2013-12-16 21:50:26 +01:00
Philipp Hagemeister
d67b0b1596
Reorder info_dict documentation
2013-12-16 14:13:40 +01:00
Philipp Hagemeister
c0ba0f4859
Document duration field
2013-12-16 04:09:43 +01:00
Philipp Hagemeister
e2b38da931
[mtv] Fixup incorrectly encoded XML documents
2013-12-10 12:45:22 +01:00
Philipp Hagemeister
7cc3570e53
Add fatal=False parameter to _download_* functions.
...
This allows us to simplify the calls in the youtube extractor even further.
2013-12-09 01:49:03 +01:00
Philipp Hagemeister
19e3dfc9f8
[9gag] Like/dislike count ( #1895 )
2013-12-05 18:29:07 +01:00
Philipp Hagemeister
aaebed13a8
[smotri] Simplify
2013-12-02 17:08:17 +01:00
Philipp Hagemeister
2a275ab007
[zdf] Use _download_xml
2013-11-28 05:47:50 +01:00
Philipp Hagemeister
79d09f47c2
Merge branch 'opener-to-ydl'
2013-11-25 03:30:37 +01:00
Philipp Hagemeister
c059bdd432
Remove quality_name field and improve zdf extractor
2013-11-25 03:28:55 +01:00
Philipp Hagemeister
02dbf93f0e
[zdf/common] Use API in ZDF extractor.
...
This also comes with a lot of extra format fields
Fixes #1518
2013-11-25 03:13:22 +01:00
Philipp Hagemeister
e03db0a077
Merge branch 'master' into opener-to-ydl
2013-11-24 15:18:44 +01:00
Jaime Marquínez Ferrándiz
267ed0c5d3
[collegehumor] Encode the xml before calling xml.etree.ElementTree.fromstring ( fixes #1822 )
...
Uses a new helper method in InfoExtractor: _download_xml
2013-11-24 14:59:19 +01:00
Philipp Hagemeister
7012b23c94
Match --download-archive during playlist processing ( Fixes #1745 )
2013-11-22 22:46:46 +01:00
Philipp Hagemeister
dca0872056
Move the opener to the YoutubeDL object.
...
This is the first step towards being able to just import youtube_dl and start using it.
Apart from removing global state, this would fix problems like #1805 .
2013-11-22 19:57:52 +01:00
Philipp Hagemeister
5904088811
Add support for tou.tv ( Fixes #1792 )
2013-11-20 06:13:19 +01:00
Philipp Hagemeister
91c7271aab
Add automatic generation of format note based on bitrate and codecs
2013-11-16 01:08:43 +01:00
Jaime Marquínez Ferrándiz
78fb87b283
Don't accept '>' inside the content attribute in OpenGraph regexes
2013-11-15 12:54:13 +01:00
Jaime Marquínez Ferrándiz
ab2d524780
Improve the OpenGraph regex
...
* Do not accept '>' between the property and content attributes.
* Recognize the properties if the content attribute is before the property attribute using two regexes (fixes the extraction of the description for SlideshareIE).
2013-11-15 12:24:54 +01:00
Philipp Hagemeister
eb0a839866
[common] Simplify og_search_property
2013-11-12 10:36:23 +01:00
Marcin Cieślak
a8eeb0597b
Fix AssertionError when og property not found
...
On tvp.pl some webpages contain OpenGraph
metadata and some don't.
If og property is not found, _og_search_description
fails with
WARNING: unable to extract OpenGraph description; please report this issue on http://yt-dl.org/bug
Traceback (most recent call last):
File "/usr/home/saper/bin/youtube-dl", line 18, in <module>
youtube_dl.main()
File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 766, in main
_real_main(argv)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/__init__.py", line 719, in _real_main
retcode = ydl.download(all_urls)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 715, in download
videos = self.extract_info(url)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/YoutubeDL.py", line 348, in extract_info
ie_result = ie.extract(url)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 125, in extract
return self._real_extract(url)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/tvp.py", line 56, in _real_extract
info['description'] = self._og_search_description(webpage)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 331, in _og_search_description
return self._og_search_property('description', html, fatal=False, **kargs)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/extractor/common.py", line 325, in _og_search_property
return unescapeHTML(escaped)
File "/usr/home/saper/sw/youtube-dl/youtube_dl/utils.py", line 494, in unescapeHTML
assert type(s) == type(u'')
AssertionError
The patch allows me to use:
try:
info['description'] = self._og_search_description(webpage)
info['thumbnail'] = self._og_search_thumbnail(webpage)
except RegexNotFoundError:
pass
2013-11-05 23:19:29 +01:00
Jaime Marquínez Ferrándiz
9103bbc5cd
Add the 'webpage_url' field to info_dict
...
The url for the video page, it must allow to reproduce the result.
It's automatically set by YoutubeDL if it's missing.
2013-11-03 12:11:13 +01:00
Philipp Hagemeister
b5d0d817bc
Remove superfluous space
2013-10-30 01:09:44 +01:00
Philipp Hagemeister
ebc14f251c
Merge remote-tracking branch 'origin/master'
2013-10-28 10:44:13 +01:00
Philipp Hagemeister
d41e6efc85
New debug option --write-pages
2013-10-28 10:44:02 +01:00
Filippo Valsorda
8ffa13e03e
[Instagram] get the non-https link, as they are serving Akamai cert from a instagram.com domain
2013-10-28 02:34:29 -04:00
Jaime Marquínez Ferrándiz
55b3e45bba
[vimeo] Fix pro videos and player.vimeo.com urls
...
The old process can still be used for those videos.
Added RegexNotFoundError, which is raised by _search_regex if it can't extract the info.
2013-10-23 14:38:03 +02:00
Jaime Marquínez Ferrándiz
8c51aa6506
The 'format' field now defaults to '{format_id} - {width}x{height}{format_note}'
...
Following the YoutubeIE format. The 'format_note' gives additional info about the format, for example '3D' or 'DASH video'.
2013-10-21 14:42:06 +02:00
Philipp Hagemeister
416a5efce7
fix typos
2013-10-18 00:49:45 +02:00
Philipp Hagemeister
8dbe9899a9
Allow users to specify an age limit ( fixes #1545 )
...
With these changes, users can now restrict what videos are downloaded by the intented audience, by specifying their age with --age-limit YEARS .
Add rudimentary support in youtube, pornotube, and youporn.
2013-10-06 06:08:56 +02:00
Philipp Hagemeister
2f5865cc6d
Clarify that url and ext are optional when formats is given ( #980 )
2013-10-04 11:09:43 +02:00
Philipp Hagemeister
deefc05b88
Document formats (for #980 )
2013-10-04 10:40:42 +02:00
Jaime Marquínez Ferrándiz
0d75ae2ce3
Fix detection of the webpage charset if it's declared using ' instead of "
...
Like in "<meta charset='utf-8'/>"
2013-08-29 11:35:15 +02:00
Philipp Hagemeister
f143d86ad2
[sohu] Handle encoding, and fix tests
2013-08-28 14:00:05 +02:00
Philipp Hagemeister
6d69d03bac
Merge remote-tracking branch 'origin/reuse_ies'
2013-08-28 13:05:21 +02:00
Philipp Hagemeister
2eabb80254
[addanime] improve
2013-08-28 04:25:38 +02:00
Jaime Marquínez Ferrándiz
9e9c164052
Merge pull request #937 from jaimeMF/subtitles_rework
...
Subtitles rework
2013-08-23 02:40:25 -07:00
Philipp Hagemeister
79cb25776f
Cache suitable regular expressions
...
This speeds up TestAllURLsMatching.test_no_duplicates by about 8000% at the cost of minimal memory overhead.
2013-08-21 04:06:48 +02:00
Jaime Marquínez Ferrándiz
5d51a883c2
Use a dictionary for storing the subtitles
...
The errors while getting the subtitles are reported as warnings, if no subtitles are found return and empty dict.
2013-07-20 12:52:25 +02:00
Philipp Hagemeister
f38de77f6e
Use unescapeHTML for OpenGraph properties
...
These are attribute values, so we don't need the more complex and whitespace-destroying cleanHTML - we just need to unescape quotes, that's it.
2013-07-17 10:38:23 +02:00
Philipp Hagemeister
b9d3e1635f
Strip hash info from URL when making requests ( Fixes #1038 )
2013-07-13 22:52:12 +02:00
Philipp Hagemeister
3c4e6d8337
Improve OpenGraph property matching
2013-07-13 20:39:47 +02:00
Jaime Marquínez Ferrándiz
44dbe89035
Use re.DOTALL by default when searching OpenGraph properties
2013-07-13 11:29:08 +02:00
Jaime Marquínez Ferrándiz
46720279c2
InfoExtractor: add some helper methods to extract OpenGraph info
2013-07-12 22:12:04 +02:00
Philipp Hagemeister
690e872c51
Remove video_result helper method
...
Calling it was more complex then actually including the type in the video info
2013-07-11 12:12:30 +02:00
Jaime Marquínez Ferrándiz
56c7366547
YoutubeIE: reuse instances of InfoExtractors ( closes #998 )
...
When a IE is added to the list, it's also added to a dictionary. When a IE is requested it first looks in the dictionary and if there's no instance it will create a new one.
That way _real_initialize is only called once for each IE, saving time if it needs to login for example.
2013-07-08 15:14:27 +02:00
Philipp Hagemeister
d93e4dcbb7
Merge branch 'master' of github.com:rg3/youtube-dl
2013-07-08 01:15:19 +02:00
Philipp Hagemeister
73e79f2a1b
[3sat] Add support ( Fixes #1001 )
2013-07-08 01:13:55 +02:00
Jaime Marquínez Ferrándiz
fc79158de2
VimeoIE: authentication support ( closes #885 ) and add a method in the base InfoExtractor to get the login info
2013-07-07 23:24:34 +02:00
Philipp Hagemeister
0f81866329
Add --list-extractor-descriptions (human-readable list of IEs)
2013-07-01 18:52:19 +02:00
Philipp Hagemeister
f3d294617f
Document view_count ( Closes #963 )
2013-06-29 16:32:28 +02:00
Filippo Valsorda
98bcd2834a
improve generic and encrypted signature error messages
2013-06-25 16:47:16 +02:00
Philipp Hagemeister
3c25b9abae
Remove useless headers
2013-06-23 20:35:50 +02:00
Philipp Hagemeister
d6983cb460
Fix generic class move (add all files)
2013-06-23 19:57:38 +02:00