mirror of
https://codeberg.org/polarisfm/youtube-dl
synced 2024-11-22 16:44:32 +01:00
c9a9ccf8a3
These improvements apply to reading the list of URLs from the file supplied via the `--batch-file` (`-a`) command line option. 1. Skip blank and empty lines in the file. Currently, lines with leading whitespace are only skipped when that whitespace is followed by a comment character (`#`, `;`, or `]`). This means that empty lines and lines consisting only of whitespace are returned as (trimmed) empty strings in the list of URLs to process. 2. [bug fix] Detect and remove the Unicode BOM when the file descriptor is already decoding Unicode. With Python 3, the `batch_fd` enumerator returns the lines of the file as Unicode. For UTF-8, this means that the raw BOM bytes from the file `\xef \xbb \xbf` show up converted into a single `\ufeff` character prefixed to the first enumerated text line. This fix solves several buggy interactions between the presence of BOM, the skipping of comments and/or blank lines, and ensuring the list of URLs is consistently trimmed. For example, if the first line of the file is blank, the BOM is incorrectly returned as a URL standing alone. If the first line contains a URL, it will be prefixed with this unwanted single character--but note that its being there will have inhibited the proper trimming of any leading whitespace. Currently, the `UnicodeBOMIE` helper attempts to recover from some of these error cases, but this fix prevents the error from happening in the first place (at least on Python3). In any case, the `UnicodeBOMIE` approach is flawed, because it is clearly illogical for a BOM to appear in the (non-batch) URL(s) specified directly on the command line (and for that matter, on URLs *after the first line* of a batch list, also) 3. Having fixed `read_batch_urls` so that it more consistently enumerates only properly trimmed URLs, it can also do a quick on-the-fly elimination of exact duplicates (of course doing so without disturbing the order in which they are listed). |
||
---|---|---|
.. | ||
downloader | ||
extractor | ||
postprocessor | ||
__init__.py | ||
__main__.py | ||
aes.py | ||
cache.py | ||
compat.py | ||
jsinterp.py | ||
options.py | ||
socks.py | ||
swfinterp.py | ||
update.py | ||
utils.py | ||
version.py | ||
YoutubeDL.py |