Building a YouTube downloader from scratch in Python
There are many resources on How to download YouTube videos using yt-downloaders; this one is different. This article shows how to create a YouTube video/audio downloader in Python without using external libraries.
This article is heavily inspired by https://kakoc.blog/blog/myox-youtube-downloader/; the article shows you how to build a basic YouTube downloader in rust.
We assume that all inputs are in this format: https://www.youtube.com/watch?v=rUWxSEwctFU
.
This makes it easy for us to extract videoID from the URL. We replace the recurring string, https://www.youtube.com/watch?v=
with an empty string.
video_id = url.replace("https://www.youtube.com/watch?v=", "")
Now, let's get the video information from the special endpoint,
import requests
endpoint = (
lambda id: f"https://www.youtube.com/get_video_info?video_id={id}&el=embedded&ps=default"
)
x = requests.get(endpoint(video_id))
x = x.content.decode("utf-8")
This returns a lot of information in the form of a query string. For those who don't know what a query string is,
first=value1&second=value2&...
A query string is a key=value
pair where multiple pairs are separated using a &
. This is how the output looks like,
c=WEB&fexp=23735348%2C23744176%2C23748146%2C23804281%2C23839597%2C23856950%2C23857948%2C23858057%2C23868330%2C23877069%2C23882685%2C23884386%2C23885566
Python lets us convert the query to JSON,
import urllib.parse as urlparse
d = urlparse.parse_qs(x)
The parsed data is,
{'c': ['WEB'],
'cr': ['US'],
'csi_page_type': ['embed'],
'csn': ['kUm6X_jhIdXRqAG0r5v4DQ'],
'cver': ['2.20201120.01.00'],
'enablecsi': ['1'],
...
Now we have a JSON, the data that we require is in, player_response
,
d = d.get('player_response')
All the data about the video stream is inside d.get('streamingData')
>> d = d.get('streamingData').keys()
>> d
dict_keys(['expiresInSeconds', 'formats', 'adaptiveFormats'])
The keys, adaptiveFormats
and formats,
contain various video formats and links to download them.
>> d = d.get('formats')[0]
>> d
{'approxDurationMs': '10007',
'audioChannels': 2,
'audioQuality': 'AUDIO_QUALITY_LOW',
'audioSampleRate': '44100',
'averageBitrate': 736573,
'bitrate': 740048,
'contentLength': '921361',
'fps': 25,
'height': 360,
'itag': 18,
'lastModified': '1458553967463178',
'mimeType': 'video/mp4; codecs="avc1.42001E, mp4a.40.2"',
'projectionType': 'RECTANGULAR',
'quality': 'medium',
'qualityLabel': '360p',
'url': 'https://r6---sn-un57en7l.googlevideo.com/videoplayback?expire=1606065649&ei=kUm6X_jhIdXRqAG0r5v...',
'width': 640}
Once you have the link you need, download using.
def download_file(url, filename) -> None:
chunkSize = 1024
r = requests.get(url, stream=True)
with open(filename, "wb") as f:
for chunk in r.iter_content(chunk_size=chunkSize):
if chunk:
f.write(chunk)
That's it. We have successfully built a YouTube downloader🥳🥳.
But this does not look good. So I decided to build a YouTube downloader that allows you to select various formats for downloading.
Introducing pytdl
pronounced pie-tee-dee-el
Remember, this is not as good as the ones out there😁.
Install
pip install -U pytdl --user
Usage
pytdl https://www.youtube.com/watch?v=E5c_Ty3RVGA
Demo
Links
GitHub: pytdl
Computer Scientist and Deep Learning Researcher
Nice. I was looking for implementations in requests library. Thank you.
Comments (1)