Python Script to download resources from NPTEL in all formats available
Before we get started, ensure that you have installed Python3 on your machine.
( contains a sample code that does all the basic operations)
- Beautiful Soup
- Requests
The available formats for downloading are
- Video
- MP4
- 3GP
- Audio
- MP3
- Transcript
The resourses are available to download from the link -[Course ID]
Eg -
For steps to reach to this website, Click this Link
- Include
in your filefrom main import nptel as nptel
- Create a object with url of the download page
Download = nptel("")
ObjectName.getLinks(Format of the content, 'mod'+(Week Number))
returns the download links of all the videos of the specified format and provied weekLinks = Download.getLinks('mp4','mod03')
For weeks you can use
'mod'+"%02d"%(Week Number)
to automatically format the week number.For downloading a specific format use the corresponding string given below :
- MP4 ->
- FLV ->
- 3GP ->
- MP3 ->
- PDF ->
- If all videos of the format are to be downloaded just pass the format to .getLinks
Links = Download.getLinks('mp4')
- PDF can only be downloaded fully and not week wise.
- MP4 ->
Downloading from the link
After step 2 the download links are stored on to an array. To download them pass each url to,Filename)
The List of urls of pdf files contain the complete link to the file
>>> Links = Download.getLinks('English') >>> Links[0] ''
count = 1 for url in Links: print('\nDownloading PDF ',count) , str(count)+'.'+'pdf') count = count + 1
- All other files except PDF
The List of urls of files contain only a partial link to the file
>>> Links = Stream.getLinks('mp4') >>> Links[0] '/courses/download_mp4.php?subjectId=106106198&filename=mod01lec01.mp4&subjectName=Introduction to the Course History of Artificial Intelligence'
Therefore we have to add '' while passing with the url
Here the file name is obtained from the url by slicing it.count = 0 for url in Links: print('\nDownloading ',Links[count].split('=')[3])''+ url , str(count+1)+"."+Links[count].split('=')[3]+'.'+format) count = count + 1