This Python application is used to download a batch of lessons recordings of Politecnico di Milano.
This app is intended to download a large amount of recordings (i.e. entire courses) in an easy and fast way. If you want to download a single recoring consider using this browser extension.
- Polimi recordings downloader
- Python
- aria2: this needs to be in your $PATH (for example, put aria2c.exe inside C:\Program Files\aria2c and add this filder to $PATH)
- (Optional) Create a virtual environment: inside the project folder use
python -m venv .venv
. Activate the environment using.venv\Scripts\activate.bat
on Windows orsource .venv/bin/activate
on Unix/MacOS. See here for more informations about virtual envirorments. If you know how to use Poetry you could use that instead. - Install libraries:
pip install -r requirements.txt
Run python -m prd --help
for information about usage and additional options.
This app can download the recordings from:
- An URL to the recordings archives where there are all the links to the recordigs you want to download
- A txt file with the links to the recordings
- An URL to a Webeep "Recordings" page where the professor links the recordings.
- The URL to a public (no authentication) webpage where the professor links directly the links to the videos (for example, the personal site of the professor)
- An HTML file with the direct links to the videos. This is useful when the webpage is behind authentication.
This mode parses a page from the recordings archives to fetch the download links of the videos.
In order to download a batch of recordings some steps are required:
- With your browser open the recordings archives. From the browser copy the
SSL_JSESSIONID
(domain iswww11.ceda.polimi.it
) cookie value and set it using:python -m prd set-cookie SSL_JSESSIONID "{COOKIE_VALUE}"
.SSL_JSESSIONID
must be taken from the recordings page, it can change in different pages of the web services. - With your browser open Webex and login. From the browser copy the
ticket
cookie value and set it using:python -m prd set-cookie ticket "{COOKIE_VALUE}"
. - With your browser navigare to the recordings archive and search for a course to download. Try to have all the recordings in a single page.
- Make sure to have all the recordings you want in the page
- Copy the current URL and run:
python -m prd archives "{URL}"
This mode parses an TXT file with the urls or video ids of some recordings in the format:
{VIDEO_ID}
https://politecnicomilano.webex.com/politecnicomilano/ldr.php?RCID={VIDEO_ID}
https://politecnicomilano.webex.com/recordingservice/sites/politecnicomilano/recording/playback/{VIDEO_ID}
https://politecnicomilano.webex.com/recordingservice/sites/politecnicomilano/recording/{VIDEO_ID}/playback
https://politecnicomilano.webex.com/webappng/sites/politecnicomilano/recording/{VIDEO_ID}/playback
https://politecnicomilano.webex.com/webappng/sites/politecnicomilano/recording/playback/{VIDEO_ID}
https://politecnicomilano.webex.com/webappng/sites/politecnicomilano/recording/{VIDEO_ID}
This command supports only downloading one course at the time.
Some steps are required:
- With your browser open Webex and login. From the browser copy the
ticket
cookie value and set it using:python -m prd set-cookie ticket "{COOKIE_VALUE}"
. - Run
python -m prd txt --course="My beutiful course" --academic-year="2021-22" {TXT_FILE}
.
This mode parses a "Recordings" page where the professor links the recordings.
Some steps are required:
- With your browser open Webeep. From the browser copy the
MoodleSession
cookie value and set it using:python -m prd set-cookie MoodleSession "{COOKIE_VALUE}"
. - With your browser open Webex and login. From the browser copy the
ticket
cookie value and set it using:python -m prd set-cookie ticket "{COOKIE_VALUE}"
. - With your browser navigare to the Webeep recordings section and copy the url of the page.
- Run
python -m prd webeep "{WEBEEP_URL}"
.
This mode parses an URL to a public (i.e. without authentication) HTML page where the professor links directly the recordings.
Some steps are required:
- With your browser open Webex and login. From the browser copy the
ticket
cookie value and set it using:python -m prd set-cookie ticket "{COOKIE_VALUE}"
. - With your browser navigate to the page where the direct links are placed.
- Copy the URL of the page.
- Run
python -m prd webpage-url --course="{COURSE_NAME}" --academic-year="2021-22" "{URL}"
.
This mode parses an HTML file where the professor linked directly the recordings.
Some steps are required:
- With your browser open Webex and login. From the browser copy the
ticket
cookie value and set it using:python -m prd set-cookie ticket "{COOKIE_VALUE}"
. - With your browser navigate to the page where the direct links are placed.
- Download the page HTML.
- Run
python -m prd webpage-html --course="{COURSE_NAME}" --academic-year="2021-22" {FILE_PATH}
.
Inside the output folder there will be:
- A
dowaload_links.txt
file which is the one fed toaria2
. If the option--no-aria2c
is used this file will contain a list of download links to be passed to another program (for example, Free Download Manager) to download the recordings. - One folder for each course parsed. Inside this folder there will be the recordings and an
xlsx
file with the recordings metadata (unless--no-create-xlsx
is used).
Use the command aria2c --input-file=output/dowaload_links.txt --auto-file-renaming=false --dir=output --max-concurrent-downloads=16 --max-connection-per-server=16
.