massivedl

by dimkouv

dimkouv / massivedl

Download a large list of files concurrently

129 Stars 7 Forks Last release: over 1 year ago (v1.2) GNU General Public License v3.0 59 Commits 3 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

massivedl

Download a large list of files in parallel.

Install

# for linux 64bit
wget https://github.com/dimkouv/massivedl/releases/download/v1.2/massivedl_linux_amd64
chmod +x massivedl_linux_amd64
mv massivedl_linux_amd64 /usr/local/bin/massivedl

Usage

Create a .csv file with the downloads

bash
filename,url
0.png,https://placehold.it/100x100
1.png,https://placehold.it/100x101
2.png,https://placehold.it/100x102
...

Assuming the file was named

data.csv
we can download the files using
bash
massivedl -p 10 -i data.csv -s 1 -o downloads

Command line parameters

-p  (default=10)          : Maximum number of parallel requests
-s  (default=0)           : Number of skipped lines from input csv
-i                        : Input csv file with the list of urls
-o  (default='downloads') : Directory to place the downloads

Stop and continue later

You can stop and continue downloading later.
Press

Ctrl+C
then you will have the following dialog.
...
Do you want to save progress? [Y/n]: yes

Progress has been saved! Use the following command to continue downloading

massivedl --load /path/to/savedfile.save

Use Cases

With this tool I was able to download about 1.5 million images (~60GB) for a machine learning project.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.