download-file-with-curl

Download Files Using cURL: Let’s Learn How To Do It

Do you want to download files via the command line on a Linux system? With cURL you can do that.

I will show you how to download files with curl, but let’s start from the basics first.

More generally, you can use curl to transfer data from or to a server.

It supports a long list of protocols and the ones we are interested in for downloading files are HTTP and HTTPS. You can see the full list of protocols supported by curl in its documentation.

For now let’s have a look at how we can use curl to download a file.

I have also created a video to guide you through the commands explained in this tutorial:

Basic Syntax for the cURL Command

Here is the basic syntax to do that using either HTTP or HTTPS:

curl http://url-for-the-file --output <filename>
curl https://url-for-the-file --output <filename>

The use of HTTP or HTTPS depends on the configuration of the server we are downloading the file from.

The –output flag is used to write the output of the curl command to a file.

So, let’s try to download a file with curl from the official curl website:

myuser@localhost:~$ curl https://curl.haxx.se/docs/protdocs.html --output protdocs.html
   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                  Dload  Upload   Total   Spent    Left  Speed
 100  2063  100  2063    0     0   9008      0 --:--:-- --:--:-- --:--:--  8969 

As you can see curl shows the progress for the download and using the ls command we can confirm that the file has been downloaded:

myuser@localhost:~$ ls -ltr
total 8
-rw-r--r--  1 myuser  mygroup  2063  1 Apr 18:42 protdocs.html 

Using the file command we can confirm that this is an HTML file:

myuser@localhost:~$ file protdocs.html 
protdocs.html: HTML document text, ASCII text 

Now, let’s try to use the HTTP protocol instead of the HTTPS one:

myuser@localhost:~$ curl http://curl.haxx.se/docs/protdocs.html --output protdocs.html
   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                  Dload  Upload   Total   Spent    Left  Speed
   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0 

Something doesn’t look right…

For some reason zero data has been transferred as you can see from the curl output.

And if we check the file using the ls command we will notice that the file is empty:

myuser@localhost:~$ ls -ltr
total 0
-rw-r--r--  1 myuser  mygroup  0  1 Apr 18:50 protdocs.html 

So, what’s going on?

To have more details we can use the -v flag that provides verbose output:

myuser@localhost:~$ curl -v http://curl.haxx.se/docs/protdocs.html --output protdocs.html
   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                  Dload  Upload   Total   Spent    Left  Speed
   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 151.101.18.49:80...
 * TCP_NODELAY set
 * Connected to curl.haxx.se (151.101.18.49) port 80 (#0)
 > GET /docs/protdocs.html HTTP/1.1
 > Host: curl.haxx.se
 > User-Agent: curl/7.65.3
 > Accept: */*
 > 
 * Mark bundle as not supporting multiuse
 < HTTP/1.1 301 Moved Permanently
 < Server: Varnish
 < Retry-After: 0
 < Location: https://curl.haxx.se/docs/protdocs.html
 < Content-Length: 0
 < Accept-Ranges: bytes
 < Date: Wed, 01 Apr 2020 17:52:41 GMT
 < Via: 1.1 varnish
 < Connection: close
 < X-Served-By: cache-lcy19240-LCY
 < X-Cache: HIT
 < X-Cache-Hits: 0
 < X-Timer: S1585763561.363434,VS0,VE0
 < 
   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
 * Closing connection 0 

The answer is in the two lines in bold you can see above.

When we connect with curl to the HTTP URL we get redirected to the HTTPS URL using a 301 redirect, a common approach to redirect from an HTTP URL to an HTTPS URL.

That’s why the content of the file we downloaded is zero.

What If I want to download files with cURL and I don’t want to see the transfer report?

It’s a very common requirement to use curl in a Bash script.

For example, if you have a script that performs a sequence of tasks and one of them is downloading a file using curl before moving to the next tasks.

In this scenario the user running the script might not want to see the transfer report that curl returns by default:

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                  Dload  Upload   Total   Spent    Left  Speed
   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0 

So, how can we hide it?

My first answer would be, let’s have a look at the manual for curl.

To do that in Linux (and Unix-like) systems you can use the man command:

man curl

The man page for curl is pretty long! 😀

The curl command accepts so many different flags, quite interesting…

I want to make your life easy so I will tell you how to hide the transfer report, you can use the following flag:

-s, --silent 

So you can use either the short flag -s or the long flag –silent.

It’s a common standard for command line tools to have a long and a short version for flags. The same applies to the –output flag that can also be replaced by -o.

But, how do you now if a flag has a short and a long version?

The man command for the tool will tell you that.

Now, let’s give it a try to download the file with curl and passing the -s flag:

myuser@localhost:~$ curl -s https://curl.haxx.se/docs/protdocs.html --output protdocs.html
myuser@localhost:~$ ls -ltr
total 8
-rw-r--r--  1 myuser  mygroup  2063  1 Apr 19:03 protdocs.html 

Perfect, the transfer report is not displayed, exactly what we wanted! 🙂

If we use the short flag for –output the command becomes:

myuser@localhost:~$ curl -s https://curl.haxx.se/docs/protdocs.html -o protdocs.html

Note: it’s important to remember that the name of the file to write to must follow the -o flag.

And, what do you think will happen if we don’t provide a file name after the -o flag?

myuser@localhost:~$ curl -s https://curl.haxx.se/docs/protdocs.html -o
curl: option -o: requires parameter
curl: try 'curl --help' or 'curl --manual' for more information 

So, curl is smart enough to detect that I haven’t passed a filename after -o and it rejects my command suggesting the -o requires a parameter.

What Happens to the Output Without –output Flag?

The –output flag is optional. I’m wondering what happens if we don’t pass it to the command…

myuser@localhost:~$ curl -s https://curl.haxx.se/docs/protdocs.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head> <title>curl - Protocol Documentation</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
<link rel="stylesheet" type="text/css" href="https://curl.haxx.se/curl.css">
<link rel="shortcut icon" href="https://curl.haxx.se/favicon.ico">
<link rel="icon" href="https://curl.haxx.se/logo/curl-symbol.svg" type="image/svg+xml">
</head>
<body bgcolor="#ffffff" text="#000000">
<div class="main">
<div class="menu">
<a href="/docs/" class="menuitem" title="Documentation main page">Docs Overview</a>
<a href="https://curl.haxx.se/docs/caextract.html" class="menuitem" title="CA cer
...
...
...
and this continues until the end of the HTML file

So what’s happening is, the curl command is printing the content of the HTML file in the shell.

Can I use the Pipe to Send the Output of cURL to Other Commands?

The pipe is used in Linux to send the standard output of a command to another command.

Here is how we can use it with the curl command, in this example we want to see the last 5 lines of the file downloaded using curl.

To print the last line of the file in the shell we use the tail command.

myuser@localhost:~$ curl -s https://curl.haxx.se/docs/protdocs.html | tail -5
</div>
</div>
<script defer src="https://www.fastly-insights.com/insights.js?k=8cb1247c-87c2-4af9-9229-768b1990f90b" type="text/javascript"></script>
</BODY>
</HTML>

And now let’s say we want to look for any lines in the file that contain the word “javascript”.

We can use the grep command:

myuser@localhost:~$ curl -s https://curl.haxx.se/docs/protdocs.html | grep javascript
<script defer src="https://www.fastly-insights.com/insights.js?k=8cb1247c-87c2-4af9-9229-768b1990f90b" type="text/javascript"></script>

These are just two examples of the commands you can pipe the curl output to…

…I will leave other commands to your imagination! 😀

Conclusion

In this article you have learned:

  • How to download files with the curl command.
  • Two of the protocols that curl supports (HTTP and HTTPS).
  • The flag used to hide the transfer report.
  • How to use curl together with the pipe.

And if you want to learn another way the curl command can be used for, you can see how to call an API using curl.

Share knowledge with your friends!

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *