Difference between revisions of "Curl"

From Christoph's Personal Wiki
Jump to: navigation, search
(Download to a file)
(Miscellaneous examples)
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''cURL''' is a [[:Category:Linux Command Line Tools|command line tool]] for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, TFTP, Telnet, DICT, FILE and LDAP. cURL supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading, Kerberos, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM and Negotiate for HTTP and kerberos4 for FTP), file transfer resume, http proxy tunneling and many other features. cURL is open source/free software distributed under MIT License.
+
'''cURL''' is a [[:Category:Linux Command Line Tools|command line tool]] for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, TFTP, Telnet, DICT, FILE and LDAP. cURL supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading, Kerberos, HTTP form-based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM and Negotiate for HTTP and kerberos4 for FTP), file transfer resume, HTTP proxy tunnelling, and many other features. cURL is open source/free software distributed under MIT License.
  
The main purpose and use for cURL is to automate unattended file transfers or sequences of operations. It is for example a good tool for simulating a user's actions at a web browser.
+
The main purpose and use for cURL is to automate unattended file transfers or sequences of operations. It is for example a good tool for simulating a user's actions in a web browser.
  
 
Libcurl is the corresponding library/API that users may incorporate into their programs; cURL acts as a stand-alone wrapper to the libcurl library. libcurl is being used to provide URL transfer capabilities to numerous applications, Open Source as well as many commercial ones.
 
Libcurl is the corresponding library/API that users may incorporate into their programs; cURL acts as a stand-alone wrapper to the libcurl library. libcurl is being used to provide URL transfer capabilities to numerous applications, Open Source as well as many commercial ones.
  
== Simple usage ==
+
==Common options==
 +
;<code>-o</code>: save the data in specific file
 +
;<code>-c</code>: resume interrupted downloads
 +
;<code>-O</code>: download multiple URLs (seperated with space)
 +
;<code>-l</code>: view the HTTP header's information
 +
;<code>-I</code>: fetch only the header information
 +
;<code>-v</code>: view the entire TLS handshake
 +
;<code>-k</code>: ignore invalid or self-signed certificates
 +
;<code>-C</code>: resume the file transfer
 +
;<code>-f</code>: fail silently
 +
 
 +
==Simple usage==
  
 
* Get the main page from firefox's web-server:
 
* Get the main page from firefox's web-server:
Line 53: Line 64:
  
 
* Check on the amount of time it takes to load a website (lookup/connect/transfer times):
 
* Check on the amount of time it takes to load a website (lookup/connect/transfer times):
  $ for i in $(seq 1 3); do curl -so /dev/null www.example.com \
+
  $ for i in $(seq 1 3); do
    -w "time_namelookup: %{time_namelookup}\
+
    curl -so /dev/null www.example.com \
 +
    -w "time_namelookup: %{time_namelookup}\
 
         \ttime_connect: %{time_connect}\
 
         \ttime_connect: %{time_connect}\
 
         \ttime_starttransfer: %{time_starttransfer}\
 
         \ttime_starttransfer: %{time_starttransfer}\
         \ttime_total: %{time_total}\n"; done
+
         \ttime_total: %{time_total}\n";
 +
  done
 
  time_namelookup: 0.004  time_connect: 0.005    time_starttransfer: 0.854      time_total: 0.964
 
  time_namelookup: 0.004  time_connect: 0.005    time_starttransfer: 0.854      time_total: 0.964
 
  time_namelookup: 0.004  time_connect: 0.005    time_starttransfer: 0.575      time_total: 0.617
 
  time_namelookup: 0.004  time_connect: 0.005    time_starttransfer: 0.575      time_total: 0.617
 
  time_namelookup: 0.004  time_connect: 0.005    time_starttransfer: 0.550      time_total: 0.555
 
  time_namelookup: 0.004  time_connect: 0.005    time_starttransfer: 0.550      time_total: 0.555
 +
 +
* Retries:
 +
$ curl -4 --retry 25 --retry-delay 20 --retry-connrefused
 +
 +
* Share files via cURL:
 +
<pre>
 +
$ curl -F "file=@foo.jpg" 0x0.st
 +
https://0x0.st/ou6C.jpg
 +
</pre>
  
 
== Download to a file ==
 
== Download to a file ==

Latest revision as of 04:37, 21 April 2023

cURL is a command line tool for transferring files with URL syntax, supporting FTP, FTPS, HTTP, HTTPS, TFTP, Telnet, DICT, FILE and LDAP. cURL supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading, Kerberos, HTTP form-based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM and Negotiate for HTTP and kerberos4 for FTP), file transfer resume, HTTP proxy tunnelling, and many other features. cURL is open source/free software distributed under MIT License.

The main purpose and use for cURL is to automate unattended file transfers or sequences of operations. It is for example a good tool for simulating a user's actions in a web browser.

Libcurl is the corresponding library/API that users may incorporate into their programs; cURL acts as a stand-alone wrapper to the libcurl library. libcurl is being used to provide URL transfer capabilities to numerous applications, Open Source as well as many commercial ones.

Common options

-o
save the data in specific file
-c
resume interrupted downloads
-O
download multiple URLs (seperated with space)
-l
view the HTTP header's information
-I
fetch only the header information
-v
view the entire TLS handshake
-k
ignore invalid or self-signed certificates
-C
resume the file transfer
-f
fail silently

Simple usage

  • Get the main page from firefox's web-server:
$ curl http://www.firefox.com/
  • Get the README file the user's home directory at funet's ftp-server:
$ curl ftp://ftp.funet.fi/README
  • Get a web page from a server using port 8000:
$ curl http://www.weirdserver.com:8000/
  • Get a list of a directory of an FTP site:
$ curl ftp://cool.haxx.se/
  • Get a gopher document from a gopher server:
$ curl gopher://gopher.funet.fi
#~OR~
$ curl gopher://gopher.quux.org:70
  • Get the definition of curl from a dictionary:
$ curl dict://dict.org/m:curl
  • Fetch two documents at once:
$ curl ftp://cool.haxx.se/ http://www.weirdserver.com:8000/
$ curl -I http://xtof.ch/skills
HTTP/1.1 301 Moved Permanently
Date: Wed, 15 Apr 2015 21:24:28 GMT
Server: Apache/2.2.15 (CentOS)
Location: http://wiki.christophchamp.com/index.php/Technical_and_Specialized_Skills
Connection: close
Content-Type: text/html; charset=iso-8859-1

# xtof.ch/skills redirects to http://wiki.christophchamp.com/index.php/Technical_and_Specialized_Skills
# So, use "-L" to follow that redirect:
$ curl -L http://xtof.ch/skills

Miscellaneous examples

  • Check on the amount of time it takes to load a website (lookup/connect/transfer times):
$ for i in $(seq 1 3); do
    curl -so /dev/null www.example.com \
    -w "time_namelookup: %{time_namelookup}\
       \ttime_connect: %{time_connect}\
       \ttime_starttransfer: %{time_starttransfer}\
       \ttime_total: %{time_total}\n";
  done
time_namelookup: 0.004  time_connect: 0.005     time_starttransfer: 0.854       time_total: 0.964
time_namelookup: 0.004  time_connect: 0.005     time_starttransfer: 0.575       time_total: 0.617
time_namelookup: 0.004  time_connect: 0.005     time_starttransfer: 0.550       time_total: 0.555
  • Retries:
$ curl -4 --retry 25 --retry-delay 20 --retry-connrefused
  • Share files via cURL:
$ curl -F "file=@foo.jpg" 0x0.st
https://0x0.st/ou6C.jpg

Download to a file

  • Get a web page and store in a local file:
$ curl -o thatpage.html http://www.example.com/
  • Get a web page and store in a local file, make the local file get the name of the remote document (if no file name part is specified in the URL, this will fail):
$ curl -O http://www.example.com/index.html
  • Fetch two files and store them with their remote names:
$ curl -O www.haxx.se/index.html -O curl.haxx.se/download.html

Using cURL for fast downloads

Suppose you want to download the Ubuntu 14.04.3 LTS (Trusty Tahr; 64-bit) ISO from the following three mirrors:

$ curl -sI http://mirror.pnl.gov/releases/14.04/ubuntu-14.04.3-desktop-amd64.iso |\
    awk '/^Content-Length/{iso_size=$2/1024^2; print iso_size}'
$ url1=http://mirror.pnl.gov/releases/14.04/ubuntu-14.04.3-desktop-amd64.iso
$ url2=http://mirror.scalabledns.com/ubuntu-releases/14.04.3/ubuntu-14.04.3-desktop-amd64.iso
$ url3=http://mirrors.rit.edu/ubuntu-releases/14.04.3/ubuntu-14.04.3-desktop-amd64.iso

Get the total size (in bytes) of the ISO:

$ ISOURL=http://mirror.pnl.gov/releases/14.04/ubuntu-14.04.3-desktop-amd64.iso
$ iso_size=$(curl -sI ${ISOURL} | awk '/^Content-Length/{print $2}')

The total size of the ISO is 1054867456 bytes (~1.0GB). Using cURL's "--range" option, we can download that ISO in 3 parts from the above 3 different mirrors simultaneously with the following commands (do not forget the "&" at the end so each download is backgrounded):

$ curl -r 0-499999999 -o ubuntu-14.04.3-desktop-amd64.iso.part1 $url1 &         # 1st 500MB
$ curl -r 500000000-999999999 -o ubuntu-14.04.3-desktop-amd64.iso.part2 $url2 & # 2nd 500MB
$ curl -r 1000000000- -o ubuntu-14.04.3-desktop-amd64.iso.part3 $url3 &         # remaining bytes

After all three parts have downloaded, `cat` them all together into a single ISO

$ cat ubuntu-14.04.3-desktop-amd64.iso.part? > ubuntu-14.04.3-desktop-amd64.iso

Finally, check the integrity of the ISO using the MD5SUM for the original ISO:

$ wget -c http://mirror.pnl.gov/releases/14.04/MD5SUMS
$ grep ubuntu-14.04.3-desktop-amd64.iso MD5SUMS
$ md5sum ubuntu-14.04.3-desktop-amd64.iso

The two values should be identical. Et voilà! You have downloaded that ISO (potentially) much faster than downloading it as one single ISO.

Note: You could automate the process in a script. You would use the ${iso_size} from above together with the following lines:

$ blocksize=$(expr 1024 \* 512)
$ curl -\# -r $sum-$(($sum+$blocksize)) -o ubuntu-14.04.3-desktop-amd64.iso.part${num} $url1 &

The "-\#" is to switch from the regular meter to a progress "bar".

Write out variables

With curl "write-out" variables, one can make curl display information on STDOUT after a completed transfer. The format is a string that may contain plain text mixed with any number of variables. The format can be specified as a literal "string", or you can have curl read the format from a file with "@filename" and to tell curl to read the format from STDIN you write "@-".

The variables present in the output format will be substituted by the value or text that curl thinks fit, as described below. All variables are specified as %{variable_name} and to output a normal "%" you just write them as "%%". You can output a newline by using "\n", a carriage return with "\r", or a tab space with "\t".

As an example on how to use write out variables, consider my personal URL shortener (or TinyURL) website:

$ curl -I http://www.xtof.ch
HTTP/1.1 200 OK
Date: Wed, 17 Feb 2016 01:39:21 GMT
Server: Apache/2.2.15 (CentOS)
Last-Modified: Sun, 17 May 2015 21:22:51 GMT
ETag: "2a073-d2-5164dadfec0c0"
Accept-Ranges: bytes
Content-Length: 210
X-CLI: Website by Christoph Champ
X-Owner-URL: www.christophchamp.com
X-Wiki-URL: http://xtof.ch/wiki
Connection: close
Content-Type: text/html; charset=UTF-8
$ read -r -d '' WRITE_OUT_VARS <<'EOF'
content_type: %{content_type}
http_code: %{http_code}
http_connect: %{http_connect}
local_ip: %{local_ip}
local_port: %{local_port}
num_connects: %{num_connects}
num_redirects: %{num_redirects}
redirect_url: %{redirect_url}
remote_ip: %{remote_ip}
remote_port: %{remote_port}
size_download: %{size_download}
size_header: %{size_header}
size_upload: %{size_upload}
speed_download: %{speed_download}
speed_upload: %{speed_upload}
ssl_verify_result: %{ssl_verify_result}
time_connect: %{time_connect}
time_namelookup: %{time_namelookup}
time_redirect: %{time_redirect}
time_starttransfer: %{time_starttransfer}
time_total: %{time_total}
url_effective: %{url_effective}
EOF

If I pass a TinyURL to my website, I get the following:

$ curl -sw "${WRITE_OUT_VARS}\n" xtof.ch/cv -o /dev/null
content_type: text/html; charset=iso-8859-1
http_code: 301
http_connect: 000
local_ip: 10.x.x.x
local_port: 56646
num_connects: 1
num_redirects: 0
redirect_url: http://wiki.christophchamp.com/index.php/Curriculum_Vitae
remote_ip: 67.207.152.20
remote_port: 80
size_download: 338
size_header: 257
size_upload: 0
speed_download: 3238.000
speed_upload: 0.000
ssl_verify_result: 0
time_connect: 0.055
time_namelookup: 0.004
time_redirect: 0.000
time_starttransfer: 0.104
time_total: 0.104
url_effective: HTTP://xtof.ch/cv

If I tell curl to follow the redirect URL (i.e., with "-L"), I get the following:

$ curl -sLw "${WRITE_OUT_VARS}\n" xtof.ch/cv -o /dev/null
content_type: text/html; charset=UTF-8
http_code: 200
http_connect: 000
local_ip: 10.x.x.x
local_port: 54964
num_connects: 2
num_redirects: 2
redirect_url: 
remote_ip: 45.56.73.83
remote_port: 80
size_download: 66120
size_header: 1132
size_upload: 0
speed_download: 91268.000
speed_upload: 0.000
ssl_verify_result: 0
time_connect: 0.000
time_namelookup: 0.000
time_redirect: 0.466
time_starttransfer: 0.167
time_total: 0.724
url_effective: http://wiki.christophchamp.com/index.php/Curriculum_Vitae

Note how the "redirect_url", "url_effective", "remote_ip", "num_redirects", etc. have changed.

See also

External links