Hacker News new | past | comments | ask | show | jobs | submit login
Curl vs. Wget (haxx.se)
532 points by of on March 2, 2016 | hide | past | favorite | 197 comments



For my usage:

* Wget's the interactive, end-user tool, and my go-to if I just need to download a file. For that purpose, its defaults are more sane, its command line usage is more straightforward, its documentation is better-organized, and it can continue incomplete downloads, which curl can't.

* Curl's the developer tool-- it's what I'd use if I were building a shell script that needed to download. The command line tool is more unix-y by default (outputs to stdout) and it's more flexible in terms of options. It's also present by default on more systems-- of note, OSX ships curl but not wget out of the box. Its backing library (libcurl) is also pretty nifty, but not really relevant to this comparison.

This doesn't really need to be an "emacs vs. vim" or "tabs vs. spaces"-type dichotomy: wget and curl do different things well and there's no reason why both shouldn't coexist in one's workflow.


> This doesn't really need to be an "emacs vs. vim" or "tabs vs. spaces"-type dichotomy: wget and curl do different things well and there's no reason why both shouldn't coexist in one's workflow.

Totally agree. I love curl for testing API request/responses manually. It's usually a huge part of navigating my way around a new API that doesn't have a client library available for whatever language I'm using at that time.

I also use it for weird requests that need special headers or authentication.

Wget is the first thing I turn to when I'm downloading anything remote from the command line or scraping some remote content for analysis.


Random plug: Another invaluable tool for API analysis is http://mitmproxy.org.


Thanks for mentioning it! :) One of the authors here, happy to answer any questions.


Wow, this looks extremely interesting, I'll definitely keep this in mind!


Use the Chrome Postman extension together with Postman Interceptor. It's really awesome.


I'd just like to mention HttpRequester here, which is a very similar FLOSS addon for Firefox.


Yes, wget is fantastic for mirroring www and ftp sites and I use it a lot for that purpose. It's magic [0]. I hadn't realized that it didn't support compression though, which might explain why it's so slow in some cases. Not normally a problem as it just runs in the background on a schedule.

Curl supports gzip and deflate. It would be great if support for sdch and br were added too. Brotli is in Firefox 44 and can be enabled in Chrome Canary with a flag. SDCH has been in Chrome for a while and is used on Google servers.

[0] http://www.litkicks.com/WgetMagic


The Win64 latest version of curl doesn't seem to support gzip nor deflate. I have to remove those options when copying from Chrome developer tools and pasting into a script. I'd report a bug but their page doesn't seem to have an obvious link.



I wonder if the Win64 version is built without zlib or similar.


Pretty much. If I want to save a file: wget. If I want to do _anything_ else: curl. Yes you can write files with curl, no I don't use that functionality very often. I don't think of them as "end user" vs "developer" use cases so much as them being two great tools for different tasks. I do wish that -s was the curl default, since that stderr progress output is pretty lame.


Interestingly, I've somewhat replace my usual curl calls with HTTPie, at least on my mac; for distributed scripts it's still curl all the way.


Doesn't curl -C continue an incomplete download? Regardless agree that wget is better for just downloading a file.


-C requires you to specify the byte offset, and you also need --append or >> so it's not quite the same.


Actually no. From the manpage:

Use "-C -" to tell curl to automatically find out where/how to resume the transfer. It then uses the given output/input files to figure that out.


Stupid question, but how do things like this resume from where they left off? Wouldn't the server need to be cooperating in this? Is that build into HTTP?


Yes the server would need to support it. The request is made via a HTTP header (Range[1] IIRC).

Also, I wouldn't consider that a stupid question. :)

[1] https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#Rang...


This is also how download accelerators worked (back in the late nineties, early naught's), by having different connections work at several ranges to maximize bandwidth usage.


Funny enough, it was writing a download accelerator that taught me about the Range HTTP request header.


You specify a byte "range" when requesting a file via HTTP from the server. Not all servers support this, but most do these days.

#Edit. More info here: https://en.wikipedia.org/wiki/Byte_serving


Just looked that up, that's a rather limited usefulness. The multi homed usage makes sense but feels out of place really.


Why? How could it be more useful? HTTP byte ranges are incredibly flexible, since you one request can specify many byte ranges at once (it's almost too flexible, since a dumb server can easily be overwhelmed by a maliciously complicated request)

It handles the basic case of fetching the remainder of an interrupted download, and can also support partial downloads e.g. for supporting a video stream with the user jumping to different places in the movie.


Correct, but very unintuitive


Also doesn't work with the command line utility's retry option as it gets the size only once and does not update it when retrying.


The article is not coming up for me. Perhaps it's ycombinated. Anyway I agree curl is more of a developer tool although using it to download files is not the first thing that comes to mind. I use it daily to identify caching issues and redirect issues. The -SLIXGET flags in particular are very useful for this.


Yeah I've been doing some libcurl development today .... Online docs have been unavailable all day ..


Huh. I've always thought of them as the opposite - wget is the full featured spidering tool, curl is the easy to run one when I need a command line thing or to bang web stuff into a janky copy and paste workflow.


This is interesting. Do you want to give your reasons or is it just an arbitrary habit?


wget leaves files everywhere, and isn't useful without picking a few flags to use, carefully. I use it as a tool to check builds for broken links

curl is a unix-way type program that interacts with standard out in a pretty predictable way.


Looks like a frontend to curl that has wget's defaults would be useful.


Merge the code, hard link the filenames to the same executable and get it to change behavior based on the name it's invoked under.


No need to merge code. Take curl as is, and set different defaults based on executable name.


In that case I'd love for it to include resuming downloads though, but given the differences I don't think merging is going to be more useful than just adding the feature in curl.

Edit: nevermind, I just learned that wget can download a page's resources or even inline them (-k apparently), that's a bit of a way off from curl's purpose. Better keep them separate tools, although wget might benefit from using libcurl so they don't have to implement lots of stuff like https or http2.


Sounds like a lot of complexity for no benefit.


Either tool can provide the functionality of the other with a little work, is what I meant to imply...


It's what vi does.


I totally agree except over the years I have been using aria2 more and more instead of wget. aria2 supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink with the same sane wget syntax and defaults.

example

$ aria2c http://yourlink.com/file.*

-x2 allows using 2 connections per host.

https://aria2.github.io/


I'm having to use aria2 more and more to maintain decent download speeds, occasionally even from Akamai and CloudFront. S3 seems to be the exception.


curl is the one where I have to remember whether to use -o or -O when trying to download a file with the original filename and just use wget instead because it's faster than reading the curl man page.


I have the same problem. Curl is "unix-y" in the sense that the default options are optimized for a shell script and make no sense for interactive usage.

bropages is where it's at for that kind of stuff. curl is actually their usage example :)

http://bropages.org/


Their example will just dump the webpage to stdout! Almost certainly not the desired behavior given the comment. Then they include second example of a use case that almost nobody has, instead of giving the option that everyone is actually looking for.


wow what the heck libcurl supports a ridiculous number of protocols (https://curl.haxx.se/libcurl/)


> Wget's [..] defaults are more sane

Does it finally use filename from "Content-Disposition" without need for any switches?


That sounds like a security issue, allowing the attacker to name a file like they want.


"This doesn't really need to be an "emacs vs. vim" or "tabs vs. spaces"-type dichotomy."

Agreed, and even "emacs vs. vim" or "tabs vs. spaces" do not really need to be "emacs vs. vim" or "tabs vs. spaces"-type dichotomies.


My favorite use of wget: mirroring web documentation to my local machine.

    wget -r -l5 -k -np -p https://docs.python.org/2/
Rewrites the links to point local where appropriate, and the ones which are not local remain links to the online documentation. Makes for a nice, seamless experience while browsing documentation.

I also prefer wget to `curl -O` for general file downloads, simply because wget will handle redirects by default, `curl -O` will not. Yes, I could remember yet another argument to curl... but why?

That said, I love curl (combined with `jq`) for playing with rest interfaces.


Thanks for the jq tip. I hadn't seen that before. Link if anyone's interested:

https://stedolan.github.io/jq/


Underscore CLI looks interesting as well, though I haven't personally had much of a chance to play with it. It does require NodeJS, which might be a deal-breaker for some, but if it's already in your toolchain then it might come in handy.

https://github.com/ddopson/underscore-cli


I've written a similar tool underscore-cli and jq called jowl. It's designed to be easier to learn (for JavaScript developers) than underscore-cli or jq. The README includes a comparison.

It's still early in its development, and would benefit from a tutorial and a few more features, but it's getting there.

https://www.npmjs.com/package/jowl


N.B., this will make curl redirect by default:

  echo "-L" >> ~/.curlrc


This is a great use, and I've used it for a lot of other sites and documentation. However I just want to point out that Python's documentation happens to come in an easily downloadable archive of HTML files:

https://docs.python.org/2.7/download.html

(you can also find archives for Python 3 and SciPy and NumPy)


wget is ludicrously good for mirroring. I use it to mirror the entirety of EDGAR.

For my money another tool that belongs in the toolbox is perl's WWW::Mechanize and its component WWW::Mechanize::Shell.


How does wget compare with httrack for mirroring?


It's somewhat hard to compare CLI vs GUI programs.. but in my experience, wget --mirror is simpler than HTTrack for simple tasks, which takes a little bit of mucking around to start downloading. HTTrack has a smaller learning curve for more complex situations though.


For docs specifically, you could also try Zeal, which allows you to browse documentation for quite a few programming languages offline, including Mozilla's JS docs and an HTML spec, as you mentioned in some other comment in this thread.

https://zealdocs.org/


Wow. This is so incredibly useful to me.

The general workflow I have followed was, "Download PDF if present. If documentation can be shown in single HTML page, download it, change all the anchors to point to my own filesystem". This changes everything in such a simple, useful, way.


I bet that would be extra useful for creating Dash.app docsets...


if you look at some of the default docsets you can tell that this was most likely done. All of the directories under the file root in the docset are named after domain names (one of the things whet does when crawling a site and traversing multiple domains).


Or just curl -OL


Is this the part where I tell you about pydoc and blow your mind?

I had a patch at one point to make pydoc style itself just like docs.p.o, but I'm not working much in Python these days. Maybe it's not even necessary any more, which would be cool.


Sure. Does it work with golang docs, the w3 HTML5 spec, Mozilla's JS reference, etc?

;)

Python's built in documentation tools are good, but the command I referenced also gives you the full language spec locally, as well as the FAQs and packaging docs.


Also putting this out there—for nicer REST API interaction on the CLI, and a little more user-friendliness, you might also want to add HTTPie[1] to your toolbelt.

It's not going to replace curl or wget usage, but it is a nicer interface in certain circumstances.

[1] https://github.com/jkbrzt/httpie


HTTPie is amazing for working with JSON/REST interfaces. It's really succinct and generally designed with user-friendliness in mind, versus supporting every nook and cranny of the HTTP RFCs. It's installable via homebrew for Mac users.

Whenever I go back to curl I feel the same minor aggravation I do when moving from homebrew back to apt (want to install? apt-get. Search? apt-cache search. List? dpkg. Remove? apt-get again, for some reason). It's just not a very user-friendly CLI and I'm constantly pulling up the curl man pages.


On Debian these days, they provide an executable named apt that does all the common stuff in one place (install, search, list (mimicks dpkg), remove, show, etc.).


> want to install? apt-get. Search? apt-cache search. List? dpkg. Remove? apt-get again, for some reason

I think this is largely why aptitude exists; a single unified command for all common package tasks. Most of the Debian docs have been switched over to recommend its usage, but for some reason Ubuntu hasn't followed suit. That said, I just keep using the same old shell aliases I added in 2003 personally.


Arch Linux has pacman, which is a single program that does all of the things you need from a package manager. OpenSUSE systems have zypper, which also has one command for everything (and has a much more reliable format -- rpm).


How is RPM a more reliable format than dpkg? I've worked with both systems as part of writing a program that generates pacman/dpkg/RPM packages, and I've found dpkg to be the most sane and well-designed of the three, while the RPM format is a horrible and under-documented atrocity.


I meant more reliable than Arch Linux's format. I haven't really worked with dpkg much, so I can't comment.


I've been loving HTTPie, and for my somewhat idiosyncratic usage it's completely replaced curl and wget. (Of course, most people probably don't spend as much time as me fiddling with poorly documented third party REST APIs.)

Also amazing: jq.


I use httpie for interacting with APIs and aria2c for downloading stuff. They're both perfect at their respective jobs.


This is an extremely minor quibble, but the dependency on Python makes me less inclined to use it. I'm stuck on Windows for a lot of the work I do so configuring Python is never fun and there is no one line installation method from what I can tell.

Putting it up on chocolatey[0] might be a good idea. Not sure how feasible that is however.

[0] https://chocolatey.org/packages?q=httpie


Should have lead that comment with "I'm on Windows, so".

Of course it's going to be miserable to use any command-line tool on Windows. It's Windows.


I feel wget in PowerShell is useful.

PS> $page = wget http://www.yahoo.com/ PS> $page.Images | sort width | select src, width, height | Export-Csv -Encoding utf8 images.CSV


I'm not sure why you feel like that, I switched back to Windows after ~1 year of using OSX, I wouldn't say its "miserable" there is really nothing I could do in OSX's command line that can't do in windows.


What is your Windows command-line environment? Plain CMD prompt or Cygwin? Or something else?


Powershell is very powerful and for windows it beats cygwin/mingw. Not quite sure how it measure against Linux shells running on Linux, but Microsoft has really made a proper shell for windows, too bad it looks so different.

Obviously if you work in a cross-platform environment, cygwin/mingw is still the only thing that will provide you some sort of consistency in your workflow on Windows machines.


I don't get agree with your parent, but I used cygwin when I was on Windows and it mostly does what you want if you just need basic command line tools.

The major annoyances were packages were limited, compiling anything was generally a disaster and file permissions between linux/windows are a mess. I happily used it everyday though.


I use git bash, with conemu, to be honest I don't do anything crazy but most of the linux commands I need day to day just work.


I feel like you can't have really scratched the surface on what you can do in a real Unix shell. I doubt that Windows has support for things I use all the time (like command substitution <(some command) where the output of a command appears to be a file to a program).


I doubt you've used Powershell. Using objects on the cmdline is pretty ballin.


I had the same quibble, was going to write my own but found and now use bat.

https://github.com/astaxie/bat/blob/master/README.md

I don't use Windows but I also dislike python dependencies if I can get away with just needing libc.


I've never needed to do more configuration for python than whatever the installer does. What problem are you having?


I've had enough problems in the past that I try to steer clear at this point. I don't do much with python so it's hard to justify spending much time on it. I think my bigger issue is the lack of being able to do cinst httpie


The key is to stick with Python 2.7. Unfortunately pyhon 3 has significantly damaged the brand. Also installing msysgit will give you git and the msys command line which is a nice easy to install mostly UNIX command line. It is not a beast like cygwin.

With msysgit (for git and msys) you can practically live in the command line on Windows and with python 2.7 the py things mostly just work. pywin32 helps for some things and if pip install doesn't work there's always the unofficial windows binaries:

http://www.lfd.uci.edu/~gohlke/pythonlibs/


I use the Babun Shell [0] on Windows desktops to provide an environment for tools more typically found and used in a Unix-like environment.

I know it's not the same as installing a tool natively, but it lets me a) use some tools pretty much in the way I prefer to do so and b) check out stuff that I see on HN (and elsewhere) without a PITA install.

Installing was a one line deal if pip is installed (pip install --upgrade httpie), which in turn was a one line installation (wget https://bootstrap.pypa.io/get-pip.py -O - | python)

That's pretty close to one line.

[0] https://babun.github.io/


+1; I likely wouldn't be able to just jump on any host and copy/paste some commands that worked from my laptop with this, but I can do that with curl and assume that it's installed and of a reasonably serviceable version basically everywhere I'd be.


Previously on HN: https://news.ycombinator.com/item?id=10418882 (that's the thread where I first found out about httpie)


That's true of running just about any command-line tool on Windows. We were promised an ecosystem of Powershell modlets that operate on objects rather than text, but that hasn't really happened.


Any time I'm critical of Windows treatment of CLIs, I'm either met with "Why would I need a CLI, it's not the 80s" or "PowerShell is vastly superior to bash". The issue keeping the MS world from adopting adequate text only tools seems to be much more related to developer mindset than anything technical.


I think it's the mindset of not being involved in open-source mostly.



Ha, I just got to using `jq` regularly (which is already a huge improvement). Now this ..


I find httpie's lack of manpage quite disturbing though.



Though only briefly mentioned in this article at the buttom, I'd like to give a huge shoutout to aria2. I use it all the time for quick torrent downloads, as it requires no daemon and just seeds until you C-c. It also does a damn good job at downloading a list of files, with multiple segments for each.


+1 for aria2. It's like the VLC of command-line download tools - it Just Works with anything.

https://aria2.github.io/


Also can be installed with brew.


I instinctively go to `wget` when I need to, uhm, get the file into my computer[1]. `curl -O` is a lot more effort :P

Other than that, curl is always better.

[1] Aliasing `wget` to ~`curl -O` might be a good idea :)


I use wget for downloads because it follows links by default and resume is just -c. I never figured out how to make curl do the equivalent of -c.


Lets compare the length of the man page:

    $ man curl | wc -l
    1728
    $ man wget | wc -l
    1096
How about the --help output?

    $ curl --help | wc -l
    178
    $ wget --help | wc -l
    176
The wget help is nicer, grouping options together by category and with longer text. curl just has a long list of options in alphabetical order. How many (long) options do they have?

    $ curl --help | grep -- -- | wc -l
    175
    $ wget --help | grep -- -- | wc -l
    137
I'd say it is a lot quicker to work out the flags etc you need with wget because there is less to look through.


"I'm glad I typed `man wget` instead of `wget --help`" -- no one ever

You want the `wget --help` text over the man page, 99% of the time. The other 1%, you want the full info manual. The man page is an awful mix between the two; too dense for scanning through for the flag you need, but not containing the full information when you need specifics.


> no one ever

Except [at least] me.

I like man better because it's consistent. Some tools want --help, -help, -h, -H, -\? etc.

I like man better because I can search it.

I like man better because it gives me the details, not just a list.


For all of the reasons you gave (except search--that's what grep is for), I usually reach for `man`. But, for wget the information density of the man page is just wrong. At least these days it has some more information in it--it used to just be a reformatted version of the --help text, plugged into a generic template.


grep on help output is annoying sometimes since many programs send it on stderr and you need to redirect it if you want to pipe it to grep.

Plus even if grep matched something you can't read the context without extra options.

man is much easier than doing all that, but the time you have your full search command you would have already gotten your info from man.


|& is useful in these cases, as it redirects both stdout and stderr to the piped process's stdin. It's a cshism, but it works in both zsh and modern bash. Much nicer than typing cmd 2>&1 | cmd.

For context try grep -2, where 2 is the desired lines of context.

  $ wget --help |& grep -2 base   
    -i,  --input-file=FILE           download URLs found in local or external FILE
    -F,  --force-html                treat input file as HTML
    -B,  --base=URL                  resolves HTML input-file links (-i -F)
                                       relative to URL
         --config=FILE               specify config file to use
  --
                                       existing files (overwriting them)
    -c,  --continue                  resume getting a partially-downloaded file
         --start-pos=OFFSET          start downloading from zero-based position OFFSET
         --progress=TYPE             select progress gauge type
         --show-progress             display the progress bar in any verbosity mode
(Not that I'm arguing that this is an excuse for wget's [and GNU projects' in general] man pages sucking, but it's a useful workaround.)


I know how to do it, that's not the question.

But rather, doing all that seems easier than man to you?


I really wish the GNU foundation would give up on info pages. Just admit failure and condense them down into full info manpages that I can search easily instead of having to use their 1980s version of a web browser with its awful EMACS-like keybinds.


I recommend using this script[1] and aliasing it to "man". Saves so much time over loading man pages and searching through them. For example, with it aliased to "man", you can run "man wget continue" or "man find -exec" and get just the relevant parts of the man page. And "man git commit -a" also works, despite the separated command name.

1: https://gist.github.com/alphapapa/3cba3ff196147ad42bac


> help is nicer, grouping options together by category ... vs ... long list of options in alphabetical order.

Honestly, either is good. Grouping options is good if you don't know what you're looking for, and alphabetical is good if you do. The bad ones are like the help page for rsync - a ton of options, with no semantic ordering at all.


I don't think you can:

"[Wget's] ability to recover from a prematurely broken transfer and continue downloading has no counterpart in curl."


    curl -C - -O filename url
-C continues from an offset; "-C -" uses the length of the output file as the offset.


That's what I tried a couple years ago and it failed. Might try it again some time.


I guess it's possible that the server didn't support HTTP byte serving via Accept-Ranges:, either for that particular resource or altogether.


I don't think curl can mirror.


"Wget can be typed in using only the left hand on a qwerty keyboard!"

I love both of these, but wish that curl was just like wget in that the default behavior was to download a file, as opposed to pipe it to stdout. (Yes, aliases can help, I know.)


Streaming to stdout is more Unix-y, allowing you to pipe the response into further processes. For example:

    curl http://api.example.com/json | jq '.["someKey"]' # etc., etc.


Or in modern idiomatically insecure install scripts:

    curl http://bogus.example.com/install | sudo bash


Too bad that is a reserved domain. Would be hilarious to put an actual shell script there that would echo ARF ARF ARF in a loop.


It may be more unix-y, but it's less user-friendly if the expectation is to just download a file.

EDIT: Wow, surprised by the downvotes. I don't think I said anything controversial (y'know principle of least surprise and all), but maybe I was being a bit too opaque: wget, by virtue of being the first on the scene, built an expectation that $THING_THAT_GETS_URLS would result in a file without any other input/arguments. Curl, to this day, surprises me because I was around when wget was all you had.


Nonsense. You can use both tools. 'curl' is 'cat url', and by default is meant to behave like 'cat' - that is, send stuff to STDOUT. 'wget' is 'web get', and gets an object from (only) the web or ftp, and plonks it on your filesystem. They both do exactly what they're supposed to (according to name), by default.


> 'curl' is 'cat url',

Whoa... TIL something! I don't know if that's the official etymology, but that's a great mnemonic!

EDIT: ... and yes, I use both tools :).


Well, it looks like I was taught wrong, and it's not 'cat URL' (though that's a good way to think of it), but the rather more direct-to-STDOUT-sounding 'see URL'. TIL something too :)

https://en.wikipedia.org/wiki/CURL


My first impression is 'cat URL' not 'see URL'. The wikipedia article does not have to be official.


Remove your edit, it makes it more likely for you to get downvotes, not less.

If you post something, stand by it, do not worry about downvotes. I don't really like them, but nevertheless I'm proud when I get a downvote - it means I don't have a hive-mind mentality.


Nah, I don't really mind the downvotes per se. I was just surprised, that's all. Anyway, seems that my further clarification in the edit cleared up some confusion on the content of my post, so it's all good.


One nice thing about saving a file vs piping is that wget sets the timestamp on the file based on the remote HTTP headers.

curl -o does not.


curl -Ro does, however.


Fair enough. Perhaps I just need to adjust my expectations.

Or maybe there's a good reason for having two tools :)


I use wget when I need to download things.

curl is for everything else (love it when it comes to debugging some api)... Httpie is not bad too for debugging but most of them time I forget to use it.


Since aria2 was only passingly mentioned, let me list some of its features:

- Supports splitting and parallelising downloads. Super handy if you're on a not-so-good internet connection.

- Supports bittorrent.

- Can act as a server and has a really nice XML/JSON RPC interface over HTTP or WebSocket (I have a Chrome plugin that integrates with this pretty nicely).

They're not super important features sure but I stick with it because it's typically the fastest tool and I hate waiting.


Curl gets another point for having better SNI support, as wget versions until relatively recently didn't support it.

This means you can't securely download content using relatively recent (but not the newest) versions of wget (such as any in the Ubuntu 12.04 repos) from a server which uses SNI, unless the domain you're requesting happens to be the default for the server.

As an example, I found the file https://redbot.org/static/style.css only accessible with SNI. Try `wget https://redbot.org/static/style.css` vs. `curl -O https://redbot.org/static/style.css` on Ubuntu 12.04. Domain names which point to S3 buckets (and likely other CDNs) will have similar issues.


For me defaults matter... 99% of the time when I want to use wget or curl, I want to do it to download a file, so I can keep working on it, from the filesystem.

wget does that without any parameters. Curl requires me to remember and provide parameters for this obvious usecase.

So wget wins every time.


If nobody's tried it, axel mentioned in the report as possibly abandoned has the awesome feature of splitting a download in to parts and then establishing that many concurrent TCP connections. Very useful on individual TCP flow rate-limited networks.


Axel is great, but nowadays I reach for aria2 instead, which can do the same, but also supports bittorrent and (s)ftp.


Haven't found anything better than axel to saturate the link yet.


Try saldl[1]. It depends on libcurl. So protocol support should be good and reliable.

[1] https://github.com/saldl/saldl



FWIW, this page is incomplete and outdated.

For example, `--mirror-url` was implemented. So, it is now possible to download from two sources concurrently.


We are forgetting our long lost cousin, fetch. http://www.unix.com/man-page/FreeBSD/1/FETCH/


wget has the amazing flag `--page-requisites` though, which downloads all of an html documents' css and images that you might need to display it properly. Lifesaver.


wget has another great flag, -k, which changes references to the css, js, and images to absolute URLs, resulting in a 1 page download that still looks like the original page. It's useful for making dummy pages for clients. I wish curl had this for my OSX friends who need the functionality above. Getting a wget binary onto OSX is a pain but curl is there by default.


Whoa, nice, thanks for the tip! That’s amazingly useful and I seriously wish I had known about it earlier!

For OS X users, you can get wget pretty easily with Homebrew¹. Just install it, then enter the following:

brew install wget --with-gpgme --with-iri --with-pcre

…well, those extra options aren’t strictly needed. Just what I used since I wanted wget compiled with support for those things (GnuPG Made Easy², Internationalized Resource Identifiers³, and Perl Compatible Regular Expressions⁴).

You can see all the compile-time options before installing wget by typing in:

brew info wget

――――――

¹ — http://brew.sh/

² — https://www.gnupg.org/related_software/gpgme/

³ — https://en.wikipedia.org/wiki/Internationalized_resource_ide...

⁴ — https://en.wikipedia.org/wiki/Perl_Compatible_Regular_Expres...


Woah! Had I RTFM'd this would've saved SO. MUCH. AGGRAVATION. in recent months! Cheers!


wget is on homebrew.


After vi vs. emacs, this is truly the great debate of our generation.


Come on people, who already emails with curl. Admit it.


This would be better directed at those who use Outlook. The ones using curl to send mails will be boasting about it ;)


Really interesting. Under curl he has:

"Much more developer activity. While this can be debated, I consider three metrics here: mailing list activity, source code commit frequency and release frequency. Anyone following these two projects can see that the curl project has a lot higher pace in all these areas, and it has been so for 10+ years. Compare on openhub"

Under wget he has: "GNU. Wget is part of the GNU project and all copyrights are assigned to FSF. The curl project is entirely stand-alone and independent with no organization parenting at all with almost all copyrights owned by Daniel."

Daniel seems pretty wrong here. Curl does not require copyright assignment to him to contribute, and so, really, 389 people own the copyright to curl if the openhub data he points to is correct :)

Even if you give it the benefit of the doubt, it's super unlikely that he owns "almost all", unless there really is not a lot of outside development activity (so this is pretty incongruous with the above statement).

(I'm just about to email him with some comments about this, i just found it interesting)


"almost all copyrights owned by Daniel." and "389 people own the copyright to curl" aren't mutually exclusive. I think Daniel was saying that most of the code is copyright to him, and you are saying that the rest is copyright to 388 other people.


Unmentioned in the article - Curl supports --resolve, this single feature helps us test all sorts of scenarios for HTTPS and hostname based multiplexing where DNS isn't updated or consistent yet, e.g. transferring site, bringing up cold standbys, couldn't live without it (well I could if I wanted to edit /etc/hosts continuously)


wget was the first one I learned how to use by trying to recursively download a professor's course website for offline use, and then learning that they hosted the solutions to the assignments there as well..

I did well in that course, granted it was an easy intro to programming one. ;)


> Wget requires no extra options to simply download a remote URL to a local file, while curl requires -o or -O.

I think this is oddly the major reason why wget is more popular. Saving 3 chars + not having to remember the specific curl flag seems to matter more than what we can think.


I'm always amused by people who do the opposite: using wget to send get/post requests to web servers and having to add `-O /dev/null' (or, even worse `-O - > /dev/null'to keep from saving the results.


Curl scripts allow open connection to view all new logs in a session.

can wget do similar? I did not know it can or could however from my point of view if it cannot this is like comparing a philips head screwdriver to a powertool with 500pc set.


What do you even mean by logs in a session?


for example, in troubleshooting with a bluecoat proxy, I can run a curl session in conjuction with grep to check for very specific types of traffic and leave that script open while I might have an end user test.


Sorry, can't imagine what you mean. Do you just start curl with a list of urls to process and grep for errors? Any specific examples?


it acts like a live packet capture of a log file.

not sure how else to describe other than I do not think wget is capable of such functionality.


sooo... "wget -O - .... | grep ..." ?


havent not worked with it in that matter, however like someone above said.

typically I use wget to download, and curl to troubleshoot http https.

however, if that is capable of keeping everything open I suppose thats a point for wget.


aria2 is much more reliable when downloading stuff, especially for links which involve redirections.

For example here's a link to download 7zip for windows from filehippo.com.

Results:

* Curl doesn't download it at all.

  curl -O 'http://filehippo.com/download/file/bf0c7e39c244b0910cfcfaef2af45de88d8cae8cc0f55350074bf1664fbb698d/'
gives:

  curl: Remote file name has no length!
* Wget manages to download the file, but with the wrong name.

  wget 'http://filehippo.com/download/file/bf0c7e39c244b0910cfcfaef2af45de88d8cae8cc0f55350074bf1664fbb698d/'
gives:

  2016-03-03 18:08:21 (75.9 KB/s) - ‘index.html’ saved [1371668/1371668]
* aria2 manages to download the file with the correct name with no additional switches.

  aria2c 'http://filehippo.com/download/file/bf0c7e39c244b0910cfcfaef2af45de88d8cae8cc0f55350074bf1664fbb698d/'
gives:

  03/03 18:08:45 [NOTICE] Download complete: /tmp/7z1514-x64.exe


The URL does not work right now. But I tried another one from the same site.

No client can get this right, always. aria2c is not more reliable. It's just choosing to take the filename from the redirect URL. It appears to be the right thing to do in this case. But it would fail if the start URL was actually the one that had the right filename.

Hosts can use the Content-Disposition header if they want to make sure all (capable) clients get the right filename.

In saldl, I implemented `--filename-from-redirect` to handle your use-case. But It's not set by default.


Thanks for the explanation. But generally I have found aria2 to be more reliable in such scenarios.


Useful to know: If you use the Chrome dev tools, in the network tab, you can right click on a request and "Copy as cURL".


My usage pattern has been:

  - wget to download files (or entire sites even)

  - curl to debug everything http/https


For certain case like creating a Telegram bot which has no interaction with browser, do you think we can make use of curl (post request) to make PHP session works?

As there's no browser interaction in Telegram bot, the script just receives response back from Telegram server. This might help to kerp track of user state without a need of db?


I use curl because it is generally installed. I prefer not to install wget, especially on customer machines because it stops 90% of script kiddies. For some reason wget is the only tool they will attempt to use to download their sploit.


Pretty sure skiddies will not assume most victims have wget already, they'll just ship it with the exploit. If not installing wget is an annoyance to a hacker, they're already in too deep ;)


Your stock standard drive by PHP exploit attempts usually attempt to "wget" another PHP file to public_html.

They try wget, fail, and move on.


I should probably write a "saldl vs. others" page someday.

> Wget supports the Public Suffix List for handling cookie domains, curl does not.

This is outdated info. (lib)curl can be built with libpsl support since 7.46.0.


Released on 2015-12-02, so it wont be in many dists for some years. :-)


Stability and security comes first. So, let's ship an X years old curl release + patches ;)


Nowaday I just use httpie. It's in Python, so easy to install in windows, and let me work easily with requests and responses, inspect the content, add coloration, etc. Plus the syntax is much easier.


I like Wget's option to continue a file download if it gets interrupted. I believe you can achieve the same thing in curl but its not as simple as just setting a flag (-c).


>> Wget can be typed in using only the left hand on a qwerty keyboard!

Great!


Wget is under GPLv3 so thats what I use more often. Sometimes I will use curl in certain cases, but yes, I will use a GPL product over a non-gpl product if given a choice.


The "only need a left hand" sways me for wget.


There is no other industry where tools are debated so much as in IT. We literally waste tonns of hours on arguing over minor differences and nuances that really should not matter that much.


You clearly aren't familiar with US gun culture, e.g. plastic Glock v. metal 1911, striker fired vs. Single Action, Gaston Glock vs. John Moses Browning! Light, fast 9 mm Europellet vs. heavy, slow .45 (11.5 mm) Auto Colt Pistol ... and those are just two of the most prominent right now. Let me assure you, this isn't unique to IT!


There's a balancing act between trying to cut a tree with a blunt ax that one never resharpens versus spending all week in an ax store looking at different axes. (Speaking of which I'm doing something like that by looking at hn so I'm not trying to say bad to anyone else because I'd be a hypocrite if I did)


"Give me six hours to chop down a tree and I will spend the first four sharpening the axe." - commonly attributed to Abraham Lincoln


I have seen guitarists spend serious time debating picks.


It is a waste of time, clearly whet and curl are just petty toys. Emacs is obviously the one true tool to meet any purpose.


curl for checking http headers simply with: curl -vskL http:1.2.3.4 -H "Host: example.com" > /dev/null


Why don't you just use the -I flag?


For some sites a HEAD may return different headers than GET, so it is safer to return the results in full. Also using vsk shows the request headers, including IP so you can easily see if things such as roundrobin DNS is in use, again to assist with debugging.


Who funds projects like this?


TLDR: curl rocks.


Unless you want to download a web site for archiving purposes.


Wget just werks.


You should all check out wpull!



I love wpull!

I'm actually considering using it for a large upcoming project but unfortunately there are some pretty significant bugs in their backlog. wget seems to be a bit more battle hardened.


Why?


Everything 6 years old is new again: https://news.ycombinator.com/item?id=1241479


> Updated: February 26, 2016 17:20 (Central European, Stockholm Sweden)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: