* Wget's the interactive, end-user tool, and my go-to if I just need to download a file. For that purpose, its defaults are more sane, its command line usage is more straightforward, its documentation is better-organized, and it can continue incomplete downloads, which curl can't.
* Curl's the developer tool-- it's what I'd use if I were building a shell script that needed to download. The command line tool is more unix-y by default (outputs to stdout) and it's more flexible in terms of options. It's also present by default on more systems-- of note, OSX ships curl but not wget out of the box. Its backing library (libcurl) is also pretty nifty, but not really relevant to this comparison.
This doesn't really need to be an "emacs vs. vim" or "tabs vs. spaces"-type dichotomy: wget and curl do different things well and there's no reason why both shouldn't coexist in one's workflow.
> This doesn't really need to be an "emacs vs. vim" or "tabs vs. spaces"-type dichotomy: wget and curl do different things well and there's no reason why both shouldn't coexist in one's workflow.
Totally agree. I love curl for testing API request/responses manually. It's usually a huge part of navigating my way around a new API that doesn't have a client library available for whatever language I'm using at that time.
I also use it for weird requests that need special headers or authentication.
Wget is the first thing I turn to when I'm downloading anything remote from the command line or scraping some remote content for analysis.
Yes, wget is fantastic for mirroring www and ftp sites and I use it a lot for that purpose. It's magic [0]. I hadn't realized that it didn't support compression though, which might explain why it's so slow in some cases. Not normally a problem as it just runs in the background on a schedule.
Curl supports gzip and deflate. It would be great if support for sdch and br were added too. Brotli is in Firefox 44 and can be enabled in Chrome Canary with a flag. SDCH has been in Chrome for a while and is used on Google servers.
The Win64 latest version of curl doesn't seem to support gzip nor deflate. I have to remove those options when copying from Chrome developer tools and pasting into a script. I'd report a bug but their page doesn't seem to have an obvious link.
Pretty much. If I want to save a file: wget. If I want to do _anything_ else: curl. Yes you can write files with curl, no I don't use that functionality very often. I don't think of them as "end user" vs "developer" use cases so much as them being two great tools for different tasks. I do wish that -s was the curl default, since that stderr progress output is pretty lame.
Stupid question, but how do things like this resume from where they left off? Wouldn't the server need to be cooperating in this? Is that build into HTTP?
This is also how download accelerators worked (back in the late nineties, early naught's), by having different connections work at several ranges to maximize bandwidth usage.
Why? How could it be more useful? HTTP byte ranges are incredibly flexible, since you one request can specify many byte ranges at once (it's almost too flexible, since a dumb server can easily be overwhelmed by a maliciously complicated request)
It handles the basic case of fetching the remainder of an interrupted download, and can also support partial downloads e.g. for supporting a video stream with the user jumping to different places in the movie.
The article is not coming up for me. Perhaps it's ycombinated. Anyway I agree curl is more of a developer tool although using it to download files is not the first thing that comes to mind. I use it daily to identify caching issues and redirect issues. The -SLIXGET flags in particular are very useful for this.
Huh. I've always thought of them as the opposite - wget is the full featured spidering tool, curl is the easy to run one when I need a command line thing or to bang web stuff into a janky copy and paste workflow.
In that case I'd love for it to include resuming downloads though, but given the differences I don't think merging is going to be more useful than just adding the feature in curl.
Edit: nevermind, I just learned that wget can download a page's resources or even inline them (-k apparently), that's a bit of a way off from curl's purpose. Better keep them separate tools, although wget might benefit from using libcurl so they don't have to implement lots of stuff like https or http2.
I totally agree except over the years I have been using aria2 more and more instead of wget. aria2 supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink with the same sane wget syntax and defaults.
curl is the one where I have to remember whether to use -o or -O when trying to download a file with the original filename and just use wget instead because it's faster than reading the curl man page.
I have the same problem. Curl is "unix-y" in the sense that the default options are optimized for a shell script and make no sense for interactive usage.
bropages is where it's at for that kind of stuff. curl is actually their usage example :)
Their example will just dump the webpage to stdout! Almost certainly not the desired behavior given the comment. Then they include second example of a use case that almost nobody has, instead of giving the option that everyone is actually looking for.
My favorite use of wget: mirroring web documentation to my local machine.
wget -r -l5 -k -np -p https://docs.python.org/2/
Rewrites the links to point local where appropriate, and the ones which are not local remain links to the online documentation. Makes for a nice, seamless experience while browsing documentation.
I also prefer wget to `curl -O` for general file downloads, simply because wget will handle redirects by default, `curl -O` will not. Yes, I could remember yet another argument to curl... but why?
That said, I love curl (combined with `jq`) for playing with rest interfaces.
Underscore CLI looks interesting as well, though I haven't personally had much of a chance to play with it. It does require NodeJS, which might be a deal-breaker for some, but if it's already in your toolchain then it might come in handy.
I've written a similar tool underscore-cli and jq called jowl. It's designed to be easier to learn (for JavaScript developers) than underscore-cli or jq. The README includes a comparison.
It's still early in its development, and would benefit from a tutorial and a few more features, but it's getting there.
This is a great use, and I've used it for a lot of other sites and documentation. However I just want to point out that Python's documentation happens to come in an easily downloadable archive of HTML files:
It's somewhat hard to compare CLI vs GUI programs.. but in my experience, wget --mirror is simpler than HTTrack for simple tasks, which takes a little bit of mucking around to start downloading. HTTrack has a smaller learning curve for more complex situations though.
For docs specifically, you could also try Zeal, which allows you to browse documentation for quite a few programming languages offline, including Mozilla's JS docs and an HTML spec, as you mentioned in some other comment in this thread.
The general workflow I have followed was, "Download PDF if present. If documentation can be shown in single HTML page, download it, change all the anchors to point to my own filesystem". This changes everything in such a simple, useful, way.
if you look at some of the default docsets you can tell that this was most likely done. All of the directories under the file root in the docset are named after domain names (one of the things whet does when crawling a site and traversing multiple domains).
Is this the part where I tell you about pydoc and blow your mind?
I had a patch at one point to make pydoc style itself just like docs.p.o, but I'm not working much in Python these days. Maybe it's not even necessary any more, which would be cool.
Sure. Does it work with golang docs, the w3 HTML5 spec, Mozilla's JS reference, etc?
;)
Python's built in documentation tools are good, but the command I referenced also gives you the full language spec locally, as well as the FAQs and packaging docs.
Also putting this out there—for nicer REST API interaction on the CLI, and a little more user-friendliness, you might also want to add HTTPie[1] to your toolbelt.
It's not going to replace curl or wget usage, but it is a nicer interface in certain circumstances.
HTTPie is amazing for working with JSON/REST interfaces. It's really succinct and generally designed with user-friendliness in mind, versus supporting every nook and cranny of the HTTP RFCs. It's installable via homebrew for Mac users.
Whenever I go back to curl I feel the same minor aggravation I do when moving from homebrew back to apt (want to install? apt-get. Search? apt-cache search. List? dpkg. Remove? apt-get again, for some reason). It's just not a very user-friendly CLI and I'm constantly pulling up the curl man pages.
On Debian these days, they provide an executable named apt that does all the common stuff in one place (install, search, list (mimicks dpkg), remove, show, etc.).
> want to install? apt-get. Search? apt-cache search. List? dpkg. Remove? apt-get again, for some reason
I think this is largely why aptitude exists; a single unified command for all common package tasks. Most of the Debian docs have been switched over to recommend its usage, but for some reason Ubuntu hasn't followed suit. That said, I just keep using the same old shell aliases I added in 2003 personally.
Arch Linux has pacman, which is a single program that does all of the things you need from a package manager. OpenSUSE systems have zypper, which also has one command for everything (and has a much more reliable format -- rpm).
How is RPM a more reliable format than dpkg? I've worked with both systems as part of writing a program that generates pacman/dpkg/RPM packages, and I've found dpkg to be the most sane and well-designed of the three, while the RPM format is a horrible and under-documented atrocity.
I've been loving HTTPie, and for my somewhat idiosyncratic usage it's completely replaced curl and wget. (Of course, most people probably don't spend as much time as me fiddling with poorly documented third party REST APIs.)
This is an extremely minor quibble, but the dependency on Python makes me less inclined to use it. I'm stuck on Windows for a lot of the work I do so configuring Python is never fun and there is no one line installation method from what I can tell.
Putting it up on chocolatey[0] might be a good idea. Not sure how feasible that is however.
I'm not sure why you feel like that, I switched back to Windows after ~1 year of using OSX, I wouldn't say its "miserable" there is really nothing I could do in OSX's command line that can't do in windows.
Powershell is very powerful and for windows it beats cygwin/mingw. Not quite sure how it measure against Linux shells running on Linux, but Microsoft has really made a proper shell for windows, too bad it looks so different.
Obviously if you work in a cross-platform environment, cygwin/mingw is still the only thing that will provide you some sort of consistency in your workflow on Windows machines.
I don't get agree with your parent, but I used cygwin when I was on Windows and it mostly does what you want if you just need basic command line tools.
The major annoyances were packages were limited, compiling anything was generally a disaster and file permissions between linux/windows are a mess. I happily used it everyday though.
I feel like you can't have really scratched the surface on what you can do in a real Unix shell. I doubt that Windows has support for things I use all the time (like command substitution <(some command) where the output of a command appears to be a file to a program).
I've had enough problems in the past that I try to steer clear at this point. I don't do much with python so it's hard to justify spending much time on it. I think my bigger issue is the lack of being able to do cinst httpie
The key is to stick with Python 2.7. Unfortunately pyhon 3 has significantly damaged the brand. Also installing msysgit will give you git and the msys command line which is a nice easy to install mostly UNIX command line. It is not a beast like cygwin.
With msysgit (for git and msys) you can practically live in the command line on Windows and with python 2.7 the py things mostly just work. pywin32 helps for some things and if pip install doesn't work there's always the unofficial windows binaries:
I use the Babun Shell [0] on Windows desktops to provide an environment for tools more typically found and used in a Unix-like environment.
I know it's not the same as installing a tool natively, but it lets me a) use some tools pretty much in the way I prefer to do so and b) check out stuff that I see on HN (and elsewhere) without a PITA install.
Installing was a one line deal if pip is installed (pip install --upgrade httpie), which in turn was a one line installation (wget https://bootstrap.pypa.io/get-pip.py -O - | python)
+1; I likely wouldn't be able to just jump on any host and copy/paste some commands that worked from my laptop with this, but I can do that with curl and assume that it's installed and of a reasonably serviceable version basically everywhere I'd be.
That's true of running just about any command-line tool on Windows. We were promised an ecosystem of Powershell modlets that operate on objects rather than text, but that hasn't really happened.
Any time I'm critical of Windows treatment of CLIs, I'm either met with "Why would I need a CLI, it's not the 80s" or "PowerShell is vastly superior to bash". The issue keeping the MS world from adopting adequate text only tools seems to be much more related to developer mindset than anything technical.
Though only briefly mentioned in this article at the buttom, I'd like to give a huge shoutout to aria2. I use it all the time for quick torrent downloads, as it requires no daemon and just seeds until you C-c. It also does a damn good job at downloading a list of files, with multiple segments for each.
The wget help is nicer, grouping options together by category and with longer text. curl just has a long list of options in alphabetical order. How many (long) options do they have?
"I'm glad I typed `man wget` instead of `wget --help`" -- no one ever
You want the `wget --help` text over the man page, 99% of the time. The other 1%, you want the full info manual. The man page is an awful mix between the two; too dense for scanning through for the flag you need, but not containing the full information when you need specifics.
For all of the reasons you gave (except search--that's what grep is for), I usually reach for `man`. But, for wget the information density of the man page is just wrong. At least these days it has some more information in it--it used to just be a reformatted version of the --help text, plugged into a generic template.
|& is useful in these cases, as it redirects both stdout and stderr to the piped process's stdin. It's a cshism, but it works in both zsh and modern bash. Much nicer than typing cmd 2>&1 | cmd.
For context try grep -2, where 2 is the desired lines of context.
$ wget --help |& grep -2 base
-i, --input-file=FILE download URLs found in local or external FILE
-F, --force-html treat input file as HTML
-B, --base=URL resolves HTML input-file links (-i -F)
relative to URL
--config=FILE specify config file to use
--
existing files (overwriting them)
-c, --continue resume getting a partially-downloaded file
--start-pos=OFFSET start downloading from zero-based position OFFSET
--progress=TYPE select progress gauge type
--show-progress display the progress bar in any verbosity mode
(Not that I'm arguing that this is an excuse for wget's [and GNU projects' in general] man pages sucking, but it's a useful workaround.)
I really wish the GNU foundation would give up on info pages. Just admit failure and condense them down into full info manpages that I can search easily instead of having to use their 1980s version of a web browser with its awful EMACS-like keybinds.
I recommend using this script[1] and aliasing it to "man". Saves so much time over loading man pages and searching through them. For example, with it aliased to "man", you can run "man wget continue" or "man find -exec" and get just the relevant parts of the man page. And "man git commit -a" also works, despite the separated command name.
> help is nicer, grouping options together by category ... vs ... long list of options in alphabetical order.
Honestly, either is good. Grouping options is good if you don't know what you're looking for, and alphabetical is good if you do. The bad ones are like the help page for rsync - a ton of options, with no semantic ordering at all.
"Wget can be typed in using only the left hand on a qwerty keyboard!"
I love both of these, but wish that curl was just like wget in that the default behavior was to download a file, as opposed to pipe it to stdout. (Yes, aliases can help, I know.)
It may be more unix-y, but it's less user-friendly if the expectation is to just download a file.
EDIT: Wow, surprised by the downvotes. I don't think I said anything controversial (y'know principle of least surprise and all), but maybe I was being a bit too opaque: wget, by virtue of being the first on the scene, built an expectation that $THING_THAT_GETS_URLS would result in a file without any other input/arguments. Curl, to this day, surprises me because I was around when wget was all you had.
Nonsense. You can use both tools. 'curl' is 'cat url', and by default is meant to behave like 'cat' - that is, send stuff to STDOUT. 'wget' is 'web get', and gets an object from (only) the web or ftp, and plonks it on your filesystem. They both do exactly what they're supposed to (according to name), by default.
Well, it looks like I was taught wrong, and it's not 'cat URL' (though that's a good way to think of it), but the rather more direct-to-STDOUT-sounding 'see URL'. TIL something too :)
Remove your edit, it makes it more likely for you to get downvotes, not less.
If you post something, stand by it, do not worry about downvotes. I don't really like them, but nevertheless I'm proud when I get a downvote - it means I don't have a hive-mind mentality.
Nah, I don't really mind the downvotes per se. I was just surprised, that's all. Anyway, seems that my further clarification in the edit cleared up some confusion on the content of my post, so it's all good.
curl is for everything else (love it when it comes to debugging some api)... Httpie is not bad too for debugging but most of them time I forget to use it.
Since aria2 was only passingly mentioned, let me list some of its features:
- Supports splitting and parallelising downloads. Super handy if you're on a not-so-good internet connection.
- Supports bittorrent.
- Can act as a server and has a really nice XML/JSON RPC interface over HTTP or WebSocket (I have a Chrome plugin that integrates with this pretty nicely).
They're not super important features sure but I stick with it because it's typically the fastest tool and I hate waiting.
Curl gets another point for having better SNI support, as wget versions until relatively recently didn't support it.
This means you can't securely download content using relatively recent (but not the newest) versions of wget (such as any in the Ubuntu 12.04 repos) from a server which uses SNI, unless the domain you're requesting happens to be the default for the server.
For me defaults matter... 99% of the time when I want to use wget or curl, I want to do it to download a file, so I can keep working on it, from the filesystem.
wget does that without any parameters. Curl requires me to remember and provide parameters for this obvious usecase.
If nobody's tried it, axel mentioned in the report as possibly abandoned has the awesome feature of splitting a download in to parts and then establishing that many concurrent TCP connections. Very useful on individual TCP flow rate-limited networks.
wget has the amazing flag `--page-requisites` though, which downloads all of an html documents' css and images that you might need to display it properly. Lifesaver.
wget has another great flag, -k, which changes references to the css, js, and images to absolute URLs, resulting in a 1 page download that still looks like the original page. It's useful for making dummy pages for clients. I wish curl had this for my OSX friends who need the functionality above. Getting a wget binary onto OSX is a pain but curl is there by default.
…well, those extra options aren’t strictly needed. Just what I used since I wanted wget compiled with support for those things (GnuPG Made Easy², Internationalized Resource Identifiers³, and Perl Compatible Regular Expressions⁴).
You can see all the compile-time options before installing wget by typing in:
"Much more developer activity. While this can be debated, I consider three metrics here: mailing list activity, source code commit frequency and release frequency. Anyone following these two projects can see that the curl project has a lot higher pace in all these areas, and it has been so for 10+ years. Compare on openhub"
Under wget he has:
"GNU. Wget is part of the GNU project and all copyrights are assigned to FSF. The curl project is entirely stand-alone and independent with no organization parenting at all with almost all copyrights owned by Daniel."
Daniel seems pretty wrong here. Curl does not require copyright assignment to him to contribute, and so, really, 389 people own the copyright to curl if the openhub data he points to is correct :)
Even if you give it the benefit of the doubt, it's super unlikely that he owns "almost all", unless there really is not a lot of outside development activity (so this is pretty incongruous with the above statement).
(I'm just about to email him with some comments about this, i just found it interesting)
"almost all copyrights owned by Daniel." and "389 people own the copyright to curl" aren't mutually exclusive. I think Daniel was saying that most of the code is copyright to him, and you are saying that the rest is copyright to 388 other people.
Unmentioned in the article - Curl supports --resolve, this single feature helps us test all sorts of scenarios for HTTPS and hostname based multiplexing where DNS isn't updated or consistent yet, e.g. transferring site, bringing up cold standbys, couldn't live without it (well I could if I wanted to edit /etc/hosts continuously)
wget was the first one I learned how to use by trying to recursively download a professor's course website for offline use, and then learning that they hosted the solutions to the assignments there as well..
I did well in that course, granted it was an easy intro to programming one. ;)
> Wget requires no extra options to simply download a remote URL to a local file, while curl requires -o or -O.
I think this is oddly the major reason why wget is more popular. Saving 3 chars + not having to remember the specific curl flag seems to matter more than what we can think.
I'm always amused by people who do the opposite: using wget to send get/post requests to web servers and having to add `-O /dev/null' (or, even worse `-O - > /dev/null'to keep from saving the results.
Curl scripts allow open connection to view all new logs in a session.
can wget do similar? I did not know it can or could however from my point of view if it cannot this is like comparing a philips head screwdriver to a powertool with 500pc set.
for example, in troubleshooting with a bluecoat proxy, I can run a curl session in conjuction with grep to check for very specific types of traffic and leave that script open while I might have an end user test.
The URL does not work right now. But I tried another one from the same site.
No client can get this right, always. aria2c is not more reliable. It's just choosing to take
the filename from the redirect URL. It appears to be
the right thing to do in this case. But it would fail if the start URL was actually the one that had the right filename.
Hosts can use the Content-Disposition header if they want to make sure all (capable) clients get the right filename.
In saldl, I implemented `--filename-from-redirect` to handle your use-case. But It's not set by default.
For certain case like creating a Telegram bot which has no interaction with browser, do you think we can make use of curl (post request) to make PHP session works?
As there's no browser interaction in Telegram bot, the script just receives response back from Telegram server. This might help to kerp track of user state without a need of db?
I use curl because it is generally installed. I prefer not to install wget, especially on customer machines because it stops 90% of script kiddies. For some reason wget is the only tool they will attempt to use to download their sploit.
Pretty sure skiddies will not assume most victims have wget already, they'll just ship it with the exploit. If not installing wget is an annoyance to a hacker, they're already in too deep ;)
Nowaday I just use httpie. It's in Python, so easy to install in windows, and let me work easily with requests and responses, inspect the content, add coloration, etc. Plus the syntax is much easier.
I like Wget's option to continue a file download if it gets interrupted. I believe you can achieve the same thing in curl but its not as simple as just setting a flag (-c).
Wget is under GPLv3 so thats what I use more often. Sometimes I will use curl in certain cases, but yes, I will use a GPL product over a non-gpl product if given a choice.
There is no other industry where tools are debated so much as in IT. We literally waste tonns of hours on arguing over minor differences and nuances that really should not matter that much.
You clearly aren't familiar with US gun culture, e.g. plastic Glock v. metal 1911, striker fired vs. Single Action, Gaston Glock vs. John Moses Browning! Light, fast 9 mm Europellet vs. heavy, slow .45 (11.5 mm) Auto Colt Pistol ... and those are just two of the most prominent right now. Let me assure you, this isn't unique to IT!
There's a balancing act between trying to cut a tree with a blunt ax that one never resharpens versus spending all week in an ax store looking at different axes. (Speaking of which I'm doing something like that by looking at hn so I'm not trying to say bad to anyone else because I'd be a hypocrite if I did)
For some sites a HEAD may return different headers than GET, so it is safer to return the results in full. Also using vsk shows the request headers, including IP so you can easily see if things such as roundrobin DNS is in use, again to assist with debugging.
I'm actually considering using it for a large upcoming project but unfortunately there are some pretty significant bugs in their backlog. wget seems to be a bit more battle hardened.
* Wget's the interactive, end-user tool, and my go-to if I just need to download a file. For that purpose, its defaults are more sane, its command line usage is more straightforward, its documentation is better-organized, and it can continue incomplete downloads, which curl can't.
* Curl's the developer tool-- it's what I'd use if I were building a shell script that needed to download. The command line tool is more unix-y by default (outputs to stdout) and it's more flexible in terms of options. It's also present by default on more systems-- of note, OSX ships curl but not wget out of the box. Its backing library (libcurl) is also pretty nifty, but not really relevant to this comparison.
This doesn't really need to be an "emacs vs. vim" or "tabs vs. spaces"-type dichotomy: wget and curl do different things well and there's no reason why both shouldn't coexist in one's workflow.