require 'net/http' Net::HTTP.get_print 'www.gun.io', '/' That's it! In only two ...

pak · on Oct 19, 2011

arguably, in PHP

    echo file_get_contents('http://www.gun.io')

you lose even more of the line noise of the other languages.

For a language that claims to be "batteries included" it amazes me how many Python tutorials begin with "go get and install this or that package..." How much of the Python standard library is no longer the "Pythonic" way of doing standard things? Why aren't APIs with more refined design, like Requests, pushed back into the standard library, and made the real one way to do it?

Well, OK, you'd have the problem of redundant libraries starting to stack up--it's already looking tiresome. For instance, Requests would have to be urllib3, since there is already a urllib2 and urllib. This article wants me to install simplejson, but there's already a Python json module: oh I see, it was merged in from simplejson, and now they are developed in parallel, so using simplejson may still be better, wtf? Why is there an htmllib in the standard library if I'm always going to use lxml instead? Let's not even discuss eyesores like the heritage of the subprocess and threading libraries...

timtadh · on Oct 19, 2011

To all your questions:

Because it takes time. It takes time to add things to the stdlib, especially if the are replacing "an old reliable." In the case of Requests, it is just a wrapper on the stdlib. There is nothing in Requests you can't do in the stdlib, ditto pretty much everything.

As far as lxml goes, there are a ton of XML parsing/query libraries out there. In lxml's case it is built on libxml2. I believe there is another library that is also build on libxml2, however lxml has better syntax for the most part. Should it be in the standard library? Probably not, it depends on a pretty heavy weight C library and most people don't really need it. When you do, you use it.

I have no idea what is going on with the simplejson business. I just use the json lib.

I don't see anything wrong with subprocess or the threading libraries. They do what they need to do. I use subprocess quite often, it works fine. It is way better than the actual legacy version (in the os library).

d0mine · on Oct 19, 2011

It seems performance issues with stdlib's json version are not in the past http://news.ycombinator.com/item?id=3128433

timtadh · on Oct 19, 2011

Good to know. When I said I don't know what is going on, I meant why the simplejson people are not focusing on the stdlib version. It seems strange to basically have a fork...

FluidDjango · on Oct 19, 2011

> arguably, in PHP

> echo file_get_contents('http://www.gun.io)

I thought it had to be "too good..." - links are broken. And since src links are broken, too, images do not display.

pak · on Oct 19, 2011

Of course, these are all examples of piping the body of an HTTP GET response. There is no parsing going on. You'd need to parse the HTML to translate the links, if this is being served from your own webserver. If you're trying to mirror actual content, wget -r might be a better tool (and can translate the URLs).

RossM · on Oct 19, 2011

What? Both the Python and PHP samples output the same thing (bar one line) - images don't show because the pages contain relative paths (which only work if you have the images stored locally).

    (development)ross@debian:~/hntest$ python download.py > download.py.html
    (development)ross@debian:~/hntest$ php download.php > download.php.html
    (development)ross@debian:~/hntest$ diff download.py.html download.php.html
    242c242
    <             <div style='display:none'><input type='hidden' name='csrfmiddlewaretoken' value='b0c35970dfd374f2b138ed89a4f83a76' /></div>
    ---
    >             <div style='display:none'><input type='hidden' name='csrfmiddlewaretoken' value='018fe81570d710d88ca3f46d1db4c8b7' /></div>

303d302 <

anthonyb · on Oct 19, 2011

urllib is in Python's standard library:

  import urllib
  the_page = urllib.urlopen("http://www.gun.io/")
  print ''.join( the_page.readlines() )

I could jam it all into one line, but in Python we prefer readable code ;) ;)

The Python community also tends to be quieter than those raving Ruby fanbois - if you'd said appeared to be larger, then I would have agreed with you...

d0mine · on Oct 19, 2011

Why don't you just use:

  print the_page.read()

elithrar · on Oct 19, 2011

Pythonistas should look at the "Requests"[1] lib for Python mentioned here on HN[2] a few days ago, too.

[1]: http://docs.python-requests.org/en/latest/ [2]: http://news.ycombinator.com/item?id=3094695

codygman · on Oct 19, 2011

Explicit is better than implicit. I think keeping a bit of the verbose helps people know what's going on. That knowledge is infinitely valuable when debugging certain problems or creating solutions that work well.

adeelk · on Oct 19, 2011

Have you heard of something called Java? ;)

malkarouri · on Oct 19, 2011

A good language, as a good anything, is a reasonable language. Giving extremes as arguments against sensible middle positions is the way of madness, paralysis, or extremism..

manojlds · on Oct 19, 2011

Thats what the post linked to talks about.

anthonyb · on Oct 19, 2011

Sure, but good to repeat it in this case - Python's urllib/urllib2/etc. libraries are famously hard(er than necessary) to use.