Hacker News new | past | comments | ask | show | jobs | submit login

require 'net/http'

Net::HTTP.get_print 'www.gun.io', '/'

That's it! In only two lines of ruby, you can grab a whole webpage and print it to the screen. Awesome! - and it's in the standard lib ;)

I agree, when I left PHPland I chose Ruby over Python because the (web) community was larger, there were just more books, blogs, screencasts, etc for Ruby/Rails.

There seems to be more ideas generated out of the Ruby community, I'd add Sinatra and HAML to the list above, like Rails they've both been ported to other languages.

OF course Heroku was Ruby first and it's so good that my PHP/Java/Python friends jumped for joy when their language became supported.




arguably, in PHP

    echo file_get_contents('http://www.gun.io')
you lose even more of the line noise of the other languages.

For a language that claims to be "batteries included" it amazes me how many Python tutorials begin with "go get and install this or that package..." How much of the Python standard library is no longer the "Pythonic" way of doing standard things? Why aren't APIs with more refined design, like Requests, pushed back into the standard library, and made the real one way to do it?

Well, OK, you'd have the problem of redundant libraries starting to stack up--it's already looking tiresome. For instance, Requests would have to be urllib3, since there is already a urllib2 and urllib. This article wants me to install simplejson, but there's already a Python json module: oh I see, it was merged in from simplejson, and now they are developed in parallel, so using simplejson may still be better, wtf? Why is there an htmllib in the standard library if I'm always going to use lxml instead? Let's not even discuss eyesores like the heritage of the subprocess and threading libraries...


To all your questions:

Because it takes time. It takes time to add things to the stdlib, especially if the are replacing "an old reliable." In the case of Requests, it is just a wrapper on the stdlib. There is nothing in Requests you can't do in the stdlib, ditto pretty much everything.

As far as lxml goes, there are a ton of XML parsing/query libraries out there. In lxml's case it is built on libxml2. I believe there is another library that is also build on libxml2, however lxml has better syntax for the most part. Should it be in the standard library? Probably not, it depends on a pretty heavy weight C library and most people don't really need it. When you do, you use it.

I have no idea what is going on with the simplejson business. I just use the json lib.

I don't see anything wrong with subprocess or the threading libraries. They do what they need to do. I use subprocess quite often, it works fine. It is way better than the actual legacy version (in the os library).


It seems performance issues with stdlib's json version are not in the past http://news.ycombinator.com/item?id=3128433


Good to know. When I said I don't know what is going on, I meant why the simplejson people are not focusing on the stdlib version. It seems strange to basically have a fork...


> arguably, in PHP

> echo file_get_contents('http://www.gun.io)

I thought it had to be "too good..." - links are broken. And since src links are broken, too, images do not display.


Of course, these are all examples of piping the body of an HTTP GET response. There is no parsing going on. You'd need to parse the HTML to translate the links, if this is being served from your own webserver. If you're trying to mirror actual content, wget -r might be a better tool (and can translate the URLs).


What? Both the Python and PHP samples output the same thing (bar one line) - images don't show because the pages contain relative paths (which only work if you have the images stored locally).

    (development)ross@debian:~/hntest$ python download.py > download.py.html
    (development)ross@debian:~/hntest$ php download.php > download.php.html
    (development)ross@debian:~/hntest$ diff download.py.html download.php.html
    242c242
    <             <div style='display:none'><input type='hidden' name='csrfmiddlewaretoken' value='b0c35970dfd374f2b138ed89a4f83a76' /></div>
    ---
    >             <div style='display:none'><input type='hidden' name='csrfmiddlewaretoken' value='018fe81570d710d88ca3f46d1db4c8b7' /></div>
303d302 <


urllib is in Python's standard library:

  import urllib
  the_page = urllib.urlopen("http://www.gun.io/")
  print ''.join( the_page.readlines() )
I could jam it all into one line, but in Python we prefer readable code ;) ;)

The Python community also tends to be quieter than those raving Ruby fanbois - if you'd said appeared to be larger, then I would have agreed with you...


Why don't you just use:

  print the_page.read()


Pythonistas should look at the "Requests"[1] lib for Python mentioned here on HN[2] a few days ago, too.

[1]: http://docs.python-requests.org/en/latest/ [2]: http://news.ycombinator.com/item?id=3094695


Explicit is better than implicit. I think keeping a bit of the verbose helps people know what's going on. That knowledge is infinitely valuable when debugging certain problems or creating solutions that work well.


Have you heard of something called Java? ;)


A good language, as a good anything, is a reasonable language. Giving extremes as arguments against sensible middle positions is the way of madness, paralysis, or extremism..


Thats what the post linked to talks about.


Sure, but good to repeat it in this case - Python's urllib/urllib2/etc. libraries are famously hard(er than necessary) to use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: