> Python is the best language in the world for interacting with the web, and I'm going to show you why.
I disagree. I didn't need to read it, but I did, and I still feel that way.
The statement says language, but the actual points made are about the libraries and the community (i. e. the ecosystem).
I think the ruby ecosystem is actually better, for a few reasons:
* Deep integration with JavaScript (ExecJS and TheRubyRacer are way ahead of what is available in Python, Perl, and PHP)
* Many different excellent choices for talking to HTTP APIs, including the highly innovative Faraday
* A top-notch JVM implementation (JRuby has had more traction than Jython over the last five years)
* Lots of good auth libraries: OAuth (for authenticating with a plethora of outside services), Devise (pre-built auth with confirmation emails, password resets), and OAuth2 provider (server) implementations
* Ruby has evented programming too. Goliath is pretty good.
Python is great but I take issue with saying that it's the best web dev language out there.
My cranky old age is going to show here, but... Python or Perl, all the way.
Both have a long history of "standard" modules for common cases and well-maintained (if not "standard") modules for alternative approaches to the common cases, or for the uncommon cases.
More importantly, both languages have a well-established culture of boringness. This is a quality I actively look for, because it tells me someone wrote code to solve a problem, not because it was the hip thing to do.
There's still room for improvement in both of them, too. If you want to have a lot of connections open, you have to choose between lots of threads with blocking I/O, or using event-driven I/O, or using some kind of magical light-weight cooperatively multitasked greenlet/fiber thing.
As Erlang and Haskell show, it's totally possible to combine an event-driven I/O subsystem with lightweight user-space preemptive multithreading that supports a huge number of threads, takes advantage of multiple cores, and doesn't need to worry about blocking-vs-nonblocking magic. I would love to see that in a more mainstream language.
It also depends on what you need to do with the web. Perl is currently the best language for interacting with Unicode, for example. If you web needs include Unicode, you might have trouble with Ruby or Python, unless you're willing to accept half-ass, partial, or just plain wrong Unicode implementations.
With Python 2 strings you have to be proactive about it and
- always use u"" unless you know what you're doing
- segregate non-unicode data to boundaries by decode()ing early and encode()ing late, or you're guaranteed to shoot yourself in the foot sooner or later
but if you do it it's quite smooth sailing.
Also,
#/usr/bin/env python
# -*- encoding: utf-8 -*-
helps a lot.
All that just brings you closer to Python 3 anyway and really helps when using 2to3.
Agreed, however I've run in to compatibility issues with 3 a few times and so resolved to stick w/ 2, deal with the text parsing issues myself, keep the libraries I like and leave the ops team alone.
Ruby's character encoding and Unicode support is pretty strong. I'm intrigued how you think it's half-ass, partial or just plain wrong (Really! If there's something really borked with it, it's in my interest to know :-)). Every string has full encoding support and it's baked right in to the language.
Its String functions like upcase or capitalize won’t even look at
anything but ASCII.
It’s completely missing a whole lot of critical Unicode
functionality:
casemapping & -folding
grapheme support
normalization
collation
text segmentation, &c &c &c.
Every Ruby string carries around its encoding, instead of sanely
unifying into Unicode internally like nearly everything else does.
Also:
> baked right in to the language
is not synonymous with "intelligently implemented"
Note that I wasn't implying that "half-ass," "partial," and "just plain wrong" necessarily all apply to Python and/or Ruby's implementations. Some may apply to some areas while others may not, and really this extends outside of just Python and Ruby, but I'm trying to stay in context here.
That's it! In only two lines of ruby, you can grab a whole webpage and print it to the screen. Awesome! - and it's in the standard lib ;)
I agree, when I left PHPland I chose Ruby over Python because the (web) community was larger, there were just more books, blogs, screencasts, etc for Ruby/Rails.
There seems to be more ideas generated out of the Ruby community, I'd add Sinatra and HAML to the list above, like Rails they've both been ported to other languages.
OF course Heroku was Ruby first and it's so good that my PHP/Java/Python friends jumped for joy when their language became supported.
you lose even more of the line noise of the other languages.
For a language that claims to be "batteries included" it amazes me how many Python tutorials begin with "go get and install this or that package..." How much of the Python standard library is no longer the "Pythonic" way of doing standard things? Why aren't APIs with more refined design, like Requests, pushed back into the standard library, and made the real one way to do it?
Well, OK, you'd have the problem of redundant libraries starting to stack up--it's already looking tiresome. For instance, Requests would have to be urllib3, since there is already a urllib2 and urllib. This article wants me to install simplejson, but there's already a Python json module: oh I see, it was merged in from simplejson, and now they are developed in parallel, so using simplejson may still be better, wtf? Why is there an htmllib in the standard library if I'm always going to use lxml instead? Let's not even discuss eyesores like the heritage of the subprocess and threading libraries...
Because it takes time. It takes time to add things to the stdlib, especially if the are replacing "an old reliable." In the case of Requests, it is just a wrapper on the stdlib. There is nothing in Requests you can't do in the stdlib, ditto pretty much everything.
As far as lxml goes, there are a ton of XML parsing/query libraries out there. In lxml's case it is built on libxml2. I believe there is another library that is also build on libxml2, however lxml has better syntax for the most part. Should it be in the standard library? Probably not, it depends on a pretty heavy weight C library and most people don't really need it. When you do, you use it.
I have no idea what is going on with the simplejson business. I just use the json lib.
I don't see anything wrong with subprocess or the threading libraries. They do what they need to do. I use subprocess quite often, it works fine. It is way better than the actual legacy version (in the os library).
Good to know. When I said I don't know what is going on, I meant why the simplejson people are not focusing on the stdlib version. It seems strange to basically have a fork...
Of course, these are all examples of piping the body of an HTTP GET response. There is no parsing going on. You'd need to parse the HTML to translate the links, if this is being served from your own webserver. If you're trying to mirror actual content, wget -r might be a better tool (and can translate the URLs).
What? Both the Python and PHP samples output the same thing (bar one line) - images don't show because the pages contain relative paths (which only work if you have the images stored locally).
I could jam it all into one line, but in Python we prefer readable code ;) ;)
The Python community also tends to be quieter than those raving Ruby fanbois - if you'd said appeared to be larger, then I would have agreed with you...
Explicit is better than implicit. I think keeping a bit of the verbose helps people know what's going on. That knowledge is infinitely valuable when debugging certain problems or creating solutions that work well.
A good language, as a good anything, is a reasonable language. Giving extremes as arguments against sensible middle positions is the way of madness, paralysis, or extremism..
I wouldn't count the need for a good JVM implementation as a plus mark in a Ruby vs Python comparison. Jython is used when you need to interact with Java libs, not because the C implementation of Python language is slower than the JVM one. The same does not apply for Ruby, you use JRuby because it's better than CRuby. If you want to compare speed you'd be up against pypy.
When a language has multiple viable implementations, it means the language has a good specification. It also means that it doesn't depend too much on platform-specific characteristics. It means it is more portable. Python is both blessed and cursed with a couple of problems ... its behavior is sometimes related to the CPython implementation (e.g. reference counting, __del__), and also some libraries are too big and important to live without them.
One such library is NumPy. Currently you cannot talk about an alternative Python implementation if you don't have NumPy running on it, and that's a fact.
I'm a Python developer in my day job and I never used Pypy for anything. I only played with it and became frustrated that libraries I relied upon don't work on it.
This is really, really cool. Thanks for making this.
I really dig the Flask project (love the website and the docs, and everybody who uses it raves about it) and I hope to play with it more in the future, but I don't think I'd recommend it to somebody who is a first timer, largely because of one issue: Data. Flask leaves you to sort it out on your own, which is great if you're capable of that, but Django holds your hand, which is more appropriate for a beginner.
I'm pretty sure he means that there's no database support built in, ie. you need to go download something like SQLAlchemy, SQLObject, Storm, etc. If you're already going to do that anyway it's not so much of an issue, but for people who are just getting their feet wet, it's yet another obstacle to overcome.
And it makes sense -- I think Flask is a much easier introduction to config and the VC of MVC development... but once you get to the M, well... SQLAlchemy is great, but the learning curve is steeper. And even though Flask has a well-documented extension for that, it's still another package and a separate piece of documentation.
As a first-timer, I really appreciated having Flask around. It might not have a database layer, but it did have really excellent documentation. I felt like I had a much firmer understanding of what was happening with Flask than I did with Django.
Nice. It does seem like one of the lighter-weight Python web frameworks might fit in better with the overall theme of this intro, and Flask is a great choice.
I thought about doing the same thing as I read through it.
I wrote this for a friend who is just getting started out in python/web stuff. It might be a little n00bly for lots of people here, but it could be interesting for people who aren't pythonic who are thinking about making the switch!
Great that you promote requests, lxml, json (just json, not simplejson since Python 2.6), django to use by a beginner in Python. Pages from http://wiki.python.org/moin/WebProgramming might provide too much choice for a beginner.
You might add a tiny example of a microframework such as Bottle, Flask or an example from Pyramid written in the microframework-ish style.
Recommending `sudo` with `pip install` to a novice is not a good idea. System package manager or `pip install --user` could be used as alternatives. `virtualenv` might be out of topic for a short tutorial.
About JSON: Python 2.6 added a "json" module to the standard library that was forked from simplejson, but in that time, simplejson has improved; the interface is exactly the same as the built-in json module, but simplejson tends to be quite a bit faster.
On my informal benchmark of JSON parsing speed, simplejson was 27x faster than json.
The idiom I've been using is:
try:
import simplejson as json
except ImportError:
import json
Documentation for simplejson mentions that it should be faster. Though it contains reference to Python 2.5, 2.6 so It is unclear whether the claim is outdated:
It is the externally maintained version of the json library contained in Python 2.6, but maintains compatibility with Python 2.5 and (currently) has significant performance advantages, even without using the optional C extension for speedups. </quote> http://simplejson.readthedocs.org/en/latest/index.html
An order of magnitude difference might be an issue with your python installation.
On my machine version from 2.6 (1.9) is 15 times slower than from 2.7 (2.0.9). Though simplejson (2.2.1) is still faster on both 40 and 2 times correspondingly for .loads() timeline.js from the example as measured by:
% timeit json.loads(timeline_text)
it seems that C speedups available on both versions:
BeautifulSoup is poorly maintained — you have to be very specific with which version you're using.
Note: Lxml has a number of repair modes that allows it to parse virtually anything. Cpu cycles and memory go up quite a bit when they're activated, but it's still better than BeautifulSoup.
Thankfully lxml has a slower-but-more-forgiving mode that you can use when interacting with poorly formatted HTML, which takes advantage of BeautifulSoup http://lxml.de/elementsoup.html
lxml's default parsers are good for xml, atom and xhtml; for html5 and html tag soup, lxml.html.html5parser (which depends on html5lib) is the way to go. For feed tag soup, feedparser still uses BeautifulSoup internally.
Very nice article and thanks for the intro to Requests. Though, I'm not sure we would agree about Django being the best thing to learn in the long run.
I undertook to move to Python for web development about 8 months ago and embarked on a thorough quest for the one true framework. Obviously Django was the first and most recurrent recommendation. Having briefly toyed with it in the past, I really could see myself commit to it for the long run. The various raving reviews also made the choice all that simple. I was all ready to give it my seal of approval, when I encountered my first complain, which was so compelling that it raised an eyebrow.
It was about an experienced Python programmer explaining that as time went and as he progressed with Django, he found himself increasingly swapping parts out of it for external libraries. SQLAlchemy, WTForms, Jinja2. In the end he had only the routing module and the admin, which wasn't that big a deal for him. He was asking what was the point of using a full-stack framework not necessarily designed with interchangeability in mind, if you end up just using it like a glue mini-framework?
As I dug deeper, I found more similar complains, all from similarly experienced developers, who all ended up adopting something else with a light plugable base approach. I heard of Repose.bfg, Pyramid, Werkzeug and a slew of other ones, that allow you to get down and dirty fast, while still allowing you to get big in the long run.
Just as you, I was recently asked by a friend wanting to get into web applications development to recommend a platform to work from. I also did point to Django, but explained that it wasn't because it's necessarily the best, but rather because it's the gentler introduction. It comes with batteries, crutches, first aid kit and a nice box of goodies, perfect for someone who has no clue what they're doing.
Note that I'm not dissing Django or relegating it as an amateurish framework. I agonized on my decision and still sometimes experience some Django envy (FYI I adopted Flask and don't regret it one bit). Nonetheless, it's hard to deny that it does a particularly good job of introducing newbies to good concepts fast, while at the same time being notorious for getting in the way of more experienced developer than some other frameworks.
The issue that I have with that attitude (as an experienced Python and Django developer) is that there's a lot more to it than just "swapping out Django's DB and replacing it with SQLAlchemy". You miss out on most of the infrastructure around Django's database stuff that makes it so easy and worthwhile to use.
For example, fixtures. In Django, you can include them in your tests and have it All Just Work(TM):
Also note that, while I'm picking on fixtures here, Django also has a bunch of other database related features, like introspecting a pre-existing database, and generating a bunch of Model classes from it. Combine that with South and you're 90% of the way there when migrating the data from a legacy system.
This has turned into a bit of a rant, but I've seen a lot of half-baked reimplementations in "pluggable" architectures of stuff which Django just gets right.
It's not a matter of attitude, it's about having tools, evaluating them for what they're doing for you and making a choice. There's always a conscious trade-off.
You can't focus on what's good about Django, as if someone picking SQLAlchemy or Werkzeug is a fool at a loss. The same goes for the other libraries that I listed.
In my own case, I'm not particularly a fan of opinionated frameworks. I've had my share of griefs with them. Django looked nice, but I easily could identify with the pains those developers went through when dealing with its lack of flexibility at certain corners. I was new at Python, not at web development, I did not need the hand holding, no matter how nice and clever the code was.
This was not a one evening process, I read blog posts, forums, perused StackOverflow and even HN. It took weeks. Feel free to take the journey, then tell me it didn't give you pause.
Flask is a small framework that is far from pretending to be what Django is. It's not even at version 1, but with its flaws, I'm quite happy with what it's allowing me to do.
I don't mind you ranting about my post, just don't assume that I would use Django the way you do. Stuff that you relish might be what turns me off and the beauty of it all is that it's all acceptable.
I really don't understand the "lack of flexibility" thing myself. I've made Django and Django's ORM and admin do all sorts of weird stuff, and it's much easier to build on top of that than to reroll everything, which is what you have to do with Pylons/Flask/Bottle/...
And database/fixture set up is something that you will have to do, assuming that you test your app, and that your app is more sophisticated than "I have some strings which I want to upper case". Ditto for working with legacy databases, migrating data, bla bla blah.
The default option from what I've seen tends to be to set your DB up once and hope for the best, or else repopulate it after every test. Both of these options work great(ish) when you first start your project, but then grind you down six months later when your test suite takes an hour to run. A decent ORM, along with in-memory SQLite and fixture setup is generally what's settled on for most integration suites that I've seen, and Django does that out of the box.
Also - a minor nit, but WTForms and Jinja2 are based on Django's form and template libraries. Switching to them doesn't get you that much - they certainly don't replace any of Django's core infrastructure. And I've found SQLAlchemy to be basically unusable unless you use the new 'declarative base' stuff, which looks suspiciously similar to... Django's ORM :)
> And I've found SQLAlchemy to be basically unusable unless you use the new 'declarative base' stuff, which looks suspiciously similar to... Django's ORM :)
I can relieve your suspicions as I never looked at Django's ORM at all when designing declarative. It's merely poking normal SQLAlchemy attributes, all of which existed before Django was ever released, onto a class. I can assure you active-record style class mapping is not an idea Django invented.
The same design pressure (to make the simple stuff easier) is at work in both cases. Much like, say, Bottle and Flask looking very similar.
The "basically unusable" is probably a little harsh, but I've never really understood the motivation behind some of SQLAlchemy's design decisions. eg. separating one class/table definition into interacting table, model and mapping classes makes no sense at least 99% of the time - that seems like something which should be hidden internally (but accessible if you really need it).
well if you read our docs you'll see that they're entirely in agreement with that. separate mapping/table design is called "classical mapping" and was years ago superceded by the declarative style. Update your sqlalchemy knowledge before commenting on it.
interestingly enough a ton of users still prefer the mappers/tables to be separate.
I'd say the smart bet is in learning technologies that are relevant and to be comfortable with your tools. I don't recall the last time a client asked/cared what technology stack we're using. They usually just need stuff built.
I believe that the python2.6 json module (basically forked form simplejson) is much slower than the current simplejson version. The speedups (simplejson rewrite) didn't get included until 2.7 as I recall (memory is vague on this point).
First, why is the author recommending SimpleJSON? Just use json, it's built into the standard lib! (Also if I'm not mistaken the json implementation included in Python may even be based on simplejson, the APIs are very similar even if not.)
Second, I disagree that Django is the best web framework. It might be the best web framework, but it depends on what you're doing. I've come to prefer Flask for its simplicity and overall the way it feels more Pythonic.
That said, requests cannot be recommended enough! It is an awesome package that should not be missed if you're doing web programming in Python.
requests comes close to Perl's LWP::UserAgent in terms of usability, but LWP::UserAgent has been around for years. I don't know how that makes Python the 'best language for interacting with the web.'
Are you doing anything funky with the file or the file system? Windows, Linux or Mac OS? There should be no need to do what you are saying.
Perhaps you are trying to run "./manage.py runserver" without calling python? In this case you'd be right that you need to have "./" as part of the command call, but you still would have to add the shebang to manage.py and make it executable.
I disagree. I didn't need to read it, but I did, and I still feel that way.
The statement says language, but the actual points made are about the libraries and the community (i. e. the ecosystem).
I think the ruby ecosystem is actually better, for a few reasons:
* Deep integration with JavaScript (ExecJS and TheRubyRacer are way ahead of what is available in Python, Perl, and PHP)
* Many different excellent choices for talking to HTTP APIs, including the highly innovative Faraday
* A top-notch JVM implementation (JRuby has had more traction than Jython over the last five years)
* Lots of good auth libraries: OAuth (for authenticating with a plethora of outside services), Devise (pre-built auth with confirmation emails, password resets), and OAuth2 provider (server) implementations
* Ruby has evented programming too. Goliath is pretty good.
Python is great but I take issue with saying that it's the best web dev language out there.