Hacker News new | past | comments | ask | show | jobs | submit login
Python Ecosystem - An Introduction (mirnazim.org)
499 points by mnazim on Nov 28, 2011 | hide | past | favorite | 75 comments



Great post. I would add:

* the site module, which is imported by default and is what is responsible for setting up the default sys.path. You can skip 'import site' by running python with the -S switch. the site module is written in python, so you can scan through it and understand how python starts up and inits.

* PYTHONSTARTUP env variable, which points to a python file that is run (like a bashrc, or AUTOEXEC.BAT, if you prefer) on interactive prompt startup. I use this to import custom paths and modules that I want to access from the REPL, such as Google App Engine

* I use pip with local repositories. clone the repos of the libs you need, and then pip install in the virtualenv from that local clone:

    $ pip install git+file:///Users/nik/.python-packages/tornado
(note the triple slash). Or straight from GH:

    $ pip install git+git://nikcub@github.com/nikcub/tornado
this can keep your versions in sync across all projects and virtualenvs and it means no re-downloading and you can setup and update projects while offline.

* don't store the actual project inside the virtualenv. the virtualenv provides the execution context (setup and torn down using the virtualenvwrapper helper scripts). a common practice is to place all your virtualenvs into a directory like ~/.virtualenvs. you should never have to cd into this dir, access it using the wrappers and pip. (edit: also agree with comment below that you shouldn't be sudo'ing).

* just a quick add, I think it is definitely worth learning how to install python from source.


PYTHONSTARTUP also lets you enable things like

  * coloured prompt
  * tab-completion
  * persistent history
Here's mine: http://mg.pov.lt/pythonrc


A formidable effort.

It might be matter of taste but recommendations given starting from "Understanding the packages" and to "Install packages that need compiling" are almost harmful.

My preference:

* you should not care what is your `sys.path` looks like. You need it for debugging if something goes horribly wrong. A tutorial might mention it but things like `sys.path.insert(0,..)` should be avoided or accompanied with a big disclaimer (don't use nuclear weapons if you care about the future)

* the same goes for `PYTHONPATH`. It is a hack that rarely needed

* don't use `sudo pip`. System packages should be managed by a system packager. Use `pip --user` or create a `virtualenv`

* `pip` can handle tarballs there is no need for `python setup.py install` in this case.

"Code Like a Pythonista: Idiomatic Python" is worth mentioning http://python.net/~goodger/projects/pycon/2007/idiomatic/han...

Some third-party packages that could be listed (it is subjective):

bpython - interactive prompt; something for tests e.g., pytest, tox, selenium; sphinx - docs; lxml - xml/html, werkzeug - if you talking about web-development; SQLAlchemy - sql; Cython - C extension, ~ Python syntax; async. libs e.g., gevent, Twisted.


My indented audience was not the pure beginners. I was targeting for developers coming to python from other platforms. Over past 2 years, if I have learned one thing while training interns, trainees and experience devs in Python/Django, it is that packaging confuses people a lot. apt-get vs easy_install vs pip. That is why I choose to spend most time on package management and virtualenv etc. I think, I should have included stronger indications of this.

In general, your feedback is very good and important. Idiomatic Python specifically is an important resource and should have been included.

I will incorporate your feedback as soon as I can squeeze out some time.

Big Thanks!


Great Pythonesque pun there! indented==intended, I'm sure.


This happens to me all the time. My most favorite is when I have to type the word "important", my mind spells "I M P O R T A N T" but my fingers instinctively stop at "import".


My fingers often type "test" when I want to type "text".


Yes, system packages should be managed by a system packager. Unfortunately, for Debian and/or Ubuntu, those packages are often a full version number behind the PyPi version (I'm guessing it's the same for RHL/Fedora). Be very careful using a system packager like apt-get or yum to install a package that is under active development. In that case, it's often better to install from source or use pip/easy_install. But yes, if it's not a system package, don't use sudo for security reasons (although PyPi is pretty safe).


Agree with your points. Would add emphasis on ipython and heavy use of interpreter, pylint instead of pyflakes, "learn and prefer builtin modules" rule, and Python in a Nutshell on the desk.


My reasons for preferring pyflakes over pylint: * pyflakes are fast * pyflakes is less likely to flood me with false positives

What are your reasons for preferring pylint?


> System packages should be managed by a system packager.

You shouldn't be using the system Python. Should compile from source and put in /opt/<name of company>/. And that shouldn't be owned by some random user. So, you should be using sudo.


Building your own Python and using system package manager do not exclude each other. On the contrary, I'd even recommend installing your own built software with your systems package manager


I'd love to have one of these for Ruby. Every time I want to try out something written in Ruby I run head-first in to the packaging problem - Debian and Ubuntu don't appear to like shipping a working gem (presumably because it conflicts with how apt likes to do things) and the documentation on how to resolve the resulting inscrutable error messages isn't particularly easy to find. The Mac is a bit better, but I still run in to problems far too often.

I'm pretty sure a "Ruby Ecosystem, An Introduction" guide is exactly what I need.


I'm sitting in lecture right now, so here's a short version:

- Use http://beginrescueend.com/ to install ruby itself

- Use http://gembundler.com/ to manage project-level dependencies

Both of those sites have good examples, but if no one steps up I might just write something like the linked guide when time frees up.


I followed this while learning best practices for Ruby:

http://ascarter.net/2011/09/25/modern-ruby-development.html

It seemed to cover everything that I wanted, coming from a familiar Python background of pip/virtualenv/etc. For reference here's the associated HN commentary:

http://news.ycombinator.com/item?id=3044908


Up-vote, because it's a great idea for any language/tool out there. I happen to know ruby, but taking the first steps in anything is difficult, and usually involves looking up a lot of the things you already know from a more familiar place (setting up, using libs, etc). I'd really love to see a community resource like this - perhaps I'll start working on one :)


The gem issue is actually fixed in Debian testing. Not sure when it'll arrive in Ubuntu, but it can't be soon enough.


The way these things get written is by keeping notes as you figure out the bits and pieces ...


You got it spot on. Note keeping the the way to go.

The post was extracted from our internal wiki, the content in which has evolved over past 2 years and spread all over.

It took me about 20-25 hours, spread over 3 weeks, to put it in the shape of a single post/tutorial.


Have you thought about combining your work with Kenneth Reitz's Python Guide? https://github.com/kennethreitz/python-guide

It looks like you're covering a lot of the same ground.


I would have loved to hear something about unit testing. Is nose or unittest2 the unit testing framework of choice for Python?


> Choose Python 3 only if you need to and/or fully understand the implications.

I would apply the "if you need to" part to Python 2. "3 if you can, 2 if you must"


The argument (for installing 2 as a default) is that a novice to the ecosystem will be disproportionately harmed by the inevitable Python3 experience:

Novice: "I want to do X" Internet advice: "Use package Y" Novice: "Okay <install, install>, wtf nothing is working" <long frustrating debugging session> Novice: "Oh, wow, this doesn't support Python3 yet. Now I have to ignore all the internet advice and forge my own path, OR port all my code back to 2.7! This language sucks!"

The alternative seems preferable: Stuff works, but occasionally you don't get a whiz-bang feature (you "only" get 2.7's feature set, poor you). Five years pass, you learn the language. Now you've gotta learn a bunch of new habits, which sucks, but it's not as bad because now you're pretty good at Python, so the easy parts are easy.


I think they should be asking "How do I do X" before choosing a version to install. The answer might lead them towards IronPython or Jython, not a version of CPython from python.org. Some questions might even lead people straight to PyPy.

More and more questions have problems that can be solved by Python 3, but it's probably a pain in the ass if you start with 3 then work out how to solve your problem.


Python 3 is still an "eyes open" choice. Python.org all but comes out and says that 2 is still the safer choice.

http://wiki.python.org/moin/Python2orPython3

tl;dr "If 3 does everything you need, great. However, there's a good bit of things that still don't work with 3, in which case 2 is the safer choice. Here's a pretty long list of reasons why 2 is probably better for you. ..."


I agree that as of late November 2011, Python 2.7 is the safer choice for a novice.

But I bet within a year, that's no longer true. (This is dependent on package migration, but there's been a lot of progress lately, and the chances of a novice needing a sophisticated package day one is slim anyway.)

There are a few items that Python 3 fixes that will make this a no-brainer when the vast majority of major packages are ported to 3. Specifically, floating point results of integer division, and print as a function are two that come to mind.

I can't tell you how many times I cursed at the same bug as a beginner (back before division was importable from __future__). 1/2 + 1/2 = 0. Uggh!

And the beginner may as well get in the habit day 1 of using parentheses in statements like print("Hello World").


I still thing that Python 2.7.2 is the only way to go for development unless you already know Python and its ecosystem well enough to know that 3.x will work for you.

That said, I try to use any backported 3.0 features such as .format for string formatting. And for beginners, I heartily recommend using the 'six' module. That way when you need to move to 3.0, porting your code or your skills, will be easy-peasy.


"While not a software tool per se, PEP 8 is a very important resource related to Python."

Actually, it is :)

pip install pep8


Please keep the great feedback coming. I will try to incorporate as much as possible.

I am indebted to HN community for the great feedback so far.


As an additional supplement - here's a pile of links I've collected that might apply: http://jessenoller.com/good-to-great-python-reads/


Man. That's one hell of a collections. Thanks for sharing


I think that $ sudo apt-get install python-pip is a bad idea.

You should not mix multiple packaging system on your operating system. And more you can dammage it pip provide more recent package than your distro. And if you upgrade a lib that have an incompatibility with a part of the system, you can corrupt it. I have no example to give but I am sure you can find it... Ubuntu now have many tools written in python.

You should use pip inside a virtualenv only. And, fortunatelly when you create a virtualenv, pip is installed in it, and you don't need to use the --distribute to have it.


What I tell people is that Python (as installed by apt-get, yum) is not on your system for you to develop with. It is there because some of the system tools are written in Python, including at least part of the apt packaging system. It's also there for System admins to use for writing Python scripts instead of bash scripts.

But for app development, get your own Python, manage it yourself and install 'distribute' so that you have both easy_install and pip to work with. I've taken that to extreme by making a portable Python distro that comes in a tarball and runs on any Linux distro, but even if you only untar the source and run ./configure --prefix=/home/python;make

That will build a default Python with support for any shared libraries for which you have a development version installed ( -dev version on debian/ubuntu, -devel version on redhat/suse)

sudo make install will install it, assuming that you have write permissions on the target prefix that you specified. You can even hide it in your home directory with --prefix=~/tools/python272


I don't anderstand why you want to use the --distribute to have pip and easy_install installed.

If you do not, you also have those tools.


Great post - I wish there was a unified resource for things like that for other languages/tools.

I would only add iPython - a must for any console adventures.


Definitely; IPython is great at making Python more self-documenting, with completion, the ? help function, and the ability to embed an interpreter anywhere in your program.


bpython is much simpler, yet still helpful.


ipython has known issues running certain code. It is not guaranteed to function like a normal interpreter. If you insist on a fancy interpreter, use bpython.


These issues are not known to me, could you expand?


References last far longer than they should, because the interpreter doesn't always let go of references to returned results.

There used to be a problem with certain statements being executed as expressions and printed, but I'm told that's been fixed.

The encoding is always Latin-1. Always. Hope you don't use Unicode literals.

It's not Python. It's Python plus other things. That's always a Bad Thing because it precludes taking results from the interpreter and using them in plain Python contexts. web2py also has this problem.


This is fantastic. Someone could make a well-visited site by creating similar documents for all the major languages.


You really shouldn't tell people to go ask how to install Python on stackoverflow.com. Instead give them a few URLs to stackoverflow questions with the answer such as this one http://stackoverflow.com/questions/7538834/how-to-create-a-p...

Or even better, give them a stackoverflow search like this one http://stackoverflow.com/search?q=%5Bpython%5D+%22install+py...

P.S. I think that your wiki page is a great idea and I'm going to write a custom one for our developer wiki.


Though the article claims to be targeted at users running on linux, most of the the info is still quite useful regardless of the platform - just figure out how to install python and and pip and the rest is pretty platform agnostic.


Nice article. I am bookmarking it.

Also The Zen of Python can always be accessed by this Easter egg

    >>> import this
http://www.python.org/dev/peps/pep-0020/



I mostly dabble with python. I learned a little reading this, and it raised some questions for me. Is there a reason --distribute is not the default behavior of virtualenv? Is there a plan to incorporate the stuff virtualenvwrapper does into virtualenv (virtualenvwrapper is a pretty cumbersome name, if for no other reason)?


Thank you for this article.

As per the Pragmatic Programmer, I thought I would learn Python this year. It's been a tremendously frustrating experience getting a workable stack installed.

I wish the famous "One -- and preferably only one -- obvious way to do it" Python design philosophy extended to actually installing everything :(


Just curious - which OS/distribution are you on? Ubuntu and Gentoo (and virtualenv+pip if necessary) both make the Python stack a total no-brainer for me.


Try the batteries-included distro from ActiveState.com.


Great article.

If you're on Ubuntu LTS you should install PIP from PyPI (easy_install pip), since the system package management version is outdated and it doesn't have the (very useful, since PyPI likes to go down) --use-mirrors install option. That would be my only recommendation.


I'm surprised nobody has mentioned pythonbrew.

https://github.com/utahta/pythonbrew

It's the Python version of RVM. It is higher level than even virtualenv, and in my opinion, the most seamless way to manage Python environments.


This is very good. I wish a resource like this was around when I was first learning Python. The difficulty of getting things to work around the language has always been a pretty stark contrast to the ease of the language itself.


Looks great. I've started turning my attention to python recently so skimming through this I can already see lots of stuff that'll be very useful. So much that perhaps a linked TOC at the top of the page could be an idea?


Thank you, thank you, thank you. I've been looking for something like this for ages. While Python the language is fantastic, I have found getting into the whole environment quite tricky so your guide is fantastic.


Nice article but you have a typo: updrage instead of upgrade :)


And at the end, it should be "omission" not "ommision". Thanks a lot, wish I had this years ago.


Great article. Is it just me or the prepend/append examples are swapped? ie. If you want to append TO your PYTHONPATH you should do PYTHONPATH=$PYTHONPATH:/some/new/path


It depends on what do you mean by "append", I guess. Append the search path (e.g. "try this one first"), then /path:$PYTHONPATH is correct. If you mean append the variable then your example is correct.


I am not sure why Requests gets this much press. httplib2 has all the features of Requests and more, and their APIs are not that different.


NOTE: there's brokenness lurking in urllib and email modules. The python library modules are more often than not modeled on perfect reality rather than a pragmatic one (e.g., complete violations of RFCs). BTW, if one reads re.py in the dist, you'll notice that it hasn't been touched by Fredrik Lundh since 2001!

I should say I still love python. Its the most fun I've had programming next to Scheme, and well, NodeJS is kind of fun too (in an algol way).


Great post! Covers everything I still didn't understand after reading LPTHW. Thanks so much for sharing this.


pip is a neat system and all, but I still don't see why to use that versus a system package?


Updating the source package is easier for one. Ubuntu's packages are often out of date, especially if you are running an LTS. It also integrates nicely with virtualenv which makes managing dependencies easier, via pip freeze.


System package's availability vary a lot with the distribution you're using. For instance I use Debian and my web hosting provider is based on CentOS and share my REQUIREMENTS.txt file.I can't really depend on the packages being available on every distro's repository, let alone the same version.

Also, I wouldn't like to install some packages system-wide (a friend's personal project from a github repository) so I use pip and virtualenv in a similar manner as the article recommends to maintain a per project dependency library.

That's how I see it :)


System packages install globally, pip can install globally or inside a virtualenv.


System packages are going to miss some package that exists in pypi. Using pip is also good if your company maintains their own internal toolsets, and want to be able to easily reuse those across multiple applications.


virtualenv can't isolate your system's package util like it can isolate pip.


if you have multiple projects on a system you may have different dependency requirements. module with version 1 for one project and the same module with version 2 for the other project.


pkg_resources.require() can handle this case. Though `virtualenv` is more explicit.


would make a good idea for a startup...a Python host that has a simple checkbox interface for installing all this stuff.

That way you can get started with coding instead of having to install everything by yourself.


http://enthought.com/products/epd.php (Disclaimer - I work for enthought)


In the section of "The Development Environment" you should mention the buildout (http://www.buildout.org/). It is a very simple way to reproduce an enviroment.


Great great great post!!!!!!!!!!!!!!!!!


Nice list. I would replace django with web2py, it is closer to the python philosophy imho.


great post!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: