I don't usually go on tirades like this, but this is what I do for a living.
You'll want to use schemas vs. databases. Not only is it significantly more convenient (i.e., you'll administer 1 database instead of n), it will let you scale. For instance, a postgres database has a pool of shared memory that it uses for query plans, caching, etc., and you'll be able to optimize the use of it across all of your clients, rather than one by one.
Related is the impact on connection resources. A connection to a database may be used for multiple schemas, so you'll be able to make much better use of those. Suppose you have 1000 clients, which each typically have 1-2 users on your application when it's typically in use. In the multiple database config, you'll need one connection pool per database, each pool having at least 1 open connection, so that under no load you're still maintaining 1000 connections. With the schemas approach, you can maximize the throughput and balance the connections proportionally to your active users rather than the number of customers you have.
Oh, one more biggie: remember the shared buffers? Those are typically per-session (connection). So if you frequently run "select price from widgets where category=?", you'll want to parse, rewrite, plan, compile, and cache that as infrequently as possible, so keep those connections and sessions open as long as possible so you get the benefit of prepared statements. You will see load go down if you do this, from reduced CPU for parsing and plan generation, and reduced system wait since you won't be churning your shared buffers as much.
Could you please clarify what you mean by "schemas"?
My understanding is that a schema is an abstract definition of what your database fields/relations are. So that's too vague for me.
Adding a customer_id field to every single table in the database would technically fit the definition of "schema", but I suspect you mean something else altogether. Please elaborate.
Neeleshs' reply is what I meant. In Postgres, it's like a namespace.
But you have a point: adding customer_id is not unreasonable. In fact, this is my preferred logical approach, but I'll typically only do this on Oracle. It's DRY, meaning that, in the abstract definition of schema, you're modeling your relations once. If you have widgets belonging to customers, you have a one widgets relation containing a customer id, rather than one widgets relation in each customer schema. It's an elegant design, and it takes even greater advantage of the benefits I described earlier.
I'd advocate against it when building a multitenant SaaS application on Postgres if the following hold:
1. You have many, many customers
2. Each customer has a lot of data
The reasons:
1. You'll quickly hit a place where it's hard to throw more vertical processing power at it. You'll need a serious SAN with lots of spindles to be able to service many request to a lot of data, and plenty of CPU and RAM to handle the grunt work. Let's face it: you're on Postgres because you're broke :) I really love it, but sometimes I'd rather buy a solution to a problem (e.g., VPD). However, if you're rolling out schemas, you can build a new database and move them discretely. This isn't automatic or transparent, but it's very clean and very easy.
2. You're likely dealing with clients with special needs. It might not be appropriate to roll out new features to every customer all at the same time. If you ALTER TABLE widgets... it's all or nothing. Sometimes that's great, but sometimes it's not, and in building multitenant SaaS apps, I find it's most often not. In repeating your design, you're buying the flexibility to deal with your customers on a more granular level.
3. Others have already mentioned this: sometimes it's easier to articulate the integrity of your system to technical auditors with a hard separation like independent schemas. It's a little easier to show that a particular database account has access only to a particular schema than it is to convince him that your code does a sufficient job of enforcing the rules. I've done a lot of work where regulatory compliance is a concern, and this is always an easier sell.
The thing is that most vendors have solutions to these problems, and if Postgres doesn't have already built, you can probably build it from their primitives. This is just my general thinking under this particular set of circumstances.
I don't usually go on tirades like this, but this is what I do for a living.
You'll want to use schemas vs. databases. Not only is it significantly more convenient (i.e., you'll administer 1 database instead of n), it will let you scale. For instance, a postgres database has a pool of shared memory that it uses for query plans, caching, etc., and you'll be able to optimize the use of it across all of your clients, rather than one by one.
Related is the impact on connection resources. A connection to a database may be used for multiple schemas, so you'll be able to make much better use of those. Suppose you have 1000 clients, which each typically have 1-2 users on your application when it's typically in use. In the multiple database config, you'll need one connection pool per database, each pool having at least 1 open connection, so that under no load you're still maintaining 1000 connections. With the schemas approach, you can maximize the throughput and balance the connections proportionally to your active users rather than the number of customers you have.
Oh, one more biggie: remember the shared buffers? Those are typically per-session (connection). So if you frequently run "select price from widgets where category=?", you'll want to parse, rewrite, plan, compile, and cache that as infrequently as possible, so keep those connections and sessions open as long as possible so you get the benefit of prepared statements. You will see load go down if you do this, from reduced CPU for parsing and plan generation, and reduced system wait since you won't be churning your shared buffers as much.