Hacker News new | past | comments | ask | show | jobs | submit login
REST worst practices (jacobian.org)
123 points by rbanffy on Jan 20, 2012 | hide | past | favorite | 64 comments



"HTTP Accept headers (and indeed, use of many fancy headers) are beyond what most client developers will bother with; including the response format in the URL with a ".xml" or ".json" is natural and obvious."

That was in a comment by Alex Payne [1] near the bottom. Obviously it's been quite a few years since he wrote that but it's what I personally think, too. The extent that many blog articles and comments take REST really adds far too much complication. Are client-side developers going to understand how to use your API? Always keep your target market in mind.

Honestly in my mind that's the most important point to take from this article.

[1] http://al3x.net/about.html


Of course we have to be pragmatic about these things - if you have a business that will live or die based on the success or failure of its API then you have to make your decision based on that - but there are times that it makes sense to push the boundaries of what developers are comfortable with, yes? Otherwise how do we expect things to progress?


How would this be pushing the boundaries?

Putting the format in the URL itself is much easier than having to fiddle with HTTP headers. Not to mention the ability to view API data directly within the browser is a major advantage in discoverability.


Using accept headers and browser discoverability do not have to be mutually exclusive. You can recognize browsers based on user agent and relax the requirements on accept headers.


I've never been a big REST fan since I'm not sure that shoehorning into HTTP is a great idea. Seems like using a protocol to solve problems it was never really intended for.

When I am trying to connect 2 systems I prefer to think about functions rather than resources , basically create "namespaces" and have something like: www.myapp.com/api/namespace/function then simply pass all data in using POST vars and serialize anything more complicated into JSON then just have to other end return JSON or XML.

Sometimes I will implement a simple header system with some metadata that should be sent and received with each request.

Getting too worried about POST/GET/PUT seems silly when you can just give your functions appropriate names.

I prefer to think of this as SOAP extra light.


I suspect you were being downvoted because you don't understand what REST is. I don't mean this as an insult, most people that talk about REST haven't actually read Roy Fielding's dissertation. It's a good read; I highly recommend it.

If you do give it a read, you'll find that his paper isn't about "shoehorning into HTTP". Actually, the title is pretty clear: _Architectural Styles and the Design of Network-based Software Architectures_. After doing a great job of classifying different types of problems and possible solutions to these problems, Dr. Fielding describes REST as a solution to a particular kind of problem. To quote (from his blog):

> REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them.

It's fine if you don't want to use REST, but it's obvious from your comment that you don't understand what REST is or when you might use it.


Thanks, I've got that bookmarked and will give it a read when I get some time.

I apologize if I have mis-represented the arguments for/against REST it's just my experience that a lot of debate around REST seems to be people discussing which resource something should belong to or whether something should be POST or PUT.

These sorts of conversations don't really make your software better (I often feel the same way about OO inheritance).


Not sure why this is being down voted quite so hard?


I found that font difficult to read. Especially on top of a flying pony.

As Jare mentioned, I don't like the idea of returning url pointers. But, I always run into a wall when it comes to deciding how 'deep' to load certain entities. Does anyone have any interesting approaches to making it dynamic based on the client request? Or should you just expect the client to make additional calls to get more data.


You're not the first person to wonder about this. Take a read over this link [1] maybe?

[1] http://www.stereoplex.com/blog/mobile-api-design-thinking-be...


Right, to pull out what I think the parent is looking for from the article: you add a depth=n parameter which runs a pass n times on the returned data to replace all IDs/URLs with their corresponding API representations.


HN comments on that link: http://news.ycombinator.com/item?id=3337357

IMHO, we'll end up with something equivalent to relational algebra. It's the same problem: providing different representations of one data model (e.g. how far to expand links is an instance of denormalizing a flat list). But it won't happen til providing different representations becomes critical.


Thanks, very interesting. After reading the discussion about creating a query language for REST I remembered that Microsoft has something called WCF Data Services which is basically a SQL-like query language for web services.

Perhaps it is inevitable that we'll eventually need rich query languages to really deliver what clients want.


You may be right about queries.

Though, queries are distinct from relations. They are a database concept, existing before Codd's relations and also present in NoSQL etc.

I was thinking of the relational concepts that SQL builds on. A way to separate these concepts is to do the query in two stages: (1) a relational transformation over the database that denormalises it (joining lists) into a list containing your answer; (2) a query to extract that answer. (These two stages are usually redundant; it's just a way to think about it, not how to write the query, nor for a DB to implement it.)

In a sense, the nesting of REST APIs means they are heavily de-normalised: to access some data, you have to start at the top and work your way down that "path". They sometimes have multiple paths to the same data, each one representing a different denormalisation. The reason this is not "relational" is because you can only use pre-existing paths; you can't make up your own as needed.

It's only in a narrow sense that REST APIs are normalised, in that you can only return one list, with no joins in it.

Looking at WCF Data Services http://en.wikipedia.org/wiki/WCF_Data_Services, those examples aren't (necessarily) relational, just queries, in that there is no transformation, just selection of a sublist. It's not fair to judge it on one example though. I found another eg http://msdn.microsoft.com/en-us/library/dd728279.aspx it's analogous. Maybe I'll read the specs in detail later. :-)

BTW: A query returns a subset of the database. It seems sensible to return only the subset needed. But there's an odd trade-off in REST: caching more than a specific query needs is more likely to be useful for the next query (in effect, it distributes the database over network caches). Bandwidth is plentiful, so let's return 10KB instead of 10 bytes. But latency is a problem (IMHO); presently, we need several requests to get the data we want.

I just had an idea: a rich REST API query language (as you say), and also have a client-side query language, and a client-side optimisation engine that transforms each query into a form that maximises cache hits. e.g. get much more information than needed, if it happens to be cached nearby. It is like the query optimisations that databases presently use, but on the client, on the other side of the expensive network.

You could also over-request, to populate the cache for future queries (your own and from other users). This might need a lot of tuning, based on actual usage patterns between users. Existing web-caching experience might guide this.

Finally: at the moment, I think it's clearly true that different ways of accessing the same data are not that important. Perhaps they'll never be, for web-data that is always used in the same way. The "enterprise" needed relations because they have heaps of apps using centralised data in different ways (manufacturing, inventory, sales, analysis, etc). However, as the enterprise moves into the cloud, and as developers start to construct services out of services (with much deeper integration than a simple mash-up; more like using remote libraries), this will become extremely important to some users.

whoa, long post. just trying to get my thoughts in order. hope it's of some use.


A well designed REST interface should allow the requester to specify how much information they want to receive.

By default, it's reasonable to assume the requester only wants the barest of information (along with links to related resources) when they access /hero/superman.

But it's perfectly reasonable to provide additional views for common use cases e.g. /hero/superman/powers/detailed or /hero/superman/friends/filter/powers/flight.


Ugh. The way you filter a RESTful resource is using the query string. /heroes/superman/powers/detailed is borderline OK: it could conceivably return a 404. However, the filter business is just messy. /heroes/superman/friends?powers=flight is more elegant.

See http://stackoverflow.com/questions/4024271/rest-api-best-pra...


I think you're probably right about replacing the "filter" part of the URL with a query string instead.


Quick note to readers: this article is from 2008.


Luckily HTTP hasn't changed since then.


I don't really understand the obsession with using HTTP methods. OP brings up a good point that using only GET/PUT/POST/DELETE to only represent CRUD is extremely limiting, but suggests that using even more obscure HTTP methods would be less limiting? I would argue not only is it limiting, it is also obscure, what is it about "PUT" that implies update vs. create?

If you look at the twitter API (https://dev.twitter.com/docs/api) (commonly held up as exemplary REST) you will see exactly 2 methods, GET and POST. GET means no state change other than logging, POST implies state change. This makes so much more sense because then you can use ?command= and have any command you want and not have people confused about what it means.


> commonly held up as exemplary REST

Not by people who know what they're talking about.

> This makes so much more sense because then you can use ?command= and have any command you want and not have people confused about what it means.

That's textbook RPC. It's a pattern that works well for many things, but it's definitely not REST.


Very sincerely, why is REST an idea worth pursuing?


Have you read Roy Felding's REST dissertation?


Yes, I have. To me there seems to be an abstraction leak. Specifically he's baked HTML into his entire thesis. The idea of interconnecting hypermedia together is great. It does how ever limit on to using spec'ed hypermedia like HTML. HTML is one of the only spec media that can link out of the box.

One of his goals was that a document would be able to provide links to another document in a natural, intuitive way. It is ment to be so intuitive in fact that the interchange of the document doesn't need a schema or WSDL like system. HTML has this ability. JSON however doesn't.

If one wants to nest links in JSON, one has to tell the client how those links will appear. On has to define the structure beyond merely syntaxical correctness. This is where REST falls down, IMHO.

So I recommend pursuing pieces of REST. But a full HATEOS concept seems to meaningfully limit on to symantec HTML.


> To me there seems to be an abstraction leak. Specifically he's baked HTML into his entire thesis.

No he's not, he's baked hyperlinks into his thesis as the base of REST, that has nothing to do with HTML.

> One of his goals was that a document would be able to provide links to another document in a natural, intuitive way.

No. His goal is merely that documents be hyperlinked, how is the application's domain.

> It is ment to be so intuitive in fact that the interchange of the document doesn't need a schema or WSDL like system.

I don't know where you got that one, but it's definitely not part of Fielding's thesis or his comments on the subject since. It's in fact exactly the opposite, the one thing Fielding notes must be documented in details in a RESTful service is media types, aka the documents returned by the service. That's where hyperlink semantics are added.

> JSON however doesn't.

Neither does SGML or XML, that's not an issue.

> If one wants to nest links in JSON, one has to tell the client how those links will appear.

Just as one has done with HTML.

> On has to define the structure beyond merely syntaxical correctness.

Just as with HTML.

> This is where REST falls down, IMHO.

That's complete and utter nonsense.

The only difference between HTML and JSON is that HTML is already an application of a meta-document type (well used to be, of SGML, it's drifted further but conceptually it still is) whereas JSON, much like XML, is a meta-document type from which users build applications (what do you think application/xhtml+xml is?).

You sound like you want some magical silver-bullet through which you don't have to document anything and things suddenly understand each others based on fairy dust. Reality does not work that way.


If one wants to nest links in JSON, one has to tell the client how those links will appear. On has to define the structure beyond merely syntaxical correctness. This is where REST falls down, IMHO.

Then use HTML. Or some other representation with known link semantics.

I don't see how failures or shortcomings of JSON indicate a problem with REST. They are orthogonal; JSON has nothing to do with REST, it's just a serialization format some people like.


I am not sure who wrote it, or where to even find it again, but I have read exactly what virmundi is referring to. It is possible he is confusing authors.

The long and the short of the document was that a REST client must be able to navigate the entire API when given just a single endpoint URL, much like you can navigate an entire website just by entering the homepage URL into your browser, but without the dependance on HTML.

The leap the document made, and what virmundi seems to be referring to, is that a REST client that speaks, say, JSON is not enough to traverse any random service, as everyone will provide a completely different JSON representation of their data. When you start hardcoding site-specific keys in which to find the action descriptions, you lose all of the flexibility the author claimed REST would provide in the first place.


Part of my comment is backed by the implicit constraint on HATEOS, http://en.wikipedia.org/wiki/HATEOAS.

I've downloaded the original thesis. I'll look there to see what's mentioned about the hypermedia linking process.


> Then use HTML. Or some other representation with known link semantics.

Or define link semantics in JSON, the one documentation a restful service should have is that of media types (the documents returned) anyway, the only URL which should appear in the documentation is the root of the service.

Whining that JSON has no link semantics is nonsensical, SGML does not have any either, HTML added those when it was first created as an application of SGML, just as text/xhtml+xml is an application of XML defining links, "XML" as a meta-type does not know about links either.


Actually these is quite a big difference.

PUT specifies the resource to be update, whilst POST does not do so. This means that you must (according to RFC 2616) use POST if you want to create a new resource and PUT if you want to change an existing resource or create one that you know "how to name". This disctinction is sometimes necessary, or at least useful (ex: idempotence of PUT).

I don't understand how any even half-competent developer would be unable to understand this difference and act accordingly.


What if you want to update a value, but you want to store the number of times it was updated to be returned in the JSON (not idempotent, but not an unreasonable thing to want)? Would you still use PUT?

My point is not that the ascribed meaning of POST and PUT is bad, I think the concepts behind both these functions are great, but what if you want to make a different or more complicated function? It makes sense to me to invent your own rather than be limited to what HTTP gives you.


I suppose you could expose the counter as a separate resource. It depends how important that counter is. If it's somewhat incidental, you could use PUT. If it's key to your system, use POST for updating the counted resource, because it's not idempotent.


You examine the situation and determine what makes the most sense. Do you want the client to repeat the request if it's not sure whether it succeeded or not? If so, use PUT; otherwise use POST.

This isn't complicated stuff.


There's no need to be condescending.


Put is supposed to be Idempotent (http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1...).

calling "PUT a=b&c=d" three times should be identical to calling it once. So any time you let the system generate an ID for the resource you should use POST.

PUT works in creation when you know the id beforehand.

I have not worked with amazon s3 but I hear they handle this well. You PUT a file to a url containing the file name. The first PUT creates the file on their end, any future PUT updates that file.


What's often left out is the reason that that's a useful rule to have: if PUTs are always idempotent, then you can always safely repeat a PUT if you're not sure whether it succeeded or not. This makes it easier to build robust systems.


What I like about using GET (and, to some extent, POST) is that it makes it trivial to test your API in a browser while developing it. Just paste in the URL and you get the raw response, fetch time analysis, and everything else the Chrome developer tools provide you.

This shouldn't matter much, but it does.


XHR Poster[0] sounds like an extension you'll enjoy. It provides a much more comprehensive testing interface.

[0]http://goo.gl/UFSdZ (links to Chrome extension store)


I'd add that one of the best features of XHR Poster is that it uses your cookies and any authentication you've already performed in Chrome when sending the requests.

I had to build an API that lived inside a system with entirely too many authentication/cookie checks and using curl was becoming too awkward for testing, this saved me a lot of time.


Thank you so much for the pointer! I've installed it, and I imagine I'll be using it quite a bit in the coming weeks at $work.


Where is the Twitter API held up as "exemplary REST"? I've usually seen Twilio's API as being the de-facto REST API. Not perfect, but pretty close.


>I don't really understand the obsession with >using HTTP methods.

If I could break it down to the simplest form:

GET requests can be cached, any other method should be processed.


I'm not sure about the rest of the stuff, but the idea that you should return links instead of IDs strikes me as horrible. Your REST API should define a way to ID the objects it deals with, and a way to access information about those objects. An example of such information would be an API to return the URL for a photo given its ID (expecting the client to construct that URL is certainly brittle).

My personal best practice would be to build batching facilities for your APIs. Batching is incredibly important when your application data grows.


No, no.

The point is that REST uses Hypertext as the Engine of Application State. You might have heard of this as HATEOAS.

Ideally, your API is discoverable given the initial URL. If you return IDs, I have to construct URLs based on information I already have. If you return URLs, I just have to make an HTTP request.


> The idea that you should return links instead of IDs strikes me as horrible.

I've not thought much about this but Roy Fielding disagrees [1].

[1] http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hyperte...


In REST there is no need for IDs at all. There is only URLs; the URL of an object is the only way to refer to it.


I almost completely agree, but a problem I often encounter in REST API design is: how to do searches?

Let's say we have an "Event" Resource and it belongs to a "Location" Resource and the Event belongs to "Category" Resources. If you want to provide an endpoint to search events with a GET requests and filter based on Location and Categories (which are identified by URLs) you can end up with really long query strings.

/events?location=URL1&categories[]=URL2&categories[]=URL3&categories[]=URL4&...

This is not only ugly, it can bring up problems on both clients and servers.

Any suggestions on how to handle this?


Why use GET instead of POST? Just POST a document with the parameters to filter, and the server creates a new resource with the search results and returns its URL to the client, which can then GET it.


There's really no need to create a separate URL; it's reasonable for POST requests to contain response bodies.


I don't think there's anything particularly unRESTful about using non-URL ids in that case. The critical thing is that the client isn't using out-of-band knowledge (i.e. what the developer read in the API documentation and hardcoded into the client) to construct the query.


URLs, and links. And, more precisely, an URL is the only way to refer to a resource. There may be multiple URLs pointing to the same.

Aside what appared to be a fundamental misunderstanding on the GP's part, there is the legitimate concern that an individual resource shouldn't need to know its own URL. In the article, that was just glossed over, though.

One could, for example, have a book by ID 1 trickle the ID up the application call path, and at a higher level, a complete URL is generated. In fact, the application could simply be a module inside yet another one that wants to add its own ‘namespace’.

What the URL is and how it's generated doesn't matter. All that does is that the URL is made available (discoverable), and that it goes to the right resource, however that happens.


I'm not sure quite where you're going. You agree that having a client construct its own URLs from IDs is brittle. But then you propose an API to turn photo IDs into URLs—but that service itself lives at a particular URL, and is equally subject to change!

Can you explain a bit more? I'm just surprised because returning URLs (not IDs) is a fundamental idea of proper REST.


If the API entry point is subject to change then the very ability to access the service is in trouble, isn't it?

In particular, my doubts about IDs vs URLs is the ability to store those pieces of data in the client (or intermediate caches) - if the service URLs or namespaces change, the client-side URLs become useless even after the client updates the service entry point. With IDs, that information would still be usable.

I'm absolutely no expert on this so I'm probably in the "HTTP API, not REST API" side of the problem.


Migrating anything will always be hard, sometimes services even migrate IDs (to something that is easier to shard intelligently against). As others mentioned you could either put redirects everywhere, but most services just version their REST APIs and then write glue to tie old interfaces to newer data models, see for example Twitter, Amazon AWS, and Instagram.

One other semantic nitpick is that a URL is a subset of URI, which stands for Universal Resource Identifier, so in some sense you could make a philosophical argument that URI and IDs are/should be the same thing. If you were really crazy you could store URLs in your DB (obviously not a good idea for many other reasons).


Good point. I think the best way to handle URL changes is to use HTTP status 301 (permanently moved) for the old URLs. This can be applied consistently to APIs and webpages. This allows discoverable resources as well as re-discoverable URLs.


This is how a good API would handle the change. Deprecate the first URL, do a redirect to the new one for a period of time, then remove the original endpoints.

Same concept as deprecating code. You don't wipe it out straight away because you have no idea what else might rely on it or how fast they are to react (within reason).

As for the arguments about not using HTTP request methods - they're standard, and are implemented uniformly everywhere (for the best part). Your API isn't standard, but if all APIs use HTTP and its request methods correctly, you only need to learn the general HTTP spec to generally understand all APIs.

Resources and URL structure afterwards is a matter of opinion. What may seem RESTful to me might not be to someone else.


> In particular, my doubts about IDs vs URLs is the ability to store those pieces of data in the client (or intermediate caches) - if the service URLs or namespaces change, the client-side URLs become useless even after the client updates the service entry point.

Which is OK, because the client can re-traverse the service to get back to those resources which are now invalid.


Another worst practice is referring to JSON as the "REST file format" as ESRI's products did two years ago when I last worked with them.


The thing I wonder about with stuff from jacobian.org is whether the guy can actually write ANY code. My gut feeling is he a guy that has very good political skills and somehow used this a way to mind control geeks.

Basically, I can't understand why anything he writes is somehow hackernews worthy...is he self-promoting?


> The thing I wonder about with stuff from jacobian.org is whether the guy can actually write ANY code. My gut feeling is he a guy that has very good political skills and somehow used this a way to mind control geeks.

So you have no idea who he is, apparently did not look for any information (such as his gh page or his being one of Django's lead devs) and decided he was all about "political skills". And somehow failed to realized the article linked above is 3 years old.

> Basically, I can't understand why anything he writes is somehow hackernews worthy...

But that does not prevent you from judging him as "a politician who can't write code". Nice.

> is he self-promoting?

He "self-promotes" 3 years old articles by not even posting them himself? What sense does that make?

Here's an idea: maybe he wrote good content, and he's not too bad a writer. But you know, if you want to insult him you could at least do it to his face, he's a member on HN after all: http://news.ycombinator.com/user?id=jacobian


is it possible that being a Django "lead dev" is a political position? I noticed he is a literature major at UC Santa Cruz, a school that doesn't give actual letter grades.

Maybe he is the ultimate "trick baby"....or is Django something we cannot question on hacker news. I think this dude is full of shit myself.


And you say that hiding behind a throwaway id... very brave of you.

Instead of insulting people, try to learn to properly discuss ideas. Learn to point out failed reasoning and to defend your own concepts with sound reasoning or facts. Once you start, it's not that hard and it makes the discussion so much more useful. We have limited time. We shouldn't waste it.


For what it's worth and though I shamefully fed it, this seems to be a troll account created solely to insult a few people of the python community: only three comments insulting Jacob here and Jesse Noller in an other thread. I'm pretty sure the guy is not looking for discussion, he should just be flagged and ignored.


Thanks, masklinn, for the nice detective work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: