HTTP Authentication may be RESTful, but it's not very USEful.
Whenever I talk to a REST enthusiast, they tell me I should be using HTTP authentication, not forms with cookies or URL-rewriting, for user authentication. (REST is an architecture style for distributed systems, and a popular way to think about the web.) Most web frameworks, including Java servlets, use one or both of these approaches to make it easy to identify a server-side session object for a given request. A RESTful application, however, would model all server-side state as resources accessible via URIs, so there's no need for that session object. The URL-rewriting is particularly anti-REST, because a single resource has many URIs, one for each session ID. Both approaches, especially when combined with a server-side session data, tend to erase much of the scalability benefits of HTTP 1.1's caching mechanisms.
Nevertheless, HTTP authentication is not widely used. The two approaches currently defined in the specifications are HTTP "Basic" and "Digest" authentication. Basic is widely supported, but because it effectively transmits the username and password in the clear, it is usually appropriate only over HTTPS. Digest never sends the password in the clear, but is apparently not implemented consistently in clients. In his recent XML.com article, httplib2: HTTP Persistence and Authentication, Joe Gregario describes the interoperability issues with Digest in more detail, and concludes:
The bad news is that current state of security with HTTP is bad. The best interoperable solution is Basic over HTTPS.
With HTTP authentication, the client (such as a web browser) presents its own authentication dialog to the user when prompted for credentials by the server, after an initial request of a resource that is in a protected realm. (A realm is a URI, such as "/admin" and all resources under that URI, such as "/admin/moderation" and "/admin/users".) The user enters their username and password into the dialog, thereby authenticating themselves to the client for the requested realm. The client then uses those credentials to authenticate with the server for that realm from then on, until the client is exited. (I.e., there is usually no way to log out of a client authentication except by ending the client process.)
Being a budding REST-enthusiast myself, I investigated HTTP authentication, but discovered several usability and a few security concerns:
1. Username prompt may confuse the user
With HTTP authentication, the client takes the credentials, usually a username and password. This means that the application has no control over what prompt is given for username. At Artima, we are planning to have a network of sites with single sign on, and a while back I took to calling the username "Artima ID" on our sign in page (just as Yahoo calls their login a "Yahoo ID"), to make it obvious what you're logging into. The C++ Source will someday be its own website, but to sign in you will need to use your Artima ID. You won't have a The C++ Source ID. My first concern is that the inability for me to put "Artima ID" next to an empty box for the username with HTTP authentication, will cause people to be confused about what to type there.
Moreover, email addresses are now unique in our database, so if you forget your Artima ID, I'd like to let you use any of the email addresses you attached to your account instead of your Artima ID. And in fact, I may have registration paths where you don't select an Artima ID, so you may not have an ID even if you have an account. So I may want to say "Artima ID or email address:" next to that empty box, or at least a note explaining that if you've forgotten your ID, you can use your email. If people are authenticating via the client's authentication dialog, I believe it would be harder for me to explain that.
2. Not obvious what to do if you forgot your password
If you have forgotten your password, I have on the sign in page a "Forgot your password" link to a way to deal with that problem. It is not as easy to make that option clear using the browser's HTTP authentication dialog.
3. Not obvious what to do if you don't have an account
If you don't have an account yet, I have on our existing sign in page a "Sign Up" link to a path that allows you to create a new account. It is not as easy to make that path obvious when using the client's HTTP authentication dialog.
4. No way to do optional authentication
With HTTP authentication, I can force the user to sign in before doing things that require being signed in, like posting to a forum or administration. However, I also want to be able to recognize via the request whether or not the user is signed in. If not, I don't want to force them to sign in, and instead will return a representation for guests. If they are signed in, I want to return a personalized representation of the resource.
5. Difficult to do single-sign on
With cookies or URL-rewriting, I can quite easily enable users to sign onto the entire network of artima.com subdomains in one shot. In HTTP Digest authentication, I can explicitly list domains that are included in the realm, but not with Basic. However, the scalability of listing each subdomain individually with Digest is a scalability concern. It works fine for half a dozen subdomains, but what about 100 or 500?
6. No way to log out
Users are accustomed to having a log out button that enables them to log out of a site before leaving a public terminal. The HTTP authentication protocols provide no way for the server to request that the client erase the credentials for a realm other than prompting for them again, which causes the browser to pop up the dialog again. Today's browsers do not themselves provide a way to log out of a realm, other than quitting the browser.
HTTP authentication workarounds
Despite the presence of potential workarounds, I find HTTP authentication in its current state to be too much trouble to use. I am still planning on using a cookie as an authentication token in our new architecture, or if cookies aren't enabled, URL rewriting. I'm curious if others have any success stories, or the opposite, to share about using HTTP authentication in practice.
I think HTTP auth, at least like it is implemented these days, is not good for most things. We use Basic HTTP Auth (with Apache) internally, to access a Subversion repository. There's no "sign up" page, and we don't want anyone with no username to do anything. In general, HTTP Auth is good if the application developer doesn't want to invest the time for developing a custom auth mechanism (or when you just want to password-protect an existing application).
However, I don't see the problem with cookies. Persistent cookies may be a privacy issue, but session cookies are harmless AFAIK, and I see no reason to block them(*).
One more approach that you didn't cover is the one used by default by ASP.NET applications. They use POST rather than GET, and the session object is embedded in the response sent to the user - and uploaded back whenever a user does something. I don't like it, but it's something.
> I think HTTP auth, at least like it is implemented these > days, is not good for most things. We use Basic HTTP Auth > (with Apache) internally, to access a Subversion > repository. There's no "sign up" page, and we don't want > anyone with no username to do anything. > In general, HTTP Auth is good if the application developer > doesn't want to invest the time for developing a custom > auth mechanism (or when you just want to password-protect > an existing application). > I was at one point considering using Basic HTTP auth over HTTPS for making admins re-authenticate before getting to the /admin realm, for example. But even that may be too much of a pain. If I'm going over HTTPS, then I can just use an admin auth cookie if I want, and its easier. If I could depend on HTTP Digest auth, that would be different. I haven't tried to see how well it is implemented everywhere, but I see reports of problems, and I don't need to spend time testing cookies.
Another way it occurred to me to secure scary functions like deleting an entire site's database, would be to send the admin through a link I send in an email to activate the scary admin roles. Or confirm scary things by email before actually doing them. Then even if someone steals their Artima password, then unless they can get ahold of their email, then still can't do the really scary things. But there aren't that many scary things, and I think I'm going to err on the side of ease of use for the most part. If you have the password, you will be able to do most things without needing to reauthenticate.
> However, I don't see the problem with cookies. Persistent > cookies may be a privacy issue, but session cookies are > harmless AFAIK, and I see no reason to block them(*). > Here's the relevant portion from Roy Fielding's dissertation:
An example of where an inappropriate extension has been made to the protocol to support features that contradict the desired properties of the generic interface is the introduction of site-wide state information in the form of HTTP cookies . Cookie interaction fails to match REST's model of application state, often resulting in confusion for the typical browser application.
An HTTP cookie is opaque data that can be assigned by the origin server to a user agent by including it within a Set-Cookie response header field, with the intention being that the user agent should include the same cookie on all future requests to that server until it is replaced or expires. Such cookies typically contain an array of user-specific configuration choices, or a token to be matched against the server's database on future requests. The problem is that a cookie is defined as being attached to any future requests for a given set of resource identifiers, usually encompassing an entire site, rather than being associated with the particular application state (the set of currently rendered representations) on the browser. When the browser's history functionality (the "Back" button) is subsequently used to back-up to a view prior to that reflected by the cookie, the browser's application state no longer matches the stored state represented within the cookie. Therefore, the next request sent to the same server will contain a cookie that misrepresents the current application context, leading to confusion on both sides.
Cookies also violate REST because they allow data to be passed without sufficiently identifying its semantics, thus becoming a concern for both security and privacy. The combination of cookies with the Referer [sic] header field makes it possible to track a user as they browse between sites.
Is there a URL where this approach is described?
Anyway, this sounds even worse than URL rewriting to me on first blush.
> Is there a URL where this approach is described?
I'm not a web developer myself, so I don't know where the official documentation is. However, try searching Google for "doPostBack" or "__doPostBack", and you'll see plenty of references to it.
> Anyway, this sounds even worse than URL rewriting to me on > first blush.
It is. What's really bad is this encrypted state information can get really big - even as big as a few KBs. Again, I'm not a web developer, but it appears to me that this is just the default behavior that allows applications to Just Work (TM), and it should be tweaked by the developer.
If you are using apache, you can use something like mod_perl to 'hijack' the authentication mechanism. You can use a custom auth handler to do things like check the URI, the client IP, verify username/password (you could use this for your 'id or email' thing) or any number of other factors. Your handler will then either return 'ok' or 'auth required.' Similarly, you can hijack the access mechanism. A custom access handler would let you (for example) deny access to an admin URI without popping up a new login box.
I think there are some very simple solutions to most of the pain points described in this article.
For example, you can have a welcome page that does not require authentication. The welcome page can have links to sign up, reset password, and login. The login link can point to a page that requires basic HTTP auth. Once the user logs in, the rest of the pages he traverses to can all be covered by HTTP basic auth.
HTTP Auth does allow for providing a hint to the user as to what credentials are needed. If you are using Apache httpd, you can use the AuthName config param to say, "Artima Login".