User Authentication and Authorization Information
User authentication (authn) is how we know that a user is who he claims to be. At a minimum it’s a username and password but it could include much more if two-factor authentication is used.
User authorization (authz) is what we allow the user to do.
These are very different questions and should be treated as such. Some architectures for this, e.g., if a site uses siteminder or a similar tool then it doesn’t have access to authn information at all – it can only add authz.
What is user authn/authz information? It is
- username and/or email
- password
- single sign-on (SSO) identifications
- security tokens (for two-factor authentication)
- security images/phrases (used to prove your site is legitimate to the user)
- groups and roles
- contact information
- content subscriptions
- or anything else that’s not required to authenticate or authorize the user.
Wrong Approach
Put everything – user authn/authz, static content and dynamic content – into a single database schema.
It’s quick.
It’s easy.
It’s the default behavior for auto-generation tools.
And it’s very, very wrong since anyone who cracks your webapp has also cracked your user authn/authz data. At best you’ll have a denial-of-service. At worst they can pretend to be other users, can add their own highly-priviledged account, etc.
Separate Schemas and Connection Pools
The quickest solution is to create a separate schema for the user authn/authz data and use a dedicated data source (or Hibernate session) when accessing this data. This schema should be unreadable from the standard data source (or Hibernate session). This gives you a good firewall from the world but isn’t perfect.
A seemingly more robust solution is to use a separate database, not just a separate schema, for the user authn/auhtz data. This would seem to protect you from misconfigured rights that would allow the dynamic content data source to access the user data source.
Sadly in some RDMBS there’s not a clear distinction between schemas and databases and a connection to one “database” can still access another “database” if the necessary rights are granted. You can’t be sure unless you have a separate database instance for user authn/authz and dynamic content. This may not be an undue burden if your architecture has a server dedicated for user authn/authz. This is not unreasonable with virtual servers or a cloud design.
Container-Based Authentication
A better solution is container-based authentication. Pull user authn/authz entirely out of the webapp – by the time your webapp gets the request the HttpServletRequest already has all necessary information populated. Your webapp has no access to the container’s authentication information. (Modulo the notes above – you don’t gain anything if the container looks into the same schema as your dynamic content.)
A variant of this is authentication filters put in front of the webapp, e.g., those from Spring Security. It’s a different mechanism but serves the same purpose of keeping a very sharp distinction between user data and dynamic content.
The Glitch – Adding and Updating Users
There’s one big glitch here – how do you add or update user information if your webapp can’t access the user authn/authz tables?
The first approach is to create a separate webapp that handles this. Your main webapp can transparently redirect to the second webapp as necessary. The upside is that you can have a consistent look and feel, the downside is that you’re exposing user authn/authz information to the weeb again.
The second approach is to create a separate REST service that handles this. Your webapp can provide the user interface but call the REST service instead of the standard business layer. The REST service can be within your firewall.
The third approach is to defer this entirely to the container. This ensures maximum separation but makes it difficult to have a consistent look and feel.