Proactive Database Defenses Using Triggers
Bear Giles | January 15, 2017I’m sure I’ve discussed this a number of years ago but a question came up after the recent Boulder Linux User Group meeting and I decided this would be a good time to revisit it.
The question is how do you protected sensitive information from illicit insertion or modification when the attacker has full SQL access as the website user?
Important: I am focused on approaches we can use in the database itself, not our application, since the former will protect our data even if an attacker has full access to the database. These approaches are invisible to our database frameworks, e.g., JPA, once we have created the tables.
An Approach Without Triggers
At a minimum we can ensure that the database was properly configured with multiple users:
app_owner – owns the schema and tables. Often does not have INSERT/UPDATE/DELETE (or even SELECT) privileges on the tables.
app_user – owns the data but cannot modify the schema, tables, etc.
We can make this much more secure by splitting app_user into two users, app_reader and app_writer. The former user only has SELECT privileges on the tables. This is the only account used by user-facing code. The app_writer user adds INSERT/UPDATE/DELETE privileges and is only used by the methods that actually need to modify the data. Data is typically read so much more often that it is written that it often makes sense to view an application as actually two (or more) separate but related applications. In fact they may be – you can improve security by handling any data manipulation via microservices only visible to the application.
There is a big downside to this – modern database frameworks, e.g., JPA or Hibernate, make heavy use of caching to improve performance. You need to ensure that the the cache is properly updated in the app_reader cache whenever the corresponding record(s) are updated in the app_writer account.
Security Defense
This is highly database specific – does the database maintain logs that show when a user attempts to perform a non-permitted action? If so you can watch the logs on the app_reader account. Any attempt to insert or update data is a strong indication of an attacker.
Triggers Based On Related Information
A 3NF (or higher) database requires that each column be independent. In practice we often perform partial denormalization for performance reasons, e.g., adding a column for the day of the week in addition to the full date. We can easily compute the former from the latter but it takes time and can’t be indexed.
There’s a risk that a bug or intruder will introduce inconsistencies. One common solution is to use an INSERT OR UPDATE trigger that calculates the value at the time the data is inserted into the database. E.g.,
- CREATE FUNCTION calculate_day_of_week() ....
- CREATE TABLE date_with_dow (
- date text,
- dow text
- );
- CREATE FUNCTION set_day_of_week() RETURNS trigger AS $$
- BEGIN
- NEW.date = OLD.date;
- NEW.dow = calculate_day_of_week(OLD.date);
- RETURN NEW;
- END;
- $$ LANGUAGE plpgsql;
- CREATE TRIGGER set_day_of_week BEFORE INSERT OR UPDATE ON date_with_dow
- FOR EACH ROW EXECUTE PROCEDURE set_day_of_week();
CREATE FUNCTION calculate_day_of_week() .... CREATE TABLE date_with_dow ( date text, dow text ); CREATE FUNCTION set_day_of_week() RETURNS trigger AS $$ BEGIN NEW.date = OLD.date; NEW.dow = calculate_day_of_week(OLD.date); RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER set_day_of_week BEFORE INSERT OR UPDATE ON date_with_dow FOR EACH ROW EXECUTE PROCEDURE set_day_of_week();
This ensures that the day of week is properly set. A software bug, or attacker, can try specifying an invalid value but they’ll fail.
Of course we don’t really care (much) if the day of the week is incorrect. However there are other times when we care a great deal, e.g., cached attributes from digital certificates. If someone can insert a certificate with mismatched cached values, esp. if it they can replace an existing table entry, then they can do a lot of damage if the code doesn’t assume that the database could be corrupted and thus perform its own validation checks on everything it gets back. (First rule of security: never trust anything.) Even with tests we’ll only know that the data has been corrupted, not when and not how broadly.
Security Defense
Developers are information packrats. Can we learn anything from the provided day of week value?
Yes. It’s a huge red flag if the provided value doesn’t match the calculated value (modulo planned exceptions, e.g., passing null or a sentinel value to indicate that the application is deferring to the database). It’s easy to add a quick test:
- CREATE FUNCTION calculate_day_of_week() ....
- -- user-defined function that can do anything from adding an entry into
- -- a table to sending out an email, SMS, etc., alert
- CREATE FUNCTION security_alert() ....
- CREATE TABLE date_with_dow (
- date text,
- dow text
- );
- CREATE FUNCTION set_day_of_week() RETURNS rigger AS $$
- DECLARE
- calculated_dow text;
- BEGIN
- NEW.date = OLD.date;
- NEW.dow = calculate_day_of_week(OLD.date);
- IF (NEW.dow OLD.date) THEN
- security_alert("bad dow value!");
- RETURN null;
- END IF;
- RETURN NEW;
- END;
- $$ LANGUAGE plpgsql;
- CREATE TRIGGER set_day_of_week BEFORE INSERT OR UPDATE ON date_with_dow
- FOR EACH ROW EXECUTE PROCEDURE set_day_of_week();
CREATE FUNCTION calculate_day_of_week() .... -- user-defined function that can do anything from adding an entry into -- a table to sending out an email, SMS, etc., alert CREATE FUNCTION security_alert() .... CREATE TABLE date_with_dow ( date text, dow text ); CREATE FUNCTION set_day_of_week() RETURNS rigger AS $$ DECLARE calculated_dow text; BEGIN NEW.date = OLD.date; NEW.dow = calculate_day_of_week(OLD.date); IF (NEW.dow OLD.date) THEN security_alert("bad dow value!"); RETURN null; END IF; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER set_day_of_week BEFORE INSERT OR UPDATE ON date_with_dow FOR EACH ROW EXECUTE PROCEDURE set_day_of_week();
Sidenote: check out your database documentation for more ideas. For instance many applications use @PrePersist annotations to autofill creationDate and lastUpdateDate. It’s easy to do this via a trigger – and using a trigger ensures that the data updated even if an attacker does it via SQL injection or direct access. More impressively you can write audit information to a separate table, perhaps even in a separate schema that the app_user only has INSERT privileges for in order to prevent an attacker from learning what the system has learned about them, much less altering or deleting that information.
I’ve written triggers that generate XML representations of the OLD and NEW values and write them to an audit table together with date, etc. On INSERT the OLD data is null, on DELETE the NEW data is null. Using XML allows us to use a common audit table (table name is just a field) and potentially allows you to add transaction id, etc.
It is then easy to use a bit of simple XML diff code to see exactly what changed when by reviewing the audit table.
Resources:
Triggers Based On Secrets
What about tables where there’s no “related” columns? Can we use a trigger to detect an illicit attempt to INSERT or UPDATE a record?
Yes!
In this case we want to add an extra column to the table. It can be anything – the sole purpose is to create a way to pass a validation token to the trigger.
What are validation tokens?
A validation token can be anything you want. A few examples are:
A constant – this is the easiest but will be powerful as long as you can keep it secret. An example is ’42’. An obvious variant is the sum of several of the other columns of the table. This value should not be written to the database or it will be exposed to anyone with SELECT privileges.
A time-based value – your webserver and database will have closely synced clocks so you can use a time-based protocol such as Time-based One-time Password (TOTP) Algorithm. If both the database and application servers use NTP you can keep the window as small as a few seconds. Just remember to include one tick on either side when validating the token – NTP keeps the clocks synchronized but there can still be a very small skew plus network latency to consider.
Note: TOTP requires a shared secret and is independent of the contents of the INSERT or UPDATE statement.
You can save a time-based value but it is meaningless without a timestamp – and some algorithms can be cracked if you have a series of values and the starting time.
An HMAC value – most people will be familiar with standard cryptographic hashes such as MD-5 or SHA-1 (both considered cracked) or SHA-256. They’re powerful tools – but in part because everyone will compute the same values given the same input.
In our case we want an HMAC – it is a cryptographically strong message digest that also requires an encryption key. An attacker cannot generate their own HMAC but anyone with the corresponding digital certificate can verify one. An HMAC value requires something in the message to be processed and it needs to be intrinsic to the value of the record. For instance a digital certificate, a PDF document, even a hashed password. Don’t use it to hash the primary key or any value that can be readily reused.
You can freely save an HMAC value.
Subsequent validation
We would like to know that values haven’t been corrupted, e.g., by an attacker knowledgeable enough to disable the trigger, insert bad values, and then restore the trigger. The last step is important since we can / should run periodic scans to ensure all security-related features like these database triggers are still in place. Can we use these techniques to validate the records after the fact?
Constant value: no.
Time-base value: only if we record a timestamp as well, and if we do then we have to assume that the secret has been compromised. So… no.
HMAC value: yes.
Backups and Restorations
Backups and restorations have the same problems as subsequent validations. You can’t allow any magic values to be backed up (or an attacker could learn it by stealing the backup media) and you can’t allow the time-based values plus timestamps to be backed up (or an attacker could learn the shared secret by stealing the backup media). That means you would need to disable the trigger when restoring data to the database and you can’t verify that it’s properly validated afterwards. Remember: you can’t trust backup media!
The exception is HMAC tokens. They can be safely backed up and restored even if the triggers are in place.
Security Defense
You can add a token column to any table. As always it’s a balance between security and convenience and the less powerful techniques may be Good Enough for your needs. But for highly sensitive records, esp. those that are inserted or updated relatively infrequently, an HMAC token may be a good investment.
Implementation-wise: on the application side you can write a @PrePersist method that handles the creation of the TOTP or HMAC token. It’s a standard calculation and the biggest issue, as always, is key management. On the database side you’ll need to have a crypto library that supports whatever token method you choose.
Shadow Tables
Finally, there are two concerns with the last approach. First, it requires you to have crypto libraries available in your database. That may not be the case. Second if a value is inserted it’s impossible know for sure that it was your application that inserted it.
There’s a solution to this which is entirely app-side. It might not give you immediate notification of a problem but it still gives you some strong protection when you read the data.
You start as above – add a @PrePersist method that calculates an HMAC code. Only now you edit the domain bean so that the HMAC column uses a @SecondaryTable instead of the main table. (I think you can even specify a different schema if you want even higher security.) From the data perspective this is just two tables with a 1:1 relationship but from the source code perspective it’s still a single object.
Putting this into a separate table, if not a separate schema as well, means that a casual attacker will not know that it is there. They might succeed in inserting or modifying data but not realize that the changes will be detected even if audit triggers are disabled.
The final step is adding a @PostLoad method that verifies the HMAC code. If it’s good you can have confidence the data hasn’t been corrupted. If it’s incorrect or missing you know there’s a problem and you shouldn’t trust it.
For advanced users the developer won’t even know that the extra data is present – you can do a lot with AOP and some teams are organized so that the developers write unsecured code and a security team – which is focused entirely on security, not features – is responsible for adding security entirely through AOP code interwoven into the existing code. But that’s a topic for a future blog….