Archive

For December, 2011

Introduction to PostgreSQL PL/Java, part 3: Triggers

No Comments

In the past two entries I discussed the basics of PL/Java. Here I will describe one of the most powerful uses of stored procedures – triggers.

Triggers

A database trigger is stored procedure that is automatically run during one of the three of the four CRUD (create-read-update-delete) operations.

  • insertion - the trigger is provided the new value and is able to modify the values or prohibit the operation outright.
  • update - the trigger is provided both old and new values. Again it is able to modify the values or prohibit the operation.
  • deletion - the trigger is provided the old value. It is not able to modify the value but can prohibit the operation.

A trigger can be run before or after the operation. You would execute a trigger before an operation if you want to modify the values; you would execute it after an operation if you want to log the results.

Typical Usage

Insertion and Update: Data Validation

A pre-trigger on insert and update operations can be used to enforce data integrity and consistency. In this case the results are either accepted or the operation is prohibited.

Insertion and Update: Data Normalization and Sanitization

Sometimes values can have multiple representations or potentially be dangerous. A pre-trigger is a chance to clean up the data, e.g., to tidy up XML or replace < with &lt; and > with &gt;.

All Operations: Audit Logging

A post-trigger on all operations can be used to enforce audit logging. Applications can log their own actions but can’t log direct access to the database. This is a solution to this problem.

A trigger can be run for each row or after completion of an entire statement. Update triggers can also be conditional.

Triggers can be used to create ‘updateable views’.

PL/Java Implementation

Any java method can be a used in a trigger provided it is a public static method returning void that takes a single argument, a TriggerData object. Triggers can be called “ON EACH ROW” or “ON STATEMENT”.

TriggerDatas that are “ON EACH ROW” contain a single-row, read-only, ResultSet as the ‘old’ value on updates and deletions, and a single-row, updatable ResultSet as the ‘new’ value on insertions and updates. This can be used to modify content, log actions, etc.

  1. public class AuditTrigger {
  2.  
  3.     public static void auditFoobar(TriggerData td) throws SQLException {
  4.  
  5.         Connection conn = DriverManager
  6.                 .getConnection("jdbc:default:connection");
  7.         PreparedStatement ps = conn
  8.                 .prepareStatement("insert into javatest.foobar_audit(what, whenn, data) values (?, ?, ?::xml)");
  9.  
  10.         if (td.isFiredByInsert()) {
  11.             ps.setString(1, "INSERT");
  12.         } else if (td.isFiredByUpdate()) {
  13.             ps.setString(1, "UPDATE");
  14.         } else if (td.isFiredByDelete()) {
  15.             ps.setString(1, "DELETE");
  16.         }
  17.         ps.setTimestamp(2, new Timestamp(System.currentTimeMillis()));
  18.  
  19.         ResultSet rs = td.getNew();
  20.         if (rs != null) {
  21.             ps.setString(3, toXml(rs));
  22.         } else {
  23.             ps.setNull(3, Types.VARCHAR);
  24.         }
  25.  
  26.         ps.execute();
  27.         ps.close();
  28.     }
  29.  
  30.     // simple marshaler. We could use jaxb or similar library
  31.     static String toXml(ResultSet rs) throws SQLException {
  32.         String foo = rs.getString(1);
  33.         if (rs.wasNull()) {
  34.             foo = "";
  35.         }
  36.         String bar = rs.getString(2);
  37.         if (rs.wasNull()) {
  38.             bar = "";
  39.         }
  40.         return String.format("<my-class><foo>%s</foo><bar>%s</bar></my-class>", foo, bar);
  41.     }
  42. }
public class AuditTrigger {

    public static void auditFoobar(TriggerData td) throws SQLException {

        Connection conn = DriverManager
                .getConnection("jdbc:default:connection");
        PreparedStatement ps = conn
                .prepareStatement("insert into javatest.foobar_audit(what, whenn, data) values (?, ?, ?::xml)");

        if (td.isFiredByInsert()) {
            ps.setString(1, "INSERT");
        } else if (td.isFiredByUpdate()) {
            ps.setString(1, "UPDATE");
        } else if (td.isFiredByDelete()) {
            ps.setString(1, "DELETE");
        }
        ps.setTimestamp(2, new Timestamp(System.currentTimeMillis()));

        ResultSet rs = td.getNew();
        if (rs != null) {
            ps.setString(3, toXml(rs));
        } else {
            ps.setNull(3, Types.VARCHAR);
        }

        ps.execute();
        ps.close();
    }

    // simple marshaler. We could use jaxb or similar library
    static String toXml(ResultSet rs) throws SQLException {
        String foo = rs.getString(1);
        if (rs.wasNull()) {
            foo = "";
        }
        String bar = rs.getString(2);
        if (rs.wasNull()) {
            bar = "";
        }
        return String.format("<my-class><foo>%s</foo><bar>%s</bar></my-class>", foo, bar);
    }
}

and

  1.   CREATE TABLE javatest.foobar (
  2.        foo   varchar(10),
  3.        bar   varchar(10)
  4.   );
  5.  
  6.   CREATE TABLE javatest.foobar_audit (
  7.        what  varchar(10) not null,
  8.        whenn timestamp not null,
  9.        data  xml
  10.   );
  11.  
  12.   CREATE FUNCTION javatest.audit_foobar()
  13.       RETURNS trigger
  14.       AS 'sandbox.AuditTrigger.auditFoobar'
  15.       LANGUAGE 'java';
  16.  
  17.   CREATE TRIGGER foobar_audit
  18.       AFTER INSERT OR UPDATE OR DELETE ON javatest.foobar
  19.       FOR EACH ROW
  20.       EXECUTE PROCEDURE javatest.audit_foobar();
  CREATE TABLE javatest.foobar (
       foo   varchar(10),
       bar   varchar(10)
  );

  CREATE TABLE javatest.foobar_audit (
       what  varchar(10) not null,
       whenn timestamp not null,
       data  xml
  );

  CREATE FUNCTION javatest.audit_foobar()
      RETURNS trigger
      AS 'sandbox.AuditTrigger.auditFoobar'
      LANGUAGE 'java';

  CREATE TRIGGER foobar_audit
      AFTER INSERT OR UPDATE OR DELETE ON javatest.foobar
      FOR EACH ROW
      EXECUTE PROCEDURE javatest.audit_foobar();

Rules

A PostgreSQL extension is Rules. They are similar to triggers but a bit more flexible. One important difference is that Rules can be triggered on a SELECT statement, not just INSERT, UPDATE and DELETE.

Rules, unlike triggers, use standard functions.

The Interface

As before I have not been able to find a maven repository of a recent version and am including the files for your convenience.

TriggerData

  1. /*
  2.  * Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden
  3.  * Distributed under the terms shown in the file COPYRIGHT
  4.  * found in the root folder of this project or at
  5.  * http://eng.tada.se/osprojects/COPYRIGHT.html
  6.  */
  7. package org.postgresql.pljava;
  8.  
  9. import java.sql.ResultSet;
  10. import java.sql.SQLException;
  11.  
  12. /**
  13.  * The SQL 2003 spec. does not stipulate a standard way of mapping
  14.  * triggers to functions. The PLJava mapping use this interface. All
  15.  * functions that are intended to be triggers must be public, static,
  16.  * return void, and take a <code>TriggerData</code> as their argument.
  17.  *
  18.  * @author Thomas Hallgren
  19.  */
  20. public interface TriggerData
  21. {
  22.     /**
  23.      * Returns the ResultSet that represents the new row. This ResultSet will
  24.      * be null for delete triggers and for triggers that was fired for
  25.      * statement.
  26. The returned set will be updateable and positioned on a
  27.      * valid row. When the trigger call returns, the trigger manager will see
  28.      * the changes that has been made to this row and construct a new tuple
  29.      * which will become the new or updated row.
  30.      *
  31.      * @return An updateable <code>ResultSet</code> containing one row or
  32.      *         <code>null</code>.
  33.      * @throws SQLException
  34.      *             if the contained native buffer has gone stale.
  35.      */
  36.     ResultSet getNew() throws SQLException;
  37.  
  38.     /**
  39.      * Returns the ResultSet that represents the old row. This ResultSet will
  40.      * be null for insert triggers and for triggers that was fired for
  41.      * statement.
  42. The returned set will be read-only and positioned on a
  43.      * valid row.
  44.      *
  45.      * @return A read-only <code>ResultSet</code> containing one row or
  46.      *         <code>null</code>.
  47.      * @throws SQLException
  48.      *             if the contained native buffer has gone stale.
  49.      */
  50.     ResultSet getOld() throws SQLException;
  51.  
  52.     /**
  53.      * Returns the arguments for this trigger (as declared in the <code>CREATE TRIGGER</code>
  54.      * statement. If the trigger has no arguments, this method will return an
  55.      * array with size 0.
  56.      *
  57.      * @throws SQLException
  58.      *             if the contained native buffer has gone stale.
  59.      */
  60.     String[] getArguments() throws SQLException;
  61.  
  62.     /**
  63.      * Returns the name of the trigger (as declared in the <code>CREATE TRIGGER</code>
  64.      * statement).
  65.      *
  66.      * @throws SQLException
  67.      *             if the contained native buffer has gone stale.
  68.      */
  69.     String getName() throws SQLException;
  70.  
  71.     /**
  72.      * Returns the name of the table for which this trigger was created (as
  73.      * declared in the <code>CREATE TRIGGER</code statement). * * @throws SQLException * if the contained native buffer has gone stale. */ String getTableName() throws SQLException; /** * Returns the name of the schema of the table for which this trigger was created (as * declared in the <code>CREATE TRIGGER</code statement). * * @throws SQLException * if the contained native buffer has gone stale. */ String getSchemaName() throws SQLException; /** * Returns <code>true</code> if the trigger was fired after the statement * or row action that it is associated with. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredAfter() throws SQLException; /** * Returns <code>true</code> if the trigger was fired before the * statement or row action that it is associated with. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredBefore() throws SQLException; /** * Returns <code>true</code> if this trigger is fired once for each row * (as opposed to once for the entire statement). * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredForEachRow() throws SQLException; /** * Returns <code>true</code> if this trigger is fired once for the entire * statement (as opposed to once for each row). * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredForStatement() throws SQLException; /** * Returns <code>true</code> if this trigger was fired by a <code>DELETE</code>. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredByDelete() throws SQLException; /** * Returns <code>true</code> if this trigger was fired by an <code>INSERT</code>. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredByInsert() throws SQLException; /** * Returns <code>true</code> if this trigger was fired by an <code>UPDATE</code>. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredByUpdate() throws SQLException; } </code></code>
/*
 * Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden
 * Distributed under the terms shown in the file COPYRIGHT
 * found in the root folder of this project or at
 * http://eng.tada.se/osprojects/COPYRIGHT.html
 */
package org.postgresql.pljava;

import java.sql.ResultSet;
import java.sql.SQLException;

/**
 * The SQL 2003 spec. does not stipulate a standard way of mapping
 * triggers to functions. The PLJava mapping use this interface. All
 * functions that are intended to be triggers must be public, static,
 * return void, and take a <code>TriggerData</code> as their argument.
 *
 * @author Thomas Hallgren
 */
public interface TriggerData
{
	/**
	 * Returns the ResultSet that represents the new row. This ResultSet will
	 * be null for delete triggers and for triggers that was fired for
	 * statement.
The returned set will be updateable and positioned on a
	 * valid row. When the trigger call returns, the trigger manager will see
	 * the changes that has been made to this row and construct a new tuple
	 * which will become the new or updated row.
	 *
	 * @return An updateable <code>ResultSet</code> containing one row or
	 *         <code>null</code>.
	 * @throws SQLException
	 *             if the contained native buffer has gone stale.
	 */
	ResultSet getNew() throws SQLException;

	/**
	 * Returns the ResultSet that represents the old row. This ResultSet will
	 * be null for insert triggers and for triggers that was fired for
	 * statement.
The returned set will be read-only and positioned on a
	 * valid row.
	 *
	 * @return A read-only <code>ResultSet</code> containing one row or
	 *         <code>null</code>.
	 * @throws SQLException
	 *             if the contained native buffer has gone stale.
	 */
	ResultSet getOld() throws SQLException;

	/**
	 * Returns the arguments for this trigger (as declared in the <code>CREATE TRIGGER</code>
	 * statement. If the trigger has no arguments, this method will return an
	 * array with size 0.
	 *
	 * @throws SQLException
	 *             if the contained native buffer has gone stale.
	 */
	String[] getArguments() throws SQLException;

	/**
	 * Returns the name of the trigger (as declared in the <code>CREATE TRIGGER</code>
	 * statement).
	 *
	 * @throws SQLException
	 *             if the contained native buffer has gone stale.
	 */
	String getName() throws SQLException;

	/**
	 * Returns the name of the table for which this trigger was created (as
	 * declared in the <code>CREATE TRIGGER</code statement). * * @throws SQLException * if the contained native buffer has gone stale. */ String getTableName() throws SQLException; /** * Returns the name of the schema of the table for which this trigger was created (as * declared in the <code>CREATE TRIGGER</code statement). * * @throws SQLException * if the contained native buffer has gone stale. */ String getSchemaName() throws SQLException; /** * Returns <code>true</code> if the trigger was fired after the statement * or row action that it is associated with. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredAfter() throws SQLException; /** * Returns <code>true</code> if the trigger was fired before the * statement or row action that it is associated with. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredBefore() throws SQLException; /** * Returns <code>true</code> if this trigger is fired once for each row * (as opposed to once for the entire statement). * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredForEachRow() throws SQLException; /** * Returns <code>true</code> if this trigger is fired once for the entire * statement (as opposed to once for each row). * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredForStatement() throws SQLException; /** * Returns <code>true</code> if this trigger was fired by a <code>DELETE</code>. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredByDelete() throws SQLException; /** * Returns <code>true</code> if this trigger was fired by an <code>INSERT</code>. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredByInsert() throws SQLException; /** * Returns <code>true</code> if this trigger was fired by an <code>UPDATE</code>. * * @throws SQLException * if the contained native buffer has gone stale. */ boolean isFiredByUpdate() throws SQLException; } </code></code>

TriggerException

  1. /*
  2.  * Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden
  3.  * Distributed under the terms shown in the file COPYRIGHT
  4.  * found in the root folder of this project or at
  5.  * http://eng.tada.se/osprojects/COPYRIGHT.html
  6.  */
  7. package org.postgresql.pljava;
  8.  
  9. import java.sql.SQLException;
  10.  
  11. /**
  12.  * An exception specially suited to be thrown from within a method
  13.  * designated to be a trigger function. The message generated by
  14.  * this exception will contain information on what trigger and
  15.  * what relation it was that caused the exception
  16.  *
  17.  * @author Thomas Hallgren
  18.  */
  19. public class TriggerException extends SQLException
  20. {
  21.     private static final long serialVersionUID = 5543711707414329116L;
  22.  
  23.     private static boolean s_recursionLock = false;
  24.  
  25.     public static final String TRIGGER_ACTION_EXCEPTION = "09000";
  26.  
  27.     private static final String makeMessage(TriggerData td, String message)
  28.     {
  29.         StringBuffer bld = new StringBuffer();
  30.         bld.append("In Trigger ");
  31.         if(!s_recursionLock)
  32.         {
  33.             s_recursionLock = true;
  34.             try
  35.             {
  36.                 bld.append(td.getName());
  37.                 bld.append(" on relation ");
  38.                 bld.append(td.getTableName());
  39.             }
  40.             catch(SQLException e)
  41.             {
  42.                 bld.append("(exception while generating exception message)");
  43.             }
  44.             finally
  45.             {
  46.                 s_recursionLock = false;
  47.             }
  48.         }
  49.         if(message != null)
  50.         {
  51.             bld.append(": ");
  52.             bld.append(message);
  53.         }
  54.         return bld.toString();
  55.     }
  56.  
  57.     /**
  58.      * Create an exception based on the <code>TriggerData</code> that was
  59.      * passed to the trigger method.
  60.      * @param td The <code>TriggerData</code> that was passed to the trigger
  61.      * method.
  62.      */
  63.     public TriggerException(TriggerData td)
  64.     {
  65.         super(makeMessage(td, null), TRIGGER_ACTION_EXCEPTION);
  66.     }
  67.  
  68.     /**
  69.      * Create an exception based on the <code>TriggerData</code> that was
  70.      * passed to the trigger method and an additional message.
  71.      * @param td The <code>TriggerData</code> that was passed to the trigger
  72.      * method.
  73.      * @param reason An additional message with info about the exception.
  74.      */
  75.     public TriggerException(TriggerData td, String reason)
  76.     {
  77.         super(makeMessage(td, reason), TRIGGER_ACTION_EXCEPTION);
  78.     }
  79. }
/*
 * Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden
 * Distributed under the terms shown in the file COPYRIGHT
 * found in the root folder of this project or at
 * http://eng.tada.se/osprojects/COPYRIGHT.html
 */
package org.postgresql.pljava;

import java.sql.SQLException;

/**
 * An exception specially suited to be thrown from within a method
 * designated to be a trigger function. The message generated by
 * this exception will contain information on what trigger and
 * what relation it was that caused the exception
 *
 * @author Thomas Hallgren
 */
public class TriggerException extends SQLException
{
    private static final long serialVersionUID = 5543711707414329116L;

    private static boolean s_recursionLock = false;

    public static final String TRIGGER_ACTION_EXCEPTION = "09000";

    private static final String makeMessage(TriggerData td, String message)
    {
        StringBuffer bld = new StringBuffer();
        bld.append("In Trigger ");
        if(!s_recursionLock)
        {
            s_recursionLock = true;
            try
            {
                bld.append(td.getName());
                bld.append(" on relation ");
                bld.append(td.getTableName());
            }
            catch(SQLException e)
            {
                bld.append("(exception while generating exception message)");
            }
            finally
            {
                s_recursionLock = false;
            }
        }
        if(message != null)
        {
            bld.append(": ");
            bld.append(message);
        }
        return bld.toString();
    }

    /**
     * Create an exception based on the <code>TriggerData</code> that was
     * passed to the trigger method.
     * @param td The <code>TriggerData</code> that was passed to the trigger
     * method.
     */
    public TriggerException(TriggerData td)
    {
        super(makeMessage(td, null), TRIGGER_ACTION_EXCEPTION);
    }

    /**
     * Create an exception based on the <code>TriggerData</code> that was
     * passed to the trigger method and an additional message.
     * @param td The <code>TriggerData</code> that was passed to the trigger
     * method.
     * @param reason An additional message with info about the exception.
     */
    public TriggerException(TriggerData td, String reason)
    {
        super(makeMessage(td, reason), TRIGGER_ACTION_EXCEPTION);
    }
}

More Information

PostgreSQL ‘create trigger’ documentation.

PostgreSQL ‘create rule’ documentation.

Introduction to PostgreSQL PL/Java, part 2: Working with Lists

1 Comment

Last time I discussed the basics of working with PL/Java. This time I’ll discuss working with lists.

Lists of Scalar Values

Lists of scalar values are returned as Iterators in the java world and SETOF in the SQL world.

  1.     public static Iterator<String> colors() {
  2.         List<String> colors = Arrays.asList("red", "green", "blue");
  3.         return colors.iterator();
  4.     }
    public static Iterator<String> colors() {
        List<String> colors = Arrays.asList("red", "green", "blue");
        return colors.iterator();
    }

and

  1.   CREATE FUNCTION javatest.colors()
  2.       RETURNS SETOF varchar
  3.       AS 'sandbox.PLJava.colors'
  4.       IMMUTABLE LANGUAGE java;
  CREATE FUNCTION javatest.colors()
      RETURNS SETOF varchar
      AS 'sandbox.PLJava.colors'
      IMMUTABLE LANGUAGE java;

I’ve added the IMMUTABLE keyword since this function will always return the same values. This allows the database to perform caching and query optimization.

You don’t need to know the results, or even the size of the results, before you start. Following is a sequence that’s believed to always terminate but this hasn’t been proven. (Unfortunately I’ve forgotten the name of the sequence.) As a sidenote this isn’t a complete solution since it doesn’t check for overflows – a correct implemention should either check this or use BigInteger.

  1.     public static Iterator seq(int start) {
  2.         Iterator iter = null;
  3.         try {
  4.             iter = new SeqIterator(start);
  5.         } catch (IllegalArgumentException e) {
  6.             // should log error...
  7.         }
  8.         return iter;
  9.     }
  10.  
  11.     public static class SeqIterator implements Iterator {
  12.         private int next;
  13.         private boolean done = false;
  14.  
  15.         public SeqIterator(int start) {
  16.             if (start <= 0) {
  17.                 throw new IllegalArgumentException();
  18.             }
  19.             this.next = start;
  20.         }
  21.  
  22.         @Override
  23.         public boolean hasNext() {
  24.             return !done;
  25.         }
  26.  
  27.         @Override
  28.         public Integer next() {
  29.             int value = next;
  30.             next = (next % 2 == 0) ? next / 2 : 3 * next + 1;
  31.             done = (value == 1);
  32.             return value;
  33.         }
  34.  
  35.         @Override
  36.         public void remove() {
  37.             throw new UnsupportedOperationException();
  38.         }
  39.     }
    public static Iterator seq(int start) {
        Iterator iter = null;
        try {
            iter = new SeqIterator(start);
        } catch (IllegalArgumentException e) {
            // should log error...
        }
        return iter;
    }

    public static class SeqIterator implements Iterator {
        private int next;
        private boolean done = false;

        public SeqIterator(int start) {
            if (start <= 0) {
                throw new IllegalArgumentException();
            }
            this.next = start;
        }

        @Override
        public boolean hasNext() {
            return !done;
        }

        @Override
        public Integer next() {
            int value = next;
            next = (next % 2 == 0) ? next / 2 : 3 * next + 1;
            done = (value == 1);
            return value;
        }

        @Override
        public void remove() {
            throw new UnsupportedOperationException();
        }
    }

and

  1.   CREATE FUNCTION javatest.seq(int)
  2.       RETURNS SETOF int
  3.       AS 'sandbox.PLJava.seq'
  4.       IMMUTABLE LANGUAGE java;
  CREATE FUNCTION javatest.seq(int)
      RETURNS SETOF int
      AS 'sandbox.PLJava.seq'
      IMMUTABLE LANGUAGE java;

All things being equal it is better to create each result as needed. This usually reduces the memory footprint and avoids unnecessary work if the query has a LIMIT clause.

Single Tuples

A single tuple is returned in a ResultSet.

  1.     public static boolean singleWord(ResultSet receiver) throws SQLException {
  2.         receiver.updateString("English", "hello");
  3.         receiver.updateString("Spanish", "hola");
  4.         return true;
  5.     }
    public static boolean singleWord(ResultSet receiver) throws SQLException {
        receiver.updateString("English", "hello");
        receiver.updateString("Spanish", "hola");
        return true;
    }

and

  1.   CREATE TYPE word AS (
  2.       English varchar,
  3.       Spanish varchar);
  4.  
  5.   CREATE FUNCTION javatest.single_word()
  6.       RETURNS word
  7.       AS 'sandbox.PLJava.singleWord'
  8.       IMMUTABLE LANGUAGE java;
  CREATE TYPE word AS (
      English varchar,
      Spanish varchar);

  CREATE FUNCTION javatest.single_word()
      RETURNS word
      AS 'sandbox.PLJava.singleWord'
      IMMUTABLE LANGUAGE java;

A valid result is indicated by returning true, a null result is indicated by returning false. A complex type can be passed into a java method in the same manner – it is a read-only ResultSet containing a single row.

Lists of Tuples

Returning lists of complex values requires a class implementing one of two interfaces.

org.postgresql.pljava.ResultSetProvider

A ResultSetProvider is used when the results can be created programmatically or on an as-needed basis.

  1.    public static ResultSetProvider listWords() {
  2.         return new WordProvider();
  3.     }
  4.  
  5.     public static class WordProvider implements ResultSetProvider {
  6.         private final Map<String,String> words = new HashMap<String,String>();
  7.         private final Iterator<String> keys;
  8.  
  9.         public WordProvider() {
  10.             words.put("one", "uno");
  11.             words.put("two", "dos");
  12.             words.put("three", "tres");
  13.             words.put("four", "quatro");
  14.             keys = words.keySet().iterator();
  15.         }
  16.  
  17.         @Override
  18.         public boolean assignRowValues(ResultSet receiver, int currentRow)
  19.                 throws SQLException {
  20.             if (!keys.hasNext()) {
  21.                 return false;
  22.             }
  23.             String key = keys.next();
  24.             receiver.updateString("English", key);
  25.             receiver.updateString("Spanish", words.get(key));
  26.             return true;
  27.         }
  28.  
  29.         @Override
  30.         public void close() throws SQLException {
  31.         }
  32.     }
   public static ResultSetProvider listWords() {
        return new WordProvider();
    }

    public static class WordProvider implements ResultSetProvider {
        private final Map<String,String> words = new HashMap<String,String>();
        private final Iterator<String> keys;

        public WordProvider() {
            words.put("one", "uno");
            words.put("two", "dos");
            words.put("three", "tres");
            words.put("four", "quatro");
            keys = words.keySet().iterator();
        }

        @Override
        public boolean assignRowValues(ResultSet receiver, int currentRow)
                throws SQLException {
            if (!keys.hasNext()) {
                return false;
            }
            String key = keys.next();
            receiver.updateString("English", key);
            receiver.updateString("Spanish", words.get(key));
            return true;
        }

        @Override
        public void close() throws SQLException {
        }
    }

and

  1.     CREATE FUNCTION javatest.list_words()
  2.       RETURNS SETOF word
  3.       AS 'sandbox.PLJava.listWords'
  4.       IMMUTABLE LANGUAGE java;
    CREATE FUNCTION javatest.list_words()
      RETURNS SETOF word
      AS 'sandbox.PLJava.listWords'
      IMMUTABLE LANGUAGE java;

org.postgresql.pljava.ResultSetHandle

A ResultSetHandle is typically used when the method uses an internal query.

  1.     public static ResultSetHandle listUsers() {
  2.         return new UsersHandle();
  3.     }
  4.  
  5.     public static class UsersHandle implements ResultSetHandle {
  6.         private Statement stmt;
  7.  
  8.         @Override
  9.         public ResultSet getResultSet() throws SQLException {
  10.             stmt = DriverManager.getConnection("jdbc:default:connection").createStatement();
  11.             return stmt.executeQuery("SELECT * FROM pg_user");
  12.         }
  13.  
  14.         @Override
  15.         public void close() throws SQLException {
  16.             stmt.close();
  17.         }
  18.     }
    public static ResultSetHandle listUsers() {
        return new UsersHandle();
    }

    public static class UsersHandle implements ResultSetHandle {
        private Statement stmt;

        @Override
        public ResultSet getResultSet() throws SQLException {
            stmt = DriverManager.getConnection("jdbc:default:connection").createStatement();
            return stmt.executeQuery("SELECT * FROM pg_user");
        }

        @Override
        public void close() throws SQLException {
            stmt.close();
        }
    }

and

  1.   CREATE FUNCTION javatest.list_users()
  2.       RETURNS SETOF pg_user
  3.       AS 'sandbox.PLJava.listUsers'
  4.       LANGUAGE java;
  CREATE FUNCTION javatest.list_users()
      RETURNS SETOF pg_user
      AS 'sandbox.PLJava.listUsers'
      LANGUAGE java;

The Interfaces

I have been unable a recent copy of the pljava jar in a standard maven repository. My solution was to extract the interfaces from the PL/Java source tarball. They are provided here for your convenience.

ResultSetProvider

  1. /*
  2.  * Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden
  3.  * Distributed under the terms shown in the file COPYRIGHT
  4.  * found in the root folder of this project or at
  5.  * http://eng.tada.se/osprojects/COPYRIGHT.html
  6.  */
  7. package org.postgresql.pljava;
  8.  
  9. import java.sql.ResultSet;
  10. import java.sql.SQLException;
  11.  
  12. /**
  13.  * An implementation of this interface is returned from functions and procedures
  14.  * that are declared to return <code>SET OF</code> a complex type. Functions that
  15.  * return <code>SET OF</code> a simple type should simply return an
  16.  * {@link java.util.Iterator Iterator}.
  17.  * @author Thomas Hallgren
  18.  */
  19. public interface ResultSetProvider
  20. {
  21.     /**
  22.      * This method is called once for each row that should be returned from
  23.      * a procedure that returns a set of rows. The receiver
  24.      * is a {@link org.postgresql.pljava.jdbc.SingleRowWriter SingleRowWriter}
  25.      * writer instance that is used for capturing the data for the row.
  26.      * @param receiver Receiver of values for the given row.
  27.      * @param currentRow Row number. First call will have row number 0.
  28.      * @return <code>true</code> if a new row was provided, <code>false</code>
  29.      * if not (end of data).
  30.      * @throws SQLException
  31.      */
  32.     boolean assignRowValues(ResultSet receiver, int currentRow)
  33.     throws SQLException;
  34.  
  35.     /**
  36.      * Called after the last row has returned or when the query evaluator decides
  37.      * that it does not need any more rows.
  38.      */
  39.     void close()
  40.     throws SQLException;
  41. }
/*
 * Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden
 * Distributed under the terms shown in the file COPYRIGHT
 * found in the root folder of this project or at
 * http://eng.tada.se/osprojects/COPYRIGHT.html
 */
package org.postgresql.pljava;

import java.sql.ResultSet;
import java.sql.SQLException;

/**
 * An implementation of this interface is returned from functions and procedures
 * that are declared to return <code>SET OF</code> a complex type. Functions that
 * return <code>SET OF</code> a simple type should simply return an
 * {@link java.util.Iterator Iterator}.
 * @author Thomas Hallgren
 */
public interface ResultSetProvider
{
	/**
	 * This method is called once for each row that should be returned from
	 * a procedure that returns a set of rows. The receiver
	 * is a {@link org.postgresql.pljava.jdbc.SingleRowWriter SingleRowWriter}
	 * writer instance that is used for capturing the data for the row.
	 * @param receiver Receiver of values for the given row.
	 * @param currentRow Row number. First call will have row number 0.
	 * @return <code>true</code> if a new row was provided, <code>false</code>
	 * if not (end of data).
	 * @throws SQLException
	 */
	boolean assignRowValues(ResultSet receiver, int currentRow)
	throws SQLException;

	/**
	 * Called after the last row has returned or when the query evaluator decides
	 * that it does not need any more rows.
	 */
	void close()
	throws SQLException;
}

ResultSetHandle

  1. /*
  2.  * Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden
  3.  * Distributed under the terms shown in the file COPYRIGHT
  4.  * found in the root directory of this distribution or at
  5.  * http://eng.tada.se/osprojects/COPYRIGHT.html
  6.  */
  7. package org.postgresql.pljava;
  8.  
  9. import java.sql.ResultSet;
  10. import java.sql.SQLException;
  11.  
  12. /**
  13.  * An implementation of this interface is returned from functions and procedures
  14.  * that are declared to return <code>SET OF</code> a complex type in the form
  15.  * of a {@link java.sql.ResultSet}. The primary motivation for this interface is
  16.  * that an implementation that returns a ResultSet must be able to close the
  17.  * connection and statement when no more rows are requested.
  18.  * @author Thomas Hallgren
  19.  */
  20. public interface ResultSetHandle
  21. {
  22.     /**
  23.      * An implementation of this method will probably execute a query
  24.      * and return the result of that query.
  25.      * @return The ResultSet that represents the rows to be returned.
  26.      * @throws SQLException
  27.      */
  28.     ResultSet getResultSet()
  29.     throws SQLException;
  30.  
  31.     /**
  32.      * Called after the last row has returned or when the query evaluator decides
  33.      * that it does not need any more rows.
  34.      */
  35.     void close()
  36.     throws SQLException;
  37. }
/*
 * Copyright (c) 2004, 2005, 2006 TADA AB - Taby Sweden
 * Distributed under the terms shown in the file COPYRIGHT
 * found in the root directory of this distribution or at
 * http://eng.tada.se/osprojects/COPYRIGHT.html
 */
package org.postgresql.pljava;

import java.sql.ResultSet;
import java.sql.SQLException;

/**
 * An implementation of this interface is returned from functions and procedures
 * that are declared to return <code>SET OF</code> a complex type in the form
 * of a {@link java.sql.ResultSet}. The primary motivation for this interface is
 * that an implementation that returns a ResultSet must be able to close the
 * connection and statement when no more rows are requested.
 * @author Thomas Hallgren
 */
public interface ResultSetHandle
{
	/**
	 * An implementation of this method will probably execute a query
	 * and return the result of that query.
	 * @return The ResultSet that represents the rows to be returned.
	 * @throws SQLException
	 */
	ResultSet getResultSet()
	throws SQLException;

	/**
	 * Called after the last row has returned or when the query evaluator decides
	 * that it does not need any more rows.
	 */
	void close()
	throws SQLException;
}

More Information

CREATE TYPE (PostgreSQL)

Introduction to PostgreSQL PL/java, part 1

1 Comment

Modern databases allow stored procedures to be written in a variety of languages. One commonly implemented language is java.

N.B., this article discusses the PostgreSQL-specific java implementation. The details will vary with other databases but the concepts will be the same.

Installation of PL/Java

Installation of PL/Java on an Ubuntu system is straightforward. I will first create a new template, template_java, so I can still create databases without the pl/java extensions.

At the command line, assuming you are a database superuser, enter

  1. # apt-get install postgresql-9.1
  2. # apt-get install postgresql-9.1-pljava-gcj
  3.  
  4. $ createdb template_java
  5. $ psql -d template_java -c "update db_database set datistemplate='t' where datnam='template_java'"
  6. $ psql -d template_java -f /usr/share/postgresql-9.1-pljava/install.sql
# apt-get install postgresql-9.1
# apt-get install postgresql-9.1-pljava-gcj

$ createdb template_java
$ psql -d template_java -c "update db_database set datistemplate='t' where datnam='template_java'"
$ psql -d template_java -f /usr/share/postgresql-9.1-pljava/install.sql

Limitations

The prepackaged Ubuntu package uses the Gnu GCJ java implementation, not a standard OpenJDK or Sun implementation. GCJ compiles java source files to native object code instead of byte code. The most recent versions of PL/Java are “trusted” – they can be relied upon to stay within their sandbox. Among other things this means that you can’t access the filesystem on the server.

If you must break the trust there is a second language, ‘javaU’, that can be used. Untrusted functions can only be created a the database superuser.

More importantly this implementation is single-threaded. This is critical to keep in mind if you need to communicate to other servers.

Something to consider is whether you want to compile your own commonly used libraries with GCJ and load them into the PostgreSQL server as shared libraries. Shared libraries go in /usr/lib/postgresql/9.1/lib and I may have more to say about this later.

Quick verification

We can easily check our installation by writing a quick test function. Create a scratch database using template_java and enter the following SQL:

  1. CREATE FUNCTION getsysprop(VARCHAR) RETURNS VARCHAR
  2.   AS 'java.lang.System.getProperty'
  3.   LANGUAGE java;
  4.  
  5. SELECT getsysprop('user.home');
CREATE FUNCTION getsysprop(VARCHAR) RETURNS VARCHAR
  AS 'java.lang.System.getProperty'
  LANGUAGE java;

SELECT getsysprop('user.home');

You should get “/var/lib/postgresql” as a result.

Installing Our Own Methods

This is a nice start but we don’t really gain much if we can’t call our own methods. Fortunately it isn’t hard to add our own.

A simple PL/Java procedure is

  1. package sandbox;
  2.  
  3. public class PLJava {
  4.     public static String hello(String name) {
  5.         if (name == null) {
  6.             return null;
  7.         }
  8.  
  9.         return "Hello, " + name + "!";
  10.     }
  11. }
package sandbox;

public class PLJava {
    public static String hello(String name) {
        if (name == null) {
            return null;
        }

        return "Hello, " + name + "!";
    }
}

There are two simple rules for methods implementing PL/Java procedures:

  • they must be public static
  • they must return null if any parameter is null

That’s it.

Importing the java class into PostgreSQL server is simple. Let’s assume that the package classes are in /tmp/sandbox.jar and our java-enabled database is mydb. Our commands are then

  1. --
  2. -- load java library
  3. --
  4. -- parameters:
  5. --   url_path - where the library is located
  6. --   url_name - how the library is referred to later
  7. --   deploy   - should the deployment descriptor be used?
  8. --
  9. select sqlj.install_jar('file:///tmp/sandbox.jar', 'sandbox', true);
  10.  
  11. --
  12. -- set classpath to include new library.
  13. --
  14. -- parameters
  15. --   schema    - schema (or database) name
  16. --   classpath - colon-separated list of url_names.
  17. --
  18. select sqlj.set_classpath('mydb', 'sandbox');
  19.  
  20. -- -------------------
  21. -- other procedures --
  22. -- -------------------
  23.  
  24. --
  25. -- reload java library
  26. --
  27. select sqlj.replace_jar('file:///tmp/sandbox.jar', 'sandbox', true);
  28.  
  29. --
  30. -- remove java library
  31. --
  32. -- parameters:
  33. --   url_name - how the library is referred to later
  34. --   undeploy - should the deployment descriptor be used?
  35. --
  36. select sqlj.remove_jar('sandbox', true);
  37.  
  38. --
  39. -- list classpath
  40. --
  41. select sqlj.get_classpath('mydb');
  42.  
  43. --
--
-- load java library
--
-- parameters:
--   url_path - where the library is located
--   url_name - how the library is referred to later
--   deploy   - should the deployment descriptor be used?
--
select sqlj.install_jar('file:///tmp/sandbox.jar', 'sandbox', true);

--
-- set classpath to include new library.
--
-- parameters
--   schema    - schema (or database) name
--   classpath - colon-separated list of url_names.
--
select sqlj.set_classpath('mydb', 'sandbox');

-- -------------------
-- other procedures --
-- -------------------

--
-- reload java library
--
select sqlj.replace_jar('file:///tmp/sandbox.jar', 'sandbox', true);

--
-- remove java library
--
-- parameters:
--   url_name - how the library is referred to later
--   undeploy - should the deployment descriptor be used?
--
select sqlj.remove_jar('sandbox', true);

--
-- list classpath
--
select sqlj.get_classpath('mydb');

--

It is important to remember to set the classpath. Libraries are automatically removed from the classpath when they’re unloaded but they are NOT automatically added to the classpath when they’re installed.

We aren’t quite finished – we still need to tell the system about our new function.

  1. --
  2. -- create function
  3. --
  4. CREATE FUNCTION mydb.hello(varchar) RETURNS varchar
  5.   AS 'sandbox.PLJava.hello'
  6.   LANGUAGE java;
  7.  
  8. --
  9. -- drop this function
  10. --
  11. DROP FUNCTION mydb.hello(varchar);
  12.  
  13. --
--
-- create function
--
CREATE FUNCTION mydb.hello(varchar) RETURNS varchar
  AS 'sandbox.PLJava.hello'
  LANGUAGE java;

--
-- drop this function
--
DROP FUNCTION mydb.hello(varchar);

--

We can now call our java method in the same manner as any other stored procedures.

Deployment Descriptor

There’s a headache here – it’s necessary to explicitly create the functions when installing a library and dropping them when removing a library. This is time-consuming and error-prone in all but the simplest cases.

Fortunately there’s a solution to this problem – deployment descriptors. The precise format is defined by ISO/IEC 9075-13:2003 but a simple example should suffice.

  1. SQLActions[] = {
  2.   "BEGIN INSTALL
  3.     CREATE FUNCTION javatest.hello(varchar)
  4.       RETURNS varchar
  5.       AS 'sandbox.PLJava.hello'
  6.       LANGUAGE java;
  7.   END INSTALL",
  8.   "BEGIN REMOVE
  9.     DROP FUNCTION javatest.hello(varchar);
  10.   END REMOVE"
  11. }
SQLActions[] = {
  "BEGIN INSTALL
     CREATE FUNCTION javatest.hello(varchar)
       RETURNS varchar
       AS 'sandbox.PLJava.hello'
       LANGUAGE java;
   END INSTALL",
  "BEGIN REMOVE
     DROP FUNCTION javatest.hello(varchar);
   END REMOVE"
}

You must tell the deployer about the deployment descriptor in the jar’s MANIFEST.MF file. A sample maven plugin is

  1. <plugin>
  2.    <groupId>org.apache.maven.plugins</groupId>
  3.    <artifactId>maven-jar-plugin</artifactId>
  4.    <version>2.3.1</version>
  5.    <configuration>
  6.       <archive>
  7.          <manifestSections>
  8.             <manifestSection>
  9.                <name>postgresql.ddr</name> <!-- filename -->
  10.                <manifestEntries>
  11.                   <SQLJDeploymentDescriptor>TRUE</SQLJDeploymentDescriptor>
  12.                </manifestEntries>
  13.             </manifestSection>
  14.          </manifestSections>
  15.       </archive>
  16.    </configuration>
  17. </plugin>
<plugin>
   <groupId>org.apache.maven.plugins</groupId>
   <artifactId>maven-jar-plugin</artifactId>
   <version>2.3.1</version>
   <configuration>
      <archive>
         <manifestSections>
            <manifestSection>
               <name>postgresql.ddr</name> <!-- filename -->
               <manifestEntries>
                  <SQLJDeploymentDescriptor>TRUE</SQLJDeploymentDescriptor>
               </manifestEntries>
            </manifestSection>
         </manifestSections>
      </archive>
   </configuration>
</plugin>

The database will now know about our methods as they areinstalled and removed.

Internal Queries

One of the ‘big wins’ with stored procedures is that queries are executed on the server itself and are MUCH faster than running them through the programmatic interface. I’ve seen a process that required over 30 minutes via Java knocked down to a fraction of a second by simply moving the queried loop from the client to the server.

The JDBC URL for the internal connection is “jdbc:default:connection”. You cannot use transactions (since you’re within the caller’s transaction) but you can use savepoints as long as you stay within a single call. I don’t know if you can use CallableStatements (other stored procedures yet) – you couldn’t in version 1.2 but the Ubuntu 11.10 package uses version 1.4.2.

Upcoming

Next time I will discuss working with lists and triggers, followed by user-defined types.

More Information

TADA wiki – the author’s blog.

http://pgfoundry.org/projects/pljava/ – Home of PL/Java project.

PL/Java 1.2 User Guide (Note: the Ubuntu package uses release 1.4.2)

Using Google Authenticator (TOTP) On Your Site

1 Comment

Let’s say you want to use two-factor authentication on your site. (Blog entries to follow…). How do you do it?

Time-based One-Time Passwords (TOTP)

An increasingly popular approach is Time-based One-Time Passwords (TOTP) (RFC6238). This is a straightforward algorithm that only requires an accurate clock and a shared secret.

Accurate times have been a pain in the past – computers did not include particularly good real time clock chips – but any server should now be using NTP. I think the major distributions set it up by default but could be mistaken about that.

Modern cell phones also have the accurate time since they include GPS receivers.

Finally dongles with LCD displays can include accurate clocks, esp. if you’re able to periodically synchronize them to a PC.

Put it together and we can have reasonable confidence that we’ll have matching clocks on the client and server so TOTP becomes a good option.

Jumping straight to the code – this is the reference implementation from the RFC. The RFC also includes test vectors to verify implementations.

  1. /**
  2.  Copyright (c) 2011 IETF Trust and the persons identified as
  3.  authors of the code. All rights reserved.
  4.  
  5.  Redistribution and use in source and binary forms, with or without
  6.  modification, is permitted pursuant to, and subject to the license
  7.  terms contained in, the Simplified BSD License set forth in Section
  8.  4.c of the IETF Trust's Legal Provisions Relating to IETF Documents
  9.  (http://trustee.ietf.org/license-info).
  10.  */
  11. import java.lang.reflect.UndeclaredThrowableException;
  12.  
  13. import java.math.BigInteger;
  14.  
  15. import java.security.GeneralSecurityException;
  16.  
  17. import java.text.DateFormat;
  18. import java.text.SimpleDateFormat;
  19.  
  20. import java.util.Date;
  21. import java.util.TimeZone;
  22.  
  23. import javax.crypto.Mac;
  24. import javax.crypto.spec.SecretKeySpec;
  25.  
  26. /**
  27.  * This is an example implementation of the OATH
  28.  * TOTP algorithm.
  29.  * Visit www.openauthentication.org for more information.
  30.  *
  31.  * @author Johan Rydell, PortWise, Inc.
  32.  */
  33. public class TOTP {
  34.     private static final int[] DIGITS_POWER
  35.     // 0 1  2   3    4     5      6       7        8
  36.          = { 1, 10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000 };
  37.  
  38.     private TOTP() {
  39.     }
  40.  
  41.     /**
  42.      * This method uses the JCE to provide the crypto algorithm.
  43.      * HMAC computes a Hashed Message Authentication Code with the
  44.      * crypto hash algorithm as a parameter.
  45.      *
  46.      * @param crypto: the crypto algorithm (HmacSHA1, HmacSHA256,
  47.      *                             HmacSHA512)
  48.      * @param keyBytes: the bytes to use for the HMAC key
  49.      * @param text: the message or text to be authenticated
  50.      */
  51.     private static byte[] hmac_sha(String crypto, byte[] keyBytes, byte[] text) {
  52.         try {
  53.             Mac hmac;
  54.             hmac = Mac.getInstance(crypto);
  55.  
  56.             SecretKeySpec macKey = new SecretKeySpec(keyBytes, "RAW");
  57.             hmac.init(macKey);
  58.  
  59.             return hmac.doFinal(text);
  60.         } catch (GeneralSecurityException gse) {
  61.             throw new UndeclaredThrowableException(gse);
  62.         }
  63.     }
  64.  
  65.     /**
  66.      * This method converts a HEX string to Byte[]
  67.      *
  68.      * @param hex: the HEX string
  69.      *
  70.      * @return: a byte array
  71.      */
  72.     private static byte[] hexStr2Bytes(String hex) {
  73.         // Adding one byte to get the right conversion
  74.         // Values starting with "0" can be converted
  75.         byte[] bArray = new BigInteger("10" + hex, 16).toByteArray();
  76.  
  77.         // Copy all the REAL bytes, not the "first"
  78.         byte[] ret = new byte[bArray.length - 1];
  79.  
  80.         for (int i = 0; i < ret.length; i++)
  81.             ret[i] = bArray[i + 1];
  82.  
  83.         return ret;
  84.     }
  85.  
  86.     /**
  87.      * This method generates a TOTP value for the given
  88.      * set of parameters.
  89.      *
  90.      * @param key: the shared secret, HEX encoded
  91.      * @param time: a value that reflects a time
  92.      * @param returnDigits: number of digits to return
  93.      *
  94.      * @return: a numeric String in base 10 that includes
  95.      *              {@link truncationDigits} digits
  96.      */
  97.     public static String generateTOTP(String key, String time,
  98.         String returnDigits) {
  99.         return generateTOTP(key, time, returnDigits, "HmacSHA1");
  100.     }
  101.  
  102.     /**
  103.      * This method generates a TOTP value for the given
  104.      * set of parameters.
  105.      *
  106.      * @param key: the shared secret, HEX encoded
  107.      * @param time: a value that reflects a time
  108.      * @param returnDigits: number of digits to return
  109.      *
  110.      * @return: a numeric String in base 10 that includes
  111.      *              {@link truncationDigits} digits
  112.      */
  113.     public static String generateTOTP256(String key, String time,
  114.         String returnDigits) {
  115.         return generateTOTP(key, time, returnDigits, "HmacSHA256");
  116.     }
  117.  
  118.     /**
  119.      * This method generates a TOTP value for the given
  120.      * set of parameters.
  121.      *
  122.      * @param key: the shared secret, HEX encoded
  123.      * @param time: a value that reflects a time
  124.      * @param returnDigits: number of digits to return
  125.      *
  126.      * @return: a numeric String in base 10 that includes
  127.      *              {@link truncationDigits} digits
  128.      */
  129.     public static String generateTOTP512(String key, String time,
  130.         String returnDigits) {
  131.         return generateTOTP(key, time, returnDigits, "HmacSHA512");
  132.     }
  133.  
  134.     /**
  135.      * This method generates a TOTP value for the given
  136.      * set of parameters.
  137.      *
  138.      * @param key: the shared secret, HEX encoded
  139.      * @param time: a value that reflects a time
  140.      * @param returnDigits: number of digits to return
  141.      * @param crypto: the crypto function to use
  142.      *
  143.      * @return: a numeric String in base 10 that includes
  144.      *              {@link truncationDigits} digits
  145.      */
  146.     public static String generateTOTP(String key, String time,
  147.         String returnDigits, String crypto) {
  148.         int codeDigits = Integer.decode(returnDigits).intValue();
  149.         String result = null;
  150.  
  151.         // Using the counter
  152.         // First 8 bytes are for the movingFactor
  153.         // Compliant with base RFC 4226 (HOTP)
  154.         while (time.length() < 16)
  155.             time = "0" + time;
  156.  
  157.         // Get the HEX in a Byte[]
  158.         byte[] msg = hexStr2Bytes(time);
  159.         byte[] k = hexStr2Bytes(key);
  160.         byte[] hash = hmac_sha(crypto, k, msg);
  161.  
  162.         // put selected bytes into result int
  163.         int offset = hash[hash.length - 1] & 0xf;
  164.  
  165.         int binary = ((hash[offset] & 0x7f) << 24) |
  166.             ((hash[offset + 1] & 0xff) << 16) |
  167.             ((hash[offset + 2] & 0xff) << <img src='http://invariantproperties.com/wp-includes/images/smilies/icon_cool.gif' alt='8)' class='wp-smiley' /> | (hash[offset + 3] & 0xff);
  168.  
  169.         int otp = binary % DIGITS_POWER[codeDigits];
  170.  
  171.         result = Integer.toString(otp);
  172.  
  173.         while (result.length() < codeDigits) {
  174.             result = "0" + result;
  175.         }
  176.  
  177.         return result;
  178.     }
  179.  
  180.     public static void main(String[] args) {
  181.         // Seed for HMAC-SHA1 - 20 bytes
  182.         String seed = "3132333435363738393031323334353637383930";
  183.  
  184.         // Seed for HMAC-SHA256 - 32 bytes
  185.         String seed32 = "3132333435363738393031323334353637383930" +
  186.             "313233343536373839303132";
  187.  
  188.         // Seed for HMAC-SHA512 - 64 bytes
  189.         String seed64 = "3132333435363738393031323334353637383930" +
  190.             "3132333435363738393031323334353637383930" +
  191.             "3132333435363738393031323334353637383930" + "31323334";
  192.         long T0 = 0;
  193.         long X = 30;
  194.         long[] testTime = {
  195.                 59L, 1111111109L, 1111111111L, 1234567890L, 2000000000L,
  196.                 20000000000L
  197.             };
  198.  
  199.         String steps = "0";
  200.         DateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
  201.         df.setTimeZone(TimeZone.getTimeZone("UTC"));
  202.  
  203.         try {
  204.             System.out.println("+---------------+-----------------------+" +
  205.                 "------------------+--------+--------+");
  206.             System.out.println("|  Time(sec)    |   Time (UTC format)   " +
  207.                 "| Value of T(Hex)  |  TOTP  | Mode   |");
  208.             System.out.println("+---------------+-----------------------+" +
  209.                 "------------------+--------+--------+");
  210.  
  211.             for (int i = 0; i < testTime.length; i++) {
  212.                 long T = (testTime[i] - T0) / X;
  213.                 steps = Long.toHexString(T).toUpperCase();
  214.  
  215.                 while (steps.length() < 16)
  216.                     steps = "0" + steps;
  217.  
  218.                 String fmtTime = String.format("%1$-11s", testTime[i]);
  219.                 String utcTime = df.format(new Date(testTime[i] * 1000));
  220.                 System.out.print("|  " + fmtTime + "  |  " + utcTime + "  | " +
  221.                     steps + " |");
  222.                 System.out.println(generateTOTP(seed, steps, "8", "HmacSHA1") +
  223.                     "| SHA1   |");
  224.                 System.out.print("|  " + fmtTime + "  |  " + utcTime + "  | " +
  225.                     steps + " |");
  226.                 System.out.println(generateTOTP(seed32, steps, "8", "HmacSHA256") +
  227.                     "| SHA256 |");
  228.                 System.out.print("|  " + fmtTime + "  |  " + utcTime + "  | " +
  229.                     steps + " |");
  230.                 System.out.println(generateTOTP(seed64, steps, "8", "HmacSHA512") +
  231.                     "| SHA512 |");
  232.  
  233.                 System.out.println("+---------------+-----------------------+" +
  234.                     "------------------+--------+--------+");
  235.             }
  236.         } catch (final Exception e) {
  237.             System.out.println("Error : " + e);
  238.         }
  239.     }
  240. }
/**
 Copyright (c) 2011 IETF Trust and the persons identified as
 authors of the code. All rights reserved.

 Redistribution and use in source and binary forms, with or without
 modification, is permitted pursuant to, and subject to the license
 terms contained in, the Simplified BSD License set forth in Section
 4.c of the IETF Trust's Legal Provisions Relating to IETF Documents
 (http://trustee.ietf.org/license-info).
 */
import java.lang.reflect.UndeclaredThrowableException;

import java.math.BigInteger;

import java.security.GeneralSecurityException;

import java.text.DateFormat;
import java.text.SimpleDateFormat;

import java.util.Date;
import java.util.TimeZone;

import javax.crypto.Mac;
import javax.crypto.spec.SecretKeySpec;

/**
 * This is an example implementation of the OATH
 * TOTP algorithm.
 * Visit www.openauthentication.org for more information.
 *
 * @author Johan Rydell, PortWise, Inc.
 */
public class TOTP {
    private static final int[] DIGITS_POWER
    // 0 1  2   3    4     5      6       7        8
         = { 1, 10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000 };

    private TOTP() {
    }

    /**
     * This method uses the JCE to provide the crypto algorithm.
     * HMAC computes a Hashed Message Authentication Code with the
     * crypto hash algorithm as a parameter.
     *
     * @param crypto: the crypto algorithm (HmacSHA1, HmacSHA256,
     *                             HmacSHA512)
     * @param keyBytes: the bytes to use for the HMAC key
     * @param text: the message or text to be authenticated
     */
    private static byte[] hmac_sha(String crypto, byte[] keyBytes, byte[] text) {
        try {
            Mac hmac;
            hmac = Mac.getInstance(crypto);

            SecretKeySpec macKey = new SecretKeySpec(keyBytes, "RAW");
            hmac.init(macKey);

            return hmac.doFinal(text);
        } catch (GeneralSecurityException gse) {
            throw new UndeclaredThrowableException(gse);
        }
    }

    /**
     * This method converts a HEX string to Byte[]
     *
     * @param hex: the HEX string
     *
     * @return: a byte array
     */
    private static byte[] hexStr2Bytes(String hex) {
        // Adding one byte to get the right conversion
        // Values starting with "0" can be converted
        byte[] bArray = new BigInteger("10" + hex, 16).toByteArray();

        // Copy all the REAL bytes, not the "first"
        byte[] ret = new byte[bArray.length - 1];

        for (int i = 0; i < ret.length; i++)
            ret[i] = bArray[i + 1];

        return ret;
    }

    /**
     * This method generates a TOTP value for the given
     * set of parameters.
     *
     * @param key: the shared secret, HEX encoded
     * @param time: a value that reflects a time
     * @param returnDigits: number of digits to return
     *
     * @return: a numeric String in base 10 that includes
     *              {@link truncationDigits} digits
     */
    public static String generateTOTP(String key, String time,
        String returnDigits) {
        return generateTOTP(key, time, returnDigits, "HmacSHA1");
    }

    /**
     * This method generates a TOTP value for the given
     * set of parameters.
     *
     * @param key: the shared secret, HEX encoded
     * @param time: a value that reflects a time
     * @param returnDigits: number of digits to return
     *
     * @return: a numeric String in base 10 that includes
     *              {@link truncationDigits} digits
     */
    public static String generateTOTP256(String key, String time,
        String returnDigits) {
        return generateTOTP(key, time, returnDigits, "HmacSHA256");
    }

    /**
     * This method generates a TOTP value for the given
     * set of parameters.
     *
     * @param key: the shared secret, HEX encoded
     * @param time: a value that reflects a time
     * @param returnDigits: number of digits to return
     *
     * @return: a numeric String in base 10 that includes
     *              {@link truncationDigits} digits
     */
    public static String generateTOTP512(String key, String time,
        String returnDigits) {
        return generateTOTP(key, time, returnDigits, "HmacSHA512");
    }

    /**
     * This method generates a TOTP value for the given
     * set of parameters.
     *
     * @param key: the shared secret, HEX encoded
     * @param time: a value that reflects a time
     * @param returnDigits: number of digits to return
     * @param crypto: the crypto function to use
     *
     * @return: a numeric String in base 10 that includes
     *              {@link truncationDigits} digits
     */
    public static String generateTOTP(String key, String time,
        String returnDigits, String crypto) {
        int codeDigits = Integer.decode(returnDigits).intValue();
        String result = null;

        // Using the counter
        // First 8 bytes are for the movingFactor
        // Compliant with base RFC 4226 (HOTP)
        while (time.length() < 16)
            time = "0" + time;

        // Get the HEX in a Byte[]
        byte[] msg = hexStr2Bytes(time);
        byte[] k = hexStr2Bytes(key);
        byte[] hash = hmac_sha(crypto, k, msg);

        // put selected bytes into result int
        int offset = hash[hash.length - 1] & 0xf;

        int binary = ((hash[offset] & 0x7f) << 24) |
            ((hash[offset + 1] & 0xff) << 16) |
            ((hash[offset + 2] & 0xff) << <img src='http://invariantproperties.com/wp-includes/images/smilies/icon_cool.gif' alt='8)' class='wp-smiley' /> | (hash[offset + 3] & 0xff);

        int otp = binary % DIGITS_POWER[codeDigits];

        result = Integer.toString(otp);

        while (result.length() < codeDigits) {
            result = "0" + result;
        }

        return result;
    }

    public static void main(String[] args) {
        // Seed for HMAC-SHA1 - 20 bytes
        String seed = "3132333435363738393031323334353637383930";

        // Seed for HMAC-SHA256 - 32 bytes
        String seed32 = "3132333435363738393031323334353637383930" +
            "313233343536373839303132";

        // Seed for HMAC-SHA512 - 64 bytes
        String seed64 = "3132333435363738393031323334353637383930" +
            "3132333435363738393031323334353637383930" +
            "3132333435363738393031323334353637383930" + "31323334";
        long T0 = 0;
        long X = 30;
        long[] testTime = {
                59L, 1111111109L, 1111111111L, 1234567890L, 2000000000L,
                20000000000L
            };

        String steps = "0";
        DateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
        df.setTimeZone(TimeZone.getTimeZone("UTC"));

        try {
            System.out.println("+---------------+-----------------------+" +
                "------------------+--------+--------+");
            System.out.println("|  Time(sec)    |   Time (UTC format)   " +
                "| Value of T(Hex)  |  TOTP  | Mode   |");
            System.out.println("+---------------+-----------------------+" +
                "------------------+--------+--------+");

            for (int i = 0; i < testTime.length; i++) {
                long T = (testTime[i] - T0) / X;
                steps = Long.toHexString(T).toUpperCase();

                while (steps.length() < 16)
                    steps = "0" + steps;

                String fmtTime = String.format("%1$-11s", testTime[i]);
                String utcTime = df.format(new Date(testTime[i] * 1000));
                System.out.print("|  " + fmtTime + "  |  " + utcTime + "  | " +
                    steps + " |");
                System.out.println(generateTOTP(seed, steps, "8", "HmacSHA1") +
                    "| SHA1   |");
                System.out.print("|  " + fmtTime + "  |  " + utcTime + "  | " +
                    steps + " |");
                System.out.println(generateTOTP(seed32, steps, "8", "HmacSHA256") +
                    "| SHA256 |");
                System.out.print("|  " + fmtTime + "  |  " + utcTime + "  | " +
                    steps + " |");
                System.out.println(generateTOTP(seed64, steps, "8", "HmacSHA512") +
                    "| SHA512 |");

                System.out.println("+---------------+-----------------------+" +
                    "------------------+--------+--------+");
            }
        } catch (final Exception e) {
            System.out.println("Error : " + e);
        }
    }
}

Google Authenticator

Enter Google. Or more precisely, Google Accounts. This is a popular hosting platform for small businesses, non-profits and groups. Even individuals with vanity domains.

Many of these users require better security than you get with just a password. Some users REQUIRE better security due to regulatory or contractual obligations.

Google saw the problem and came up with a solution: Google Authenticator. It is an open source implementation of the TOTP algorithm that has been implemented on smart phones and as a Linux PAM module. Not everyone has a smart phone but enough do for this to be a good solution. Hardware dongles are also available if you prefer them.

(IMPORTANT: the security of the Linux PAM module is debatable since it includes the user’s secret key, unencrypted, in the user’s home directory.)

The code to generate the code produced by Google Authenticator implementations is:

  1. byte[] key = new byte[8];  // = TOPT.exStr2Bytes(keyInHex);
  2. long counter = System.currentTimeMillis() / 30000L;
  3.  
  4. String code = TOTP.generateTOTP(key, counter, "6", "HmacSHA1");
byte[] key = new byte[8];  // = TOPT.exStr2Bytes(keyInHex);
long counter = System.currentTimeMillis() / 30000L;

String code = TOTP.generateTOTP(key, counter, "6", "HmacSHA1");

(Sidenote: it goes without saying that the googlecode project includes similar code. I’m using the RFC reference implementation since it’s much more flexible – I might want to use different parameters in other situations.)

Server-side Implementation

The server-side implementation is fairly straightforward.

Registration

  • Create a random 8-byte secret key. Be sure to use a cryptographically strong random number generator (SecureRandom), not the standard random number generator.
  • Save the key in the database. ENCRYPT IT.
  • Create a unique label, e.g., the username @ the site’s domain name. You don’t want to use anything that can change, e.g., the user’s email address, or something that might be used at other sites, e.g., nothing but his username. Remember that this is what is used to remind the user to use this key – you don’t want to use random strings.
  • Provide the user with a QR code that he can scan using his smart phone. You can also provide the secret key in a string for he user to enter manually.

There is a standard URI for providing secret keys.

The URI for a TOTP key is otpauth://totp/LABEL?secret=SECRET where LABEL is the unique identifier you created above and SECRET is the base32-encoded shared secret.

Base32 uses the case-insensitive letters A-Z and the digits 2-7 to encode a value and corrects for several common errors, e.g., using ’0′ instead of ‘O’. An encoder is available in the Apache commons-codec project.

You can get QR image for this URI at google.com, e.g., for otpauth://totp/alice@google.com?secret=JBSWY3DPEHPK3PXP it is

<img src=”https://www.google.com/chart?chs=200×200&chld=M|0&cht=qr&chl=otpauth://totp/alice@google.com?secret=JBSWY3DPEHPK3PXP”/>

With the QR code the user can simply point his smartphone at his monitor to load his key.

Verification

  • The user is prompted for his username, password, and TOTP code
  • When the login form is received three codes are generated – for the current time, 30 seconds ago, and 30 seconds from now. This gives you a bit of buffer to allow for unsynchronized clocks or the time required by the user to enter the data and submit the form.
  • Check the codes and respond accordingly.

Lost or Compromised Keys

People lose the phones. Local policy may require access credentials to be changed periodically. You have to be prepared.

  • The user authenticates himself using the current TOTP code (if it’s a a periodic change) or via some other mechanism. (Do not ask the common easy to guess questions!)
  • A new secret key is created and provided to the user as described above.
  • The old secret key is either deleted (low security) or retained to capture attempted uses in the future (high security).

As always you never provide current credentials to the user.

Crash Codes

Sometimes people do not have access to their phone (e.g., they’re in a secured environment) or are otherwise unable to use the TOTP code. We must provide a fallback mechanism.

Fortunately this is very easy – provide the user with the eight digit code for the first few values, say counter = 0 to counter = 5. Strictly speaking these are now Hash-based One-Time Passwords (HOTP).

The user should treat these codes in the same way as passwords.

The authentication process now checks the current time for 6 digit codes, or the first few counters for 8 digit codes.

Oracles

I mentioned that the secret key should be encrypted but we can make this design much more robust by using an oracle. These are stored procedures that encapsulate all of the logic and the application is provided the absolute minimum amount of information.

The stored procedure signatures are

  1. --
  2. -- Generate a random key, associate it with the user, and return
  3. -- the corresponding otpauth URI.
  4. --
  5. -- This stored procedure is also responsible for moving any
  6. -- existing key to an audit table if future attempts to log
  7. -- in with the code should be recorded.
  8. --
  9. CREATE FUNCTION generateTotpUri(username varchar, label varchar) RETURNS varchar AS $$
  10. -- body...
  11. $$
  12.  
  13. --
  14. -- Authenticate the user with the specified password and TOTP code.
  15. -- This stored procedure should accept either 6 digit time-based TOTP
  16. -- codes or 8 digit HOTP codes.
  17. --
  18. -- This stored procedure should only return 1 (success) or 0 (failure).
  19. --
  20. -- This stored procedure is also responsible for recording
  21. -- any attempts to log in with disabled secret keys.
  22. --
  23. CREATE FUNCTION authenticateUser(username varchar, password varchar, totp varchar) RETURNS intege AS $$
  24. -- body...
  25. $$
--
-- Generate a random key, associate it with the user, and return
-- the corresponding otpauth URI.
--
-- This stored procedure is also responsible for moving any
-- existing key to an audit table if future attempts to log
-- in with the code should be recorded.
--
CREATE FUNCTION generateTotpUri(username varchar, label varchar) RETURNS varchar AS $$
-- body...
$$

--
-- Authenticate the user with the specified password and TOTP code.
-- This stored procedure should accept either 6 digit time-based TOTP
-- codes or 8 digit HOTP codes.
--
-- This stored procedure should only return 1 (success) or 0 (failure).
--
-- This stored procedure is also responsible for recording
-- any attempts to log in with disabled secret keys.
--
CREATE FUNCTION authenticateUser(username varchar, password varchar, totp varchar) RETURNS intege AS $$
-- body...
$$

Many databases support SHA1 HMAC computations, e.g., in the pgcrypto package for PostgreSQL. The trick to remember is that the last nybble of the hash is used as the offset into the hash before converting the hash into an integer – that’s not a common approach yet.

Implementation is left as an exercise for the reader. (Hint: you can always be lazy and ask the database crypto provider to add this function!)

For More Information

http://code.google.com/p/google-authenticator/

Google Authenticator for multi-factor authentication

RFC 6238 (TOTP)

RFC 4226 (HOTP)

PostgreSQL pgcrypto package – note: this is for release 8.3. I don’t know if there have been more recent updates.

Late update: another article went up at Java Code Geeks while I was working on mine: Google Authenticator: Using It With Your Own Java Authentication Server

Database and Webapp Security, part 5: User Authentication

No Comments

User Authentication and Authorization Information

User authentication (authn) is how we know that a user is who he claims to be. At a minimum it’s a username and password but it could include much more if two-factor authentication is used.

User authorization (authz) is what we allow the user to do.

These are very different questions and should be treated as such. Some architectures for this, e.g., if a site uses siteminder or a similar tool then it doesn’t have access to authn information at all – it can only add authz.

What is user authn/authz information? It is

  • username and/or email
  • password
  • single sign-on (SSO) identifications
  • security tokens (for two-factor authentication)
  • security images/phrases (used to prove your site is legitimate to the user)
  • groups and roles
What is not user authn/authz information?
  • contact information
  • content subscriptions
  • or anything else that’s not required to authenticate or authorize the user.

Wrong Approach

Put everything – user authn/authz, static content and dynamic content – into a single database schema.

It’s quick.

It’s easy.

It’s the default behavior for auto-generation tools.

And it’s very, very wrong since anyone who cracks your webapp has also cracked your user authn/authz data. At best you’ll have a denial-of-service. At worst they can pretend to be other users, can add their own highly-priviledged account, etc.

Separate Schemas and Connection Pools

The quickest solution is to create a separate schema for the user authn/authz data and use a dedicated data source (or Hibernate session) when accessing this data. This schema should be unreadable from the standard data source (or Hibernate session). This gives you a good firewall from the world but isn’t perfect.

A seemingly more robust solution is to use a separate database, not just a separate schema, for the user authn/auhtz data. This would seem to protect you from misconfigured rights that would allow the dynamic content data source to access the user data source.

Sadly in some RDMBS there’s not a clear distinction between schemas and databases and a connection to one “database” can still access another “database” if the necessary rights are granted. You can’t be sure unless you have a separate database instance for user authn/authz and dynamic content. This may not be an undue burden if your architecture has a server dedicated for user authn/authz. This is not unreasonable with virtual servers or a cloud design.

Container-Based Authentication

A better solution is container-based authentication. Pull user authn/authz entirely out of the webapp – by the time your webapp gets the request the HttpServletRequest already has all necessary information populated. Your webapp has no access to the container’s authentication information. (Modulo the notes above – you don’t gain anything if the container looks into the same schema as your dynamic content.)

A variant of this is authentication filters put in front of the webapp, e.g., those from Spring Security. It’s a different mechanism but serves the same purpose of keeping a very sharp distinction between user data and dynamic content.

The Glitch – Adding and Updating Users

There’s one big glitch here – how do you add or update user information if your webapp can’t access the user authn/authz tables?

The first approach is to create a separate webapp that handles this. Your main webapp can transparently redirect to the second webapp as necessary. The upside is that you can have a consistent look and feel, the downside is that you’re exposing user authn/authz information to the weeb again.

The second approach is to create a separate REST service that handles this. Your webapp can provide the user interface but call the REST service instead of the standard business layer. The REST service can be within your firewall.

The third approach is to defer this entirely to the container. This ensures maximum separation but makes it difficult to have a consistent look and feel.

Revisiting Defending from XEE Attacks With Security Managers

No Comments

A little over a year ago I wrote about using security managers and used XEE attacks (that is, using XML External Entities (XEE) in XML for denial of service or information disclosure attacks) in passing.

Using a SecurityManager and Identifying Requirements.

Since then I’ve had a ‘duh’ moment and realized a simple solution to this problem. I am currently working on a longer piece but XEE attacks are serious enough to warrant a quick update to the earlier piece.

In a nutshell there’s no need to explicitly list all of the required permissions – we can simply train our SecurityManager. The process is straightforward:

  • write a logging SecurityManager as described earlier[1]
  • process a few sample documents that exercise all of the requirements
  • flip a flag
  • the SecurityManager now checks all future requests against the permissions already seen. If there’s a match it’s permitted. If not it’s denied.

A human should still review the identified permissions before putting this into production – you want to ensure it’s the most restrictive permissions. E.g., you should only grant permission to read a specific file, not an arbitrary file.

That’s the problem holding up the broader paper. XML processing is a very tightly constrained problem and we can identify precisely what’s required. We can’t say that about many other problems so we would want wildcard matching of pre-specified permissions and that’s a lot more complex.

[1] I would make one change today – I would call the existing SecurityManager, if it exists, instead of blindly accepting all requests.

Database and Webapp Security, part 4: Schema Ownership

No Comments

What are DDL, DML, DCL and TCL?

SQL contains four distinct types of statements.

Data Definition Language

Data Definition Language (DDL) statements define the database structure. Think of this as the landlord that builds the warehouse but turns over the keys to the renter.

Statements:

  • create – create tables, views, indexes, etc.
  • alter – alter tables, views, indexes, columns, etc.
  • drop – delete tables, views, indexes, etc.
  • truncate – remove all of the records from a table
  • comment – add comments to tables, columns, views, etc.
  • rename – rename a table, view, etc.

Data Manipulation Language

Data Manipulation Language (DML) statements manage the data within the structure created by the DDL. Think of this as the tenant of the warehouse – it can use the warehouse but can’t knock down walls.

Statements:

  • select – retrieve data
  • insert – insert new data into a table
  • update – update existing data within a table
  • delete – delete data from a table
  • call – call a PL/SQL or other stored procedure
  • explain plan – explain how a querywill be executed
  • lock table – lock a table to limit concurrency

Data Control Language

Data Control Language (CDL) statements control access rights to the data and schema. Think of these as locks on the doors, permission to move walls within the warehouse, etc.

Statements:

  • grant – give the user additional privileges
  • revoke – remove user privileges

Transaction Control Language

Transaction Control Language (TCL) statement are used to control transactions.

Statements:

  • commit – save completed work
  • rollback – undo completed work
  • savepoint – mark a point that we can rollback to later without necessarily rolling back the entire transaction
  • set transaction – set transaction options

Use Different Database User for Schema And Data Ownership

The schema should be owned by one database user, e.g., app_owner and the data should be owned by a different database user, e.g., app_user.

The owner should:

  • have the ability to run DDL and DCL statements
  • arguably not have the ability to run DML statements
  • never be accessed via the webapp

The user should

  • have the ability to run DML and TCL statements
  • not have the ability to run DDL or DCL statements
  • be accessible via the webapp

Cost/Benefit Analysis

There is a very favorable cost/benefit ratio for separating the ownership of schema and database. There is a slightly higher cost when creating and maintaining the database but it essentially eliminates the ability of a web intruder to destroy the database schema itself. The data, on the other hand, can still be nuked.

Database and Webapp Security, part 3: SQL Injection in Stored Procedures

No Comments

What are stored procedures and CallableStatements?

Stored procedures are bits of code kept in the database. The most common form is a SQL-like scripting language but additional languages are supported – PERL, tcl, ruby, java, etc.

It is important to remember that stored procedures are used in database triggers – you should be aware of them even if you all of your work with hibernate.

Wrong Approach

The wrong approach is to create a dynamic SQL query without sanitization.

  1. DELIMITER $$
  2. DROP PROCEDURE IF EXISTS SP_AUTHENTICATE$$
  3. CREATE PROCEDURE SP_AUTHENTICATE(IN username VARCHAR(20),
  4.                                  IN password VARCHAR(20),
  5.                                  OUT success INT)
  6. BEGIN
  7.   SET @query = CONCAT('SELECT COUNT(credentials.username) INTO @succ
  8.     FROM credentials
  9.     WHERE credentials.username = \'', username,
  10.         '\' AND credentials.password = \'', password, '\'');
  11.    PREPARE stmt FROM @query;
  12.    EXECUTE stmt;
  13.    SELECT @succ;
  14.    SET success = @succ;
  15. END;
  16. $$
  17. DELIMITER ;
DELIMITER $$
DROP PROCEDURE IF EXISTS SP_AUTHENTICATE$$
CREATE PROCEDURE SP_AUTHENTICATE(IN username VARCHAR(20),
                                 IN password VARCHAR(20),
                                 OUT success INT)
BEGIN
  SET @query = CONCAT('SELECT COUNT(credentials.username) INTO @succ
     FROM credentials
     WHERE credentials.username = \'', username,
        '\' AND credentials.password = \'', password, '\'');
   PREPARE stmt FROM @query;
   EXECUTE stmt;
   SELECT @succ;
   SET success = @succ;
END;
$$
DELIMITER ;

(Note: this code fragment is an example comes from the reference below.)

This stored procedure has no benefits over the “wrong answer” in the part 2 with the exception of very modest encapsulation..

Sidenote: This is an example of an oracle. It returns the minimum amount of information about user authentication – a “thumbs up” or “thumbs down”. There’s no information leak in this implementation since the caller already knows the username and password but a more robust implementation could also verify that the user account has not been disabled, etc.

Stored Procedures and Parameterization

The first safe approach is executing the SQL directly instead of creating dynamic SQL.

The second safe approach is parameterization within the stored procedure. This is directly equivalent to Java prepared statements and placeholders.

  1. DELIMITER $$
  2. DROP PROCEDURE IF EXISTS SP_AUTHENTICATE$$
  3. CREATE PROCEDURE SP_AUTHENTICATE(IN username VARCHAR(20),
  4.                                  IN password VARCHAR(20),
  5.                                  OUT success INT)
  6. BEGIN
  7.   SET @query = 'SELECT COUNT(credentials.username) INTO @succ
  8.     FROM credentials
  9.     WHERE credentials.username = ? AND credentials.password = ?';
  10.    PREPARE stmt FROM @query;
  11.    SET @usernm = username;
  12.    SET @pass = password;
  13.    EXECUTE stmt USING @usernm, @pass;
  14.    SELECT @succ;
  15.    SET success = @succ;
  16. END;
  17. $$
  18. DELIMITER ;
DELIMITER $$
DROP PROCEDURE IF EXISTS SP_AUTHENTICATE$$
CREATE PROCEDURE SP_AUTHENTICATE(IN username VARCHAR(20),
                                 IN password VARCHAR(20),
                                 OUT success INT)
BEGIN
  SET @query = 'SELECT COUNT(credentials.username) INTO @succ
     FROM credentials
     WHERE credentials.username = ? AND credentials.password = ?';
   PREPARE stmt FROM @query;
   SET @usernm = username;
   SET @pass = password;
   EXECUTE stmt USING @usernm, @pass;
   SELECT @succ;
   SET success = @succ;
END;
$$
DELIMITER ;

PLPSQL Sanitization

There is another alternative if you’re willing to be tied to a specific database vendor. In practice this usually isn’t an issue – hibernate gives you some database transparency but stored procedures will always be tied closely to the database.

In plpsql (PostgreSQL) there are two commands that can be used for sanitizing input: quote_ident and quote_literal. There are undoubtably similar commands in other stored procedure languages.

Updating the wrong answer above we have:

  1. DELIMITER $$
  2. DROP PROCEDURE IF EXISTS SP_AUTHENTICATE$$
  3. CREATE PROCEDURE SP_AUTHENTICATE(IN username VARCHAR(20),
  4.                                  IN password VARCHAR(20),
  5.                                  OUT success INT)
  6. BEGIN
  7.   SET @query = CONCAT('SELECT COUNT(credentials.username) INTO @succ
  8.     FROM credentials
  9.     WHERE credentials.username = ', quote_literal(username),
  10.         'AND credentials.password = ', quote_literal(password));
  11.    PREPARE stmt FROM @query;
  12.    EXECUTE stmt;
  13.    SELECT @succ;
  14.    SET success = @succ;
  15. END;
  16. $$
  17. DELIMITER ;
DELIMITER $$
DROP PROCEDURE IF EXISTS SP_AUTHENTICATE$$
CREATE PROCEDURE SP_AUTHENTICATE(IN username VARCHAR(20),
                                 IN password VARCHAR(20),
                                 OUT success INT)
BEGIN
  SET @query = CONCAT('SELECT COUNT(credentials.username) INTO @succ
     FROM credentials
     WHERE credentials.username = ', quote_literal(username),
        'AND credentials.password = ', quote_literal(password));
   PREPARE stmt FROM @query;
   EXECUTE stmt;
   SELECT @succ;
   SET success = @succ;
END;
$$
DELIMITER ;

Direct SQL

The final safe approach is to use direct SQL calls with minimum parameter size. This is mentioned on the CERT website but I would hesitate to use it since it would be so easy to introduce unsafe code by accident.

  1. DELIMITER $$
  2. DROP PROCEDURE IF EXISTS SP_AUTHENTICATE$$
  3. CREATE PROCEDURE SP_AUTHENTICATE(IN username VARCHAR(8),
  4.                                  IN password VARCHAR(20),
  5.                                  OUT success INT)
  6. BEGIN
  7.   SELECT COUNT(credentials.username) INTO success
  8.      FROM credentials
  9.      WHERE credentials.username = username AND credentials.password = password;
  10. END;
  11. $$
  12. DELIMITER ;
DELIMITER $$
DROP PROCEDURE IF EXISTS SP_AUTHENTICATE$$
CREATE PROCEDURE SP_AUTHENTICATE(IN username VARCHAR(8),
                                 IN password VARCHAR(20),
                                 OUT success INT)
BEGIN
  SELECT COUNT(credentials.username) INTO success
     FROM credentials
     WHERE credentials.username = username AND credentials.password = password;
END;
$$
DELIMITER ;

Cost/Benefit Analysis

Stored procedures are harder to exploit than naked SQL queries but this often gives people a false sense of security. This should be considered mandatory for sensitive information (user authentication, audit logging) and highly recommended in all other cases.

References

https://www.securecoding.cert.org/confluence/pages/viewpage.action?pageId=70288108

Database and Webapp Security, part 2: SQL Injection

No Comments

What is SQL Injection?

SQL Injection is the ability of attackers to insert arbitrary SQL commands into your system.

Sample attack

Look at the following code:

  1. ResultSet rs = stmt.execute(
  2.    "select * from users where username='" + username +
  3.    "' and password='" + password + "'");
ResultSet rs = stmt.execute(
   "select * from users where username='" + username +
   "' and password='" + password + "'");

What could go wrong? Let’s say we use the following values:

  1. String username = "bob' or 1=1; --";
  2. String password = "dont care";
String username = "bob' or 1=1; --";
String password = "dont care";

When we call the earlier code the generated code is

  1. select * from users where username='bob' or 1=1; --'
  2.   and password='dont care'
select * from users where username='bob' or 1=1; --'
   and password='dont care'

This will list all users. Some web frameworks will list all users in the system. More carefully written applications will raise an alarm if more than one record is returned. This is easy to fix:

  1. String username = "bob' or 1=1 order by userid limit 1; --";
  2. String password = "dont care";
String username = "bob' or 1=1 order by userid limit 1; --";
String password = "dont care";

to produce

  1. select * from users where username='bob' or 1=1
  2.   order by userid limit 1; --' and password='dont care'
select * from users where username='bob' or 1=1
  order by userid limit 1; --' and password='dont care'

The ‘order by’ stanza ensures we see the first user in the system. That’s normally the administrator – something attackers do not forget.

Wrong Approach

Many inexperienced programmers attempt to get around this problem by explicitly sanitizing the user-provided input.

  1. ResultSet rs = stmt.execute(
  2.   "select * from users where username='" +
  3.   username.replaceAll("'",   "''") +
  4.   "' and password='" + password.replaceAll("'", "''") + "'");
ResultSet rs = stmt.execute(
  "select * from users where username='" +
  username.replaceAll("'",   "''") +
  "' and password='" + password.replaceAll("'", "''") + "'");

This might have worked in the 1980s but the world uses more than ASCII today. Properly identifying quote characters is a non-trivial problem and should be left to others. The JDBC writers often have database-specific methods for this but they can get out of sync with the database and are, of course, database-specific.

Prepared Statements and Placeholders

The standard solution to this problem is to use prepared statements and placeholders. This replaces the code

  1. ResultSet rs = stmt.execute(
  2.   "select * from users where username='" +
  3.   username + "' and password='" + password + "'");
ResultSet rs = stmt.execute(
  "select * from users where username='" +
  username + "' and password='" + password + "'");

with

  1. PreparedStatment stmt = conn.prepareStatement(
  2.   "select * from users where username=? and password=? limit 1");
  3. stmt.setString(1, username);
  4. stmt.setString(2, password);
  5. ResultSet rs = stmt.execute()
PreparedStatment stmt = conn.prepareStatement(
  "select * from users where username=? and password=? limit 1");
stmt.setString(1, username);
stmt.setString(2, password);
ResultSet rs = stmt.execute()

Limitations

There are times when prepared statements are inappropriate. One common example is multi-insert statements. These can be significantly faster than multiple prepared statement calls.

An example of a multi-insert statement is

  1. insert into squares(x, y)
  2.    values (1, 1),
  3.           (2, 4),
  4.           (3, 9),
  5.           (4, 16),
  6.           (5, 25);
insert into squares(x, y)
   values (1, 1),
          (2, 4),
          (3, 9),
          (4, 16),
          (5, 25);

As a general rule this should not be used with user-provided data. If it is absolutely required use the database-specific method provided by your JDBC provider, not a roll your own solution.

Cost/Benefit Analysis

The cost/benefit analysis to using prepared statement placeholders is irrelevant – it’s one of those things that you simply have to do.

Database and Webapp Security, part 1: Threat Model

No Comments

This is the first of a series of discussions of database and webapp security loosely based on the quick reference page on my site. That page is becoming unwieldy and does not make it easy for readers to interact with me or others.

Threat Model

All security analysis must begin by examining the threat model. A threat model requires you to answer four questions:

  • what I am trying to protect?
  • from whom?
  • for how long?
  • and at what (net) cost?

What am I trying to protect?

This is the obvious place to start… and your first answer is probably wrong! What I mean by that is that you may answer “the database password” but that’s not quite right. What you’re actually want to protect is access to the database as that user – an attacker might be able to find a way into the databasewithout the password, e.g., SQL injection.

But wait, that’s not quite right either! Our real concern is preventing the attacker from using that access to cause damage, learn sensitive information, and so forth. At this point we should enumerate our actual concerns, e.g., our database may contain

  • user content
  • financial information
  • user authentication and authorization
  • logs
  • static content

The way we access and use this information is varied

  • user content – need ongoing read/write access
  • financial information – need an oracle (for approval) and can leave details to fulfillment process
  • user authentication and authorization – need an oracle (for approval and authorizations) when a user logs in but never afterwards (oracle)
  • logs - need ongoing append-only access (oracle)
  • static content – need read-only access on startup (oracle)

(All of the access is modulo the need for maintenance.)

An oracle is a standalone method that takes (optional) values and returns either true or false. A bit more generally it can return any self-contained, immutable object. A good implementation choice for an oracle is a stored procedure in the database, a better choice would be a REST call to another webapp using an independent database.

Two examples of oracles:

User authentication: Use an oracle that takes a username and password and returns a boolean value indicating whether it was valid or not. (Alternative: return full authn/authz structure upon success.) The non-oracle approach is for the application to do a query on the user and password tables and compare the passwords itself.

Credit card authentication: Use an oracle that takes the credit card information and amount of purchase and returns either a confirmation number or an error indication. The application can rely on the oracle keeping a copy of previously provided values (but not the CVV!) so the user doesn’t have to fill out the same information every time. . The non-oracle approach is for the application to bundle the information itself.

The point I’m making here is that deciding what needs to be protected is an architectural question and a bit of foresight can have a dramatic impact on threat model. You want as little exposure to untrusted users (e.g., the webapp) as possible and small changes can make big differences.

Last but not least there is one other thing that the should be protected: your reputation. Not the company’s – the developer’s. What do you say when you get a phone call from the president of the company demanding to know why the company will be the lead story on the nightly news? You can’t protect against all attacks but you don’t want to be left speechless when someone demands to know why you didn’t take basic steps to protect the system.

From whom?

Everyone.

Okay, I kid. But there’s a far broader list than you first think.

  • fumble-finged employees. We’ve all done this. They already have legitimate access.
  • disgruntled employees, especially the soon-to-be-former employees. They already have legitimate access and motivation.
  • script kiddies. We tend to think of them as unsophisticated but they may be running cracking tools written by experts. They’ll probably move on to easier targets if your site is reasonably secure.
  • advanced persistent threats (ATP). These are the people who have a strong motivation and strong technical skills. Assume they will get in.

This list is far from exhaustive and listing additional ‘potentialattackers’ is left as an exercise for the reader.

For how long?

At the risk of being obvious there’s three broad categories

  • Information or access that must be protected until a relatively soon specific date and is then open knowledge, e.g., corporate financial reports.
  • Information or access that has declining value over time.
  • Information or access that must be protected forever, e.g., confidential legal and medical documents.
The first category is straightforward since the best known algorithms and attacks are known and the attacker has limited time to work.
The last category is difficult since we know don’t can’t predict future attacks.  Some things that were impossible 10 years ago are now run-of-the-mill. One good bit of advice: things we don’t keep are things we don’t need to protect. Keep as little as possible but no less.
A full analysis is behind the scope of this blog entry but this is an important concern that should not be trivialized.
At what (net) cost?

“Cost” is a flexible concept since there are so many indirect and inferred costs. E.g., what’s the cost in making it harder for people to do their work… or is it cheaper since the system won’t be down for days after a breach? What’s the cost of people leaving the site in frustration vs. the benefits of people not leaving the site en masse after a breach at your site made the national news?

The bottom line is that this is ultimately a non-technical question. All you can do is identify the direct and indirect concerns and let the powers-that-be make the final determination.

Putting it together

The bottom line is that the threat model is ultimately a business decision. We can provide analysis and recommendations but the ultimate decision has to come from above.

That said there are many things we can do on our own initiative. This series will address some of them.

Blue Taste Theme created by Jabox