Invariant Properties

  • rss
  • Home

Embedded KDC Server using Apache MiniKDC

Bear Giles | November 19, 2017

One of the biggest headaches when working with Kerberos is that you need to set up external files in order to use it. That should be a simple one-time change but it can introduce subtle issues such as forcing developers to be on the corporate VPN when doing a build on their laptop.

The Hadoop developers already have a solution – the “MiniKDC” embedded KDC server. This class can be used to create a temporary KDC in the build environment that eliminates any need for external files or network resources. This approach comes at a cost – on my system it takes about 15 seconds to stand up the embedded server. You don’t want to run these tests on every build but a brief delay during integration tests is better than introducing a dependency on a VPN and a running server.

Update regarding ticket caches on 11/22/2017

Important update on ticket caches (and TGT) on 11/22/2017. I need to emphasize that the standard implementation of the Krb5LoginModule does not create ticket cache files. That may not be clear below. I will post a followup article that discusses using the external kinit program to create Kerberos ticket caches.

Embedded Servers and JUnit 4 Rules

Modern test frameworks have a way to stand up test resources before test and tear them down afterwards. With JUnit 4 this is done with Rules. A rule is an annotation that the test runner recognizes and knows how to use. The details are different in other test frameworks (or JUnit 5) but the underlying concepts are the same.

An embedded KDC is a class-level external resource.

  1. public class EmbeddedKdcResource extends ExternalResource {
  2.     private final File baseDir;
  3.     private MiniKdc kdc;
  4.  
  5.     public EmbeddedKdcResource() {
  6.         try {
  7.             baseDir = Files.createTempDirectory("mini-kdc_").toFile();
  8.         } catch (IOException e) {
  9.             // throw AssertionError so we don't have to deal with handling declared
  10.             // exceptions when creating a @ClassRule object.
  11.             throw new AssertionError("unable to create temporary directory: " + e.getMessage());
  12.         }
  13.     }
  14.  
  15.     /***
  16.      * Start KDC.
  17.      */
  18.     @Override
  19.     public void before() throws Exception {
  20.  
  21.         final Properties kdcConf = MiniKdc.createConf();
  22.         kdcConf.setProperty(MiniKdc.INSTANCE, "DefaultKrbServer");
  23.         kdcConf.setProperty(MiniKdc.ORG_NAME, "EMBEDDED");
  24.         kdcConf.setProperty(MiniKdc.ORG_DOMAIN, "INVARIANTPROPERTIES.COM");
  25.  
  26.         // several sources say to use extremely short lifetimes in test environment.
  27.         // however setting these values results in errors.
  28.         //kdcConf.setProperty(MiniKdc.MAX_TICKET_LIFETIME, "15_000");
  29.         //kdcConf.setProperty(MiniKdc.MAX_RENEWABLE_LIFETIME, "30_000");
  30.  
  31.         kdc = new MiniKdc(kdcConf, baseDir);
  32.         kdc.start();
  33.  
  34.         // this is the standard way to set the default location of the JAAS config file.
  35.         // we don't need to do this since we handle it programmatically.
  36.         //System.setProperty("java.security.krb5.conf", kdc.getKrb5conf().getAbsolutePath());
  37.     }
  38.  
  39.     /**
  40.      * Shut down KDC, delete temporary directory.
  41.      */
  42.     @Override
  43.     public void after() {
  44.         if (kdc != null) {
  45.             kdc.stop();
  46.         }
  47.     }
  48.  
  49.     /**
  50.      * Get realm.
  51.      */
  52.     public String getRealm() {
  53.         return kdc.getRealm();
  54.     }
  55.  
  56.     /**
  57.      * Create a keytab file with entries for specified user(s).
  58.      *
  59.      * @param keytabFile
  60.      * @param names
  61.      * @throws Exception
  62.      */
  63.     public void createKeytabFile(File keytabFile, String... names) throws Exception {
  64.         kdc.createPrincipal(keytabFile, names);
  65.     }
  66. }
public class EmbeddedKdcResource extends ExternalResource {
    private final File baseDir;
    private MiniKdc kdc;

    public EmbeddedKdcResource() {
        try {
            baseDir = Files.createTempDirectory("mini-kdc_").toFile();
        } catch (IOException e) {
            // throw AssertionError so we don't have to deal with handling declared
            // exceptions when creating a @ClassRule object.
            throw new AssertionError("unable to create temporary directory: " + e.getMessage());
        }
    }

    /***
     * Start KDC.
     */
    @Override
    public void before() throws Exception {

        final Properties kdcConf = MiniKdc.createConf();
        kdcConf.setProperty(MiniKdc.INSTANCE, "DefaultKrbServer");
        kdcConf.setProperty(MiniKdc.ORG_NAME, "EMBEDDED");
        kdcConf.setProperty(MiniKdc.ORG_DOMAIN, "INVARIANTPROPERTIES.COM");

        // several sources say to use extremely short lifetimes in test environment.
        // however setting these values results in errors.
        //kdcConf.setProperty(MiniKdc.MAX_TICKET_LIFETIME, "15_000");
        //kdcConf.setProperty(MiniKdc.MAX_RENEWABLE_LIFETIME, "30_000");

        kdc = new MiniKdc(kdcConf, baseDir);
        kdc.start();

        // this is the standard way to set the default location of the JAAS config file.
        // we don't need to do this since we handle it programmatically.
        //System.setProperty("java.security.krb5.conf", kdc.getKrb5conf().getAbsolutePath());
    }

    /**
     * Shut down KDC, delete temporary directory.
     */
    @Override
    public void after() {
        if (kdc != null) {
            kdc.stop();
        }
    }

    /**
     * Get realm.
     */
    public String getRealm() {
        return kdc.getRealm();
    }

    /**
     * Create a keytab file with entries for specified user(s).
     *
     * @param keytabFile
     * @param names
     * @throws Exception
     */
    public void createKeytabFile(File keytabFile, String... names) throws Exception {
        kdc.createPrincipal(keytabFile, names);
    }
}

Functional Tests

Once we have an embedded KDC we can quickly write tests that attempt to get a JAAS LoginContext using Kerberos authentication. We call it a success if LoginContext#login() succeeds.

  1. public class BasicKdcTest {
  2.  
  3.     @ClassRule
  4.     public static final TemporaryFolder tmpDir = new TemporaryFolder();
  5.  
  6.     @ClassRule
  7.     public static final EmbeddedKdcResource kdc = new EmbeddedKdcResource();
  8.  
  9.     private static KerberosPrincipal alice;
  10.     private static KerberosPrincipal bob;
  11.     private static File keytabFile;
  12.     private static File ticketCacheFile;
  13.  
  14.     private KerberosUtilities utils = new KerberosUtilities();
  15.  
  16.     @BeforeClass
  17.     public static void createKeytabs() throws Exception {
  18.         // create Kerberos principal and keytab filename.
  19.         alice = new KerberosPrincipal("alice@" + kdc.getRealm());
  20.         bob = new KerberosPrincipal("bob@" + kdc.getRealm());
  21.         keytabFile = tmpDir.newFile("users.keytab");
  22.         ticketCacheFile = tmpDir.newFile("krb5cc_alice");
  23.  
  24.         // create keytab file containing key for Alice but not Bob.
  25.         kdc.createKeytabFile(keytabFile, "alice");
  26.  
  27.         assertThat("ticket cache does not exist", ticketCacheFile.exists(), equalTo(true));
  28.     }
  29.  
  30.     /**
  31.      * Test LoginContext login without TGT ticket (success).
  32.      *
  33.      * @throws LoginException
  34.      */
  35.     @Test
  36.     public void testLoginWithoutTgtSuccess() throws LoginException {
  37.         final LoginContext lc = utils.getKerberosLoginContext(alice, keytabFile);
  38.         lc.login();
  39.         assertThat("subject does not contain expected principal", lc.getSubject().getPrincipals(),
  40.                 contains(alice));
  41.         lc.logout();
  42.     }
  43.  
  44.     /**
  45.      * Test LoginContext login without TGT ticket (unknown user). This only
  46.      * tests for missing keytab entry, not a valid keytab file with an unknown user.
  47.      *
  48.      * @throws LoginException
  49.      */
  50.     @Test(expected = LoginException.class)
  51.     public void testLoginWithoutTgtUnknownUser() throws LoginException {
  52.         @SuppressWarnings("unused")
  53.         final LoginContext lc = utils.getKerberosLoginContext(bob, keytabFile);
  54.     }
  55.  
  56.     /**
  57.      * Test getKeyTab() method (success)
  58.      */
  59.     @Test
  60.     public void testGetKeyTabSuccess() throws LoginException {
  61.         assertThat("failed to see key", utils.getKeyTab(alice, keytabFile), notNullValue());
  62.     }
  63.  
  64.     /**
  65.      * Test getKeyTab() method (unknown user)
  66.      */
  67.     @Test(expected = LoginException.class)
  68.     public void testGetKeyTabUnknownUser() throws LoginException {
  69.         assertThat("failed to see key", utils.getKeyTab(bob, keytabFile), notNullValue());
  70.     }
  71. }
public class BasicKdcTest {

    @ClassRule
    public static final TemporaryFolder tmpDir = new TemporaryFolder();

    @ClassRule
    public static final EmbeddedKdcResource kdc = new EmbeddedKdcResource();

    private static KerberosPrincipal alice;
    private static KerberosPrincipal bob;
    private static File keytabFile;
    private static File ticketCacheFile;

    private KerberosUtilities utils = new KerberosUtilities();

    @BeforeClass
    public static void createKeytabs() throws Exception {
        // create Kerberos principal and keytab filename.
        alice = new KerberosPrincipal("alice@" + kdc.getRealm());
        bob = new KerberosPrincipal("bob@" + kdc.getRealm());
        keytabFile = tmpDir.newFile("users.keytab");
        ticketCacheFile = tmpDir.newFile("krb5cc_alice");

        // create keytab file containing key for Alice but not Bob.
        kdc.createKeytabFile(keytabFile, "alice");

        assertThat("ticket cache does not exist", ticketCacheFile.exists(), equalTo(true));
    }

    /**
     * Test LoginContext login without TGT ticket (success).
     *
     * @throws LoginException
     */
    @Test
    public void testLoginWithoutTgtSuccess() throws LoginException {
        final LoginContext lc = utils.getKerberosLoginContext(alice, keytabFile);
        lc.login();
        assertThat("subject does not contain expected principal", lc.getSubject().getPrincipals(),
                contains(alice));
        lc.logout();
    }

    /**
     * Test LoginContext login without TGT ticket (unknown user). This only
     * tests for missing keytab entry, not a valid keytab file with an unknown user.
     *
     * @throws LoginException
     */
    @Test(expected = LoginException.class)
    public void testLoginWithoutTgtUnknownUser() throws LoginException {
        @SuppressWarnings("unused")
        final LoginContext lc = utils.getKerberosLoginContext(bob, keytabFile);
    }

    /**
     * Test getKeyTab() method (success)
     */
    @Test
    public void testGetKeyTabSuccess() throws LoginException {
        assertThat("failed to see key", utils.getKeyTab(alice, keytabFile), notNullValue());
    }

    /**
     * Test getKeyTab() method (unknown user)
     */
    @Test(expected = LoginException.class)
    public void testGetKeyTabUnknownUser() throws LoginException {
        assertThat("failed to see key", utils.getKeyTab(bob, keytabFile), notNullValue());
    }
}

Next Steps

The next article will discuss the Apache Hadoop UserGroupInformation class and how it connects to JAAS authentication.

Source

You can download the source for this article here: JAAS with Kerberos; Unit Test using Apache Hadoop Mini-KDC.

Comments
No Comments »
Categories
hadoop, java, security
Comments rss Comments rss
Trackback Trackback

JAAS without configuration files; JAAS and Kerberos

Bear Giles | November 19, 2017

Java’s JAAS abstraction is a powerful tool to handle authentication but it has two major weaknesses in practice. First, nearly all of the discussions on how to use it assume that the developer can write the JAAS configuration file to a secure location. That may not be easy in a hosted environment. Second, JAAS has some unexpected behavior that made sense at the time but which can bite developers today. Neither is difficult to overcome once you know the solution.

This article will discuss the solution to these problems and give a concrete example using Kerberos authentication. Kerberos is widely used in the Hadoop ecosystem and a future article will discuss how to use the Hadoop-specific UserGroupInformation class.

Limitations

Unfortunately this code does not completely eliminate the need for external files. First, we must still provide an explicit Kerberos keytab file. I will update this article if I find an approach that eliminates this limitation.

Second, we must still provide an external krb5.conf configuration file. This is required by the Krb5LoginModule class and I think we’re limited to changing the location of this file.

Update regarding ticket caches on 11/22/2017

Important update on ticket caches (and TGT) on 11/22/2017. I need to emphasize that the standard implementation of the Krb5LoginModule does not create ticket cache files. That may not be clear below. I will post a followup article that discusses using the external kinit program to create Kerberos ticket caches.

The JAAS Configuration Class

The first issue to discuss is eliminating the need for a JAAS configuration file. We want to be able to configure JAAS programmatically, perhaps using information provided via a traditional database or a cloud discovery service such as Spring Cloud Config. JAAS provides an oft-overlooked class that can replace the external configuration: javax.security.auth.login.Configuration. The most general solution is to create a class that accepts a Map in its constructor and uses it to create an array of AppConfigurationEntry values.

  1. class CustomLoginConfiguration extends javax.security.auth.login.Configuration {
  2.     private static final String SECURITY_AUTH_MODULE_KRB5_LOGIN_MODULE =
  3.             "com.sun.security.auth.module.Krb5LoginModule";
  4.  
  5.     private final Map<String, String> entries = new HashMap<>();
  6.  
  7.     /**
  8.      * Constructor taking a Map of parameters
  9.      */
  10.     public CustomLoginConfiguration(Map<String, Map<String, String>> params) {
  11.         for (Map.Entry<String, Map<String, String>gt; entry : params.entrySet()) {
  12.             entries.put(entry.getKey(),
  13.                     new AppConfigurationEntry(SECURITY_AUTH_MODULE_KRB5_LOGIN_MODULE,
  14.                 AppConfigurationEntry.LoginModuleControlFlag.REQUIRED, entry.getValue()));
  15.         }
  16.     }
  17.  
  18.     /**
  19.      * Get entry.
  20.      */
  21.     @Override
  22.     public AppConfigurationEntry[] getAppConfigurationEntry(String name) {
  23.         if (entries.containsKey(name)) {
  24.             return new AppConfigurationEntry[] { entries.get(name) };
  25.         }
  26.         return new AppConfigurationEntry[0];
  27.     }
  28. }
class CustomLoginConfiguration extends javax.security.auth.login.Configuration {
    private static final String SECURITY_AUTH_MODULE_KRB5_LOGIN_MODULE =
            "com.sun.security.auth.module.Krb5LoginModule";

    private final Map<String, String> entries = new HashMap<>();

    /**
     * Constructor taking a Map of parameters
     */
    public CustomLoginConfiguration(Map<String, Map<String, String>> params) {
        for (Map.Entry<String, Map<String, String>gt; entry : params.entrySet()) {
            entries.put(entry.getKey(),
                    new AppConfigurationEntry(SECURITY_AUTH_MODULE_KRB5_LOGIN_MODULE,
                AppConfigurationEntry.LoginModuleControlFlag.REQUIRED, entry.getValue()));
        }
    }

    /**
     * Get entry.
     */
    @Override
    public AppConfigurationEntry[] getAppConfigurationEntry(String name) {
        if (entries.containsKey(name)) {
            return new AppConfigurationEntry[] { entries.get(name) };
        }
        return new AppConfigurationEntry[0];
    }
}

In practice we’ll only need one or two JAAS configurations in our application and it may be more maintainable to write a convenience class. It is very easy to use an external configuration file to identify the required properties and then convert the final file into a static method that populates the Map.

  1. import static java.lang.Boolean.TRUE;
  2.  
  3. class Krb5WithKeytabLoginConfiguration extends CustomLoginConfiguration {
  4.  
  5.     /**
  6.      * Constructor taking basic Kerberos properties.
  7.      *
  8.      * @param serviceName JAAS service name
  9.      * @param principal Kerberos principal
  10.      * @param keytabFile keytab file containing key for this principal
  11.      */
  12.     public Krb5WithKeytabLoginConfiguration(String serviceName, KerberosPrincipal principal, File keytabFile);
  13.        super(serviceName, makeMap(principal, keytabFile);
  14.     }
  15.  
  16.     /**
  17.      * Static method that creates the Map required by the parent class.
  18.      *
  19.      * @param principal Kerberos principal
  20.      * @param keytabFile keytab file containing key for this principal
  21.      */
  22.     private static Map<String, String> makeMap(KerberosPrincipal principal, File keytabFile) {
  23.         final Map<String, String> map = new HashMap<gt;();
  24.  
  25.         // this is the basic Kerberos information
  26.         map.put("principal", principal.getName());
  27.         map.put("useKeyTab", TRUE.toString());
  28.         map.put("keyTab", keytabFile.getAbsolutePath());
  29.  
  30.         // 'fail fast'
  31.         map.put("refreshKrb5Config", TRUE.toString());
  32.  
  33.         // we're doing everything programmatically so we never want to prompt the user.
  34.         map.put("doNotPrompt", TRUE.toString());
  35.         return map;
  36.     }
  37. }
import static java.lang.Boolean.TRUE;

class Krb5WithKeytabLoginConfiguration extends CustomLoginConfiguration {

    /**
     * Constructor taking basic Kerberos properties.
     *
     * @param serviceName JAAS service name
     * @param principal Kerberos principal
     * @param keytabFile keytab file containing key for this principal
     */
    public Krb5WithKeytabLoginConfiguration(String serviceName, KerberosPrincipal principal, File keytabFile);
       super(serviceName, makeMap(principal, keytabFile);
    }

    /**
     * Static method that creates the Map required by the parent class.
     *
     * @param principal Kerberos principal
     * @param keytabFile keytab file containing key for this principal
     */
    private static Map<String, String> makeMap(KerberosPrincipal principal, File keytabFile) {
        final Map<String, String> map = new HashMap<gt;();

        // this is the basic Kerberos information
        map.put("principal", principal.getName());
        map.put("useKeyTab", TRUE.toString());
        map.put("keyTab", keytabFile.getAbsolutePath());

        // 'fail fast'
        map.put("refreshKrb5Config", TRUE.toString());

        // we're doing everything programmatically so we never want to prompt the user.
        map.put("doNotPrompt", TRUE.toString());
        return map;
    }
}

The JAAS CallbackHandler and LoginContext Classes

A nasty surprise for many developers is that the JAAS implementation will fall back to a non-trivial default implementation if a custom CallbackHandler is not provided. At best this will result in confusing error messages, at worst an attacker can override the default implementation with one that is much more open than the developer intended.

Fortunately this is easy to handle when creating the JAAS LoginContext. We could use an empty handler method but it doesn’t hurt to log any messages in case there’s a problem.

This example uses the LoginContext method that takes a suggested Subject. It may be possible to provide the keytab information via the Subject’s private credentials instead of passing in an explicit file location via the ‘keyTab’ property but I haven’t found it yet. I’ve left the code in place in case it will help others.

  1. class KerberosUtilities {
  2.     private static final Logger LOG = LoggerFactory.getLogger(KerberosUtilities.class);
  3.  
  4.     /**
  5.      * Get JAAS LoginContext for specified Kerberos parameters
  6.      *
  7.      * @param principal Kerberos principal
  8.      * @param keytabFile keytab file containing key for this principal
  9.      */
  10.     public LoginContext getKerberosLoginContext(KerberosPrincipal principal, File keytabFile)
  11.             throws LoginException, ConfigurationException {
  12.  
  13.         final KeyTab keytab = getKeyTab(keytabFile, principal);
  14.  
  15.         // create Subject containing basic Kerberos parameters.
  16.         final Set<Principal> principals = Collections.<Principal> singleton(principal);
  17.         final Set<?> pubCredentials = Collections.emptySet();
  18.         final Set<?> privCredentials = Collections.<Object> singleton(keytab);
  19.         final Subject subject = new Subject(false, principals, pubCredentials, privCredentials);
  20.  
  21.         // create LoginContext using this subject.
  22.         final String serviceName = "krb5";
  23.         final LoginContext lc = new LoginContext(serviceName, subject,
  24.                 new CallbackHandler() {
  25.                     public void handle(Callback[] callbacks) {
  26.                         for (Callback callback : callbacks) {
  27.                             if (callback instanceof TextOutputCallback) {
  28.                                 LOG.info(((TextOutputCallback) callback).getMessage());
  29.                             }
  30.                         }
  31.                     }
  32.                 }, new Krb5KeytabLoginConfiguration(serviceName, principal, keytabFile);
  33.  
  34.         return lc;
  35.     }
  36.  
  37.     /**
  38.      * Convenience method that verifies keytab file exists, is readable, and contains appropriate entry.
  39.      */
  40.     public KeyTab getKeyTab(File keytabFile, KerberosPrincipal principal)
  41.             throws LoginException {
  42.  
  43.         if (!keytabFile.exists() || !keytabFile.canRead()) {
  44.             throw new LoginException("specified file does not exist or cannot be read");
  45.         }
  46.  
  47.         // verify keytab file exists
  48.         KeyTab keytab = KeyTab.getInstance(principal, keytabFile);
  49.         if (!keytab.exists()) {
  50.             throw new LoginException("specified file is not a keytab file");
  51.         }
  52.  
  53.         // verify keytab file actually contains at least one key for this principal.
  54.         KerberosKey[] keys = keytab.getKeys(principal);
  55.         if (keys.length == 0) {
  56.             throw new LoginException("keytab file does not contain required entry");
  57.         }
  58.  
  59.         // destroy keys since we don't need them, we just need to make sure they exist.
  60.         for (KerberosKey key : keys) {
  61.             try {
  62.                 key.destroy();
  63.             } catch (DestroyFailedException e) {
  64.                 LOG.debug("unable to destroy key");
  65.             }
  66.         }
  67.  
  68.         return keytab;
  69.     }
  70. }
class KerberosUtilities {
    private static final Logger LOG = LoggerFactory.getLogger(KerberosUtilities.class);

    /**
     * Get JAAS LoginContext for specified Kerberos parameters
     *
     * @param principal Kerberos principal
     * @param keytabFile keytab file containing key for this principal
     */
    public LoginContext getKerberosLoginContext(KerberosPrincipal principal, File keytabFile)
            throws LoginException, ConfigurationException {

        final KeyTab keytab = getKeyTab(keytabFile, principal);

        // create Subject containing basic Kerberos parameters.
        final Set<Principal> principals = Collections.<Principal> singleton(principal);
        final Set<?> pubCredentials = Collections.emptySet();
        final Set<?> privCredentials = Collections.<Object> singleton(keytab);
        final Subject subject = new Subject(false, principals, pubCredentials, privCredentials);

        // create LoginContext using this subject.
        final String serviceName = "krb5";
        final LoginContext lc = new LoginContext(serviceName, subject,
                new CallbackHandler() {
                    public void handle(Callback[] callbacks) {
                        for (Callback callback : callbacks) {
                            if (callback instanceof TextOutputCallback) {
                                LOG.info(((TextOutputCallback) callback).getMessage());
                            }
                        }
                    }
                }, new Krb5KeytabLoginConfiguration(serviceName, principal, keytabFile);

        return lc;
    }

    /**
     * Convenience method that verifies keytab file exists, is readable, and contains appropriate entry.
     */
    public KeyTab getKeyTab(File keytabFile, KerberosPrincipal principal)
            throws LoginException {

        if (!keytabFile.exists() || !keytabFile.canRead()) {
            throw new LoginException("specified file does not exist or cannot be read");
        }

        // verify keytab file exists
        KeyTab keytab = KeyTab.getInstance(principal, keytabFile);
        if (!keytab.exists()) {
            throw new LoginException("specified file is not a keytab file");
        }

        // verify keytab file actually contains at least one key for this principal.
        KerberosKey[] keys = keytab.getKeys(principal);
        if (keys.length == 0) {
            throw new LoginException("keytab file does not contain required entry");
        }

        // destroy keys since we don't need them, we just need to make sure they exist.
        for (KerberosKey key : keys) {
            try {
                key.destroy();
            } catch (DestroyFailedException e) {
                LOG.debug("unable to destroy key");
            }
        }

        return keytab;
    }
}

The Final Bits for Kerberos

There is one final problem. The Krb5LoginModule expects the Kerberos configuration file to be located at a standard location, typically /etc/krb5.conf on Linux systems. This can be overridden with the java.security.krb5.kdc system property.

The default realm is usually set in the Kerberos configuration file. You can override it with the java.security.krb5.realm system property.

Cloudera (Hadoop Cluster)

You must set one additional system property, at least when using Cloudera clients with Hive:

  • javax.security.auth.useSubjectCredsOnly=false

For more information on this see Hive JDBC client error when connecting to Kerberos Cloudera cluster .

Debugging

Finally there are several useful system properties if you are stuck:

  • sun.security.krb5.debug=true
  • java.security.debug=gssloginconfig,configfile,configparser,logincontext

Next Steps

The next article will discuss writing unit tests using an embedded KDC server.

Source

You can download the source for this article here: JAAS with Kerberos; Unit Test using Apache Hadoop Mini-KDC.

Comments
No Comments »
Categories
hadoop, java, security
Comments rss Comments rss
Trackback Trackback

Building Hadoop on Ubuntu 16.10

Bear Giles | January 2, 2017

Edge Nodes and Rolling your own Hadoop Packages

I must answer an important question before I start. Why would anyone want to build Hadoop themselves? Isn’t it much saner to use one of the commercial distributions like Cloudera or Hortonworks? Both have ‘express’ versions that are free to use and ideal for developers and small-scale testing. (They’re distributed as VMWare images but it’s straightforward to convert a VMWare image into an AWS image that can be run on an EC2 instance. In fact it’s an item on my to-blog list!) The Cloudera Express version also has an option that gives you a straight Hadoop cluster without the Cloudera enhancements.

Why bother building our own packages?

The answer is edge nodes. The classic Hadoop environment, e.g., what you’ll see in a Coursera specialization, involves a tidy Hadoop cluster that has map/reduce jobs uploaded to it and run. Everything goes through a handful of clean interfaces.

In practice any site that needs a Hadoop cluster will probably have its own software that solves its business needs and that software uses the Hadoop cluster as a resource no differently than an existing database, mail, or jms service. It needs access to the cluster but only via the well-defined wire protocol. In addition any CISO on the ball will want to keep the Hadoop cluster tucked away on its own locked-down VPC, one that has the Hadoop cluster on it but nothing else. She’ll also want anything that talks to the cluster on a relatively locked-down VPC, ideally one that’s not directly accessible from the corporate VPN much less the internet at large.

Hence ‘edge nodes’. The idea is that a Hadoop cluster consists of two types of nodes. Compute nodes run the actual Hadoop services, edge nodes do not but are able to communicate with the services. They’re often the only way to communicate with the services. Edge nodes can be located in the same VPC as the compute nodes or they can be located in a different VPC. There are benefits and drawbacks to both approaches.

Setting up an edge node is pretty straightforward. Besides access to the Hadoop compute nodes you need:

  • The configuration files from the cluster (typically in /etc/hadoop)
  • The appropriate client jars from the cluster
  • The Hadoop client programs (e.g., ‘hadoop’ or ‘hdfs’) if you’ll do anything from the command line or in scripts
  • The shared libraries for native code. (Optional but it improves performance)

In addition you’ll need the Kerberos client apps and the /etc/krb5.conf file if you use Kerberos authentication within the cluster. Kerberos is a good idea even if you’re on a dedicated VPC. The commercial distributions make it easy to set up if you don’t already have a corporate KDC. Again there are benefits and drawbacks to using the KDC bundled in the Hadoop distribution vs. using a corporate KDC.

Developer systems will also often be set up as edge nodes to development-internal clusters.

So Why Do We Need To Build Our Own Packages?

We have commercial distributions. They provide free ‘express’ versions and even a free prebundled standard Hadoop cluster. So why would we possibly want to build our own packages?

There are two reasons. The first is the most obvious – we want or need to use the version of a service that’s not supported by the commercial distribution. Cloudera takes a very conservative approach so even the newest release contains older versions of the services, albeit ones that have often had new features backported to them. Hortonworks takes a more aggressive approach and will have newer versions of the services but there’s no guarantee that it will have the version you need. In these cases you need to provide the tarball yourself and it’s often better to build it locally than to download a prebuilt image in order to reduce the risk of a nasty surprise because of unmet expectations in the system libraries. That’s rare but it can be a real pain to track down when it occurs.

The second reason is more subtle. There’s no requirement that the edge nodes run the same operating system as the compute nodes. This could be a small change, e.g., a developer laptop running Ubuntu vs. a CDH cluster running on RedHat, or it could be more substantial such as a macbook developer laptop. In the case of Ubuntu it should be possible to copy the RedHat files but again rebuilding the packages on your own system ensures there won’t be any surprises.

The Easy (But Incomplete) Solutions

You can built the most recent versions of Hadoop by cloning https://github.com/apache/hadoop and running the start-build-env.sh script. That creates a Docker environment with all of the required libraries so builds will go quickly.

  1. $ git clone https://github.com/apache/hadoop
  2. $ cd hadoop
  3. $ ./start-build-env.sh
$ git clone https://github.com/apache/hadoop
$ cd hadoop
$ ./start-build-env.sh

Unfortunately that script was introduced in Hadoop 2.8. I need to support earlier versions and that means building Hadoop on the command line instead of a preconfigured Docker container. At first glance this is straightforward:

  1. #
  2. # get source...
  3. #
  4. $ git clone https://github.com/apache/hadoop
  5. $ cd hadoop
  6. $ git checkout release-2.5.0
  7.  
  8. #
  9. # install development libraries
  10. #
  11. $ sudo apt-get install libopenssl-dev zlib1g-dev libbz2-dev libsnappy-dev
  12.  
  13. #
  14. # install Kerberos client and development libraries (just in case)
  15. #
  16. $ sudo apt-get install krb5-client, libkrb5-dev
  17.  
  18. #
  19. # install required tools
  20. #
  21. $ sudo apt-get install cmake protobuf-compiler
  22.  
  23. #
  24. # build Hadoop distribution tarball including native libraries
  25. #
  26. $ mvn package -Pdist,native -DskipTests -Dtar -Dmaven.javadoc.skip=true -Drequire.snappy -Drequire.openssl
#
# get source...
#
$ git clone https://github.com/apache/hadoop
$ cd hadoop
$ git checkout release-2.5.0

#
# install development libraries
#
$ sudo apt-get install libopenssl-dev zlib1g-dev libbz2-dev libsnappy-dev

#
# install Kerberos client and development libraries (just in case)
#
$ sudo apt-get install krb5-client, libkrb5-dev

#
# install required tools
#
$ sudo apt-get install cmake protobuf-compiler

#
# build Hadoop distribution tarball including native libraries
#
$ mvn package -Pdist,native -DskipTests -Dtar -Dmaven.javadoc.skip=true -Drequire.snappy -Drequire.openssl

At this point we should fly along… until we do a good impression of a bug hitting a windshield. Ubuntu 16.10 provides version 3.0.0 of the protocol buffer tool but Hadoop has a hard requirement on version 2.5.0. That version hasn’t been supported by Ubuntu since 14.04.

We could manually download and install the 14.04 .deb packages but that runs the risk of getting into library dependency hell. There is a better solution.

Building Protocol Buffers 2.5.0 on Ubuntu 16.10

The solution is a bit of Ubuntu-fu. We don’t want to install the Ubuntu 14.04 binary packages but there’s no problem downloading the Ubuntu 14.04 source package and rebuilding it in our Ubuntu 16.10 environment. We can then safely install these packages as either a direct replacement or in a separate location (using ‘dpkg –root=dir’) without worrying about introducing other outdated libraries.

  1. #
  2. # download source package. There's a way to do this with dpkg-source but with older source packages
  3. # I prefer to do it manually.
  4. #
  5. $ wget https://launchpad.net/ubuntu/+archive/primary/+files/protobuf_2.5.0.orig.tar.gz
  6. $ wget https://launchpad.net/ubuntu/+archive/primary/+files/protobuf_2.5.0-9ubuntu1.debian.tar.gz
  7. $ wget https://launchpad.net/ubuntu/+archive/primary/+files/protobuf_2.5.0-9ubuntu1.dsc
  8.  
  9. #
  10. # unpack source
  11. #
  12. $ dpkg-source --extract protobuf_2.5.0-9ubuntu1.dsc
  13.  
  14. #
  15. # build binary packages. this will take awhile.
  16. #
  17. # note: you might need to install additional packages in order to build this package.
  18. # Build dependencies are listed in the control file under "Build-Depends".
  19. #
  20. $ cd protobuf-2.5.0
  21. $ dpkg-buildpackage -us -uc -nc
  22.  
  23. #
  24. # the binary packages are now available in the original directory. You can install
  25. # them using 'dpkg', or 'dpkg --root=dir' if you want them to exist in parallel with
  26. # the current libraries. In the latter case you will need to specify the new location
  27. # when you build hadoop.
  28. #
  29. $ cd ..
  30. $ sudo dpkg -i *deb
#
# download source package. There's a way to do this with dpkg-source but with older source packages
# I prefer to do it manually.
#
$ wget https://launchpad.net/ubuntu/+archive/primary/+files/protobuf_2.5.0.orig.tar.gz
$ wget https://launchpad.net/ubuntu/+archive/primary/+files/protobuf_2.5.0-9ubuntu1.debian.tar.gz
$ wget https://launchpad.net/ubuntu/+archive/primary/+files/protobuf_2.5.0-9ubuntu1.dsc

#
# unpack source
#
$ dpkg-source --extract protobuf_2.5.0-9ubuntu1.dsc

#
# build binary packages. this will take awhile.
#
# note: you might need to install additional packages in order to build this package.
# Build dependencies are listed in the control file under "Build-Depends".
#
$ cd protobuf-2.5.0
$ dpkg-buildpackage -us -uc -nc

#
# the binary packages are now available in the original directory. You can install
# them using 'dpkg', or 'dpkg --root=dir' if you want them to exist in parallel with
# the current libraries. In the latter case you will need to specify the new location
# when you build hadoop.
#
$ cd ..
$ sudo dpkg -i *deb

Note: you system will revert to the 3.0.0 version with the next ‘apt-get upgrade’ unless you pin the version at 2.5.0.

Finishing and Deploying the Build

We can now finish the build. When it is done there will be a large .tar.gz file in the hadoop-dist/target directory. For instance hadoop-2.5.0.tar.gz is over 133 MB. This file is traditionally untarred in the /opt directory.

  1. #
  2. # untar package
  3. #
  4. $ sudo tar xzf hadoop-dis/target/hadoop-2.5.0.tar.gz -C /opt
  5.  
  6. #
  7. # create symlink to make life easier
  8. #
  9. $ cd /opt
  10. $ sudo ln -s hadoop-2.5.0 hadoopk
  11.  
  12. #
  13. # make the native libraries available
  14. # (note: file must be created as root. showing 'echo' for convenience.)
  15. #
  16. $ echo /opt/hadoop/lib/native > /etc/ld.so.conf.d/hadoop.conf
  17. $ sudo ldconfig
  18.  
  19. #
  20. # verify shared libraries are now visible
  21. #
  22. ldconfig -p | grep hadoop
  23.  
  24. #
  25. # add hadoop binaries to PATH. Note: in the long term you'll want to update
  26. # /etc/profile or ~/.bashrc.
  27. #
  28. export PATH=$PATH:/opt/hadoop/bin
  29.  
  30. #
  31. # set HADOOP_HOME. Or is it HADOOP_COMMON_HOME? HADOOP_PREFIX? This seems to change between
  32. # Hadoop versions so check your documentation.
  33. #
#
# untar package
#
$ sudo tar xzf hadoop-dis/target/hadoop-2.5.0.tar.gz -C /opt

#
# create symlink to make life easier
#
$ cd /opt
$ sudo ln -s hadoop-2.5.0 hadoopk

#
# make the native libraries available
# (note: file must be created as root. showing 'echo' for convenience.)
#
$ echo /opt/hadoop/lib/native > /etc/ld.so.conf.d/hadoop.conf
$ sudo ldconfig

#
# verify shared libraries are now visible
#
ldconfig -p | grep hadoop

#
# add hadoop binaries to PATH. Note: in the long term you'll want to update
# /etc/profile or ~/.bashrc.
#
export PATH=$PATH:/opt/hadoop/bin

#
# set HADOOP_HOME. Or is it HADOOP_COMMON_HOME? HADOOP_PREFIX? This seems to change between
# Hadoop versions so check your documentation.
#

In total the contents of /opt/hadoop include seven directories: bin, etc, include, lib, libexec, sbin, and share. Edge nodes need to keep bin, lib, libexec, share/doc, and the client libraries from share/hadoop. The easiest way to find them is

  1. $ find /opt/hadoop/share/hadoop/common/lib
  2. $ find /opt/hadoop/share/hadoop -name "*-client*-2.5.0.jar"
  3. $ find /opt/hadoop/share/hadoop -name "*-common*-2.5.0.jar"
$ find /opt/hadoop/share/hadoop/common/lib
$ find /opt/hadoop/share/hadoop -name "*-client*-2.5.0.jar"
$ find /opt/hadoop/share/hadoop -name "*-common*-2.5.0.jar"

In my case that’s

  • ./common/hadoop-common-2.5.0.jar
  • ./common/hadoop-nfs-2.5.0.jar (needed?)
  • ./common/lib/hadoop-annotations-2.5.0.jar
  • ./common/lib/hadoop-auth-2.5.0.jar
  • ./mapreduce/hadoop-mapreduce-client-app-2.5.0.jar
  • ./mapreduce/hadoop-mapreduce-client-common-2.5.0.jar
  • ./mapreduce/hadoop-mapreduce-client-core-2.5.0.jar
  • ./mapreduce/hadoop-mapreduce-client-hs-2.5.0.jar
  • ./mapreduce/hadoop-mapreduce-client-hs-plugins-2.5.0.jar
  • ./mapreduce/hadoop-mapreduce-client-jobclient-2.5.0.jar
  • ./mapreduce/hadoop-mapreduce-client-shuffle-2.5.0.jar
  • ./yarn/hadoop-yarn-client-2.5.0.jar
  • ./yarn/hadoop-yarn-common-2.5.0.jar
  • ./yarn/hadoop-yarn-server-common-2.5.0.jar (needed?)

plus a large number of third-party libraries. Some of the less common ones that are unlikely to already be in your app include

  • ./common/lib/apacheds-i18n-2.0.0-M15.jar
  • ./common/lib/apacheds-kerberos-code-2.2.0-M15.jar
  • ./common/lib/avro-1.7.4.jar
  • ./common/lib/guava-11.0.2.jar
  • ./common/lib/jsch-0.1.42.jar
  • ./common/lib/paranamer-2.3.jar
  • ./common/lib/protobuf-java-2.5.0.jar
  • ./common/lib/psnappy-java-1.0.4.1.jar
  • ./common/lib/zookeeper-3.4.6.jar

We don’t need the contents of the /opt/hadoop/etc directory – we should use a copy of the configuration files from one of the compute nodes. We don’t need the contents of the include, sbin, or rest of the shared directories since they are only required when we run the Hadoop services.

Distribution

We don’t need to go through this process on every edge node – in the case of Ubuntu it’s easy to create a binary package for redistribution via ‘dpkg -b’. We have to follow a few simple rules and we’ll have a package that can be safely installed, updated, and removed. One huge benefit of using a binary package is that we can safely put the files in their standard places instead of the /opt directory.

I’m not familiar with the RPM creation process but I’m sure it’s equally easy to do in that environment.

Finally I am debating debating creating a Debian/Ubuntu PPA with these packages for multiple Hadoop projects and versions. Watch this blog for announcements.

Other Hadoop Projects

There are other Hadoop projects that we will want to bundle for edge nodes. One good example is Hive – building the package from source gives us the ‘beeline’ and ‘hplsql’ command line tools. The process should go smoothly once you have an environment that can build the main Hadoop project. Just be careful to examine the pom file since available profiles and final distribution location will differ.

Comments
No Comments »
Categories
cloud computing, hadoop, java, linux
Comments rss Comments rss
Trackback Trackback

DataSource Classloader Headaches

Bear Giles | January 2, 2017

I haven’t been posting since I’ve been very busy learning Hadoop + Kerberos for multiple client environments and getting into shape before it’s too late. (I know it’s “never too late” in principle but I’m seeing family and friends my age who are now unable to do hard workouts due to medical issues. For them it is “too late” to get into better shape so this is no longer an abstract concern for me.)

Part of my broader work is supporting applications with user-provided JDBC drivers. We bundle the datasource (typically HikariCP) and allow the user to specify the JDBC driver jar. Support has been very ad hoc and I’ve been working on parameterized tests that use aether to query the maven central repository for all versions of the datasource and JDBC jars and then verifying that I can make a connection to our test servers using all possible combinations. That’s not always the case, e.g., older JDBC drivers might not support a method required by newer versions of the datasource class, especially for more obscure databases such as hive.

(Note: I don’t mean to pick on Hikari here. I’m seeing this problem in several libraries and I’m just using it as an example.)

The test should be straightforward. With one test class per datasource version:

  1. ClassLoader oldClassLoader = Thread.currentThread().getContextClassLoader();
  2. for (Artifact artifact : /* Aether query */) {
  3.     try {
  4.         URL[] urls = new URL[] {
  5.             new URL("file", "", artifact.getFile());
  6.         }
  7.         ClassLoader cl = new URLClassLoader(urls, oldClassLoader);
  8.         Thread.currentThread().setContextClassLoader(cl);
  9.  
  10.         HikariConfig config = new HikariConfig();
  11.         config.setJdbcUrl(TEST_URL);
  12.         config.setDriverClassName(DRIVER_CLASSNAME);
  13.         DataSource ds = new HikariDataSource(config);
  14.         try (Connection conn = ds.getConnection();
  15.                 Statement stmt = conn.createStatement();
  16.                 ResultSet rs = stmt.executeQuery("select 1 as x")) {
  17.             assertThat(rs.next(), equalTo(true));
  18.             assertThat(rs.getInt("x"), equalTo(1);
  19.         }
  20.     } finally {
  21.         Thread.currentThread().setContextClassLoader(oldClassLoader);
  22.     }
  23. }
ClassLoader oldClassLoader = Thread.currentThread().getContextClassLoader();
for (Artifact artifact : /* Aether query */) {
    try {
        URL[] urls = new URL[] {
            new URL("file", "", artifact.getFile());
        }
        ClassLoader cl = new URLClassLoader(urls, oldClassLoader);
        Thread.currentThread().setContextClassLoader(cl);

        HikariConfig config = new HikariConfig();
        config.setJdbcUrl(TEST_URL);
        config.setDriverClassName(DRIVER_CLASSNAME);
        DataSource ds = new HikariDataSource(config);
        try (Connection conn = ds.getConnection();
                Statement stmt = conn.createStatement();
                ResultSet rs = stmt.executeQuery("select 1 as x")) {
            assertThat(rs.next(), equalTo(true));
            assertThat(rs.getInt("x"), equalTo(1);
        }
    } finally {
        Thread.currentThread().setContextClassLoader(oldClassLoader);
    }
}

(Note: I’m actually using a parameterized junit test that uses the loop to produce the list of parameters. Each parameterized test is then run individually. I’m using an explicit loop here to emphasize the need to restore the environment after each test.)

Only one problem – it can’t find the driver class. Looking at the source code in github reveals the problem:

  1. public void setDriverClassName(String driverClassName) {
  2.     Class c = HikariConfig.class.getClassLoader().loadClass(driverClassName);
  3.     ...
  4. }
public void setDriverClassName(String driverClassName) {
    Class c = HikariConfig.class.getClassLoader().loadClass(driverClassName);
    ...
}

The Hikari classes were loaded by a different classloader than the JDBC driver classes and the ‘parent’ relationship between the classloaders goes the wrong way.

The fix isn’t hard – I need to modify my classloader so it loads both the Hikari datasource library and the JDBC driver library. This requires the use of reflection to create and configure the HikariConfig and HikariDataSource classes but that’s not too hard if I use commons-lang3 helper classes. There’s even a benefit to this approach – I can specify both datasource and JDBC driver jars in the test parameters and no longer need a separate test class for each version of the datasource.

Unfortunately it doesn’t work. I haven’t dug deeper into the class but I noticed the setter only verifies that the class is visible. It’s actually loaded and used elsewhere and it might use a different classloader at that point. Research continues….

But wait, it gets worse!

As an alternative I tried to explicitly register the JDBC driver in order to create the datasource without explicitly naming the JDBC driver classname (if possible):

  1. ClassLoader oldClassLoader = Thread.currentThread().getContextClassLoader();
  2. Driver driver = null;
  3. for (Artifact artifact : /* Aether query */) {
  4.     try {
  5.         URL[] urls = new URL[] {
  6.             new URL("file", "", artifact.getFile());
  7.         }
  8.         ClassLoader cl = new URLClassLoader(urls, oldClassLoader);
  9.         Thread.currentThread().setContextClassLoader(cl);
  10.  
  11.         Class driverClass = (Class) cl.loadClass(DRIVER_NAME);
  12.         driver = driverClass.newInstance();
  13.         DriverManager.registerDriver(driver);
  14.  
  15.         ...
  16.  
  17.     } finally {
  18.         if (driver != null) {
  19.             DriverManager.deregisterDriver(driver);
  20.         }
  21.         Thread.currentThread().setContextClassLoader(oldClassLoader);
  22.     }
  23. }
ClassLoader oldClassLoader = Thread.currentThread().getContextClassLoader();
Driver driver = null;
for (Artifact artifact : /* Aether query */) {
    try {
        URL[] urls = new URL[] {
            new URL("file", "", artifact.getFile());
        }
        ClassLoader cl = new URLClassLoader(urls, oldClassLoader);
        Thread.currentThread().setContextClassLoader(cl);

        Class driverClass = (Class) cl.loadClass(DRIVER_NAME);
        driver = driverClass.newInstance();
        DriverManager.registerDriver(driver);

        ...

    } finally {
        if (driver != null) {
            DriverManager.deregisterDriver(driver);
        }
        Thread.currentThread().setContextClassLoader(oldClassLoader);
    }
}

Incredibly this fails – the deregisterDriver() call throws a SecurityException! This happens even when I explicitly set a permissive SecurityManager in the test setup. Digging into the code I discovered that the DriverManager checks whether the caller has the ability to load the class being deregistered. That sounds like a basic sanity check against malicious behavior but it introduces a classloader dependency. Again it’s not using the classloader I created in order to isolate my tests. The DriverManager is a core class so there’s no solution to this problem.

Edited to add…

I meant that there’s no clean solution to this problem. The DriverManager class uses reflection to learn the classloader of the calling method and verifies that the driver is accessible to it. In our case it’s not – we created a new classloader and it’s still our thread’s contextClassLoader but we’re calling the deregisterDriver() method from a class loaded by the original classloader.

One solution is to write and maintain another class that exists solely to deregister the driver class. That is non-obvious and will be a pain to maintain.

The other solution is to use reflection to make the internal registeredDrivers collection accessible and directly manipulate it in our ‘finally’ clause. That was my final solution.

Lessons learned

If we’re writing a library that allows the user to specify a classname at runtime we MUST test the scenario where the user loads the containing jar in a separate classloader. It’s not enough to only test it when containing jar is in the same classpath as our library – the jar might ultimately be provided by the end user and not the developer.

Comments
No Comments »
Categories
java
Comments rss Comments rss
Trackback Trackback

Should Mocked Method Arguments Make Assumptions?

Bear Giles | March 27, 2016

I’m cleaning up some tests at my new job and came across an interesting difference in opinion. Here are two ways to write a test. Which is better?

Context: the class being tested acts as a translation layer between our internal abstraction layer and a traditional JDBC driver. JPA and other ORM frameworks are not a viable option since we must work with arbitrary databases. We usually don’t want to call verify() since we want to test behavior, not implementation, but in this case the behavior we want to test is whether the appropriate calls are made to the mocked JDBC objects.

(Implementation note: we are using easymock but we can create similar code in jmock and mockito.)

Approach 1

  1. final Connection conn = createMock(Connection.class);
  2. final PreparedStatement stmt = createMock(PreparedStatement.class);
  3.  
  4. // more initialization
  5.  
  6. expect(conn.prepareStatement("insert into foo(a, b, c) values (?, ?, ?)")).andReturn(stmt);
  7. stmt.setString(1, "Larry");
  8. expectLastCall();
  9. stmt.setString(2, "Curly");
  10. expectLastCall();
  11. stmt.setString(3, "Moe");
  12. expectLastCall();
  13. expect(stmt.execute()).andReturn(true);
  14.  
  15. // more initialization
  16.  
  17. replay(conn, stmt);
  18.  
  19. // run code under test
  20.  
  21. verify(conn, stmt);
  22. }
final Connection conn = createMock(Connection.class);
final PreparedStatement stmt = createMock(PreparedStatement.class);

// more initialization

expect(conn.prepareStatement("insert into foo(a, b, c) values (?, ?, ?)")).andReturn(stmt);
stmt.setString(1, "Larry");
expectLastCall();
stmt.setString(2, "Curly");
expectLastCall();
stmt.setString(3, "Moe");
expectLastCall();
expect(stmt.execute()).andReturn(true);

// more initialization

replay(conn, stmt);

// run code under test

verify(conn, stmt);
}

Approach 2

  1. final Connection conn = createMock(Connection.class);
  2. final PreparedStatement stmt = createMock(PreparedStatement.class);
  3.  
  4. final String expectedSql = "insert into foo(a, b, c) values (?, ?, ?)";
  5. final List<String> expectedValues = ImmutableList.of("Larry", "Curly", "Moe");
  6.  
  7. final Capture<String> actualSql = newCapture();
  8. final Capture<String> actualValues = newCapture(CaptureType.ALL);
  9.  
  10. // more initialization
  11.  
  12. expect(conn.prepareStatement(capture(actualSql))).andReturn(stmt);
  13. for (int i = 0; i < expectedValues.size(); i++) {
  14.     stmt.setString(eq(i + 1), actualValues);
  15.     expectLastCall();
  16. }
  17. expect(stmt.execute()).andReturn(true);
  18.  
  19. // more initialization
  20.  
  21. replay(conn, stmt);
  22.  
  23. // run code under test
  24.  
  25. assertEquals(expectedSql, actualSql);
  26. assertEquals(expectedValues, actualValues);
  27.  
  28. verify(conn, stmt);
  29. }
final Connection conn = createMock(Connection.class);
final PreparedStatement stmt = createMock(PreparedStatement.class);

final String expectedSql = "insert into foo(a, b, c) values (?, ?, ?)";
final List<String> expectedValues = ImmutableList.of("Larry", "Curly", "Moe");

final Capture<String> actualSql = newCapture();
final Capture<String> actualValues = newCapture(CaptureType.ALL);

// more initialization

expect(conn.prepareStatement(capture(actualSql))).andReturn(stmt);
for (int i = 0; i < expectedValues.size(); i++) {
    stmt.setString(eq(i + 1), actualValues);
    expectLastCall();
}
expect(stmt.execute()).andReturn(true);

// more initialization

replay(conn, stmt);

// run code under test

assertEquals(expectedSql, actualSql);
assertEquals(expectedValues, actualValues);

verify(conn, stmt);
}

Both tests will fail if the expected SQL or values are different than expected. The former creates a framework exception (unexpected method call/missing method call), the latter creates a standard JUnit failure giving expected and actual values.

My coworker argues that the first approach is better since it’s more concise. That’s a good point – shorter methods are easier to understand and maintain.

I argue that the second approach is better since the messages are a lot clearer and the code is much more explicit that we’re concerned about the arguments to the mocked methods. The second approach will also be a lot easier to convert to a parameterized test in the future. Furthermore the actual code (not the snippet above) often uses the expectedValues value as an argument to the code under test. This reduces the risk of breaking tests because we updated one value but not the corresponding value.

Given-When-Then

Academic programs rarely spend much time on testing methodologies and many companies still treat testing as grunt work for the junior people instead of something critical to the long-term success of the project. We would never think it’s enough to just sit down and bang out code but we’ll do exactly that with test code.

No, we need to think about what we’re trying to accomplish and the best way to do it. Fortunately some very smart people have been thinking about this for a long time. One approach is Given-When-Then. Some test frameworks are explicitly built around this approach. JUnit is not but it’s easy to follow as a heuristic.

The second approach is easy to convert to Given-When-Then.

Given-When-Then Approach

  1. Capture<String> setupForPreparedStatement(Connection conn, PreparedStatement stmt, Capture<Object> actualValues)
  2.         throws SQLException {
  3.     final Capture<String> actualSql = newCapture();
  4.  
  5.     expect(conn.prepareStatement(capture(actualSql))).andReturn(stmt);
  6.     for (int i = 0; i < expectedValues.size(); i++) {
  7.         Object obj = expectedValues.get(i);
  8.         if (obj == null) {
  9.             stmt.setNull(eq(i + 1), anyInt());
  10.         } else if (obj instanceof String) {
  11.             stmt.setString(eq(i + 1), (String) actualValues);
  12.         } else if (obj instanceof BigDecimal) {
  13.             stmt.setBigDecimal(eq(i + 1), (BigDecimal) actualValues);
  14.         } else ... {
  15.         }
  16.         expectLastCall();
  17.     }
  18.     expect(stmt.execute()).andReturn(true);
  19.  
  20.     return actualSql;
  21. }
  22.  
  23. public void test1() throws SQLException {
  24. given:
  25.     final Connection conn = createMock(Connection.class);
  26.     final PreparedStatement stmt = createMock(PreparedStatement.class);
  27.     final Capture<String> actualValues = newCapture(CaptureType.ALL);
  28.  
  29.     setupForPreparedStatement(conn, stmt, actualValues);
  30.  
  31. when:
  32.     final List<Object> expectedValues = ImmutableList. of("Larry", "Curly", "Moe");
  33.  
  34. then:
  35.     replay(conn, stmt);
  36.     // run code under test
  37.     verify(conn, stmt);
  38.  
  39.     assertEquals(expectedValues, actualValues);
  40. }
Capture<String> setupForPreparedStatement(Connection conn, PreparedStatement stmt, Capture<Object> actualValues)
        throws SQLException {
    final Capture<String> actualSql = newCapture();

    expect(conn.prepareStatement(capture(actualSql))).andReturn(stmt);
    for (int i = 0; i < expectedValues.size(); i++) {
        Object obj = expectedValues.get(i);
        if (obj == null) {
            stmt.setNull(eq(i + 1), anyInt());
        } else if (obj instanceof String) {
            stmt.setString(eq(i + 1), (String) actualValues);
        } else if (obj instanceof BigDecimal) {
            stmt.setBigDecimal(eq(i + 1), (BigDecimal) actualValues);
        } else ... {
        }
        expectLastCall();
    }
    expect(stmt.execute()).andReturn(true);

    return actualSql;
}

public void test1() throws SQLException {
given:
    final Connection conn = createMock(Connection.class);
    final PreparedStatement stmt = createMock(PreparedStatement.class);
    final Capture<String> actualValues = newCapture(CaptureType.ALL);

    setupForPreparedStatement(conn, stmt, actualValues);

when:
    final List<Object> expectedValues = ImmutableList. of("Larry", "Curly", "Moe"); 

then:
    replay(conn, stmt);
    // run code under test
    verify(conn, stmt);

    assertEquals(expectedValues, actualValues);
}

(For clarity I’ve removed the test for the expected SQL statement.)

In this code there is absolutely nothing in the setup that makes assumptions about what values will be passed to the mocked objects. It doesn’t even make assumptions about what SQL commands will be passed, just that we want to create a prepared statement, populate it, then execute it.

It’s not hard to take this a step further and decree that all ‘expect’ calls should be done in setup methods and maintained by the developer working on the tested class. The actual test methods will be maintained by other people and follow the “test behavior, not implementation” rule.

If we don’t need to verify() the mocked objects – and remember we usually won’t since we want to test behavior and not implementation – then we can make the test even more concise.

Given-When-Then Approach without verify()

  1. Capture<String> setupForPreparedStatement(Capture<Object> actualValues) throws SQLException {
  2.     final Connection conn = createMock(Connection.class);
  3.     final PreparedStatement stmt = createMock(PreparedStatement.class);
  4.     final Capture<String> actualSql = newCapture();
  5.  
  6.     expect(conn.prepareStatement(capture(actualSql))).andReturn(stmt);
  7.     for (int i = 0; i < expectedValues.size(); i++) {
  8.         Object obj = expectedValues.get(i);
  9.         if (obj == null) {
  10.             stmt.setNull(eq(i + 1), anyInt());
  11.         } else if (obj instanceof String) {
  12.             stmt.setString(eq(i + 1), (String) actualValues);
  13.         } else if (obj instanceof BigDecimal) {
  14.             stmt.setBigDecimal(eq(i + 1), (BigDecimal) actualValues);
  15.         } else ... {
  16.         }
  17.         expectLastCall();
  18.     }
  19.     expect(stmt.execute()).andReturn(true);
  20.     replay(conn, stmt);
  21.  
  22.     return actualSql;
  23. }
  24.  
  25. public void test1() throws SQLException {
  26. given:
  27.     final Capture<String> actualValues = newCapture(CaptureType.ALL);
  28.     setupForPreparedStatement(actualValues);
  29.  
  30. when:
  31.     final List<Object> expectedValues = ImmutableList. of("Larry", "Curly", "Moe");
  32.  
  33. then:
  34.     // run code under test
  35.  
  36.     assertEquals(expectedValues, actualValues);
  37. }
Capture<String> setupForPreparedStatement(Capture<Object> actualValues) throws SQLException {
    final Connection conn = createMock(Connection.class);
    final PreparedStatement stmt = createMock(PreparedStatement.class);
    final Capture<String> actualSql = newCapture();

    expect(conn.prepareStatement(capture(actualSql))).andReturn(stmt);
    for (int i = 0; i < expectedValues.size(); i++) {
        Object obj = expectedValues.get(i);
        if (obj == null) {
            stmt.setNull(eq(i + 1), anyInt());
        } else if (obj instanceof String) {
            stmt.setString(eq(i + 1), (String) actualValues);
        } else if (obj instanceof BigDecimal) {
            stmt.setBigDecimal(eq(i + 1), (BigDecimal) actualValues);
        } else ... {
        }
        expectLastCall();
    }
    expect(stmt.execute()).andReturn(true);
    replay(conn, stmt);

    return actualSql;
}

public void test1() throws SQLException {
given:
    final Capture<String> actualValues = newCapture(CaptureType.ALL);
    setupForPreparedStatement(actualValues);

when:
    final List<Object> expectedValues = ImmutableList. of("Larry", "Curly", "Moe"); 

then:
    // run code under test

    assertEquals(expectedValues, actualValues);
}
Comments
No Comments »
Categories
java, politics
Comments rss Comments rss
Trackback Trackback

Adding Database Logging to JUnit3

Bear Giles | August 16, 2015

We have written many thousands of JUnit3 tests over the last decade and are now trying to consolidate the results in a database instead of scattered log files. It turns out to be remarkably easy to extend the TestCase class to do this. Note: this approach does not directly apply to JUnit4 or other test frameworks but it’s usually possible to do something analogous.

The tested class and its test

For demonstration purposes we can define a class with a single method to test.

  1. public class MyTestedClass {
  2.  
  3.     public String op(String a, String b) {
  4.         return ((a == null) ? "" : a) + ":" + ((b == null) ? "" : b);
  5.     }
  6. }
public class MyTestedClass {

    public String op(String a, String b) {
        return ((a == null) ? "" : a) + ":" + ((b == null) ? "" : b);
    }
}

A class with a single method to be tested is less of a restriction than you might think. We are only testing four methods in the thousands of tests I mentioned earlier.

Here are a handful of tests for the class above.

  1. public class MySimpleTest extends SimpleTestCase {
  2.  
  3.     private MyTestedClass obj = new MyTestedClass();
  4.  
  5.     public void test1() {
  6.         assertEquals("a:b", obj.op("a", "b"));
  7.     }
  8.  
  9.     public void test2() {
  10.         assertEquals(":b", obj.op(null, "b"));
  11.     }
  12.  
  13.     public void test3() {
  14.         assertEquals("a:", obj.op("a", null));
  15.     }
  16.  
  17.     public void test4() {
  18.         assertEquals(":", obj.op(null, null));
  19.     }
  20.  
  21.     public void test5() {
  22.         // this will fail
  23.         assertEquals(" : ", obj.op(null, null));
  24.     }
  25. }
public class MySimpleTest extends SimpleTestCase {

    private MyTestedClass obj = new MyTestedClass();

    public void test1() {
        assertEquals("a:b", obj.op("a", "b"));
    }

    public void test2() {
        assertEquals(":b", obj.op(null, "b"));
    }

    public void test3() {
        assertEquals("a:", obj.op("a", null));
    }

    public void test4() {
        assertEquals(":", obj.op(null, null));
    }

    public void test5() {
        // this will fail
        assertEquals(" : ", obj.op(null, null));
    }
}

Capturing basic information with a TestListener

JUnit3 allows listeners to be added their test processes. This listener is called before and after the test is run, plus anytime a test fails or has an error (throws an exception). This TestListener writes basic test information to System.out as a proof of concept. It would be easy to modify it to write the information to a database, a JMS topic, etc.

  1. public class SimpleTestListener implements TestListener {
  2.     private static final TimeZone UTC = TimeZone.getTimeZone("UTC");
  3.     private long start;
  4.     private boolean successful = true;
  5.     private String name;
  6.     private String failure = null;
  7.  
  8.     SimpleTestListener() {
  9.     }
  10.  
  11.     public void setName(String name) {
  12.         this.name = name;
  13.     }
  14.  
  15.     public void startTest(Test test) {
  16.         start = System.currentTimeMillis();
  17.     }
  18.  
  19.     public void addError(Test test, Throwable t) {
  20.         // cache information about error.
  21.         successful = false;
  22.     }
  23.  
  24.     public void addFailure(Test test, AssertionFailedError e) {
  25.         // cache information about failure.
  26.         failure = e.getMessage();
  27.         successful = false;
  28.     }
  29.  
  30.     /**
  31.      * After the test finishes we can update the database with statistics about
  32.      * the test - name, elapsed time, whether it was successful, etc.
  33.      */
  34.     public void endTest(Test test) {
  35.         long elapsed = System.currentTimeMillis() - start;
  36.  
  37.         SimpleDateFormat fmt = new SimpleDateFormat();
  38.         fmt.setTimeZone(UTC);
  39.  
  40.         System.out.printf("[%s, %s, %s, %d, %s, %s]\n", test.getClass().getName(), name, fmt.format(new Date(start)),
  41.                 elapsed, failure, Boolean.toString(successful));
  42.  
  43.         // write any information about errors or failures to database.
  44.     }
  45. }
public class SimpleTestListener implements TestListener {
    private static final TimeZone UTC = TimeZone.getTimeZone("UTC");
    private long start;
    private boolean successful = true;
    private String name;
    private String failure = null;

    SimpleTestListener() {
    }

    public void setName(String name) {
        this.name = name;
    }

    public void startTest(Test test) {
        start = System.currentTimeMillis();
    }

    public void addError(Test test, Throwable t) {
        // cache information about error.
        successful = false;
    }

    public void addFailure(Test test, AssertionFailedError e) {
        // cache information about failure.
        failure = e.getMessage();
        successful = false;
    }

    /**
     * After the test finishes we can update the database with statistics about
     * the test - name, elapsed time, whether it was successful, etc.
     */
    public void endTest(Test test) {
        long elapsed = System.currentTimeMillis() - start;

        SimpleDateFormat fmt = new SimpleDateFormat();
        fmt.setTimeZone(UTC);

        System.out.printf("[%s, %s, %s, %d, %s, %s]\n", test.getClass().getName(), name, fmt.format(new Date(start)),
                elapsed, failure, Boolean.toString(successful));

        // write any information about errors or failures to database.
    }
}

A production TestListener should do a lot more with errors and failures. I’ve skipped that in order to focus on the broader issues.

This listener is not thread-safe so we will want to use a Factory pattern to create a fresh instance for each test. We can create heavyweight objects in the factory, e.g., open a SQL DataSource in the factory and pass a fresh Connection to each instance.

  1. public class SimpleTestListenerFactory {
  2.     public static final SimpleTestListenerFactory INSTANCE = new SimpleTestListenerFactory();
  3.  
  4.     public SimpleTestListenerFactory() {
  5.         // establish connection data source here?
  6.     }
  7.  
  8.     public SimpleTestListener newInstance() {
  9.         // initialize listener.
  10.         SimpleTestListener listener = new SimpleTestListener();
  11.         return listener;
  12.     }
  13. }
public class SimpleTestListenerFactory {
    public static final SimpleTestListenerFactory INSTANCE = new SimpleTestListenerFactory();

    public SimpleTestListenerFactory() {
        // establish connection data source here?
    }

    public SimpleTestListener newInstance() {
        // initialize listener.
        SimpleTestListener listener = new SimpleTestListener();
        return listener;
    }
}

If we know the test framework is purely serial we can capture all console output by creating a buffer and calling System.setOut() in startTest() and then restoring the original System.out in endTest(). This works well as long as tests never overlap but will cause problems otherwise. This can be problematic though – IDEs may have their own test runners that allow concurrent execution.

We override the standard run() method with our own that creates and registers a listener before calling the existing run() method.

  1. public class SimpleTestCase extends TestCase {
  2.  
  3.     public void run(TestResult result) {
  4.         SimpleTestListener l = SimpleTestListenerFactory.INSTANCE.newInstance();
  5.         result.addListener(l);
  6.         l.setName(getName());
  7.         super.run(result);
  8.         result.removeListener(l);
  9.     }
  10. }
public class SimpleTestCase extends TestCase {

    public void run(TestResult result) {
        SimpleTestListener l = SimpleTestListenerFactory.INSTANCE.newInstance();
        result.addListener(l);
        l.setName(getName());
        super.run(result);
        result.removeListener(l);
    }
}

We now get the expected results to System.out.

  1. [MySimpleTest, test1, 8/2/15 11:58 PM, 0, null, true]
  2. [MySimpleTest, test2, 8/2/15 11:58 PM, 10, null, true]
  3. [MySimpleTest, test3, 8/2/15 11:58 PM, 0, null, true]
  4. [MySimpleTest, test4, 8/2/15 11:58 PM, 0, null, true]
  5. [MySimpleTest, test5, 8/2/15 11:58 PM, 4, expected same:<:> was not:< : >, false]
[MySimpleTest, test1, 8/2/15 11:58 PM, 0, null, true]
[MySimpleTest, test2, 8/2/15 11:58 PM, 10, null, true]
[MySimpleTest, test3, 8/2/15 11:58 PM, 0, null, true]
[MySimpleTest, test4, 8/2/15 11:58 PM, 0, null, true]
[MySimpleTest, test5, 8/2/15 11:58 PM, 4, expected same:<:> was not:< : >, false]

Capturing call information with a facade and TestListener

This is a good start but we might be able to do better. Only 4 methods are called in the thousands of tests mentioned above – it would be extremely powerful if we could capture the input and output values on those calls.

It is easy to wrap these functions with AOP, or a logging facade if AOP is not acceptable for some reason. In simple cases we can simply capture the input and output values.

  1. public class MyFacadeClass extends MyTestedClass {
  2.     private MyTestedClass parent;
  3.     private String a;
  4.     private String b;
  5.     private String result;
  6.  
  7.     public MyFacadeClass(MyTestedClass parent) {
  8.         this.parent = parent;
  9.     }
  10.  
  11.     public String getA() {
  12.         return a;
  13.     }
  14.  
  15.     public String getB() {
  16.         return b;
  17.     }
  18.  
  19.     public String getResult() {
  20.         return result;
  21.     }
  22.  
  23.     /**
  24.      * Wrap tested method so we can capture input and output.
  25.      */
  26.     public String op(String a, String b) {
  27.         this.a = a;
  28.         this.b = b;
  29.         String result = parent.op(a, b);
  30.         this.result = result;
  31.         return result;
  32.     }
  33. }
public class MyFacadeClass extends MyTestedClass {
    private MyTestedClass parent;
    private String a;
    private String b;
    private String result;

    public MyFacadeClass(MyTestedClass parent) {
        this.parent = parent;
    }

    public String getA() {
        return a;
    }

    public String getB() {
        return b;
    }

    public String getResult() {
        return result;
    }

    /**
     * Wrap tested method so we can capture input and output.
     */
    public String op(String a, String b) {
        this.a = a;
        this.b = b;
        String result = parent.op(a, b);
        this.result = result;
        return result;
    }
}

We log the basic information as before and add just a bit new code to log the inputs and outputs.

  1. public class AdvancedTestListener extends SimpleTestListener {
  2.  
  3.     AdvancedTestListener() {
  4.     }
  5.  
  6.     /**
  7.      * Log information as before but also log call details.
  8.      */
  9.     public void endTest(Test test) {
  10.         super.endTest(test);
  11.  
  12.         // add captured inputs and outputs
  13.         if (test instanceof MyAdvancedTest) {
  14.             MyTestedClass obj = ((MyAdvancedTest) test).obj;
  15.             if (obj instanceof MyFacadeClass) {
  16.                 MyFacadeClass facade = (MyFacadeClass) obj;
  17.                 System.out.printf("[, , %s, %s, %s]\n", facade.getA(), facade.getB(), facade.getResult());
  18.             }
  19.         }
  20.     }
  21. }
public class AdvancedTestListener extends SimpleTestListener {

    AdvancedTestListener() {
    }

    /**
     * Log information as before but also log call details.
     */
    public void endTest(Test test) {
        super.endTest(test);

        // add captured inputs and outputs
        if (test instanceof MyAdvancedTest) {
            MyTestedClass obj = ((MyAdvancedTest) test).obj;
            if (obj instanceof MyFacadeClass) {
                MyFacadeClass facade = (MyFacadeClass) obj;
                System.out.printf("[, , %s, %s, %s]\n", facade.getA(), facade.getB(), facade.getResult());
            }
        }
    }
}

The logs now show both the basic information and the call details.

  1. [MyAdvancedTest, test2, 8/3/15 12:13 AM, 33, null, true]
  2. [, , null, b, :b]
  3. [MyAdvancedTest, test3, 8/3/15 12:13 AM, 0, null, true]
  4. [, , a, null, a:]
  5. [MyAdvancedTest, test4, 8/3/15 12:13 AM, 0, null, true]
  6. [, , null, null, :]
  7. [MyAdvancedTest, test1, 8/3/15 12:13 AM, 0, null, true]
  8. [, , a, b, a:b]
[MyAdvancedTest, test2, 8/3/15 12:13 AM, 33, null, true]
[, , null, b, :b]
[MyAdvancedTest, test3, 8/3/15 12:13 AM, 0, null, true]
[, , a, null, a:]
[MyAdvancedTest, test4, 8/3/15 12:13 AM, 0, null, true]
[, , null, null, :]
[MyAdvancedTest, test1, 8/3/15 12:13 AM, 0, null, true]
[, , a, b, a:b]

We want to associate the basic and call details but that’s easily handled by adding a unique test id.

This approach is not enough in the real world where the tested methods may be called multiple times during a single test. In this case we need to either have a way to cache multiple sets of input and output values or to extend the listener so we can call it at the end of each covered method.

We can make our results more extensible by encoding the results in XML or JSON instead of a simple list. This will allow us to only capture values of interest or to easily handle fields added in the future.

  1. [MyAdvancedTest, test2, 8/3/15 12:13 AM, 33, null, true]
  2. {"a":null, "b":"b", "results":":b" }
  3. [MyAdvancedTest, test3, 8/3/15 12:13 AM, 0, null, true]
  4. {"a":"a", "b":null, "results":"a:" }
  5. [MyAdvancedTest, test4, 8/3/15 12:13 AM, 0, null, true]
  6. {"a":null, "b":null, "results":":" }
  7. [MyAdvancedTest, test1, 8/3/15 12:13 AM, 0, null, true]
  8. {"a":" a", "b":"b", "results":" a:b" }
[MyAdvancedTest, test2, 8/3/15 12:13 AM, 33, null, true]
{"a":null, "b":"b", "results":":b" }
[MyAdvancedTest, test3, 8/3/15 12:13 AM, 0, null, true]
{"a":"a", "b":null, "results":"a:" }
[MyAdvancedTest, test4, 8/3/15 12:13 AM, 0, null, true]
{"a":null, "b":null, "results":":" }
[MyAdvancedTest, test1, 8/3/15 12:13 AM, 0, null, true]
{"a":" a", "b":"b", "results":" a:b" }

Capturing assertX information

We can now rerun the tests by replaying the captured inputs but there are two problems with blindly comparing the results. First, it will be a lot of unnecessary work if we only care about a single value. Second, many tests are non-deterministic (e.g., they use canned data that changes over time or even live data) and things we don’t care about may change.

This is not an easy problem. If we’re lucky the tests will follow the standard pattern and we can make a good guess at what tests are being performed but it needs to be manually verified.

First, we need to wrap the tested method’s results with a facade that captures some or all method calls. The call history should become available in a form that we can replay later, e.g., a sequence of method names and serialized parameters.

Second, we need to wrap the TestCase assertX methods so that we capture the recent method calls and the values passed to the assert call (plus the results, of course).

Example

The process is easiest to show – and demolish – with an example. Let’s start with a simple POJO.

  1. public class Person {
  2.     private String firstName;
  3.     private String lastName;
  4.  
  5.     public String getFirstName() { return firstName; }
  6.     public String getLastName() { return lastName; }
  7. }
public class Person {
    private String firstName;
    private String lastName;

    public String getFirstName() { return firstName; }
    public String getLastName() { return lastName; }
}

In this case our facade only needs to record the method name.

A typical test method is

  1. public void test1() {
  2.     Person p = getTestPerson();
  3.     assertEquals("John", p.getFirstName());
  4.     assertEquals("Smith", p.getLastName());
  5. }
public void test1() {
    Person p = getTestPerson();
    assertEquals("John", p.getFirstName());
    assertEquals("Smith", p.getLastName());
}

with a wrapped assertX method of

  1. static PersonFacade person;
  2.  
  3. public static void assertEquals(String expected, String actual) {
  4.     // ignoring null handling...
  5.     boolean results = expected.equals(actual);
  6.     LOG.log("assertEquals('" + expected + "',"+person.getMethodsCalled()+ ") = " + results);
  7.     person.clearMethodsCalled();
  8.     if (!results) {
  9.         throw new AssertionFailedError("Expected same:<" + expected + " > was not:<" + actual + ">");
  10.     }
  11. }
static PersonFacade person;

public static void assertEquals(String expected, String actual) {
    // ignoring null handling...
    boolean results = expected.equals(actual);
    LOG.log("assertEquals('" + expected + "',"+person.getMethodsCalled()+ ") = " + results);
    person.clearMethodsCalled();
    if (!results) {
        throw new AssertionFailedError("Expected same:<" + expected + " > was not:<" + actual + ">");
    }
}

so we would get results like

  1. assertEquals('John', getFirstName()) = true;
  2. assertEquals('Smith', getLastName()) = false;
assertEquals('John', getFirstName()) = true;
assertEquals('Smith', getLastName()) = false;

It’s not hard to see how this could be parsed by a test framework but it’s too early to celebrate. The second test method is

  1. public void test1() {
  2.     Person p = getTestPerson();
  3.     assertEquals("john", p.getFirstName().toLowerCase());
  4. }
public void test1() {
    Person p = getTestPerson();
    assertEquals("john", p.getFirstName().toLowerCase());
}

and our simple code will not capture the toLowerCase(). Our log will wrongly record

  1. assertEquals('John', getFirstName()) = false;
assertEquals('John', getFirstName()) = false;

A more pathological case is

  1. public void test1() {
  2.     Person p = getTestPerson();
  3.     LOG.log("testing " + p.getFirstName());
  4.     assertEquals("john", "joe");
  5. }
public void test1() {
    Person p = getTestPerson();
    LOG.log("testing " + p.getFirstName());
    assertEquals("john", "joe");
}

where the assertion has nothing to do with the wrapped class.

There are obvious bandaids, e.g., we could capture the return values in our facade, but this is a very deep rabbit hole that we want to stay far away from. I think the answer is to make a reasonable first attempt, manually verify the results, and leave it at that. (Alternative: rewrite the tests to a form that can be captured.)

Comments
No Comments »
Categories
java
Comments rss Comments rss
Trackback Trackback

Installing PostgreSQL PL/Java as a PostgreSQL Extension.

Bear Giles | August 8, 2015

In 2011 I wrote a series of articles on PostgreSQL PL/Java. The basic information is still solid but there is a now a much easier way to install PL/Java from source. This also eliminates the need to depend on third parties to create packages. These notes will be fairly brief since I assume my readers are already familiar with git and maven.

(Note: I’ve passed this information to the PL/Java team so it may already be handled by the time you read this.)

Perform the basic build

  1. Clone the PL/Java repository at https://github.com/tada/pljava.
  2. Run maven not make.
  3. …
  4. Profit!

Of course it’s not that simple. Maven can pull in its own dependencies but we still need several specialized libraries beyond the standard GNU toolchain. On my Ubuntu system I needed:

  • postgresql-server-dev-9.4
  • libpg-dev
  • libpgtypes3
  • libecpg-dev

(I don’t know the corresponding package names for RedHat/Fedora/CentOS.)

It may take a bit of experimentation but it shouldn’t be too hard to identify all of the packages you need. Just remember that you’ll usually want the packages with the “-dev” extension.

There are a large number of compiler warnings and errors but most if not all seem to be related to sign conversions. This warrants further investigation – sign conversion warnings indicate possible attack surfaces by malicious users – but for now we should be fine as long as maven succeeds. We need three files:

  1. ./src/sql/install.sql
  2. ./pljava/target/pljava-0.0.2-SNAPSHOT.jar
  3. ./pljava-so/target/nar/pljava-so-0.0.2-SNAPSHOT-i386-Linux-gpp-shared/lib/i386-Linux-gpp/shared/libpljava-so-0.0.2-SNAPSHOT.so
./src/sql/install.sql
./pljava/target/pljava-0.0.2-SNAPSHOT.jar
./pljava-so/target/nar/pljava-so-0.0.2-SNAPSHOT-i386-Linux-gpp-shared/lib/i386-Linux-gpp/shared/libpljava-so-0.0.2-SNAPSHOT.so

Copying the files

We can now copy the three files to their respective locations.

  1. $ sudo cp ./pljava-so/target/nar/pljava-so-0.0.2-SNAPSHOT-i386-Linux-gpp-shared/lib/i386-Linux-gpp/shared/libpljava-so-0.0.2-SNAPSHOT.so \
  2.   /usr/lib/postgresql/9.4/lib/pljava.so
  3.  
  4. $ sudo cp ./pljava/target/pljava-0.0.2-SNAPSHOT.jar /usr/share/postgresql/9.4/extension/pljava--1.4.4.jar
  5.  
  6. $ sudo cp ./src/sql/install.sql /usr/share/postgresql/9.4/extension/pljava--1.4.4.sql
$ sudo cp ./pljava-so/target/nar/pljava-so-0.0.2-SNAPSHOT-i386-Linux-gpp-shared/lib/i386-Linux-gpp/shared/libpljava-so-0.0.2-SNAPSHOT.so \
  /usr/lib/postgresql/9.4/lib/pljava.so

$ sudo cp ./pljava/target/pljava-0.0.2-SNAPSHOT.jar /usr/share/postgresql/9.4/extension/pljava--1.4.4.jar

$ sudo cp ./src/sql/install.sql /usr/share/postgresql/9.4/extension/pljava--1.4.4.sql

We can learn the correct target directory with the ‘pg_config’ command.

  1. $ pg_config
  2. PKGLIBDIR = /usr/lib/postgresql/9.4/lib
  3. SHAREDIR = /usr/share/postgresql/9.4
  4. ...
$ pg_config
PKGLIBDIR = /usr/lib/postgresql/9.4/lib
SHAREDIR = /usr/share/postgresql/9.4
...

I have changed the version from 0.0.2-SNAPSHOT to 1.4.4 since we want to capture the PL/Java version, not the pom.xml version. I hope these will soon be kept in sync.

Editing pljava–1.4.4.sql

We need to add two lines to the installation sql:

  1. SET PLJAVA.CLASSPATH='/usr/share/postgresql/9.4/extension/pljava--1.4.4.jar';
  2. SET PLJAVA.VMOPTIONS='-Xms64M -Xmx128M';
SET PLJAVA.CLASSPATH='/usr/share/postgresql/9.4/extension/pljava--1.4.4.jar';
SET PLJAVA.VMOPTIONS='-Xms64M -Xmx128M';

It’s important to remember that there is a unique JVM instantiated for each database connection. Memory consumption can become a major concern when you have 20+ simultaneous connections.

Create the pljava.control file

We must tell PostgreSQL about the new extension. This is handled by a control file.

/usr/share/postgresql/9.4/extension/pljava.control

  1. # pljava extension
  2. comment = 'PL/Java bundled as an extension'
  3. default_version = '1.4.4'
  4. relocatable = false
# pljava extension
comment = 'PL/Java bundled as an extension'
default_version = '1.4.4'
relocatable = false

Make libjvm.so visible

We normally specify the location of the java binaries and shared libraries via the JAVA_HOME environment variable. This isn’t an option with the database server.

There are two approaches depending on whether you want to make the java shared library (libjvm.so) visible to all applications or just the database server. I think the former is easiest.

We need to create a single file

/etc/ld.so.conf.d/i386-linux-java.conf

  1. /usr/lib/jvm/java-8-openjdk-i386/jre/lib/i386/server
/usr/lib/jvm/java-8-openjdk-i386/jre/lib/i386/server

where most of the pathname comes from JAVA_HOME. The location may be different on your system. The directory must contain the shared library ‘libjvm.so’.

We must also tell the system to refresh its cache.

  1. $ sudo ldconfig
  2. $ sudo ldconfig -p | grep jvm
  3.     libjvm.so (libc6) => /usr/lib/jvm/java-8-openjdk-i386/jre/lib/i386/server/libjvm.so
  4.     libjsig.so (libc6) => /usr/lib/jvm/java-8-openjdk-i386/jre/lib/i386/server/libjsig.so
$ sudo ldconfig
$ sudo ldconfig -p | grep jvm
	libjvm.so (libc6) => /usr/lib/jvm/java-8-openjdk-i386/jre/lib/i386/server/libjvm.so
	libjsig.so (libc6) => /usr/lib/jvm/java-8-openjdk-i386/jre/lib/i386/server/libjsig.so

Loading the extension

We can now easily load and unload PL/Java.

  1. => CREATE EXTENSION pljava;
  2. CREATE EXTENSION
  3.  
  4. => DROP EXTENSION pljava;
  5. DROP EXTENSION
=> CREATE EXTENSION pljava;
CREATE EXTENSION

=> DROP EXTENSION pljava;
DROP EXTENSION

In case of flakiness…

If the system seems flaky you can move the two ‘set’ commands into the postgresql.conf file.

/etc/postgresql/9.4/main/postgresql.conf

  1. #------------------------------------------------------------------------------
  2. # CUSTOMIZED OPTIONS
  3. #------------------------------------------------------------------------------
  4.  
  5. # Add settings for extensions here
  6.  
  7. PLJAVA.CLASSPATH='/usr/share/postgresql/9.4/extension/pljava--1.4.4.jar'
  8. PLJAVA.VMOPTIONS='-Xms64M -Xmx128M'
#------------------------------------------------------------------------------
# CUSTOMIZED OPTIONS
#------------------------------------------------------------------------------

# Add settings for extensions here

PLJAVA.CLASSPATH='/usr/share/postgresql/9.4/extension/pljava--1.4.4.jar'
PLJAVA.VMOPTIONS='-Xms64M -Xmx128M'
Comments
No Comments »
Categories
java, pl/java, PostgreSQL
Comments rss Comments rss
Trackback Trackback

Auto-encrypting Serializable Classes

Bear Giles | June 25, 2015

A crazy idea came up during the post-mortem discussions in the Coursera security capstone project. Can a class encrypt itself during serialization?

This is mostly an academic “what if” exercise. It is hard to think of a situation where we would want to rely on an object self-encrypting instead of using an explicit encryption mechanism during persistence. I’ve only been able to identify one situation where we can’t simply make a class impossible to serialize:

HTTPSession passivation

Appservers may passivate inactive HTTPSessions to save space or to migrate a session from one server to another. This is why sessions should only contain Serializable objects. (This restriction is often ignored in small-scale applications that can fit on a single server but that can cause problems if the implementation needs to be scaled up or out.)

One approach (and the preferred approach?) is for the session to write itself to a database during passivation and reload itself during activation. The only information actually retained is what’s require to reload the data, typically just the user id. This adds a bit of complexity to the HTTPSession implementation but it has many benefits. One major benefit is that it trivial to ensure sensitive information is encrypted.

It’s not the only approach and some sites may prefer to use standard serialization. Some appservers may keep copies of serialized copies of “live” sessions in an embedded database like H2. A cautious developer may want to ensure that sensitive information is encrypted during serialization even if it should never happen.

Note: a strong argument can be made that the sensitive information shouldn’t be in the session in the first place – only retrieve it when necessary and safely discard it once it is no longer needed.

The approach

The approach I’m taking is based on the serialization chapter in Effective Java. In broad terms we want to use a serialization proxy to handle the actual encryption. The behavior is

Action Method Protected Serialized Class Serialization Proxy
Serialization writeReplace() create proxy N/A
writeObject() throw exception write encrypted contents to ObjectOutputStream
Deserialization readObject() read encrypted contents from ObjectInputStream
readResolve() construct protected class object

The reason the protected class throws an exception when the deserialization methods are called is because it prevents attacks through attacker-generated serialized objects. See the discussion on the bogus byte-stream attack and internal field theft attack in the book mentioned above.

This approach has a big limitation – the class cannot be extended without the subclass reimplementing the proxy. I don’t think this is an issue in practice since this technique will only be used to protect classes containing sensitive information and it would rarely be desirable to add methods beyond the ones anticipated by the designers.

The proxy class handles encryption. The implementation below shows the use of a random salt (IV) and cryptographically strong message digest (HMAC) to detect tampering.

The code

  1. public class ProtectedSecret implements Serializable {
  2.     private static final long serialVersionUID = 1L;
  3.  
  4.     private final String secret;
  5.  
  6.     /**
  7.      * Constructor.
  8.      *
  9.      * @param secret
  10.      */
  11.     public ProtectedSecret(final String secret) {
  12.         this.secret = secret;
  13.     }
  14.  
  15.     /**
  16.      * Accessor
  17.      */
  18.     public String getSecret() {
  19.         return secret;
  20.     }
  21.  
  22.     /**
  23.      * Replace the object being serialized with a proxy.
  24.      *
  25.      * @return
  26.      */
  27.     private Object writeReplace() {
  28.         return new SimpleProtectedSecretProxy(this);
  29.     }
  30.  
  31.     /**
  32.      * Serialize object. We throw an exception since this method should never be
  33.      * called - the standard serialization engine will serialize the proxy
  34.      * returned by writeReplace(). Anyone calling this method directly is
  35.      * probably up to no good.
  36.      *
  37.      * @param stream
  38.      * @return
  39.      * @throws InvalidObjectException
  40.      */
  41.     private void writeObject(ObjectOutputStream stream) throws InvalidObjectException {
  42.         throw new InvalidObjectException("Proxy required");
  43.     }
  44.  
  45.     /**
  46.      * Deserialize object. We throw an exception since this method should never
  47.      * be called - the standard serialization engine will create serialized
  48.      * proxies instead. Anyone calling this method directly is probably up to no
  49.      * good and using a manually constructed serialized object.
  50.      *
  51.      * @param stream
  52.      * @return
  53.      * @throws InvalidObjectException
  54.      */
  55.     private void readObject(ObjectInputStream stream) throws InvalidObjectException {
  56.         throw new InvalidObjectException("Proxy required");
  57.     }
  58.  
  59.     /**
  60.      * Serializable proxy for our protected class. The encryption code is based
  61.      * on https://gist.github.com/mping/3899247.
  62.      */
  63.     private static class SimpleProtectedSecretProxy implements Serializable {
  64.         private static final long serialVersionUID = 1L;
  65.         private String secret;
  66.  
  67.         private static final String CIPHER_ALGORITHM = "AES/CBC/PKCS5Padding";
  68.         private static final String HMAC_ALGORITHM = "HmacSHA256";
  69.  
  70.         private static transient SecretKeySpec cipherKey;
  71.         private static transient SecretKeySpec hmacKey;
  72.  
  73.         static {
  74.             // these keys can be read from the environment, the filesystem, etc.
  75.             final byte[] aes_key = "d2cb415e067c7b13".getBytes();
  76.             final byte[] hmac_key = "d6cfaad283353507".getBytes();
  77.  
  78.             try {
  79.                 cipherKey = new SecretKeySpec(aes_key, "AES");
  80.                 hmacKey = new SecretKeySpec(hmac_key, HMAC_ALGORITHM);
  81.             } catch (Exception e) {
  82.                 throw new ExceptionInInitializerError(e);
  83.             }
  84.         }
  85.  
  86.         /**
  87.          * Constructor.
  88.          *
  89.          * @param protectedSecret
  90.          */
  91.         SimpleProtectedSecretProxy(ProtectedSecret protectedSecret) {
  92.             this.secret = protectedSecret.secret;
  93.         }
  94.  
  95.         /**
  96.          * Write encrypted object to serialization stream.
  97.          *
  98.          * @param s
  99.          * @throws IOException
  100.          */
  101.         private void writeObject(ObjectOutputStream s) throws IOException {
  102.             s.defaultWriteObject();
  103.             try {
  104.                 Cipher encrypt = Cipher.getInstance(CIPHER_ALGORITHM);
  105.                 encrypt.init(Cipher.ENCRYPT_MODE, cipherKey);
  106.                 byte[] ciphertext = encrypt.doFinal(secret.getBytes("UTF-8"));
  107.                 byte[] iv = encrypt.getIV();
  108.  
  109.                 Mac mac = Mac.getInstance(HMAC_ALGORITHM);
  110.                 mac.init(hmacKey);
  111.                 mac.update(iv);
  112.                 byte[] hmac = mac.doFinal(ciphertext);
  113.  
  114.                 // TBD: write algorithm id...
  115.                 s.writeInt(iv.length);
  116.                 s.write(iv);
  117.                 s.writeInt(ciphertext.length);
  118.                 s.write(ciphertext);
  119.                 s.writeInt(hmac.length);
  120.                 s.write(hmac);
  121.             } catch (Exception e) {
  122.                 throw new InvalidObjectException("unable to encrypt value");
  123.             }
  124.         }
  125.  
  126.         /**
  127.          * Read encrypted object from serialization stream.
  128.          *
  129.          * @param s
  130.          * @throws InvalidObjectException
  131.          */
  132.         private void readObject(ObjectInputStream s) throws ClassNotFoundException, IOException, InvalidObjectException {
  133.             s.defaultReadObject();
  134.             try {
  135.                 // TBD: read algorithm id...
  136.                 byte[] iv = new byte[s.readInt()];
  137.                 s.read(iv);
  138.                 byte[] ciphertext = new byte[s.readInt()];
  139.                 s.read(ciphertext);
  140.                 byte[] hmac = new byte[s.readInt()];
  141.                 s.read(hmac);
  142.  
  143.                 // verify HMAC
  144.                 Mac mac = Mac.getInstance(HMAC_ALGORITHM);
  145.                 mac.init(hmacKey);
  146.                 mac.update(iv);
  147.                 byte[] signature = mac.doFinal(ciphertext);
  148.  
  149.                 // verify HMAC
  150.                 if (!Arrays.equals(hmac, signature)) {
  151.                     throw new InvalidObjectException("unable to decrypt value");
  152.                 }
  153.  
  154.                 // decrypt data
  155.                 Cipher decrypt = Cipher.getInstance(CIPHER_ALGORITHM);
  156.                 decrypt.init(Cipher.DECRYPT_MODE, cipherKey, new IvParameterSpec(iv));
  157.                 byte[] data = decrypt.doFinal(ciphertext);
  158.                 secret = new String(data, "UTF-8");
  159.             } catch (Exception e) {
  160.                 throw new InvalidObjectException("unable to decrypt value");
  161.             }
  162.         }
  163.  
  164.         /**
  165.          * Return protected object.
  166.          *
  167.          * @return
  168.          */
  169.         private Object readResolve() {
  170.             return new ProtectedSecret(secret);
  171.         }
  172.     }
  173. }
public class ProtectedSecret implements Serializable {
    private static final long serialVersionUID = 1L;

    private final String secret;

    /**
     * Constructor.
     * 
     * @param secret
     */
    public ProtectedSecret(final String secret) {
        this.secret = secret;
    }

    /**
     * Accessor
     */
    public String getSecret() {
        return secret;
    }

    /**
     * Replace the object being serialized with a proxy.
     * 
     * @return
     */
    private Object writeReplace() {
        return new SimpleProtectedSecretProxy(this);
    }

    /**
     * Serialize object. We throw an exception since this method should never be
     * called - the standard serialization engine will serialize the proxy
     * returned by writeReplace(). Anyone calling this method directly is
     * probably up to no good.
     * 
     * @param stream
     * @return
     * @throws InvalidObjectException
     */
    private void writeObject(ObjectOutputStream stream) throws InvalidObjectException {
        throw new InvalidObjectException("Proxy required");
    }

    /**
     * Deserialize object. We throw an exception since this method should never
     * be called - the standard serialization engine will create serialized
     * proxies instead. Anyone calling this method directly is probably up to no
     * good and using a manually constructed serialized object.
     * 
     * @param stream
     * @return
     * @throws InvalidObjectException
     */
    private void readObject(ObjectInputStream stream) throws InvalidObjectException {
        throw new InvalidObjectException("Proxy required");
    }

    /**
     * Serializable proxy for our protected class. The encryption code is based
     * on https://gist.github.com/mping/3899247.
     */
    private static class SimpleProtectedSecretProxy implements Serializable {
        private static final long serialVersionUID = 1L;
        private String secret;

        private static final String CIPHER_ALGORITHM = "AES/CBC/PKCS5Padding";
        private static final String HMAC_ALGORITHM = "HmacSHA256";

        private static transient SecretKeySpec cipherKey;
        private static transient SecretKeySpec hmacKey;

        static {
            // these keys can be read from the environment, the filesystem, etc.
            final byte[] aes_key = "d2cb415e067c7b13".getBytes();
            final byte[] hmac_key = "d6cfaad283353507".getBytes();

            try {
                cipherKey = new SecretKeySpec(aes_key, "AES");
                hmacKey = new SecretKeySpec(hmac_key, HMAC_ALGORITHM);
            } catch (Exception e) {
                throw new ExceptionInInitializerError(e);
            }
        }

        /**
         * Constructor.
         * 
         * @param protectedSecret
         */
        SimpleProtectedSecretProxy(ProtectedSecret protectedSecret) {
            this.secret = protectedSecret.secret;
        }

        /**
         * Write encrypted object to serialization stream.
         * 
         * @param s
         * @throws IOException
         */
        private void writeObject(ObjectOutputStream s) throws IOException {
            s.defaultWriteObject();
            try {
                Cipher encrypt = Cipher.getInstance(CIPHER_ALGORITHM);
                encrypt.init(Cipher.ENCRYPT_MODE, cipherKey);
                byte[] ciphertext = encrypt.doFinal(secret.getBytes("UTF-8"));
                byte[] iv = encrypt.getIV();

                Mac mac = Mac.getInstance(HMAC_ALGORITHM);
                mac.init(hmacKey);
                mac.update(iv);
                byte[] hmac = mac.doFinal(ciphertext);

                // TBD: write algorithm id...
                s.writeInt(iv.length);
                s.write(iv);
                s.writeInt(ciphertext.length);
                s.write(ciphertext);
                s.writeInt(hmac.length);
                s.write(hmac);
            } catch (Exception e) {
                throw new InvalidObjectException("unable to encrypt value");
            }
        }

        /**
         * Read encrypted object from serialization stream.
         * 
         * @param s
         * @throws InvalidObjectException
         */
        private void readObject(ObjectInputStream s) throws ClassNotFoundException, IOException, InvalidObjectException {
            s.defaultReadObject();
            try {
                // TBD: read algorithm id...
                byte[] iv = new byte[s.readInt()];
                s.read(iv);
                byte[] ciphertext = new byte[s.readInt()];
                s.read(ciphertext);
                byte[] hmac = new byte[s.readInt()];
                s.read(hmac);

                // verify HMAC
                Mac mac = Mac.getInstance(HMAC_ALGORITHM);
                mac.init(hmacKey);
                mac.update(iv);
                byte[] signature = mac.doFinal(ciphertext);

                // verify HMAC
                if (!Arrays.equals(hmac, signature)) {
                    throw new InvalidObjectException("unable to decrypt value");
                }

                // decrypt data
                Cipher decrypt = Cipher.getInstance(CIPHER_ALGORITHM);
                decrypt.init(Cipher.DECRYPT_MODE, cipherKey, new IvParameterSpec(iv));
                byte[] data = decrypt.doFinal(ciphertext);
                secret = new String(data, "UTF-8");
            } catch (Exception e) {
                throw new InvalidObjectException("unable to decrypt value");
            }
        }

        /**
         * Return protected object.
         * 
         * @return
         */
        private Object readResolve() {
            return new ProtectedSecret(secret);
        }
    }
}

It should go without saying that the encryption keys should not be hard-coded or possibly even cached as shown. This was a short-cut to allow us to focus on the details of the implementation.

Different keys should be used for the cipher and message digest. You will seriously compromise the security of your system if the same key is used.

Two other things should be handled in any production system: key rotation and changing the cipher and digest algorithms. The former can be handled by adding a ‘key id’ to the payload, the latter can be handled by tying the serialization version number and cipher algorithms. E.g., version 1 uses standard AES, version 2 uses AES-256. The deserializer should be able to handle old encryption keys and ciphers (within reason).

Test code

The test code is straightforward. It creates an object, serializes it, deserializes it, and compares the results to the original value.

  1. public class ProtectedSecretTest {
  2.  
  3.     /**
  4.      * Test 'happy path'.
  5.      */
  6.     @Test
  7.     public void testCipher() throws IOException, ClassNotFoundException {
  8.         ProtectedSecret secret1 = new ProtectedSecret("password");
  9.         ProtectedSecret secret2;
  10.         byte[] ser;
  11.  
  12.         // serialize object
  13.         try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
  14.                 ObjectOutput output = new ObjectOutputStream(baos)) {
  15.             output.writeObject(secret1);
  16.             output.flush();
  17.  
  18.             ser = baos.toByteArray();
  19.         }
  20.  
  21.         // deserialize object.
  22.         try (ByteArrayInputStream bais = new ByteArrayInputStream(ser); ObjectInput input = new ObjectInputStream(bais)) {
  23.             secret2 = (ProtectedSecret) input.readObject();
  24.         }
  25.  
  26.         // compare values.
  27.         assertEquals(secret1.getSecret(), secret2.getSecret());
  28.     }
  29.  
  30.     /**
  31.      * Test deserialization after a single bit is flipped.
  32.      */
  33.     @Test(expected = InvalidObjectException.class)
  34.     public void testCipherAltered() throws IOException, ClassNotFoundException {
  35.         ProtectedSecret secret1 = new ProtectedSecret("password");
  36.         ProtectedSecret secret2;
  37.         byte[] ser;
  38.  
  39.         // serialize object
  40.         try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
  41.                 ObjectOutput output = new ObjectOutputStream(baos)) {
  42.             output.writeObject(secret1);
  43.             output.flush();
  44.  
  45.             ser = baos.toByteArray();
  46.         }
  47.        
  48.         // corrupt ciphertext
  49.         ser[ser.length - 16 - 1 - 3] ^= 1;
  50.  
  51.         // deserialize object.
  52.         try (ByteArrayInputStream bais = new ByteArrayInputStream(ser); ObjectInput input = new ObjectInputStream(bais)) {
  53.             secret2 = (ProtectedSecret) input.readObject();
  54.         }
  55.  
  56.         // compare values.
  57.         assertEquals(secret1.getSecret(), secret2.getSecret());
  58.     }
  59. }
public class ProtectedSecretTest {

    /**
     * Test 'happy path'.
     */
    @Test
    public void testCipher() throws IOException, ClassNotFoundException {
        ProtectedSecret secret1 = new ProtectedSecret("password");
        ProtectedSecret secret2;
        byte[] ser;

        // serialize object
        try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
                ObjectOutput output = new ObjectOutputStream(baos)) {
            output.writeObject(secret1);
            output.flush();

            ser = baos.toByteArray();
        }

        // deserialize object.
        try (ByteArrayInputStream bais = new ByteArrayInputStream(ser); ObjectInput input = new ObjectInputStream(bais)) {
            secret2 = (ProtectedSecret) input.readObject();
        }

        // compare values.
        assertEquals(secret1.getSecret(), secret2.getSecret());
    }

    /**
     * Test deserialization after a single bit is flipped.
     */
    @Test(expected = InvalidObjectException.class)
    public void testCipherAltered() throws IOException, ClassNotFoundException {
        ProtectedSecret secret1 = new ProtectedSecret("password");
        ProtectedSecret secret2;
        byte[] ser;

        // serialize object
        try (ByteArrayOutputStream baos = new ByteArrayOutputStream();
                ObjectOutput output = new ObjectOutputStream(baos)) {
            output.writeObject(secret1);
            output.flush();

            ser = baos.toByteArray();
        }
        
        // corrupt ciphertext
        ser[ser.length - 16 - 1 - 3] ^= 1;

        // deserialize object.
        try (ByteArrayInputStream bais = new ByteArrayInputStream(ser); ObjectInput input = new ObjectInputStream(bais)) {
            secret2 = (ProtectedSecret) input.readObject();
        }

        // compare values.
        assertEquals(secret1.getSecret(), secret2.getSecret());
    }
}

Final words

I cannot overemphasize this – this is primarily an intellectual exercise. As usual the biggest problem is key management, not cryptography, and with the level of effort required for the former you can probably implement a more traditional solution more quickly.

This may still be “good enough” in some situations. For instance you may only need to keep the data around during the duration of a long-running application. In this case you can create random keys at startup and simply discard all serialized data after the program ends.

Source code: https://gist.github.com/beargiles/90182af6f332830a2e0e

Comments
No Comments »
Categories
java, security
Comments rss Comments rss
Trackback Trackback

AWS certifications

Bear Giles | September 30, 2014

It’s amazing how I can have a huge pile of books to read and then another load is dumped on top of them. le sigh. I still have 3-4 technical blog entries in various stages of completion on top of a dozen books for “professional development” at work.

The current top of the stack is cloud computing, specifically AWS. It turns out that there are several certifications available for AWS: https://aws.amazon.com/certification/. The program is still undergoing growth pains, e.g., the nearest testing site to metro Denver appears to be in Winnemucca, Nevada, or maybe some hayseed college in Kansas. The Kryterion testing sites are probably added as people request them as opposed to running down a list of the top N technical markets. (Do you think the startups in Boulder aren’t heavy users of the cloud?)

Anyway I’ve mentioned before how I like to use certification objectives as study guides even if I don’t actually take the exam. I found a few resources on udemy.com that should go a long way. I was already familiar with most (not all) of the information on EC2 and S3 but was pleasantly surprised to learn about the other services.

Udemy Courses

  • Amazon Web Services for Entrepreneurs and Bloggers (free) – good introduction. Plan on 6 hours.
  • Amazon Web Services Certified Solution Architect – Associate Level (2014) ($59) – I haven’t taken this yet but it’s probably also a good overview. Plan on 10 hours.
  • Amazon Web Services – Web Hosting & Cloud Computing with AWS ($49) – highly recommended, much more detailed than first course. Plan on 12+ hours.
  • Amazon Web Services Certified Developer – Associate Level ($199) – in-depth discussion of using the AWS API. Plan on 10 hours plus many hours playing around with the SKD in your favorite language(s).

These courses are free under my corporate account so you should check with your employer first. If not Udemy regularly offers specials that can dramatically cut the cost of their courses.

Comments
No Comments »
Categories
Amazon Web Services (AWS), java
Tags
amazon, aws, cloud
Comments rss Comments rss
Trackback Trackback

Getting A List of Available Cryptographic Algorithms

Bear Giles | August 3, 2014

How do you learn what cryptographic algorithms are available to you? The Java spec names several required ciphers, digests, etc., but a provider often offers more than that.

Fortunately this is easy to learn what’s available on our system.

  1. public class ListAlgorithms {
  2.     public static void main(String[] args) {
  3.         // Security.addProvider(new
  4.         // org.bouncycastle.jce.provider.BouncyCastleProvider());
  5.  
  6.         // get a list of services and their respective providers.
  7.         final Map<String, List<Provider>> services = new TreeMap<>();
  8.  
  9.         for (Provider provider : Security.getProviders()) {
  10.             for (Provider.Service service : provider.getServices()) {
  11.                 if (services.containsKey(service.getType())) {
  12.                     final List<Provider> providers = services.get(service
  13.                             .getType());
  14.                     if (!providers.contains(provider)) {
  15.                         providers.add(provider);
  16.                     }
  17.                 } else {
  18.                     final List<Provider> providers = new ArrayList<>();
  19.                     providers.add(provider);
  20.                     services.put(service.getType(), providers);
  21.                 }
  22.             }
  23.         }
  24.  
  25.         // now get a list of algorithms and their respective providers
  26.         for (String type : services.keySet()) {
  27.             final Map<String, List<Provider>> algs = new TreeMap<>();
  28.             for (Provider provider : Security.getProviders()) {
  29.                 for (Provider.Service service : provider.getServices()) {
  30.                     if (service.getType().equals(type)) {
  31.                         final String algorithm = service.getAlgorithm();
  32.                         if (algs.containsKey(algorithm)) {
  33.                             final List<Provider> providers = algs
  34.                                     .get(algorithm);
  35.                             if (!providers.contains(provider)) {
  36.                                 providers.add(provider);
  37.                             }
  38.                         } else {
  39.                             final List<Provider> providers = new ArrayList<>();
  40.                             providers.add(provider);
  41.                             algs.put(algorithm, providers);
  42.                         }
  43.                     }
  44.                 }
  45.             }
  46.  
  47.             // write the results to standard out.
  48.             System.out.printf("%20s : %s\n", "", type);
  49.             for (String algorithm : algs.keySet()) {
  50.                 System.out.printf("%-20s : %s\n", algorithm,
  51.                         Arrays.toString(algs.get(algorithm).toArray()));
  52.             }
  53.             System.out.println();
  54.         }
  55.     }
  56. }
public class ListAlgorithms {
    public static void main(String[] args) {
        // Security.addProvider(new
        // org.bouncycastle.jce.provider.BouncyCastleProvider());

        // get a list of services and their respective providers.
        final Map<String, List<Provider>> services = new TreeMap<>();

        for (Provider provider : Security.getProviders()) {
            for (Provider.Service service : provider.getServices()) {
                if (services.containsKey(service.getType())) {
                    final List<Provider> providers = services.get(service
                            .getType());
                    if (!providers.contains(provider)) {
                        providers.add(provider);
                    }
                } else {
                    final List<Provider> providers = new ArrayList<>();
                    providers.add(provider);
                    services.put(service.getType(), providers);
                }
            }
        }

        // now get a list of algorithms and their respective providers
        for (String type : services.keySet()) {
            final Map<String, List<Provider>> algs = new TreeMap<>();
            for (Provider provider : Security.getProviders()) {
                for (Provider.Service service : provider.getServices()) {
                    if (service.getType().equals(type)) {
                        final String algorithm = service.getAlgorithm();
                        if (algs.containsKey(algorithm)) {
                            final List<Provider> providers = algs
                                    .get(algorithm);
                            if (!providers.contains(provider)) {
                                providers.add(provider);
                            }
                        } else {
                            final List<Provider> providers = new ArrayList<>();
                            providers.add(provider);
                            algs.put(algorithm, providers);
                        }
                    }
                }
            }

            // write the results to standard out.
            System.out.printf("%20s : %s\n", "", type);
            for (String algorithm : algs.keySet()) {
                System.out.printf("%-20s : %s\n", algorithm,
                        Arrays.toString(algs.get(algorithm).toArray()));
            }
            System.out.println();
        }
    }
}

The system administrator can override the standard crypto libraries. In practice it’s safest to always load your own crypto library and either register it manually, as above, or better yet pass it as an optional parameter when creating new objects.

Algorithms

There are a few dozen standard algorithms. The ones we’re most likely to be interested in are:

Symmetric Cipher

  • KeyGenerator – creates symmetric key
  • SecretKeyFactor – converts between symmetric keys and raw bytes
  • Cipher – encryption cipher
  • AlgorithmParameters – algorithm parameters
  • AlgorithmParameterGernerator – algorithm parameters

Asymmetric Cipher

  • KeyPairGenerator – creates public/private keys
  • KeyFactor – converts between keypairs and raw bytes
  • Cipher – encryption cipher
  • Signature – digital signatures
  • AlgorithmParameters – algorithm parameters
  • AlgorithmParameterGernerator – algorithm parameters

Digests

  • MessageDigest – digest (MD5, SHA1, etc.)
  • Mac – HMAC. Like a message digest but requires an encryption key as well so it can’t be forged by attacker

Certificates and KeyStores

  • KeyStore – JKS, PKCS, etc.
  • CertStore – like keystore but only stores certs.
  • CertificateFactory – converts between digital certificates and raw bytes.

It is critical to remember that most algorithms are provided for backward compatibility and should not be used for in greenfield development. As I write this the generally accepted advice is:

  • Use a variant of AES. Only use AES-ECB if you know with absolute certainty that you will never encrypt more than one blocksize (16 bytes) of data.
  • Always use a good random IV even if you’re using AES-CBC. Do not use the same IV or an easily predicted one.
  • Do not use less than 2048 bits in an asymmetric key.
  • Use SHA-256 or better. MD-5 is considered broken, SHA-1 will be considered broken in the near future.
  • Use PBKDF2WithHmacSHA1 to create AES key from passwords/passphrases. (See also Creating Password-Based Encryption Keys.)

Some people might want to use one of the other AES-candidate ciphers (e.g., twofish). These ciphers are probably safe but you might run into problems if you’re sharing files with other parties since they’re not in the required cipher suite.

Beware US Export Restrictions

Finally it’s important to remember that the standard Java distribution is crippled due to US export restrictions. You can get full functionality by installing a standard US-only file on your system but it’s hard if not impossible for developers to verify this has been done. In practice many if not most people use a third-party cryptographic library like BouncyCastle. Many inexperienced developers forget about this and unintentionally use crippled functionality.

Comments
No Comments »
Categories
java, security
Comments rss Comments rss
Trackback Trackback

« Previous Entries

Archives

  • May 2020 (1)
  • March 2019 (1)
  • August 2018 (1)
  • May 2018 (1)
  • February 2018 (1)
  • November 2017 (4)
  • January 2017 (3)
  • June 2016 (1)
  • May 2016 (1)
  • April 2016 (2)
  • March 2016 (1)
  • February 2016 (3)
  • January 2016 (6)
  • December 2015 (2)
  • November 2015 (3)
  • October 2015 (2)
  • August 2015 (4)
  • July 2015 (2)
  • June 2015 (2)
  • January 2015 (1)
  • December 2014 (6)
  • October 2014 (1)
  • September 2014 (2)
  • August 2014 (1)
  • July 2014 (1)
  • June 2014 (2)
  • May 2014 (2)
  • April 2014 (1)
  • March 2014 (1)
  • February 2014 (3)
  • January 2014 (6)
  • December 2013 (13)
  • November 2013 (6)
  • October 2013 (3)
  • September 2013 (2)
  • August 2013 (5)
  • June 2013 (1)
  • May 2013 (2)
  • March 2013 (1)
  • November 2012 (1)
  • October 2012 (3)
  • September 2012 (2)
  • May 2012 (6)
  • January 2012 (2)
  • December 2011 (12)
  • July 2011 (1)
  • June 2011 (2)
  • May 2011 (5)
  • April 2011 (6)
  • March 2011 (4)
  • February 2011 (3)
  • October 2010 (6)
  • September 2010 (8)

Recent Posts

  • 8-bit Breadboard Computer: Good Encapsulation!
  • Where are all the posts?
  • Better Ad Blocking Through Pi-Hole and Local Caching
  • The difference between APIs and SPIs
  • Hadoop: User Impersonation with Kerberos Authentication

Meta

  • Log in
  • Entries RSS
  • Comments RSS
  • WordPress.org

Pages

  • About Me
  • Notebook: Common XML Tasks
  • Notebook: Database/Webapp Security
  • Notebook: Development Tips

Syndication

Java Code Geeks

Know Your Rights

Support Bloggers' Rights
Demand Your dotRIGHTS

Security

  • Dark Reading
  • Krebs On Security Krebs On Security
  • Naked Security Naked Security
  • Schneier on Security Schneier on Security
  • TaoSecurity TaoSecurity

Politics

  • ACLU ACLU
  • EFF EFF

News

  • Ars technica Ars technica
  • Kevin Drum at Mother Jones Kevin Drum at Mother Jones
  • Raw Story Raw Story
  • Tech Dirt Tech Dirt
  • Vice Vice

Spam Blocked

53,314 spam blocked by Akismet
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox