Modern databases allow stored procedures to be written in a variety of languages. One commonly implemented language is java.
N.B., this article discusses the PostgreSQL-specific java implementation. The details will vary with other databases but the concepts will be the same.
Installation of PL/Java
Installation of PL/Java on an Ubuntu system is straightforward. I will first create a new template, template_java, so I can still create databases without the pl/java extensions.
At the command line, assuming you are a database superuser, enter
- # apt-get install postgresql-9.1
- # apt-get install postgresql-9.1-pljava-gcj
- $ createdb template_java
- $ psql -d template_java -c "update db_database set datistemplate='t' where datnam='template_java'"
- $ psql -d template_java -f /usr/share/postgresql-9.1-pljava/install.sql
# apt-get install postgresql-9.1 # apt-get install postgresql-9.1-pljava-gcj $ createdb template_java $ psql -d template_java -c "update db_database set datistemplate='t' where datnam='template_java'" $ psql -d template_java -f /usr/share/postgresql-9.1-pljava/install.sql
Limitations
The prepackaged Ubuntu package uses the Gnu GCJ java implementation, not a standard OpenJDK or Sun implementation. GCJ compiles java source files to native object code instead of byte code. The most recent versions of PL/Java are “trusted” – they can be relied upon to stay within their sandbox. Among other things this means that you can’t access the filesystem on the server.
If you must break the trust there is a second language, ‘javaU’, that can be used. Untrusted functions can only be created a the database superuser.
More importantly this implementation is single-threaded. This is critical to keep in mind if you need to communicate to other servers.
Something to consider is whether you want to compile your own commonly used libraries with GCJ and load them into the PostgreSQL server as shared libraries. Shared libraries go in /usr/lib/postgresql/9.1/lib and I may have more to say about this later.
Quick verification
We can easily check our installation by writing a quick test function. Create a scratch database using template_java and enter the following SQL:
- CREATE FUNCTION getsysprop(VARCHAR) RETURNS VARCHAR
- AS 'java.lang.System.getProperty'
- LANGUAGE java;
- SELECT getsysprop('user.home');
CREATE FUNCTION getsysprop(VARCHAR) RETURNS VARCHAR
AS 'java.lang.System.getProperty'
LANGUAGE java;
SELECT getsysprop('user.home');You should get “/var/lib/postgresql” as a result.
Installing Our Own Methods
This is a nice start but we don’t really gain much if we can’t call our own methods. Fortunately it isn’t hard to add our own.
A simple PL/Java procedure is
- package sandbox;
- public class PLJava {
- public static String hello(String name) {
- if (name == null) {
- return null;
- }
- return "Hello, " + name + "!";
- }
- }
package sandbox;
public class PLJava {
public static String hello(String name) {
if (name == null) {
return null;
}
return "Hello, " + name + "!";
}
}There are two simple rules for methods implementing PL/Java procedures:
- they must be public static
- they must return null if any parameter is null
That’s it.
Importing the java class into PostgreSQL server is simple. Let’s assume that the package classes are in /tmp/sandbox.jar and our java-enabled database is mydb. Our commands are then
- --
- -- load java library
- --
- -- parameters:
- -- url_path - where the library is located
- -- url_name - how the library is referred to later
- -- deploy - should the deployment descriptor be used?
- --
- select sqlj.install_jar('file:///tmp/sandbox.jar', 'sandbox', true);
- --
- -- set classpath to include new library.
- --
- -- parameters
- -- schema - schema (or database) name
- -- classpath - colon-separated list of url_names.
- --
- select sqlj.set_classpath('mydb', 'sandbox');
- -- -------------------
- -- other procedures --
- -- -------------------
- --
- -- reload java library
- --
- select sqlj.replace_jar('file:///tmp/sandbox.jar', 'sandbox', true);
- --
- -- remove java library
- --
- -- parameters:
- -- url_name - how the library is referred to later
- -- undeploy - should the deployment descriptor be used?
- --
- select sqlj.remove_jar('sandbox', true);
- --
- -- list classpath
- --
- select sqlj.get_classpath('mydb');
- --
--
-- load java library
--
-- parameters:
-- url_path - where the library is located
-- url_name - how the library is referred to later
-- deploy - should the deployment descriptor be used?
--
select sqlj.install_jar('file:///tmp/sandbox.jar', 'sandbox', true);
--
-- set classpath to include new library.
--
-- parameters
-- schema - schema (or database) name
-- classpath - colon-separated list of url_names.
--
select sqlj.set_classpath('mydb', 'sandbox');
-- -------------------
-- other procedures --
-- -------------------
--
-- reload java library
--
select sqlj.replace_jar('file:///tmp/sandbox.jar', 'sandbox', true);
--
-- remove java library
--
-- parameters:
-- url_name - how the library is referred to later
-- undeploy - should the deployment descriptor be used?
--
select sqlj.remove_jar('sandbox', true);
--
-- list classpath
--
select sqlj.get_classpath('mydb');
--It is important to remember to set the classpath. Libraries are automatically removed from the classpath when they’re unloaded but they are NOT automatically added to the classpath when they’re installed.
We aren’t quite finished – we still need to tell the system about our new function.
- --
- -- create function
- --
- CREATE FUNCTION mydb.hello(varchar) RETURNS varchar
- AS 'sandbox.PLJava.hello'
- LANGUAGE java;
- --
- -- drop this function
- --
- DROP FUNCTION mydb.hello(varchar);
- --
-- -- create function -- CREATE FUNCTION mydb.hello(varchar) RETURNS varchar AS 'sandbox.PLJava.hello' LANGUAGE java; -- -- drop this function -- DROP FUNCTION mydb.hello(varchar); --
We can now call our java method in the same manner as any other stored procedures.
Deployment Descriptor
There’s a headache here – it’s necessary to explicitly create the functions when installing a library and dropping them when removing a library. This is time-consuming and error-prone in all but the simplest cases.
Fortunately there’s a solution to this problem – deployment descriptors. The precise format is defined by ISO/IEC 9075-13:2003 but a simple example should suffice.
- SQLActions[] = {
- "BEGIN INSTALL
- CREATE FUNCTION javatest.hello(varchar)
- RETURNS varchar
- AS 'sandbox.PLJava.hello'
- LANGUAGE java;
- END INSTALL",
- "BEGIN REMOVE
- DROP FUNCTION javatest.hello(varchar);
- END REMOVE"
- }
SQLActions[] = {
"BEGIN INSTALL
CREATE FUNCTION javatest.hello(varchar)
RETURNS varchar
AS 'sandbox.PLJava.hello'
LANGUAGE java;
END INSTALL",
"BEGIN REMOVE
DROP FUNCTION javatest.hello(varchar);
END REMOVE"
}You must tell the deployer about the deployment descriptor in the jar’s MANIFEST.MF file. A sample maven plugin is
- <plugin>
- <groupId>org.apache.maven.plugins</groupId>
- <artifactId>maven-jar-plugin</artifactId>
- <version>2.3.1</version>
- <configuration>
- <archive>
- <manifestSections>
- <manifestSection>
- <name>postgresql.ddr</name> <!-- filename -->
- <manifestEntries>
- <SQLJDeploymentDescriptor>TRUE</SQLJDeploymentDescriptor>
- </manifestEntries>
- </manifestSection>
- </manifestSections>
- </archive>
- </configuration>
- </plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.3.1</version>
<configuration>
<archive>
<manifestSections>
<manifestSection>
<name>postgresql.ddr</name> <!-- filename -->
<manifestEntries>
<SQLJDeploymentDescriptor>TRUE</SQLJDeploymentDescriptor>
</manifestEntries>
</manifestSection>
</manifestSections>
</archive>
</configuration>
</plugin>The database will now know about our methods as they areinstalled and removed.
Internal Queries
One of the ‘big wins’ with stored procedures is that queries are executed on the server itself and are MUCH faster than running them through the programmatic interface. I’ve seen a process that required over 30 minutes via Java knocked down to a fraction of a second by simply moving the queried loop from the client to the server.
The JDBC URL for the internal connection is “jdbc:default:connection”. You cannot use transactions (since you’re within the caller’s transaction) but you can use savepoints as long as you stay within a single call. I don’t know if you can use CallableStatements (other stored procedures yet) – you couldn’t in version 1.2 but the Ubuntu 11.10 package uses version 1.4.2.
Upcoming
Next time I will discuss working with lists and triggers, followed by user-defined types.
More Information
TADA wiki – the author’s blog.
http://pgfoundry.org/projects/pljava/ – Home of PL/Java project.
PL/Java 1.2 User Guide (Note: the Ubuntu package uses release 1.4.2)
[...] [...]