All posts by roger

Using Maven for production deployment

We use Maven to manage dependencies during development. This entails adding a pom.xml file to our Eclipse project which defines the jars which the application depends. Maven then takes care of fetching the right version of the jars from a number of repositories (central maven repository, vendor specific repositories, our own repository).

This works pretty well and its hard to imagine developing complex projects without this capability. However, when it comes to ensuring that an application is delivered to the production environment with all its dependencies, you’re pretty much on your own. You have to build either a war file or a jar-with-dependencies – both of which can be very tricky and lead to problems occurring in production which you never saw during development.

Additionally, our applications tend to have a lot of dependencies and the war files get huge.

So, we thought, why not just use Maven on production servers to fetch applications and their dependencies.

To do this we maintain a pom.xml file on the production server with the application listed as a dependency. We use the maven goals “versions:use-latest-releases” and “versions:commit” to update the pom file automatically to the latest release version. We then use the “dependency:build-classpath” goal to build a class path from the repository and finally run the application.

Abbreviated component paths in wicket tester

Wicket tester combined with Mockito is fantastic for unit testing your web application. However, developing the tests is harder than it should be because you have to refer to components by their path rather than their id.

For example, to clock the OK button on a page, I need something like:

tester.executeAjaxEvent(“border:form:pagetitlecontainer:ok”. “onclick”);

i.e. instead of “ok”, I must refer to the button by its full path “border:form:pagetitlecontainer:ok”. This seems tedious – if I know that I only have one component on the page with the id “ok”, why can’t I just use “ok”. Wicket tester should notice that the component is not available and traverse the component hierarchy for it. If its ambiguous then it can complain, but if not, then it should be OK.

To get around this limitation, we override the getComponentFromLastRenderedPage method of the WicketTester class as follows:

@Override
public Component getComponentFromLastRenderedPage(final String path) {
	// first check if we can find the component with the specified path - if not, check if its an abbreviated path
	String fullPath = lookupPath(getLastRenderedPage(), path);
	if (fullPath == null)
		return null;

	return super.getComponentFromLastRenderedPage(fullPath);
}

public String lookupPath(final MarkupContainer markupContainer, final String path) {
	// try to look it up directly
	if (markupContainer.get(path) != null)
		return path;

	// if that fails, traverse the component hierarchy looking for it
	final List<component> candidates = new ArrayList<component>();
	markupContainer.visitChildren(new IVisitor<component>() {
		private static final long serialVersionUID = 1L;

		Set<component> visited = new HashSet<component>();

		@Override
		public Object component(Component c) {
			if (!visited.contains(c)) {
				visited.add(c);

				if (c.getId().equals(path))
					candidates.add(c);
			}
			return IVisitor.CONTINUE_TRAVERSAL;
		}
	});
	// if its unambiguous, then return the full path
	if (candidates.isEmpty()) {
		fail("path: '" + path + "' not found for " +
				Classes.simpleName(markupContainer.getClass()));
		return null;
	} else if (candidates.size() == 1) {
		String pathToContainer = markupContainer.getPath();
		String pathToComponent = candidates.get(0).getPath();
		return pathToComponent.replaceFirst(pathToContainer + ":", "");
	} else {
		String message = "path: '" + path + "' is ambiguous for " + Classes.simpleName(markupContainer.getClass()) + ". Possible candidates are: ";
		for (Component c : candidates) {
			message += "[" + c.getPath() + "]";
		}
		fail(message);
		return null;
	}
}

The Palchinsky Principles

My holiday reading this year was “Adapt” by Tim Harford. One of the most interesting parts was about Peter Palchinsky, a great russian engineer who was an advisor to both the Tsar and to Stalin. Although his uncompromising honesty got him exiled to Siberia by the Tsar, pardoned again and finally murdered by Stalin’s secret police, he nevertheless had time to formulate 3 principles for innovation:

  1. Try lots of things, expecting many to fail.
  2. Make sure the failures are survivable.
  3. Learn from the failures.

As Tim Harford points out, this is roughly how evolution works in nature and it seems applicable to software development. Since customers don’t generally appreciate failures on their projects, this trial and error cycle needs to be carried out outside of production projects – in the “20%” projects we do on our own time.

Wicket needs a component catalog

I’m a big fan of Apache Wicket, but there’s an urgent need for a better-organized component-ecosystem around Wicket. To write a real web application you need a framework (that’s Wicket) and visual components (that’s JQuery, YUI, Recaptcha etc) and you need the two to work together smoothly.

A newcomer to Wicket should be impressed by the clean, component-oriented approach to producing the GUI for his application, but a production web application needs more than just HTML. So, inevitably, the next step is to try to find the stuff which will make your web-application look like all those other web 2.0 sites out there. However, very few of the Wicket components referenced from the Wicket home pages are “best-of-breed” and a certain amount of disappointment soon sets in.

Take the wicket example captcha implementation for example. There are two problems here: (a) the implementation produces a captcha image which is not only difficult for robots to read, but also pretty impossible for humans and (b) everybody else is already using recaptcha so why even bother distracting developers with yet-another home-spun implementation.

The Wicket community might react by saying “well, there are enough blogs describing how to use recaptcha with Wicket, so where’s the problem?”. Well the problem is that these blogs typically provide their description in a way which reflects their own expertise rather than that of the reader, which means that a novice Wicket programmer may or may not be able to follow the instructions to successfully integrate recaptcha into his application.

Or to put it another way, developers blogs are not a substitute for organized, tested and documented component libraries.

What could help, on the other hand, is a catalog of components, with ratings and a compatibility matrix indicating tested interoperability with other components/browsers etc.

By the way what prompted me to write this post was that yesterday I finally used a few of the Visural Wicket components (based on JQuery) and they are fantastic – well thought out APIs, styling etc. What I don’t understand is that mediocre-to-lousy versions of equivalent components (such as the Wicket extensions modal window or the wicketstuff lightbox) feature much more prominently on the Wicket sites, so the novice programmer is far more likely to use them (and be bitterly disappointed) than to use Visural Wicket (and be delighted).

Wicket adoption would likely be accelerated if the end-to-end experience of producing the first production web application involved a bit less trial-and-error and a bit more delight.

Shadowed variables in inner classes

Yesterday we had a bug caused by the following situation: a class X had a variable “user”. Class X had an inner, anonymous class extending class Y. Inside the inner class a reference was made to the “user” variable from the enclosing class X. Somebody then added a protected variable “user” to class Y. The java compiler silently updated the reference to “user” within the inner class from X.user to Y.user.

This is pretty nasty and I hoped that some of the code-checking tools would warn about about it, so I checked the code with PMD, Findbugs and Checkstyle. Unfortunately, none of them flagged the issue.

Deciphering anonymous class references like MyPage$1$1$2

If you work with anonymous classes a lot (as Wicket applications do), you sometimes get confronted with compiler/tool messages referring to anonymous (inner) classes with names like MyPage$1$1$2 (for instance in FindBugs reports). Its pretty tedious to figure out which class is meant. Maybe there’s a better way, but what I do is go to the target directory and call “javap MyPage$1$1$2.class”. This produces a decompiled version of the class which makes it pretty clear which class it is.

Wicket 1.4 and browser tabs

We had an amazingly annoying problem in a Wicket application. A specific user was continuously having problems with ajax controls on pages (search fields, auto-complete fields etc). The problems were caused by PageExpiredExceptions. We couldn’t understand why only this one user had these problems. This went on for ages, until today I found out that Wicket 1.4 sets a default limit of 5 page maps per session. This specific user typically worked with multiple browser tabs on the application and once he went over 5, some of the page maps got evicted and the ajax stuff started failing.

The solution was to call “getSessionSettings().setMaxPageMaps(100)” to allow up to 100 page maps per session.

Java logging in development and production

Logging is tricky. Even some major open-source projects don’t do it correctly, so if you use their libraries, you end up with log files you didn’t ask for cluttering your machine.

Current best practices are to use a logging facade like commons-logging or slf4j to avoid these kind of problems by allowing libraries to conform to whatever logging strategy the application which uses the library is using. This means that if your app logs to myapp.log, then the library using slf4j will also log to myapp.log.

Here’s how we use slf4j in our projects:

Our libraries use slf4j-api – here’s the maven dependency you’ll need:

<dependency>
	<groupId>org.slf4j</groupId>
	<artifactId>slf4j-api</artifactId>
	<version>1.4.2</version>
</dependency>

Our applications use slf4j-log4j12 – here’s the maven dependency you’ll need:

<dependency>
	<groupId>org.slf4j</groupId>
	<artifactId>slf4j-log4j12</artifactId>
	<version>1.4.2</version>
</dependency>

The code to log looks like this (its the same in libraries or applications):

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
...
public class MyClass {
	static Logger log = LoggerFactory.getLogger(MyClass.class);

	public Application() {
		log.info("some logging info");
	}
}

We want to have simple console logging during development. To do this we use a simple log4j.properties file containing only a console appender as shown below.

log4j.debug=false
log4j.rootLogger=INFO, Stdout
log4j.appender.Stdout=org.apache.log4j.ConsoleAppender
log4j.appender.Stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.Stdout.layout.conversionPattern=%-5p - %-26.26c{1} - %mn

We put this file in a Settings folder in our home directory.

During development we need to get log4j to load this file so we get console logging. We do this by defining a system property “log4j.configuration” in the Eclipse Preferences/Installed JREs/Edit/Default VM Arguments (that way it applies for all development projects):

-Dlog4j.configuration=file:///Users/roger/Settings/log4j.properties

During production, we do this same thing, but this time we pass a log4j configuration with a rolling file appender as shown below:

log4j.debug=false
log4j.rootLogger=INFO, R
log4j.appender.R=org.apache.log4j.RollingFileAppender
log4j.appender.R.File=myapp.log
log4j.appender.R.MaxFileSize=5000KB
log4j.appender.R.MaxBackupIndex=10
log4j.appender.R.layout=org.apache.log4j.PatternLayout
log4j.appender.R.layout.ConversionPattern=%d %p %t %c - %m%n

So the startup in production looks something like this:

java -Dlog4j.configuration=file:///home/administrator/log4j.properties -jar myapp.jar

Clustering scheduled jobs with Quartz and Terracotta

We’re using Quartz to handle scheduled jobs in our java applications. Since we cluster our apps for failover using Terracotta, we need to address the issue of how to failover scheduled jobs (we only want each job to execute exactly once, but for fault tolerance it should be scheduled on multiple servers).

Quartz handles this situation with clustered job stores and Terracotta provides a job store for Quartz. Its easy to use – just add the quartz-terracotta jar (you’ll also need a quartz version > 1.7 – I’m using 1.8.4 below).

<dependency>
	<groupId>org.quartz-scheduler</groupId>
	<artifactId>quartz</artifactId>
	<version>1.8.4</version>
</dependency>

<dependency>
	<groupId>org.terracotta.quartz</groupId>
	<artifactId>quartz-terracotta</artifactId>
	<version>1.2.0</version>
</dependency>

then set two systems properties. I’m doing it programmatically below because I use a commandline option to enable clustering in applications using embedded jetty. The “terracotta” commandline option is a terracotta url which tells the terracotta client where its servers are (something like “terracotta:9510,terracotta2:9510” for a fault-tolerant pair of terracotta servers).

if (cmd.hasOption("terracotta")) {
	System.setProperty("org.quartz.jobStore.class", "org.terracotta.quartz.TerracottaJobStore");
	System.setProperty("org.quartz.jobStore.tcConfigUrl", cmd.getOptionValue("terracotta"));
}

That’s it basically. The only additional thing you have to take care of is that while setting up the scheduled job, you need to be aware that another member of the cluster may have already set it up, so you should expect an ObjectAlreadyExistsException while setting up the quartz job.

Weird racing clock problem on VMWare Linux Guests

A while back we installed a new Dell server with a low-power Xeon 3GHz L3110 CPU to run some other essential network infrastructure. We chose the specific server configuration because it dissipates less than 30W while running 5 VMware VMs. It runs for hours on a UPS and doesn’t require cooling, so even if our server room air-conditioning were to die, this server should keep our network, firewalls, VPNs, DNS, DHCP and primary Terracotta server and a few other vital services up long enough for us to figure out what’s going on.

The server is running CentOS 5.5 and VMWare 1.0.10 – normally a very stable combination. However, we found that Linux guests running on this server were not keeping time properly. their clocks were running too fast – 50% too fast in fact. We finally figured out that the problem was caused by the fact that the CPU power-saving causes Linux to think that has a 2GHz processor instead of a 3GHz processor and this causes the 50% clock speedup in the Linux guests under VMWare. We disabled the demand-based power saving feature of the CPU in the BIOS and now it works correctly.