Tuesday 21 January 2014

java.io.Serializable and serialVersionUID explained

When you mark a class as Serializable, Eclipse (perhaps other IDEs, too) will issue a warning and provide you with a set of options: One (1) is to ignore the warning and just carry on, (2) is to let the IDE assist you with generation of a serialVersionUID (actually generated by your JDK), and the last option (3) is for the IDE to assign a default serialVersionUID to your class. Let's explore these options and see which one to pick and why.

Definition

Every Java class implementing the marker interface java.io.Serializable must define a static final serialVersionUID of type long (any access modifier allowed). If not provided by the programmer, the JVM will generate one at runtime.

The serialization runtime associates with each serializable class a version number, called a serialVersionUID, which is used during deserialization to verify that the sender and receiver of a serialized object have loaded classes for that object that are compatible with respect to serialization. If the receiver has loaded a class for the object that has a different serialVersionUID than that of the corresponding sender's class, then deserialization will result in an InvalidClassException.

No serialVersionUID

If a class does not define this constant and is serialized using the standard Java serialization process, the serialVersionUID generated at runtime will be used and stored in the serialized output. If we later decide to add this constant to the Serializable class, and choose to use a default of 1L, any previously serialized objects of this same class, although in theory compatible, will fail to deserialize with a InvalidClassException.

Generated serialVersionUID

Therefore, if there are already serialized copies of objects that are using a JVM-provided serialVersionUID, it’s best to select the option to generate the serialVersionUID in you IDE, which will come up with an UID matching the UID of objects previously serialized. This is especially important if we want to make compatible changes to the class and not break the current deserialization process. First, generate a serialVersionUID from the class marked Serializable that does not explicitly define this constant (use your IDE, or the JDK utility serialver to generate the correct UID), then add fields (or other compatible changes) to the class then recompile. The compiler will bake your explicitly defined serialVersionUID into the class definition and your new code will be able to deserialize both the old and the new versions of the serialized objects, as these will both have a matching serialVersionUID.

Note, however, that different JVM implementations compute this number differently, so the UID of source/destination might not match; thus serialization between different systems would not work.

Default serialVersionUID

On the other hand, we have the option to use default serialVersionUID numbers. These are a good choice for new Serializable classes. Note that every time we make incompatible changes to a Serializable class, we must remember to update the serialVersionUID number. This is easier to do when we explicitly define a small version number, such as 1L. When creating a new Serializable class, there’s no reason to generate the serialVersionUID, as there won’t be any already serialized objects of this new class. Define a default serialVersionUID of 1L and remember to update this version number every time you make an incompatible change to the class.

Wednesday 15 January 2014

Log4j logger hierarchies and Java nested classes

According to this LOG4J article, A logger is said to be an ancestor of another logger if its name followed by a dot is a prefix of the descendant logger name. So beware of nested classes!

  <logger name="com.my_group_id.MyClassDontLogDebug">
    <level value="error" />
  </logger>

  <logger name="com.my_group_id">
    <level value="debug" />
  </logger>

Any DEBUG-level log statements from the MyClassDontLogDebug will be ignored, as expected. However, any DEBUG-level log statements from any loggers obtained from classes nested inside MyClassDontLogDebug will be appearing in your logs. And that despite the fact that nested classes are members of MyClassDontLogDebug! The problem is that the class name will include a dollar sign "$" to separate the nested class name from its nesting class. For example, when com.my_group_id.MyClassDontLogDebug$MyNestedClass acquires a logger in the usual fashion - LogFactory.getLog(getClass()); - the logger will not inherit logging level from MyClassDontLogDebug, as it's not its ancestor. It will become a child of the com.my_group_id parent logger, and inherit the DEBUG logging level, despite our efforts to stop logging at DEBUG level from the MyClassDontLogDebug class.

Re-read the definition carefully again: A logger is said to be an ancestor of another logger if its name followed by a dot is a prefix of the descendant logger name. A logger is said to be a parent of a child logger if there are no ancestors between itself and the descendant logger.

Take the name of our logger "com.my_group_id.MyClassDontLogDebug", append a dot, "com.my_group_id.MyClassDontLogDebug." and see if it is a prefix of "com.my_group_id.MyClassDontLogDebug$MyNestedClass" and behold! it's not. There's the dollar sign. Caught me out. Don't let it catch YOU out.