Thursday, 26 July 2012

An introduction to XML

What is XML?

XML, or eXtensible Markup Language was created by the World Wide Web Consortium (W3C) to overcome the limitations of HTML. While the HTML tags tell a browser how to display this information, the tags don't tell the browser what the information is. With XML, you can assign some meaning to the tags in the document, which can be processed by the machine.

A simple XML File
<?xml version="1.0" encoding="UTF-8"?>
<post id="1" value="post1">
<title>Serialization in Java</title>
<post id="2" value="post2">
<title>serialVersionUId in Java Object Serialization</title>
<post id="3" value="post3">
<title>Serializable vs Externizable</title>

A little bit of XML terminology

  • XML Declaration: XML declaration is recommended but not mandatory
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
The  version is the version of XML usedThe encoding is the character set used in this document. If no encoding is specified, the XML parser assumes that the characters are in the  UTF-8; standalone, which can be either yes or no, defines whether this document can be processed without reading any other files.
  • Tag: is the text between the left angle bracket (<) and the right angle bracket (>). There are starting tags <post>ending tags </post> and self-closing tags <publishedOn date="20-07-2012"/>
  • Element: is the starting tag, the ending tag, and everything in between.  <post id=14> <title>An Introduction to XML</title> ... </post>
  • Root Element: is the first element in your XML file which encloses all the other elements of your XML. Each XML document has exactly one root element aka document element. In above example <blog> is the root element
  • Attribute: is a name-value pair inside the starting tag of an element. In above example,   id="1" value="post1"  
  • Comments: can appear anywhere in the document;  A comment begins with <!-- and ends with -->
  • Processing Instructions (PI): gives command or information to an application that is processing the XML. 
<? target instruction ?>
where the target is the name of application that is excepted to do the processing and instruction is the command or information for the application
NOTE: The XML Declaration at the beginning of an XML document is not a processing instruction
  • Entities: are the alias for a piece of information. The XML spec also defines five entities you can use in place of various special characters. The entities are:
    • &lt; for the less-than sign
    • &gt; for the greater-than sign
    • &quot; for a double-quote
    • &apos; for a single quote (or apostrophe)
    • &amp; for an ampersand.
<!ENTITY name "definition">

<!ENTITY blogurl "">
Anywhere the XML processor finds the string &blogurl;, it replaces the entity with the string

XML document rules

  • Root Element is mandatory. Every XML document must contain only one root element
  • Elements can't overlap - If you begin a <tag2> element inside <tag1>, then <tag2> must end before <tag1>
  • End tags are required or a tag should be self-closing tag
  • Elements are case sensitive - In XML  <blog> and <Blog> are not the same.  If you try to end an <blog> element with a </Blog> tag, you'll get an error.
  • Attributes must have values enclosed within quotation mark.
  • Element Names must follow the following naming convention
    • it can contain any letter or number or special characters
    • cannot contain spaces
    • must not begin with a number or any special character
    • cannot start with xml
  • XML declaration should be the first line in the document, if at all present
  • You should avoid having empty lines in the begging of the document, because few XML parsing API does not excepts such files   

XML Advantages:

  • Easy Information Exchange - XML allows easy sharing of data between different applications  - even if these applications are written in different languages and reside on different platforms.
  • XML enables smart code - XML's rigid set of rules helps make documents more readable to both humans and machines. XML document syntax contains a fairly small set of rules, making it possible for developers to get started right away.  
  • Self-describing dataEvery important piece of information (as well as the relationships between the pieces) can be identified easily.
  • Openness - it allow users to define their own DTDs; these set of tags can be used by the applications very easily
  • Unicode Support enables a wide variety of characters to be represented and communicated.

XML Disadvantages:

  • XML syntax is redundant, this may affect application efficiency through higher storage, transmission and processing costs
  • You cannot have a single generic application for processing different XMLs

Tuesday, 24 July 2012

Annotations in Java

Annotations are the tags that can be inserted into a Java programs so that they can be processed by the tools. In the Java programming language, an annotation is used like a modifier, and it is placed before the annotated item, without a semicolon. The name of each annotation is preceded by an @ symbol. 
For Example:
public boolean equals(Object obj){

Annotation Syntax

An annotation is defined by an annotation interface: 

public @interface AnnotationName
 //element declarations 
 type elementName()
 type elementName() default value
 . . . 
One of the following type can be used for annotation fields:
  • All primitives (boolean, char, short, int, long, double)
  • String
  • Class
  • Enum
  • Array of any of above types


Meta annotations are annotations that are used to annotate annotations. There are four meta-annotation types that come standard with Java 5: 
@Documented: is a marker annotation. Annotation declaration (of the @documented annotation type) will be included in the documentation generated using Javadoc or similar tools.
@Inherited:  indicates that an annotation type is inherited by subclasses of annotated class. 
public @interface InheritedAnno {

public class SuperClass { ... }

public class SubClass extends SuperClass {... } 

In above example the SuperClass is explicitly annotated with @InheritedAnno and @SuppressWarning. SubClass has not been explicitly marked with any annotation, however it automatically inherits @InheritedAnno because of the @Inherited meta-annotation.

@Retention: indicates how long the annotations are to be retained. If no retention policy is defined, it defaults to RetentionPolicy.CLASS 

Retention Policies for the @Retention Annotation
Retention Policy Description
SOURCE Annotations are not included in class files. Annotations like @Deprecated, @SuppressWarning and @Override are used by the compiler at compile time to validate source code. 
Annotations are included in class files, but the virtual machine need not load them. These are parsed by the application servers and other software tools at the time of deployment to generate XMLs, bolierplate code etc.
Annotations are included in class files and loaded by the virtual machine. You could make your code to behave in a particular way whenever it is called. This can be achieved by the use of reflection API.

@Target: describes the program element on which an annotation is applicable. If no target is defined the annotation can be used on any program element. If target is defined the compiler will enforce specific usage restriction. 

Element Types for the @Target Annotation
Element TypesDescription
CONSTRUCTORAnnotation can be applied to constructors
FIELDAnnotation can be applied to class fields or global variables
LOCAL VARIABLESAnnotation can be applied to local variables
LOCAL VARIABLESAnnotation can be applied to local variables
METHODAnnotation can be applied to any method declaration
PACKAGEAnnotation can be applied to package declaration
PARAMETERAnnotation can be applied to method parameters
TYPEAnnotation can be applied to Class, interface and Enum declaration

User Defined Annotation (Custom Annotation)


import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Target(value={ElementType.METHOD, ElementType.LOCAL_VARIABLE, ElementType.TYPEElementType.FIELD})
public @interface CustomAnnotation {
int id() default 0;
String value() default "";

Annotated Class
Following code snippet shows the various formats in which an annotation can be annotated.

@CustomAnnotation(id=111, value="Annotation on Class")
public class AnnotatedClass {
//The order of the elements does not matter
@CustomAnnotation(value="Annotation on globalVar1", id=222)
private int globalVar1;
//If element value is not specified, the default value is used
@CustomAnnotation(value="Annotation on globalVar2")
private float globalVar2;
//Single valued Annotation
@CustomAnnotation("Annotation on globalVar3")
private int globalVar3;
//Marker Annotation
private int globalVar4;

public void annotatedMethod(){
System.out.println("Annotated method");

Making use of Reflection API


import java.lang.annotation.Annotation;
import java.lang.reflect.Field;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;

public class CustomAnnotationParser {
public static void main(String[] args) {
public static void parseAnnotatedClass(){
Class<AnnotatedClass> c = AnnotatedClass.class;

//Fetches Class Level Annotation information
CustomAnnotation anno = c.getAnnotation(CustomAnnotation.class);
System.out.println("Id: ";
System.out.println("Value: "+anno.value());
             //Retrieves all the methods defined in the Class
Method[] methods = c.getDeclaredMethods();
for(Method m : methods){
System.out.println("Method Name = "+m.getName());
  //Checks is CustomAnnotation is Present on the current method
CustomAnnotation mAnno = m.getAnnotation(CustomAnnotation.class);
System.out.println("Id: ";
System.out.println("Value: "+mAnno.value());
  //Retrieves all the global variables/fields defined in the Class
Field[] fields = c.getDeclaredFields();
for(Field f : fields){
System.out.println("Field Name = "+f.getName());
CustomAnnotation fAnno = f.getAnnotation(CustomAnnotation.class);
System.out.println("Id: ";
System.out.println("Value: "+fieldAnno.value());
System.out.println(field.getName()+" is not annotated");
The above class make use of Java Reflection API to parse the AnnotatedClass and prints the annotated elements values. 

Monday, 9 July 2012

Serializable vs Externizable

Serialization vs Externalization

Serializable is a marker interface (an interface with no methods) Unlike Serializable, Externizable is a standard interface with two methods defined to be implemented by the implementing class.
Serailization is a recursive process, all non-transient variables and super classes in the object hierarchy will be serialized causing an unnecessary overhead User defines what should be serialized and what should not. Hence it is more optimized. Should be preferred for "Fat Objects" 
Serialization uses reflection mechanism for marshalling and un marshalling the objects.  Marshalling/Unmarshalling process is user defined.
During de-serialization no constructor is called, hence initialization done in constructor will be skipped.  During de-serialization default constructor is invoked
A default construtor definition is not mandatory when parameterized constructor(s) are defined An explicit default constructor definition is mandatory, when parameterized constructor(s) is defined. Throws an exception if no default construtor in such cases.
While flattening an Object implementing Serializable requires more space on disk as it store additional inforamation(field names, types super classes info and other metadata) along with the field values An object implementing Externizable, will take lesser disk space while persiting it.
The default serialization mechanism adapts to class changes due to the fact that metadata is automatically extracted from the class definitions Externalization isn't very flexible and requires you to rewrite your readExternal and writeExternal code whenever you change your class definitions.
Implicitly serialize super class. If you are subclassing your externalizable class, you have to invoke your superclass’s implementation. So this causes overhead