Java

From Elvanör's Technical Wiki
Revision as of 08:39, 22 September 2009 by Elvanor (talk | contribs) (→‎Arrays)
Jump to navigation Jump to search

This page will be a collection of resources on Java programming.

Documentation

  • Thinking in Java. This links to a mirror for the third edition.
  • Official Sun documentation.
  • Almost everything can be found in the API Javadocs. Java libraries are very well designed, and very well documented (Javadocs rule!).

Core Concepts

Launching Java

  • When launching Java from the command line, you can supply the -jar option and the name of the jar. In this case the main class must be provided in the jar's Manifest file. It will run the main() method of the supplied class.
  • The alternative is to just provide directly the name of the main class (and put all relevant jars directly on the classpath).
  • Be careful about the JAVA_OPTS environment variable. It is not an official variable that java will read. Instead it is used in various launchers (Tomcat, Heritrix) that will later append this variable to the java process when launching. The correct way to specify Java properties is via the -D flag; enclosing these options in JAVA_OPTS won't work if you directly launch your program on the command line via the java binary.
  • The HOME environment variable is not honored by Java on Linux to set the user.home Java system property. You can set this user.home property manually though.

Inheritance

  • When you define a property on a subclass having the same name as a property present on a superclass, the child property represents a different object or value - there is no overriding. If you want to modify the parent property, do this on the child's constructor method.
  • Like in C++, one constructor per layer of the class hierarchy is guaranteed to be called. If you don't call super() yourself, it will be done for you.
  • Note that another constructor call must be the *first* call on the constructor method. This implies that you can't for example call super(string); and then this(otherString). You have to choose between calling a constructor on the same class or a parent constructor (usually if you need this, just call a constructor on the same class, which will itself call a parent constructor).

Visibility

  • There are four different visibility for methods and properties:
    • private;
    • protected;
    • public;
    • default.
  • Default visibility means package visibility (private to other packages, public to the same package). You cannot specify explicitely this visibility: just omit the other keywords and it will take place.

Threads

  • If you need to execute a task in a separate thread, just create a class implementing Runnable. It will have a run() method that will get executed by another thread. You will usually want to use some Executor class from the JDK; these classes provide support for running tasks in a thread pool, with a queue containing waiting tasks.
  • Be aware that the JDK classes like ThreadPoolExecutor don't maintain references to your tasks (well they probably do, but you cannot access your tasks through this object later). You can get hold of the queue, meaning you can access your tasks that are waiting for execution - but you cannot access the tasks currently running. For monitoring purposes you should thus maintain your own references to these tasks.
  • Calling shutdownNow() on an Executor supposedly stops all threads (via interrupt()). However it does not seem to work all the time. This would apparently be due to the fact that the default ThreadFactory creates a thread without the daemon flag - this must be investigated.
  • Probably the best way to stop a thread is to maintain a variable that is periodically checked to see if the thread must exit.

Exceptions

  • There are two kinds of exceptions: checked and unchecked. The unchecked exceptions are those that extend RuntimeException. Errors also constitute unchecked exceptions. A method does not need to have the throws keyword if it throws unchecked exceptions.
  • In multiple catch blocks, begin with the most specific exception (eg, the deepest in the class hierarchy), as if you begin with a parent of the specific exception you wish to catch, it will still be caught by the first block and the second block (which looks for this specific exception) is not executed.

Strings

  • Strings are encoded in memory using UTF-16 encoding, but this is generally transparent: when you output to a stream (file, console, etc), it will be converted to another encoding that's dependent on your system locale.

Loading native external C/C++ libraries

  • The java.library.path property specifies the locations of the shared libraries you may want to load. However, those libraries may in turn need to load other shared libraries (.so files). This is not by Java but by the standard native loader mechanism. This means that in Linux you may have to specify the additional locations using the LD_LIBRARY_PATH environment variable. Specifying the locations using java.library.path won't work.

Differences from C++

  • Inner classes: Although nested classes exist in C++, they are rarely used, and anyway it seems that the only effect of nested classes is on namespaces. Inner classes in Java have the interesting features that they can access their enclosing class object, and all of its members, directly.
  • Inner local classes: Local classes defined inside a statement block (a function block for example) can even access the final local variables of the function. Nice!
  • Anonymous inner classes: When you define the body of a class directly after a new keyword, you are defining an anonymous class. This class can derive from an existing class or implement an interface. The syntax is:
MyInterface my_instance = new MyInterface()
{
    // Here you write the body of the class

    {
        // Here we could perform instance initialization
    }
};

You cannot have a constructor in an anonymous class since the class does not have a name. You can, however, pass arguments to the base constructor, and you can also perform instance initialization (see the previous code).

Note that with anonymous classes, you cannot have more than one object of the class. This limitation is not present with inner local classes, so the only point of an inner local class (vs. an anonymous class) is to create more than one instance of the class.

  • Note that there is no such thing as anonymous or inner interfaces. What you do generally, however, is create a class that will implement an interface.
  • Java imports are NOT like C++ includes. First they are not inherited by subclasses. Second, they actually only help with namespaces issues. Eg, import org.hibernate.Session just allows you to write Session instead of org.hibernate.Session. If you were to write the fully qualified name, the import statement would not be mandatory (this really differs from C++).
  • There are no default arguments in Java (although there are in Groovy).

Standard Library from the JDK

Numbers

  • To round a float to the upper integer, use Math.ceil().
  • To format or parse numbers to / from strings, use the NumberFormat and DecimalFormat classes. Note that the pattern used by DecimalFormat is locale-independent. This means that if you specify "##0.##", there will be two numbers after the decimal separator, which will be "," with a French locale.
  • To obtain a NumberFormat instance with a pattern corresponding to a given locale , call NumberFormat.getInstance(locale). Only instantiate a DecimalFormat object if you need to customize the pattern (eg, the default pattern does not suit you, which can happen).
  • With the french locale, note something very important: the thousands separator is a unsecable space, not a normal space (the Unicode code point is different). When displaying this has not much importance, however when parsing a string back into a number this can be problematic (especially with a web-app, where the user can use normal spaces for separation).

Enums

  • You can call the values() method on an enum to get the list of values for this enumerated type.
  • You can call ordinal() on an enum instance to get its ordinal value.

Objects

  • The clone() method performs a shallow copy of an object instanced. Sometimes it is best to hand write a copy constructor yourself, as when using clone() you don't have much control over what gets copied.
  • The equals() method allows you to define equality for your object. By default this method will compare the objects adresses in memory, but this is for example overriden for java.lang.String.

I/O

  • Remember that InputStream and OutputStream are for dealing with bytes (binary data). Readers and Writers deal with characters (eg, they will use an encoding).

File

  • The mkdir() returns a boolean, and does not throw an exception, if the directory cannot be created. Keep this in mind.

Streams

  • To copy from an input stream to an output stream, the standard library offers no other way than a loop with an array of bytes. Groovy offers some higher level API methods via the overloaded << operator. In pure Java, commons-io library offers also an API to do this.
  • To obtain an input stream from an output stream, you have to extract the data from the outputstream and give that to the input stream constructor.
  • Some InputStreams support rewinding via the mark() and reset() methods, eg going back in the stream. You can test if the stream supports these methods via the markSupported() call.

Collections

  • There is no first() nor last() methods on ordered collections! This is surprising. Note Groovy has the pop() method for removing the last element of a list.
  • Be careful that a Map, for example, does not implement the Collection interface, even if it is part of the collection framework.
  • Using a SortedSet is sometimes useful. For example it allows to have Hibernate objects sorted (via Java, not SQL) as you want automatically everytime you retrieve the collection. Be very cautious though when implementing the compareTo method. If you compare the ids, don't forget that the ids will only be associated once Hibernate has saved the collection a first time. A workaround is to use a transient id that you set when creating the collection.
  • If you try to insert in a sorted set an element that is deemed to be equal to another one (via compareTo()), the element won't be inserted (which is logical but may be hard to debug, so be cautious).

Warnings

  • You cannot add or remove an entry from a Map while iterating over it. It will throw a ConcurrentModificationException.

Arrays

  • There are two toArray() methods in the List class. One allows you to obtain an Array of the given class, the other only gives you an array of Objects. Be careful: you must pass a new Array object, not a class type! Thus use:
myList.toArray(new String[0]) // NOT myList.toArray(java.lang.String[])

Else it will result in a ArrayStoreException.

  • The size of an array is fixed and defined when you create the array. You can obtain this length via the Array class or via the length property of an array.
  • For an array of bytes, byte[], the bytes will never be undefined, they are initialized to 0.

Dates and time manipulation

  • For time manipulations, the main class to use is Calendar. As expected, it is very well designed and once you understand the concepts, time manipulation is quite easy. Note that you should not instantiate this class, but retrieve an instance using a static function.
  • To parse a string into a Date instance, you can in theory use the DateFormat class. However, it uses default patterns for parsing that are undocumented! The best is thus to instantiate directly the SimpleDateFormat subclass. On this class you can specify your own pattern for parsing the string dates, and it works great. The Javadocs for that class specify the pattern syntax.
  • To format a Date into a String, use the format(Date date) method of DateFormat, not the ones with more arguments. Note that there is no automatic way to get the ordinal suffixes (eg, 1st of July, 3rd of July, 4th of July) automatically. A helper method must be written.

Preferences

  • With the default JDK implementation, preferences are stored in XML files in the home directory of the user (under the .java/.userPrefs directory).
  • Sometimes it can be very useful to modify the directory of these preferences. For example in Gentoo Java development, when the sandbox prevents access to home directories. The easiest way to do that is to set ${user.home} variable to another directory; the ${java.util.prefs.userRoot} property can also be used.

Libraries

Log4J

  • Log4J is an excellent high performance logging library. Log4J has 3 core concepts: loggers (objects that you obtain for logging), appenders (loggers use different appenders to direct the log output to various places) and layouts (how the log will be formatted).
  • One thing that's worth noting is that the logging level is defined on the loggers, not on the appenders. Thus it is not possible to tell a logger to use different logging levels depending on the appender. In that case, a workaround is to define a threshold on the appender. In practice, this works fine.
  • You can provide the path to the Log4J configuration file by providing a Java property setting on the command line:
java -jar sample.jar -Dlog4j.configuration=file:/path/lib/log4j.properties

Notice you must supply a resource string. By default, if this Java property is not set, the resource string used will be "log4j.properties", meaning it will look for the file at the root of the package hierarchy (in the default package).

  • In the Log4J configuration file, you set up the properties of the beans via the standard Java beans ways, by using the setters convention (thus lookup the Javadoc to know which configuration options are available).
  • An useful formatting example:
"%d{DATE} [%-5p] %m%n"

%-5p will give you the logging level (WARN, DEBUG...), %m is the message, %n the newline character, and %d{DATE} the date. Automatic caller information (class which generated the call, line level of the call in the source code) can also be useful but is slow so should not be used in production.

  • WARNING: setting the additivity flag for the loggers, if the configuration is done via a .properties file, uses a special syntax (which is not documented on the short tutorial!). Instead of writing
log4j.logger.com.example.demo.additivity=false

you must write:

log4j.additivity.com.example.demo=false

Commons I/O

  • Excellent library if you need to perform simple file system tasks such as recursively copy directories, move files, etc. The standard Java library only provides low level means of dealing with files, which makes this library very useful.
  • FileUtils.listFiles() will list only files, not directories.
  • FileUtils.readFileToString allows for easy reading of a whole file into a String. Doing this with standard JDK methods takes quite a few lines of code.

Dozer

  • Don't forget than you need to have a default constructor in order for Dozer to correctly instantiate the destination objects.
  • If you are not using Java 1.5, and you are dealing with a collection mapping, you must provide hints regarding the collection contents. Else it will assume the target collection contains the same objects than the source collection (which often is not the case).

FileUpload

  • If you use a HTTP POST request with the multipart/formdata encoding, probably the HTTP headers won't include the encoding. At least that happens when using Firefox and GWT. So on the server side, Java won't know which encoding to use for this ServletRequest. Directly setting the correct encoding on the request object does not work later with fields retrieved via the FileUpload API. You have to explicitly provide the encoding when retrieving those fields:
file_item.getString(request.getCharacterEncoding());

This would work if you previously set the request encoding (else you can just provide "UTF-8" directly).

Velocity

  • Velocity is a view technology library that offers an alternative to standard JSP. This technology is fast, similar to Smarty in PHP and strictly enforces separation of concepts in the MVC model. This means that the code operations available in a Velocity Macro (.vm) file are very, very limited.
  • Restrictions:
    • Not even simple math operations like ($counter +1) are possible.
    • Accessing an array is not possible as $myArray[0] will produce an error. Thus Lists should be given to Velocity.
  • To access a variable use $ or ${}. Even within ${}, if you acccess another variable you need to reprefix it with $, as in:
${myList.get($myindex)}

JUnit

  • JUnit is a standard frameworks to write unit and integration tests on the Java platform. Junit 3 test classes use CoC to determine which methods are test methods. With JUnit 4, annotations are used.
  • There are many ways to run your JUnit tests. Usually you use your IDE (Eclipse has a great support for JUnit), or the Ant JUnit task. However, you can also write your own runners and managers (think listeners, etc...) for your tests.
  • When running your test with JUnitCore, the Result object has information about the results of the test run (you can get failures, numbers of tests that failed, etc...).

Rome

  • Rome is a straightforward library to deal with RSS feeds (many different formats are supported like Atom, RSS...). Another library (not tried yet) is Apache Abdera.
  • To read (and cache) a simple RSS feed, use the Rome fetcher submodule (JDOM dependency). It is very simple to use and allows you to integrate your WordPress posts very easily anywhere in your Java based website.
  • Note that you need a fetcher.properties file to avoid warnings (even an empty one will do).

JEE

  • The servlet API included in the early versions a method to parse a query string into a map of parameters. This method is now deprecated, probably because newer versions of the API include much better techniques (request.getParameterMap for example). However, it can still be useful to parse such a query string outside of a servelt request. You have to write the method yourself, the best is to use Matcher and Pattern classes of regular expressions.

Java Beans

  • In a Java properties file, you can set the properties of Java beans by using the setter convention. For example if a bean has a setLogging() method, you could activate logging by writing in your .properties file:
myBean.logging=true

Servlet API 2.5

  • Official documentation is available via the Java Servlet Specification (latest version is 2.5 as of August 2008). This documentation is mainly useful for implementing a Servlet container and understanding the basic concepts behind the servlet architecture. It is not intended to be used as a reference documentation, although there is an useful description of the syntax of the Web Deployment Descriptor (web.xml file).

General

  • Java servlets can easily access information from GET or POST HTTP requests via the HttpServletRequest objet and the method getParameter("key");. However if you are using a FileUpload servlet, in the POST request the information has to be encoded in a way that makes the standard official Java Servlet API fail. You *must* use the FileUpload API.
  • To obtain the current working dir of a servlet, use
getServletContext().getRealPath("/");

Don't use System.getProperty("user.dir") as user.dir is a Java property that has nothing to do with the servlet context.

  • The servlet API returns an array of Strings when calling getParameter() on a HTTP request containing multiple identical parameters. Thus Grails also returns an array of Strings. As arrays are obsolete and difficult to manipulate, the following code can be used:
Arrays.asList(params.name)

It will return a list of Strings. It would return a list with a single element if params.name was a String.

  • There is apparently no way to remove a header for the current HttpServletResponse (that's a shame!). Using setHeader() can overwrite a previously set header, but you cannot remove it. You can use the reset() method but this is extremely dangerous as it will clear all headers previously set, even extremely important ones. So it is best avoided.

Web Applications

  • A Web Application within a servlet container can contain several servlet classes. The web container takes the request URL, then first finds the web application it should be dispatched to. Within the web-application, the URL path (minus the context path used for the web-application) is thus matched to a servlet class. The URL patterns are defined in the web application deployment descriptor. This file is usually an XML file called web.xml, which is read by the servlet container when starting the web application.
  • Note that in many web frameworks (such as Spring), there will only be a few servlet classes. Generally, one main class is used as a dispatcher within the framework.
  • If an URL is not matched to a servlet class, the default servlet will be used to treat the URL. In Tomcat (and probably other containers), this default servlet is used to serve static ressources (images, CSS files etc...).

Web Request Flow

  • Inside a servlet method, you can use the Request Dispatcher to forward the flow to another servlet. There are two methods, include() and forward(). This is a kind of "internal" redirect (eg, not an HTTP redirect where a whole new request is done by the client). Here everything happens within the servlet container.
  • You can also define a Filter, which will perform an action before the servlet calls its doGet() method. I haven't yet used filters extensively, so I don't know yet about ChainFilter concepts, etc. Filters can act if there is a flow redirect via the Request Dispatcher, and they can also act if errors happen during the processing of the request.

Filters

  • Filters can be extremely useful, but dealing with a ServletRequestWrapper is hard. In particular it does not seem feasible to specify another request body (by overriding getReader() with a different underlying stream), since it would mean overriding almost all ServletRequest methods (also getInputStream(), getParameterMap() etc). Thus (input) filters are not meant to modify significantly the request.
  • Filters can however look at the request body to perform various useful operations.

HTTP Requests

  • Once you started reading the input stream (for example in a filter, before the request hits your Web framework), there is no hope of "undoing" this action. In particular you cannot call setCharacterEncoding() after you called getInputStream(), getReader() or any getParameter() method.
  • Reading the body of a POST request will also prevent you from obtaining the parameters via the normal way later. Once the stream is processed, the body will basically be considered empty.
  • The ServletInputStream has a readLine() method, but the documentation is broken: it says this method will return -1 if it reaches the end of the stream before obtaining the maximum nomber of bytes. In fact it will return -1 only if the end of the stream is reached. It actually works as the other I/O methods.

HTTP Responses

  • You can check if a header is set for an HTTP response. Thus, by checking for "Set-Cookie", you can know if a cookie has been set. However you cannot read the value of the header (and thus the name and value of a cookie).

Session mechanism

  • A Servlet container should support two session mechanisms: the first is via a session cookie (which is sent in the HTTP headers), and the second is via a session variable in the path.
  • To put the session in the path of the URL, it must have this form:
/path/to/action;jsessionid=598swt18er?variable=value

Thus, note that it is not in fact a query parameter. It is appended to the end of the path and is followed later optionally by the query string.

  • If you have the session in the path, the container will take it into account. This is separate and independant from the URL rewriting mechanism that can be used to transform documents served to the browser. Eg, you can send a session in the path even if you don't implement any rewriting mechanism.

Error Handling

  • Apparently according to the servlet specification, you can define a servlet to process errors (eg, if the HTTP status code set on the response corresponds to an error).

Java Resources & Packages

Resources

  • A Java resource is not a file on the filesystem; it is referenced from within the classpath. So if you have a file at the root of the classpath, and wish to reference it as a resource, it would be "/file.xml". Similarly, you could have a resource "/com/shoopz/shopengine/example.xml".

Packages

  • A package is a kind of module, it is used as a namespace to group some classes and resources together. Classes in the default package cannot be imported by classes in a named package, although the contrary is possible.
  • The convention for a package is to have a folder for every part of the name, splitted by dots. Eg, com.shoopz.shopengine -> com/shoopz/shopengine. I think this hierarchy is mandatory. IDEs handle this automatically.
  • You cannot have underscores or hyphens in a package name.

Tools and techniques

Standard JVM Logging

  • By default the JVM and standard Java libraries logs everything to stdout (console). The default logging options are read from the file jre/lib/logging.properties (so on Gentoo it would be /opt/sun-jdk-1.6.0.14/jre/lib/logging.properties for instance). On this file, you can set the general logging level (INFO, FINE, WARNING, SEVERE...), as well as define other appenders. The concepts are similar to Log4J.

Profiling / optimization

  • The easiest way to obtain timing data is to use the static method java.lang.System.currentTimeMillis().