Tuesday 18 December 2012

Basics - XML

Here's the next entry in my Java basics series. Earlier we looked at how to read a text file. Now we'll look at some simple XML processing. XML is pretty ubiquitous; at some point you're going to run in to it. This post will show how to read XML and it parse into a Document. We'll extract data from the Document and then create a new one.

There are many XML libraries around that help with XML processing, such as jdom or dom4j, but the standard JDK comes with everything you need. Lets get started!

Getting a Document

The first step in processing XML is to create a Document from the raw XML. The Document is the central class for XML processing. We use a DocumentBuilder to parse XML into a Document. The builder has several overloaded parse methods allowing you to create a Document from a File, InputStream, InputSource or a URI. Here we are using a file.

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
File file = new File("employees.xml");
Document doc = builder.parse(file);

In this example we're going to work with a basic XML file, employees;

<employees>
  <employee id="1">
    <name>Fred</name>
    <age>20</age>
    <department>Sales</department>
  </employee>
  <employee id="2">
    <name>Bob</name>
    <age>30</age>
    <department>Sales</department>
  </employee>
  <employee id="3">
    <name>Jim</name>
    <age>23</age>
    <department>Marketing</department>
  </employee>
</employees>

Working with the Document


Now we have our Document, we can extract data from it using XPath. XPath is a powerful query language for selecting nodes from an XML Document. Lets find the name of the employee with an ID of 1.

XPathFactory xpFactory = XPathFactory.newInstance();
XPath xpath = xpFactory.newXPath();
String qry = "/employees/employee[@id = '1']/name";
String name = (String)xpath.evaluate(qry, doc, XPathConstants.STRING);
System.out.println(name);

When this code is run, it will output Fred, which is the name of the employee with an id attribute of '1'. Now lets get all the employee nodes in the sales department.

xpath.reset();
qry = "/employees/employee[department = 'Sales']";
NodeList employees = (NodeList)xpath.evaluate(qry, doc, XPathConstants.NODESET);
System.out.println("Employees in Sales department;");
for (int i=0; i<employees.getLength(); i++) {
  Element employee = (Element)employees.item(i);
  name = employee.getElementsByTagName("name").item(0).getTextContent();
  String age = employee.getElementsByTagName("age").item(0).getTextContent();
  String id = employee.getAttribute("id");
  System.out.format("%2$s - %1$s age %3$s\n", name, id, age);
}

In the above example we get a NodeList and loop over it to create a report of the employees in the Sales department;

Employees in Sales department;
1 - Fred age 20
2 - Bob age 30

Creating a new Document

Now lets look at creating a new Document from scratch. We use a DocumentBuilder to create a new empty Document, then we create some elements and build up the DOM tree.

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.newDocument();

Element order = doc.createElement("order");
order.setAttribute("number", "1");
doc.appendChild(order);

Element total = doc.createElement("total");
total.setTextContent("10.00");
order.appendChild(total);

Element status = doc.createElement("status");
status.setTextContent("dispatched");
order.appendChild(status);

Now we have a small order Document! In order to see what a Document looks like we can output the Document to the console using a Transformer.

TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(order);
StreamResult result = new StreamResult(System.out);
transformer.transform(source, result);

which gives us the output;

<?xml version="1.0" encoding="UTF-8"?>
<order number="1">
<total>10.00</total>
<status>dispatched</status>
</order>

Tuesday 11 December 2012

Basics - reading a file

I quite often see new developers struggling with the basics in Java. I guess everyone has to begin somewhere, so here's the first in a series of posts demonstrating some of the basics the you might need when you're starting out.

This post deals with reading a text file line by line. First of all, lets create a text file called names.txt;

Bob
Fred
Jim

Now were going to write a class to read the file. This revolves around the BufferedReader class.

package querky.blog.basics;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;

public class ReadAFile {

  public static void main(String[] args) {
 
    String fileName = "names.txt";
    BufferedReader reader = null;
 
    try {
      System.out.println("Reading " + fileName);
      reader = new BufferedReader(new FileReader(fileName));
      String line = null;
      while ((line = reader.readLine()) != null) {
        System.out.println(line);
      }
    } catch (IOException ioe) {
      ioe.printStackTrace();
    } finally {
      if (reader != null) {
        System.out.println("Closing " + fileName);
        try {
          reader.close();
        } catch (IOException ioe) {
          ioe.printStackTrace();
        }
      }
    }
  }
}

You can see that we create a BufferedReader and then call it's readLine() method in a loop until it returns null. This gives us each line in the file in turn. Notice that we also close the reader in a finally block. This is important as we want to release the resources used.

When run, the program should output this;

Reading names.txt
Bob
Fred
Jim
Closing names.txt

Common problems

The only problem you're likely to encounter is a FileNotFoundException - that is, your program can't find the input file it is supposed to be reading. The error will look something like this;

Reading names.txt
java.io.FileNotFoundException: names.txt (The system cannot find the file specified)
   at java.io.FileInputStream.open(Native Method)
   at java.io.FileInputStream.<init>(FileInputStream.java:106)
   at java.io.FileInputStream.<init>(FileInputStream.java:66)
   at java.io.FileReader.<init>(FileReader.java:41)
   at querky.blog.basics.ReadAFile.main(ReadAFile.java:16)

First, check you have a names.txt file in the right location. What's the right location? As the file name is specified as simply;

String fileName = "names.txt";

Then our program will look for the file in the current working directory, that is to say the directory where the program was launched from. If you don't know what that is, then you can easily find out with a debugging statement;

System.out.println(new File(fileName).getCanonicalPath());

Here we're using the getCanonicalPath() method to show exactly where on the file system Java is looking. If you're getting a FileNotFoundException odds are that the file isn't there or Java doesn't have permission to read it.

Tuesday 30 October 2012

New blog

I seem to be thinking about cycling a lot at the moment, and I seem to be posting as much about cycling as I do about programming. With this in mind I've started a new blog called I am a human being (and a cyclist).

The idea for the name came about because as soon as I get on a bike some people seem to see me as less than human, I become a cyclist. I'd like to point out that I am a human being whatever my choice of mode of transport.

Wednesday 4 July 2012

Interviewing software developers


I've been doing some interviews recently for Jave EE developers. Its for a new team and we had four positions with no particular experience level in mind. Part of the interview was a programming test in which the candidate was asked to write the legendary 'FizzBuzz' program. However, I decided that to spice it up a little the candidate would also have to write some unit tests. I also suggested that they may wish to use a TDD approach, although this wasn't mandatory.

For those unfamiliar with FizzBuzz, the problem is described as such;

Write a program that prints the numbers from 1 to 100.
 - for numbers that are divisible by 3 print "Fizz" instead
 - for numbers that are divisible by 5 print "Buzz" instead
 - for numbers that are divisible by 3 and 5 print "FizzBuzz" instead
You may use a TDD approach if you want.

Results? A lot, lot worse than I thought. Some candidates had over 5 years Java development experience on their CVs and really struggled with this. Out of 7 candidates, only one failed to complete the task (7 years java experience) but the others took on average 45 minutes, and none of them finished with great solutions (their tests were flaky).

The most common mistake made was to start out right away with something like this;

public static void main(String[] args) {
  for (int i=1; i<=100; i++) {
    if (....)
     System.out.println("FizzBuzz");
   else if (....)
     you get the picture
  }
}

This solves the problem, but has a major downside. It is very hard to test. The program has no inputs and only outputs to System out. I would expect a competent programmer to come up with a solution in under 10 minutes. All but one took over 20 minutes. Once they started writing the unit tests they realised that they couldn't test their application, so they started refactoring it. Things generally went further downhill from here. There were some wierd and wonderful results, with programs looping 1 to 100 and filling an array with strings, returning a map of numbers to strings, returning lists, all kinds of stuff!

The one guy who aced the interview was a recent graduate and the only one to write the unit tests before the solution. He finished the whole thing in about 15 minutes - something which supposed senior developers with 7 years experience failed to do in an hour.

Here's what I'd consider a reference solution.

public class FizzBuzz {

  public static final String FIZZ = "Fizz";
  public static final String BUZZ = "Buzz";
  public static final String FIZZBUZZ = "FizzBuzz";


  public static void main(String[] args) {
    FizzBuzz fb = new FizzBuzz();
    fb.print();
  }
  
  public String getString(int i) {
    if (i % 15 == 0) return FIZZBUZZ;
    if (i % 5 == 0) return BUZZ;
    if (i % 3 == 0) return FIZZ;
    return String.valueOf(i);
  }


  public void print() {
    for (int i=1; i<=100; i++) {
      System.out.println(getString(i));
    }
  }
}


..and the tests;

public class FizzBuzzTest {

  private FizzBuzz underTest = new FizzBuzz();

  @Test
  public void testFizzBuzz() {
    for (int i=15; i<=100; i = i+15) {
      Assert.assertEquals(i + " is FizzBuzz", FizzBuzz.FIZZBUZZ, underTest.getString(i));
    }
  }


  @Test
  public void testFizz() {
    for (int i=3; i<=100; i = i+3) {
      if (i % 5 == 0) continue;
      Assert.assertEquals(i + " is Fizz", FizzBuzz.FIZZ, underTest.getString(i));
    }
  }


  @Test
  public void testBuzz() {
    for (int i=5; i<=100; i = i+5) {
      if (i % 3 == 0) continue;
      Assert.assertEquals(i + " is Buzz", FizzBuzz.BUZZ, underTest.getString(i));
    }
  }


}

I didn't really expect a unit test around the printing, just the logic for determining if a number should be Fizz, Buzz or FizzBuzz.

What did this tell me about the candidates?

I wasn't so much interested in whether they could do it or not; all the candidates claimed a fair amount of dev experience or to have studied computer science at University so I expected they could. The key was that they worked on a laptop which was plugged into a flat panel TV so I could watch them work. Watching how someone arrives at a solution tells you a wealth of information that you wouldn't know just by looking at the finished product. What mistakes do they make, how do they correct them, how do they refactor, how do they use the tools available in their dev environment? Sure, it ups the pressure with the interviewer watching your every key stroke, but the task isn't a complicated one and doesn't require a huge amount of thought.

Where did people go wrong?


  • The most common mistake was to dive straight in and write something hard to test; a main method that looped and printed to system out. Most realised their mistake when they wrote tests (some required a little prodding in the right direction) and refactored.
  • Quite a few candidates struggled with the default case of returning the number; counldn't understand why the compiler didn't like them returning an int when the method signature specified a String return type. Only one candidate figured out String.valueOf(i) straight away, the others who got it had to think about it for several minutes.
  • Several candidates didn't follow the spec, even though it was clearly written; eg their method returned "3: Fizz" for the number 3.
  • Two candidates solutions printed "Fizz" or "Buzz" for 15 because they checked divisibility of 3 or 5 first (rather than 15, or 3 and 5) and fell out of they if construct.
  • One candidate (12 years experience in software dev, 8 in java) didn't write unit tests because he didn't know how to! 12 years and never wrote a unit test!
  • Some candidates seriously over complicated the problem, using arrays, lists or maps to store all outcomes. This wouldn't have been too bad except they all utterly failed to test their overly complicated solutions.

Monday 13 February 2012

Storing passwords

The list of accounts I have for various websites/subscriptions/applications is large, and constantly growing. Having a lot of accounts means having a lot of user names and passwords. A couple of problems arise out of this situation.

First, people are tempted to use a memorable password. This is generally a bad idea. If your password is memorable then it is likely to have a low entropy, i.e. it is more guessable. If your password is really strong and consists of truly random characters, then it isn't memorable. Because it is so strong people may be tempted to use it for more than one account, which leads to the second problem.

Second, people memorise a really good password, then reuse it for several accounts. This is bad because if you reuse your password then you are at the mercy of the website that holds your account. You don't know what that website is doing with it behind the scenes. They may misuse it or leak it.

Choosing a password based on something that only you know often doesn't help because it isn't hard to find out information about someone from Facebook, LinkedIn etc. You're also going to struggle realistically to come up with many different passwords based on different things only you know. Those bits of information can change pretty regularly too. The answer is to use some sort of password manager; a tool that can securely store information about your accounts. Of course this raises the question of whether you trust your password manager. What if it misuses or leaks your details?

Being a programmer, and somewhat paranoid, I obviously wrote my own solution. Its a simple application that encrypts text to a file on disk. It can also read a file, decrypt it and display the plaintext. You can see the source on my github page. The application is really simple. When reading or writing a file it uses AES-256 encryption, with the key being a SHA-256 hash of a password. Now all you need is one single strong password.

How do I use it?
Create a new empty file. The file extension doesn't matter, but .vault makes it easy to remember what the file is.

Run the application. You'll get a file dialog. Open the file you just created.

You will be prompted to enter a password to decrypt the contents of the file. As the file is empty this doesn't matter; just press OK.

You'll see the contents of the file, which will be empty. Enter some secret information and choose Save from the file menu.

You'll be prompted for a password. Choose a good, strong, memorable password. The application will use this to encrypt the contents of the file to disk.

Tuesday 7 February 2012

Managing photos

One thing that's been in my mind recently is the problem of managing the ever growing number of digital photos I accumulate. This was on my mind again because I've just bought a new laptop. This means I'll probably use my old laptop less, but that laptop has hundreds of photos on and I don't want to lose/forget about them. The problem is wider than that; I've also got photos on my phone, PC and some old hard disks sat in a box in my office.

Now I know there are solutions to this already, and the cloud is the big thing at the moment. Why not transfer all my photos to some cloud storage solution? If I have an iPhone (which I don't) it can even automatically store photos in the iCloud for me. Well I have a couple of problems with that. Firstly I want something really simple, and not tied to any particular vendor (bye bye Apple!). I also don't want to rely on anyone else. I don't want my photos to be dependent on my internet connection.

It seems the best solution for me is to keep a master copy of all my photos somewhere I can easily access it (eg my new laptop), and keep a backup copy too just in case. The backup copy can just sit on an external hard disk, which I keep safe. A decent fireproof box in the garage should do it. Can't think of many situations where I'd lose both copies.

Of course I need some way of synchronising files between my various sources and my laptop, and also a way of backing up files to my external disk. Being a software developer obviously I wrote my own sync program.

Thursday 26 January 2012

Cycle at your own risk

There is a fairly high profile news story at the moment about a cyclist (who is also a lawyer) involved in an incident with a motorist on his daily commute. The motorist was driving aggressively and seemed to consider the cyclist to be basically in his way. A verbal altercation took place which ended with the motorist threatening to kill the cyclist. The cyclist has blogged about the whole thing. The interesting aspect of this is that the whole thing was recorded by the cyclist's helmet cam.

The lack of desire to prosecute and, to be honest, rather pathetic penalty highlights the sad lot that we cyclists have to put up with. I've cycle-commuted my whole life so it comes as no surprise that aggressive, dangerous morons like Scott Lomas are at large on our roads - I experience similar incidents myself quite frequently.

I have long since resigned myself to the fact that cyclists are considered to be the lowest form of life on the roads. We don't count. We're not even human beings; we're cyclists. We are in the way, we should be on the pavement (if you're a motorist), we should be on the road (if you're a pedestrian), we should yield right of way to all other commuters in every circumstance. Of course I exaggerate; 99 out of 100 other road users are perfectly reasonable, courteous and law abiding. Its the 1 out of 100 that you have to watch out for; the Scott Lomas' of this world, and if you cycle just 10 miles in rush hour, you're going to encounter hundreds of motorists and statistically that means at least one moron.