Thursday, February 10, 2011

make java xpath work on large files

this post is just for my own reference. if it helps you then fine. but read the following post first:
http://blog.astradele.com/2006/02/24/slow-xpath-evaluation-for-large-xml-documents-in-java-15/


family.xml
...

if we've got a few thousand dads in a 10 meg xml file and you try and use xpath

DocumentBuilder docbuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
org.w3c.dom.Document doc = docbuilder.parse(new FileInputStream("family.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList dads =  (NodeList)xpath.evaluate("//dad", doc, XPathConstants.NODESET)
for( int i = 0 ; i < dads.getLength() ; i++){
     System.out.println(xpath.evaluate("./son/@age",dads.item(i)));
}


dads.item(i) will completely screw you here. you will wait forever.

if you break the dads.item(i) off of the tree you'll be fine. of course you wont be able to look back up the new tree.

method 1

for( int i = 0 ; i < dads.getLength() ; i++){
    Node oneDad =  dads.item(i);
    System.out.println(xpath.evaluate("./son/@age",oneDad.getParentNode().removeChild(oneDad));
}


i like method 2.. make a new doc with the nodes you need. this will not be reparsed. just a different reference point for the expression to start from


for( int i = 0 ; i < dads.getLength() ; i++){
     org.w3c.dom.Document oneDad =  docbuilder.newDocument();
     oneDad.adoptNode(dads.item(i));
     System.out.println(xpath.evaluate("./son/@age",oneDad);
}

No comments:

Post a Comment