In my last blog, I outlined how I found the Scala XML library a pleasant solution for unpleasant XML format conversion jobs. In those jobs, I had to completely transform the document from one grammar to another.
When you need to make small tweaks to a document, the library a bit more of a hassle. This page by Burak Emir, the author of the Scala XML library, states: “The Scala XML API takes a functional approach to representing data, eschewing imperative updates where possible. Since nodes as used by the library are immutable, updating an XML tree can a bit verbose, as the XML tree has to be copied.” A verbose example follows.
Here is what I needed to do. Whenever I had a <div
class="example"><p>Filename.java</p></div>
, I had
to replace it with the actual file name, with each line preceded by a line
number.
That part is simple:
def getExample(node: Node) = <ol>{io.Source.fromFile(new File((node \ "p").toString)).getLines().map( w => <li><pre>{w} </pre></li>)}</ol>
But how can you say “Do this for all <div
class="example">
, and leave the rest alone?”
In a functional program, you need to copy the tree, so I figured I should write a universal transformer method.
/** * Transforms all descendants matching a predicate. * n a node * pred the predicate to match * trans the transformation to apply to matching descendants */ def transformIf(n: Node, pred: (Node)=>Boolean, trans: (Node)=>Node): Node = if (pred(n)) trans(n) else n match { case e: Elem => if (e.descendant.exists(pred)) e.copy(e.prefix, e.label, e.attributes, e.scope, e.child.map(transformIf(_, pred, trans))) else e case _ => n }
The if (e.descendant.exists(pred))
part isn't strictly
necessary. I just wanted to reuse nodes when there was no need for rewriting.
This solved my immediate problem.
It turned out that I needed to change some other nodes as well. I could have done two transforms, or rewritten my method to take a sequence of (predicate, transformer) pairs. But then I remembered something about partial functions in the actor library.
This
blog brought me up to speed. A case expression { case ... => ...;
case ... => ... }
can be converted to a PartialFunction
.
There are methods for checking whether a value is covered by one of the cases,
and for applying the function. In other words, I could trivially extend my
previous method to partial functions:
def transform(n: Node, pf: PartialFunction[Node, Node]) = transformIf(n, pf.isDefinedAt(_), pf.apply(_));
Burak Emir explains how one can write case statements that check conditions with attributes. This is what it looks like.
transform(doc.docElem, { case node @ <div>{_*}</div> if node.attribute("class").getOrElse("").toString == "example" => getExample(node) // Other cases go here case ... => ... })
It reads quite nicely. When you have a div
whose class
attribute is example
, call the getExample
method.
Eat your heart out, Java!
There is a larger message here. Consider again the task described in this
blog, i.e. replacing <div
class="example"><p>Filename.java</p></div>
with
<ol><li><pre>each line in that
file</pre></li></ol>
? Yes, I could program it in
Java, but the thought makes my skin crawl.
A while ago, I resolved to use Scala for all my little processing tasks so that I would get to know it over time. It was painful at first—tasks that I know I could have completed easily in Java took some research and definitely took me out of my comfort zone. But over time, this has paid off. I can now easily do tasks in Scala that I would never have attempted in Java.