Extensible Markup Language - XML

XML parsing module offers Flows for parsing and processing parsed XML documents.

Artifacts

sbt
libraryDependencies += "com.lightbend.akka" %% "akka-stream-alpakka-xml" % "0.15"
Maven
<dependency>
  <groupId>com.lightbend.akka</groupId>
  <artifactId>akka-stream-alpakka-xml_2.12</artifactId>
  <version>0.15</version>
</dependency>
Gradle
dependencies {
  compile group: 'com.lightbend.akka', name: 'akka-stream-alpakka-xml_2.12', version: '0.15'
}

XML parsing

XML processing pipeline starts with an XmlParsing.parser flow which parses a stream of ByteStrings to XML parser events.

Scala
val parse = Flow[String]
  .map(ByteString(_))
  .via(XmlParsing.parser)
  .toMat(Sink.seq)(Keep.right)
Java
final Sink<String, CompletionStage<List<ParseEvent>>> parse = Flow.<String>create()
  .map(ByteString::fromString)
  .via(XmlParsing.parser())
  .toMat(Sink.seq(), Keep.right());

Source on Github Source on Github

To parse an XML document run XML document source with this parser.

Scala
val doc = "<doc><elem>elem1</elem><elem>elem2</elem></doc>"
val resultFuture = Source.single(doc).runWith(parse)
Java
final String doc = "<doc><elem>elem1</elem><elem>elem2</elem></doc>";
final CompletionStage<List<ParseEvent>> resultStage = Source.single(doc).runWith(parse, materializer);

XML writing

XML processing pipeline ends with an XmlWriting.writer flow which writes a stream of XML parser events to ByteStrings.

Scala
val writer: Sink[ParseEvent, Future[String]] = Flow[ParseEvent]
  .via(XmlWriting.writer)
  .map[String](_.utf8String)
  .toMat(Sink.fold[String, String]("")((t, u) => t + u))(Keep.right)
Java
final Sink<ParseEvent, CompletionStage<String>> write = Flow.of(ParseEvent.class)
  .via(XmlWriting.writer())
  .map((ByteString bs) -> bs.utf8String())
  .toMat(Sink.fold("", (acc, el) -> acc + el), Keep.right());

Source on Github Source on Github

To write an XML document run XML document source with this writer.

Scala
val listEl: List[ParseEvent] = List(
  StartDocument,
  StartElement("doc", Map.empty),
  StartElement("elem", Map.empty),
  Characters("elem1"),
  EndElement("elem"),
  StartElement("elem", Map.empty),
  Characters("elem2"),
  EndElement("elem"),
  EndElement("doc"),
  EndDocument
)

val doc = "<?xml version='1.0' encoding='UTF-8'?><doc><elem>elem1</elem><elem>elem2</elem></doc>"
val resultFuture: Future[String] = Source.fromIterator[ParseEvent](() => listEl.iterator).runWith(writer)
Java
final String doc = "<?xml version='1.0' encoding='UTF-8'?><doc><elem>elem1</elem><elem>elem2</elem></doc>";
final List<ParseEvent> docList= new ArrayList<ParseEvent>();
docList.add(StartDocument.getInstance());
docList.add(StartElement.create("doc", Collections.emptyMap()));
docList.add(StartElement.create("elem", Collections.emptyMap()));
docList.add(Characters.create("elem1"));
docList.add(EndElement.create("elem"));
docList.add(StartElement.create("elem", Collections.emptyMap()));
docList.add(Characters.create("elem2"));
docList.add(EndElement.create("elem"));
docList.add(EndElement.create("doc"));
docList.add(EndDocument.getInstance());


final CompletionStage<String> resultStage = Source.from(docList).runWith(write, materializer);

XML Subslice

Use XmlParsing.subslice to filter out all elements not corresponding to a certain path.

Scala
val parse = Flow[String]
  .map(ByteString(_))
  .via(XmlParsing.parser)
  .via(XmlParsing.subslice("doc" :: "elem" :: "item" :: Nil))
  .toMat(Sink.seq)(Keep.right)
Java
final Sink<String, CompletionStage<List<ParseEvent>>> parse = Flow.<String>create()
  .map(ByteString::fromString)
  .via(XmlParsing.parser())
  .via(XmlParsing.subslice(Arrays.asList("doc", "elem", "item")))
  .toMat(Sink.seq(), Keep.right());

Source on Github Source on Github

To get a subslice of an XML document run XML document source with this parser.

Scala
val doc =
  """
    |<doc>
    |  <elem>
    |    <item>i1</item>
    |    <item><sub>i2</sub></item>
    |    <item>i3</item>
    |  </elem>
    |</doc>
  """.stripMargin
val resultFuture = Source.single(doc).runWith(parse)
Java
final String doc =
  "<doc>" +
  "  <elem>" +
  "    <item>i1</item>" +
  "    <item><sub>i2</sub></item>" +
  "     <item>i3</item>" +
  "  </elem>" +
  "</doc>";
final CompletionStage<List<ParseEvent>> resultStage = Source.single(doc).runWith(parse, materializer);
The source code for this page can be found here.