System.Xml
namespace. LINQ to Xml uses types that are part of the System.Xml.Linq
namespace. In my experience, it shines for a large majority of the use cases for working with XML by offering a simpler method of accessing:
- Elements (XElement)
- Attributes (XAttribute)
- Nodes (XNode)
- Comments (XComment)
- Text (XText)
- declarations (XDeclaration)
- Namespaces (XNamespace)
- Processing Instructions (XProcessingInstruction)
XmlWriter
as well as calling XDocument.ToString()
or XElement.ToString()
. This is helpful in most cases as it abstracts all of the details and just does it for you. It's not helpful when you need to control the prefixes that are serialized. Some examples of when you might need to control prefix serialization are:
- comparing documents for semantic equivalence
- normalizing an xml document
- or canonicalizing an xml document
Even LINQ to Xml's XNode.DeepEquals()
method doesn't consider elements or attributes with different prefixes across XElement or XDocument instances to be equivalent. How disappointing.
const string xml = @"
<x:root xmlns:x='http://ns1' xmlns:y='http://ns2'>
<child a='1' x:b='2' y:c='3'/>
</x:root>";
XElement original = XElement.Parse(xml);
// Output the XML
Console.WriteLine(original.ToString());
The output is:
<x:root xmlns:x="http://ns1" xmlns:y="http://ns2">
<child a="1" x:b="2" y:c="3" />
</x:root>
Now create an XElement based on the original. This illustrates that the namespace prefixes specified in the original XElement are not maintained in the copy:
XElement copy =
new XElement(original.Name,
original.Elements()
.Select(e => new XElement(e.Name, e.Attributes(), e.Elements(), e.Value)));
Console.WriteLine("\nNew XElement:");
Console.WriteLine(copy.ToString());
The output is:
New XElement:
<root xmlns="http://ns1">
<child a="1" p2:b="2" p3:c="3" xmlns:p3="http://ns2" xmlns:p2="http://ns1" xmlns="" />
</root>
What happened to my prefixes of x and y? Where did the prefixes p2 and p3 come from? Also, note how there are some additional namespace declarations that aren't in the original XElement
. We didn't touch anything related to namespaces when we created the copy; we just selected the elements and attributes from the original. You can't even use XNode.DeepEquals()
to compare them for equivalence since the attributes are now different as well as the element names. I'll leave that as an exercise for the reader.
Controlling the prefixes yourself, although not very intuitive, is actually really easy. You might consider trying to change the Name property on the xmlns
namespace declaration attribute to get the prefix rewritten. However, unlike the XElement.Name
property which is writable, the XAttribute.Name
property is readonly.
One caveat here: IF you happen to have superfluous (duplicate) namespace prefixes pointing to the same namespace, you are out of luck. I've written a lengthy description about this on Stack Overflow answering this question: c# linq xml how to use two prefixes for same namespace.
Would you believe me if I told you that LINQ is actually going to help you rewrite the prefixes? In fact, it doesn't just help you, it rewrites all of the prefixes for you! I bet you didn't see that coming! This is because modifications to the xml made at runtime cause LINQ to Xml to keep its tree updated so that it is in a consistent state. Imagine if you did made a programmatic change that caused the xml tree to be out of whack per se and the LINQ to Xml runtime didn't make updates for you to keep things in sync. Well, welcome to debugging hell.
So, all we have to do are a couple of really simple things and LINQ to Xml handles the rewriting for us. There are two approaches: you can either create a new element or modify the existing one. I've done some performance profiling and found that for larger documents above 1MB it is more efficient to create new elements and attributes rather than modifying existing ones. Some of this has to do with all of the allocations that are made when LINQ to Xml invokes the event handlers on the XObject
class: see XObject Class. These are fired, thus allocating memory, regardless of whether you have event handlers wired to them or not. I'll show you both patterns so you can decide what works for you. The steps to rewrite vary depending on which approach you decide upon.
XNamespace origANamespace = "http://www.a.com";
XNamespace origBNamespace = "http://www.b.com";
const string originalAPrefix = "a";
const string originalBPrefix = "b";
const string xml = @"
<a:foo xmlns:a='http://www.a.com' xmlns:b='http://www.b.com'>
<b:bar/>
<b:bar/>
<b:bar/>
<a:bar b:att1='val'/>
</a:foo>
""";
Creating a new XElement
There are three steps.// Define new namespaces
XNamespace newANamespace = origANamespace;
XNamespace newBNamespace = origBNamespace;
XAttribute newANamespaceXmlnsAttr =
new XAttribute(XNamespace.Xmlns + "aNew", newANamespace);
XAttribute newBNamespaceXmlnsAttr =
new XAttribute(XNamespace.Xmlns + "bNew", newBNamespace);
// Create a new XElement with the new namespace
XElement newElement =
new XElement(newANamespace + originalElement.Name.LocalName,
newANamespaceXmlnsAttr,
newBNamespaceXmlnsAttr,
originalElement
.Elements()
.Select(e =>
new XElement(
e.Name,
e.Attributes(),
e.Elements(),
e.Value)));
Modifying an existing XElement
There are four steps with the last step being optional depending if you want to remove the old namespace prefix declaration. THe first two steps are identical to the creating a newXElement
approach.
// Define new namespaces
XNamespace newANamespace = origANamespace;
XNamespace newBNamespace = origBNamespace;
XAttribute newANamespaceXmlnsAttr =
new XAttribute(XNamespace.Xmlns + "aNew", newANamespace);
XAttribute newBNamespaceXmlnsAttr =
new XAttribute(XNamespace.Xmlns + "bNew", newBNamespace);
originalElement.Add(
newANamespaceXmlnsAttr,
newBNamespaceXmlnsAttr);
// now remove the original 'a' and 'b' namespace declarations
originalElement
.Attribute(XNamespace.Xmlns + originalAPrefix)?
.Remove();
originalElement
.Attribute(XNamespace.Xmlns + originalBPrefix)?
.Remove();
I hope you found this helpful. Drop a comment and let me know.