Wednesday, January 17, 2007

Writing your own XSD.exe

If you spend any time working with Web Services or even just XML, you'll inevitably come into contact with XSD.exe and WSDL.exe, they both generate .net code from XSD type definitions. With XSD.exe, you simply give it the path to an xsd document and it will spit out a .cs file. That file defines types that will serialize to an XML document instance that validates against the xsd. De-serializing your XML to a strongly typed object model is almost always better than fiddling with the XML DOM, but what if you don't like the code that XSD.exe generates? Well, you can easily spin your own XSD.exe since it simply uses the public framework types System.Xml.Serialization.XmlSchemaImporter, System.Xml.Serialization.XmlCodeExporter and CodeDom. For some reason the MSDN documentation on these classes says, "This class supports the .NET Framework infrastructure and is not intended to be used directly from your code.", but don't let that put you off, they're public types and work fine. At a high level the process goes like this, you can follow it with the code sample below:
  1. Load your xsd file into an XmlSchema.
  2. Create an XmlSchemaImporter instance that references your schema. This class is used to generate mappings from XSD types to .net types.
  3. Create a CodeDom CodeNamespace instance where you'll build the syntactic structure of your .net types.
  4. Create an XmlCodeExporter instance with a reference to the CodeNamespace that you use to export your type. This is the class that actually creates the syntactic structure of the .net types in the CodeNamespace.
  5. Create an XmlTypeMapping instance for each type that you wish to export from the XSD.
  6. Call the ExportTypeMapping method on XmlCodeExporter for each XmlTypeMapping object, this creates the types syntax in the CodeNamespace object.
  7. Use a CSharpCodeProvider to output C# source code for the types that were created in CodeNamespace object.
Once the CodeNamespace has been fully populated (after step 6 above) there's an opportunity to make any changes that we wish to the code we output. Note that at this stage, the CodeDom CodeNamespace object represents an IL syntactic structure rather than code in a particular language. We could just as easily generate VB.NET at this point. We can use the CodeDom methods to alter that structure before outputting source code. In the example below I run the RemoveAttributes function to remove some attributes from the type definition.
using System;
using System.IO;
using System.Collections.Generic;
using System.Reflection;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
using System.Xml.Schema;
using System.CodeDom;
using System.CodeDom.Compiler;

using Microsoft.CSharp;

using NUnit.Framework;

namespace XmlSchemaImporterTest
  public class XsdToClassTests
      // Test for XmlSchemaImporter
      public void XsdToClassTest()
          // identify the path to the xsd
          string xsdFileName = "Account.xsd";
          string path = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
          string xsdPath = Path.Combine(path, xsdFileName);

          // load the xsd
          XmlSchema xsd;
          using(FileStream stream = new FileStream(xsdPath, FileMode.Open, FileAccess.Read))
              xsd = XmlSchema.Read(stream, null);
          Console.WriteLine("xsd.IsCompiled {0}", xsd.IsCompiled);

          XmlSchemas xsds = new XmlSchemas();
          xsds.Compile(null, true);
          XmlSchemaImporter schemaImporter = new XmlSchemaImporter(xsds);

          // create the codedom
          CodeNamespace codeNamespace = new CodeNamespace("Generated");
          XmlCodeExporter codeExporter = new XmlCodeExporter(codeNamespace);

          List maps = new List();
          foreach(XmlSchemaType schemaType in xsd.SchemaTypes.Values)
          foreach(XmlSchemaElement schemaElement in xsd.Elements.Values)
          foreach(XmlTypeMapping map in maps)


          // Check for invalid characters in identifiers

          // output the C# code
          CSharpCodeProvider codeProvider = new CSharpCodeProvider();

          using(StringWriter writer = new StringWriter())
              codeProvider.GenerateCodeFromNamespace(codeNamespace, writer, new CodeGeneratorOptions());


      // Remove all the attributes from each type in the CodeNamespace, except
      // System.Xml.Serialization.XmlTypeAttribute
      private void RemoveAttributes(CodeNamespace codeNamespace)
          foreach(CodeTypeDeclaration codeType in codeNamespace.Types)
              CodeAttributeDeclaration xmlTypeAttribute = null;
              foreach(CodeAttributeDeclaration codeAttribute in codeType.CustomAttributes)
                  if(codeAttribute.Name == "System.Xml.Serialization.XmlTypeAttribute")
                      xmlTypeAttribute = codeAttribute;
              if(xmlTypeAttribute != null)


Karl Böhlmark said...

Thanks for a truly useful post. However I ran into trouble when attempting to read schemas with xsd:import. How can you handle schema dependencies?

This is also somewhat of a problem when using xsd.exe since it doesn't handle the schemaLocation on imports, but the workaround is to specify all schemas on the commandline. How would that translate to this solution?


Mike Hadlow said...

Hi Karl,

Thanks, I'm glad you found this useful. I do remember looking at the 'imports' issue when I was writing WsdlWorks (my attempted WS test tool) . There was some part of the XML API that handled it, but I can't for the life of me remember where. Sorry :(

It might be worth pointing reflector at the Visual Studio WS proxy generation tool, since that is certainly able to load multi-file schemas. The WCF code might be similarly useful.


Anonymous said...

Interesting stuff. Will be trying this out to customize my generated classes.

A small note, "List maps = new List()" should is missing the type specification on the generic List, which should be XmlTypeMapping. If you view the page source you can see that the left and right angle brackets haven't been escaped, so the browser (at least firefox) interprets the type specification as a html tag.

Mike Hadlow said...

Thanks anonymous. Yes, I was a bit slack in those days about escaping my angle brackets. As you say, the best thing is to get the code from view->source.

Anonymous said...

while looking for a way to generate a xml document from an xsd-schema definition ( that was generated by xsd.exe ) I found your solution after getting a hint in the official silverlight forum.

I have a question : is there a way to make this code work in the Silverlight framework ?

It seems at the moment, that most of the used classes are - again - not available .

Thanks for your help !



Mike Hadlow said...

Hi Roland, sorry I don't know the answer to that off the top of my head.

I had the same problem recently, wanting to do some encryption on the client using Silverlight. It seemed like the ideal choice until I realised the BCL classes I wanted weren't part of the silverlight framework :(

tobsen said...

great article! Thanks!

Anonymous said...

A really helpfull article thanks

Felice Pollano said...

Thank you,
I used your info today. Really good Job, I will have to deal with a *lot* of XSD and with your work I think I will automate the assembly creation runtime

Anonymous said...

I was wondering if you knew a way to use reflection once this has been done, and the code has been generated to use reflection and generate HTML code from the code that has been generated.

Anonymous said...

Nice one! thanks for sharing
this solution will change the dll but i don't want to change the dll how can i do it ?

Anonymous said...

can anyone tell me how to implement a tool that takes xml document as input and return its attributes