Thursday, June 05, 2008

LINQ to CSV

I thought it would be nice to be able to produce a CSV file by doing something like this:

string ordersCsv = orderRepository.GetAll().Select(o => new 
{ 
    OrderId = o.OrderId,
    Email = o.Email,
    OrderStatus = o.OrderStatus.Name,
    CreatedDate = o.CreatedDate,
    Total = o.Basket.Total
}).AsCsv();

So here's an extension method to do just that:

public static string AsCsv<T>(this IEnumerable<T> items)
    where T : class
{
    var csvBuilder = new StringBuilder();
    var properties = typeof (T).GetProperties();
    foreach (T item in items)
    {
        string line = properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray().Join(",");
        csvBuilder.AppendLine(line);
    }
    return csvBuilder.ToString();
}

private static string ToCsvValue<T>(this T item)
{
    if (item is string)
    {
        return "\"{0}\"".With(item.ToString().Replace("\"", "\\\""));
    }
    double dummy;
    if (double.TryParse(item.ToString(), out dummy))
    {
        return "{0}".With(item);
    }
    return "\"{0}\"".With(item);
}

It's work with anything that implements IEnumerable<T>, that includes the results of LINQ-to-SQL queries , arrays, List<T> and pretty much any kind of collection. Here's it's unit test:

[TestFixture]
public class EnumerableExtensionsTests
{
    [Test]
    public void GetCsv_ShouldRenderCorrectCsv()
    {
        IEnumerable<Thing> things = new List<Thing>()
            {
                new Thing
                    {
                        Id = 12,
                        Name = "Thing one",
                        Date = new DateTime(2008, 4, 20),
                        Child = new Child
                                    {
                                        Name = "Max"
                                    }
                    },
                new Thing
                    {
                        Id = 13,
                        Name = "Thing two",
                        Date = new DateTime(2008, 5, 20),
                        Child = new Child
                                    {
                                        Name = "Robbie"
                                    }
                    }
            };

        string csv = things.Select(t => new { Id = t.Id, Name = t.Name, Date = t.Date, Child = t.Child.Name }).AsCsv();

        Assert.That(csv, Is.EqualTo(expectedCsv));
    }

    const string expectedCsv = 
@"12,""Thing one"",""20/04/2008 00:00:00"",""Max""
13,""Thing two"",""20/05/2008 00:00:00"",""Robbie""
";

    public class Thing
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public DateTime Date { get; set; }
        public Child Child { get; set; }
    }

    public class Child
    {
        public string Name { get; set; }
    }
}

11 comments:

  1. Anonymous8:45 am

    Cool stuff.

    One worry on the date format though. i think it would be better to use a portable format (i.e - insensitive to mmdd/ddmm and to timezones)
    so:
    if (item is DateTime)
    {
    return string.Format("{0:u}", item);
    }

    ReplyDelete
  2. Hi Ken,

    Thanks, that's a good suggestion.

    There are other things that I really should do like escape commas properly and new lines. It's not really the most robust CSV implementation at the moment.

    ReplyDelete
  3. Hi Mike, when I try to compile, I get an error "string does not contain a definition for 'With'..." at "return "\"{0}\"".With(item) ... any ideas?

    Luis.

    ReplyDelete
  4. Hi Luisfx,

    Sorry 'With' is an extension method on string in place of the string.format statement. Just change the line to:

    return string.Format("\"{0}\"", item.ToString().Replace("\"", "\\\""));

    And it should work fine.

    ReplyDelete
  5. Anonymous2:41 pm

    Mike,

    You haven't thought about the scenario where the thing you are converting has a string that has a comma in it!

    ReplyDelete
  6. Anonymous2:42 pm

    Woops,

    I just saw your comment!

    ReplyDelete
  7. Anonymous6:57 pm

    Hi Mike
    I'm also getting the error
    "No overload for method 'Join' takes '1' arguments",
    any idea why?

    Thanks

    ReplyDelete
  8. Anonymous1:59 pm

    Hi,

    I get the same error:

    "No overload for method 'Join' takes '1' arguments",

    ReplyDelete
  9. Hi Anonymous,

    I can't remember now, but I may have written a little extension method for join. It's a simple matter to use string.Join() instead.

    ReplyDelete
  10. if you want to had a line with the names of the linq properties do something like:

    if (UseHeader)
    {
    csvBuilder.Append(string.Join(",", (from a in properties select a.Name).ToArray()));
    }

    put this above the line that iterates over the ienumerable

    ReplyDelete
  11. Anonymous9:42 am

    Changed your ToCsvValue function to handle nulls, and got rid of the custom Extension methods:

    private static string ToCsvValue(this T item)
    {
    if (item is string)
    {
    return string.Format("\"{0}\"", item.ToString().Replace("\"", "\\\"")); ;
    }

    double dummy;
    if (item == null)
    return "";

    if (double.TryParse(item.ToString(), out dummy))
    return string.Format("{0}", item);

    return string.Format("\"{0}\"", item);
    }

    Also, regarding Scott's post on getting the headers: Change Append() to AppendLine() to avoid losing the first row of data!

    ReplyDelete

Note: only a member of this blog may post a comment.