An article was just posted on MSDN advocating the use of custom business entities over datasets. Althought I too believe that DataSets should not be exposed by the business layer, this article is rather poor and should not be on MSDN.

 

Before you can understand my comments you have to read the article, or at least the sections my comments are referring to.

 

Problems with the article

 

Argument against datasets #1: Lack of Abstraction

Well, if you don't design a typed dataset and you simply do adapter.Fill() what do you expect? What is being compared here is building an entire set of custom business entities and collections for these business entities to using an existing class from System.Data without doing any work at all. The comparison should be against typed datasets, which of course do not have to bare any resemblance to the database's schema at all. When we talk about datasets we are not referring to just System.Data.DataSet, but also to typed datasets, table mappings, adapters and so on. In any case, the article is effectively arguing that ADO.NET does not offer abstraction, which is just crazy.

 

Argument against datasets #2: Weakly-Typed

This is like saying that C# is a weakly-typed language because I can use object for all my variables. Just use typed datasets, which with the help of annotations, you have complete control over.

 

Argument against datasets #3: Not Object-Oriented

 

"The "hello world" of OO programming is typically a Person class that is sub-classed by an Employee class. DataSets, however, don't make this type of inheritance, or most other OO techniques, possible (or at least natural/intuitive)"

 

This is exactly the type of article that makes new developers go mad over inheritance and use it in all the wrong places. Inheritance is just one of the aspects of OO programming. All the other OO techniques such as encapsulation and interfaces are equally important. Also, why are datasets making all these techniques impossible, unnatural or unintuitive? Dataset inherits from Object. A new class can inherit from DataSet. DataSet implements a number of interfaces such as IComponent, IDisposable, ISerializable and so on. This makes no sense at all.

 

Other points

 

"If your idea of a Data Access Layer is to return a DataSet, you are likely missing out on some significant benefits. One reason for this is that you may be using a thin or non-existent business layer that, amongst other things, limits your ability to abstract."

 

This has nothing to do with datasets. If I am not a capable programmer and cannot develop a proper business layer, not using datasets is not going to help. I can just as easily create a "business entity" with a hashtable of properties which clients have access to. There goes “abstraction” again.

 

"The second thing to notice is that instead of using a SqlDataReader for our mapping function, we use an IDataRecord. This is an interface that all DataReaders implement. Using IDataRecord makes our mapping process vendor-independent. In other words, we can use the previous function to map a User from an Access database, even if it uses an OleDbDataReader. If you combine this specific approach with the Provider Model Design Pattern (link 1, link 2), you'll have code that can be easily used for different database vendors."

 

Why is this even part of the article? This is completely irrelevant to the dataset / custom entities debate.

 

"Dealing with NULLs in DataSets isn't the easiest thing—that's because every time you pull a value you need to check if it's NULL. With the above population method we've conveniently taken care of this in a single place, and spared our consumers from having to deal with it."

 

Datasets provide a very nice feature through annotations. When accessing a field that is null, you have the choice of returning either a default value, throwing an exception, or simply returning null. How do custom business entities offer anything more than that?

 

“This tends to work well when the database schema closely resembles the custom entity (as in this example)”

 

I’m confused. Is having a domain design that is similar to the database schema a good thing or a bad thing?

 

As your system grows in complexity and the differences between the two worlds start to appear, having a clear separation between your data layer and business layer can greatly help simplify maintenance (I like to call this the Data Access Layer).”

 

So you end up with a “layer” that sits between two other layers and knows about the existence of both? This doesn’t sound like much of a layer to me.

 

While custom collections might seem like a lot of code, most of it is code generation or cut and paste friendly, oftentimes requiring only one search and replace.”

 

Did I just read that in an MSDN article?

 

“Using DataSets the same way can be achieved with DataTable.Select. It is important to note that while creating your own functionality puts you in absolute control of your code, the Select method provides a very convenient and code-free means of doing the same thing. On the flip side, Select requires developers to know the underlying database and isn't strongly-typed.”

 

When using custom business entities you need to know about them and their relationships. For example you need to know that User has a property called UserName, which is a string. It is no different to when you are using a dataset and you need to know its data tables and their fields. Also, datasets do not need to have the same schema as the database and if you use typed datasets then everything is strongly typed.

 

design patterns aren't meant to be 100% cut and paste”

 

What is up with all the references to cutting and pasting code? I mean, we all do it sometimes, but it is not something that you should promote. It often leads to much more pain than gain.

 

There's nothing to say that design patterns only apply to custom entities, and in fact many don't. However, if you give them a chance you'll likely be pleasantly surprised at how many well documented patterns do apply to custom entities and the mapping process.”

 

In other words, we should use custom business entities because then we have the opportunity of making use of design patterns. I don’t know about you, but I prefer not having a problem rather than creating one myself just so I can make use of an existing solution.

 

Concurrency

I don’t think that such an important and complicated matter is done justice with just the following

 

One way to totally avoid any conflicts is to use pessimistic concurrency; however, this method requires some type of locking mechanism, which can be difficult to implement in a scalable manner. The alternative is to use optimistic concurrency techniques. Letting the first commit dominate and notifying subsequent users is typically a gentler and more user-friendly approach to take. This is achieved by some type of row versioning, such as timestamps.

 

Right, so what are the differences between using custom entities and datasets in this case? I thought the article was a debate between these two, not a general guide to database programming. I guess it is convenient to leave out the fact that all the work that is already done by datasets in terms of concurrency has to be reproduced.

 

Alternatives

 

I don’t believe in criticizing something without a good reason and I don’t believe in claiming that an idea is bad without offering a better alternative. My view on business entities and datasets is that they both have good things to offer and they work best when they work together. I have started a series of posts on the subject; you can find the first one here.