An
article was just posted on MSDN advocating the use of custom business
entities over datasets. Althought I too believe that DataSets should not be
exposed by the business layer, this article is rather poor and should not be on
MSDN.
Before you can understand my
comments you have to read the article, or at least the sections my comments are
referring to.
Problems
with the article
Argument
against datasets #1: Lack of Abstraction
Well, if you don't design a typed
dataset and you simply do adapter.Fill() what do you expect? What is being
compared here is building an entire set of custom business entities and
collections for these business entities to using an existing class from
System.Data without doing any work at all. The comparison should be against
typed datasets, which of course do not have to bare any resemblance to the
database's schema at all. When we talk about datasets we are not referring to
just System.Data.DataSet, but also to typed datasets, table mappings, adapters
and so on. In any case, the article is effectively arguing that ADO.NET does
not offer abstraction, which is just crazy.
Argument
against datasets #2: Weakly-Typed
This is like saying that C# is a
weakly-typed language because I can use object for all my variables. Just use
typed datasets, which with the help of annotations,
you have complete control over.
Argument
against datasets #3: Not Object-Oriented
"The
"hello world" of OO programming is typically a Person class that is
sub-classed by an Employee class. DataSets, however, don't make this type of
inheritance, or most other OO techniques, possible (or at least
natural/intuitive)"
This is exactly the type of
article that makes new developers go mad over inheritance and use it in all the
wrong places. Inheritance is just one of the aspects of OO programming.
All the other OO techniques such as encapsulation and interfaces are equally
important. Also, why are datasets making all these techniques impossible, unnatural
or unintuitive? Dataset inherits from Object. A new class can inherit from
DataSet. DataSet implements a number of interfaces such as IComponent,
IDisposable, ISerializable and so on. This makes no sense at all.
Other
points
"If your idea of a Data Access Layer is to
return a DataSet, you are likely missing out on some significant benefits. One
reason for this is that you may be using a thin or non-existent business layer
that, amongst other things, limits your ability to abstract."
This has nothing to do with
datasets. If I am not a capable programmer and cannot develop a proper business
layer, not using datasets is not going to help. I can just as easily create a
"business entity" with a hashtable of properties which clients have access
to. There goes “abstraction” again.
"The second thing to notice is that
instead of using a SqlDataReader for our mapping function, we use an
IDataRecord. This is an interface that all DataReaders implement. Using
IDataRecord makes our mapping process vendor-independent. In other words, we
can use the previous function to map a User from an Access database, even if it
uses an OleDbDataReader. If you combine this specific approach with the
Provider Model Design Pattern (link 1, link 2), you'll have code that can be
easily used for different database vendors."
Why is this even part of the
article? This is completely irrelevant to the dataset / custom entities debate.
"Dealing with NULLs in DataSets isn't the
easiest thing—that's because every time you pull a value you need to check if
it's NULL. With the above population method we've conveniently taken care of
this in a single place, and spared our consumers from having to deal with
it."
Datasets
provide a very nice feature through annotations.
When accessing a field that is null, you have the choice of returning either a
default value, throwing an exception, or simply returning null. How do custom
business entities offer anything more than that?
“This
tends to work well when the database schema closely resembles the custom entity
(as in this example)”
I’m
confused. Is having a domain design that is similar to the database schema a
good thing or a bad thing?
“As
your system grows in complexity and the differences between the two worlds
start to appear, having a clear separation between your data layer and business
layer can greatly help simplify maintenance (I like to call this the Data
Access Layer).”
So you
end up with a “layer” that sits between two other layers and knows about the
existence of both? This doesn’t sound like much of a layer to me.
“While
custom collections might seem like a lot of code, most of it is code generation
or cut and paste friendly, oftentimes requiring only one search and replace.”
Did I
just read that in an MSDN article?
“Using
DataSets the same way can be achieved with DataTable.Select. It
is important to note that while creating your own functionality puts you in
absolute control of your code, the Select method provides a very
convenient and code-free means of doing the same thing. On the flip side, Select
requires developers to know the underlying database and isn't strongly-typed.”
When
using custom business entities you need to know about them and their
relationships. For example you need to know that User has a property called
UserName, which is a string. It is no different to when you are using a dataset
and you need to know its data tables and their fields. Also, datasets do not
need to have the same schema as the database and if you use typed datasets then
everything is strongly typed.
“design
patterns aren't meant to be 100% cut and paste”
What is
up with all the references to cutting and pasting code? I mean, we all do it
sometimes, but it is not something that you should promote. It often leads to
much more pain than gain.
“There's
nothing to say that design patterns only apply to custom entities, and in fact
many don't. However, if you give them a chance you'll likely be pleasantly
surprised at how many well documented patterns do apply to custom entities and
the mapping process.”
In other
words, we should use custom business entities because then we have the
opportunity of making use of design patterns. I don’t know about you, but I
prefer not having a problem rather than creating one myself just so I can make
use of an existing solution.
Concurrency
I don’t
think that such an important and complicated matter is done justice with just
the following
One
way to totally avoid any conflicts is to use pessimistic concurrency; however,
this method requires some type of locking mechanism, which can be difficult to
implement in a scalable manner. The alternative is to use optimistic
concurrency techniques. Letting the first commit dominate and notifying
subsequent users is typically a gentler and more user-friendly approach to
take. This is achieved by some type of row versioning, such as timestamps.
Right, so
what are the differences between using custom entities and datasets in this
case? I thought the article was a debate between these two, not a general guide
to database programming. I guess it is convenient to leave out the fact that
all the work that is already done by datasets in terms of concurrency has to be
reproduced.
Alternatives
I don’t
believe in criticizing something without a good reason and I don’t believe in
claiming that an idea is bad without offering a better alternative. My view on
business entities and datasets is that they both have good things to offer and
they work best when they work together. I have started a series of posts on the
subject; you can find the first one here.