Something To Code

All about programming and information security

Using the repository pattern

In most applications database access is necessary. In .NET there are 2 main ways of doing this: ADO.NET or using the Entity Framework. Of course there are great 3rd party libraries like NHibernate that do the trick as well, and if you are courageous you can use native libraries to access your database.

We usually need data from other places as well. We may need to access a directory service (like Active Directory) to obtain additional user info, or call web services to access remote data. Also not all the data will come from a SQL Server database, other database management systems such as Oracle, PostgreSQL etc are possible too. And then there are the good old XML files, or CSV-files and flat files.

I’m sure you can think of other data sources than those that I have just summed up. But for your application it isn’t important where data is stored or where data comes from. We want to be able to think about the data in a more abstract way.

The repository

When we use ADO.NET, we’re close to the database. We may need to write SQL to obtain or manipulate or data. We’re close to the tables as well. Sometimes this is an advantage because we know exactly what we’re doing, but when things start to evolve this may become more complex.

With entity framework we are 1 step further. The database is largely abstracted away, and we can use inheritance and other OO mechanisms in our data models. But we’re still talking to a database, and we depend on it.

So let’s think what we really want to do in our application. We want to get a list of customers, we want to be able to insert, update, delete customers, and probably we want to filter some data left and right. But we don’t care where this data comes from.

So we need another abstraction layer, which we call a repository. Repositories can be implemented in many ways. Some people like to use generics in their repositories (which saves a lot of repetitive work), others like to create “pinpointed” repositories for the job at hand. And of course we can start from the generic one and then add some pinpointed code.

Contacts example

Let’s create a simple contacts application. We want to show a list of contacts, be able to insert a new contact, update contacts and delete contacts. So we can create a class like:

public class ContactsRepository
{
        public IEnumerable<Contact> GetContacts()
        {  … }

        public Contact GetContactByID(int contactID)
        {  … }

        public Contact Insert(Contact ins)
        {  … }

        public Contact Update(Contact upd)
        {  … }

        public Contact Delete(int contactID)
}

This class can be implemented using Entity Framework, or ADO.NET, or XML files, or … The user of the class doesn’t care, as long as the class behavior is right. So we effectively abstracted away the way of storing our data.

This screams for interfaces… Using VS2015 we right-click on the class name > Quick

Actions > Extract interface > OK. The IContactsRepository interface is generated for us.

public interface IContactsRepository
{
        Contact Delete(int contactID);

        Contact GetContactByID(int contactID);

        IEnumerable<Contact> GetContacts();

        Contact Insert(Contact ins);

        Contact Update(Contact upd);
}

This interface can be made generic. We just need to specify the class name for the entity type. If you have standards that say that all primary keys will be integers then that will be enough. Otherwise you’ll need to make the data type for the primary key generic as well. In this example we’ll do the latter:

public interface IRepository<T, K>
{
        T Delete(K key);
        T GetByID(K key);
        IEnumerable<T> Get();
        T Insert(T ins);
        T Update(T upd);
}

So now the IContactsRepository interface becomes simple:

public interface IContactsRepository : IRepository<Contact, int>
{
        // additional functionality
}

 

Code Organization

Let’s say that we implement this repository using Entity Framework. In whichever way you use it (database first, code first, model first), you’ll end up with some classes that will reflect the database tables. In our example this is the Contact class (the entity class). It may be tempting to use these classes for anything else as well, such as sending retrieved data in a WCF web service, or displaying data in an MVC application, but this is generally a bad idea.

Using entities in WCF

When we create a WCF service GetCustomers( ) that returns a list of Customer objects, we’ll need to specify the [DataContract] attribute on the data class that you want to return, and the [DataMember] attribute on all the properties that you want to serialize with it. You could update your entity classes to add these attributes, but when you regenerate your classes from the database your modifications will be overwritten. And that is not even the biggest problem. The biggest problem is that you have violated the Separation of Concerns principle. You are using entity classes to return web service data. If this is the only thing you intend to do with your entity classes, this may be “acceptable” (but certainly not future-proof), but if you also want to also show them in an MVC application, with its own data attributes then things will become messy.

For this reason you should put the entities and the repositories in a separate assembly, which you can name Contacts.Data. In that way you have a project which will only handle data access and will only expose the entity classes and the IRepository interfaces. Internally the interfaces will be implemented by using the Entity Framework (or something else). This assembly will be a class library, so you only need to reference it in your other projects to use it.

In the WCF project we reference the Contacts.Data project so we have access to the data. We then define our own DataContract classes, which may be a copy of the entity classes (with all the necessary attributes); or not.

Show me the code

The WCF service will not return Contacts, but Customers. Here is the definition of the Customer:

[DataContract]
public class Customer
{
        [DataMember]
        public int CustomerID { get; set; }

        [DataMember]
        public string Name { get; set; }
}

As you can see the class name is different and the ID property is now called CustomerID.

The interface is a plain vanilla WCF interface:

[ServiceContract]
public interface ICustomersService
{ 
        [OperationContract]
        IEnumerable<Models.Customer> GetCustomers();
}

Notice the [ServiceContract] and [OperationContract] attributes, which make sure that our web service exposes the GetCustomers() method.

The implementation contains little surprises as well. The only thing is that we need to convert the Contact to a Customer, something that LINQ is very suitable for:

public class CustomersService : ICustomersService
{
        private readonly IContactsRepository _repo;

        public CustomersService()
        {
            _repo = new ContactsRepository();
        }

        public IEnumerable<Customer> GetCustomers()
        {
            var contacts = _repo.GetContacts();
            var qry = from c in contacts
                      select new Customer { CustomerID = c.ID, Name = c.Name };
            return qry.ToList();    // Don’t forget ToList()
        }

}

This is a trivial conversion. Sometimes this logic may be more complex, maybe also calculating some fields. And sometimes it may be just a member-wise copy, where libraries like Automapper can help you reduce code.

If the code for the conversion is used in many places, then you can make a function for it. I sometimes create a utility class only for the conversions.

Some caveats using EF in your repository

As you know, tables are represented as DbSets in Entity Framework. So if you have a context with a property called Contacts, then you can use it like

context.Contacts.

But this will not obtain the data from the database until you call a method like ToList() or ToArray() on it.

So in your Repository you can return object.Contacts. The advantage is that in your client code (the WCF code in our example) you can now chain other methods like Where, OrderBy, … to it, and only when you call a method that will retrieve your data (First, Single, Any, ToList, …) the query to the database will be generated so you get your data. This is a very efficient way of working with Entity Framework, but it will tie you to it. If that doesn’t bother you, then this is a good solution.

Another way to implement this is by returning

context.Contacts.ToList().

In this case you obtain the list of entities from the database and return them as a collection. The advantage is clear: You don’t depend on Entity Framework now, you just get a list of Contacts that you can work with. The problem however is that subtle errors can emerge:

int nrOfContacts = repo.GetContacts().Count();

if you have 100.000 records in your database, then all the records will be loaded in memory, and you calculate your count on the records in memory.

If you use the previous method (returning the DbSet), then a select count (*) will be sent to the database, resolving your query by indexes in the database and returning only the integer containing your count.

So choose wisely!

Implementing the MVC application

In the MVC application we can call the web service. To do that we’ll first create a proxy to make our live easy, and then obtain the customers. Again it would be possible to use the Customer class directly to display the records, but this poses the same “Separation of Concerns” problem. So create a ViewModel class to handle all the communication with the user. The idea is that everything that has to do with the data will be handled by the entity classes, and everything that has to do with the representation of the data, and getting data from the user will be handled by the ViewModel classes. The only additional code to write is again the trivial conversion between the entities and the viewmodels.

Conclusion

It may seem like we are creating a lot of classes that make no sense. But separating the concerns like this makes our code easier. In the data library we only work with the entity classes. In the web services we only use the entity classes to talk with the data library, and then transform them into what we want to return to the client. And in the MVC application we do the same. This gives a lot of freedom, but it also makes things much more testable. I know that you have been waiting for the T-word. I will cover tests for this flow in another post.

Loading