Barbarian Meets Coding
barbarianmeetscoding

WebDev, UX & a Pinch of Fantasy

18 minutes readorm

Entity Framework

Entity Framework is an open source ORM built by Microsoft and a de facto standard within the .NET ecosystem. As any ORM it defines mappings between your domain objects and SQL databases and makes talking to your database transparently by operating directly with your domain objects.

Entity Framework 5

Code First

With Entity Framework Code First you can take advantage of your existing domain model classes to interact with your database. Entity Framework Code First builds an in-memory data model that maps your domain model classes to your tables within your database. If you don’t have a database, EF can create a new database from your model. Additionally, as your domain model changes through time, EF provides migration APIs that allow your to migrate your existing database to reflect the new status of your domain model. Generally EF handles all this mapping between domain model and database through conventions but you can explicitly define your own configuration if so you wish.

Here we have an example domain model that you could use in a blog app:


    public class Blog 
    { 
        public int BlogId { get; set; } 
        public string Name { get; set; } 
 
        public virtual List<Post> Posts { get; set; } 
    } 
 
    public class Post 
    { 
        public int PostId { get; set; } 
        public string Title { get; set; } 
        public string Content { get; set; } 
 
        public int BlogId { get; set; } 
        public virtual Blog Blog { get; set; } 
    } 

Note how the virtual keyword is applied to some object properties to enable the possibility of lazy loading objects related to a particular domain model object. Additionally you can see how a Post is related to a Blog (navigation property) and it also includes the BlogId (which represents a foreign key), this feels like an invasion of database concerns inside my domain model but it makes easier to work with relationships between domain objects when using Entity Framework.

Note also how we have a PostId and a BlogId, these are required by EF who expects (by default) to find keys in every domain object by looking for the Id or typenameId properties within all objects. You can define the key explicitly by using data annotations (in this case [Key]):


public class Category
{
    [Key]
    public int CatId {get; set;}
    public string Name {get;set;}
}

When having one-to-one relationships you will also need to specify which object is the principal and which is the dependent:


public class Author
{
    public int AuthorId {get; set;}
    public string Name {get;set;}

    public virtual ContactDetails Details {get;set;}
}

public class ContactDetails
{
    [Key, ForeignKey("Author")]
    public int AuthorId {get;set;}
    public string Email {get;set;}
    ...

    // the ForeignKey attribute points to this property
    public virtual Author Author{get; set;}
}

Once that you have defined a domain model you need a way to let Entity Framework know about your domain model classes and a way to interact with the database. You can achieve both of these through EF DbContext class. First you will need to add EF to your project or solution, you can achieve this via NuGet:

> Install-Package EntityFramework

This will add an App.Config file to your project that will contain basic EF settings with a default connection factory that will use a LocalDB database, in addition to getting the EF assembly. (Note that if this configuration is not present, EF will default to SQL Express). Note that this App.Config or any other file that has the EF configuration must be the startup project of your solution.

Once we have added entity framework we can create our application data context by inheriting DbContext:


public class BlogDbContext: DbContext
{

    // You only need to add here the entities
    // that you want to operate with directly
    public DbSet<Blog> Blogs {get;set;}
    public DbSet<BlogPost> BlogPosts {get;set;}

}

Note how we make EF aware of our classes by virtue of declaring DbSet<T>. DbSet<T> (and IDbSet<T>) constitutes the API that we will use to query and perform operations with objects on our database. Additionally it is important to note that you are not limited to using a single DbContext, within large applications it is even recommended to use multiple DbContexts.

When EF interacts with the database using code first, it builds an in-memory model of all the metadata it needs to map you domain model to the database. It does so by inferring all these metadata from your domain model. Because the conceptual model that EF inferrs can differ from what you want, you can verify it by using the Entity Framework Power Tools (Right-click on class that inherits from DbContext and then click Entity Framework and View Entity Data Model - Read only.

Generating a Database From Your Code First Domain Model

Code First doesn’t care about the database at all until runtime. Once the application is running and the DbContext is instantiated and it needs to interact with the database, EF will try to find the database either via an specific database connection or one created by convention:

  1. If no connection string is provided:
    1. Locate DB Server
      1. By default it will look into SQL Express
      2. If another configuration is provided (for instance a LocalDb connection provider) EF will use it to locate the DB server
    2. Locate DB by looking for a DB with the same name as the strongly typed DbContext (e.g. MyApp.Data.MyDbContext)
  2. If connection string is provided then EF will use it

Once the connection to a database server is established EF will either operate on the existing database or create a new database if it doesn’t exist (it will use the metadata inferred from the domain model to define the schema for the new database).

Code Migrations in Entity Framework Code First

Entity Frameworks allows you to make changes in your database following the growth, expansion and change of your domain model through Code Migrations.

By default, Code First will only create the database if the database does not exist. Everytime that a new in-memory model is generated it is assigned a hash, Entity Framework uses this hash to make sure that the database it connects to is in the same state as the in-memory model. If this is not the case EF will throw an exception and warn us that the new context cannot map to the existing database.

You can change this configuration in an easy way by running the enable-migrations command from within your NuGet Package Manager Console while referencing your data layer project. It has two modes:

  • Automatic, in which EF handles migrations for you (suitable for simple scenarios) (enable-migrations -EnableAutomaticMigrations)
  • Codebased, in which you have complete control on how and when migrations occur. (enable-migrations)

This will create a Migrations folder with a migration Configuration class that will inherit from DbMigrationsConfiguration<yourDbContext>:


internal sealed class Configuration : DbMigrationsConfiguration<MyDbContext>{
    public Configuration()
    {
       // With automatic configuration
       AutomaticConfiguration = true;
       // Disable exceptions when removing properties or objects
       // that may lead to data loss
       AutomaticMigrationDataLossAllowed = true;
    }

    protected override Seed(MyDbContext context)
    {
        // this method will be called on db initialization (even if there is no changes in the model)
        // you can use the DbSet AddOrUpdate method to avoid creating 
        // duplicated seed data
    }

}

After you have enabled migrations using the enable-migration scripts you still need to explicitly tell Entity Framework to use the correct database initialization strategy. There are for initializer (IDatabaseInitializer):

  • The default is CreateDatabaseIfNotExists which creates a database if it doesn’t exist but doesn’t handle migrations
  • The DropCreateDatabaseIfModelChanges is useful during development but cannot be used in a production setting since it will remove all existing data whenever the model changes
  • The DropCreateDatabaseAlways is excellent for Continuous Integration and testing environments where you want to start fresh on every deployment
  • The MigrateDatabaseToLatestVersion will enable migrations

You can setup the database initializing strategy in the startup of your project or within your config file:


Database.SetInitializer(new MigrateDatabaseToLatestVersion<MyDbContext, Configuration>());

Using Data Annotations to Configure Mappings Declaratively

Some of the data annotations that you can use to further configure how your domain model classes are mapped into tables and columns are:

  • [Table("tableName")]
  • [Key]
  • [Column("columnName")]
  • [Required]
  • [MaxLength(10)]
  • [MinLength(12)]
  • [StringLength(12)]
  • [Timestamp]
  • [NotMapped]
  • [ForeignKey]
  • [ComplexType]
  • [ConcurrencyCheck]
  • [InverseProperty]
  • [DatabaseGenerated]

Configuring Mappings Imperatively With The Fluent API

Using data annotations can lead to putting too much database related information within your domain model classes. If you want to leave your domain classes a little cleaner and a little bit more POCO-ish then you can use EF fluent API to encapsulate all mapping related configurations. (The validation data annotations can still feel more like business rules and do have a place in the domain model, it’s the database ones that stink xD). Additionally the fluent API provides more comprehensive functionality to configure mappings (like defining how object hierarchies are mapped to database tables).

The entry point to Entity Framework fluent API is the DbModelBuilder class that is normally passed as an argument to the OnModelCreating virtual method of the DbContext:


public class MyDbContext : DbContext
{
    public DbSet<Blog> Blogs {get;set;}

    // here we hijack the in-memory entity model building process
    protected override OnModelCreating(DbModelBuilder modelBuilder) 
    {
        modelbuilder.Entity<Blog>.HasKey(b => b.IdBlog).ToTable("MyBlogs");
        modelBuilder.Entity<Blog>.Property(b => b.IdBlog).ToColumn("Id");
        // adding relationships works with a pattern:
        // Object.HasXXX(NavigationProperty).WithXXX(ForeignKeyInRelatedObject)
        // example: 1<->0.1
        modelBuilder.Entity<ContactDetails>().HasRequired(c => c.Author).WithOptional(a => a.ContactDetails);
        ...
        base.OnModelCreating(modelBuilder);
    }
}

You can also separate configurations by extending the EntityTypeConfiguration<TEntity> class. This will allow you to specify configurations per object type:


public class BlogMappings : EntityTypeConfiguration<Blog>
{
    public BlogMappings()
    {
        HasKey(b => b.IdBlog).ToTable("MyBlogs");
        Property(b => b.IdBlog).ToColumn("Id");
        // etc
    }
}

You will need to update your DbContext so that it will be aware of these configurations:

modelBuilder.Configurations.Add(new BlogMappings());

How the Code First DBContext Find Out About Entities to Model

  • DbSet properties publicly exposed in the DbContext class
  • Classes directly referenced by these DbSet
  • Classes configured via the fluent API

You can remove a class from the DbContext entity data model by using the DbModelBuilder Ignore<T> method.

Reverse Engineering Databases To Code First With Entity Framework Power Tools

The EF power tools provide a feature to generate a domain model plus fluent configuration from an existing database.

Using Enums and Spatial Information

With EF5 you have the possibility of working seamlessly with enums and spatial information. In order to use spatial information in your domain model check out the DBGeography class in the System.Data.Spatial namespace within the System.Data.Entity assembly.

Other Ways to work with Entity Framework

In addition to the code first approach EF also lets you generate a domain model from your database or start working with a EDM model (graphical user interface that allows you to generate a domain model and define the mappings to your SQL database). For more info refer to MSDN.

Interacting With The DbContext

The DbContext takes care of querying your database, tracking changes and managing the state of the entities that are within your context and persisting these changes to the DB.

Querying and Performing Operations on Data via The DbContext

You can use LINQ over your DbContext to perform queries to your DB.


using (var context = new MyDBContext()){
    // Get all blog posts
    var posts = context.BlogPosts.ToList();
    // Get a blog by id
    // EF will check in-memory first before making a call to the db
    var aPost = context.BlogPosts.Find(42);
    // Add a new entity to the db
    context.BlogPosts.Add(new BlogPost { Title = "Hello World"});
    context.SaveChanges();
    // Update
    var samePost = context.BlogPosts.Find(42);
    samePost.Title = "aaaaa";
    context.SaveChanges();
    // Delete
    context.BlogPosts.Remove(samePost);
    context.SaveChanges();
}

Note that by default all updates done within the same context before calling the context.SaveChanges method will be wrapped into a transaction when run against a database.

Additionally you can perform operations on whole graphs of objects, just be careful to which related objects the context should be made aware of (since a db context doesn’t know about objects that are coming from other contexts):

var blogPost = new BlogPost{
    AuthorId = jaime.Id, // this will avoid the need to attach the object to the newly created context
    Title = "Hello world",
    Content = "...",
};
using (var context = new MyDbContext()){
    // when I add the blogPost to the context
    // every object within the object graph is added to the context
    context.BlogPosts.Add(blogPost);
}

When querying an object from the DB you can also load related data either eagerly or after the fact (as needed). In order to eager load related data for your entities you use the Include method on a query or with query projection. You can get related data after the fact by using the Load method or loading related data lazily when a object relation being accessed triggers an additional query to the database.


using (var context = new MyDbContext(){
    // Eager loading
    var blogPosts = context.BlogPosts.Include(b => b.Comments).ToList();
    // the more includes that you do the crazier the generated SQL queries get so be careful
    var blogPostsAndFirstComment = context.BlogPosts
        .Select(b =>
            b,
            FirstComment = b.Comments.OrderByDesc(c => c.Date).FirstOrDefault()
        ).ToList();

    var blogPost = context.BlogPosts.Find(11);
    // After the fact explicit loading
    context.Entry(blogPost).Collection(b => b.Comments).Load();
    
    // Lazy loading
    // you need to enable it by adding the virtual keyword within your domain properties that you want lazy loaded
    // and in the configuration of your DbContext Configuration.LazyLoadingEnabled = true;
    blogPost.Comments.First(); // triggers a query if the comment is not in the context
})

Working With Store Procedures

You cannot map store procedures with code first (in EF5) but you can still call them via the DbContext. You can do so through the ExecuteSqlCommand and the SqlQuery methods.

Architecting Your Application With Entity Framework

Automated Testing with Entity Framework

Unit Testing

When you are writing unit tests you want to fake away dependencies like the DB (and thus EF) to avoid long running tests and test coupling.

Faking, Moking and Stubbing DBSets and DbContexts

First thing that you need to do is make sure that you expose IDbSet from your DbContext instead of DbSet. This will allow you to completely fake the db sets you are exposing via your db context and ensure that no unit test is hitting the database.

The next thing that you’ll need to do is to extract an interface from your DbContext implementation IMyAppContext that exposes the same IDbSet properties.


public interface IBlogContext : IDisposable
{
    IDbSet<Blog> Blogs {get;}    
    IDbSet<BlogPost> BlogPosts {get;}    

    int SaveChanges();
    // etc
}

public class BlogContext : DbContext, IBlogContext
{
    IDbSet<Blog> Blogs {get;}
    IDbSet<BlogPost> BlogPosts {get;}
}

TODO: Is there any in-memory library for EF unit testing?? That would come handy :)

Integration Testing

Entity Framework Code First is specially good for integration testing because you can easily create a new database and seed it just for testing:


// declare new db initializer
public class MyDBInitializer : DropCreateDatabaseAlways<MyDbContext>
{
    protected override void Seed(MyDbContext context)
    {
    //seed with test data
    }
}

// use it within your test setup *or* in the constructor
// if you want to avoid DB initialization in each test
// and have faster running tests
// Beware of coupling your tests too much if that is the case
// (They are coupled by using the same db but you should
// minimize impact from one test to the next to avoid
// WTF errors)

[TestFixture]
public class SomeTests{
    [SetUp]
    public void SetUp(){
        Database.SetInitializer(new MyDBInitializer());
        using (var context = new MyDbContext())
        {
            try {context.Database.Initialize(true);}    
            catch { throw; }
        }
    }
    //.... tests
}

Code First Migrations

To enable migrations in your project open the package manager console and run (in your data access layer project):

PS>enable-migrations

This will create a Migrations folder in your project with a Configuration class that will contain your migrations configuration. Additionally, if you are working with a new database generated by EF, an initial migration that describes how to recreate your database to the current point will be created.

The configuration will look like this:

internal sealed class Configuration: DbMigrationsConfiguration<MyDbContext>
{
    public Configuration()
    {
        AutomaticMigrationsEnabled = false;
    }

    public override void Seed(MyDbContext context)
    {
        // seed with data after migrating
    }
    
}

The initial migration that is created for you will have an up method that contains information on how to move a database to the latest version, and a down method to downgrade the database to the previous state.

In order to fully enable migrations in your project you will still to set the right database initialization strategy in your EF configuration: MigrateDatabaseToLatestVersion<DbContext, MigrationConfiguration>.

Database.SetInitializer(new MigrateDatabaseToLatestVersion<MyDbContext, Configuration>());

Another interesting commands that you can use with the package manager console are:

  • update-database: update database based on the currently defined migrations
  • add-migration {name}: create a new migration
  • add-migration {name} -IgnoreChanges: create a new empty migration (useful for applying EF migrations to an existing database)

Automatic Migrations

To enable automatic migrations just set the AutomaticMigrationsEnabled property of your DbMigrationsConfiguration class to true. Once you’ve done this EF will take care of migrations for you.

By default, if a migration will cause a loss of data, EF will through a AutomaticDataLossException. You can disable this behavior in the configuration by enabling the AutomaticMigrationDataLossAllowed property.

Code-based Migrations

By disabling automatic migrations you have full control and responsibility on the migration of your database. You will need to use add-migration and update-database explicitely to create and apply new migrations.

The code-based migrations workflow works as follows: Whenever you make changes in your domain model you will run the add-migration command, EF will compare the existing model in the DB and your current domain model and will generate a migration script for you in the for of a class which will inherit from DbMigration. After that you will want to apply the migration to your database and you’ll do that by using the update-database command (you can use the -Verbose option to get more information about what is happening during the database migration).

You can move the database to a specific database version by using the update-database as follows:

PS> update-database -TargetMigration:"AddNamePropertyToPerson"

You can use update-database -? to get more information on how to use that command. Some common options are:

  • TargetMigration: target schema
  • Script and SourceMigration: create SQL script, starting schema
  • Force: allow data loss
  • ProjectName
  • ConfigurationTypeName
  • StartUpProjectName
  • ConnectionString and ConnectionStringProviderName

Code Migrations and the DbMigration gives you a ton of possibilities to finely grain configure your migration scripts. For instance adding a default value for a new column:

AddColumn("Person", "Name", /* column definition */c => c.String(defaultValue: "my default name"));

You can run arbitrary from your migration classes by using the Sql method:

AddColumn("Person", "NumberOfEyes")
Sql("UPDATE Person SET NumberOfEyes = 2")

Additionally you can create update scripts like SQL scripts by using the -script option with the update database command. Remember to set the sourcemigration parameter to the original migration you want to base your migration scripts from.

update-database -script -verbose -sourcemigration:"newproperty4"

When going to production it is wise to turn off migrations and database initialization (and only update it via SQL scripts) to make sure that EF doesn’t do anything unexpected with your data. You can do this by using null as DataInitializer:

Database.SetInitializer<MyDbContext>(null);

You can also set it in the application configuration file (App.Config, Web.Config) with the attribute disableDatabaseInitialization=true of the context element.

All the powershell commands are shortcuts to access the DbMigrator class. For instance update-database is a shortcut for:

var migrationConfig = new Configuration();
migrationConfig.target = connectionString;
var migrator = new  DbMigrator(migrationConfig);
migrator.Update();

Entity Framework 6

References


Jaime González García

Written by Jaime González García , dad, husband, software engineer, ux designer, amateur pixel artist, tinkerer and master of the arcane arts. You can also find him on Twitter jabbering about random stuff.Jaime González García