Rants Tagged with “LINQ”
1 (Total Pages: 1/Total Results: 9)
UPDATE: Roger Jennings correctly stated, I meant to say that Include is *not* a guarantee.
When I am using the Entity Framework for a project, I have gotten into the habit of using eager loading via the Include syntax. In case you're not familiar it, the Entity Framework has a different philosophy than other data layers (e.g. NHibernate). In the Entity Framework, relationships have to be manually loaded when they are lazy loaded (so developers never have network round-trips without explicitly knowing about it). Whether you agree or not with this philosophy, understanding how it works is helpful when you're working with the Entity Framework.
The Entity Framework supports eager loading of the data as well using the Include syntax. For example:
var qry = from w in ctx.Workshops.Include("Topic")
orderby w.Name
select w;
By amending the source of the query with the Include method, you can eager load relationships. The problem is that these are really hints to the Entity Framework to load the relationships, but not a guarantee. Depending on the query, these Includes may be dropped. The two scenarios I see this most often are the grouping and subselects:
// Drops the Include
var qry = from e in ctx.Events
.Include("Workshop")
where e.EventDate >= DateTime.Today
group e by e.EventLocation.Region into r
select r;
I ran into a good post on the forums on the subject:
In that post, Diego Vega says that Include only makes sense in the following scenarios:
- Include only applies to items in the query results: objects that are projected at the outermost operation in the query, for instance, the last Select() operation (in your query, you tried to apply Include to a subquery)
- The type of the results has to be an entity type (not the case in your query)
- The query cannot contain operations that change the type of the result between Include and the outermost operation (i.e. a GroupBy() or a Select() operation that changes the result type)
- The parameter taken by Include is a dot-delimited path of navigation properties that have to be navigable from an instance of the type returned at the outermost operation
To alleviate the problem in some scenarios, you can use the EFExtensions Include Extension method to add includes on the complete query like so:
using Microsoft.Data.Extensions;
// ...
var qry = from e in ctx.Events
where e.EventDate >= DateTime.Today
select e;
List results = qry.Include("Workshop").ToList();
You can find the EFExtensions at the MSDN Code site here:
Has this bitten you before?
It seems that because of some internal NHibernate changes that are required to make NHibernate LINQ work really well, the current version of NHibernate LINQ will not be supported. Evidently there are a number of complex queries that do not work correctly under the current codebase. Its been announced that these changes will be made in the NHibernate 2.1 (which is in development). Follow the link to read the full details!
I've spent much of the last couple of weeks trying to strengthen my LINQ knowledge. A friend of mine is one of the authors of a LINQ book so I figured it was a good match to dig deeper.
The book covers a lot of topics that emcompass LINQ including LINQ basics, but also LINQ to SQL, LINQ to XML. I like that it starts out with a discussion of the problem and doesn't dive directly into the solution. In addition, I think it teaches the technology without resorting to starting with database applications as the example. Anyone who has heard me talk about LINQ knows that I can't stand that LINQ to SQL is the wrong way to teach it to new people...they didn't fall into that trap.
In addition, I really like that there are lots of good examples and a great index. There was never an example I was looking for that the index didn't help me find. That's becoming rarer in books. I really liked their coverage of LINQ from both the consumer of LINQ and the provider of LINQ. Their discussion of the LINQ to Amazon provider provided quite a lot of good insight into how the inner workings of LINQ are put together.
My only hesitation at completely loving this book is that all the examples are either in C# or VB. This lends the book to feel a bit schizophrenic. I would have preferred a more bloated book where all the example in print were in both languages. This is especially true of LINQ because the language integration of LINQ is very dissimilar between the languages.
Overall, I would recommend the book to anyone trying to learn LINQ as a technology.
UPDATE: I know the title is wrong and it should be IUpdateable but I didn't want to break any links for any RSS feeds that already had it.
If you haven't read Part 1 or Part 2 first, you should start there.
In this final part of the IUpdateable implementation, we will discuss the rest of the methods which includes ClearChanges, ResolveResource, ReplaceResource, SetReference, AddReferenceToCollection, RemoveReferenceFromCollection and SaveChanges.
ClearChanges is a simple method that simple undo's any changes that have accumulated. In my implementation, I simply clear the UpdateCache that we discussed in Part 2 as well as clear the NHibernate Session to undo any changed that were previously saved (but not persisted to the database).
ResolveResource is a funny little method. In the implementation documentation for the IUpdateable interface, they specify that in most cases you can return a token or anything you want that uniquely identifies an item instead of an actual object reference. For example, when you implement CreateResource the return value can be the actual resource or can be some token that you know how to resolve to the resource. This method is the one that will turn that token into an object reference if you've used that functionality. In our case we're actually returning object references so this becomes a non-issue and we just return the object that is passed in.
Next up is ReplaceResource. Not surprising, this method is used to complete replace a resource with another copy (who may have the correct values already set). The first parameter is a query who must return a single object. For my implementation, I first retrieve the single instance from the query and then walk through the properties to set the new values. Simple really.
The next couple of methods have to do with relationships between objects. The SetReference method takes three parameters: the target resource, the property name and the property value. The property value is an object who needs to be assigned to a property of the target. The property name to be assigned in the 2nd parameter. The implementation is pretty simple: I used the existing SetValue implementation to set the correct property with the correct value.
The AddReferenceToCollection and RemoveReferenceFromCollection are related methods. One adds an object to a collection and one removes it from the collection. Both of these methods take the same signature that SetReference uses (target, property name, property value). The difference here is instead of setting the reference, you need to add or remove it from a property (which is the collection). For my AddReferenceToCollection implementation, I use reflection to find an "Add" method and invoke it with the new reference. For the RemoveReferenceFromCollection, I do the same but I am looking for a "Remove" method to invoke. Using reflection in this way does not require that the collections follow any specific pattern but do require an Add and Remove method. I chose this implementation because I was copying ADO.NET Data Services' implementation that they use for the Entity Framework. I could have expected IList since that is the standard way that NHibernate usually implements collections, but I decided to use this implementation first and if I had to refactor it I would.
The last method to implement is SaveChanges. This method is pretty straightforward as all pending changes that the IUpdateable interface has applied should be applied during this call. IUpdateable expects that SaveChanges will be atomic (e.g. pass/fail, no partial updates). To enforce that requirement, I used the NHibernate Session object's BeginTransaction method to start a transaction so that I could rollback any changes that failed. Because I had kept a cache of updates so I could clear them if necessary, I walked through them once the transaction had began and applied them to the Session object. Finally I flushed the session to force the persistence to happen and ended the transaction. I did this all in a try...catch block so I could rollback the changes if any exceptions were thrown. SaveChanges does not have a return value so if it fails you are expected to return a DataServiceException. So in my catch block, after rolling back the transaction, I throw a DataServiceException and pass along the caught exception as the inner exception.
Overall the implementation was pretty simple. I hope that these articles will help anyone who is trying to implement this for their own data that they want to use in ADO.NET Data Services.
Next Monday night (July 28th, 2008), i'll be giving the short Q&A session at the Atlanta .NET Users Group. The topic? NHibernate's LINQ and ADO.NET Data Services support. If you're interested in using NHibernate but don't want to give up your LINQ skills, stop by for a listen!
Now that my ADO.NET Data Services support has been merged into the trunk of NHibernate.LINQ, I do have some caveats about using NHibernate.LINQ with ADO.NET Data Services. ADO.NET Data Services is a beta 1 product so there are some bugs and issues that you will either need to avoid or work around.
The biggest issue is around entity identity. ADO.NET Data Services must know how identity is established for objects in order to support the Data Service. It does this in a two step process:
- First it looks for attributes that describe the 'primary key'.
- Failing that, it looks for properties on the entity called ID, or ending with "ID".
The second approach is where I expect most of NHibernate projects to fall into since you really don't want to pollute your objects with technology specific information (the attributes). This approach works well except that there is a bug in the Beta 1 version of ADO.NET Data Services. If the properties are specified in a base class and the keys are specified ending in "ID" (instead of just being called "ID"), then the search for the identifiers fails and Data Services fails to want to serve these objects. For example:
public class AbstractCategory
{
public virtual int CategoryID { get; set; }
public virtual string CategoryName { get; set; }
public virtual string Description { get; set; }
public virtual byte[] Picture { get; set; }
public virtual IList Products { get; set; }
}
public class Category : AbstractCategory
{
}
}
If this is your scenario, I might suggest waiting for later build of ADO.NET Data Services to be released as this is definitely a bug not expected behavior and I have gotten word from Microsoft that it is fixed in the RTM (which isn't available yet).
The next issue is that for collections, ADO.NET Data Services must understand the types that belong in a collection. In this case our above example will not work either in that having the Products in a Category as a simple IList can't tell ADO.NET Data Services what types of objects to deal with. If we change this to an IList<Products>, it works fine. If we have to change our entities to work with ADO.NET Data Services, this is what our new Category might look like instead:
public class Category
{
public virtual int CategoryID { get; set; }
public virtual string CategoryName { get; set; }
public virtual string Description { get; set; }
public virtual byte[] Picture { get; set; }
public virtual IList<Product> Products { get; set; }
}
With these changes, ADO.NET Data Services work fine.
If you are new to ADO.NET Data Services, this blog entry may help with some debugging issues in using it:
http://wildermuth.com/2008/06/07/Debugging_ADO_NET_Data_Services_with_Fiddler2
Lastly, I want to follow up on a note that Ayende mentioned on his announcement of my examples. In his blog post, he said:
From a technological perspective, I think this is awesome. However, there are architectural issues with exposing your model in such a fashion. Specifically, with regards to availability and scalability on the operations side, and schema versioning and adaptability on the development side.
I think he's right in that there is a schema version issue here that needs to be addressed but that the availability and scalability problems are ones that would be in the underlying data model itself. Since ADO.NET Data Services are just a convenience around WCF's REST Service Model, we can scale out or up depending on our needs (as well as caching).
What I think is important is to understand the reason behind ADO.NET Data Services. It is not a model to replace typical Web Service or Message Bus architectures. Its not all that fast or efficient. Its purpose is to allow the creation of a simple model to allow communication across the firewall. What I mean is that it is meant for the AJAX and RIA developers. Its a way of communicating data to clients that run on the Internet.
Its important to understand that data you expose with ADO.NET Data Services is not magically more secure...in fact, since its meant for client-side consumption of data, you should not allow data to be exposed by ADO.NET Data Services that is sensitive. Remember, that consuming data in the client is not secure in itself. If you wouldn't feel safe consuming data in client-side JavaScript, don't expose it via ADO.NET Data Services.
In response to some requests that I have received, I decided to write a several part blog on some of the techniques I used in developing Wildermuth.com. In this first example, I am going to discuss the use of LINQ and data in my site.
In moving from www.adoguy.com to www.wildermuth.com, one of my goals was to use LINQ as much as possible to see the travails of using it on a real project. I have done a lot of small samples with LINQ but did it hold up for real work? Suffice to say I am pretty impressed (though whether a blog is 'real work' is up for discussion, but its a better exercise of the technology than my samples had been).
When I say LINQ, I want to be specific. LINQ to me really is "Language Integrated Query". The data store behind is a secondary discussion and whether the Entity Framework, LLBLGen Pro, or nHibernate, I really wanted to make sure that the way I queried data in the C# code was LINQ. I was hoping to avoid dropping down into SQL as much as possible.
I knew I wanted to be able to use my model from most pages so I added the data context to the Master page so it would handle most of the build-up, tear-down for me:
public WilderEntities Ctx
{
get
{
if (_ctx == null)
{
_ctx = new WilderEntities();
}
return _ctx;
}
}
protected override void OnUnload(EventArgs e)
{
if (_ctx != null)
{
Ctx.Dispose();
}
}
This allowed me to create the Ctx on the master page but use it on any page/control that I neeeded. The disposal of the context would happen during unloading of the the page. The only thing that this required is that most pages needed a typed reference to the master page which you can do with a MasterType page directive:
<%@ Page Language="C#" MasterPageFile="~/StwMaster.Master"
AutoEventWireup="true" CodeBehind="default.aspx.cs"
Inherits="stw._default" Title="Shawn Wildermuth's Blog" %>
<%@ MasterType VirtualPath="~/StwMaster.Master" %>
Much of the LINQ code is fairly pedestrian:
var qry = from b in MasterPage.Ctx.BlogEntry.Include("BlogEntryComments")
where b.Published == true
orderby b.DatePosted descending
select b;
List<BlogEntry> rants = qry.Take(10).ToList();
You can see that I am doing a simple LINQ query to get all the published BlogEntry objects and order them by the date they were posted. Of note, when I execute the query I am adding the Take() method to limit the results. This is akin to TOP in SQL and we'll be revisiting it in a minute.
Because I need to have access to comments about the blog entries, I use the Include clause in the LINQ query to retrieve not only the BlogEntry objects, but also the related BlogEntryComment objects. This syntax is specific to the EntityFramework. If you are using a different LINQ provider, you may find that lazy loading is automatic (e.g. nHibernate) or not available. I know this is a chief complaint about the Entity Framework, but I like it as it makes the developer have to think about loading sub-types and the side effects...but that's a whole other discussion.
Once we have a result its a simple as assigning our collection and forcing data binding to happen:
theRants.Blogs = rants;
titles.Blogs = rants;
DataBind();
Of particular interest here is that I am using the same list to bind to two collections (the list of blogs and the "On This Page" titles). Because we are getting simple CLR object collection back, there isn't any magic here in how we're doing the binding. This is in stark contrast to older data access (e.g. DataRows, DataReaders).
One thing I really like here is that I can do the paging directly during the execution of the LINQ query as seen in the LINQ query for the paged Rants page:
var qry = from b in Master.Ctx.BlogEntry.Include("BlogEntryComments")
where b.Published == true
orderby b.DatePosted descending
select b;
List<BlogEntry> rants = qry.Skip(PAGESIZE * (currentPage - 1)).Take(PAGESIZE).ToList();
Note that this LINQ query is identical to the earlier home page query but I am doing the paging by calculating both the Skip() and Take() value. Like I mentioned earlier, Take() is like TOP in that it specifies the number of returned elements. Whereas Skip() specifies how many results to ignore before starting the Take() amount. This allows us to manage paging directly using the LINQ code.
There was one place where I just couldn't get LINQ to bend to my wishes. I am still not convinced that there is *not* a way to do this, but I dove down into Entity SQL to make the request instead. The case was where I am translating the URI pieces of the Rant URI (e.g. http://wildermuth.com/2008/07/07/Wildermuth_com_By_Example_-_Part_1) where I am being handed the year, month, day and title by the routing framework (I'll talk about that in a future post). I needed a way of finding the right Rant based on that information (since I don't want to share the Rant ID with anyone). To do this, I used an ObjectQuery (Entity Framework's query syntax using Entity SQL):
string whereClause = @"SqlServer.DATEPART('yyyy', it.DatePosted) = @year AND
SqlServer.DATEPART('mm', it.DatePosted) = @month AND
SqlServer.DATEPART('dd', it.DatePosted) = @day AND
it.Title LIKE @pattern";
ObjectQuery<BlogEntry> qry = Master.Ctx.BlogEntry
.Include("BlogEntryComments")
.Where(whereClause);
qry.Parameters.Add(new ObjectParameter("year", rantDate.Year));
qry.Parameters.Add(new ObjectParameter("month", rantDate.Month));
qry.Parameters.Add(new ObjectParameter("day", rantDate.Day));
qry.Parameters.Add(new ObjectParameter("pattern", pattern));
_blogEntry = qry.FirstOrDefault();
Note I am using the DATEPART syntax to compare the parts of the date. The real issue with the query was the parameter as I take the title and replace all underscores with % to do a LIKE query. This worked fine and once the query executes, I am dealing with a simple CLR object so it doesn't matter in the bit picture.
Other than that, the LINQ syntax is pretty straightforward in my example. I ported old old nasty code from my original sites (in fact some routines were originally ported from ASP to ASP.NET back in pre v1.0). Because of that I don't have a good separation of responsibilities, but that's for the next conversion...when MVC comes of age.
My last note I wanted to say is that I attempt to create pretty simple HTML. I don't use many actual controls except the repeater and create mostly CSS-based XHTML to be simple. With that in mind I enchew ASP.NET's DataSource stuff. Whether its LINQ, Object or Sql DataSources they all are trying to do a lot of magic IMHO and I end up writing code instead of depending on them.
Opinions and observations are welcome!
Jim Wooley has a live LINQ-based blog site that Jim Wooley's been working on with every version of LINQ it seems like forever. If you have a chance to stop by or see what he's been Thinqing of, i'd suggest stopping by.
Most of my exposure to LINQ has been in very short snippets and sessions at various conferences and blogs. My initial reaction was fairly negative. This negative reaction was based on several key factors:
- The evangelizing of LINQ as an ORM (making LINQ for SQL as the main focus of LINQ).
- Pointing out that LINQ is based on SQL so that developers should be comfortable with the syntax.
Let me discuss these points individually. I think that LINQ for SQL is not a compelling ORM (and is mostly useful only in the RAD or prototyping of applications, much like Typed DataSets are now). The more problematic part of this factor is that it only confuses developers as LINQ is not about database development but about integrating query into code. Integrating a query mechanism into the language is a great feature that should be explained but using LINQ for SQL as a demonstration tends to lend the comparison with nHibernate, LLBGenPro, Typed DataSets and such. The reality is that LINQ for SQL is an interesting implementation of LINQ but clouds the issue.
As for equating LINQ as having a SQL-like syntax, there are three specific problems with this in my mind:
- Many developers only know SQL basics therefore basing a language on it does not necessarily add great benefits to skill-reuse.
- LINQ is only vaguely SQL-like in my opinion. The problem is that LINQ is attempting to create a language that is useful for a different set of tasks than SQL was designed for. SQL is a set-based query language, where LINQ needs to be a query language for multiple types of data: sets, hierarchies, trees, inverted trees, etc. (Good news is that LINQ is actually fairly adept at handing these other data structures).
- Lastly, I would have liked to see the syntax be something that was more intrinsically self-describing. SQL is a classic case of a functional language that is feature rich but very difficult to decipher (and even harder to determine the results as it often depends on execution paths). Bringing a purely SQL-like syntax into the language will not help make the code clearer to read.
This is still my opinion. I believe that all of these issues are fairly problematic with LINQ. So what has changed?
As I've been digging in, I realize that there really isn't "LINQ for Objects", "LINQ for DataSets", "LINQ for SQL", etc. There is just Language INtegrated Query. The idea that there is a moniker for "LINQ for Objects" I think is a misnomer. LINQ is always about managed objects. Anyone who enables LINQ in their own collections simple are saying that with available data, we can tie into a centralized query facility. This became abundantly clear when I was chatting with Jim Wooley (co-author of a forthcoming book on LINQ) about LINQ for SQL.
LINQ for SQL depends on managed classes that are using attributes to map them to the database (though you can do something with mapping files to execute arbitrary SQL). This means that the metadata about a query must exist in the CLR (or in Memory if you prefer). If the metadata is in your memory space, then LINQ is really just querying objects. Underneath the covers LINQ for SQL may be creating new objects for you from the database but that's an implementation detail as far as I am concerned. LINQ for SQL. Its not special, just an implementation (much like LINQ for Amazon and LINQ for Flickr are just inventive implementations).
What does this all really mean. This means that LINQ is a generalized query mechanism for managed code. That's great news as we needed one. The cost may be a bit high as I think introducing "Extension Methods" and "Variable Inference" may cause more trouble than they are worth, its here and I better get used to it. At the end of the day LINQ is just syntactical sugar to map to IEnumerable<> so I can live with that. This reminds me of a great conversation I had with a bunch of guys at the MVP summit about this. Several of us were very negative about LINQ and others were very passionate about how great it was. I remember hearing that there are other syntactical sugars in languages already. The most obvious one for me is the "foreach" statement. We do not need foreach. We can just as easily write code that uses the enumerator to walk a collection, but foreach certainly makes the code both more readable and easier to write. I hope I feel that way about LINQ in the coming years, but I am certainly coming around to it.
How are you feeling about LINQ?