Posted on 2/25/2010 11:46:53 PM by Justin Etheredge
I want to show you an algorithm, it is a pretty simple algorithm. It is an implementation of the Damerau–Levenshtein edit distance algorithm from the pseudocode on Wikipedia:
public static int EditDistance(string string1, string string2)
{
var s1Length = string1.Length;
var s2Length = string2.Length;
var matrix = new int[s1Length + 1, s2Length + 1];
for (int i = 0; i <= s1Length; i++)
matrix[i, 0] = i;
for (int j = 0; j <= s2Length; j++)
matrix[0, j] = j;
for (int i = 1; i <= s1Length; i++)
{
for (int j = 1; j <= s2Length; j++)
{
int cost = (string2[j - 1] == string1[i - 1]) ? 0 : 1;
matrix[i, j] = (new[] { matrix[i - 1, j] + 1,
matrix[i, j - 1] + 1,
matrix[i - 1, j - 1] + cost}).Min();
if ((i > 1) && (j > 1) &&
(string1[i - 1] == string2[j - 2]) &&
(string1[i - 2] == string2[j - 1]))
{
matrix[i, j] = Math.Min(
matrix[i, j],
matrix[i - 2, j - 2] + cost);
}
}
}
return matrix[s1Length, s2Length];
}
Continue reading the rest of this post...
Posted on 2/22/2010 9:08:01 PM by Justin Etheredge
I recently put up a post on my blog about some of the new concurrent collections in .NET 4.0, and I noticed that a lot of people were being sent by Google to those posts who were only searching for System.Collections. I figured that maybe people could use a similar overview of the collections available to them in the System.Collections.Generic namespace, since it seems to me that no one uses anything other than List and Dictionary. So, in this post, I am going to take a look at a few of those collections, and explain exactly why you would want to use them.
Keep in mind as you read through this list that you shouldn't just start switching out collection types in code that you already have working. If something is working and performing properly for you, it is almost always better to take the easier route until you have proof that the easy approach does not work for you.
System.Collections.Generic.List<T>
The first collection that we are going to look at is List<T>. Like I said before, this collection seems to be the fallback, and with good reason. It is unsorted (but supports sorting), items can be added, removed, or inserted at a specific index. It also has indexed access, meaning that items can be accessed directly using a numeric index. You can add Ranges, perform binary searches, perform sequential searches, built in sorting, get the count of items, check for items within it, pass a delegate to it to perform actions, etc... It really is the Swiss Army knife of collections.
Continue reading the rest of this post...
Posted on 2/11/2010 12:32:10 AM by Justin Etheredge
After I posted my last post about my JavaScript bundler utility, I had a few comments from people who made comments that I needed to better integrate it into an ASP.NET or ASP.NET MVC application. I had approached the problem from the standpoint of a build. I wanted an executable that could be pointed at a series of files during a build, or some other automated process, and perform all of the work involved in minifying, combining, and compressing my JavaScript and CSS. I started thinking about it however, and realized that I could probably build something to do this with a small amount of effort.
The approaches that were put forth were excellent, and one of the comments was from a fellow blogger Milan Negovan who made a similar utility recently called Shinkansen which is an integrated ASP.NET control for compressing JavaScript and CSS. It is very impressive, you should go check it out! It appears to use a custom handler and an ASP.NET component in order to combine and minify (or crunch) your JavaScript files and then cache the result and spit out a reference to the handler. It seems to be a very efficient and clever solution!
Another comment was by Jeff Olson who said that he wanted better integration into an application via an executable which could scan a project and do replacements. He was advocating a similar approach to the one that I had already taken, but instead of specifying files manually, the tool needed to scan a project and compress and combine the needed files. While this is an interesting approach if you wanted a completely platform agnostic solution, but I decided that I would implement it in a bit different manner.
The first requirement that I thought was that it had to work in both ASP.NET and ASP.NET MVC. I also didn't want to really have any setup or configuration. I also wanted it to output a physical file that I could simply pass a reference to. This way I could avoid having to do any manual caching and such. I just thought it would be easier to deal with. My only concern here revolves around the security of having a file actually written to disk inside of the website process. Some people could have a problem with this, and there could be issues around file locking, but nothing that couldn't be coded around.
Continue reading the rest of this post...
Posted on 2/9/2010 1:09:54 AM by Justin Etheredge
It pretty much goes without saying that if you are building a public facing website these days you are probably using a ridiculous amount of JavaScript. And it is also likely that most of the JavaScript is in the form of libraries that you didn't write and you don't maintain. But even if you aren't maintaining those libraries, you are still responsible for pushing them all down to your users. And so you can get into the situation where you have either hundreds of kilobytes of JavaScript or you just end up with a ton of tiny script files. Both of these can really put a damper on the amount of time that your site initially loads for your user.
Fortunately for us there are several solutions to the problem of slow loading JavaScript. One is to try and load most of your libraries from content delivery networks (CDN) provided by companies like Google and Microsoft. A second is to employ a CDN of your own like Amazon's CloudFront. But no matter what you are doing to speed up your the delivery of your JavaScript, it is absolutely imperative that you do three things:
- Combine your JavaScript files: Concatenate all of your JavaScript files into a single file so that the browser only has to make one request to download your scripts.
- Minify your JavaScript files: Perform some optimizations on your JavaScript to remove whitespace, shorten variable names, and in some instances even perform some static analysis to optimize statements or remove unused code.
- Compress your JavaScript: Enabled gzip compression so that users that have browsers which support compression will receive a smaller file.
Now this may sound like a lot of work, but thankfully people like my friend Dave Ward have already solved the problem of easily combining and minifying our JavaScript files in a pretty easy way. However, I was looking at one of my favorite JavaScript libraries, SyntaxHighlighter, and I was thinking that it was just an absolutely huge amount of files that you had to import in order to use it. SyntaxHighlighter has hosted versions of its files, and so wouldn't it be cool if I could just pull those hosted versions, along with my other javascript and then combine, minify, and then just push all of that up to my CDN?
Continue reading the rest of this post...
Posted on 2/8/2010 11:12:56 AM by Justin Etheredge
In .NET 4.0 there is a new namespace on the block, and it is called System.Collections.Concurrent. In this namespace you will find a pretty decent number of goodies that will help you to more easily write application which can leverage multiple threads without having to resort to manual locking. We looked previously at the ConcurrentBag, ConcurrentStack, and ConcurrentQueue. Well, I have two more goodies to show off, and those are the BlockingCollection class and the IProducerConsumerCollection interface.
In order to get this party started, let me explain exactly what the IProducerConsumerCollection is and why we would want to use it. The IProducerConsumerCollection interface is pretty simply and really exposes only two methods that we care about: TryAdd and TryTake. You see, IProducerConsumerCollection is designed for multi-threaded producer/consumer scenarios, which means situations where we have multiple threads producing pieces of data (producers), and then multiple threads that are trying to consume those pieces of data (consumers). So, IProducerConsumer just signifies that the implementing type supports thread-safe adding and removing of data. (In .NET 4.0, the types that will support this interface are the ones that we already looked at: ConcurrentStack, ConcurrentQueue, and ConcurrentBag)
Continue reading the rest of this post...