Blog Splash

A Flat Month List in BlogEngine.NET

by Kerido Wednesday, December 1, 2010 6:00 AM

Yesterday I received an automated notification e-mail informing that my additions to the BlogEngine.NET source code have been merged with the main repository. This is really a new kind of experience to me. Although I have contributed to open-source development, it's the first time I'm so inspired by the project and interested in its evolution. And, probably, it is the first time I know my change has been accepted.

Speaking about the contribution itself, it's a small enhancement to the Month List widget. Since I started to use it, I wanted to display a flat list of months, sorted from recent to oldest:

  • January 2010
  • December 2009
  • November 2009, etc.

I added two properties – GroupByYear and RecentDatesAtTop – to implement that. Without changing these, the widget looks the default way to retain backwards compatibility.

Permanent Redirect Using ASP.NET

by Kerido Monday, February 22, 2010 7:49 AM

As Google recomments, when changing a website's domain name, you should add the HTTP 301 "Moved Permanently" status to the old address. To my shame, I always thought this is what HttpResponse.Redirect is about. However, using the latter produces the HTTP 302 status which is semantically somewhat different.

A trivial task that ASP.NET has no pretty code for. Here is a convenience class that does the job:

public static class HttpResponseUtility
{
  public static void PermanentRedirect(HttpResponse theResp, string theUrl)
  {
    PermanentRedirect(theResp, theUrl, true);
  }

  public static void PermanentRedirect(HttpResponse theResp,
    string theUrl, bool theEndRequest)
  {
    theResp.Redirect(theUrl, false);
    theResp.StatusCode = 301;  // as opposed to 302 set by Redirect

    if(theEndRequest)
      theResp.End();
  }
}

With this class in place, you simply call the following from a page:

HttpResponseUtility.PermanentRedirect(Response, newUrl);

Also, if you use .NET version 3.5, an extension method would do an even better job (notice the this keyword):

public static class HttpResponseUtility
{
  public static void PermanentRedirect(this HttpResponse theResp, string theUrl)
  {
    PermanentRedirect(theResp, theUrl, true);
  }

  public static void PermanentRedirect(this HttpResponse theResp,
    string theUrl, bool theEndRequest)
  {
    theResp.Redirect(theUrl, false);
    theResp.StatusCode = 301;  // as opposed to 302 set by Redirect

    if(theEndRequest)
      theResp.End();
  }
}

Your client code would then look even prettier:

Response.PermanentRedirect(newUrl);

An Ultra Fast CSS Minify Algorithm

by Kerido Saturday, January 30, 2010 6:10 AM

Introduction

You've probably thought a lot about ways you can optimize your Web site. Since the release of this new site, I've been thinking about it a lot. In this post I would like to introduce an advanced version of the CSS minify algorithm. There are several existing implementations available, but I took the one integrated into BlogEngine.NET, as a reference. Compared to it, my algorithm offers several benefits:

Concept

As opposed to the built-in CSS minify algorithm which uses regular expressions, the current algorithm is more like a state machine. This is why you might think it's harder to read and debug. This is why I am writing this post.

First, here are valid CSS states that the algorithm uses:

enum CssState
  { Punctuation, Token, StringD, StringS }

The CSS code is always considered Punctuation if a different state does not apply. Curly braces, square brackets, parentheses – this is all Punctuation. The Token state includes all tag names, element IDs and classes, properties and values – anything that consists of alpha-numeric characters plus a limited number of auxiliary characters – .#_-@*%(). And finally, there are two string states. StringD represents a double-quoted string, StringS – single-quoted. I'll cover more about the reason why the latter two are required, in the Handling Strings section.

The algorithm reads the input character by character and determines if a sequence of whitespace characters can be removed from it. It also handles comments pretty much the same way. Here is the code illustrating the idea:

public static string Minify(string theCss)
{
  // Assume that the length of the output string
  // will be at most 75% the length of the incoming one to avoid
  // additional StringBuilder reallocations.
  StringBuilder aRet = new StringBuilder(theCss.Length * 3 / 4);

  int aNumChars = theCss.Length;

  CssState aPrevState = CssState.Punctuation;
  int aPrevPos = 0;

  int i = 0;

  while(i < aNumChars)
  {
    CssState aCurState = GetState(theCss, ref i, aPrevState);

    if(i > aPrevPos + 1)
    {
      // If whitespace is found between two tokens, keep it compact
      if (aPrevState != CssState.Punctuation && aCurState != CssState.Punctuation)
        aRet.Append(' ');

      // Otherwise, no whitespace is needed, skip everything between aPrevPos and i
    }

    aPrevPos = i;
    aPrevState = aCurState;
    aRet.Append(theCss[i++]);
  }

  return aRet.ToString();
}

As you can see, whitespace must not be removed is when it delimits token or string characters. Examples of this case are:

border-bottom: solid 2px #9f4c1f;
background-position: center top;
quotes: '« ' ' »';
quotes: "»" "«" "\2039" "\203A";

In all other cases, the whitespace can be trimmed. Again, there is a special case when whitespace is found inside a string, but we'll cover it in the next section.

The key of the algorithm is the GetState method. It skips any comments and whitespace characters found in the incoming string, updates the position variable i to a value represented by a meaningful state:

static CssState GetState(string theCss, ref int thePos, CssState theCurState)
{
  int aLen = theCss.Length;
  int i = thePos;

  if (theCurState == CssState.StringD)
  {
    //REMOVED FOR COMPACTNESS
  }
  else if (theCurState == CssState.StringS)
  {
    //REMOVED FOR COMPACTNESS
  }


  bool aSkip = true;

  while(aSkip)
  {
    /////////////////////////////////////////
    // Skip whitespace
    while(aSkip = (i < aLen - 1 && IsWhitespaceChar(theCss[i]) ) )
      i++;

    /////////////////////////////////////////
    // Skip comments
    if(i < aLen - 1)
      if (theCss[i] == '/' && theCss[i+1] == '*') // comment opening
      {
        aSkip = true;

        while(i < aLen - 1)
        {
          i++;

          if(theCss[i-1] == '*' && theCss[i] == '/') // comment closing
          {
            i++;
            break;
          }
        }
      }
  }

  thePos = i;
  if ( IsTokenChar( theCss[i] ) )
    return CssState.Token;

  else if ( theCss[i] == '\"' )
    return CssState.StringD;

  else if(theCss[i] == '\'')
    return CssState.StringS;

  else
    return CssState.Punctuation;
}

For for the sake of saving space I have removed several lines, so let's move on to see what they are meant for.

Handling Strings

According to the CSS specification, two types of strings are supported: single-quoted and double-quoted. My algorithm leaves the string intact, even if whitespace characters are found inside it. Of course, for the HTML language it may not matter at all since all whitespace characters will be merged by the browser. But I just think that keeping the string unmodified is more intuitive and professional. In order achieve that, support needs to be added for escaped quote characters. An escaped single quote character inside a single-quoted string does not close the string. Similarly, an escaped double quote character does not close a double quoted string:

if (theCurState == CssState.StringD)
{
  if(theCss[i] == '\"')
  {
    // Make sure the double quote character is not escaped
    if(thePos > 0)
      if(theCss[i-1] == '\\')
        return CssState.StringD;

    // Enforce a whitespace afterwards
    return CssState.Token;
  }
  else
    return CssState.StringD;
}
else if (theCurState == CssState.StringS)
{
  if(theCss[i] == '\'')
  {
    // Make sure the single quote character is not escaped
    if(thePos > 0)
      if(theCss[i-1] == '\\')
        return CssState.StringS;

    // Enforce a whitespace afterwards
    return CssState.Token;
  }
  else
    return CssState.StringS;
}

There is a special example attribute in the reference CSS file that illustrates the differences in the way the two algorithms work:

angledouble: "Angle=             00deg00'00\"      ";
anglesingle: 'Angle=             00deg00\'00"      ';

Although these attributes make no sense to a browser, they represent valid CSS syntax. My version produces the same string while the BlogEngine.NET version trims several spaces from it. I consider this a flaw even though string length gets reduced.

Performance Testing

The suggested algorithm provides both speed and size wins. The code below lists two methods I used to measure output length and processing time of the two algorithms:

[TestMethod]
public void CssMinifyTest_Length()
{
  string aCss = Properties.Resources.style;

  string aMin1 = CssMinifier.Minify(aCss);
  string aMin2 = CssHandler.StripWhitespace(aCss);

  Console.WriteLine(
    string.Format("KO Software output ({0} bytes):", aMin1.Length) );
  Console.WriteLine(aMin1);

  Console.WriteLine("---------------------------------");

  Console.WriteLine(
    string.Format("BlogEngine.NET output ({0} bytes):", aMin2.Length) );
  Console.WriteLine(aMin2);

  Assert.IsTrue(aMin1.Length <= aMin2.Length);
}

[TestMethod]
public void CssMinifyTest_Speed()
{
  int aNumCycles = 50000;
  string aCss = Properties.Resources.style;

  DateTime aStart = DateTime.Now;
  for(int i = 0; i < aNumCycles; i++)
  {
    string aMin = CssMinifier.Minify(aCss);
  }
  DateTime aEnd = DateTime.Now;

  TimeSpan aTime_AspxGear = aEnd - aStart;
  Console.WriteLine(aTime_AspxGear.ToString());


  aStart = DateTime.Now;
  for(int i = 0; i < aNumCycles; i++)
  {
    string aMin = CssHandler.StripWhitespace(aCss);
  }
  aEnd = DateTime.Now;


  TimeSpan aTime_BE = aEnd - aStart;
  Console.WriteLine(aTime_BE.ToString());


  Assert.IsTrue(aTime_AspxGear <= aTime_BE);
}

The results are illustrated in the table below:

KO Software CSS Minifier BlogEngine.NET CSS Minifier
Output size, bytes 6,888 7,274
50,000 iteration processing time (Debug build) 33 seconds 1 minute and 25 seconds
50,000 iteration processing time (Release build) 15 seconds 1 minute and 20 seconds

Conclusion

The suggested algorithm produces output which is smaller by almost 400 bytes. And it runs more than 5 times faster! Interestingly, the Release build is more than two times faster than the Debug one while the BlogEngine.NET version yields only subtle performance boost. I suspect, this is because the latter uses regular expressions which are already built with most possible optimizations, regardless of our build type.

I am going to contact the BlogEngine.NET team and e-mail my code as a patch to their source code repository. I think, if this code is properly tested in various environments, it can surely improve web site stability and reduce traffic. As an old school programmer, I am still interested in ways of improving the code. So please, download the source code and leave comments with bugs and improvements.