Quantcast
Channel: Muhammad Rehan Saeed
Viewing all 138 articles
Browse latest View live

Securing the ASP.NET MVC Web.config

$
0
0

Security is a big subject in the web world. Largely because it's super easy to leave your site insecure and open to attack. The default ASP.NET MVC project template is pretty weak when it comes to security. It trades security for simplicity. The ASP.NET Core Boilerplate project template, shifts that balance more in favour of security while still trying to be as simple as possible. Several insecure settings in the Web.config file have been changed and made secure by default.

This series of blog posts goes through the additions made to the default ASP.NET MVC template to build the ASP.NET Core Boilerplate project template. You can create a new project using this template by installing the Visual Studio template extension or visit the GitHub site to view the source code.

Securing Web.config

Turn On Custom Errors

In the early stages of development, you want to see the full stack trace of your exceptions when an error occurs on a page. When it comes to releasing your site, you need to hide this sensitive information. Unbelievably, the default ASP.NET MVC template leaves this sensitive information wide open. To hide this, you need to add the customErrors section to your web.config file and turn it on.

The problem is that we still want this setting to be turned off when debugging. This is where configuration file transforms come in. This setting is off when the solution configuration is Debug and on when it is Release. The debug attribute in the compilation section is set in the same way.

Securing Cookies

By default JavaScript from external sites can access the cookies from the default ASP.NET MVC template. They can also be sent unencrypted over the wire, because they don't use SSL. The httpCookies section can be added to secure your cookies (This can also be done in code but the point is that we are making it secure by default. You could easily forget to turn this on in code).

<!-- httpOnlyCookies - Ensure that external script cannot access the cookie. -->
<!-- requireSSL - Ensure that the cookie can only be transported over SSL. -->
<httpCookies httpOnlyCookies="true" requireSSL="false" />

By default we set requireSSL to false because we don't know if you are going to use SSL in your site or not. If you are using SSL, you need to turn this on.

Shut ASP.NET's Mouth

By default ASP.NET shouts about itself a lot. It sends HTTP headers with each response telling the world and dog what version of ASP.NET your site is hosted on and even what version of MVC you are using. Below is an example of the extra headers needlessly being sent with every request:

ASP.NET Response Headers

To fix this problem you need to do a few things. The first is to set the enableVersionHeader setting on the httpRuntime section to false.

<!-- enableVersionHeader - Remove the ASP.NET version number from the response headers. Added security through obscurity. -->
<httpRuntime targetFramework="4.5" enableVersionHeader="false" />

Second, you need to clear the custom headers as shown below.

<httpProtocol>
  <customHeaders>
    <!-- X-Powered-By - Remove the HTTP header for added security and a slight performance increase. -->
    <clear />
  </customHeaders>
</httpProtocol>

Troy Hunt is a great MVC security guru and definitely worth a read on this subject.

We rename the ASP.NET session cookie from its default name of ASP.NET_SessionId to s. Now, users of our site, no longer have any idea what web server we are using (There are still ways to find out but we are making it harder) and we save a few more bytes being sent over the wire because we have a shorter name.

<!-- cookieName - Sets the name of the ASP.NET session cookie (Defaults to 'ASP.NET_SessionId'). -->
<sessionState cookieName="s" />

Maximum URL Request Length

By default, ASP.NET MVC allows 4096 characters in the request URL. This is to reduce the effects of denial of service attacks. You can reduce this limit further by setting the maxRequestLength setting on the httpRuntime section. The template does not do this by default but does include a comment highlighting this.

<!-- maxRequestLength="4096" - The maximum length of the url request in kilobytes. -->
<httpRuntime maxRequestLength="4096"/>

Machine Keys

Machine keys are used by MVC to generate anti-forgery tokens, which you should be using with any form on your site. If your site is deployed to a server cluster, you need to generate a machine key and add it to the system.web section of your web.config file. This is to ensure that both machines in your server cluster are using the same machine key to generate anti-forgery tokens. This link tells you more about how to do this.

<machineKey decryptionKey="[YOUR DECRYPTION KEY GOES HERE]" validationKey="[YOUR VALIDATION KEY GOES HERE]"/>

Securing Third Party Plugins

The popular Elmah NuGet package is included and configured for error logging out of the box. However, to properly secure it, you should change the URL pointing to it (An attacker can only probe your Elmah page for vulnerabilities if they can find it). You should also use some form of authentication to limit the Elmah page to certain roles or users. Both of these steps can only be taken by the person using the template. However, where we can't write the code for you, we add liberal comments and add an entry into the check-list so you don't forget to do this. By default we also allow remote access to the Elmah pages, consider turning this off if you have local access to machine the site is hosted on. Here are the relevant app settings:

<!-- In case of authentication is turned on, you can specify exact roles of user that have access (eg. "Administrator"). -->
<add key="elmah.mvc.allowedRoles" value="*" />
<!-- In case of authentication is turned on, you can specify exact users that have access (eg. "johndoe"). -->
<add key="elmah.mvc.allowedUsers" value="*" />
<!-- Configure ELMAH.MVC access route. Note that you should probably change this to something else. 
     This is to add a little security through obscurity. hackers can't hack your elmah page if they 
     don't know where it is. -->
<add key="elmah.mvc.route" value="elmah" />

Glimpse is another great tool to help with debugging and diagnostics for your site. Like Elmah, Glimpse has it's own URL which you should rename. Glimpse is turned off in 'Release' mode for security reasons but you could keep it turned on and use authentication to limit who can access it. The relevant section for Glimpse is shown below.

<!-- glimpse - Navigate to {your site}/glimpse and turn on Glimpse to see detailed information about your site.
               (See http://getglimpse.com/ for a video about how this helps with debugging).
               You can also install addons for Glimpse to see even more information. E.g. Install the Glimpse.EF6
               NuGet package to see your SQL being executed (See http://getglimpse.com/Extensions for all Glimpse extensions).
               For more information on how to configure Glimpse, please visit http://getglimpse.com/Help/Configuration
               or access {your site}/glimpse for even more details and a Configuration Tool to support you. 
               Note: To change the glimpse URL, change the value in endpointBaseUri and also the glimpse URL under 
               httpHandlers and handlers sections above. -->
<glimpse defaultRuntimePolicy="On" endpointBaseUri="~/glimpse">
</glimpse>

Securing Anti-Forgery Tokens

I'm not quite sure why but configuring the ASP.NET MVC anti-forgery tokens cannot be done in the web.config file. The following code can be found in the Global.asax.cs file.

private static void ConfigureAntiForgeryTokens()
{
    // Rename the Anti-Forgery cookie from "__RequestVerificationToken" to "f". 
    // This adds a little security through obscurity and also saves sending a 
    // few characters over the wire.
    AntiForgeryConfig.CookieName = "f";

    // If you have enabled SSL. Uncomment this line to ensure that the Anti-Forgery 
    // cookie requires SSL to be sent accross the wire. 
    // AntiForgeryConfig.RequireSsl = true;
}

We are renaming the anti-forgery cookie from __RequestVerificationToken to f. This saves a few bytes and obscures the technology we are using a little. You can also require SSL for the anti-forgery cookie to be sent over the wire. This is commented out by default but if you are using SSL, set this to true for added security.

(UPDATE) Removing Tracing

Enabling tracing while debugging your site is a fairly common occurrence. It can be done with a single line of config:

<system.web>
  <trace enabled="true"/>
</system.web>

Your tracing information can be easily views by navigating to http://YourSite/trace.axd as shown here:

ASP.NET Tracing Trace.axd Page

The security angle on tracing is two-fold. First is the most obvious, you could leave the trace.axd page open to anyone who knows to try that URL on your site. Thus, leaking valuable inside information about your site, as well as the version of ASP.NET and .NET you are using. The fix for this is simple, you just need to remember to remove the tracing node from your web.config file.

Once again, you can use configuration file transforms to fix this problem. In your Web.Release.config file, you can add the following code to remove tracing but only when the site is built in release mode:

<system.web>
  <!-- customErrors - Turn on custom error pages instead of ASP.NET errors containing stack traces which are a security risk. -->
  <customErrors xdt:Transform="SetAttributes(mode)" mode="On"/>
  <!-- compilation - Turn off debug compilation. -->
  <compilation xdt:Transform="RemoveAttributes(debug)" />
  <!-- trace - Turn off tracing, just in case it is turned on for debugging. -->
  <trace xdt:Transform="Remove" />
</system.web>

The second problem is that even if you do this, accessing http://YourSite/trace.axd causes a 500 Internal Server Error on your site! This gives an attacker a clue that you are using ASP.NET. The correct thing to do is for the site to respond with a 404 Not Found error page instead. It turns out that in release mode you have to remove the tracing HTTP handlers altogether to stop your site responding to this URL. You can do that by adding the following snippet to the Web.Release.config file:

<system.webServer>
  <!-- remove TraceHandler-Integrated - Remove the tracing handlers so that navigating to /trace.axd gives us a 
       404 Not Found instead of 500 Internal Server Error. -->
  <handlers>
    <remove xdt:Transform="Insert" name="TraceHandler-Integrated" />
    <remove xdt:Transform="Insert" name="TraceHandler-Integrated-4.0" />
  </handlers>
</system.webServer>

(UPDATE) 403.14 Forbidden Responses to Directories

Navigating to a directory using IIS and ASP.NET MVC can cause a 403 Forbidden response to be returned. Actually, its a 403.14 Forbidden response to be exact. IIS is basically telling us that directory browsing is disabled (As it should be, directory browsing is a severe security risk. It can allow attackers to see your Web.config file with all your connection strings in it!). You can see what happens when I navigate to the physical /Content folder below:

403.14 Forbidden Response

So what is the problem? Well, a user would expect a 404 Not Found response if a resource is not found. A 403.14 Forbidden response tells a potential attacker that there is a folder there and that you are using IIS. Not the most useful information to an attacker but combine it with other information and it could be useful. The way to fix this is to handle 403.14 errors and replace the response with a standard 404 Not Found. We just need to add the code below:

<system.webServer>
  <!-- Custom error pages -->
  <httpErrors errorMode="Custom" existingResponse="Replace">
    <!-- Redirect IIS 403.14 Forbidden responses to the error controllers not found action.
         A 403.14 happens when navigating to an empty folder like /Content and directory browsing is turned off
         See http://www.troyhunt.com/2014/09/solving-tyranny-of-http-403-responses.html -->
    <error statusCode="403" subStatusCode="14" responseMode="ExecuteURL" path="/error/notfound" />
    <!-- ...Ommitted Code... -->
  </httpErrors>
</system.webServer>

Just adding this is not enough however. If I fire up Fiddler Navigating to the /Content folder of the site now results in a 301 Document Moved response, followed by a 404 Not Found. We can do much better than that.

Fiddler 301 Courtesy Redirect

To get around the above issue, you need to turn off default document handling in IIS. Please do note, that this will stop IIS from returning the default document (Using whats called a courtesy redirect) when navigating to a folder e.g. navigating to /Folder which contains an index.html file will not return /Folder/index.html. This should not be a problem as we are using ASP.NET MVC controllers and actions and not physical files.

<system.webServer>
  <!-- Stop IIS from doing courtesy redirects used to redirect a link to a directory without
       to a slash to one with a slash e.g. /Content redirects to /Content/. This gives a clue
       to hackers as to the location of directories. -->
  <defaultDocument enabled="false"/>
</system.webServer>

Now, navigating to /Content will return us a simple and correct 404 Not Found and we don't have the courtesy redirect any-more too. Take a look at the same request in Fiddler:

Fiddler 404 Not Found

Conclusions

IIS seems to have a lot of strange behaviours that have a detrimental effect on security. If you use the above settings however, you can cut out IIS's extra features that you don't need or want. Look out for the next post when I'll be discussing the very cool NWebSec NuGet package, which provides a whole host of comprehensive ASP.NET MVC security related filters which you can apply to your site.


Reactive Extensions (Rx) - Part 7 - Sample Events

$
0
0

Its been a while since I've done another Rx post. They've been pretty popular and thanks to the community for all the positive feedback. I was talking to a colleague yesterday who had been using standard C# events in WPF (The principals learned in this post can apply anywhere). He had subscribed to the TextChanged event in C# and was updating the user interface on the fly, whenever the user typed in a character of text. He was getting way too many events being fired and his user interface couldn't keep up with all the work it was being asked to do.

this.TextBox.TextChanged += this.OnTextBoxTextChanged;

private void OnTextBoxTextChanged(object sender, TextChangedEventArgs e)
{
    // Heavy User Interface updates that can cause the application to lock up.
}

This is a very common scenario which I myself have come across several times. The solution to this problem is to take a sample of the events being fired and only update the user interface every few seconds. This is possible without Reactive Extensions (Rx) but you have to write a fair amount of boilerplate code (I know, I've done it myself).

Reactive Extensions (Rx) can do this with a few easy to understand (This is the real bonus) lines of code. The first step is to wrap the WPF TextChanged event (I've shown how to do this in a previous post here).

public IObservable<TextChangedEventArgs> WhenTextChanged
{
    get
    {
        return Observable
            .FromEventPattern<TextChangedEventHandler, TextChangedEventArgs>(
                h => this.TextBox.TextChanged += h,
                h => this.TextBox.TextChanged -= h)
            .Select(x => x.EventArgs);
    }
}

this.WhenTextChanged
    .Sample(TimeSpan.FromSeconds(3))
    .Subscribe(x => Debug.WriteLine(DateTime.Now + " Text Changed"));

The final and most succinct step is to use the Sample method to only pick out the latest text changed event every three seconds and pass that on to the Subscribe delegate. It really is that easy and this blog post really is this short because of that!

NWebSec ASP.NET MVC Security Through HTTP Headers

$
0
0

This series of blog posts goes through the additions made to the default ASP.NET MVC project template to build the ASP.NET Core Boilerplate project template. You can create a new project using this template by installing the Visual Studio extension or visit the GitHub site to view the source code.

Web Security is Hard

Security is hard at the best of times. Web security...well...it takes things to a whole new level of difficulty. It is ridiculously easy to slip up and leave holes in your sites defences.

This blog post as well as the ASP.NET Core Boilerplate project are not a replacement for your own knowledge but it does help in setting up some defaults to be more secure and giving you a few more tools out of the box to help secure your site.

If you have some time and want to learn more about web security I highly recommend Troy Hunt's Pluralsight course called Hack yourself first. Note that Pluralsight requires a paid subscription (I'm quite against posting links to paid content but this course is pretty good. You can also get a trial subscription if you're interested). Here is a free video by Troy which covers the same topic but in a little less depth.

I would also, highly recommend reading up on Troy Hunt's blog which has extensive examples of real life websites in the wild, written by major companies getting web security horribly wrong.

NWebSec

The NWebSec NuGet packages written by André N. Klingsheim are a great way to add additional security to your ASP.NET MVC site. The ASP.NET Core Boilerplate project template includes them by default.

Everything is preconfigured and commented as much as possible out of the box but remember this is a project template to get you started. You still need to put the effort in to customize the site security to your own requirements and put in some time learning about what each of the security features does and how best to use it.

HTTP Headers

HTTP has been around for a very long time and so, a fairly large number of HTTP headers have been accumulated over time. Some are more useful than others but many of them are aimed at making the web more secure.

André N. Klingsheim has a brilliant blog post called Security through HTTP response headers which is a must read and fairly comprehensive. Go on, I'll wait for you to finish reading. NWebSec provides a host of ActionFilterAttribute's (The rest of this post expects you to know what these are) which can be applied in three different ways:

  1. Applied globally, so that they apply to all HTTP request/response messages.
  2. Applied to individual controllers.
  3. Applied to individual controller actions.

NWebSec's ActionFilterAttribute's add and configure specific HTTP headers. Most of them are preconfigured in ASP.NET Core Boilerplate for you to apply globally but some require you to take action.

X-Frame-Options

The X-Frame-Options HTTP header stops click-jacking by stopping the page from opening in an iframe or only allowing it from the same origin (your domain). There are three options to choose from:

  • SameOrigin - Specifies that the X-Frame-Options header should be set in the HTTP response, instructing the browser to display the page when it is loaded in an iframe - but only if the iframe is from the same origin as the page.
  • Deny - Specifies that the X-Frame-Options header should be set in the HTTP response, instructing the browser to not display the page when it is loaded in an iframe.
  • Disabled - Specifies that the X-Frame-Options header should not be set in the HTTP response.

We can use NWebSec to set it to block all iframe's from loading the site which is the most secure option and the default option set in ASP.NET Core Boilerplate.

// Filters is the GlobalFilterCollection from GlobalFilters.Filters
filters.Add(
    new XFrameOptionsAttribute()
    {
        Policy = XFrameOptionsPolicy.Deny
    });

You should note that for newer browsers, this HTTP header has become superseded by the Content-Security-Policy HTTP header which I will be covering in my next blog post. However, it should still be used for older browsers.

Strict-Transport-Security

This HTTP header is only relevant if you are using TLS. It ensures that content is loaded over HTTPS and refuses to connect in case of certificate errors and warnings. You can read a complete guide to setting up your site to run with a free TLS certificate here.

NWebSec currently does not support an MVC filter that can be applied globally. Instead we can use the Owin (Using the added NWebSec.Owin NuGet package) extension to apply it.

app.UseHsts(options => options.MaxAge(days:30).IncludeSubdomains());

As well as this header, MVC ships with the RequireHttpsAttribute. This forces an unsecured HTTP request to be re-sent over HTTPS. It does so without requiring any extra HTTP headers. Instead, this is a function of the MVC framework itself, which checks requests and simply redirects users if they send a normal HTTP request to a HTTPS URL. This attribute can be set globally (Using HTTPS throughout your site is a good idea these days) as shown below:

filters.Add(new RequireHttpsAttribute());

Both of these lines of code have an overlapping purpose but work in different ways. The RequireHttpsAttribute uses the MVC framework, while the NWebSec option relies on browsers responding to the Strict-Transport-Security HTTP header. Security should be applied in thick layers, so it's worth using both features. ASP.NET Core Boilerplate assumes you are not using TLS by default but does include the above lines of code commented out with a liberal sprinkling of comments to make it easy to add back in.

X-Content-Type-Options

This HTTP header stops IE9 and below from sniffing files and overriding the Content-Type header (MIME type) of a HTTP response. This filter is added by default in ASP.NET Core Boilerplate.

filters.Add(new XContentTypeOptionsAttribute());

X-Download-Options

This HTTP header stops the automatic downloading and opening of your HTML pages by browsers which then go on to run the page as if it were part of your site. It and forces the user to save the page and manually open the HTML document. This filter is added by default in ASP.NET Core Boilerplate.

filters.Add(new XDownloadOptionsAttribute());

Other HTTP Headers

NWebSec provides a number of other useful HTTP headers. The SetNoCacheHttpHeadersAttribute helps turn off caching by applying the Cache-Control, Expires and Pragma HTTP headers (Expires and Pragma have been superseded by Cache-Control but still need to be applied for backward compatibility).

Another useful filter provided is XRobotsTagAttribute. This adds the X-Robots-Tag HTTP header, which tells robots (Google or Bing) not to index any action or controller this attribute is applied to. Note, that ASP.NET Core Boilerplate includes a robots.txt file which you should use instead of this filter but I've added this here for completeness.

A good place to use these attributes would be on a page where you want to post back credit card information because caching credit card information could be a security risk and you probably don't want search engines indexing your checkout pages either.

public class CheckoutController : Controller
{
    [SetNoCacheHttpHeadersAttribute, XRobotsTagAttribute(NoIndex = true, NoFollow = true)]
    public ActionResult Checkout(CardDetails cardDetails)
    {
        // Checkout customers purchases securely.
    }
}

The CspAttribute filter adds valuable support for the new Content-Security-Policy (CSP) HTTP header. I will be covering this extensively in my next blog post so I've only mentioned it here. There are other HTTP headers but they turn off browser security features and I'm not really sure why you would use those.

Conclusions

In the image below, you can see the ASP.NET Core Boilerplate site in action. I've taken a screenshot of the HTTP response headers. You will see the ones listed in this email among them.

ASP.NET Core Boilerplate HTTP Response Headers

Using HTTP headers for security is just one extra tool in your arsenal to secure your site. As you will see in my next post about the new Content-Security-Policy (CSP) HTTP header, it can be a very powerful tool but not one to be used in isolation. You need to think about security across the whole spectrum of your site to catch all the glaring holes you may have missed.

Content Security Policy (CSP) for ASP.NET MVC

$
0
0

This series of blog posts goes through the additions made to the default ASP.NET MVC template to build the ASP.NET Core Boilerplate project template. You can create a new project using this template by installing the Visual Studio template extension or visit the GitHub site to view the source code.

What is CSP?

For a true in-depth look into CSP, I highly recommend reading Mozilla's documentation on the subject. It really is the best resource on the web. I will assume that you've read the documentation and will be going through a few examples below.

Content Security Policy or CSP is a great new HTTP header that controls where a web browser is allowed to load content from and the type of content it is allowed to load. It uses a white-list of allowed content and blocks anything not in the allowed list. It gives us very fine grained control and allows us to run our site in a sandbox in the users browser.

CSP is all about adding an extra layer of security to your site using a Defence in Depth strategy. It helps detect and mitigate Cross Site Scripting (XSS) and various data injection attacks, such as SQL Injection.

Real World Example

So what does this look like in a web browser. Well, here is an example of a Content-Security-Policy HTTP header shown in Chrome. I used the ASP.NET Core Boilerplate Visual Studio project template to create a ASP.NET MVC project that has CSP applied, right out of the box.

Content Security Policy HTTP Header

This is the HTTP header in the screenshot above. We'll discuss it in a lot more detail later in this post. Essentially it says, block everything, except scripts, images, fonts, Ajax requests and forms to or from my domain and also allow scripts from the Google and Microsoft CDN's.

Content-Security-Policy: default-src 'none';
                         script-src 'self' ajax.googleapis.com ajax.aspnetcdn.com;
                         style-src 'self' 'unsafe-inline';
                         img-src 'self';
                         font-src 'self';
                         connect-src 'self';
                         form-action 'self';
                         report-uri /WebResource.axd?cspReport=true

So for example, you may only want to load CSS, JavaScript and Images from your own trusted domain(s) and block everything else. You also might want to block any use of third party plug-ins (Flash or Silverlight) or frames. Using this type of policy, the only way an attacker could compromise your site using an XSS attack, would be to somehow get a malicious script from your own domain served up on your pages in separate script files as in-line styles and scripts are not blocked by CSP by default (You can turn this off but I will go on to tell you why this is a bad idea later on).

<script src="http://evil.com/Script.js"></script>

With the above CSP HTTP header in place if an attacker did manage to inject the script above, browsers would throw CSP violation errors and the evil script would not be executed or even downloaded. You can see what that looks like in Chrome below.

Content Security Policy Violation

Even better, the browser never even downloads the evil script in the first place. You can compare the two screen-shots of Fiddler below. The left side shows that the evil Script.js file was never even requested but a Content Security Policy violation was logged to the URL highlighted (I'll talk more about this later). The right side shows the site with no CSP policy in effect. The browser tries to download the evil Script.js file and as this is just demo and I haven't gone to the trouble of setting up an evil website, it can't be found and returns a 404 Not Found.

Fiddler Content Security Policy Violation

Fiddler No Content Security Policy Applied  

Content Security Policy Directives

There are a number of 'directives' that are used in the policy above. Mozilla has the full list of directives and how each is used here. Each directive controls access to a particular function in a web browser. I will not cover each one in details as they all work in the same way but I will cover the most important and unique directives below.

The default-src Directive

The default-src directive lets us apply some default restrictions. For example if I specified the following CSP policy, it would allow all types of content from my sites domain, as well as TrustedSite.com.

Content-Security-Policy: default-src 'self' TrustedSite.com

Now the above policy is pretty loose, it tells a browser it can load frames, Ajax requests, Web Sockets, fonts, images, audio, video, plug-ins, scripts and styles from both of those domains. It may well be that you don't use most of the things on that list. A much better policy would be to block everything by default and then only allow certain resources that you actually use as shown below.

Content-Security-Policy: default-src 'none'; 
                         script-src TrustedSite.com; 
                         style-src 'self'; 
                         img-src 'self'; 
                         font-src 'self'; 
                         connect-src 'self'; 
                         form-action 'self'

You can see that default-src has been set to none which blocks everything by default. Then we add other directives that allow, scripts from TrustedSite.com and styles, images, fonts, Ajax request and form submissions to my sites domain. This is a lot more secure and restrictive but it does require you to think more carefully about your policy.

The report-uri Directive

The report-uri directive is another special instruction. It gives the web browser a URL where it can post details of any violations to a CSP policy in JSON format. This is vitally important and allows us to find out about anyone trying to hack our site but probably much more likely, it allows us to find out about any resources that we have accidentally blocked because our policy was too restrictive and we did not do enough testing. In the example below, we are telling the browser to post CSP violation errors in JSON format to WebResource.axd?cspReport=true.

Content-Security-Policy: default-src 'self'; report-uri /WebResource.axd?cspReport=true

If we take the evil script above and try to add it to our page with the above CSP policy, we get a CSP violation error and you can see the JSON sent to us by the Chrome browser below. Please do note that different browsers do sent errors which are slightly different. Some browsers and indeed versions of browsers give more information than others.

{
    "csp-report": {
        "document-uri": "http://localhost:8080/",
        "referrer": "",
        "violated-directive": "default-src 'self'",
        "effective-directive": "script-src",
        "original-policy": "default-src 'self';report-uri /WebResource.axd?cspReport=true",
        "blocked-uri": "http://evil.com",
        "status-code": 200
    }
}

The style-src Directive

As I've mentioned before in-line styles are not allowed when using CSP because there is a risk that an attacker could inject in-line styles into a compromised page. All styles must be referenced from external CSS files as shown below.

<link href="/Site.css" rel="stylesheet"/>
<style>
    p {
        font-size:12pt;
    }
</style>

There is an extension to this directive which allows inline styles but you should avoid it as it is unsafe. Indeed the setting you have to pass to the style-src directive is called unsafe-inline.

style-src 'self' 'unsafe-inline'

The script-src Directive

Just like the style-src directive, script-src directive causes inline scripts to be blocked by default due to the risk of XSS attacks. Apart from inline scripts the JavaScript eval() function is also blocked by default.

Also just like the script-src directive, there is a way to enable inline scripts too which is also called unsafe-inline. There is also another extension called unsafe-eval, which allows access to the eval function. Once again these should be avoided and I have covered them here only because you should be wary of those who tell you to use them.

script-src 'self' 'unsafe-inline' 'unsafe-eval'

The Content-Security-Policy-Report-Only HTTP Header

CSP can be a pretty dangerous HTTP header if you have misconfigured it. Imagine a user visiting a site and wanting to view a YouTube video on your site but your CSP policy has blocked the video and all they see is a blank space where the video should be and no indication that something is wrong, unless they are clever enough to use the browser developer tools. That's a pretty poor user experience.

To combat this problem the W3C created the Content-Security-Policy-Report-Only HTTP header. This works just the same as Content-Security-Policy but it only reports violations of your policy and does not cause the browser to actually block anything.

CSP for ASP.NET MVC

So you're sold on CSP and want to know how you can implement this great new HTTP header on your ASP.NET MVC website. Well, to get started all you need to do is install the NWebsec.Mvc NuGet package.

NWebsec is a great collection of MVC filters which can be applied globally to all requests or to individual controllers or actions. NWebSec contains a series of MVC filters to support CSP but includes several other filters which I've already blogged about here.

Here is the CSP policy I have applied to the ASP.NET Core Boilerplate site and the code which is used to create it. This policy is applied to all responses from the site.

Content-Security-Policy: default-src 'none';
                         script-src 'self' ajax.googleapis.com ajax.aspnetcdn.com;
                         style-src 'self' 'unsafe-inline';
                         img-src 'self';
                         font-src 'self';
                         connect-src 'self';
                         form-action 'self';
                         report-uri /WebResource.axd?cspReport=true
// Content-Security-Policy - Add the Content-Security-Policy HTTP header to enable Content-Security-Policy.
GlobalFilters.Filters.Add(new CspAttribute());
// OR
// Content-Security-Policy-Report-Only - Add the Content-Security-Policy-Report-Only HTTP header to enable logging of 
//      violations without blocking them. This is good for testing CSP without enabling it.
//      To make use of this attribute, rename all the attributes below to their ReportOnlyAttribute versions e.g. CspDefaultSrcAttribute 
//      becomes CspDefaultSrcReportOnlyAttribute.
// GlobalFilters.Filters.Add(new CspReportOnlyAttribute());

// default-src - Sets a default source list for a number of directives. If the other directives below are not used 
//               then this is the default setting.
filters.Add(
    new CspDefaultSrcAttribute()
    {
        // Disallow everything from the same domain by default.
        None = true,
        // Allow everything from the same domain by default.
        // Self = true
    });

// connect-src - This directive restricts which URIs the protected resource can load using script interfaces 
//               (Ajax Calls and Web Sockets).
filters.Add(
    new CspConnectSrcAttribute()
    {
        // Allow AJAX and Web Sockets to example.com.
        // CustomSources = "example.com",
        // Allow all AJAX and Web Sockets calls from the same domain.
        Self = true
    });
// font-src - This directive restricts from where the protected resource can load fonts.
filters.Add(
    new CspFontSrcAttribute()
    {
        // Allow fonts from example.com.
        // CustomSources = "example.com",
        // Allow all fonts from the same domain.
        Self = true
    });
// form-action - This directive restricts which URLs can be used as the action of HTML form elements.
filters.Add(
    new CspFormActionAttribute()
    {
        // Allow forms to post back to example.com.
        // CustomSources = "example.com",
        // Allow forms to post back to the same domain.
        Self = true
    });
// img-src - This directive restricts from where the protected resource can load images.
filters.Add(
    new CspImgSrcAttribute()
    {
        // Allow images from example.com.
        // CustomSources = "example.com",
        // Allow images from the same domain.
        Self = true,
    });
// script-src - This directive restricts which scripts the protected resource can execute. 
//              The directive also controls other resources, such as XSLT style sheets, which can cause the user agent to execute script.
filters.Add(
    new CspScriptSrcAttribute()
    {
        // Allow scripts from the CDN's.
        CustomSources = string.Format("ajax.googleapis.com ajax.aspnetcdn.com"),
        // Allow scripts from the same domain.
        Self = true,
        // Allow the use of the eval() method to create code from strings. This is unsafe and can open your site up to XSS vulnerabilities.
        // UnsafeEval = true,
        // Allow inline JavaScript, this is unsafe and can open your site up to XSS vulnerabilities.
        // UnsafeInline = true
    });
// style-src - This directive restricts which styles the user applies to the protected resource.
filters.Add(
    new CspStyleSrcAttribute()
    {
        // Allow CSS from example.com
        // CustomSources = "example.com",
        // Allow CSS from the same domain.
        Self = true,
        // Allow inline CSS, this is unsafe and can open your site up to XSS vulnerabilities.
        // Note: This is currently enable because Modernizr does not support CSP and includes inline styles
        // in its JavaScript files. This is a security hold. If you don't want to use Modernizr, 
        // be sure to disable unsafe inline styles. For more information see:
        // http://stackoverflow.com/questions/26532234/modernizr-causes-content-security-policy-csp-violation-errors
        // https://github.com/Modernizr/Modernizr/pull/1263
        UnsafeInline = true
    });

Notice how there is one MVC filter for each CSP directive. This is actually a very elegant solution. Consider the fact that you may want the actions in a particular controller to be able to display YouTube videos, note that YouTube makes use of iFrames to embed videos and it's embed mark-up is shown below.

<iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/PGM_uBy99GA" frameborder="0" allow="accelerometer; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

With the above CSP policy, the Chrome browser throws the following error.

Content Security Policy Violation by YouTube iFrame

We need to add the frame-src and child-src directive which can be added to the specific controller. Note that the child-src directive is a CSP 2.0 directive and frame-src is deprecated in CSP 2.0 but we still need to add it for older browsers.

[CspChildSrc(CustomSources = "*.youtube.com")]
[CspFrameSrcAttribute(CustomSources = "*.youtube.com")]
public class HomeController : Controller
{
    // Action methods ommitted.
}

But what if we only want to allow a YouTube video to display for a single action rather, than all of the actions in a controller. Well its as simple as moving the attributes to the action, rather than the controller.

public class HomeController : Controller
{
    [CspChildSrc(CustomSources = "*.youtube.com")]
    [CspFrameSrcAttribute(CustomSources = "*.youtube.com")]
    public ActionResult Index()
    {
        // This view displays a YouTube Video.
        return this.View();
    }
}

Setting up the reporting of CSP violations is a bit more complicated. You need to add the CspReportUriAttribute MVC filter and add a special function in your Global.asax.cs file to actually handle a violation as shown below.

filters.Add(new CspReportUriAttribute() { EnableBuiltinHandler = true });

// Added to Global.asax.cs
protected void NWebsecHttpHeaderSecurityModule_CspViolationReported(object sender, CspViolationReportEventArgs e)
{
    CspViolationReport violationReport = e.ViolationReport;
    // Log the CSP violation here.
}

The CspViolationReport is a representation of the JSON CSP violation that the browser sends you. It contains several properties, which can tell you about the blocked URL, the violated directive, the user agent and a lot more. This is your opportunity to log this data in your preferred logging framework.

One final note, all of this code is available to view on GitHub here and is part of the ASP.NET Core Boilerplate project template.

Browser Support

CSP is a proper standard, you can read the W3C documentation here. According to the W3C at the time of writing, CSP is at Candidate Recommentation, which is "a version of the standard that is more firm than the Working Draft" and "The standard document may change further, but at this point, significant features are mostly locked".

If you take a look at CanIUse.com, you can see that FireFox 23+, Chrome 25+, Safari 7+ and Opera 15+ already support the official Content-Security-Policy HTTP header, while the next version of IE (Spartan or IE12, who knows what they'll name it?) will come with full support too.

A number of older browser versions supported CSP using the X-Content-Security-Policy or X-WebKit-CSP HTTP header (The X- is commonly used to add features to browsers which are not yet finalised) but these older implementations are buggy (Their use can mean content on your site gets blocked, even though you allowed it!) and should not be used.

Content Security Policy (CSP) 2.0

There is currently an 'Editors Draft' of CSP 2.0, written by the W3C standards body. It was published on 13 November 2014.

The intention of this version is to fill a few gaps and add a few new directives which allow control over web workers, embedded frames, application manifests, the HTML documents base URL, where forms can be posted and the types of plug-ins the browser can load. NWebsec supports most of these new directives already (except notably the plug-in types) and you can start using them today.

As well as these changes, CSP 2.0 also tries to address the pain points in using CSP and perhaps the reason for its slow take-up so far i.e. the inability to safely use in-line CSS and JavaScript using the style and script tags in your HTML.

So why would you want to use in-line styles and scripts in the first place? Well, do you use Modernizr? Yes, well Modernizr does not work with CSP (I discuss this below). It makes use of in-line styles to test for various web browser capabilities and so requires the unsafe-inline directive to function. There are other libraries that also have a similar requirement. Other reasons for using in-line styles and scripts are to use CSP on an existing web application, where you don't want to spend time moving to separate script files.

CSP 1.0 had the unsafe-inline directive which allowed the use of in-line style and script tags but it is pretty dangerous and makes CSP partially pointless. It gives attackers the ability to inject code into your site (Using another vulnerability in your site if there is one) and to pull off a Cross Site Scripting (XSS) attack. Using CSP 1.0 meant loading styles and scripts from separate CSS and JavaScript files. CSP 2.0 introduces two new ways to use in-line styles and scripts.

Nonces

Nonces work a little like the Anti-Forgery Token in ASP.NET MVC. A cryptographically random string is generated and sent to the client in the CSP HTTP header, as well as in the HTML with the style or script tag like so:

Content-Security-Policy: default-src 'self'; 
                         script-src 'self' https://example.com 'nonce-Nc3n83cnSAd3wc3Sasdfn939hc3'
<script>
alert("Blocked because the policy doesn't have 'unsafe-inline'.")
</script>
<script nonce="EDNnf03nceIOfn39fn3e9h3sdfa">
alert("Still blocked because nonce is wrong.")
</script>
<script nonce="Nc3n83cnSAd3wc3Sasdfn939hc3">
alert("Allowed because nonce is valid.")
</script>

There is a problem however, web browsers that only support CSP 1.0, will not understand the nonce directive and will block the in-line script above. To resolve this issue, we combine the nonce with the unsafe-inline directive. CSP 1.0 web browsers will execute the in-line script as before (insecure but backwards compatible), but CSP 2.0 browsers will disregard 'unsafe-inline' when they see the nonce and only execute in-line scripts with the nonce set. This gives an upgrade path for existing sites and they can benefit from CSP 2.0 without requiring a massive rewrite to get rid of in-line styles and scripts.

Nonces can be easily implemented by using the HTML helper provided by NWebSec. You can find out more about how this feature is implemented in NWebSec here.

<script @Html.CspScriptNonce()>document.write("Hello world")</script>
<style @Html.CspStyleNonce()>
   h1 {
          font-size: 10em;
        }
</style>

The big disadvantage with this approach is that the nonce is different for each response sent to the client. This means that you cannot cache any page using nonces. If your page is specific to a user, then you probably don't want to cache that page anyway and it doesn't matter but otherwise using nonces is not possible.

Hashes

Using hashes solves the caching problem we have with nonces. The server computes the hash of a particular style or script tags contents, and includes the base64 encoding of that value in the Content-Security-Policy header like so:

Content-Security-Policy: script-src 'sha512-YWIzOWNiNzJjNDRlYzc4MTgwMDhmZDlkOWI0NTAyMjgyY2MyMWJlMWUyNjc1ODJlYWJhNjU5MGU4NmZmNGU3OAo='
<script>
"alert('Hello, world.');"
</script>

As you can see, the script itself remains unchanged and only the HTTP header changes. We can now, happily cache the page with the in-line script in it. Unfortunately, at the time of writing NWebSec does not support hashes at all. If you feel this feature is worthwhile as I do, then you can raise an issue on NWebSec's issue list.

Other CSP Support

So for the reasons you've learned above, using in-line styles and scripts is not the way to go. Apart from the fact that CSP will block them, you also cannot minify and obfuscate in-line scripts very easily using ASP.NET MVC (There are ways I have looked into but they aren't very good). So moving scripts to external CSS and JavaScript files will mean you can use CSP and you might get a small performance boost. So what's the problem? Well, CSP is not currently supported in a few major libraries.

Modernizr Support for CSP

As I've said above Modernizr makes use of in-line styles to test for various web browser capabilities and so requires the insecure 'unsafe-inline' directive to function. There is a fix for the problem but its very old and can no longer be merged into the current branch of the Modernizr code. I would fix it myself but I'm not enough of a JavaScript guru to do so. What I have done is raise this Stack Overflow question which seeks to ask for a workaround or fix and to generally raise awareness.

So far, I've received no responses from GitHub or Stack Overflow but there is hope. AngularJS (Another popular JavaScript library) has a CSP compatible mode which makes use of an external CSS file but is very easy to set up. There is no reason why Modernizr could not have something similar. Another alternative is if NWebSec supports hashes, we can add work out the hashes of any scripts that Modernizr is using and include these in our CSP HTTP header.

Browser Link is a very cool Visual Studio feature that allows you to update an MVC view while debugging and hit a refresh button to refresh any browsers using that page. Unfortunately, this handy feature is not compatible with CSP because Visual Studio adds in-line scripts to the bottom of the page you are debugging. This of course, causes CSP violation errors. A simple workaround is to either introduce the unsafe-inline directive while debugging or turn off browser link altogether.

I have raised this suggestion on Visual Studio's User Voice page to get the problem fixed. I understand that this area has been changed significantly in ASP.NET Core, so it may not be needed by the time we all upgrade.

Mainstream CSP Adoption

So far, not many websites in the wild have implemented CSP. I think there are a few reasons:

  1. Lack of browser support (Until now).
  2. Lack of awareness by developers (Until this blog post I hope).
  3. Framework providers such as Microsoft and its ASP.NET MVC have not given developers a way to implement CSP (NWebSec has stepped in here to fill this gap).
  4. Prevalence of the use of in-line styles and scripts and unwillingness to switch to separate files (This is up to you).
  5. Lack of support for CSP from popular CSS/JavaScript libraries due to the reason above (This is the biggest problem).
  6. CSP gives us an extra layer of protection using a Defence in Depth strategy. Some developers don't take web security seriously enough until they get hacked.
  7. The older CSP HTTP headers (X-Content-Security-Policy or X-WebKit-CSP) were buggy or had unexpected behaviour (The Content-Security-Policy HTTP header no longer has this problem).
  8. Developers are not making good use of the ability to report violations to their CSP policy using the report-uri directive. If you find a violation, you can quickly discover if someone is attacking your site, your CSP policy is not valid or you have a bug in your site.
  9. Developers are scared of breaking their site because their CSP policy is too strict (This is often because CSP is being retrofitted to an existing spiders web of a site. If you start with CSP from the ground up, you will not have this problem).

CSP in the Real World

There is a really interesting white paper written in 2014 and titled 'Why is CSP Failing? Trends and Challenges in CSP Adoption' which goes over these issues I listed above in a lot more depth.

Who is using CSP

According to the white paper, CSP is deployed in enforcement mode on only 1% of the Alexa Top 100 sites. I believe things are about to change. All major browsers now support the CSP HTTP header, NWebSec makes it easy to add to an MVC project, this blog post tells you exactly how it works and the ASP.NET Core Boilerplate project template gives you a project template that enables CSP by default, right out of the box.

There are big names using CSP right now. Go ahead and check the HTTP response headers from sites like Facebook, CNN, BBC, Google, Huffington Post, YouTube, Twitter and GitHub. Things have moved on from when the white-paper was written and CSP adoption is starting to gain traction. Read Twitter's case study, on adopting CSP.

Browser Extensions and ISP's

Another interesting finding from the white-paper was that browser extensions and even ISP's were injecting scripts into pages, that caused CSP violation reports. CSP may break some browser extensions that inject code into the page. You may consider this a good or bad thing. From a security point of view, what you need to ask is, do you really trust any browser extension to modify your code? I don't know about you but I don't want any extensions and especially any ISP's dirty fingers in my source code.

You can use SSL/TLS which will stop most ISP's from fiddling with your code but some governments get around even this! So CSP gives us some extra protection from man in the middle attacks from browser extensions and ISP's.

CSP and Advertising

Advertising can be a problem for CSP. Some ad providers are better than others. Some providers use resources whose locations are constantly changing which can cause CSP violation errors if your policy is too strict. CNN has adopted a novel workaround for this problem. It embeds all of its adverts into frames which show pages with no CSP restrictions or at least very liberal ones.

CSP Policy Generation

There are special web crawlers that have been created to crawl all of the links on your domain, in an attempt to generate a valid CSP policy automatically. CSP Tools is one such project on GitHub, which given a list of URL's can crawl the web pages and generate a CSP policy. Another approach the tool uses is to look at your CSP violation error reports and come up with rules based on these.

Be careful using this approach however, it may not catch everything. The best approach is to build up your CSP policy as you build your site from the ground up and then carry out some testing to make sure you have got it right. You can set the CSP policy to report only mode, so that browsers don't actually block anything but do report CSP violation errors, once you are happy that no violations are being reported, you can apply your policy. Finally, you need to keep an eye out for CSP violation errors if they do occur and get them fixed as soon as you see them.

Testing CSP Policies

The CSP Tester Chrome extension is an example of a tool you can use to apply CSP policies to your site and view the effects in the browser console window.

As I've mentioned before, the best way is to build CSP into your site as you build your site. You can use the report-uri directive to log any violations and get them fixed. You can also use the Content-Security-Policy-Report-Only HTTP header instead of Content-Security-Policy, to stop the browser from actually blocking anything if you are not confident in the level of testing you have done.

Conclusions

Wow, that was a long blog post. I wanted it to be as comprehensive as possible. I hope I've shown that now is the time to invest in implementing CSP and if you are developing a new site, then integrating CSP into it at an early stage will mean that you reap the benefits of a much greater level of security. The ASP.NET Core Boilerplate project template is a great place to start and will give you a working code example which tells a thousand words on its own.

C# 6.0 - Saving Developers From Themselves

$
0
0

What's New in C# 6.0

If you haven't already taken a look at what's new in C# 6.0, you should certainly read this article. This blog post is going to cover how C# 6.0 can help reduce the number of bugs in your code by giving you the tools to avoid common developer mistakes.

In my opinion, the changes introduced in C# 6.0 can be split into two separate groups. The first group of changes seems to be a declaration of war on curly braces ({}), you can now omit them in many cases. I personally am not too sure about this set of features, it reduces the lines of code you have to write a little and may save a few seconds but at the cost of having to learn a new set of syntax. If you cast your mind back to being a newbie developer (or if you are one), lots of syntax to remember can be difficult to deal with.

This is a problem that C++ developers know well, C++ is a pretty old language now but is still undergoing rapid development with C++ 11, 14 and beyond. It's got to the stage where there are so many ways to skin a cat in C++, even experienced developers can be slowed down when looking at code using older patterns and paradigms. The C# caretakers need to be careful that each new feature is genuinely worth the effort and not just bloat.

The second set of features is what I am really interested in. These are features which will genuinely save you from yourself. They will stop developers making many common mistakes.

The nameof Operator

The nameof operator simply gives you the name of any type you pass into it. You can take a look at the simple example below:

string obiwan;  
Console.WriteLine(nameof(obiwan));
// Prints obiwan

int kenobi = 2;
Console.WriteLine(nameof(kenobi));
// Prints kenobi

Argument Exceptions

So where can this help us? Well, I can think of a few examples, the first being throwing argument exceptions. Argument exceptions all take a parameter, which represents the name of the invalid parameter. In the past, we had to pass this as a string. The problem was that the parameter might get renamed and you might forget to update the string to reflect that.

public void FightCrime(string hero)  
{
    if (hero == null)  
    {
        throw new ArgumentNullException("hero");
    }

    // Omitted crime fighting code...  
}
public void FightCrime(string hero)  
{
    if (hero == null)  
    {
        throw new ArgumentNullException(nameof(hero));
    }

    // Omitted crime fighting code...  
}

With the second example, if you used Visual Studio to rename the hero parameter, then the hero in the nameof operator will also be updated.

INotifyPropertyChanged

This interface is notorious if you are doing any WPF/Silverlight/WinRT/XAML development, for requiring strings to be passed to it. With the nameof operator, this becomes a thing of the past.

public class Ship : INotifyPropertyChanged
{
    public event PropertyChangedEventHandler PropertyChanged;    

    private string name;

    public string Name
    {
        get { return this.name; }
        set
        {
            this.name = value;
            this.OnPropertyChanged(nameof(this.Name));
        }
    }

    protected virtual void OnPropertyChanged(string propertyName)
    {
        PropertyChangedEventHandler eventHandler = this.PropertyChanged;
        if (eventHandler != null)
        {
            eventHandler(this, new PropertyChangedEventArgs(propertyName));
        }
    }
}

ASP.NET MVC

ASP.NET MVC makes massive use of strings everywhere. This is a massive problem when you want to rename something. In fact, I've taken to using constants everywhere. It's more work to setup but in the long run its much easier to maintain. Here is an example of how we can use nameof to create a link and do away with strings and constants:

@Html.ActionLink("Home", "Index", "Home")

@Html.ActionLink2("Home", nameof(HomeController.Index), nameof(HomeController))

public static MvcHtmlString ActionLink2(this HtmlHelper htmlHelper, string linkText, string actionName, string controllerName)
{
    htmlHelper.ActionLink(linkText, actionName, controllerName.Substring(0, controllerName.Length - 10));
}

The above example is a little contrived. In the real world, I would never use ActionLink and use RouteLink instead. Naming your routes has better performance and is just easier to understand when you have multiple routes with the same name (for GET and POST requests).

String Interpolation

I think we've all used string.Format and got our arguments in the wrong positions or entered the index numbers incorrectly at some point in time. Well, that bug is now a thing of the past.

// Before C# 6.0 String Interpolation
string nameAndAge = string.Format("Name:{0}, Age:{1}", name, age);
// After C# 6.0 String Interpolation
string nameAndAge = $"Name:{name}, Age:{age}";

As you can see, you can now use your parameters directly in the strings with full syntax highlighting and renaming support too. In fact the C# 6.0 code actually compiles down to doing a string.Format behind the scenes.

The Null-Conditional Operator

Every C# developer has at some point stared at the text from a NullReferenceException and thought in their head, this is a really rubbish message and leaves out vital information. In fact there is this post on UserVoice, asking Microsoft to improve their NullReferenceException messages. It turns out that Microsoft has thought of this, they haven't improved the message (They still should, please up-vote the UserVoice post) but they have introduced the Null-Conditional operator.

public string Truncate(string value, int length)
{
    string result = value;
    if (value != null)
    {
        result = value.Substring(0, Math.Min(value.Length, length));
    }
    return result;
}
public string Truncate(string value, int length)
{          
    return value?.Substring(0, Math.Min(value.Length, length));

    // Wow, look at all this code I didn't have to write!
}

Conclusions

As you can see there is a common theme with two of the three C# 6.0 features I've picked above. They give us tools to better deal with strings which have terrible IDE and language support. Making a typo in a string gives us no compile time errors and Visual Studio doesn't help either. Each of these features has allowed us to deal with strongly typed objects instead, which have full language and IDE support.

The Null-Conditional operator is another great tool to help mitigate really common but minor bugs that catch even the most experienced developers out. These are great features, and should help stop silly mistakes that all of us developers make. We are after all, human.

Building RSS/Atom Feeds for ASP.NET MVC

$
0
0

What is an RSS/Atom Feed

An RSS or Atom feed is a great way to push site updates to users. Essentially, it's just an XML document which is constantly updated with fresh content and links.

There are numerous feed readers out there that all work in different ways but most just aggregate feeds from several sites into a single reading list. When a user subscribes to your sites feed and adds it to their list of subscriptions, each time you update your feed the fresh content will appear in their reading list.

Feed readers come in all shapes and sizes, even browsers have basic feed reading abilities. Here is a screen-shot of Firefox's bookmarks side-bar, after adding the Visual Studio Magazine feed (Go ahead and try it yourself in Firefox). The bookmarks under the Blogs folder updates each time the feed updates.

Firefox Live Bookmarks Feed

Feed reading websites like Feedly and NewsBlur are fairly popular. Increasingly though, feed readers are actually just apps running on phones or tablets and these can even raise notifications when the feed changes and there is fresh content to read. Services like Feedly and NewsBlur also have their own apps too.

RSS vs Atom

The latest versions of RSS is 2.0, while Atom is 1.0. Atom 1.0 is a web standard and you can read the official IETF Atom 1.0 specification here. RSS is not a web standard and is actually owned by Harvard University.

Atom was created specifically to address problems in RSS 2.0 and is the newer and more well defined format. Both of these formats are now pretty ancient by web standards and enjoy widespread support. If you have a choice of format, go with Atom 1.0.

Atom 1.0 XML

So what does an Atom feed look like, well you can look at the official specification here or there is a simple but fully featured example below.

<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-GB" xmlns:media="http://search.yahoo.com/mrss/" xmlns="http://www.w3.org/2005/Atom">
  <title type="text">ASP.NET Core Boilerplate</title>
  <subtitle type="text">This is the ASP.NET Core Boilerplate feed description.</subtitle>
  <id>3D797739-1DED-4DB8-B60B-1CA52D0AA1A4</id>
  <rights type="text">© 2015 - Rehan Saeed</rights>
  <updated>2015-06-24T15:54:21+01:00</updated>
  <category term="Blog" />
  <logo>http://example.com/icons/atom-logo-96x48.png</logo>
  <author>
    <name>Rehan Saeed</name>
    <uri>https://rehansaeed.com</uri>
    <email>example@email.com</email>
  </author>
  <contributor>
    <name>Rehan Saeed</name>
    <uri>https://rehansaeed.com</uri>
    <email>example@email.com</email>
  </contributor>
  <link rel="self" type="application/atom+xml" href="http://example.com/feed/" />
  <link rel="alternate" type="text/html" href="http://example.com/" />
  <link rel="hub" href="https://pubsubhubbub.appspot.com/" />
  <icon>http://example.com/icons/atom-icon-48x48.png</icon>
  <entry>
    <id>6139F098-2E59-4405-9BC7-0AAB4CF78E23</id>
    <title type="text">Item 1</title>
    <summary type="text">A summary of item 1</summary>
    <published>2015-06-24T15:54:21+01:00</published>
    <updated>2015-06-24T15:54:21+01:00</updated>
    <author>
      <name>Rehan Saeed</name>
      <uri>https://rehansaeed.com</uri>
      <email>example@email.com</email>
    </author>
    <contributor>
      <name>Rehan Saeed</name>
      <uri>https://rehansaeed.com</uri>
      <email>example@email.com</email>
    </contributor>
    <link rel="alternate" type="text/html" href="http://example.com/item1/" />
    <link rel="enclosure" type="image/png" href="http://example.com/item1/atom-icon-48x48.png" />
    <category term="Category 1" />
    <rights type="text">© 2015 - Rehan Saeed</rights>
    <media:thumbnail url="http://example.com/item1/atom-icon-48x48.png" width="48" height="48" />
  </entry>
  <entry>
    <id>927406DD-E8DC-41ED-8154-30DE91B0877A</id>
    <title type="text">Item 2</title>
    <summary type="text">A summary of item 2</summary>
    <published>2015-06-24T15:54:21+01:00</published>
    <updated>2015-06-24T15:54:21+01:00</updated>
    <author>
      <name>Rehan Saeed</name>
      <uri>https://rehansaeed.com</uri>
      <email>example@email.com</email>
    </author>
    <contributor>
      <name>Rehan Saeed</name>
      <uri>https://rehansaeed.com</uri>
      <email>example@email.com</email>
    </contributor>
    <link rel="alternate" type="text/html" href="http://example.com/item2/" />
    <link rel="enclosure" type="image/png" href="http://example.com/item2/atom-icon-48x48.png" />
    <category term="Category 2" />
    <rights type="text">© 2015 - Rehan Saeed</rights>
    <media:thumbnail url="http://example.com/item2/atom-icon-48x48.png" width="48" height="48" />
  </entry>
</feed>

At the root of the XML we have the feed element which represents the Atom Feed. Within that, there is various meta-data about the feed at the top, including:

  • title - The title of the feed.
  • subtitle - A short description or subtitle of the feed.
  • id - A unique ID for the feed. No other feed on the internet should have the same ID.
  • rights - Copyright information.
  • updated - When the feed was last updated.
  • category - Zero or more categories the feed belongs to.
  • logo - A wide 2:1 ratio image representing the feed.
  • author - Zero or more authors of the feed.
  • contributor - Zero or more contributors of the feed.
  • link rel="self" - A link to the feed itself.
  • link rel="alternate" - A link to an alternative representation of the feed.
  • link rel="hub" - A link to the PubSubHubbub hub. I'll talk more about this further on.
  • icon - A square 1:1 ratio image representing the feed.

The entry elements are where it gets interesting, these are the actual 'things' in your feed you are describing. Each entry has meta-data which looks very similar to the meta-data we used to describe the feed itself.

  • id - A unique identifier to the entry. This can be a database row ID, it doesn't have to be a GUID.
  • title - The title of the entry.
  • summary - A short summary for what the entry is about.
  • published - When the entry was published.
  • updated - When the entry was last changed.
  • author - Zero or more authors of the entry.
  • contributor - Zero or more contributors of the entry.
  • link rel="alternate" - A link to an alternative representation of the entry.
  • link rel="enclosure" - An image representing the entry.
  • category - The category of the entry.
  • rights - Some copyright information.
  • media:thumbnail - A thumbnail representing the entry. This is a non-standard extension to the Atom 1.0 specification created by Yahoo but is common enough to be used here.

One thing to note is that all of the links are full absolute URL's. Relative URL's are allowed but you have to specify a single base URI which is added to the start of all URL's. Unfortunately, this feature is buggy in Firefox and so should not be used.

Implementing an Atom Feed

The Windows Communication Foundation (WCF) team at Microsoft has kindly implemented the SyndicationFeed class, giving us a nice API with which to generate the above Atom 1.0 XML (In actual fact this class also represents an RSS 2.0 feed and can be used to generate RSS 2.0 XML too). Since it was the WCF team at Microsoft who built it, they put it in the System.ServiceModel namespace. It doesn't quite feel right there and will probably be split out into it's own namespace (Indeed, I've raised this very question for the new DNX Core version of the .NET Framework which is currently missing SyndicationFeed). Creating a new feed is as simple as this:

SyndicationFeed feed = new SyndicationFeed()
{
    // id (Required) - The feed universally unique identifier.
    Id = "3D797739-1DED-4DB8-B60B-1CA52D0AA1A4",
    // title (Required) - Contains a human readable title for the feed. Often the same as the title of the 
    //                    associated website. This value should not be blank.
    Title = SyndicationContent.CreatePlaintextContent("ASP.NET Core Boilerplate"),
    // items (Required) - The entries to add to the feed. I'll cover how to do this further on.
    Items = this.GetItems(),
    // subtitle (Recommended) - Contains a human-readable description or subtitle for the feed.
    Description = SyndicationContent.CreatePlaintextContent(
        "This is the ASP.NET Core Boilerplate feed description."),
    // updated (Optional) - Indicates the last time the feed was modified in a significant way.
    LastUpdatedTime = DateTimeOffset.Now,
    // logo (Optional) - Identifies a larger image which provides visual identification for the feed. 
    //                   Images should be twice as wide as they are tall.
    ImageUrl = new Uri("http://example.com/icons/atom-logo-96x48.png"),
    // rights (Optional) - Conveys information about rights, e.g. copyrights, held in and over the feed.
    Copyright = SyndicationContent.CreatePlaintextContent(
        string.Format("© {0} - {1}", DateTime.Now.Year, "Rehan Saeed")),
    // lang (Optional) - The language of the feed.
    Language = "en-GB",
    // generator (Optional) - Identifies the software used to generate the feed, for debugging and other 
    //                        purposes. Do not put in anything that identifies the technology you are using.
    // Generator = "Sample Code",
    // base (Buggy) - Add the full base URL to the site so that all other links can be relative. This is 
    //                great, except some feed readers are buggy with it, INCLUDING FIREFOX!!! 
    //                (See https://bugzilla.mozilla.org/show_bug.cgi?id=480600).
    // BaseUri = new Uri("http://example.com")
};

// self link (Required) - The URL for the syndication feed.
feed.Links.Add(SyndicationLink.CreateSelfLink(
    new Uri("http://example.com/feed/"), 
    ContentType.Atom));

// alternate link (Recommended) - The URL for the web page showing the same data as the syndication feed.
feed.Links.Add(SyndicationLink.CreateAlternateLink(
    new Uri("http://example.com"), 
    ContentType.Html));

// hub link (Recommended) - The URL for the PubSubHubbub hub. Used to push new entries to subscribers 
//                          instead of making them poll the feed. See feed updated method below.
feed.Links.Add(new SyndicationLink(new Uri("https://pubsubhubbub.appspot.com/"), "hub", null, null, 0));

// author (Recommended) - Names one author of the feed. A feed may have multiple author elements. A feed 
//                        must contain at least one author element unless all of the entry elements contain 
//                        at least one author element.
feed.Authors.Add(
    new SyndicationPerson()
    {
        // name (Required) - conveys a human-readable name for the person.
        Name = "Rehan Saeed",
        // uri (Optional) - contains a home page for the person.
        Uri = "https://rehansaeed.com",
        // email (Optional) - contains an email address for the person.
        Email = "example@email.com"
    });

// category (Optional) - Specifies a category that the feed belongs to. A feed may have multiple category 
//                       elements.
feed.Categories.Add(new SyndicationCategory("CategoryName"));

// contributor (Optional) - Names one contributor to the feed. An feed may have multiple contributor 
//                          elements.
feed.Contributors.Add(
    new SyndicationPerson()
    {
        Name = "Rehan Saeed",
        Uri = "https://rehansaeed.com",
        Email = "example@email.com"
    });

// icon (Optional) - Identifies a small image which provides iconic visual identification for the feed. 
//                   Icons should be square.
feed.SetIcon(this.urlHelper.AbsoluteContent("http://example.com/icons/atom-icon-48x48.png"));

// Add the Yahoo Media namespace (xmlns:media="http://search.yahoo.com/mrss/") to the Atom feed. 
// This gives us extra abilities, like the ability to give thumbnail images to entries. 
// See http://www.rssboard.org/media-rss for more information.
feed.AddYahooMediaNamespace();

Unfortunately, the property to set the icon does not exist on the SyndicationFeed, even though it is part of the official specification. Luckily for you I have created a quick extension method (Usage shown above) which allows us to set the icon.

I have also created an extension method to add a Yahoo media thumbnail to an Atom entry. This is a non-standard extension but worth the effort. To use non-standard extensions, requires adding a namespace to the feed element in the XML, that is what the AddYahooMediaNamespace method does towards the bottom.

The extension methods are shown below. They use extensibility points on the SyndicationFeed, that allows us to augment its functionality.

/// <summary>
/// <see cref="SyndicationFeed"/> extension methods.
/// </summary>
public static class SyndicationFeedExtensions
{
    private const string YahooMediaNamespacePrefix = "media";
    private const string YahooMediaNamespace = "http://search.yahoo.com/mrss/";

    /// <summary>
    /// Adds a namespace to the specified feed.
    /// </summary>
    /// <param name="feed">The syndication feed.</param>
    /// <param name="namespacePrefix">The namespace prefix.</param>
    /// <param name="xmlNamespace">The XML namespace.</param>
    public static void AddNamespace(this SyndicationFeed feed, string namespacePrefix, string xmlNamespace)
    {
        feed.AttributeExtensions.Add(
            new XmlQualifiedName(namespacePrefix, XNamespace.Xmlns.ToString()), 
            xmlNamespace);
    }

    /// <summary>
    /// Adds the yahoo media namespace to the specified feed.
    /// </summary>
    /// <param name="feed">The syndication feed.</param>
    public static void AddYahooMediaNamespace(this SyndicationFeed feed)
    {
        AddNamespace(feed, YahooMediaNamespacePrefix, YahooMediaNamespace);
    }

    /// <summary>
    /// Gets the icon URL for the feed.
    /// </summary>
    /// <param name="feed">The syndication feed.</param>
    /// <returns>The icon URL.</returns>
    public static string GetIcon(this SyndicationFeed feed)
    {
        SyndicationElementExtension iconExtension = feed.ElementExtensions.FirstOrDefault(
            x => string.Equals(x.OuterName, "icon", StringComparison.OrdinalIgnoreCase));
        return iconExtension.GetObject<string>();
    }

    /// <summary>
    /// Sets the icon URL for the feed.
    /// </summary>
    /// <param name="feed">The syndication feed.</param>
    /// <param name="iconUrl">The icon URL.</param>
    public static void SetIcon(this SyndicationFeed feed, string iconUrl)
    {
        feed.ElementExtensions.Add(new SyndicationElementExtension("icon", null, iconUrl));
    }

    /// <summary>
    /// Sets the Yahoo Media thumbnail for the feed entry.
    /// </summary>
    /// <param name="item">The feed entry.</param>
    /// <param name="url">The thumbnail URL.</param>
    /// <param name="width">The optional width of the thumbnail image.</param>
    /// <param name="height">The optional height of the thumbnail image.</param>
    public static void SetThumbnail(this SyndicationItem item, string url, int? width, int? height)
    {
        XNamespace ns = YahooMediaNamespace;
        item.ElementExtensions.Add(new SyndicationElementExtension(
            new XElement(
                ns + "thumbnail",
                new XAttribute("url", url),
                width.HasValue ? new XAttribute("width", width) : null,
                height.HasValue ? new XAttribute("height", height) : null)));
    }
}

Creating feed entries is just as simple and is done using the SyndicationItem class. An example of creating the first entry is shown below.

SyndicationItem item = new SyndicationItem()
{
    // id (Required) - Identifies the entry using a universally unique and permanent URI. Two entries 
    //                 in a feed can have the same value for id if they represent the same entry at 
    //                 different points in time.
    Id = "6139F098-2E59-4405-9BC7-0AAB4CF78E23",
    // title (Required) - Contains a human readable title for the entry. This value should not be blank.
    Title = SyndicationContent.CreatePlaintextContent("Item 1"),
    // description (Recommended) - A summary of the entry.
    Summary = SyndicationContent.CreatePlaintextContent("A summary of item 1"),
    // updated (Optional) - Indicates the last time the entry was modified in a significant way. This 
    //                      value need not change after a typo is fixed, only after a substantial 
    //                      modification. Generally, different entries in a feed will have different 
    //                      updated timestamps.
    LastUpdatedTime = DateTimeOffset.Now,
    // published (Optional) - Contains the time of the initial creation or first availability of the entry.
    PublishDate = DateTimeOffset.Now,
    // rights (Optional) - Conveys information about rights, e.g. copyrights, held in and over the entry.
    Copyright = new TextSyndicationContent(
        string.Format("© {0} - {1}", DateTime.Now.Year, "Rehan Saeed")),
};

// link (Recommended) - Identifies a related Web page. An entry must contain an alternate link if there 
//                      is no content element.
item.Links.Add(SyndicationLink.CreateAlternateLink(
    new Uri("http://example.com/item1"), 
    ContentType.Html));
// AND/OR
// Text content  (Optional) - Contains or links to the complete content of the entry. Content must be 
//                            provided if there is no alternate link.
// item.Content = SyndicationContent.CreatePlaintextContent("The actual plain text content of the entry");
// HTML content (Optional) - Content can be plain text or HTML. Here is a HTML example.
// item.Content = SyndicationContent.CreateHtmlContent("The actual HTML content of the entry");

// author (Optional) - Names one author of the entry. An entry may have multiple authors. An entry must 
//                     contain at least one author element unless there is an author element in the 
//                     enclosing feed, or there is an author element in the enclosed source element.
item.Authors.Add(this.GetPerson());

// contributor (Optional) - Names one contributor to the entry. An entry may have multiple contributor elements.
item.Contributors.Add(this.GetPerson());

// category (Optional) - Specifies a category that the entry belongs to. A entry may have multiple 
//                       category elements.
item.Categories.Add(new SyndicationCategory("Category 1"));

// link - Add additional links to related images, audio or video like so.
item.Links.Add(SyndicationLink.CreateMediaEnclosureLink(
    new Uri("http://example.com/item1/atom-icon-48x48.png"), 
    ContentType.Png, 
    0));

// media:thumbnail - Add a Yahoo Media thumbnail for the entry. See http://www.rssboard.org/media-rss 
//                   for more information.
item.SetThumbnail("http://example.com/item1/atom-icon-48x48.png", 48, 48);

items.Add(item);

Now it's actually possible to include a full HTML page inside a feed entry. Alternatively, you can provide plain text content or as I have done, provide a link to the full content. I have shown how to do all three in the comments above.

The next step is to actually reply to the client with a HTTP response containing the Atom 1.0 XML. Although Atom is just XML, it has it's own specific schema and has it's own MIME type application/atom+xml. Furthermore, the XML must actually be returned using the UTF-8 character encoding as per the standard. So here is our controllers action returning the feed:

[OutputCache(Duration = 86400)]
[Route("feed", Name = "GetFeed")]
public ActionResult Feed()
{
    SyndicationFeed feed = this.feedService.GetFeed();
    return new AtomActionResult(feed);
}

The above controller action is super simple, we take our SyndicationFeed and return it in a new AtomActionResult which is where all the magic happens. We also cache the response for a day, this is great for performance if your feed does not change very often. So what is AtomActionResult, well here is the code:

/// <summary>
/// Represents a class that is used to render an Atom 1.0 feed by using an <see cref="SyndicationFeed"/> instance 
/// representing the feed.
/// </summary>
public sealed class AtomActionResult : ActionResult
{
    private readonly SyndicationFeed syndicationFeed;

    /// <summary>
    /// Initializes a new instance of the <see cref="AtomActionResult"/> class.
    /// </summary>
    /// <param name="syndicationFeed">The Atom 1.0 <see cref="SyndicationFeed" />.</param>
    public AtomActionResult(SyndicationFeed syndicationFeed)
    {
        this.syndicationFeed = syndicationFeed;
    }

    /// <summary>
    /// Executes the call to the ActionResult method and returns the created feed to the output response.
    /// </summary>
    /// <param name="context">The context in which the result is executed. The context information includes the 
    /// controller, HTTP content, request context, and route data.</param>
    public override void ExecuteResult(ControllerContext context)
    {
        context.HttpContext.Response.ContentType = "application/atom+xml";
        Atom10FeedFormatter feedFormatter = new Atom10FeedFormatter(this.syndicationFeed);
        XmlWriterSettings xmlWriterSettings = new XmlWriterSettings();
        xmlWriterSettings.Encoding = Encoding.UTF8;

        if (HttpContext.Current.IsDebuggingEnabled)
        {
            // Indent the XML for easier viewing but only in Debug mode. In Release mode, everything is output on 
            // one line for best performance.
            xmlWriterSettings.Indent = true;
        }

        using (XmlWriter xmlWriter = XmlWriter.Create(context.HttpContext.Response.Output, xmlWriterSettings))
        {
            feedFormatter.WriteTo(xmlWriter);
        }
    }
}

The above code is writing out the XML to the HTTP response in UTF-8 encoding and with the application/atom+xml MIME type. By default the XML is written out all in one line which is good for performance but not very good for legibility, so we also detect whether the application is being debugged and if so, indent the XML for better legibility.

After all our hard work, we can now navigate to the controller action and view our feed. Here is Internet Explorer's view of our Atom feed:

Atom Feed Example in Internet Explorer

Images

RSS and Atom have been around for over a decade now and there is precious little information out there on how to create a feed. One of the areas that lacked information was the logo and icon images. All the specification says is that the ratios of the images should be a 2:1 rectangle and a 1:1 square respectively.

My advice to you and what I ended up doing is looking at various examples on the internet of feeds and copy the image sizes they were using. I ended up with images of size 48x48 and 96x48 which seemed a common size.

Adding a 'Subscribe to this page' Button

Firefox has a feature called 'Subscribe to this page' which is a button that users can add to their toolbar (The button is enabled by default on older versions of Firefox). The button detects whether the current page links to an RSS/Atom feed and if it does, the user can click on it to subscribe to the feed directly. Here is a quick screen-shot of the button:

FireFox Subscribe to this Page Button

To add this feature, we need to place a meta tag in the head of our page with a link to the Atom feed like so:

<link href="http://localhost/feed" rel="alternate" title="ASP.NET Core Boilerplate Feed" type="application/atom+xml">

This is a pretty minor feature I admit but it has potential. By doing this, we are linking our page to the Atom feed. This can be read by search engines too, so potentially there could be some benefit in terms of Search Engine Optimization (SEO). Of course this is impossible to prove as search engines jealously guard how they manage their search rankings.

PubSubHubbub

The problem with feeds is that you have to pull the information from them. You are never notified of new changes to the feed, so clients have to constantly poll the feed to check for any new feed entries.

This is the problem that PubSubHubbub (I know, it has a terrible name!) solves. It's been developed by Google and it's actually an open standard, with the latest version of the standard being 0.4 at the time of writing.

There are already major platforms supporting it. Mostly they are Google products as you would expect but WordPress which powers a third of the worlds websites also supports it.

At the heart of it, you now have a hub that knows how to speak the PubSubHubbub standard language. When a feed is updated with a new entry, the website sends a message to the hub to tell it that the feed has been updated. Clients can then register for updates with the hub and get notified instantly when there is an update.

The coolest thing though is that all of this is super easy to implement, since Google provides us with a hub that we can use and we don't need to write our own. We just need to add a line of XML in our Atom feed telling clients that we support PubSubHubbub and the URL to the hub we want to use:

<link rel="hub" href="https://pubsubhubbub.appspot.com/" />

Now when there is an update to the feed, we need to publish that update to the hub linked to above. We do that by calling the simple method below:

/// <summary>
/// Publishes the fact that the feed has updated to subscribers using the PubSubHubbub v0.4 protocol.
/// </summary>
public Task PublishUpdate()
{
    HttpClient httpClient = new HttpClient();
    return httpClient.PostAsync(
        "https://pubsubhubbub.appspot.com/", 
        new FormUrlEncodedContent(
            new KeyValuePair<string, string>()
            {
                new KeyValuePair<string, string>("hub.mode", "publish"),
                new KeyValuePair<string, string>(
                    "hub.url", 
                    "http://localhost/feed")
            }));
}

It's as simple as that from the publishers side. On the client side, subscribing to the changes in the feed is only a little more complicated than this. I won't cover that but you can find out more by reading the official specification.

Feed Paging

The Atom specification actually outlines how you can add paging to your feed. This is a great way to split up your feed if you are worried that it consumes too much bandwidth. Adding paging involves inserting the following links into the top of your feed. The links are the first, last, next and previous pages of your feed. Obviously, if you don't have a next or previous page, those links can be omitted.

<link rel="first" href="http://example.com/feed"/>
<link rel="next" href="http://example.com/feed?page=4"/>
<link rel="previous" href="http://example.com/feed?page=2"/>
<link rel="last" href="http://example.com/feed?page=10"/>

Here is the corresponding code to add the above links:

feed.Links.Add(new SyndicationLink(new Uri("http://example.com/feed"), "first", null, null, 0));
feed.Links.Add(new SyndicationLink(new Uri("http://example.com/feed?page=10"), "last", null, null, 0));

if (hasPreviousPage)
{
    feed.Links.Add(new SyndicationLink(new Uri("http://example.com/feed?page=2")), "previous", null, null, 0));
}

if (hasNextPage)
{
    feed.Links.Add(new SyndicationLink(new Uri("http://example.com/feed?page=4"), "next", null, null, 0));
}

Feed Validation

Once you are done building your feed and have published it online, don't forget to check FeedValidator.org to ensure that your feed conforms to the Atom 1.0 specification.

Conclusion

As always, you can look at a full working example of all of this code on the ASP.NET Core Boilerplate GitHub page.

Canonical URL's for ASP.NET MVC

$
0
0

The aim of this post is to give your site better search engine rankings using special Search Engine Optimization (SEO) techniques. Take a look at the URL's below and see if you can spot the differences between them:

  1. http://example.com/one/two/
  2. https://example.com/one/two/
  3. http://example.com/one/two
  4. http://example.com/One/Two

The second one has a HTTPS scheme, the third omits the trailing slash and the fourth has mixed case characters. All of the URL's point to the same resource but it turns out that search engines treat every one of these URL's as unique and different. Search engines give each URL a page rank, which determines where the resource will show up in the search results. Another term you will also hear quite often is 'link juice'. The link juice conceptualizes how page rank flows between pages and websites.

If your site exposes the above four different URL's to the single resource, your link juice is being spread against each one and as a result, that will be having a detrimental impact on your page rank.

The Canonical Link Tag

One way to solve this problem is to add a canonical link tag to the head of your HTML page. This tells search engines what the canonical (actual) URL to the page is. The link tag contains a URL to your preferred URL for the page.

One thing you must decide early on is your preferred URL for every page. You must ask yourself the following questions and use the resulting URL in your canonical link tag.

  1. Do I prefer this page to be HTTP or HTTPS? This is yet another reason to go with HTTPS across your entire site.
  2. Should the URL end with a trailing slash? This is often preferred over omitting it but it's a matter of preference.
  3. Should I allow a mix of upper-case and lower-case characters? Most sites choose to go with all lower-case characters.

When search engines follow a link to your page, regardless of which URL they followed to get to your page, all of the link juice will be given to the URL specified in your canonical link tag. Google goes into a lot more depth about this tag here.

301 Permanent Redirects

Unfortunately, using the canonical link tag is not the recommended approach. The intention is that it should only be used to retrofit older websites, so they can become optimized for search engines.

According to both Google and Bing, the recommended approach if you visit a non-preferred format of your pages URL is to perform a 301 permanent redirect to the preferred canonical URL. According to them, you only lose a tiny amount of link juice by doing a 301 permanent redirect.

Canonical URL's in MVC

ASP.NET MVC 5 and ASP.NET Core have two settings you can use to automatically create canonical URL's every time you generate URL's.

// Append a trailing slash to all URL's.
RouteTable.Routes.AppendTrailingSlash = true;
// Ensure that all URL's are lower-case.
RouteTable.Routes.LowercaseUrls = true;
services.ConfigureRouting(
    routeOptions => 
    { 
        // Append a trailing slash to all URL's.
        routeOptions.AppendTrailingSlash = true;
        // Ensure that all URL's are lower-case.
        routeOptions.LowercaseUrls = true;
    });

Once you apply these settings and are using the UrlHelper to generate all your URL's, you will see that across your site all URL's are lower-case and all end with a trailing slash (This is just my personal preference you may not like trailing slashes).

This means that within your site, no 301 permanent redirects to canonical URL's are required because the URL's are already canonical. However, this just solves part of the problem. What about external links to your site? What happens when people copy and paste your site and delete or add a trailing slash? What happens when someone types in a link to your site and puts in an upper-case character? The fact is you have no control over external links and when search engine crawlers follow those non-canonical links you will be losing valuable link juice.

301 Permanent Redirects in MVC

Enter the RedirectToCanonicalUrlAttribute. This is an MVC filter you can apply, which will check that the URL from each request is canonical. If it is, it does nothing and MVC returns the view in its response as normal. If the URL is not canonical, it generates the canonical URL based on the above MVC settings and returns a 301 permanent redirect response to the client. The client can then make another request to the correct canonical URL.

You can take a look at the source code for the RedirectToCanonicalUrlAttribute, NoTrailingSlashAttribute and NoLowercaseQueryStringAttribute's (I shall explain in a minute) for MVC 5 below or the ASP.NET Core version here.

/// <summary>
/// To improve Search Engine Optimization SEO, there should only be a single URL for each resource. Case 
/// differences and/or URL's with/without trailing slashes are treated as different URL's by search engines. This 
/// filter redirects all non-canonical URL's based on the settings specified to their canonical equivalent. 
/// Note: Non-canonical URL's are not generated by this site template, it is usually external sites which are 
/// linking to your site but have changed the URL case or added/removed trailing slashes.
/// (See Google's comments at http://googlewebmastercentral.blogspot.co.uk/2010/04/to-slash-or-not-to-slash.html
/// and Bing's at http://blogs.bing.com/webmaster/2012/01/26/moving-content-think-301-not-relcanonical).
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class, Inherited = true, AllowMultiple = false)]
public class RedirectToCanonicalUrlAttribute : FilterAttribute, IAuthorizationFilter
{
    private const char QueryCharacter = '?';
    private const char SlashCharacter = '/';

    private readonly bool appendTrailingSlash;
    private readonly bool lowercaseUrls;

    /// <summary>
    /// Initializes a new instance of the <see cref="RedirectToCanonicalUrlAttribute" /> class.
    /// </summary>
    /// <param name="appendTrailingSlash">If set to <c>true</c> append trailing slashes, otherwise strip trailing 
    /// slashes.</param>
    /// <param name="lowercaseUrls">If set to <c>true</c> lower-case all URL's.</param>
    public RedirectToCanonicalUrlAttribute(
        bool appendTrailingSlash, 
        bool lowercaseUrls)
    {
        this.appendTrailingSlash = appendTrailingSlash;
        this.lowercaseUrls = lowercaseUrls;
    } 

    /// <summary>
    /// Gets a value indicating whether to append trailing slashes.
    /// </summary>
    /// <value>
    /// <c>true</c> if appending trailing slashes; otherwise, strip trailing slashes.
    /// </value>
    public bool AppendTrailingSlash
    {
        get { return this.appendTrailingSlash; }
    }

    /// <summary>
    /// Gets a value indicating whether to lower-case all URL's.
    /// </summary>
    /// <value>
    /// <c>true</c> if lower-casing URL's; otherwise, <c>false</c>.
    /// </value>
    public bool LowercaseUrls
    {
        get { return this.lowercaseUrls; }
    }

    /// <summary>
    /// Determines whether the HTTP request contains a non-canonical URL using <see cref="TryGetCanonicalUrl"/>, 
    /// if it doesn't calls the <see cref="HandleNonCanonicalRequest"/> method.
    /// </summary>
    /// <param name="filterContext">An object that encapsulates information that is required in order to use the 
    /// <see cref="RedirectToCanonicalUrlAttribute"/> attribute.</param>
    /// <exception cref="ArgumentNullException">The <paramref name="filterContext"/> parameter is <c>null</c>.</exception>
    public virtual void OnAuthorization(AuthorizationContext filterContext)
    {
        if (filterContext == null)
        {
            throw new ArgumentNullException(nameof(filterContext));
        }

        if (string.Equals(filterContext.HttpContext.Request.HttpMethod, "GET", StringComparison.Ordinal))
        {
            string canonicalUrl;
            if (!this.TryGetCanonicalUrl(filterContext, out canonicalUrl))
            {
                this.HandleNonCanonicalRequest(filterContext, canonicalUrl);
            }
        }
    }

    /// <summary>
    /// Determines whether the specified URl is canonical and if it is not, outputs the canonical URL.
    /// </summary>
    /// <param name="filterContext">An object that encapsulates information that is required in order to use the 
    /// <see cref="RedirectToCanonicalUrlAttribute" /> attribute.</param>
    /// <param name="canonicalUrl">The canonical URL.</param>
    /// <returns><c>true</c> if the URL is canonical, otherwise <c>false</c>.</returns>
    protected virtual bool TryGetCanonicalUrl(AuthorizationContext filterContext, out string canonicalUrl)
    {
        bool isCanonical = true;

        Uri url = filterContext.HttpContext.Request.Url;
        canonicalUrl = url.ToString();
        int queryIndex = canonicalUrl.IndexOf(QueryCharacter);

        // If we are not dealing with the home page. Note, the home page is a special case and it doesn't matter
        // if there is a trailing slash or not. Both will be treated as the same by search engines.
        if (url.AbsolutePath.Length > 1)
        {
            if (queryIndex == -1)
            {
                bool hasTrailingSlash = canonicalUrl[canonicalUrl.Length - 1] == SlashCharacter;

                if (this.appendTrailingSlash)
                {
                    // Append a trailing slash to the end of the URL.
                    if (!hasTrailingSlash && !this.HasNoTrailingSlashAttribute(filterContext))
                    {
                        canonicalUrl += SlashCharacter;
                        isCanonical = false;
                    }
                }
                else
                {
                    // Trim a trailing slash from the end of the URL.
                    if (hasTrailingSlash)
                    {
                        canonicalUrl = canonicalUrl.TrimEnd(SlashCharacter);
                        isCanonical = false;
                    }
                }
            }
            else
            {
                bool hasTrailingSlash = canonicalUrl[queryIndex - 1] == SlashCharacter;

                if (this.appendTrailingSlash)
                {
                    // Append a trailing slash to the end of the URL but before the query string.
                    if (!hasTrailingSlash && !this.HasNoTrailingSlashAttribute(filterContext))
                    {
                        canonicalUrl = canonicalUrl.Insert(queryIndex, SlashCharacter.ToString());
                        isCanonical = false;
                    }
                }
                else
                {
                    // Trim a trailing slash to the end of the URL but before the query string.
                    if (hasTrailingSlash)
                    {
                        canonicalUrl = canonicalUrl.Remove(queryIndex - 1, 1);
                        isCanonical = false;
                    }
                }
            }
        }

        if (this.lowercaseUrls)
        {
            foreach (char character in canonicalUrl)
            {
                if (this.HasNoLowercaseQueryStringAttribute(filterContext) && queryIndex != -1)
                {
                    if (character == QueryCharacter)
                    {
                        break;
                    }

                    if (char.IsUpper(character) && !this.HasNoTrailingSlashAttribute(filterContext))
                    {
                        canonicalUrl = canonicalUrl.Substring(0, queryIndex).ToLower() +
                            canonicalUrl.Substring(queryIndex, canonicalUrl.Length - queryIndex);
                        isCanonical = false;
                        break;
                    }
                }
                else
                {
                    if (char.IsUpper(character) && !this.HasNoTrailingSlashAttribute(filterContext))
                    {
                        canonicalUrl = canonicalUrl.ToLower();
                        isCanonical = false;
                        break;
                    }
                }
            }
        }

        return isCanonical;
    }

    /// <summary>
    /// Handles HTTP requests for URL's that are not canonical. Performs a 301 Permanent Redirect to the canonical URL.
    /// </summary>
    /// <param name="filterContext">An object that encapsulates information that is required in order to use the 
    /// <see cref="RedirectToCanonicalUrlAttribute" /> attribute.</param>
    /// <param name="canonicalUrl">The canonical URL.</param>
    protected virtual void HandleNonCanonicalRequest(AuthorizationContext filterContext, string canonicalUrl)
    {
        filterContext.Result = new RedirectResult(canonicalUrl, true);
    }

    /// <summary>
    /// Determines whether the specified action or its controller has the <see cref="NoTrailingSlashAttribute"/> 
    /// attribute specified.
    /// </summary>
    /// <param name="filterContext">The filter context.</param>
    /// <returns><c>true</c> if a <see cref="NoTrailingSlashAttribute"/> attribute is specified, otherwise 
    /// <c>false</c>.</returns>
    protected virtual bool HasNoTrailingSlashAttribute(AuthorizationContext filterContext)
    {
        return filterContext.ActionDescriptor.IsDefined(typeof(NoTrailingSlashAttribute), false) ||
            filterContext.ActionDescriptor.ControllerDescriptor.IsDefined(typeof(NoTrailingSlashAttribute), false);
    }

    /// <summary>
    /// Determines whether the specified action or its controller has the <see cref="NoLowercaseQueryStringAttribute"/> 
    /// attribute specified.
    /// </summary>
    /// <param name="filterContext">The filter context.</param>
    /// <returns><c>true</c> if a <see cref="NoLowercaseQueryStringAttribute"/> attribute is specified, otherwise 
    /// <c>false</c>.</returns>
    protected virtual bool HasNoLowercaseQueryStringAttribute(AuthorizationContext filterContext)
    {
        return filterContext.ActionDescriptor.IsDefined(typeof(NoLowercaseQueryStringAttribute), false) ||
            filterContext.ActionDescriptor.ControllerDescriptor.IsDefined(typeof(NoLowercaseQueryStringAttribute), false);
    }
}

/// <summary>
/// Requires that a HTTP request does not contain a trailing slash. If it does, return a 404 Not Found. This is 
/// useful if you are dynamically generating something which acts like it's a file on the web server. 
/// E.g. /Robots.txt/ should not have a trailing slash and should be /Robots.txt. Note, that we also don't care if 
/// it is upper-case or lower-case in this instance.
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class, Inherited = true, AllowMultiple = false)]
public class NoTrailingSlashAttribute : FilterAttribute, IAuthorizationFilter
{
    private const char QueryCharacter = '?';
    private const char SlashCharacter = '/';

    /// <summary>
    /// Determines whether a request contains a trailing slash and, if it does, calls the 
    /// <see cref="HandleTrailingSlashRequest"/> method.
    /// </summary>
    /// <param name="filterContext">An object that encapsulates information that is required in order to use the 
    /// <see cref="RequireHttpsAttribute"/> attribute.</param>
    /// <exception cref="ArgumentNullException">The <paramref name="filterContext"/> parameter is null.</exception>
    public virtual void OnAuthorization(AuthorizationContext filterContext)
    {
        if (filterContext == null)
        {
            throw new ArgumentNullException(nameof(filterContext));
        }

        string canonicalUrl = filterContext.HttpContext.Request.Url.ToString();
        int queryIndex = canonicalUrl.IndexOf(QueryCharacter);

        if (queryIndex == -1)
        {
            if (canonicalUrl[canonicalUrl.Length - 1] == SlashCharacter)
            {
                this.HandleTrailingSlashRequest(filterContext);
            }
        }
        else
        {
            if (canonicalUrl[queryIndex - 1] == SlashCharacter)
            {
                this.HandleTrailingSlashRequest(filterContext);
            }
        }
    }

    /// <summary>
    /// Handles HTTP requests that have a trailing slash but are not meant to.
    /// </summary>
    /// <param name="filterContext">An object that encapsulates information that is required in order to use the 
    /// <see cref="RequireHttpsAttribute"/> attribute.</param>
    protected virtual void HandleTrailingSlashRequest(AuthorizationContext filterContext)
    {
        filterContext.Result = new HttpNotFoundResult();
    }
}

/// <summary>
/// Ensures that a HTTP request URL can contain query string parameters with both upper-case and lower-case 
/// characters.
/// </summary>
[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class, Inherited = true, AllowMultiple = false)]
public class NoLowercaseQueryStringAttribute : FilterAttribute
{
}

Adding the RedirectToCanonicalUrlAttribute filter is easy. You can add it to the global filters collection so all requests will be handled by it like so:

GlobalFilters.Filters.Add(new RedirectToCanonicalUrlAttribute(
    RouteTable.Routes.AppendTrailingSlash, 
    RouteTable.Routes.LowercaseUrls));

That's it! It's as simple as that! Now there are two special cases, which is where the NoTrailingSlashAttribute and NoLowercaseQueryStringAttribute filters comes in.

Special Case 1

Say you want to have the following action method where visiting http://example.com/robots.txt returns a text result. We want the client to think it's just visiting a static robots.txt file but in reality we are dynamically generating it (One reason for doing this is that a robots.txt file must contain an absolute URL and you want to use the UrlHelper to just handle that, no matter what domain the site is running under).

[NoTrailingSlash]
[Route("robots.txt")]
public ContentResult RobotsText()
{
    string content = this.robotsService.GetRobotsText();
    return this.Content(content, ContentType.Text, Encoding.UTF8);
}

Adding a trailing slash to robots.txt would just be weird. Also, the last thing you want to do when search engines try to visit http://example.com/robots.txt is 301 permanent redirect them to http://example.com/robots.txt/. So we add the NoTrailingSlashAttribute filter.

The RedirectToCanonicalUrlAttribute knows about the NoTrailingSlashAttribute filter and when it sees it and we make a request to the above action, it ignores the AppendTrailingSlash setting and it works just like requesting a static robots.txt file from the file system.

Special Case 2

Sometimes you want your query string parameters to be a mix of upper-case and lower-case. When you want to do this, simply add the NoLowercaseQueryStringAttribute attribute to the action method like so:

[NoLowercaseQueryString]
[Route("action")]
public void Action(string mixedCaseParameter)
{
    // mixedCaseParameter can contain upper and lower case characters.
}

If you are using the ASP.NET Identity NuGet package for authentication, then take note, you need to apply the NoLowercaseQueryStringAttribute to the AccountController.

Conclusions

Once again, you can find a working example of this and much more using the ASP.NET Core Boilerplate project template or view the source code directly on GitHub.

Dynamically Generating Robots.txt Using ASP.NET MVC

$
0
0

A robots.txt file is a simple text file you can place at the root of your site at http://example.com/robots.txt to tell search engine robots (also known as web crawlers) how to index your site. The robots know to look for this file at the root of every site before they start indexing the site. If you do not have this file in your site, you will be getting a lot of 404 Not Found errors in your logs.

The robots.txt uses the Robots Exclusion Standard which is a very simple format that can give robots instructions on what to index and what to skip. A very basic robots.txt file looks like this:

# Allow all robots to index this site.
user-agent: *

# Tell all robots not to index any of the pages under the /error path.
disallow: /error/

# Tell all robots to index the under the error/foo path.
allow: /error/foo/

# Add a link to the site-map. Unfortunately this must be an absolute URL.
sitemap: http://example.com/sitemap.xml

In the above code, all comments start with the hash character. It tells all robots that they can index everything on the site except pages under the /error path because we don't want our error pages showing up in peoples search results. The only exception to that rule is to allow the resources under the /error/foo path to be indexed.

The last line is interesting and tells robots where to find an XML file called a site-map. A site-map contains a list of URL's to all the pages in the site and is used to give search engines a list of URL's they can go through to index the entire site. It's a great SEO (Search Engine Optimization) technique to give your site a boost in it's search rankings.

I will discuss creating a dynamic sitemap.xml file for ASP.NET Core in a future post. For now, all you need to know is that the site-map URL has to be an absolute URL according to the specification. This is a pretty terrible decision by whoever created the robots exclusion standard. It's really annoying that when you're creating a site, you have to remember to manually update this URL. If the URL was relative we would not have this problem.

Dynamically Generating a robots.txt File

Fortunately, it's really easy to dynamically create a robots.txt file, which auto-generates the site-map URL using the MVC UrlHelper. Take a look at the code below:

public class HomeController : Controller
{
    [Route("robots.txt", Name = "GetRobotsText"), OutputCache(Duration = 86400)]
    public ContentResult RobotsText()
    {
        StringBuilder stringBuilder = new StringBuilder();

        stringBuilder.AppendLine("user-agent: *");
        stringBuilder.AppendLine("disallow: /error/");
        stringBuilder.AppendLine("allow: /error/foo");
        stringBuilder.Append("sitemap: ");
        stringBuilder.AppendLine(this.Url.RouteUrl("GetSitemapXml", null, this.Request.Url.Scheme).TrimEnd('/'));

        return this.Content(stringBuilder.ToString(), "text/plain", Encoding.UTF8);
    }

    [Route("sitemap.xml", Name = "GetSitemapXml"), OutputCache(Duration = 86400)]
    public ContentResult SitemapXml()
    {
        // I'll talk about this in a later blog post.
    }
}

I set up a route to the robots.txt path at the root of the site in my main HomeController and cached the response for a day for better performance (You can and should probably specify a much longer period of time if you know yours won't change).

I then go on to append my commands to the StringBuilder. The great thing is that I can easily use the UrlHelper to generate a complete absolute URL to the sitemap.xml path which is also dynamically generated in much the same way. Finally, I just return the string as plain text using the UTF-8 encoding.

Creating a route ending with a file extension is not allowed by default in ASP.NET Core. To get around this security restriction, you need to add the following to the Web.config file:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <!-- ...Omitted -->
  <system.webServer>
    <!-- ...Omitted -->
    <handlers>
      <!-- ...Omitted -->
      <add name="RobotsText" 
           path="robots.txt" 
           verb="GET" 
           type="System.Web.Handlers.TransferRequestHandler" 
           preCondition="integratedMode,runtimeVersionv4.0" />
    </handlers>
  </system.webServer>
</configuration>

Conclusion

Dynamically generating your robots.txt file is pretty easy and only takes as many lines of code as you need to write your robots.txt file anyway. It also means that you don't need to pollute your project structure with yet another file at the root of it (This problem is fixed in MVC Core, where all static files must be added to the wwwroot folder). You can also dynamically generate your site-map URL so you don't need to remember to update it every time you change the domain.

You could argue that performance is an issue when compared to a static robots.txt text file but its a matter of a few bytes and if you cache the response with a sufficient time limit then I think that even that problem goes away.

Once again, you can find a working example of this and much more using the ASP.NET Core Boilerplate project template.


Minifying HTML for ASP.NET MVC

$
0
0

Using Razor comments or blocks of code can cause extra carriage returns to appear in the generated HTML. This has been a problem in all versions if ASP.NET MVC for a while now.

<p>Paragraph 1</p>
@* Razor Comment *@
<p>Paragraph 2</p>

The above code generates the following HTML. You can imagine that with a lot of comments or code blocks you get a lot of ugly blank lines appearing in your mark-up.

<p>Paragraph 1</p>

<p>Paragraph 2</p>

Ideally it should generate the HTML below without any blank lines. If you really wanted a blank line to appear, you could add one yourself before the comment.

<p>Paragraph 1</p>
<p>Paragraph 2</p>

The main problem with the above is that it makes your HTML look ugly and hard to follow. You often end up with several blank lines, which breaks up the flow of the mark-up.

Also, given that every ASP.NET MVC site on the internet has this problem and probably contains at least two Razor comments and maybe a for-loop in the code somewhere, that is a lot of wasted extra bandwidth.

So I made this suggestion for the next version of ASP.NET Core, to change the behaviour to the expected one above and it got accepted!

How Much Bandwidth Was Saved

So how much bandwidth has this single change saved the internet? That's the question I asked myself. According to the httparchive.org, the average request is made up of around 57KB of HTML. If we assume that each page contains two comments and maybe a for-loop, that's six carriage returns (Two sets of), twelve characters or twelve bytes of wasted bandwidth. If we assume that all sites are using GZip compression, then we can make a conservative estimate of around six bytes of wasted bandwidth per request.

Average HTML Transfer Size Over a Request Chart

Cisco forecasts that global IP traffic will pass the Zettabyte (1000 Exabytes) threshold by the end of 2016. If the the average transfer size per request is 2162 KB; and only 57 KB is HTML, we can work out that 257465 Terabytes of of the worlds internet traffic per year is HTML.

Average Total Transfer Size Over a Request Chart

According to w3techs.com, 16.7% of all sites on the internet use ASP.NET as of 1st August 2015, lets assume half of those (8.35%) will use ASP.NET Core in a few years time. So, we can say that very roughly 21498 Terabytes of the worlds bandwidth is consumed on ASP.NET HTML requests per year.

If the average wasted bandwidth is six bytes out of a total of 57 KB per HTML request, then we come to a grand total of around 2.3 Terabytes of bandwidth saved per year. I must admit, that's a lot of bandwidth but I still thought it would be a lot more than that.

HTML Minification

An even better solution would be to minify the HTML. There are solutions like Web Markup Min for ASP.NET Core but it works at runtime and is a little involved to set up, so all but the most determined developers use this feature.

Then there is Dean Hume's compile time minifier which sounded perfect. In Dean's post he gets savings of around 20-30% by minifying his HTML. If we applied a conservative 20% saving to all ASP.NET Core HTML requests, that would work out to be a 4300 Terabyte saving in global bandwidth per year!

So far, I've only mentioned the bandwidth saving but downloading smaller HTML files will also mean quicker page load times. HTML is the first thing a browser downloads before it can go off and download all the other CSS, JavaScript, fonts and images a site needs to display a page. Making this download smaller, is a small but effective way to get pages up quicker.

These days, MVC has things like CSS and JavaScript minification built in as standard. To squeeze out even more performance, HTML minification is the next logical step.

So I made this suggestion for ASP.NET Core to implement Dean's compile time minification of Razor views by default. So far, it's not been taken up but I live in hope and write this blog post to show how cool a feature it is.

Please do go and post your support for this feature. We don't necessarily need to use Dean's technique, minifying HTML could just as easily be a Grunt or Gulp task.

Conclusions

This post contains huge leaps of guess work and estimation. I hope my maths is up to scratch but I would not be surprised if I was off by a decimal point or two. Still, we are talking huge numbers here and I hope I've convinced you that minifying HTML is worth the effort.

Whats New in ASP.NET Core Boilerplate

$
0
0

I have just updated the ASP.NET Core Boilerplate Visual Studio extension with a new project template targeting ASP.NET Core. This post is just a quick one to talk about what's new and different about this version of the template compared to the ASP.NET 4.6 MVC 5 version.

What's New

Well, the obvious thing is that this template targets ASP.NET Core which is currently still in beta. In particular I am targeting Beta 6 which is the current stable version. I will be regularly updating the template with each new beta until ASP.NET Core is released sometime in November according to Microsoft.

There are not too many new improved features over the ASP.NET 4.6 MVC 5 version but here is a quick description:

  1. Performance improvements derived from using ASP.NET Core. ASP.NET Core is much improved and no longer uses System.Web, so it uses a lot less memory.
  2. Using NPM and Bower to get CSS and JavaScript scripts rather than NuGet. This means you have a lot more choice and get the latest versions straight from the source.
  3. Switched from LESS to SASS for CSS. This decision was made because SASS seems to be more popular and the upcoming Bootstrap 4 has also made the same decision.
  4. Gulp is used instead of the standard ASP.NET 4.6 bundling and minification feature. Not only that but Gulp is also configured to optimize images, rebuild CSS and JavaScript on file change, lint the CSS and JavaScript for common errors and warnings and measure the speed of your site using Google Page Speed.
  5. The default ASP.NET Core project template uses Bootstrap-Touch-Carousel and Hammer.js for a nice touch friendly carousel control on the home screen.
  6. There is now a single controller action responsible for displaying errors. This is a lot simpler and a great improvement over MVC 5.
  7. The logging and caching services are now built into ASP.NET Core, so we use them instead.

What's Missing

ASP.NET Core is still in beta and there are a lot of third party libraries that don't yet support it. Support will be added as soon as it becomes available. I have contacted all three project owners and can confirm that support will be added soon.

The new .NET Core runtime does not currently support the System.ServiceModel.Syndication namespace which is used to build an Atom feed. The .NET Core runtime is still being targeted but the Atom feed will not work and is excluded using #if pre-processor directives. I have raised this issue on the .NET teams GitHub page here. Please do go ahead and show your support for the feature.

There are other issues around ASP.NET Core missing features from MVC 5 including no support for HttpException which I will be looking into adding soon. I am also looking into submitting any improvements I make to the ASP.NET Core GitHub project, so far, I've had one pull request accepted and a few suggestions acted on.

Conclusions

ASP.NET Core is still in beta but hopefully this project will give an understanding of what can be done with it. There are still missing features but it's surprisingly usable at the moment.

Dynamically Generating Sitemap.xml for ASP.NET MVC

$
0
0

What is a sitemap.xml File

What is a sitemap.xml file used for? The official sitemaps.org site really does says it best:

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URL's for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URL's in the site) so that search engines can more intelligently crawl the site.

Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URL's in the Sitemap and learn about those URL's using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
      <loc>http://www.example.com/</loc>
      <lastmod>2005-01-01</lastmod>
      <changefreq>monthly</changefreq>
      <priority>0.8</priority>
   </url>
   <!-- ... -->
</urlset>

As you can see each URL in a sitemap contains four pieces of metadata:

  • url - The URL itself.
  • lastmod (Optional) - A last modified timestamp. This tells search engines whether or not they should re-index the page to reflect any changes that have been made.
  • changefreq (Optional) - A change frequency indicator (This can take the values: always, hourly, daily, weekly, monthly, yearly, never). This gives search engines an indication of how often they should come back and re-index the page.
  • priority (Optional) - A number from zero to one indicating the importance of the page compared to other pages on the site.

The latter three values only give search engines an indication of when they can or should index or even re-index a page. It is not a guarantee that it will happen, although it makes it more likely.

Is it Worth the Effort?

Search engines are black boxes. We only know what goes into them (Our sitemap) and what comes out the other end (The search results). I can make no promises that adding a sitemap will increase your sites search rankings but Google says:

Using a sitemap doesn't guarantee that all the items in your sitemap will be crawled and indexed, as Google processes rely on complex algorithms to schedule crawling. However, in most cases, your site will benefit from having a sitemap, and you'll never be penalized for having one.

[Google](https://support.google.com/webmasters/answer/156184?hl=en)

Generating a Static sitemap.xml File

There are tools online you can use to generate a static sitemap.xml file, which you can dump at the root of your site but you have to manually update these every time your site changes. This may be fine if your site does not change much but adding a dynamically generated sitemap.xml file is fairly simple process and worth the effort.

Dynamically Generating Sitemap.xml for ASP.NET MVC

Dynamically generating a simple sitemap.xml file for ASP.NET MVC is really simple but adding all the bells and whistles requires a bit more work. We start with a SitemapNode and frequency enumeration which represents a single URL in our sitemap:

public class SitemapNode
{
    public SitemapFrequency? Frequency { get; set; }
    public DateTime? LastModified { get; set; }
    public double? Priority { get; set; }
    public string Url { get; set; }
}

public enum SitemapFrequency
{
    Never,
    Yearly,
    Monthly,
    Weekly,
    Daily,
    Hourly,
    Always
}

Now we need to create a collection of SitemapNode's. In my example below, I add the three main pages of my site, Home, About and Contact. I then go on to add a collection of product pages. I am getting every product ID from my database and using that to generate a product URL. Note that I'm not using every property on the SitemapNode class since in my case I don't have an easy way to figure out a last changed date but I do specify a priority and frequency for my products.

Please note that the URL's must be absolute and I am using an extension method I wrote called AbsoluteRouteUrl to generate absolute URL's instead of relative ones. I have included that below too.

public IReadOnlyCollection<SitemapNode> GetSitemapNodes(UrlHelper urlHelper)
{
    List<SitemapNode> nodes = new List<SitemapNode>();

    nodes.Add(
        new SitemapNode()
        {
            Url = urlHelper.AbsoluteRouteUrl("HomeGetIndex"),
            Priority = 1
        });
    nodes.Add(
       new SitemapNode()
       {
           Url = urlHelper.AbsoluteRouteUrl("HomeGetAbout"),
           Priority = 0.9
       });
    nodes.Add(
       new SitemapNode()
       {
           Url = urlHelper.AbsoluteRouteUrl("HomeGetContact"),
           Priority = 0.9
       });

    foreach (int productId in productRepository.GetProductIds())
    {
        nodes.Add(
           new SitemapNode()
           {
               Url = urlHelper.AbsoluteRouteUrl("ProductGetProduct", new { id = productId }),
               Frequency = SitemapFrequency.Weekly,
               Priority = 0.8
           });
    }

    return nodes;
}

public class UrlHelperExtensions
{
    public static string AbsoluteRouteUrl(
        this UrlHelper urlHelper,
        string routeName,
        object routeValues = null)
    {
        string scheme = urlHelper.RequestContext.HttpContext.Request.Url.Scheme;
        return urlHelper.RouteUrl(routeName, routeValues, scheme);
    }
}

Now all we have to do is turn our collection of SitemapNode's into XML:

public string GetSitemapDocument(IEnumerable<SitemapNode> sitemapNodes)
{
    XNamespace xmlns = "http://www.sitemaps.org/schemas/sitemap/0.9";
    XElement root = new XElement(xmlns + "urlset");

    foreach (SitemapNode sitemapNode in sitemapNodes)
    {
        XElement urlElement = new XElement(
            xmlns + "url",
            new XElement(xmlns + "loc", Uri.EscapeUriString(sitemapNode.Url)),
            sitemapNode.LastModified == null ? null : new XElement(
                xmlns + "lastmod", 
                sitemapNode.LastModified.Value.ToLocalTime().ToString("yyyy-MM-ddTHH:mm:sszzz")),
            sitemapNode.Frequency == null ? null : new XElement(
                xmlns + "changefreq", 
                sitemapNode.Frequency.Value.ToString().ToLowerInvariant()),
            sitemapNode.Priority == null ? null : new XElement(
                xmlns + "priority", 
                sitemapNode.Priority.Value.ToString("F1", CultureInfo.InvariantCulture)));
        root.Add(urlElement);
    }

    XDocument document = new XDocument(root);
    return document.ToString();
}

Now we add an action method to our HomeController to get to our sitemap. Note the route to get to the sitemap. It is recommended to place your sitemap at the root of your site at sitemap.xml. Also note that creating a route with a file extension at the end (.xml) is not allowed in MVC 5 and below (ASP.NET Core is fine), so you need to add the line below in your Web.config file.

[RoutePrefix("")]
public class HomeController : Controller
{
    [Route("sitemap.xml")]
    public ActionResult SitemapXml()
    {
        var sitemapNodes = GetSitemapNodes(this.Url);
        string xml = GetSitemapDocument(sitemapNodes);
        return this.Content(xml, ContentType.Xml, Encoding.UTF8);
    }
}
<configuration>
  <system.webServer>
    <handlers>
      <add name="SitemapXml" path="sitemap.xml" verb="GET" type="System.Web.Handlers.TransferRequestHandler" preCondition="integratedMode,runtimeVersionv4.0" />
    </handlers>
  </system.webServer>
</configuration>

Sitemap Index Files

For most people the above code will be enough. You can only have a maximum of 50,000 URL's in your sitemap and it must not exceed 10MB in size. I did some testing and if your URL's are fairly long and you supply all of the metadata for each URL, you can easily hit the 10MB mark with 25,000 URL's.

It's not clear what happens if search engines come across a file that breaches these limits. I would have thought that the likes of Google or Bing would have a margin of error but it's better to be well under the limits than over. Not many sites have that many pages but you'd be surprised at how easy it is to hit these limits.

This is where sitemap index files come in. The idea is that you break up your sitemap into pages and list all of these in an index file. When a search engine visits your sitemap.xml file, they retrieve the index file and visit each page in turn. Here is an example of an index file:

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <sitemap>
      <loc>http://www.example.com/sitemap1.xml</loc>
      <lastmod>2004-10-01T18:23:17+00:00</lastmod>
   </sitemap>
   <sitemap>
      <loc>http://www.example.com/sitemap2.xml.gz</loc>
      <lastmod>2005-01-01</lastmod>
   </sitemap>
</sitemapindex>

As you can see you can optionally add a last modified date to each sitemap URL to tell search engines when a sitemap file has changed. This last modified date can be calculated from it's contents, you just need to take the latest last modified date from that particular page.

This blog post has started to get a little long and I haven't even covered sitemap pinging yet, so I will not go into too much detail but I will refer you to where you can get at the full source code and worked example. Luckily, all of the code above and the code to generate a sitemap index file is available here:

Conclusions

Adding a sitemap is a great Search Engine Optimization (SEO) technique to improve your sites search rankings. With my NuGet package, it makes it a really simple feature to add to your site. In my next blog post, I'll talk about sitemap pinging which can be used to pro-actively notify search engines of a change in your sitemap.

So I've Been Awarded Microsoft MVP Status!

$
0
0

Lucian Wischik contacted me out of the blue one day to update and test one of my NuGet packages using NuGet 3.0. We had a pleasant email exchange and he suggested I nominate myself for becoming a Microsoft MVP...so I did.

It actually takes a fair amount of time to apply, you have to give details of all your online accounts and open source projects. Not only that but for each one you have to specify how many downloads or page views you have got. Only after doing this, did I realize how much of an online presence I really have.

A few months later and lo and behold I get another email out of the blue telling me I'm one of 4000 people being awarded Microsoft MVP status. So what does this mean, I thought to myself? Well it turns out you get a few freebies:

  1. Free Software - I already had an MSDN license from work but it's nice to get a free MSDN license of my own. You can pretty much download whatever Microsoft software you want.
  2. Free Azure - As part of the MSDN license you get £100 of  Azure credits per month. Last I checked, I only use about £30 to £40 a month to run this blog and a few other sites and services.
  3. The Gift Pack - This is a nice trophy and certificate to hang on your wall, as well as a badge, just to make it official.
  4. MVP Insider Mailing List Access - Getting added to this email distribution list is pretty cool. You can ask questions and find out about bugs and new releases before everyone else.
  5. Access to Pre-Release Software - You get access to Beta or RTM builds before they are publicly released. Microsoft uses this as a way to make sure there are no last minute bugs.

I hadn't realized but you have to reapply to become an MVP every year. I'm not sure how that works but I guess I'll find out a year later.

Thanks to Lucian for suggesting I apply, my wife who has had to put up with me messing around with code all the time and all the people who downloaded some of my code and found it useful. I get a warm fuzzy feeling every time I see a new website pop-up using my project template or the download numbers for my various projects grow.

.NET Big-O Algorithm Complexity Cheat Sheet

$
0
0

Credits

All credit goes to the creator of the Big-O Algorithm Complexity Cheat Sheet Eric Rowell and the many contributors to it. You can find the original here. I simply added .NET specific bits to it and posted it on GitHub here.

What is it?

It covers the space and time Big-O notation complexities of common algorithms used in Computer Science and specifically the .NET framework.

Why is it useful?

You can see which collection type or sorting algorithm to use at a glance to write the most efficient code.

This is also useful for those studying Computer Science in University or for technical interview tests where Big-O notation questions can be fairly common depending on the type of company you are apply to.

Let me have it!

You can download the cheat sheet in three different formats:

Logging with Serilog.Exceptions

$
0
0

Picking a logging framework for your new .NET project? I've tried all the best known ones, including log4net, NLog and Microsoft's Logging Application Block. All of these logging frameworks basically output plain text but recently I tried Serilog and was literally blown away by what you could do with it.

Logging in JSON Format

Take a look at the code below which makes use of the Serilog logger to log a geo-coordinate and an integer:

var position = new { Latitude = 25, Longitude = 134 };
var elapsedMs = 34;

log.Information("Processed {@Position} in {Elapsed:000} ms.", position, elapsedMs);

If you configure Serilog correctly, you can get it to output it's logs to JSON format, so the above line would log the following:

{
  "Timestamp": "2015-12-07T12:26:24.0557671+00:00",
  "Level": "Information",
  "MessageTemplate": "Processed {@Position} in {Elapsed:000} ms.",
  "RenderedMessage": "Processed { Latitude: 25, Longitude: 134 } in 034 ms.",
  "Properties": {
    "Position": 
    { 
        "Latitude": 25,
        "Longitude": 134
    }, 
    "Elapsed": 34,
    "ProcessId": 123,
    "ThreadId": 123,
    "User": "Domain\\Username",
    "Machine": "Machine-Name",
    "Source": "My Application Name"
  }
}

Why JSON?

What can you do with JSON formatted logs that you can't do with plain text? Well, if you store all your logs in something like Elastic Search, you can query your logs and ask it questions. So if we take the above example further we could find all log messages from a particular machine or user with an elapsed time of more than 10 milliseconds and a distance of 10 Km away from the specific location.

Not only that but if you set up something like Kibana, then you can create visualisations for your logs which could grow to be gigabytes in size over time. You can create dashboards with cool charts and maps that look something like this:

Kibana Dashboard Screenshot

Logging Exceptions

One major problem with all exceptions is that they do not log all the properties of an exception and throw away vital information. Take the DbEntityValidationException from EntityFramework as an example. This exception contains vital information buried not in the message but in a custom property called EntityValidationErrors. The problem is that when you do an exception.ToString() call, this vital information is not included in the resulting string. Even worse, it's not included in the debugger either. This is a pretty major failing in the .NET framework but alas we have to work around it.

There are literally dozens of questions on Stack Overflow asking how to deal with this problem and all the major logging frameworks fail in this regard. All of them call exception.ToString() and fail to log the EntityValidationErrors collection.

DbEntityValidationException is not the only culprit, half the exceptions in the .NET framework contain custom properties that are not logged. The Exception base class itself has a Data dictionary collection which is never logged either.

Serilog.Exceptions

I wrote Serilog.Exceptions to solve this problem. So what happens when you log a DbEntityValidationException using this NuGet package added to Serilog itself? Well take a look yourself:

{
  "Timestamp": "2015-12-07T12:26:24.0557671+00:00",
  "Level": "Error",
  "MessageTemplate": "Hello World",
  "RenderedMessage": "Hello World",
  "Exception": "System.Data.Entity.Validation.DbEntityValidationException: Message",
  "Properties": {
    "ExceptionDetail": {
      "EntityValidationErrors": [
        {
          "Entry": null,
          "ValidationErrors": [
            {
              "PropertyName": "PropertyName",
              "ErrorMessage": "PropertyName is Required.",
              "Type": "System.Data.Entity.Validation.DbValidationError"
            }
          ],
          "IsValid": false,
          "Type": "System.Data.Entity.Validation.DbEntityValidationResult"
        }
      ],
      "Message": "Validation failed for one or more entities. See 'EntityValidationErrors' property for more details.",
      "Data": {},
      "InnerException": null,
      "TargetSite": null,
      "StackTrace": null,
      "HelpLink": null,
      "Source": null,
      "HResult": -2146232032,
      "Type": "System.Data.Entity.Validation.DbEntityValidationException"
    },
    "ProcessId": 123,
    "ThreadId": 123,
    "User": "Domain\\Username",
    "Machine": "Machine-Name",
    "Source": "My Application Name"
  }
}

It logs every single property of the exception and not only that but it drills down even further into the object hierarchy and logs that information too.

You're probably thinking it uses reflection right? Well...sometimes. This library has custom code to deal with extra properties on most common exception types and only falls back to using reflection to get the extra information if the exception is not supported by Serilog.Exceptions internally.

Getting Started with Serilog.Exceptions

Add the Serilog.Exceptions NuGet package to your project using the NuGet Package Manager or run the following command in the Package Console Window:

Install-Package Serilog.Exceptions

When setting up your logger, add the WithExceptionDetails line like so:

using Serilog;
using Serilog.Exceptions;

ILogger logger = new LoggerConfiguration()
    .Enrich.WithExceptionDetails()
    .WriteTo.Sink(new RollingFileSink(
        @"C:\logs",
        new JsonFormatter(renderMessage: true))
    .CreateLogger();

That's it, it's one line of code!

Make Certificate

$
0
0

Making your own certificate files is quite hard work. You have to use makecert.exe and pvk2pfx.exe, passing in some pretty cryptic arguments which you always have to go back and research.

Learning how to make a certificate and the different types of certificate is pretty important. I highly recommend reading this blog post from Jayway.com which has some very detailed instructions and is the basis of MakeCertificate.

To make things easier I made a PowerShell script called MakeCertificate.ps1 which you can get on the MakeCertificate GitHub page. It asks you to pick the type of certificate you want to create, there are a few different types of certificates that MakeCertificate helps to make: Certificate Authority (CA) Certificates, SSL/TLS Server Certificates and Client Certificates. You are then asked a series of questions which when answered outputs three files

  • .cer - A public key file that can be shared.
  • .pvk - A private key file that should be kept secret.
  • .pfx - A combined public and private key file that should be kept secret.

It also outputs the command you need to execute using makecert.exe and pvk2pfx.exe to recreate the certificate.


Colorful.Console

$
0
0

I needed to write a console application a while back and was investigating the best way to do this using the available NuGet packages. I'd seen the DNVM command line tool that Microsoft built for ASP.NET Core and really liked it and wanted something similar.

DNVM

I really like the old school ASCII art title and the use of colour. The .NET Framework does contain an enum called ConsoleColor which contains a very limited set of hard coded colours you can use but it has some major omissions like the colour orange for example.

In my hunt for a C# ASCII art generator, I discovered patorjk.com which is great for generating text using various Figlet fonts. Figlet fonts are basically .flf text files which contain instructions on how each letter in the ASCII character table can be printed out. It turns out these fonts are pretty ancient and there are libraries in every language writing out text using Figlet fonts.

Colorful.Console

I was just about to give up and write my own open source library when I discovered Colorful.Console, available on GitHub.  Using this library you can very easily write console apps which look like this:

Colorful.Console Example 1

Or this:

Colorful.Console Example 2

The only thing missing was a method to write ASCII text using Figlet fonts, so I contributed some code to the project to get this done. The output, combined with the fade that Colorful.Console is capable of created a pretty cool effect. Unbelievably this is a couple of lines of code to write!

Colorful.Console Example 3

The title image of this post is also generated using Colorful.Console but was a bit more complicated as it transitions through several colours. By default Colorful.Console includes a single Figlet font but there are dozens of others available which you can download and use yourself.  They aren't all included by default because they would bloat the library quite a bit.

Command Line Parsers

Now the only thing missing in my quest was a command line parser which could let me easily create commands, switches and flags so users could use my command line tool. The best tool I found was Command Line Parser available on GitHub. It's a pretty powerful and fully features library that makes writing a command line interface very easy. Unfortunately, its output is pretty ugly and it does not let you customize the 'look and feel' of what is output to the console.

At some point, I'd like to make another contribution to Colorful.Console, so that it offers command line parsing too but take inspiration from several command line parsing libraries to make something that's fully customizable and of course very colourful.

Command line tools have been around for decades, it's a wonder that a NuGet package that does all of these things does not exist yet.

Subresource Integrity TagHelper Using ASP.NET Core - Part 1

$
0
0

What is Subresource Integrity (SRI)

Can you trust your CDN provider? What if they get hacked and the copy of jQuery you are using hosted by them has some malicious script added to it? You would have no idea this was happening! This is where Subresource Integrity (SRI) comes in.

It works by taking a cryptographic hash of the file hosted on the CDN and adding that to your script or link tags. So in our case if we are using jQuery, we would add an integrity and crossorigin attribute to our script tag like so:

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js" 
        integrity="sha256-ivk71nXhz9nsyFDoYoGf2sbjrR9ddh+XDkCcfZxjvcM=" 
        crossorigin="anonymous"></script>

The cryptographic hashing algorithm used can be SHA256, SHA384 or SHA512 at the time of writing. In fact, you can use more than one at a time and browsers will pick the most secure one to check the file against.

The current official standard document states that currently only script or link tags are supported for your JavaScript or CSS. However, it also states that this is likely to be expanded to pretty much any tag with a src or href attribute such as images, objects etc.

Scott Helme has a great post on the subject which I highly recommend you read (It's where I learned about it).

The ASP.NET Core Tag Helper

I implemented a tag helper for ASP.NET Core which is as simple to use as this:

<script asp-subresource-integrity
        src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>

Don't you love it when security is so easy! I'm a big believer in making security as easy as having a big red button that says 'on' and turning it on by default so people don't have to. It's the only way these things will get used! What is it doing behind the scenes?

  1. Downloads the file from the CDN.
  2. Calculates a SHA512 hash for the file.
  3. Adds the integrity and crossorigin attributes to the script tag.
  4. Adds the SHA512 hash value to the distributed cache (IDistributedCache) built in to ASP.NET Core with no expiry date. If you are using a distributed cache like Redis (Which you should for the pure speed of it) then the hash will be remembered.
  5. The next time the page loads, the hash is retrieved from the cache, so there is very little performance impact of this tag helper.

There are actually two tag helpers, one supports any tag with a src attribute and another supports any tag with a href element. This is in preparation for when subresource integrity is opened up to tags other than script and link.

Gotchas

In the past, I have often omitted the scheme from the CDN URL like so:

<script src="//ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js"></script>

However, I have noticed that Firefox, does not like it when you use SRI and omit the scheme. It stops the file from loading completely. When you think about it, this makes sense. We are trying to confirm that the resource has not been changed, one of the ways to do this is to use HTTPS. It does not make sense to use SRI over HTTP.

The other gotcha I found is that the resource must have the Access-Control-Allow-Origin HTTP header. It can be set to * or your individual domain name. Now, I have been using CDN resources provided by Google (for jQuery), Microsoft (for Bootstrap, jQuery Validation etc.) and MaxCDN (for Font Awesome) because they are free, most browsers have probably already got a copy of the files from there and because they have very fast global exit nodes.

However, I have discovered that all provide the Access-Control-Allow-Origin HTTP header except Microsoft on some of their resources. Strangely, they return the header for Bootstrap but not for the jQuery Validation scripts. I have reached out to them through my capacity as an MVP and hope to get the issue solved. In the mean time, if you are using Microsoft's CDN you can switch to another CDN or wait for them to fix the issue.

Where Can I Get It?

This tag helper is available in a few ways:

  1. The .NET Boxed Boxed.AspNetCore.TagHelpers NuGet package.
  2. Check out source code in the .NET Boxed Framework GitHub repository.

Subresource Integrity TagHelper Using ASP.NET Core - Part 2

$
0
0

Last week I wrote part one of a blog post discussing a Subresource Integrity (SRI) tag helper I wrote for ASP.NET Core. It turns out the post was featured on the ASP.NET Community Standup and discussed at length by Scott Hanselman, Damian Edwards and Jon Galloway. Here is the discussion:

https://www.youtube.com/watch?v=Mu2jol8EmVo

The overall impression from the standup was that the SRI tag helper I wrote was a good first step but there was more work to be done. It was however, still more secure than "the rest of the internet" according to Jon Galloway. The main issue raised during the standup was that the first call made to get the resource could retrieve a version of it that was compromised.

My initial thinking was that you could check the files at deployment time when the tag helper first runs. Then the tag helper would have calculated the hash and cached it without any expiration time, so you are good from then on. In hindsight checking the files on every deployment is not great for the developer.

The 2nd Iteration

So for the next iteration I have added a new alternative source attribute, basically a local file from which the SRI is calculated. Now the tag helper looks like this when in use:

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js" 
        asp-subresource-integrity-src="~/js/jquery.min.js"></script>

You can also customize the hashing algorithm used in your SRI. You can choose between SHA256, SHA384 and SHA512, by default the tag helper uses the most secure option SHA512 which seems to be supported by all browsers. Should you choose to use a different hashing algorithm or even use more than one algorithm, you can set the asp-subresource-integrity-hash-algorithms attribute which is just a flagged enumeration (Note that I am using ASP.NET Core RC2 syntax, where the name of the enumeration can be omitted):

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js" 
        asp-subresource-integrity-src="~/js/jquery.min.js"
        asp-subresource-integrity-hash-algorithms="SHA256 | SHA384 | SHA512"></script>

What is it doing behind the scenes?

  1. Reads the local file specified using the asp-subresource-integrity-src  attribute.
  2. Calculates a SHA512 hash (or your custom selection) for the file.
  3. Adds the integrity and crossorigin attributes to the script tag.
  4. Adds the hash value to the distributed cache (IDistributedCache) built in to ASP.NET Core with no expiry date. If you are using a distributed cache like Redis (Which you should for the pure speed of it) then the hash will be remembered.
  5. The next time the page loads, the hash is retrieved from the cache, so there is very little performance impact of this tag helper.

Microsoft CDN Still Broken for SRI

In my last post I noted that SRI requires that the resource has a valid Access-Control-Allow-Origin HTTP header (usually with a * value). Microsoft's CDN does not supply this header for all it's resources. I did reach out to Microsoft to see if this could be fixed. I've not heard back yet. I would imagine that with a CDN of that size, fixing this issue is a non-trivial thing so it might take time but I'll do some more chasing.

Browser Extensions and SRI

Last week, I noted that leaving out the scheme in the URL for your CDN resource e.g. //example.com/jquery.js caused Firefox to error and fail to load the resource completely and I recommended that you always include the https:// scheme. It turns out that this was not Firefox causing the issue at all but a Firefox browser extension. I've yet to figure out which one yet as I have quite a few installed (most of them security related because I'm paranoid) but it's probably an extension called HTTPS Everywhere which attempts to use HTTPS if it is available. To be on the safe side and avoid this problem, always specify the https:// scheme.

CDN Fallbacks

So what happens when a CDN script is maliciously edited or (much more likely) you messed up and your local copy of the CDN script is different from the one in the CDN? Well, this is where CDN script fallbacks come in. There is already a tag helper provided by ASP.NET Core that does this:

<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js"
        asp-subresource-integrity-src="~/js/jquery.min.js"
        asp-fallback-src="~/js/jquery.min.js"
        asp-fallback-test="window.jQuery">
</script>

I should also mention that although the fallback tag helper is cool and very simple to use, it adds inline script which is not compatible with the Content Security Policy (CSP) HTTP header. If you care about security and you probably do if you are reading this, that means using the fallback tag helper is not possible. I myself prefer to move all my fallback checks to a separate JavaScript file.

sritest.io

A big shout out to Gabor Szathmari and his website sritest.io. It is able to scan your page and check that all your external resources have SRI enabled and most importantly that it has been setup correctly. You could use the console window from a browser like Chrome or Firefox but this website will also tell you if you've forgotten to add SRI to any external resources and also highlight edge cases such as the ones I highlighted in these two blog posts.

sritest.io Screenshot

Where Can I Get It?

This tag helper is available in a few ways:

  1. The .NET Boxed Boxed.AspNetCore.TagHelpers NuGet package.
  2. Check out source code in the .NET Boxed Framework GitHub repository.

Social TagHelpers for ASP.NET Core

$
0
0

Social media websites like Facebook, Twitter, Google+, Pintrest etc. provide ways to enhance the experience of sharing a page from your site through the use of meta tags. These provide metadata about what is on your page in a standardized format that these sites can use to better display your content. Here are two quick examples of the enhanced content that Twitter and Facebook display when you add these meta tags to your page:

Facebook Open Graph Share

Twitter Player Card

It turns out that most of the social media sites use only two standard sets of meta tags, namely Open Graph (Facebook) and Twitter Cards. I have built ASP.NET Core TagHelpers and ASP.NET 4.6 HTML Helpers which make it easy to add these meta tags to your site.

Author Meta Tag

This is nothing to do with social media meta tags but worth mentioning. The author meta tag has been around for many years and is a standard but very basic way of telling search engines and others, who authored your page. It's unclear where if anywhere this tag is used but as it's a standard I like to put it in anyway as it doesn't hurt to do so.

<meta name="author" content="Muhammad Rehan Saeed">

Open Graph (Facebook)

Open Graph is an open standard (it's set by Facebook and doesn't seem so open to me as I'll explain),  containing several sets of meta tags which represent various things, such as:

  • Website

  • Music Album

  • Music Song

  • Music Playlist

  • Video Movie

  • Video Episode

  • Video TV Show

  • Video Other

  • Article

  • Book

  • Profile

Here is an example of what the meta tags for a page looks like for the Website set. Note the type tag which determines the name of the set used:

<meta property="og:type" content="website">
<meta property="og:title" content=".NET Boxed">
<meta property="og:url" content="http://example.com/">
<meta property="og:image" content="http://example.com/1200x630.png">
<meta property="og:image:type" content="image/png">
<meta property="og:image:width" content="1200">
<meta property="og:image:height" content="630">
<meta property="og:site_name" content=".NET Boxed">

What I find perplexing is that Facebook also have their own custom sets of meta tags over and above the ones in Open Graph. These are:

  • Article

  • Books Author

  • Books Book

  • Books Genre

  • Business

  • Fitness Course

  • Game Achievement

  • Music Album

  • Music Playlist

  • Music Radio Station

  • Music Song

  • Place

  • Product

  • Product Group

  • Product Item

  • Profile

  • Restaurant Menu

  • Restaurant Menu Item

  • Restaurant Menu Section

  • Restaurant

  • Video Episode

  • Video Movie

  • Video Other

  • Video TV Show

As you can see there is a lot more choice and detail here. What's confusing is that there is overlap between the Open Graph and Facebook meta tags. Both have sets covering music, video and books, with the Facebook sets requiring you to add far more detailed metadata. The Open Graph tags may play nicer with other social media sites that use these tags while the Facebook ones will obviously give the best experience for the user on Facebook. The above meta tags can be set using my tag helpers or HTML helpers depending on the version of ASP.NET you are using like so:

<open-graph-website site-name="My Website"
                    title="Page Title"
                    main-image="@(new OpenGraphImage(
                        Url.AbsoluteContent("~/img/1200x630.png"),
                        ContentType.Png,
                        1200,
                        630))"
                    determiner="OpenGraphDeterminer.Blank">
@Html.OpenGraph(new OpenGraphWebsite(
    "Page Title",
    new OpenGraphImage(
        Url.AbsoluteContent("~/1200x630.png"))
        {
            Height = 630, 
            Type = ContentType.Png, 
            Width = 1200 
        })
    {
        Determiner = OpenGraphDeterminer.Blank,
        SiteName = "My Site"
    });

Of course there are tag helpers and HTML helpers for all of the above meta tag sets.

Twitter Cards

Twitter cards require meta tags representing one of several 'cards' which can represent different things:

  • App - A phone app.
  • Gallery - A photo gallery.
  • Photo - A single photo.
  • Player - A video.
  • Product - A product you want to sell.
  • Summary - A summary of the current page. This is usually the default choice for any page.
  • Summary Large Image - The same as summary but with a large image.

If you have already added Open Graph meta tags, then Twitter can make use of them and you can omit some of the meta tags that Twitter requires. This makes adding a Twitter Card very easy and in fact, most of the time all you need to do is include a Twitter username and the card type. Here is an example of what Twitter card meta tags look like given that you already have Open Graph meta tags:

<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:site" content="@RehanSaeedUK">

Below, is an example of how to generate the above code using my tag helpers or HTML helpers. I have used the Summary Large Image card (Notice the double @ sign in the tag helper, this is because @ is a special character in Razor and a double @@ escapes the character):

<twitter-card-summary-large-image username="@@RehanSaeedUK">
@Html.TwitterCard(new SummaryLargeImageTwitterCard("@RehanSaeedUK"));

There are also tag helpers and HTML helpers for all of the above Twitter cards. The other cards are a little more complicated than the summary card I have shown in my example above.

Google+, Pintrest & Others

Due to the proliferation of Facebook's Open Graph and Twitters card meta tags, other social media sites, search engines and other sites also use them. By implementing the above meta tags, you can cover most of the ground with very little effort.

Validating Meta Tags

Due to the difficulty of getting these meta tags correct, there are several validator tools that the various social media companies provide which let you confirm that you have not made any mistakes. Now, if you've used my tag helpers or HTML helpers you should be ahead of the game and things should just work but it's worth checking out:

Performance

When I was looking into implementing these tag helpers and HTML helpers, I looked at a few other efforts on GitHub. However, for some strange reason all of them used reflection behind the scenes. At this point I'd like to go on a short rant against using reflection. I've seen a lot of 'clever' code use reflection over the years and I've seen a far too many developers hammer far too many nails using it. It's a very powerful tool but gets abused far too often. Now, back to resuming normal service. This made these libraries pretty slow for generating a few meta tags, not to mention that they don't support ASP.NET Core. My implementation uses a single StringBuilder and should be fairly fast. At some point I will even use object pooling to reuse copies of StringBuilder.

Where Can I Get It?

This tag or HTML helper is available in a few ways:

  1. The .NET Boxed Boxed.AspNetCore.TagHelpers NuGet package.
  2. Check out source code in the .NET Boxed Framework GitHub repository.

Azure Active Directory Versus Identity Server

$
0
0

::: warning Disclaimer I looked into this subject for use by the company I work for, who had existing infrastructure I had to cater to, so the solution I chose is skewed in that direction. Even so I've tried to give an impartial summary of my own thoughts during my research. I originally asked this question on an Identity Server GitHub issue. :::

Azure Active Directory

Azure Active Directory is a hosted identity solution, so there is far less setup (especially if like me, you discover that to your surprise, you are already using it for Office 365). Out of the box, it provides some very nice features that can get you started very quickly.

On-Premises Active Directory

Azure Active Directory can connect to an on-premises Active Directory server very easily using something called Azure AD Connect. Most companies are not running everything in the cloud and have an on-premises AD server, so this is a pretty big killer feature.

Syncing the two directories happens transparently but there are a bunch of things that can be configured like the way passwords are synced. I'm not a system administrator so I've not set this up personally but most IT admins can do this pretty painlessly.

Connect Health

The premium edition of Azure Active Directory has a monitoring and reporting capability called Connect Health so you can see who is logging into your system and when. You can also get alerts for any seemingly nefarious activity, like a report on the top 50 users with failed username and password attempts, as well as a report on whether Azure AD is syncing correctly with any on-premises AD server you might have. It's a pretty nice feature for IT Admins, while others might not care too much about it.

Azure Active Directory Connect Health

Two-Factor Authentication

The premium edition has two factor authentication built in right out of the box, so no having to setup a text message provider, plus the costs of sending those messages are included out of the box.

Cloudflare for Identity

Microsoft is monitoring all logins and actively blocks activity from known attackers (a bit like cloudflare for identity), so it should in theory provide some added security. There is not much detail about this though.

Managing Users

You can manage users from the Azure portal but the UI is just about passable. If you are using Office 365, you're in a better position as it provides a better UI to manage users.

UI Customization

Customization of the UI is very basic. You can provide a company logo and background image, which get displayed on the login screens but that's about it.

Developer Experience

The overall developer experience is pretty slick. Creating a new project in Visual Studio lets you enable integration with Azure AD by just logging in using your Azure credentials and selecting your Azure AD account. It doesn't get any easier than that for simple scenarios.

For more complex scenarios, you will inevitably have to log into the Azure Portal and configure things a bit more. You often end up having to download and edit an XML configuration file from the Azure Portal. This is not the best experience in the world.

Overall

You have to pay for the premium features and using the Azure Portal to do identity management is kind of a pain. Out of the box though this is ridiculously fast to setup and can get you up to speed very quickly, while giving you a secure platform.

The documentation is pretty good and there are samples on GitHub with Microsoft developers actively monitoring the issues which was helpful. Some links I found useful:

IdentityServer

IdentityServer is the Swiss Army knife of Identity management. It can do everything but does require a small amount of setup and a little more knowledge of the identity space. It can do most things that I listed above and a lot more beyond.

Multiple Identity Sources

IdentityServer can connect to one or more identity sources. It has to be noted that even if you are using Azure Active Directory, there may still be reasons for choosing IdentityServer which I had not initially considered. For example, if you have more than one source of user data e.g. You are using AD and also a SQL database of users, then IdentityServer can be used to point to both of these sources of user information. In theory it should also make it easier to switch from AD to something else entirely as it decouples things.

Application Insights

It's possible to integrate Application Insights yourself and record things like logins and password resets. You could build a dashboard of graphs which looks like Connect Health. In fact, you can make it look exactly like Connect Health with very little effort.

Two-Factor Authentication

Two-Factor Authentication requires a third party provider to send text messages and of course this means that there will be a monetary cost. In addition there is a small amount of code you have to write to get things connected.

Use Cloudflare

Azure Active Directory provides some built in support for blocking malicious activity, a bit like Cloudflare but for identity. With IdentityServer, you could use the real Cloudflare and get some added protection for very little effort.

UI Customization

UI customization is where IdentityServer shines. You have full access to the HTML and CSS and can fully customize the look and feel to your hearts content.

Developer Experience

IdentityServer is built by Dominick Baier, Brock Allen and the open source community. I actually did a WCF course under the instruction of Dominick many years ago and I can tell you that IdentityServer is in capable hands.

Any questions or issues you have would be posted on the relevant IdentityServer GitHub project. Dominick, Brock and other community members often answer questions. Overall, it's run as a healthy open source project.

Microsoft Backing

Microsoft has attempted to build their own identity provider in the past but the solution wasn't the best. Having embraced open source, they now recommend IdentityServer themselves.

Overall

The project is actively developed on GitHub and it has well known developers at the helm. There are code samples for all the authentication flows and you can get answers from the community. Some links I found useful:

Authentication Flows

Fact: Security is really really hard. There are lots of different ways of doing authentication called 'flows'. I put this link here because I found it very useful for understanding them. Also, the following diagram is key to understanding this entire topic.

Authentication Flows

Summary

What you decide to choose depends entirely on the problem you have. Which should you choose? Well, it depends on the number of developers, time, money and effort you can expend setting everything up. There is no one size fits all solution. Really, the differences in the two products above are the differences between a SaaS and PaaS solution.

Which did I choose? While I was doing this research, I discovered to my surprise that the company I work for already had an Azure Active Directory linked to an on-premises Active Directory server because we were using Office 365. That made the choice much easier for us.

Viewing all 138 articles
Browse latest View live