Understanding IIS 8.0 Rewrite inbound rule for https to http redirect using regular expression

iis-8.5regexrewrite

I have managed to configure a rewrite rule for my web site using this answer:

<rule name="Redirect from non www" stopProcessing="true">
  <match url=".*" />
  <conditions>
    <add input="{HTTP_HOST}" pattern="^example.com$" />
  </conditions>
  <action type="Redirect" url="https://www.example.com/{R:0}" redirectType="Permanent" />
</rule>

<!-- this is the rule I am interested in -->
<rule name="Redirect from non https" stopProcessing="true">
  <match url=".*" />
  <conditions>
    <add input="{HTTPS}" pattern="^OFF$" />
    <add input="{HTTP_HOST}" pattern="^www.example.com$" />
  </conditions>
  <action type="Redirect" url="https://www.example.com/{R:0}" redirectType="Permanent" />
</rule>

However, I have trouble understanding how url attribute from action tag actually works. If I go to IIS -> Rewrite rules -> Redirect from non https -> Test pattern -> enter url http://www.example.com/subdir/?param=value and hit Test, I receive {R:0} = http://www.example.com/subdir/?param=value.

This makes sense, as * regex expression will match the whole string.

Question: How does URL rewrite engine obtain https://www.example.com/subdir/?param=value instead of https://www.example.com/http://www.example.com/subdir/?param=value?

Best Answer

I know this is a bit old, but just to add something to it.

One solution would be to create additional Capture Groups in your Regular Expression for the url in the match element of the rule, so as to extract explicit parts of the URL.

The Back Reference {R:0} will always be the entire string being tested, so you can obtain additional Back References by adding Capture Groups which will extract the substrings you are interested in.

An example Regular Expression to achieve this is below.

This consists of 2 Capture Groups. Note: one is a Non-Capture Group

^(?:http:)(.*)
  1. (?:http:) - is a Non-Capturing Group, which is denoted by the "?:" prefix on the pattern, which allows for matching a pattern, but not including it in the returned Back References - it will match the string "http:" only.

  2. (.*) - is a standard Capture Group that will return all the remaining chars in the string that occur after the first Non-Capture Group - it will return everything after "http:".

The resulting Capture Group Back References will be:

  1. {R:0} :: the whole original URL
  2. {R:1} :: everything after the "http:"

So the url attribute in your match node will be modified like so:

<match url="^(?:http:)(.*)" />

And the url attribute in your action node will be modified like so;

<action type="Redirect" url="https:{R:1}" redirectType="Permanent" />

The Regular Expression syntax supported in the rules (according to the docs) is ECMAScript – Perl compatible (ECMAScript standard compliant) regular expression syntax.

More info can be found in the MS Docs for IIS: https://docs.microsoft.com/en-us/iis/extensions/url-rewrite-module/url-rewrite-module-configuration-reference

Please be aware that this is a very general solution and may not apply in all cases - always test using the Test Pattern function in the Edit Rule screen of IIS to be certain of efficacy.