Asp.net hosting undocumented
I’ve wanted to write about the full ASP.net stack for a long time, and thanks to Roy, I now find the time and the will to go down the rabbit hole. If you thought the whole aspx page model is already complex, you’re going to discover that this is only the tip of the iceberg of what is really happening.
The
      life of an HTTP Request 
      
       
   
   
      We’re going to get down the rabbit
      hole, and look at what happens whenever your browser sends an HTTP request to your
      beloved .net server. You thought it was simple, boy were you wrong! Please note that
      I’m going to talk about two cases: IIS5 (Windows 2000 and windows XP) and IIS6 (Windows
      Server 2003).
      Whenever you type a URL, your browser
      establishes a connection to your web server, and start sending a piece of very simple
      text, called an HTTP Request. On the listening side, you’ve got a web server (how
      obvious is that), that is reading this data. The server, in our case IIS, relies on
      an extensibility model: ISAPI extensions and filters, which are C++ dlls that can
      be initialized by the server and process the request on their own. As a side note
      and for general culture, an extension is attached to a file extension while a filter
      is always in the IIS process and is there for every request. Asp.net have both a filter
      and an extension, but we’ll only concentrate on the filter for today.
      As some of you probably know, ISAPI
      filters are used for most of the popular Microsoft IIS centric pieces of software,
      including plain ol’ ASP, FrontPage extensions, and obviously asp.net.
You
      called me? 
      
       
   
   
      But IIS itself is no longer only
      one and one only. IIS5 was there way before .net, and rely on the same mechanism as
      his parents. Junior, IIS6, is a child of the .net era, and a completely new creation.
      While they look the same, inside they are completely different.
      
      
      The IIS5 process is inetinfo.exe,
      and listens by itself on the correct port to process the incoming calls. Whenever
      he receives an http request, he’s going to look at all of its isapi extensions to
      find the one handling it. If none is found, the request is going to return the file
      resource requested, or one of the numerous error codes you all love. If a filter is
      found, he’s going to initialize the filter somewhere, depending on the application
      protection that you defined in your MMC snap-in.
- 
         Low:
         the isapi dll is going to be loaded inside the inetinfo.exe. That means that in case
         of a problem, the whole web server, no request can be processed anymore.
 - 
         Medium:
         in that case, all the dlls are loaded in an external process, the beloved dllhost.exe.
         That’s where a lot of different things are put by the COM+ infrastructure for out
         of process execution, and the one which is a headache because you never know which
         application in it is provoking this 100% CPU usage. On the other hand, when the process
         crash or when you kill it, your web server still process incoming requests. Obviously,
         you still loose an undefined number of processes.
 - 
         High:
         In this mode, one dllhost.exe is spawned for each isapi dll. Good thing, in case it
         dies, everything else is preserved. Bad thing, it is very heavy, as a new process
         is created (and that’s heavy on windows), plus the COM+ infrastructure handling. You
         pay the price of boundaries. Nothing is free.
 
      
      
      Instead of staying into the living
      room, on the couch, listening for the doorbell to ring, drinking a beer, yelling in
      front of tv… bad… childhood… memories… must… forget… IIS6 relies on two components:
      http.sys and w3wp.exe.
      The http.sys, as its name strongly
      suggest, is a kernel mode driver. Its role is to listen on a tcp port, and fetch the
      data around as needed. That means that instead of executing into the user space, it
      is going down in the basement. It’s dark, and as is popular to say, “in kernel mode
      no one can hear you scream”. If there were a problem down there, chances are that
      your operating system would die instantly. So why in the name of god did mommy Microsoft
      put the first of the now revealed twins in such a dangerous place?
      
      
- 
         Network drivers runs in kernel
         mode. That’s the first one. Drivers runs into the kernel, as does the tcp stack and
         disk access. All the pipes runs under the floor, in the basement.
 - 
         In kernel mode, you have a very
         nifty function. You can ask for data to be copied straight from disk to the network
         card, without it going through user mode, memory and cpu. That’s what I call
         low overhead.
 
      I know you look at that, and I
      can certainly hear the sound of your “aaaaaaaaah aaaaaaaaaah” moment. The http.sys
      driver can process all the requests very efficiently. But I can also feel you being
      afraid of the dark. If I execute unknown code down in the kernel, that could be a
      serious issue right? Absolutely right, and that’s exactly why the http.sys driver
      doesn’t do it at all.
      
      
      
      
      Each one of these worker processes
      execute the isapi dll in process, so in a pure IIS6 environment you wouldn’t use dllhost
      anymore. In case of crash, your web server always continues to execute incoming requests,
      and only your isapi dlls set on your application pool dies. As http.sys relies on
      w3wp to process user code, as long as it doesn’t die on itself, the driver has no
      external risk associated with foreign code execution.
      
      
Meet
      ASPNET_ISAPI.dll 
      
       
   
   
      For .net, our ISAPI extension is
      named ASPNET_ISAPI.dll. Whenever an incoming request match the file extension to which
      the isapi filter is attached, IIS is going to initialize this dll. What is it doing?
      Here again, IIS5 and IIS6 differ wildly.
      
      
      To go more into details (and we’re
      much into details, or you wouldn’t be that far in the article), the isapi is going
      to check if the worker process is present or not. If not it creates it. In both case,
      it is going to create an asynchronous named pipe, onto which the request is going
      to be sent, after a handshake which will ensure proper communication and allow for
      transmitting authentication information.
      It is interesting to notice that,
      because there can only be one asp.net worker process, all of your asp.net applications
      are running in one process. That invalidates completely the application protection feature
      of IIS5, even more if you choose to run it under different credentials.
      There’s also a special case. The
      ISAPI dll actually reads the machine.config file whenever IIS5 is started. If you
      configured, in the processModel element, the enabled attribute to false, the worker
      process is not going to be used, and is instead going to be hosted internally in the
      inetinfo.exe process.
      Finally, on the security side,
      from what I’ve said you could assume that setting the process impersonation in your
      machine.config file would not work. It does, but instead of executing the worker process
      under a different credential, IIS sets the token to the correct credential on the
      thread executing the request, which is then set in the application domain of your
      web application.
      
      
      
      
       
      Let’s have fun in the HTTP Pipeline 
      
       
   
   
      The next step is obviously to be
      able to link the unmanaged code containing the request (the worker process side) with
      the managed code turning it into an aspx page, an asmx web service, or in anything
      else that can be generated from ASP.net.
      
      
      
      
       
      Creating the AppDomain 
      
       
   
   
      Let’s assume that no previous AppDomain
      were created for your web application when the first request goes through. The worker
      process is going to call the following method on the AppDomainFactory object.
      
      
      
      
      The second thing you see is that
      it’s given a module (that is, more or less, an assembly file), a typeName,
      and a few other parameters that are quite self descriptive. Also note that the return
      value is an object, as this is going to be very important. Let’s look at what this
      method is doing.
      
      
| 
             
               
                 | 
         
             
               “bin”  | 
      
| 
             
               PrivateBinPathProbe  | 
         
             
               *  | 
      
| 
             
               ShadowCopyFiles  | 
         
             
               True  | 
      
| 
             
               ApplicationBase  | 
         
             
               The appPath formatted (and validated)
               as a Uri. 
               Interestingly
               enough, the cleaning method used, which I used as well in many projects, is to create
               a new Uri and return the passed argument as Uri.ToString()  | 
      
| 
             
               ApplicationName  | 
         
             
               appName  | 
      
| 
             
               ConfigurationFile  | 
         
             
               web.config  | 
      
| 
             
               DisallowCodeDownload  | 
         
             
               true  | 
      
      
      
      
      
      Another nice thing to know is that
      whenever this Evidence based list is constructed, the framework is going to look at
      the Host evidences. If any Zone evidence is defined (see System.Security.Policy.Zone),
      the Zone for “My Computer” is automatically added to the evidences.
      Finally, a new Host evidence is
      added using the strUrlOfAppOrigin.
      
      
      
      
      
      
| 
             
               
                 | 
         
             
               *  | 
      
| 
             
               .appId  | 
         
             
               appId  | 
      
| 
             
               .appPath  | 
         
             
               appPath  | 
      
| 
             
               .appVPath  | 
         
             
               appVPath  | 
      
| 
             
               .domainId  | 
         
             
               domainId  | 
      
| 
             
               .appName  | 
         
             
               appName  | 
      
      
      
- 
         A parent UnionCodeGroup containing
         an AllMembershipCondition and a PolicyStatement with a PermissionSet constructued
         with PermissionState.None;
 - 
         A first child UnionCodeGroup containing
         a StrongNameMembershipCondition based on the Microsoft strong name public key,
         and a PerimissionSet constructed with PermissionState.Unrestricted
         
 - 
         A second child UnionCodeGroup with
         an UrlMembershipCondition set to the application url, associated with a PermissionState
         constructed on both the Url and Zone, from the application strUrl and iZone parameter,
         but without the permissions of type UrlIdentityPermission and ZoneIdentityPermission.
 
      
      
      By default, the appdomain have
      no permission at all (PermissionState.None) for all of the code in it (the AllMembershipCondition).
      We then open up this by providing
      an unrestricted permission set (PermissionState.Unrestricted) to the Microsoft
      signed dlls (StrongNameMembershipCondition).
      Finally, we add the permissions
      defined for the application, based on its url and it’s zone.
      
      
      
      
      
      
      
      
      
       
      Her royal majesty Runtime
      the first 
      
       
   
   
      
      
- 
         StartProcessing is
         called one time just after the AppDomain creation. I have no idea why it is there
         to be honest, my best guess being that the managed asp.net team decided to remove
         in later phases initialization code that was done at that point.
 - 
         StopProcessing is
         called whenever the application stops processing incoming requests, provoking automatically
         the death of the AppDomain.
 - 
         DoGCCollect is
         a bit of a surprise, and I’m sure the .net architects would certainly have a very
         good explanation as to why it is there. The actual action is to call exactly 10 times GC.Collect().
         My guess here would be a desperate attempt at the worker process to resume some of
         its memory under high load, but I can’t say for sure. Anyone with more information?
 - 
         ProcessRequest is
         the one that is interesting here, as it is really the beginning of the managed HTTP
         pipeline.
 
      
      
      
      
- 
         The ISAPIWorkerRequest type is
         the mother of all worker request objects in the ISAPI runtime. The class factory for
         the types we’re interested in is implemented in the static method CreateWorkerRequest.
         One of three types of objects can then be created:
 - 
         If the process model is used (that’s
         our iWRType parameter), that is if it’s different than zero, an ISAPIWorkerRequestOutOfProc object
         is created;
 - 
         If not and if the IIS Version is
         more than or equal to six, an ISAPIWorkerRequestInProcForIIS6 object is created.
 - 
         And finally, if not, an ISAPIWorkerRequestInProc object
         is created.
 
      
      
      
      
       
      Her majesty Runtime the second 
      
       
   
   
      Well, yes, the naming convention
      is a bit awkward, but it does reflect the reality of two chained runtimes. The first
      one is responsible for the link between the unmanaged and the managed world, the second
      one is all good managed implementation.
      The wonders with calling a static
      method like the HttpRuntime.ProcessRequest is that on the first call, a lot of things
      are going to happen to construct this object. The static constructor for HttpRuntime
      is going to first call the Initialize method. It is the place where the registry is
      traversed to find the Path key that you can find yourself in HKLM\Software\Microsoft\ASP.NET\version\Path,
      and initialize the s_installDirectory private variable.
      
      
      
      
      
      
      
      
      
      
A
      final word 
      
       
      
   
   
      We’ve been through the whole stack
      from the HTTP request up to the beginning of the official asp.net http pipeline. But
      if you look at many other articles on the subject of asp.net hosting,you might notice
      that they in fact talk about two other classes: ApplicationHost and SimpleWorkerRequest.
      What exactly is the difference with this scheme?
      
      
      As the name strongly suggest, ApplicationHost is
      a class that has been created to help application developers host asp.net outside
      of the IIS environment. It is a separate mechanism that IIS doesn’t use at all. However,
      by calling the CreateApplicationHost method, you go through the exact same
      process as the IISAPIRuntime interface, with one strong difference: The code
      explicitly check for the underlying platform being NT.
      As for SimpleWorkerRequest,
      it is a very simple “data” class that lets you either execute your asp.net pages within
      the isolation mechanism of AppDomains, through the first constructor and the CreateApplicationHost method,
      or from your own AppDomain through the second constructor of that class.
       
      Conculsion 
      
       
   
   
      We’ve gone very deeply in the unofficial
      asp.net relationship with IIS. What do you want to see in the future? How mono works?
      How things are modified in asp.net 2.0? Or uncover some secrets from the “official”
      http pipeline?
      
      
SerialSeb