Asp.net hosting undocumented
I’ve wanted to write about the full ASP.net stack for a long time, and thanks to Roy, I now find the time and the will to go down the rabbit hole. If you thought the whole aspx page model is already complex, you’re going to discover that this is only the tip of the iceberg of what is really happening.
The
life of an HTTP Request
We’re going to get down the rabbit
hole, and look at what happens whenever your browser sends an HTTP request to your
beloved .net server. You thought it was simple, boy were you wrong! Please note that
I’m going to talk about two cases: IIS5 (Windows 2000 and windows XP) and IIS6 (Windows
Server 2003).
Whenever you type a URL, your browser
establishes a connection to your web server, and start sending a piece of very simple
text, called an HTTP Request. On the listening side, you’ve got a web server (how
obvious is that), that is reading this data. The server, in our case IIS, relies on
an extensibility model: ISAPI extensions and filters, which are C++ dlls that can
be initialized by the server and process the request on their own. As a side note
and for general culture, an extension is attached to a file extension while a filter
is always in the IIS process and is there for every request. Asp.net have both a filter
and an extension, but we’ll only concentrate on the filter for today.
As some of you probably know, ISAPI
filters are used for most of the popular Microsoft IIS centric pieces of software,
including plain ol’ ASP, FrontPage extensions, and obviously asp.net.
You
called me?
But IIS itself is no longer only
one and one only. IIS5 was there way before .net, and rely on the same mechanism as
his parents. Junior, IIS6, is a child of the .net era, and a completely new creation.
While they look the same, inside they are completely different.
The IIS5 process is inetinfo.exe,
and listens by itself on the correct port to process the incoming calls. Whenever
he receives an http request, he’s going to look at all of its isapi extensions to
find the one handling it. If none is found, the request is going to return the file
resource requested, or one of the numerous error codes you all love. If a filter is
found, he’s going to initialize the filter somewhere, depending on the application
protection that you defined in your MMC snap-in.
-
Low:
the isapi dll is going to be loaded inside the inetinfo.exe. That means that in case
of a problem, the whole web server, no request can be processed anymore.
-
Medium:
in that case, all the dlls are loaded in an external process, the beloved dllhost.exe.
That’s where a lot of different things are put by the COM+ infrastructure for out
of process execution, and the one which is a headache because you never know which
application in it is provoking this 100% CPU usage. On the other hand, when the process
crash or when you kill it, your web server still process incoming requests. Obviously,
you still loose an undefined number of processes.
-
High:
In this mode, one dllhost.exe is spawned for each isapi dll. Good thing, in case it
dies, everything else is preserved. Bad thing, it is very heavy, as a new process
is created (and that’s heavy on windows), plus the COM+ infrastructure handling. You
pay the price of boundaries. Nothing is free.
Instead of staying into the living
room, on the couch, listening for the doorbell to ring, drinking a beer, yelling in
front of tv… bad… childhood… memories… must… forget… IIS6 relies on two components:
http.sys and w3wp.exe.
The http.sys, as its name strongly
suggest, is a kernel mode driver. Its role is to listen on a tcp port, and fetch the
data around as needed. That means that instead of executing into the user space, it
is going down in the basement. It’s dark, and as is popular to say, “in kernel mode
no one can hear you scream”. If there were a problem down there, chances are that
your operating system would die instantly. So why in the name of god did mommy Microsoft
put the first of the now revealed twins in such a dangerous place?
-
Network drivers runs in kernel
mode. That’s the first one. Drivers runs into the kernel, as does the tcp stack and
disk access. All the pipes runs under the floor, in the basement.
-
In kernel mode, you have a very
nifty function. You can ask for data to be copied straight from disk to the network
card, without it going through user mode, memory and cpu. That’s what I call
low overhead.
I know you look at that, and I
can certainly hear the sound of your “aaaaaaaaah aaaaaaaaaah” moment. The http.sys
driver can process all the requests very efficiently. But I can also feel you being
afraid of the dark. If I execute unknown code down in the kernel, that could be a
serious issue right? Absolutely right, and that’s exactly why the http.sys driver
doesn’t do it at all.
Each one of these worker processes
execute the isapi dll in process, so in a pure IIS6 environment you wouldn’t use dllhost
anymore. In case of crash, your web server always continues to execute incoming requests,
and only your isapi dlls set on your application pool dies. As http.sys relies on
w3wp to process user code, as long as it doesn’t die on itself, the driver has no
external risk associated with foreign code execution.
Meet
ASPNET_ISAPI.dll
For .net, our ISAPI extension is
named ASPNET_ISAPI.dll. Whenever an incoming request match the file extension to which
the isapi filter is attached, IIS is going to initialize this dll. What is it doing?
Here again, IIS5 and IIS6 differ wildly.
To go more into details (and we’re
much into details, or you wouldn’t be that far in the article), the isapi is going
to check if the worker process is present or not. If not it creates it. In both case,
it is going to create an asynchronous named pipe, onto which the request is going
to be sent, after a handshake which will ensure proper communication and allow for
transmitting authentication information.
It is interesting to notice that,
because there can only be one asp.net worker process, all of your asp.net applications
are running in one process. That invalidates completely the application protection feature
of IIS5, even more if you choose to run it under different credentials.
There’s also a special case. The
ISAPI dll actually reads the machine.config file whenever IIS5 is started. If you
configured, in the processModel element, the enabled attribute to false, the worker
process is not going to be used, and is instead going to be hosted internally in the
inetinfo.exe process.
Finally, on the security side,
from what I’ve said you could assume that setting the process impersonation in your
machine.config file would not work. It does, but instead of executing the worker process
under a different credential, IIS sets the token to the correct credential on the
thread executing the request, which is then set in the application domain of your
web application.
Let’s have fun in the HTTP Pipeline
The next step is obviously to be
able to link the unmanaged code containing the request (the worker process side) with
the managed code turning it into an aspx page, an asmx web service, or in anything
else that can be generated from ASP.net.
Creating the AppDomain
Let’s assume that no previous AppDomain
were created for your web application when the first request goes through. The worker
process is going to call the following method on the AppDomainFactory object.
The second thing you see is that
it’s given a module (that is, more or less, an assembly file), a typeName,
and a few other parameters that are quite self descriptive. Also note that the return
value is an object, as this is going to be very important. Let’s look at what this
method is doing.
|
“bin” |
PrivateBinPathProbe |
* |
ShadowCopyFiles |
True |
ApplicationBase |
The appPath formatted (and validated)
as a Uri.
Interestingly
enough, the cleaning method used, which I used as well in many projects, is to create
a new Uri and return the passed argument as Uri.ToString() |
ApplicationName |
appName |
ConfigurationFile |
web.config |
DisallowCodeDownload |
true |
Another nice thing to know is that
whenever this Evidence based list is constructed, the framework is going to look at
the Host evidences. If any Zone evidence is defined (see System.Security.Policy.Zone),
the Zone for “My Computer” is automatically added to the evidences.
Finally, a new Host evidence is
added using the strUrlOfAppOrigin.
|
* |
.appId |
appId |
.appPath |
appPath |
.appVPath |
appVPath |
.domainId |
domainId |
.appName |
appName |
-
A parent UnionCodeGroup containing
an AllMembershipCondition and a PolicyStatement with a PermissionSet constructued
with PermissionState.None;
-
A first child UnionCodeGroup containing
a StrongNameMembershipCondition based on the Microsoft strong name public key,
and a PerimissionSet constructed with PermissionState.Unrestricted
-
A second child UnionCodeGroup with
an UrlMembershipCondition set to the application url, associated with a PermissionState
constructed on both the Url and Zone, from the application strUrl and iZone parameter,
but without the permissions of type UrlIdentityPermission and ZoneIdentityPermission.
By default, the appdomain have
no permission at all (PermissionState.None) for all of the code in it (the AllMembershipCondition).
We then open up this by providing
an unrestricted permission set (PermissionState.Unrestricted) to the Microsoft
signed dlls (StrongNameMembershipCondition).
Finally, we add the permissions
defined for the application, based on its url and it’s zone.
Her royal majesty Runtime
the first
-
StartProcessing is
called one time just after the AppDomain creation. I have no idea why it is there
to be honest, my best guess being that the managed asp.net team decided to remove
in later phases initialization code that was done at that point.
-
StopProcessing is
called whenever the application stops processing incoming requests, provoking automatically
the death of the AppDomain.
-
DoGCCollect is
a bit of a surprise, and I’m sure the .net architects would certainly have a very
good explanation as to why it is there. The actual action is to call exactly 10 times GC.Collect().
My guess here would be a desperate attempt at the worker process to resume some of
its memory under high load, but I can’t say for sure. Anyone with more information?
-
ProcessRequest is
the one that is interesting here, as it is really the beginning of the managed HTTP
pipeline.
-
The ISAPIWorkerRequest type is
the mother of all worker request objects in the ISAPI runtime. The class factory for
the types we’re interested in is implemented in the static method CreateWorkerRequest.
One of three types of objects can then be created:
-
If the process model is used (that’s
our iWRType parameter), that is if it’s different than zero, an ISAPIWorkerRequestOutOfProc object
is created;
-
If not and if the IIS Version is
more than or equal to six, an ISAPIWorkerRequestInProcForIIS6 object is created.
-
And finally, if not, an ISAPIWorkerRequestInProc object
is created.
Her majesty Runtime the second
Well, yes, the naming convention
is a bit awkward, but it does reflect the reality of two chained runtimes. The first
one is responsible for the link between the unmanaged and the managed world, the second
one is all good managed implementation.
The wonders with calling a static
method like the HttpRuntime.ProcessRequest is that on the first call, a lot of things
are going to happen to construct this object. The static constructor for HttpRuntime
is going to first call the Initialize method. It is the place where the registry is
traversed to find the Path key that you can find yourself in HKLM\Software\Microsoft\ASP.NET\version\Path,
and initialize the s_installDirectory private variable.
A
final word
We’ve been through the whole stack
from the HTTP request up to the beginning of the official asp.net http pipeline. But
if you look at many other articles on the subject of asp.net hosting,you might notice
that they in fact talk about two other classes: ApplicationHost and SimpleWorkerRequest.
What exactly is the difference with this scheme?
As the name strongly suggest, ApplicationHost is
a class that has been created to help application developers host asp.net outside
of the IIS environment. It is a separate mechanism that IIS doesn’t use at all. However,
by calling the CreateApplicationHost method, you go through the exact same
process as the IISAPIRuntime interface, with one strong difference: The code
explicitly check for the underlying platform being NT.
As for SimpleWorkerRequest,
it is a very simple “data” class that lets you either execute your asp.net pages within
the isolation mechanism of AppDomains, through the first constructor and the CreateApplicationHost method,
or from your own AppDomain through the second constructor of that class.
Conculsion
We’ve gone very deeply in the unofficial
asp.net relationship with IIS. What do you want to see in the future? How mono works?
How things are modified in asp.net 2.0? Or uncover some secrets from the “official”
http pipeline?