Best practices for ASP.Net applications
By George Mihaescu
Summary: this article lists a set of common practices
that I've discovered to be useful when building any ASP.Net application. This
is a live document; as I find more common practices I will add them here, so
check back often. All the practices described here come at no (or very little)
cost during design and development but offer substantial benefits immediately
and for future functionality enhancements and maintenance – in other words,
they give you a lot of bang for very little buck.
Those practices are specific for ASP.Net applications; I do
not attempt to cover general .Net "best practices" such as proper use
of the Disposable pattern, avoiding boxing and unboxing, efficient iteration or
proper string concatenation.
I use the list below as a checklist and heads-up for the
rest of the team before starting a new ASP.Net project.
The Development Environment
By the time you read this, you may already have your Visual
Studio 2005 set up with the additional options I describe below, but I list
them here nonetheless for completeness:
1. Apply the VS2005
Service Pack 1 (VS80sp1-KB926601-X86-ENU.exe
) from Microsoft. This does not only fix issues and
improve Visual Studio, but also adds the "ASP.Net Web Application"
project template, which allows you to create project-based ASP.Net applications
(in addition to the "Web Site" template that is available with VS2005
without this service pack). I strongly recommend that you use the "ASP.Net
Web Application" project template instead of the "Web Site"
template mainly because this new template allows you control over which files are
included in the project ("Web Site" assumes that all files under the
application root – be it a local file system directory or an IIS virtual
directory – are part of the site, which is very annoying especially when it
comes to source control, which assumes all files must go in the source control
system).
However, when creating a project based on this new "ASP.Net Web
Application" project type, the "Build > Publish web site"
function is different than the one available when using the "Web
Site" template – specifically you loose the ability to pre-compile the site
and therefore your published Web application will contain the .aspx files
un-compiled (i.e. as source code) and they will be compiled at runtime, on
first request. I don't understand why Microsoft chose to downgrade this
functionally with a package that otherwise significantly upgrades the
development model for ASP.Net web applications, but there is a fix (and a nice
one, too) which I describe at item 2 below.
Note that migrating existing ASP.Net applications created using the "Web
Site" template to the new "ASP.Net Web Application" project type
is quite simple. The process is extensively described on MSDN as well as on
many other sites (e.g. Scott Guthrie's article
) so I won't get into those details here. In addition
to the detailed steps described in Scott's article I just want to add that if
you get compilation errors when referring to classes that used to be under the
App_Code folder (and are now under the Old_App_code folder as a result of the
migration) you need to select those source files, select Properties and
change the Build Action property value from Content to Compile.
2. Download and
install the "Web Deployment Project" template from Microsoft (WebDeploymentSetup.msi
). This gives you a new "Web Deployment
Project" template that should use to create deployment images of ASP.Net
web applications created as described above. This project template does not
only offer you the ability to pre-compile everything in assemblies (simplifying
the deployment and avoiding the .aspx source files from being deployed in
un-compiled form) but it also:
o
Allows you unlimited control over what gets deployed (such as to remove
unwanted files or to copy additional files in the deployment image). Just select
the deployment project and choose "Open project file", then edit the
opened .wdproj XML (which is conformant to msbuild XSD) – for instance add you
own custom actions under the <Target
Name="AfterBuild"> element.
o
Resolves the perennial problem of re-configuring the web.config
file for deployment. It can do this for you by either replacing specific
sections in the web.config used in development with new sections supplied by
you that contain the correct content for the deployment target or you can do it
yourself by editing the .wdproj XML file (see above) to delete the
development-only web.config file with a web.config having the content ready for
deployment.
o
Resolves the annoying problem of inadvertently deploying the
contents of the App_Data folder (which is either useless on the deployment
target because that uses a different data source, or downright bad because it
overwrites the existing data source on the target with the development data
source).
Design / Implementation Patterns
1. Design: use
your own base class for all pages. Instead of deriving all pages in the
application from System.Web.UI.Page, derive them from your own class (e.g.
BasePage) that in turn derives from System.Web.UI.Page. This
technique allows you to add functionality common to all pages by implementing
it in the BasePage class. For example, you could implement in the BasePage a
function to get a database connection and automatically close it on the
Page_Unload event handler. This class may not appear very useful initially, but
may prove to be a great help later when new functionality needs to be added
that must be available to all (or most of) the pages in the application.
2. Design: have
business objects that model the business logic of the application. This may
seem obvious, but many web application developers implement business logic in
the actual page code-behind class, mixing the presentation (UI) with the
business logic. This is bad not only because that particular business logic
cannot be re-used, but also because it cannot be unit and performance-tested.
Start with a proper analysis and design an object model for your application, and
then create the classes that implement the object model. The code left in the
code-behind classes should be very high-level, simply assembling the required
logic through calls to the applicable business objects.
3. Design: once
you've done 2) above, implement those classes modeling the business objects to
be agnostic of the web environment in which they are being used (de-couple
those classes from the web-related .Net classes). For example, the business
logic classes should not use classes such as HttpSessionState, HttpRequst,
etc. If this is not always possible (although I don’t immediately see why),
isolate your classes that need to be "web-aware" in a separate
namespace, and ideally in a separate class library (see below). This will improve
the reusability and testability of those classes tremendously (see item 4
below).
4. Project
organization: a follow-up of items 2 and 3 above: don't create the required
object model classes in the App_Code folder of the web application, but in
separate class library project(s) (which can be part of the same solution as
the web application project). This way you can easily:
a. Implement unit
and performance tests for the object model classes.
b. Re-use this
object model library for other projects – and therefore if you followed item 3
above and make those classes unaware of the web context, you can re-use them even
in Windows Forms applications or other classes libraries.
5. Implementation:
implement a type-safe session object instead of directly using the un-typed HttpSessionState
offered by the .Net framework. I present the benefits and details of two such
implementations in this
article
.
6. Implementation:
implement a type-safe cache object instead of directly using the un-typed System.Web.Caching.Cache
offered by the .Net framework. I present the benefits and details of two such
implementations in this
article
.
7. Implementation:
implement the Application_Error event handler in your code-behind class for the
global.asax (the class derived from HttpApplication). In this handler log the
complete list of exceptions (call Server.GetLastError to get the exception,
then walk the exception list based on the InnerException property until you get
null), the stack trace, the time of the exception as well as the URL that was
requested when the exception occurred (Request.RawUrl). It may be useful to also
log the Request.UserAgent to see what type of client was being used to access
the site – I've experienced certain type of exceptions when the site was
accessed by Google trying to periodically check the cache of pages that Google
stores; this may allow you to decide how to handle certain exceptions, or even
to ignore them.
This log is very valuable when you need to track down problems in a production
environment.
I usually add a piece of code to notify me by email when such an exception
occurs – if you choose to do this, spend some time and implement a thread-safe
queue of email requests serviced by a worker thread that sends the email (using
the standard .Net classes in System.Net.Mail).
This implementation is easily reusable in other contexts and also ensures that
the ASP.Net threads are not blocked for potentially slow network-bound
operations (i.e. the ASP.Net thread on which the exception has occurred should
not block until the sending of the email completes, especially since the
remaining of the code flow does not depend in any way on the result of the
email notification).
8. Maintainability
(and performance): avoid reflection or any late-binding mechanism. Reflection
and late binding in general is not only costly in terms of performance (which I
will describe below under the Performance and Scalability section) but also
escapes the normal static checks performed by the compiler. This means that
once you wrote the code based on reflection, the compiler will never be able to
help you should the name of the type / field used in reflection change. It's
very common that names of types / fields / data set columns change in
subsequent development phases of a web application due to re-factoring, code
re-organization, etc and you will never be able to notice at compile time the
impact the change has. Consider the data binding code below, used in a Repeater
control:
<ItemTemplate>
<tr>
<td> <%# Eval("Name") %> </td>
<td> <%# Eval("Description") %> </td>
</tr>
</ItemTemplate>
Should someone change the name of either Name or Description properties used
above for binding, you will not notice the run-time issue until you exercise
the particular Repeater in the particular page where this appears. This shows
how a very minor code change which initially may seem to have no impact on the
application may in fact require a full application re-test only because that
code was called using reflection. The same code can be implemented to use
static (early) binding that the compiler can validate (bonus: Intellisense will
help you, unlike in the reflection scenario):
<ItemTemplate>
<tr>
<td><%#((MyBusinessObject)Container.DataItem).Name %> </td>
<td><%#((MyBusinessObject)Container.DataItem).Description %> </td>
</tr>
</ItemTemplate>
The bottom line is that reflection or late binding escape compiler static type
checks and you may end up with run-time exceptions in parts of the application
you never expected to be affected – so you end up with a poor quality
application and a maintenance problem. Therefore, whenever possible I prefer
static (early / compile-time) binding so that the compiler can immediately
detect the mismatches. It continues to puzzle me why the MSDN documentation
that accompanies VS 2005 provides data binding examples using the
DataBinder.Eval method (i.e. reflection-based) instead of static bound code
(the example writers disliked the static casts?).
9. Maintainability
(and performance): avoid the data binding gimmicks offered by the visual
designer wizards (which generate code using the SqlDataSource, ObjectDataSource
or similar web server controls in the .aspx code). Using those controls results
in the .aspx code containing either connection strings and SQL statements (if
binding to SqlDataSource type control) or a type name and method name (if
binding to an ObjectDataSource control). This is first a maintenance issue (do
you want to start looking for SQL statements in scattered in the .aspx script?
do you want to start looking for type / method names in the .aspx script every
time you change a type / method name?) an secondly a performance issue:
ObjectDataSource uses reflection to invoke the method that provides the data
set used in binding. The only reason I can find for Microsoft providing those
mechanisms is quick prototyping – at the cost of countless developers falling
in this maintenance (and performance) trap. Of course, there is also the Wow!
effect at conferences and developer days ("see how quickly I put together
this web site? Click here, click there, and we're done"). I always prefer
explicit binding in the code-behind class (in addition to the static casting
instead of Eval, as shown above), therefore achieving both more maintainable
code and better performance.
Performance and Scalability
I chose to discuss those non-functional but critical requirements
in a common section because in the majority of cases those are competing
requirements: solutions that improve the performance decrease the scalability,
and vice versa. For example, storing a lot of frequently needed per-user data
in the session improves performance (you don't need to constantly retrieve the
data from a data store) but kills scalability (you use more memory for each
user, limiting the application's ability to scale to a large number of
sessions).
The scalability / performance balance can be a difficult one to strike – and
usually it is not clear that you’re doing the wrong thing until you’ve done it.
This section is by no means exhaustive; I just cover the
basics and the common pitfalls. For a truly comprehensive discussion on performance
and scalability for ASP.Net, refer to the online book called "Improving .NET Application Performance and Scalability
" from Microsoft (highly recommended). I chose
to list some basic issues and common pitfalls here so that I can start a new
project by going through the list and remind myself (and the rest of the team)
that there are also non-functional requirements (such as performance and
scalability) that have to be considered up-front and in parallel with the
functional ones.
When you consider the scalability / performance balance
always keep in mind the costs associated with it. Not all applications need to
perform at light speed and / or scale to thousands of users. Carefully consider
the target user base and the expectations, then set some reasonable goals and
benchmarks. For example, if the application is an intranet that will be used by
at most 50 users in accounting then focus more on performance and less on
scalability; if, on the other hand, you are working on the next eBay,
scalability should be the first and foremost goal and performance will have to
take a second seat.
1. Be careful with
the cache: caching (programmatic, using the System.Web.Caching.Cache or the
ASP.Net declarative page output caching) is a clear performance booster but can
become a scalability killer. The use of cache can be obvious in some cases
(e.g. a limited set of very small data items that otherwise may require a much
more expensive database or web service lookup) but not so clear-cut in many
other cases. Be especially aware of the ASP.Net page output caching when used
with VaryBy attribute: it may work well initially, but since it’s "out of
sight – out of mind", later on someone can make changes to the parameters by
which the cache content is varied, which in fact can cause ASP.Net to start caching
a version of the page for every user or request, defeating the whole purpose
and affecting scalability. You won't notice this in development, but you'll
notice it soon enough in load testing (you do that, right?).
2. Database: when
it comes to data retrieval, filtering, aggregation, sorting, etc, push the
heavy lifting to the database server as much as possible through use of stored
procedures, views and other database mechanisms available. This is because
database servers are designed and tuned to scale well, most likely better than
what you can do on the application side. (In addition, the use of stored
procedures is beneficial for other reasons, such as security and re-use – I
deal with those other reasons in this
article
). However, on both application side and database
side (stored procedures), always explicitly specify the columns to be retrieved
in a SQL SELECT statement instead of going for the quick SELECT *. The danger
of using SELECT * is that should someone add more columns to the schema later
on, your application will start retrieving data that it actually does not care
about, killing both performance and scalability (consider for instance what
would happen when using SELECT * on a table to which later someone adds a
column storing an image – all of sudden you'd start getting substantially more
volumes of data which you don't use anyway) – and you won't even be immediately
aware of the problem ("adding a column is a minor change, nothing should
break, right?").
Obviously, the design of the database schema and database code (stored
procedures, triggers, views) affects scalability in a substantial way – so if
you’re in charge of the project, ensure that the database design and code is created
and properly reviewed by people with relevant expertise.
3. Consider using
asynchronous page processing on pages that perform lengthy external processing
(e.g. lengthy database queries, calls to web services or external systems:
credit card validations, etc). Asynchronous page processing basically means
that the page processing starts on one of the ASP.Net request processing
threads, but then does its business logic work on a worker thread that is not
part of the request processing thread pool; when done, the worker thread
signals its completion to ASP.Net, which then completes the page processing on
a thread in its request processing pool (probably another one than the one on
which the page processing started). This is essentially a thread-wise
de-coupling of the presentation work (ASP.Net request processing thread pool)
and business logic (worker threads).
This technique ensures that the ASP.Net request processing thread pool is not
depleted of threads that have to wait on lengthy queries, and therefore can
keep servicing new requests, substantially improving scalability.
4. Carefully
consider the use of ViewState. ViewState is a great feature of ASP.Net which
allows pages and controls to keep their state (displayed data, etc) between
postbacks. It also allows you to keep your custom data between
postbacks: just use the Add / Remove methods (or the indexer) on the ViewState
property of the page to add / remove name-value pairs. Note that the value can
be any object that is marked as Serializable (so that the object can be
serialized / de-serialized to / from the hidden field of the page which stores
the view state of that page during the round-trip). Also note that objects can
be placed in the ViewState only until the PreRenderComplete event fires for
that page / control; anything placed in the ViewState after that time will not
make it in the hidden field of the output page.
This allows you to use the ViewState instead of hand-crafted hidden HTML input
fields, however:
a. Don't abuse it;
storing large volumes of data in the view state is a sure performance killer
(read below).
b. If you store
sensitive data in the ViewState, make sure that it's encrypted using the
viewStateEncryptionMode attribute at either the page / control level or entire
application level (in the pages element of the web.config). By default in
ASP.Net 2.0 it is not encrypted unless a page / control explicitly sets its
attribute to "Always". When using encryption, the ViewState killer
effect on the performance is amplified (the serialized form of an encrypted
ViewState does not grow, but there is additional encryption / decryption
processing required at each page trip).
c. Be careful
that you don't inadvertently keep adding to the ViewState of a page without
ever removing between postbacks – which will slow down the processing of the
page with each subsequent postback and also affect the entire application (a
subtle kind of "memory leak" done by round-tripping more and more
data).
Because the data in the ViewState
has to be serialized / de-serialized (and possibly encrypted / decrypted), the
ViewState can easily become a performance killer. Therefore, make sure that you
turn off the ViewState on those controls that don't need it; this becomes
tricky with certain controls (e.g. 3rd party controls over which you
don't have much knowledge about how the ViewState is used or .Net controls such
as GridView and DataView which loose the paging functionality once the
ViewState is turned off). In the particular case of the GridView and DataView,
the ViewState is particularly damaging since their ViewState contains the entire
data set they are bound to, and not only the data that is being rendered;
therefore if you have 1000 records bound to the control and the control is set
to show 10 records per page, the entire set of 1000 records is stored in the
ViewState!
Despite the caveats above,
ViewState can be used as a scalability enabler; carefully balance its use vs.
the use of the session store. Placing data in the ViewState as opposed to the
session reduces the memory use on the application server at the cost of
bandwidth and CPU required to serialize / de-serialize the ViewState (and
possibly encrypt / decrypt it). So you gain in scalability but loose in
performance. But be aware: if you choose to round-trip data in the
ViewState instead of keeping it in the session in order to improve scalability,
remember that ViewState is persisted on a per-page instance rather than
per-session; therefore if a user opens multiple browser windows on a page
within the same session, you can potentially end up with different data in each
ViewState – you have to be ready to deal with this scenario and be able to
reconcile somehow the differences. But if you place immutable data in the
ViewState, then you are safe from this perspective and you achieve a sort of a
"distributed per-user caching" mechanism without using memory on the
server, but for which you pay in bandwidth and CPU cycles.
5. Carefully
consider the use of the session. The logical user session can be maintained in
memory or in a SQL Server database (through the mode attribute of the sessionState
element in web.config). Regardless of the session storage, the larger the
session, the less scalable the application, as each user session takes memory
off the server. This is less the case if the session storage is SQL Server,
because the session is read in memory from the database at the beginning of a
request, and saved back at the end of the request – therefore a session uses
server memory only during the processing of a request. Therefore storing
sessions in SQL Server improves scalability, at the cost of performance,
because of the loading and saving of the session from / to SQL Server on each
request by default (even if the specific page being requested does not even
uses session state). (This is where I have to criticize the ASP.Net implementation:
I don't see why the session has to be automatically loaded from SQL Server at
the beginning of the request processing, rather than a lazy load approach,
where it's loaded on first request – if it was implemented like this, what I
recommend below would not be required). In order to improve performance I
therefore recommend that you customize this default behavior for each page: for
pages that don't use session state, set the EnableSessionState attribute
of the Page .aspx directive to false; for pages that require only read
access set to the session, set it to ReadOnly. This way, in the first
case you eliminate the SQL read / write of the session on each request of that
page; in the second you eliminate the SQL write.
Also, regardless of the storage chosen for the session, don’t use the session
as a cache (i.e. store data there for subsequent requests, in order to avoid an
expensive database access for example); while this does improve the
performance, it kills the scalability. The correct way to improve the
performance without affecting the scalability is the use of the cache, mainly
when the data in question is common to many / all users. When using caching a)
you can set parameters that makes the cached data item expire (such as a time
interval, etc) and b) the runtime can re-claim the memory in case of excessive
load, allowing the application to scale.
6. Minimize the use
of reflection / late-binding. Beside the maintenance problems reflection causes
(which I've described above), reflection is also much more expensive than the static
bound code. Using the same example given above, when I discussed this topic
from the perspective of code maintenance:
<ItemTemplate>
<tr>
<td> <%# Eval("Name") %> </td>
<td> <%# Eval("Description") %> </td>
</tr>
</ItemTemplate>
If the data source happens to be a DataTable which has 100 rows, the template
code above (used in a repeater control) will cause 200 calls through
reflection. However, the same code done using a static cast would be
substantially faster:
<ItemTemplate>
<tr>
<td> <%#((DataRowView)Container.DataItem)["Name"]%> </td>
<td> <%#((DataRowView)Container.DataItem)["Description"]%> </td>
</tr>
</ItemTemplate>
The performance would be even better when using the specialized methods to
retrieve the data fields. For example, if you used a DataReader to retrieve the
data and bind it to the control, each DataItem will be a DbDataRecord, so
casting to it and using its specialized methods would look like this:
<ItemTemplate>
<tr>
<td> <%#((DbDataRecord)Container.DataItem).GetString (0) %> </td>
<td> <%#((DbDataRecord)Container.DataItem).GetString (1) %> </td>
</tr>
</ItemTemplate>
However, generally I avoid binding directly to ADO.Net objects; as mentioned
before, I prefer using business objects that expose the necessary collections
for biding to the various controls in the pages. The binding code in the pages
would cast the DataItem to the applicable type of business object the
collection is made of, then invoke the specialized methods of the object to
retrieve the necessary fields. This approach meets the following goals:
a. Is
statically-bound (the compiler is my friend and will check the code)
b. Is substantially
faster then the reflection-based equivalent
c. The
business object(s) can be unit and performance tested outside the web
application
7. Minimize
use of locking and synchronization; when locking is needed, make sure that the
lock applies to the minimal set of objects required to be synchronized
(fined-grained locking) and that the lock duration is minimized (acquire late,
release early). Unneeded locking, or locking that lasts longer than it's
actually required will affect the scalability of the application. A typical
example is placing / updating data in the cache (a scenario when using a
type-safe cache implementation –as described in this
article
- becomes a must). There is a no need to lock the
entire cache object only because some common atomic operation must be
performed; locking the whole cache would block all other threads (requests)
accessing the locked cache for operations that are unrelated – instead use an
object specifically used for locking the specific data set that needs to be locked.
For example, an implementation like this:
public class
CacheAdapter
{
//the
cache object we adapt
private
readonly Cache
m_cache;
private
CacheAdapter(Cache c)
{
m_cache = c;
}
/// <summary>
/// Constructs a cache adapter and returns it
/// </summary>
public
static CacheAdapter
GetCacheAdapter(Cache c)
{
return new CacheAdapter(c);
}
public
void UpdateDataItem1 ()
{
lock (m_cache)
{
//READ FROM CACHE, UPDATE, WRITE BACK ITEM 1
}
}
public
void UpdateDataItem2()
{
lock (m_cache)
{
//READ FROM CACHE, UPDATE, WRITE BACK ITEM 2
}
}
}
would cause threads to wait on updating data item 1 because another thread has
already obtained the lock on updating data item 2. If the two update operations
are unrelated, this is completely un-necessary and will reduce scalability.
Prefer instead to create two additional objects that are used exclusively for
locking on each specific atomic operation:
public class
CacheAdapter
{
//the
cache object we adapt
private
readonly Cache
m_cache;
//locking
objects
private
static object
m_locking_for_data_item1 = new object();
private
static object m_locking_for_data_item2
= new object();
private
CacheAdapter(Cache c)
{
m_cache = c;
}
/// <summary>
/// Constructs a cache adapter and returns it
/// </summary>
public
static CacheAdapter
GetCacheAdapter(Cache c)
{
return new CacheAdapter(c);
}
public
void UpdateDataItem1 ()
{
lock (m_locking_for_data_item1)
{
//READ FROM CACHE, UPDATE, WRITE BACK ITEM 1
}
}
public
void UpdateDataItem2()
{
lock (m_locking_for_data_item2)
{
//READ FROM CACHE, UPDATE, WRITE BACK ITEM 2
}
}
}
This
allows threads to independently access the two unrelated operations without
locking each other (at the small price of two additional very small objects).
For basic atomic increments and decrements use the System.Threading.Interlocked
which simplifies your implementation.
8. Be
careful about storing data in the application state (i.e. Application
["key"] = value) as this may prevent scalability if you are to move
the application to a web farm. Application state storage is server-specific,
therefore after setting the application state data a subsequent request may not
find it there because the request is now serviced on a different server.