Designing Solutions with COM+ Technologies

Designing Solutions with COM+ Technologies

Paperback(BK&CD-ROM)

$57.53 $69.99 Save 18% Current price is $57.53, Original price is $69.99. You Save 18%.
MARKETPLACE
6 New & Used Starting at $1.99

Overview

A thorough working knowledge of the Microsoft "RM" Component Object Model (COM) and COM+ technologies is a prerequisite for effective design and implementation in today's networked Microsoft Windows "RM" operating environment. But many development teams, during the design and implementation phases of their projects, encounter generic COM+ issues that have already been worked through by other developers. COM+ Software Architecture documents these issues and presents ways of solving them that have proven successful in past projects. This book provides practical solutions to practical problems on more than one level. Some chapters qualify as design pattern catalogues, others offer implementation best practices, and others analyze performance trade-offs between competing solutions. All offer the kind of focused COM+ technical advice and information that can help developers avoid design and implementation pitfalls by learning from those who have already been successful in the COM+ development environment.

Product Details

ISBN-13: 9780735611276
Publisher: Microsoft Press
Publication date: 01/02/2001
Series: Developer Reference Series
Edition description: BK&CD-ROM
Pages: 928
Product dimensions: 7.32(w) x 9.20(h) x 1.62(d)

Read an Excerpt

Chapter 4.
Concurrency

  • Elements of Interception
    • Concurrency vs. Reentrancy
    • Interception Implementation
    • The Apartment
    • Managing STA Concurrency
    • The Context
    • The Message Filter
    • Interception Services
  • Context Neutrality
    • Implementation
    • Internal Object References
    • But Is It Fast?
    • FTM vs. TNA
    • It's the Object's Choice
  • Concurrency Design Guidelines
    • The Best Concurrency Is No Concurrency
    • Exceptions: The Case of Client Notification
    • Standard Synchronization Settings
  • Concurrency in Local Servers
    • Apartments in Local Servers
    • Local Server Pitfalls
    • Partial Location Transparency
    • Implications
  • Locking
    • Coarse-Grained Locks
    • Fine-Grained Locks

If someone asked you what the most important feature of COM+ component technology is, what would you say? In my opinion, two features that you don't see are the most significant: location transparency and synchronization transparency. The true power of a model becomes apparent when things work without source code having to make sure that they do, when a system can be reconfigured without requiring code changes and recompilation. Location transparency is an example of that: a client can access an object without knowing where that object is located. The object might be located in the same process as the client, but it might instead be located in a different process or on a different host. Synchronization transparency offers a similar advantage: a client can call a COM+ object without knowledge of that object's synchronization needs. Therefore, a multithreaded client can call a single-threaded server without protecting the server from concurrency. And you can upgrade a server object to allow concurrent paths of execution, without its clients being aware of it. This kind of decoupling promises to deliver the holy grail of software engineering: true reusability.

Do you remember when single-process concurrency emerged in the eighties? Although the potential performance benefits of this feature—named threads—were huge, using it was such a mess! Many of us were using C and C++ in those days, and the question of library thread safety became central for anyone doing multithreaded programming in those languages. If a library was not thread safe, you had to protect access to it from all locations in your source that were using it. If you forgot to protect access to your library in even one instance, you usually guaranteed yourself crashes that were hard to track and occurred only once a week, usually in your customers' hands. Such problems could take weeks to debug, and sometimes it was easier to just rewrite the code—and hope that you would not make the same mistake twice. In an attempt to alleviate the problem, library vendors invented a standard scale to describe the level of thread safety of their products; however, the effort ultimately proved to be little help. And what about the C library itself? Was it safe to call strtok in more than one thread? The advent of multithreaded programming did as much to enhance performance as it did to degrade maintainability and robustness, and to erode project schedules. Programmers needed object middleware to fix these problems, and COM+ fit the bill.

In its infancy, the COM library itself was not thread safe. In fact, you were allowed to call CoInitialize and therefore use COM from only a single thread in your process. This is understandable, since the technology had to first establish itself as a binary compatibility standard before it could provide a more comprehensive object framework.1 But COM+ has come a long way since then. First, COM introduced the apartment model to manage the relationship between objects and threads. Later, Microsoft Transaction Server (MTS) added the concept of activities, which essentially are groupings of COM objects in a call chain that does not allow concurrency in the participating objects. Finally, COM+ added configurable synchronization domains, which decouple an object's thread affinity from its synchronization needs and give the developer the power to determine which objects can be accessed concurrently and under which circumstances. While our current options give us tremendous power and flexibility, they also amount to a bewildering array of choices. As a result, concurrency management has become not only what I consider the area of greatest strength for COM+, but its most poorly understood topic.

Still, programmers need to fully understand concurrency management techniques and synchronization options. While only a small percentage of source code in a typical software project must be designed and written to perform well, those critical areas really do have to execute fast; otherwise, your software will fail in the eyes of your users. To achieve this performance, you must understand the technology you have at your disposal. Without this critical understanding, you risk jeopardizing not only your object implementation, but also the fundamental object design of your project, where fixing mistakes can be costly. The good news is that well-managed concurrency will give your software the speed that it needs to succeed, and that COM+ presents middleware technology eminently suited to manage synchronization in your project. The bad news is that you must get a firm grasp on this rather complex technology before you can apply it, and before you can produce a design poised to take advantage of your options. Let's get to it, and explore the technology of COM+ concurrency from the inside out.

Elements of Interception

COM+ performs all its transparent magic through a process known as interception. Calls from clients to objects are intercepted by an entity that implements the same interfaces as the object but adds services before passing the call to the object. It is easy to see how such a technique might regulate concurrency through locking: the interceptor could acquire a critical section before passing on the call. In fact, you might have used such a technique years ago while synchronizing access to a library that was not thread safe. We will examine interception implementation in more detail in a moment. For now, suffice it to say that COM+ always implements concurrency management through some form of interception. Because interception is such a central concept in COM+, we will step outside the bounds of examining concurrency management in this section to take a broader look at how interception is used in COM+ middleware.

Concurrency vs. Reentrancy

Do not confuse an object's tolerance for being accessed by multiple threads—perhaps simultaneously—with its ability to handle calls back to the object by a logical thread that was used to make a call from the object. If object A can handle a call by object B, which object A is currently in the process of calling (either directly or indirectly), object A is said to be reentrant.

It is not the place of object middleware to regulate reentrancy. Only your object design can ensure that your object will not be called back while waiting for an outgoing call to return, in the event it cannot handle such a call. Your design might need to provide this assurance because blocking a reentering call would guarantee deadlock. In fact, COM+ puts some effort into ensuring that callbacks are always serviced and therefore never result in deadlock. This implementation is more challenging than you might think, since a callback can occur on threads other than the one waiting for the return of the method invocation. We will look at this issue more closely when we examine locking (in its own section) later in this chapter.

Interception Implementation

The concept of interception is quite simple: an arbitrary object is wrapped so that some amount of work can be done before and after calling a method in an interface the object supports. The most familiar example of an interceptor is COM's venerable proxy. A proxy (or more precisely, proxy manager) acts like the object it represents, but its job is to marshal all call parameters for transmission through the channel to the stub, which will unmarshal the parameters, reassemble the stack frame, and then make the call to the actual object. Yet the traditional proxy (as opposed to the newer, stubless proxy) is not a perfect example of an interceptor, since it is not generic. The MIDL compiler generates functions that mirror each method of the interface the proxy wraps.

The generality of a true interceptor presents a challenge during implementation. A generic interceptor does not know the shape of the interface it will wrap once it is compiled. This lack of information at compile time creates a very interesting set of difficulties.

The Language Problem

Procedural high-level languages, including C and C++, generally are not capable of calling a function with an unknown parameter list. By using the ellipsis and va_ set of functions, you can implement a function that does not know what parameters it will be called with at compile time. However, you cannot tell the compiler to make a call to a function and simply pass the parameters that were passed to the function making the call.

This problem can be overcome only by using a piece of assembly language to make the call from the interceptor to the wrapped object. Essentially this assembly code must make the call to the target function in the wrapped object while leaving the stack frame unchanged. However, the C compiler will already have altered the stack frame with a standard function prologue segment, which lets you access local variables. Microsoft Visual C++ offers the __declspec(naked) storage class attribute, which will prevent function prologue and epilogue generation. Obviously, implementing naked functions is difficult and, along with the necessary assembly segment, requires a thorough understanding of the processor architecture for which you compile your code.

The Failure Problem

COM+ interface methods always use the __stdcall calling convention. This convention has the callee, not the caller, clean up the stack before returning from the function. This is no problem if you actually can make the call to the object for which you are intercepting, but what if your interception task fails? What if you would rather not make the call under a certain set of circumstances, or what if the object you want to dispatch the call to is unavailable? Now you are responsible for removing parameters from a stack frame whose shape you don't even know.

Of course, there are ways to get around this problem. For example, you might require the caller to tell you the combined size of all parameters. However, this approach is somewhat awkward and makes your interception hardly transparent. Or you might try to derive the combined parameter size by querying the ITypeInfo interface of the target object. Of course, the object might not support this interface, in which case you could attempt to create a stub for the interface you want to wrap, and interpret its CInterfaceStubVtbl structure, defined in RpcProxy.h. And your interceptor must create a stub and interpret the structure before your wrapper function is called, since determining stack frame size inside the wrapper cannot tolerate failure. By now you've probably guessed that doing this will require significant effort.

The Post-Processing Problem

Your interception task might require work before making the call to the wrapped object, as well as afterward. This means that after the wrapped function is complete, it must return to your wrapper rather than that wrapper's caller. Therefore, you need to change the return address on the stack so that it points within the wrapper function. But how do you remember the address of the caller to which you must return after you finish post processing? After all, there is no place on the stack to store this information.

You can save the final return address in thread local storage (TLS). But allocating a new TLS slot can be expensive if interceptor calls nest on the same thread, and you might find yourself running out of slots. Therefore, you should manage a stack of return addresses via a single slot, instead of allocating a new slot for each function invocation.

Make no mistake: implementing generic interception is very challenging and is nonportable. Even if you never need to implement an interceptor in your own software,2 understanding the issues of the task gives you a better grasp of what is happening inside the COM+ middleware, if not sympathy for the developers who created it.

The Apartment

The COM+ apartment model lets objects make a statement regarding their thread affinity. An in-process server makes this statement declaratively by setting the ThreadingModel named value under the InprocServer32 key under the class ID key in the registry, generally at registration time. Before the MTS COM era, the apartment defined an object's innermost execution context—that is, the COM run-time environment would never inject itself between objects that resided in the same apartment. COM+ allows each object to choose from one of the following apartment types:

  • The single-threaded apartment (STA). An object created in this apartment is entered only by the unique thread that comprises the apartment. A ThreadingModel value of Apartment indicates that an object requires instantiation within an STA. A user thread can create such an apartment by calling CoInitialize or CoInitializeEx with COINIT_APARTMENTTHREADED. Calls into the apartment are received by the channel via window messages; therefore, each user thread that creates this apartment type must service a message loop until no objects remain in the apartment. Otherwise, calls to objects in the apartment cannot be serviced and will block. Since STA objects can be entered only by their creating thread, no concurrency can exist within them. Microsoft Windows NT and Windows 2000 will place a new STA object in the system STA—unless the caller resides in an STA itself, in which case the new object will be co-located in the caller's apartment. The system STA is an apartment owned by a thread created by the COM/COM+ library. The library arranges for this thread to service a message loop for the lifetime of the process. At most, one system STA will be created per process. The system STA is the only STA that can be created in a process by COM preceding the MTS era.
  • The main single-threaded apartment. By omitting the ThreadingModel named value or setting it to Single, an in-process server's object indicates that the object requires instantiation within the unique main STA of a process. This main STA is formed by the first user thread that creates an STA. If no STA exists within a process yet, the system STA will become the main STA.

    Legacy in-process servers sometimes use this setting because their objects were written under the assumption that they could share global in-process server data without requiring locking. An ActiveX DLL created by Microsoft Visual Basic also supports this setting.3 The setting is not recommended for new projects because it can lead to contention among all the COM objects that are forced to share the same thread.

  • The multithreaded apartment (MTA). Objects with ThreadingModel set to Free are created in this apartment. There is only one such apartment per process, and user threads can join it by calling CoInitializeEx with COINIT_MULTITHREADED. Such threads need not service a message loop and can terminate at any time. Objects in the apartment receive calls on arbitrary threads created by the Remote Procedure Call (RPC) run-time library. This apartment type does not imply synchronization, and objects running under COM prior to MTS as well as unconfigured COM+ objects must prepare for concurrent entry by callers. Visual Basic 6 objects cannot use this setting.
  • The thread-neutral apartment (TNA). This apartment type is new in COM+. Its objects are entered directly by the caller's thread, whether it is an STA thread or belongs to the MTA. Threads cannot belong to this apartment; they merely enter it for the duration of a call sequence. Like the MTA, this apartment type does not imply synchronization. Unconfigured COM+ objects must prepare for concurrency. Visual Basic 6 does not support this setting.

An object can also declare ThreadingModel equal to Both, in which case it will be created in the apartment of its caller. The value Both is used for historical reasons: it originated at a time when COM supported only two apartment types. An unconfigured component using this setting might experience concurrency, as its creator might be an MTA thread or a TNA object. The primary motivation for using this setting is to eliminate an apartment boundary between an object and its instantiator.

Table 4-1 illustrates which apartment COM and COM+ will choose for instantiation of a new unconfigured object, given that object's ThreadingModel and the instantiating thread's apartment membership. (Of course, the TNA row and columns are relevant to COM+ only.)

Table 4-1 Instantiation Apartment Selection

InstantiatorSingleApartmentFreeNeutralBoth
Main STA Main STA Main STA MTATNAMain STA
Secondary STAMain STACaller's STAMTATNACaller's STA
MTAMain STASystem STAMTATNAMTA
On loan to TNAMain STASystem STAMTATNATNA

Whenever a thread invokes a method on an object across an apartment boundary, the invocation is intercepted by the object's proxy, routed via the channel, and then delegated to the object by the stub on the other side of the channel. The COM+ middleware performs three important functions when intercepting method calls across apartments:

  • Since apartment switches that do not involve the TNA imply thread switches, the proxy and stub are responsible for packaging the stack frame and reassembling it on the object's thread.
  • Notifying the target apartment about incoming COM+ traffic can involve sending window messages or some other interprocess communication (IPC) mechanism. This notification is the channel's job. Crossing an apartment boundary that necessitates a thread switch imposes significant overhead. A ThreadingModel value of Both will eliminate this overhead between instantiator and object, and the TNA will eliminate the overhead in all cases for subsequent callers from other apartments and for the instantiator.
  • Object references from the originating apartment are converted to proxies in the target apartment. This prevents a thread in the target apartment from crossing into the object's apartment without interception. The new proxy is always directly connected to its object's apartment and does not detour through the caller's apartment unless the object resides in the caller's apartment.

Making a call to an object in the TNA within the same process never requires switching to a different thread. Only the last item in the previous list needs to be performed by an interceptor guarding access to the in-process TNA. Such an interceptor is sometimes called a lightweight proxy. Compared to the overhead of a thread switch, a lightweight proxy is very fast. But the lightweight proxy still needs to perform object reference conversion, as shown in the graphic at the top of the next page. For this reason, a TNA interceptor needs access to the proxy/stub DLL of the interface it is encapsulating, or in the case of a type library-marshaled interface, to its type library. Such access is also necessary for the interceptor to handle the failure problem inherent in general interception.

One of the consequences of a user thread's inability to join the TNA is that the thread always incurs the overhead of at least a lightweight proxy when creating or calling an object within the thread-neutral apartment. This is not necessarily true for creating or accessing an object in one of the other apartment types. If a user thread performing a watchdog activity or other periodic task needs frequent access to COM+ objects, placing these objects in the TNA could impede performance. However, such situations are relatively rare and in most architectures are confined to "system" type objects rather than objects containing business logic.

Graphic (Image Unavailable)

Managing STA Concurrency

Since only a unique thread can enter an object created in a single-threaded apartment, an STA object naturally avoids concurrency. However, method invocations are serialized not only for an individual object in the apartment, but also for all objects created in the apartment, since they all share the same thread. Therefore, it can be important to actively control which STA will be chosen for an object instantiation; otherwise, you might end up with large groups of objects blocking one another from executing, even though such concurrency in separate objects would be perfectly all right. Often the only thing preventing two objects from executing concurrently is their need to access shared global data. Such data access frequently is better controlled by using explicit locking strategies (described later in this chapter), rather than using the somewhat heavy-handed approach COM+ has for invocation serialization. Given that a group of objects needs to reside in an STA, the question becomes how many objects should share one apartment for optimal concurrency. The pressures to be balanced include each individual object's responsiveness as well as the amount of threads the system can handle before the thread scheduler's overhead becomes too significant.

Under normal circumstances, the instantiator of an object implicitly selects the object's apartment. But it is typical to see a client take control of in-process server concurrency by first creating a single-threaded apartment and then issuing the instantiation call from this new apartment. This approach can be effective in situations where the client process ultimately is aware of how server objects are being used, and how their concurrency can best be exploited.


MTA-Bound Call Interception
Since the introduction of the multithreaded apartment on Windows NT, COM developers frequently have asked why a thread executing within an STA object cannot call directly into the multithreaded apartment. With the addition of the thread-neutral apartment under COM+, the question becomes why a thread that originally created a single-threaded apartment cannot enter the multithreaded apartment, whether executing in its own apartment at the time of the call or on loan to the thread-neutral apartment when the call is dispatched. The question generally acknowledges that you would still need lightweight interception to prevent MTA threads from crossing over into the calling apartment when accessing object references passed as interface method arguments, but such interception should be feasible without incurring the expensive thread switch. After all, MTA objects are written to be entered by arbitrary threads and therefore it doesn't matter whether a calling thread actually belonged to a single-threaded apartment.

The justification for switching threads involves an STA thread's need to service a message loop somewhat frequently, and with the expectation of the MTA object developer. This justification is not so much connected to the mechanism the channel uses when making an outbound call from the MTA: normally this mechanism involves blocking the calling thread until the method invocation returns, but the channel has the power to discover a thread's native apartment membership and can enter a message loop waiting for call return if the calling thread is an STA thread, even when making a call from an MTA object. The real problem is that the MTA programming model allows the object developer to unconditionally block threads and to do so for arbitrary periods of time—usually for synchronization purposes or when waiting to access a resource. Therefore, within MTA object implementations, you frequently will find calls to EnterCriticalSection, WaitForSingleObject, and WaitForMultipleObjects that have long time-outs. The STA thread must protect itself from running into code like this; it does so by waiting in a message loop at the apartment boundary and having an MTA thread block in the synchronization APIs instead.


In other situations, the server is best equipped for arranging object-to-apartment distribution. This includes cases where server objects do not run in the client's process space, making the client unable to affect target apartment selection. At first it might appear that the server is unable to influence this apartment selection, since COM+ makes all the choices. This impression might have been furthered by my choice of words: to simplify the discussion, I always speak of the apartment in which the object will be created. In fact, the object's class factory—not the object itself—will be instantiated in the apartment by the COM+ library. Therefore, it is up to the class factory to create the actual object. Normally the class factory creates the object inside its own apartment, but it does not have to. The indirection of the class factory in the object creation process gives servers the ability to take control over an STA object's target apartment.

Visual Basic allows developers to multiplex objects, created externally or internally by means other than New, across a thread pool of a fixed size on a per-project basis. Alternatively, developers can specify that each object created in this manner should be located in a new apartment. These options are available to local servers only. Of course, C++ developers implementing their factories manually can do whatever they like to control target apartment selection or creation. But when using the Active Template Library (ATL), you can achieve multiplexing across a pool by using the macro DECLARE_CLASSFACTORY_AUTO_THREAD. By default, the size of the thread pool used by this mechanism will be four times the number of processors of the system on which your code executes. This dynamic way of determining pool size makes more sense than the Visual Basic approach, because a pool that is too large will degrade performance as much as a pool that is too small will cause insufficient object concurrency.

The Context

With MTS, COM started taking on additional services that needed to be handled at the level of the interceptor, including transaction support and role-based security. Such services have expanded under COM+ while becoming more configurable. We will take a survey of these interception services in a moment.

Before the days of MTS, COM interception was married to the concept of the apartment, and apartments are about threads. But uniting the new services with the thread infrastructure did not make sense, so a tighter execution scope for COM objects was needed. This new innermost execution scope is called the context. An object now resides within a context, and an apartment bounds a context. Extended COM+ run-time services are performed by interceptors, which must exist between any two objects that do not reside in the same context. As long as the interception occurs between contexts in the same apartment or on a TNA-bound call, these interceptors generally are as efficient as any lightweight proxy that does not require a thread switch. But recall that the interceptor still needs access to your proxy/stub DLL or type library, for the same reason a lightweight proxy requires this access. Two objects with similar configurations and the same interception needs may share the same context, but certain services require that an object be created in an entirely new context. Threading model setting permitting, an unconfigured object, which, by its very nature does not ask for and is unaware of extended services, is always co-located in its instantiator's context.

Of particular interest is a COM+ service specifically dedicated to managing concurrency in an object. I have to admit that I was initially quite confused by this service. Having worked with the apartment model for years, the concepts of thread affinity and synchronization had become indistinguishable in my mind. But upon later reflection, I realized that while a relationship between the concepts exists, the concepts for the most part are independent. There is no reason, after all, to not serialize access to either a multithreaded or a thread-neutral object. In Essential COM (Addison-Wesley, 1998), Don Box speculated about a new apartment type he called the rental-threaded apartment (RTA). This apartment type would have behaved just like the thread-neutral apartment of COM+, but with synchronization built in. This is just the type of idea that was bound to emerge from a mindset that identifies apartments with concurrency management. Yet the decoupling of the COM thread management construct (the apartment) from the synchronization construct (contextual concurrency management) feels conceptually pure and gives us more flexibility: we now can build an object that any thread can enter (as the RTA would have allowed), but we can choose whether to synchronize method invocations (which the RTA would not have allowed).

Figure 4-1 shows all possible synchronization settings for an object. The values are identical to those available for configuring transaction support. However, instead of being enlisted in or beginning a new transaction, an object with the value Supported, Required, or Requires New participates in or begins what is known in COM+ as a synchronization domain. Under MTS, synchronization domains were known as activities and were configurable only through the method a client chose to instantiate another object. Unfortunately, it therefore was up to the client to determine whether a new object could join its activity. Under COM+, this has become transparent to the instantiator and is controlled solely by the setting in the property sheet shown in Figure 4-1.

Figure 4-1. Concurrency tab of the property sheet of a configured thread-neutral object. (Image Unavailable)

Like a transaction, a synchronization domain can include objects in different contexts, apartments, processes, and hosts.4 Also, a synchronization domain is formed through the creation of an object with the setting Required (made by a caller currently outside any synchronization domain) or Requires New (made by any caller); the synchronization domain then flows to any object with the setting Supported or Required at instantiation time. In a synchronization domain, only one physical thread can execute at a time, and each thread must execute as the result of either a direct or indirect synchronous method invocation from the thread that first entered the domain. Figure 4-2 shows a synchronization domain that spans contexts and hosts with several physical threads.

Figure 4-2. Thread and synchronization domain interaction. (Image Unavailable)


Threading Model and Synchronization Interaction
I have championed the fact that synchronization support and thread affinity are independent concepts and now are treated as such by COM+. And it is rather easy to see how synchronization can be applied (or not applied) to objects in either multithreaded or thread-neutral apartments. But understanding how synchronization is applied with the single-threaded apartment is a bit more challenging.5 After all, being single threaded already implies a certain natural synchronization across the entire apartment.

The fact is that objects in the single-threaded apartment, which do participate in a synchronization domain, act quite differently from objects that do not participate in a synchronization domain. These differences include the following:

  • An object in a synchronization domain will flow domain membership to any object it creates that supports or requires synchronization. As a result, a group of MTA or TNA objects that support synchronization will not experience concurrency when created by an STA object in a synchronization domain. But if the STA object did not participate in a synchronization domain, this group of objects will experience concurrency.
  • Synchronized STA objects cannot be entered by calls coming from a causality other than the one of the call chain currently executing within the synchronization domain. Unsynchronized STA objects can.
  • STA objects that do not require synchronization will be created in the same apartment as the single-threaded instantiator object. But if the instantiator is not inside a synchronization domain and the STA object requires synchronization, the object actually will be created in a different single-threaded apartment. The reverse does not hold, however. If the caller is inside a synchronization domain, the object will be created in the same apartment—even if it does not support synchronization.

As you can see, for the most part COM+ synchronization layers its benefits on top of single-threaded synchronization. Understanding this relationship certainly will be useful in your own projects.


An object with the synchronization setting Disabled is unaware of synchronization services and behaves like an unconfigured component. Notice that this setting is not the same as Not Supported: the latter ensures that an MTA or TNA object can receive calls concurrently, while the former can result in the object being located in the caller's synchronized context. Disabled also is not the same as the Supported setting, which will force the object into a context that participates in the same synchronization domain as it would were the caller a member of a synchronization domain. But if the object requires a different context than that of the caller (for example, because the object has a different threading model, or because of other COM+ service configurations), the contexts must communicate to prevent concurrent execution. COM+ might achieve this communication by having the contexts share some type of lock. But when the setting is Disabled, the target context will not participate in such a locking scheme. This is why the Disabled setting truly has a unique meaning.

The Required and Requires New settings mean that the object must run in a synchronization domain. Requires New ensures the object always will be the root of a new synchronization domain. Required creates a new domain only if the instantiator does not already participate in one.

Synchronization Implementation by Deadlock

The theory behind context-based synchronization is fantastic, and the sheer number of options now available to developers should tremendously simplify situations that previously required you to build your own plumbing to achieve just the right concurrency behavior. However, when I examine the current COM+ implementation of the synchronization services for single-threaded objects, my enthusiasm for the technology wanes.

A call into an executing synchronization domain from a caller not participating in that synchronization domain can cause deadlock. A deadlock occurs if the call is made through an object in the synchronization domain whose threading model is Apartment and if that object's thread is currently servicing a message loop (for example, because it is waiting for an Object RPC call return). The message loop does not need to be within the bounds of the object being accessed concurrently for the deadlock to occur. These are the deadlocked entities:

  • The caller making the call from outside the synchronization domain.
  • The entire apartment of the thread receiving the inbound call of a new causality via its message loop, as well as any upstream callers waiting for the return of method invocations that this thread might be executing on behalf of. Such callers will be deadlocked whether they were part of the synchronization domain or whether the caller representing the initial causality was taking ownership of that synchronization domain.
  • The entire synchronization domain of the apartment-threaded object, as well as any threads waiting to enter the synchronization domain.

All these callers now are stalled because the message loop thread attempts to gain access to a lock on behalf of the new inbound caller. This lock never will become available because releasing it would require returning this very thread from a method invocation that is now further up the stack, as shown in Figure 4-3. Hence, the thread never can return from the DispatchMessage call in the message loop, and the call chain becomes stalled from that level upward.

Figure 4-3. Example of a deadlock. (Image Unavailable)

This issue is mitigated by the fact that concurrency often is regulated by a layer of technology in front of the COM+ object layers in modern COM+ architectures. The sharing of object references is discouraged in such highly scalable environments. (For more on this topic, see Chapter 13.) Nevertheless, the issue begs the question of why synchronization support is even an option for STA objects when enforcing it is guaranteed to result in deadlock. It is important to understand the situation precisely, since by its very nature it likely will cause sporadic bugs under just the right timing conditions, and often will be extremely hard to debug. Until Microsoft solves this problem, the only safe thing to do is religiously avoid sharing object references to synchronized STA objects. If you must share object references, be absolutely certain that synchronous calls cannot occur in your architecture. One warning: making concessions at that level of your architecture is likely to introduce brittleness.

The Message Filter

Restricting access for a single-threaded object to one causality at a time is common. Such objects often contain an internal state associated with the operation in progress, and receiving a call unrelated to this current operation can cause failure in these objects. As we have seen, using COM+ synchronization services unfortunately is not yet a solution for this kind of problem. But an ancient mechanism is designed to deal with this type of concurrency: the message filter, which stems from the 16-bit world of OLE and is associated with concurrency in user interface applications. Such applications often share single-threaded objects representing graphical entities among clients. A message filter is intended to prevent access to such single-threaded objects when they perform some internal manipulation (perhaps as the result of an inbound COM+ call), and would become confused by new requests if their internal call sequence had not yet completed.

There is both a server and a client aspect of the message filter mechanism. Server and client applications can install their own message filter by calling CoRegisterMessageFilter, passing an object that implements the IMessageFilter interface:

IMessageFilter : public IUnknown public:     virtual DWORD STDMETHODCALLTYPE HandleInComingCall(          /* [in] */ DWORD dwCallType,         /* [in] */ HTASK htaskCaller,         /* [in] */ DWORD dwTickCount,         /* [in] */ LPINTERFACEINFO lpInterfaceInfo) = 0;     virtual DWORD STDMETHODCALLTYPE RetryRejectedCall(          /* [in] */ HTASK htaskCallee,         /* [in] */ DWORD dwTickCount,         /* [in] */ DWORD dwRejectType) = 0;     virtual DWORD STDMETHODCALLTYPE MessagePending(          /* [in] */ HTASK htaskCallee,         /* [in] */ DWORD dwTickCount,         /* [in] */ DWORD dwPendingType) = 0; };

On the server side, COM+ will call HandleInComingCall before dispatching an inbound call to an object. On the client side, COM+ will call RetryRejectedCall when the server does not dispatch the call but flat out rejects it or advises retrying it later. The code calls MessagePending when Windows messages are received by the client while waiting for a COM+ call to return. The dwCallType parameter of HandleInComingCall lets the server know whether an incoming call is from a new causality while the object's thread waits for an outgoing call to return (CALLTYPE_TOPLEVEL_CALLPENDING). Such calls therefore are deferred easily (return SERVERCALL_RETRYLATER).

This all sounds marvelous, but there are some limitations. First, the message filter shows its user interface orientation by allowing only local servers, not DLL components, to register a message filter. This means no configured object can use this technique and therefore excludes today's most popular and flexible kind of COM+ component. Second, unless all clients can be guaranteed to be in process, rejecting a call with the retry flag can put an unacceptable burden on the caller. Unless the client's message filter specifies that such a rejected call should indeed be retried, an error message immediately will be returned to the caller. This caller might not be able to set its own message filter—for example, it might have been loaded from a DLL in another process or host, or it might have been implemented in a development system that does not permit setting a message filter. It is not reasonable to expect that all clients either have a message filter that retries or performs the retry at the level of every single COM+ method invocation. A Visual Basic Standard EXE will install a message filter that does retry for some time and then uses OleUIBusy to bring up the infamous Server Busy dialog box, giving the user a chance to manipulate the user interface of the server application and thus resolve the impasse. But the default COM+ message filter does not do this, instead it propagates all retry rejections directly to the caller. Therefore, even COM+ objects implemented in Visual Basic are subject to this error when operating in a host process implemented with, say, Visual C++.

The bottom line is that message filters were designed in another era, for a problem more specific than regulating general STA object concurrency. If message filters happen to be applicable to your situation, great! By all means use them—they won't go away any time soon. But chances are that forcing message filters into a modern architecture will be like trying to fit square pegs into round holes.

Interception Services

Synchronization is only one service that COM+ performs at the level of the interceptor. COM+ configures lightweight proxies between contexts to perform the specific adjustments to the environment necessary for the crossover. For example, suppose context A has the same synchronization settings as context B but does not support transactions, while context B requires transactions. A lightweight proxy in context A representing an object in context B would create a new transaction but would not attempt to acquire the shared synchronization domain lock. This is part of the reason an object reference can be used only in the context in which it was created.

Now let's take a look at some of the other COM+ services performed by interceptors:

  • Figure 4-4 shows the transaction support configuration of COM+. All settings except Disabled are familiar from MTS. Disabled simulates the behavior of an unconfigured object, with respect to transactions. In addition, you now can set the transaction timeout on a per-object basis (rather than a per-machine basis). The following grid shows which combinations of instantiator context and transaction support settings will force a new object into a new context. Note that even if the transaction aspect does not force a new context, one still might be required as the result of other settings.
    Caller ContextDisabledNot SupportedSupportedRequiredRequires New
    Has transactionxxxx
    Does not have transactionxxx
  • Object security configuration is also familiar from MTS. If security is enabled for the application, only users who are members of the roles checked in Figure 4-5 or those roles granted access at the interface and method levels will be able to call the object. An object with security checks enabled always will be created in its own context.

    Figure 4-4. Transactions tab of the property sheet of a configured object. (Image Unavailable)

    Figure 4-5. Security tab of the property sheet of a configured object. (Image Unavailable)

  • Just-in-time activation was present but unconfigurable under MTS. It is now controlled by the similarly named check box shown in Figure 4-6. When this option is set, the object will be created in its own context because you need a unique interceptor to create the object instance when its first method is invoked.

    Figure 4-6. Activation tab of the property sheet of a configured thread-neutral object. (Image Unavailable)

  • Selecting the pooling check box shown in Figure 4-6 enables object pooling. Object pooling can be described as the opposite of just-in-time activation: object instances are returned to a pool instead of being destroyed when an object is deactivated. Interfaces related to pooling were defined under MTS (an object could request to be pooled by implementing IObjectControl in a particular fashion), but the pooling mechanism itself had not yet been implemented. Be sure to review the documentation carefully before checking this box: the system makes very specific demands of the implementation of a pooled object. You also can violate the isolation property of transactions by carrying state in a pooled object. Pooling likely will be effective only in somewhat specialized circumstances.

An object's threading model can have an effect on what services will be available to it. For example, single-threaded objects cannot be pooled. Interdependency also exists among certain services. For instance, supporting or requiring transactions (including new ones) forces an object to use just-in-time activation. The specific settings of Supported and Required also force a setting of Required for synchronization support. The transaction setting Requires New implies the setting Required or Requires New for synchronization support. Apart from that, enabling just-in-time activation also demands that the object require an existing or new synchronization domain, regardless of the transaction setting.

Context Neutrality

Often the work of the COM+ application architect involves balancing the cost of system services against performance needs in critical areas of the project. And sometimes you find that the reasons you designed an object as a COM+ object involve the most basic properties of COM+: binary compatibility and language independence. You wanted your object to be callable by anyone, anywhere—and fast. You didn't care about apartments and contexts. So you created an unconfigured component with the threading model Both. Yet as references to your object got passed around in the process, callers sometimes found themselves encountering substantial overhead when calling your performance-critical code. You found that these callers were always in contexts or apartments other than the one in which your object was created.

The implementation you seek here is a context-neutral one—that is, you want an object whose methods can be called from anywhere in the process, without the aid of any type of interceptor. Implementing a context-neutral object goes somewhat against the grain of the COM+ context and apartment model; in fact, it might be impossible to achieve from a high-level development environment. You can achieve this implementation using C++, although doing so is not a trivial task. Let's see what the issues are and whether any alternatives exist.

Implementation

A context-neutral6 object has the following characteristics:

  • It has no context or apartment of its own. The object can force its creation in a certain context and apartment via its threading model and COM+ configuration settings, but this is unusual since the object cannot take advantage of these services when called. Context-neutral objects therefore are normally unconfigured and have a threading model setting of Both, allowing for creation in the instantiator's context.
  • It makes no assumptions about the apartment or context membership of a calling thread. The object shares this characteristic with objects whose threading model is Both, even if its own threading model is not Both. However, for context-neutral objects, apartment membership of the calling thread can change with every interface method call.
  • It is prepared to be accessed concurrently, even if its threading model is Apartment, or if it participates in a synchronization domain.
  • The only kind of interface pointer it will ever access directly is the one that does not exceed the scope of an individual method invocation.
  • It implements the IMarshal interface, passing a raw interface pointer across contexts and apartments as long as the marshaling context is in process.

Providing your own implementation of IMarshal neatly solves the problem of interceptor creation. After all, an interceptor is created only as the result of marshaling an interface pointer by the standard marshaling architecture of COM+. And all marshaling, whether invoked implicitly through transporting your interface pointer in a method call bound for another context or explicitly by invoking a marshaling API, eventually passes through CoMarshalInterface, which gives your object a chance to provide custom marshaling by implementing IMarshal. When you use this approach, in effect your object joins the context of its caller for the duration of the call. If you access interface pointers passed to your object, this access is made on the physical thread of the caller and through interceptors configured for your caller's context, if any exist. If you create an object during a call, this creation occurs precisely as though performed by the actual caller.

This maneuver became so popular that Microsoft made available a canned implementation of it in the COM library, in the form of an object called the free-threaded marshaler (FTM). (This happened before the release of COM+ in Windows 2000, back in the days of Windows NT 4.) Instead of having to implement IMarshal yourself, you now can call CoCreateFreeThreadedMarshaler and aggregate the object returned from this call. It's that simple. The ATL object wizard even includes a Free Threaded Marshaler check box that adds those few lines of code into your new context-neutral object.

Internal Object References

The FTM check box I just mentioned is the single biggest source of confusion and frustration among users new to COM and the Visual C++ development system. The user interface makes it appear like a common choice, somewhat akin to providing error support. (The check box appears next to the option for which it is listed in the object creation dialog box.) But context-neutral objects are rare, and at the very least, the FTM check box should have been hidden behind an Advanced button or some such user interface device. Instead, developers check the box innocently, later forgetting that they did. But the effects of aggregating the free-threaded marshaler are dramatic and—unbeknownst to the budding, point-and-shoot Visual C++ developer—turn the programming model of COM+ objects on its head.

The biggest change for COM+ object implementation involves storing interface pointers. You expect to be able to create an object and store it in a data member, or use that data member to store an interface pointer passed as an [in] or [in, out)] parameter after increasing its reference count, and then access this stored interface pointer at any time before finally releasing it. But since every method invocation can occur under a different context for a context-neutral object, no interface pointer that was created, passed, or otherwise obtained during one call can be accessed directly in subsequent calls. Instead, the context-neutral object must ensure that such interface pointers are properly marshaled into each caller's context before accessing them, even within the object's own implementation.

Since the standard COM+ marshaling APIs (CoMarshalXXX, CoUnmarshalInterface, CoGetInterfaceAndReleaseStream, and so on) require the participation of threads in both the exporting and importing contexts, context-neutral objects usually use the global interface table (GIT) to store interface pointers for access across method invocations. This mechanism works by table-marshaling the pointer into a global location, and then returning a registration cookie to the caller. Since the pointer was table marshaled, the cookie can be used later to import the pointer into the retrieving context as often as desired. Context-neutral objects therefore never store interface pointers; instead, they store registration cookies. (This is what I meant when I said earlier that there would be dramatic changes to the programming model.) Specifically, you could accomplish this by using the following code:

STDMETHODIMP CNeutral::Foo(/*[in]*/ IBar* piBar)     if (! piBar)         return E_POINTER;     CComPtr<IGlobalInterfaceTable> ciGIT;     HRESULT hResult;     if (FAILED(hResult = ciGIT.CoCreateInstance(                CLSID_StdGlobalInterfaceTable))     ||  FAILED(hResult = ciGIT->RegisterInterfaceInGlobal(                piBar, IID_IBar, &m_nBar)))         return hResult;     m_bBar = true;     return S_OK; STDMETHODIMP CNeutral::DoBar()     if (! m_bBar)         return E_UNEXPECTED;     CComPtr<IGlobalInterfaceTable> ciGIT;     CComPtr<IBar> ciBar;     HRESULT hResult;     if (FAILED(hResult = ciGIT.CoCreateInstance(                CLSID_StdGlobalInterfaceTable))     ||  FAILED(hResult = ciGIT->GetInterfaceFromGlobal(                m_nBar, IID_IBar,                reinterpret_cast <LPVOID*> (&ciBar))))         return hResult;     ciBar->…     return S_OK; void CNeutral::FinalRelease()     if (m_bBar)     {         CComPtr<IGlobalInterfaceTable> ciGIT;         if (FAILED(ciGIT.CoCreateInstance(         CLSID_StdGlobalInterfaceTable))         ||  FAILED(ciGIT->RevokeInterfaceFromGlobal(m_nBar)))             _ASSERT (FALSE);     } }

But Is It Fast?

I once had clients who told me that they always checked the FTM check box for all new COM objects they developed. When I asked them why they did this, they explained that they had read the free-threaded marshaler was supposed to make COM objects faster. The option therefore was much like the Turbo button on older PCs: you could go slow if you wanted to, but most people preferred to go fast, especially when there were no penalties.

Jokes aside, it is true that the performance gains derived from using the FTM can be significant—especially when you short-circuit cross-apartment interceptors that would otherwise cause thread switches, as opposed to lightweight proxies. It is typical for a direct call to execute up to 30 times faster than a thread-switched one. The duration of execution of your performance-critical, context-neutral code obviously plays a crucial role here, and the benefit of context neutrality will lessen as the ratio of actual code execution time to middleware overhead increases. In other words, don't expect much gain from making a large prime-number generator context neutral.

There are situations, however, in which aggregating the FTM actually will make an object slower. To understand how this can happen, consider the fact that a context-neutral object always accesses other objects from the context of its caller. Suppose that these other objects reside in the MTA, while most callers execute on STA threads. In this case, we encounter no overhead during the initial call from an STA into our context-neutral object, but in the context-neutral code we face one expensive thread switch for every MTA object that we access, whether the objects' interface pointers are arguments to our methods or we stored interface pointers to the MTA objects in the GIT. If we had not made our object context neutral and assigned threading model Free, we would encounter only one thread switch total per call. Determining the actual gain of context neutrality therefore can be a complicated matter and requires careful analysis of an object's actions. For this reason, evaluating the performance impact of context neutrality is similar to assessing the impact of an object's threading model.

FTM vs. TNA

When considering context neutrality, the designer's goal is usually the elimination of the expensive kind of proxy—the thread switching kind. Before COM+, the FTM was the only way to accomplish this. However, boosting performance by eliminating thread switches is precisely why the thread-neutral apartment was invented. And where applicable, the thread-neutral apartment is a far more elegant solution than the free-threaded marshaler: it's like having COM+ solve your performance problem for you, instead of fighting the apartment model.

I would estimate that the TNA is now a better solution for at least 80 percent of the cases that used FTM before COM+. The chief advantage the thread-neutral apartment has over the free-threaded marshaler is that it does not alter the familiar COM+ programming model with respect to stored object references. No more cookies, no more GIT, no more worrying about keeping alive all apartments that ever imported an interface pointer or in whose scope an object was created and saved7—just plain vanilla COM+ object implementation, easy and clean. And if you can manage to implement entire layers of your project in the thread-neutral apartment, the overhead of further method invocation once the calling thread has entered the first TNA object is likely very low to nonexistent. (The method invocation overhead is throttled only by the overhead of extended any interceptor-based COM+ services you might be using.) Of course, the factors that can cause a context-neutral object to be slow also can affect an object in the thread-neutral apartment. If, for example, such an object needs to access multiple MTA objects per method invocation but is mostly called on STA threads, the thread-neutral apartment might have been a poor choice for the sake of improving performance. The free-threaded apartment remains the superior option in this scenario.

Of course, the FTM still eliminates the overhead of even lightweight proxies. For objects that stand to gain from this further optimization, the trade-off will be between actual performance gain on one hand and ease of implementation as well as maintainability on the other. And interception-based, extended COM+ services (such as transactions and synchronization) will remain unavailable to the FTM user. While the TNA gradually will emerge as an FTM replacement, expect context neutrality to remain a useful tool for a relatively small set o f specialized system objects.

It's the Object's Choice

Context neutrality should be considered a detail of an object's implementation, completely transparent to its clients. When manually marshaling interface pointers between contexts and apartments, clients should not rely on the context neutrality of the object and shouldn't simply pass raw interface pointers around. Passing raw interface pointers is especially tempting in the implementation of the context-neutral object itself, which must use the GIT to transport interface pointers across method invocations. But coupling to this detail of object implementation violates one of the fundamental tenets of COM+ as well as object-oriented programming in general: the separation between interface and implementation.

Think of it this way: following the rules and using marshaling mechanisms instead of passing raw object references across contexts insulates you against a change in the object implementation—the details of how the object implements IMarshal. This change would require no new interfaces, no new class IDs, nothing that is externally visible. And the client's invocation of the marshaling mechanism costs nothing as long as the object does aggregate the FTM; the object itself then short-circuits the marshaling.

You can make exceptions to this rule, but not many. Sometimes an object needs to be context neutral to perform its service. Perhaps the best-known example of this is the IStream pointer returned from CoMarshalInterThreadInterfaceInStream. This pointer is certified to be context neutral. If the pointer was not context neutral, there would be no reason to call the API, since its purpose is to marshal a given interface pointer to another context in the same process. Note, however, that the claim of context neutrality holds only for the particular IStream implementation returned from this API, and not IStream in general.

Concurrency Design Guidelines

Concurrency is absolutely crucial to the scalability of any software system. Adding processors to a machine or adding hosts to a distributed architecture won't help if your system already stalls at 40 percent processor utilization, with threads blocking for access to shared resources. Concurrency is vital and concurrency is hard—so much so that it is best to let someone else handle it. See Chapter 13 for a history of concurrency management by operating system services and for details on how to keep your own architecture free from concurrency management.

The Best Concurrency Is No Concurrency

The need for managing concurrency in the layers of your own code begins when object references are shared, when multiple clients contend for access to the same server objects and are stalled either by COM+ synchronization services or by code in the server object itself that now needs to manage concurrent access to its internal data structures. Try to avoid such stalls by not sharing access to server object instances in the first place. Try to keep concurrency management out of all layers of your project.


Breaking the Rules: The Case of ASP
Purposeful ignorance of an object's threading model and marshaling implementation is an important design rule. Threading models and the path of object interactions are important considerations throughout a project's lifecycle to produce a well-performing product, yet nothing in the actual implementation should make assumptions about these internals in other objects. Violating this rule limits flexibility, makes maintenance difficult, and leads to brittle products. But at least one popular technology violates this rule on a number of occasions: Active Server Pages (ASP).

ASP places restrictions on objects that are stored in a session or application state. If an object whose threading model is Apartment is stored in a session state, ASP locks down the session to the single thread used to create that object. The intention behind this locking is clear: since access to the object will require switching to the creating thread, forcing that thread to be the one making those calls likely will improve system performance by eliminating expensive thread switches and stalled threads. On the other hand, users whose sessions are now bound to this thread must wait for the thread to become available to service their requests.

The rules for objects in application state are even more draconian: an object may be stored in application state only if its threading model is Both and if it aggregates the free-threaded marshaler (unless the AspTrackThreadingModel configuration property has been set to 1). This means that ASP actually queries your object for IMarshal, and then determines whether its implementation is provided by the free-threaded marshaler (perhaps by calling GetUnmarshalClass) before allowing it to be stored in application state. That's getting rather cozy with your implementation details, isn't it? But again we recognize the reasoning: proxies and thread switches likely would affect every Web application on the server—a situation best avoided.

ASP is ultimately justified in penetrating your object's encapsulation to guarantee the performance of all Web applications on the server. Even so, it does not blindly rely on your objects' context neutrality; it merely determines whether they are context neutral and chooses a course of action based on the result. Remember, ASP is a large and comprehensive framework for building Web applications, hardly reminiscent of a typical application project. Chances are that your own code will be best served by treating the marshaling details of server objects as a black box.


Exceptions: The Case of Client Notification

In typical software projects, you can apply the single client per server object-design rule to about 95 to 100 percent of all objects. You cannot always apply this rule to 100 percent of the objects because certain types of problems cannot be handled efficiently without centralizing some transient data in memory. Client notification is a good example of such a problem. Databases offer pessimistic and optimistic locking models that inform clients of either concurrency in progress or a conflict at the time of the data update, respectively. But databases generally do not offer a mechanism by which clients can continually track the current state of changing data.

If clients in your project need to track changing data, you might find sharing event distribution objects unavoidable. Polling the database might be an alternative, but it tends to further reduce scalability for anything but the smallest number of clients and the longest polling intervals.

Sharing an event distribution object does not mean your business logic must block against the concurrent notification mechanism. In fact, it is best to carry out notification asynchronously from the necessarily serial portions of business logic. This suggestion holds true whether notification is initiated directly by your objects or by artifacts you add to your database schema, such as stored procedures and triggers. Note that it also is not necessary to dispatch all callbacks from a single context, apartment, process, or host. Even problems such as notification that introduce scalability concerns into your architecture by forcing concurrency into it can benefit from some amount of distribution. For example, you might allow clients to register for callbacks with a number of hosts. This eases the load on any single node but forces you to keep a network of distribution nodes in sync.

The client notification pattern shows that there are situations in which you cannot avoid managing some amount of concurrency yourself. The previous example contains a data structure of callback registrations at each notification distribution node that must be protected from concurrent access. In a C++ implementation, you might find a Standard Template Library (STL) container protected by a Win32 synchronization object. A Visual Basic implementation might use the Shared Property Manager (SPM). But all implementations will need to regulate concurrent access through locking mechanisms controlled by your code. You will be able to make the best use of the information in the upcoming section on locking when faced with situations like these. Such cases should be few and far between, but pay particular attention to your design when you encounter them. The scalability of your product might depend on it.

Standard Synchronization Settings

Following are some general statements that describe which synchronization settings apply to which object, depending on its functional category in your project. These observations serve as a good rule of thumb—but keep in mind that all rules were made to be broken.

  • Business logic and data tiers: thread-neutral apartment with required synchronization. Generally this type of code does not care about thread affinity. Placing all the code in the TNA diminishes overhead of calls into these tiers and all but eliminates overhead for intra-tier calls. Requiring a synchronization domain will simplify the implementation significantly wherever object-level vs. method-level data is involved. Given the single-client model, not much can be gained by striking the synchronization domain requirement.
  • System service objects: MTA or TNA. Synchronization is required when the implementation cannot tolerate concurrency but is not supported when explicit locking in the object yields better results. The FTM might be aggregated by unconfigured objects with the threading model Both for the system type functionality with the greatest performance needs. Unconfigured objects running in the default context of the multithreaded apartment—a common situation—can share interface pointers without marshaling among threads or using the GIT,8 which makes their implementation fast and relatively straightforward, as long as you are prepared to accept the burden of implementing manual locking. The same holds true for unconfigured objects in the thread-neutral apartment. The thread-neutral apartment should be your default choice for this object type, but in some cases the multithreaded apartment makes for simpler implementation when blocking against synchronization objects is needed.
  • User interface objects: STA. Synchronization is required only when it is advantageous to group more than one downstream object into the same synchronization domain, or when transaction or other settings imply it. The window system architecture demands that the thread with ownership of a window—that is, the thread that created the window—must serve its window procedure. It might be difficult to control the lifetime of this creating thread if your object ran in an apartment other than a single-threaded one. In addition, manipulating your windows' handles will cause expensive thread switches if performed from any thread other than that of the window owner. Therefore, your choosing a single-threaded apartment for this kind of object is similar to ASP's choice to lock a session to an STA object's thread when that object is stored in session state.
Concurrency in Local Servers

In the days before MTS, the local server was a popular option for objects that had to be isolated from their clients for stability reasons, because clients in multiple processes could access objects that needed to share a process, or because an object might have to survive beyond the lifetime of the creating client process. The local server remains useful today for objects that are functionally tied to a particular client executable, especially regarding OLE and Automation of a client's user interface. It makes little sense to have an Automation object intended to alter the number of columns on the currently visible spreadsheet loaded in process or in the process space of a surrogate, when the spreadsheet application itself runs in its own executable. Local servers are also the best option for objects that must run within a Windows NT or Windows 2000 service process—because the objects require the local system's security context, a service's preloading behavior turns out to be beneficial, or perhaps an IT organization prefers to use the administrative mechanisms of Windows NT and Windows 2000 services for your server.

These special cases aside, the advent of MTS and COM+ has made the local server almost obsolete. Two factors have contributed to the local server's demise: lack of need and lack of new features. The lack of need stems from the MTS and COM+ "server application," which allows in-process servers to run in their own process space, eliminating what was previously the most common motivation for choosing a local server over an in-process server. The lack of new features is a result of the inability to install local server objects in the COM+ catalog. Therefore, these unconfigured local server objects cannot take advantage of COM+ transactions, synchronization domains, just-in-time activation, event subscription, and so on. In addition, local servers always have exhibited a number of unpleasant idiosyncrasies with regard to concurrency management. Let's examine them now.

Apartments in Local Servers

A ThreadingModel named value under the LocalServer32 key in the registry has no effect and normally is omitted. Instead, a local server determines the apartment that will be used to make calls to each registered IClassFactory interface by the apartment membership of the thread used to make the registration call via CoRegisterClassObject. Therefore, it is possible to force each class of object into a different apartment by registering each factory in a different apartment. Of course, each factory has the freedom to force each object instance into yet another apartment at instantiation time, as previously discussed.

A local server that contains more than one apartment type is referred to as a mixed-mode server. Multi-apartment local servers can be a challenge to implement, since each STA thread must be kept alive while objects may remain in its apartment. Therefore, all these threads must be coordinated for startup and shutdown, introducing race conditions, which we will examine next. The factory registration mechanism also implies that local servers cannot instantiate objects in the thread-neutral apartment, since threads cannot register themselves for TNA membership. The thread-neutral apartment would have held a lesser appeal, since clients encounter the heavy burden of a process switch on each call anyway. Nevertheless, the TNA could be useful for callback interfaces handed to internally created in-process servers. If you want this benefit, you must go with a COM+ server application instead of a local server, or you must use the free-threaded marshaler.

Local Server Pitfalls

At least two well-known race conditions are associated with multithreaded local servers. The first has to do with server startup, the second with server shutdown. When initializing, each thread representing an apartment calls CoRegisterClassObject multiple times, once for each object class to be instantiated in that apartment. Then the thread enters a message loop until it receives a quit message, which signals impending process termination and instructs the thread to revoke all registered factories and then terminate. But object instantiation requests can arrive as soon as a factory has been registered and—in the case of an STA—the thread has begun servicing its message loop. This means that process termination can be initiated before all threads even have completed their initialization sequence, as a result of the entire "early" set of objects being released. In turn, this can lead to a situation in which threads fulfill new instantiation requests as they complete their initialization, after the process already has decided to shut down. When the shutdown does occur, clients are then left with disconnected proxies, resulting in errors when they attempt to call methods on those proxies.

The solution to this problem consists of registering all class factories in a suspended state (REGCLS_SUSPENDED). As each thread finishes registering factories and enters its message loop, it decrements an interlocked counter, and when that counter reaches zero someone calls CoResumeClassObjects, allowing access to all factories by the system simultaneously.

The second race condition is similar, albeit associated with server shutdown. When the server's last object instance is released and the server decides to shut down, it posts quit messages to all threads. But instantiation requests can arrive at factories before each thread processes its quit message and revokes its factories. Again, clients will be left with disconnected proxies.

COM+ (and COM) provide the functions CoAddRefServerProcess and CoReleaseServerProcess, which assist multithreaded local servers in managing their lifetimes. The shutdown race condition is eliminated if all objects call CoAddRefServerProcess when they are initialized, and CoReleaseServerProcess when they are destroyed. These functions also should be called by the factories' IClassFactory::LockServer method implementations. The server should begin the shutdown process and post quit messages to its threads when CoReleaseServerProcess returns zero. The race condition is avoided because the COM+ library suspends all registered class factories before returning zero from CoReleaseServerProcess.

Partial Location Transparency

Location transparency, the principle of not caring where a COM+ server object is implemented, is somewhat compromised in local servers. A local server is free to distribute to its clients interface pointers for its own objects or objects in other processes. When handing out interface pointers for its own objects, the server's internal reference counting mechanism will keep these objects alive until all such pointers have been released. When distributing interface pointers for objects in another process, the client will be connected directly to that target process, eliminating the need for the intermediate local server process to stay alive while the interface pointer remains in use by the client.

But there is a problem with handing out interface pointers for in-process server objects created within the local server process: COM+ does not attempt to integrate the lifetime of an in-process server with that of a regular hosting local server. A local server will decide to shut down, despite having live connections to objects in its process space—objects that belong to in-process servers. The surrogate infrastructure of COM+ provides a solution to this issue. Otherwise, COM+'s own server application process, namely DLLHost.exe, would have the same problem. But this solution is realized in a way that makes it impossible for the local server implementer to take advantage of it, unless she wants to implement a custom surrogate.

Calling CoFreeUnusedLibraries within a local server still has the desired effect: in-process servers that respond affirmatively to DllCanUnloadNow are eventually unloaded. But there is no way to determine whether some in-process servers will remain loaded after the call because they continue to host live objects. Therefore, the only practical way for a local server to evade the issue is to avoid handing out interface pointers to such objects. But that means the code had to be aware of the object being implemented in an in-process server—and there goes the "transparency" in location transparency.

Implications

Implementing a well-performing local server is difficult. A local server must perform concurrency management all by itself. Deciding how to distribute objects across apartments is not easy. And a couple of race conditions await you. On top of all that, you can't use extended COM+ services or the thread-neutral apartment. But all these issues are neatly solved in COM+ and MTS server applications. If what you need is data and functionality isolation for a symmetric architecture, go with COM+ server applications. Leave local servers to user interface automation, OLE, and Windows NT or Windows 2000 services.

Locking

One of the great truths of the universe is that any amount of concurrency in a code segment always must be tempered with locks around nonlocal data that the code segment touches. In the world of COM+ and object orientation, this means having to protect an object's instance data from concurrent access by the object's methods. An introduction to data locking in the style of a freshman computer science course usually looks like this:

// Bus has space for 50 passengers if (globalBusUsers < 50)     // WARNING: thread could be preempted here     ++globalBusUsers;     RideBusToStation(sDestination);     --globalBusUsers; else     WalkToStation(sDestination);

If a thread is preempted between the statements demarcated by the warning comment or two threads proceed through the if statement simultaneously on a multiprocessor machine, more than 50 requests for bus service could be made at one time. The testing and changing of the shared variable therefore has to occur in one single, indivisible step. Protecting the shared variable from concurrent access through locking accomplishes this goal.

A much shorter and simpler example also illustrates the need for locking. In the previous example, we actually can see the boundary between high-level statements where thread suspension would be dangerous. But allowing more than one thread at a time to execute the single, high-level statement ++globalBusUsers would be just as unsafe because such a high-level statement is not guaranteed to translate to one atomic machine instruction. To execute the increment, the CPU might be instructed to load a value into a register from memory, add 1 to this value, and then write the value back to memory. The executing thread could be interrupted after having read the value but before writing the incremented register. If a second thread can execute at least the read portion of this sequence before the first thread is scheduled to run again, we lose one increment.9

Coarse-Grained Locks

Under COM and COM+, a popular locking option consists of preventing access to an entire object while a thread is executing within that object, regardless of the object's functionality areas exercised by either the executing or waiting thread. I call this kind of lock a coarse-grained lock. In many cases, such locks represent a reasonable compromise between the maximum possible amount of concurrency in a given object and the amount of bookkeeping required to attain this high degree of concurrency. Such locks neatly solve the problem of concurrent access to object-level state, since only object method implementations can access this data and the coarse-grained lock will allow only one such method to execute at a time.

Lock Types

Let's take a look at what options we have for implementing coarse-grained object locks with COM+ and how these options compare:

  • You can assign the object to a single-threaded apartment. This effectively locks the object whenever the apartment thread is active within it, but it also ties the object to a single thread. Therefore, this option is generally too strong for simple coarse-grained locking. In addition, all objects in the apartment compete for the same thread. In essence, you have only one lock for all objects in the apartment and all other activities that thread might perform, perhaps including user interface duty. STAs are very light on bookkeeping but often too weak on concurrency.
  • Objects in the multithreaded and thread-neutral apartments that do not require or support synchronization domains can be locked manually by entering a critical section at the beginning of every method and leaving it before returning. With ATL, the methods CComObjectRootEx::Lock and CComObjectRootEx::Unlock perform this function. If locks can be obtained late and released early, this option sometimes leads to greater concurrency than a synchronization domain would, and at relatively low cost. However, a COM+ synchronization domain often performs just as well and avoids certain issues with manual locking that we will examine in a moment.
  • Objects in the MTA and the TNA can require a synchronization domain, or they require a new synchronization domain if they do not want to share their access lock with their instantiators. Since the deadlock condition I described previously affects STA objects, synchronization domains are not useful for deferring calls from secondary causalities to such objects; they are useful to STA objects only to form synchronization domain roots under which more than one downstream MTA or TNA object is grouped.

If you determine that an object must be shared and the question of locking arises, a configured TNA object that requires a new synchronization domain generally will offer the highest degree of concurrency with no restrictions on how the sharing can occur. If you find that you need a greater degree of concurrency and are willing to invest additional effort in bookkeeping, a fine-grained lock will be your next step. (I'll talk about fine-grained locks in the next section.)

Locking on STA threads

Objects in the multithreaded and thread-neutral apartments can use a synchronization object such as a critical section to regulate concurrent access to their object state, in lieu of requiring a synchronization domain. Single-threaded objects as well as objects with the threading model Both cannot do this, unless they also are context neutral and do not need to be shared across processes, lest these objects encounter a situation similar to the one depicted in Figure 4-3. The motivation for locking objects in an STA is to defer calls from other causalities, but you must choose a mechanism other than call blocking to accomplish this. However, an object in an STA still might want to use locking to access other resources—for example, resources the object might share with objects in other apartments. Or the object might need to coordinate the execution of certain code segments with the state of other threads in its process. In all these cases (save for the one involving the multithreaded apartment), we might face blocking a thread that belongs to an STA.

Suspending an STA thread for any significant amount of time can slow down your overall system, since all requests to any of the objects in the apartment will block until the thread resumes servicing its message loop. You might even cause deadlock if the synchronization object you are waiting on for release depends on activity in one of the objects in the blocking thread's apartment.

Before COM+, the way to avoid apartment stalls was to use MsgWaitForMultipleObjects, examine its return value, and, if indicated, retrieve and dispatch messages from the queue. This was rather inconvenient and even unnecessary for Both threaded objects instantiated in the multithreaded apartment, and for method invocations on FTM aggregators that came from the multithreaded apartment. In that regard, the COM+ thread-neutral apartment has the same semantics as the FTM, where each call can occur on either an STA or MTA thread. Unfortunately, it is difficult to determine the apartment affinity of a calling thread10; therefore, the cumbersome MsgWaitForMultipleObjects approach had to be chosen on all calls.

COM+ brings us the much more convenient and efficient CoWaitForMultipleHandles. This function detects the apartment affinity of the calling thread, and either dispatches messages while waiting for the given synchronization handles or uses WaitForMultipleObjects if the calling thread is a member of the MTA. This function is now the API of choice whenever STA threads are involved and should be the only wait function you ever use in a thread-neutral or context-neutral object.

Causality-Based Locking

At first glance, manual locking appears to be a simple and effective alternative to requiring a synchronization domain in the MTA and TNA. There is, however, one big drawback: COM+ synchronization domains employ causality-based locks. But the Win32 critical section or mutex most often used to implement manual object locking—while offering recursion functionality in principle—records only the physical thread ID of its owner. Therefore, manual locking has the effect of ruining reentrancy whenever callbacks in the same causality are received on a thread other than the one owning the Win32 synchronization object. When such reentrancy does occur, deadlock will result.

The first step toward regaining this reentrancy is determining the causality of the caller. COM+ transmits a causality ID across the channel from a calling thread to the called thread. Unfortunately, no documented way of obtaining access to this ID exists. However, the Ole32.dll exports the function CoGetCurrentLogicalThreadId, which retrieves the ID for you. Since the function is not in any import library, it must be dynamically retrieved from the DLL. Using undocumented system calls is always risky, but in my opinion, the function is not likely to be removed because the result it delivers is necessary to implement published functionality (COM+ synchronization domains, for example). And it is impossible to implement a causality-based lock without this information.

Now I will show you a complete implementation of a causality-based lock, using CoGetCurrentLogicalThreadId. The lock stalls secondary causalities with CoWaitForMultipleHandles and is therefore appropriate for use in all apartments. To determine whether to grant a lock request, the implementation considers whether the lock is currently in an unclaimed state; if it is not, the implementation considers whether the causality of the caller matches the causality of the lock owner. If the causalities match, the caller can proceed after a usage counter has been incremented. If they do not match, the caller must wait for an event to be raised before proceeding through the lock after incrementing the same usage counter. When the lock is released, as many times as it had been acquired by a causality, this event becomes signaled if threads are waiting to be admitted. Because the event is an auto-reset event, only one waiting thread is admitted into the lock and will set the new owner causality ID. The lock is implemented this way:

// CausalityLock.h: interface for the CCausalityLock class // ////////////////////////////////////////////////////////////////////// #pragma once class CCausalityLock   // Construction - destruction public:     CCausalityLock();     ~CCausalityLock(); // Acquisition and release     void Lock();     void Unlock(); // Implementation private:     // API function     typedef HRESULT (STDAPICALLTYPE *t_pfnCGCLTId)(LPGUID);     static t_pfnCGCLTId s_pfnCoGetCurrentLogicalThreadId;     static class CInitStatic     {     public:         CInitStatic();         ~CInitStatic();     private:         HMODULE m_hOle32;     } s_cInitStatic;     friend class CInitStatic;     // Causality ID     GUID m_tCausalityId;     // Usage and waiting thread counters     unsigned long m_nUsageCount, m_nWaitCount;     // Event for blocking secondary causalities     HANDLE m_hEvent;     // Internal lock     CRITICAL_SECTION m_tCritSec; }; // CausalityLock.cpp: implementation of the CCausalityLock class // ////////////////////////////////////////////////////////////////////// #include "stdafx.h" #include "CausalityLock.h" #include "..\CSB\ComSTLBridge\CSBError.h" #include "..\CSB\ComSTLBridge\AtlSupport.h" ////////////////////////////////////////////////////////////////////// // Static storage initialization ////////////////////////////////////////////////////////////////////// CCausalityLock::t_pfnCGCLTId     CCausalityLock::s_pfnCoGetCurrentLogicalThreadId; CCausalityLock::CInitStatic CCausalityLock::s_cInitStatic; CCausalityLock::CInitStatic::CInitStatic()     // Get CoGetCurrentLogicalThreadId proc address     m_hOle32 = LoadLibrary(_T("Ole32.dll"));     _ASSERTE (m_hOle32);     CCausalityLock::s_pfnCoGetCurrentLogicalThreadId =         reinterpret_cast <t_pfnCGCLTId>         (GetProcAddress(m_hOle32, "CoGetCurrentLogicalThreadId"));     _ASSERTE (CCausalityLock::s_pfnCoGetCurrentLogicalThreadId); CCausalityLock::CInitStatic::~CInitStatic()     _VERIFYE (FreeLibrary(m_hOle32)); ////////////////////////////////////////////////////////////////////// // Construction - destruction ////////////////////////////////////////////////////////////////////// CCausalityLock::CCausalityLock() :     m_nUsageCount (0),     m_nWaitCount (0)     // Initialize synchronization objects     InitializeCriticalSection(&m_tCritSec);     m_hEvent = CreateEvent(NULL, FALSE, FALSE, NULL);     if (! m_hEvent)     {         DeleteCriticalSection(&m_tCritSec);         CCSBStdException::CSBThrowException(static_cast <HRESULT>             (HRESULT_FROM_WIN32(GetLastError())));     } CCausalityLock::~CCausalityLock()     _ASSERTE (! m_nUsageCount && ! m_nWaitCount);     // Free synchronization objects     _VERIFYE (CloseHandle(m_hEvent));     DeleteCriticalSection(&m_tCritSec); ////////////////////////////////////////////////////////////////////// // Acquisition and release ////////////////////////////////////////////////////////////////////// void CCausalityLock::Lock()     // Determine caller's causality     HRESULT hResult;     GUID tCallerCausalityId;     if (FAILED(hResult = s_pfnCoGetCurrentLogicalThreadId(                &tCallerCausalityId)))         CCSBStdException::CSBThrowException(hResult);     // Test for reentering (or initial) caller     EnterCriticalSection(&m_tCritSec);     const bool bWaitingCausality = (m_nWaitCount ||         m_nUsageCount) && (! m_nUsageCount ||         tCallerCausalityId != m_tCausalityId);     if (bWaitingCausality)         ++m_nWaitCount;     else         // Acquisition must be completed without leaving the         // critical section         goto CompleteLockAcquisition;     LeaveCriticalSection(&m_tCritSec);     // New causalities must wait for auto-reset event     DWORD nIndex;     if (FAILED(hResult = CoWaitForMultipleHandles(0, INFINITE, 1,                &m_hEvent, &nIndex)))     {         // Sync object failure - full recovery not guaranteed         EnterCriticalSection(&m_tCritSec);         --m_nWaitCount;         LeaveCriticalSection(&m_tCritSec);         _VERIFYE (ResetEvent(m_hEvent));         CCSBStdException::CSBThrowException(hResult);     }     // Single causality is now in possession of this lock     EnterCriticalSection(&m_tCritSec); CompleteLockAcquisition:     if (! m_nUsageCount++)         m_tCausalityId = tCallerCausalityId;     if (bWaitingCausality)         --m_nWaitCount;     LeaveCriticalSection(&m_tCritSec); void CCausalityLock::Unlock()     _ASSERTE (m_nUsageCount);     EnterCriticalSection(&m_tCritSec);     // Decrement usage count; new threads can be recognized as     //  initial causalities as soon as critical section is released     if (! --m_nUsageCount && m_nWaitCount)         // Allow one blocked secondary causality to proceed now         _VERIFYE (SetEvent(m_hEvent));     LeaveCriticalSection(&m_tCritSec);

Lock Management

In most situations, a call to acquire a lock must be balanced with a call to release it. In object locking, these calls usually occur within the bounds of each interface method. Making sure that lock release always happens can be a messy and error-prone undertaking, similar to ensuring the release of interface pointers and the deallocation of BSTRs. Therefore, the recommended approach to solving this problem systematically consists of treating locks as normal resources and applying the resource-deallocation-on-destruction pattern.

To implement this approach, place a class instance of a lock acquirer and releaser on the stack. You can perform lock release manually, but it happens automatically when the object is destroyed—that is, when the function returns. However, many different kinds of locks exist, and they must be acquired and released in syntactically ways. Therefore, the first step in avoiding code replication of such an acquisition and release class involves turning it into a template, parameterized on the class type of the actual lock. If this class type does not support acquisition and release via methods the lock template expects to call, one or more adapter classes for the lock will be necessary (perhaps in the form of functionals if the lock itself is a Win32 synchronization object manipulated by API calls, instead of a wrapper class). The adapter class translates requests for acquisition and release by the template to lock class method calls. A simple lock template can take this form:

////////////////////////////////////////////////////////////////////// // CCSBLock:  easy to use resource allocator that can lock and unlock //            whatever synchronization object or CComObjectRootEx is  //            passed to it. Lock can be called multiple times, and  //            the resource will be unlocked only when this object is  //            destroyed or when the number of Unlock calls matches  //            the number of Lock calls. template <class T> class CCSBLock // Construction - destruction public:     CCSBLock(T& rTLockee, bool bAutoLock = true) :         m_rTLockee (rTLockee),         m_nLockCount (bAutoLock ? 1 : 0)     {         if (bAutoLock)             rTLockee.Lock();     } private:     CCSBLock(const CCSBLock<T>&);     CCSBLock<T>& operator= (const CCSBLock<T>&); public:     ~CCSBLock()     {         if (m_nLockCount)             m_rTLockee.Unlock();     } // Operations     void Lock()     {         if (! m_nLockCount++)             m_rTLockee.Lock();     }     void Unlock()     {         _ASSERTE (m_nLockCount);         if (! --m_nLockCount)             m_rTLockee.Unlock();     } // Implementation private:     T& m_rTLockee;     unsigned short m_nLockCount; };

Using this lock template to synchronize access to the state of an ATL object in the multithreaded or thread-neutral apartment then look like this:

STDMETHODIMP CFoo::Bar()     CCSBLock<CFoo> cLock(*this);              return S_OK;

If your ATL object's base class indicates the allocation and use of a critical section (for example, because the ThreadModel parameter of CComObjectRootEx was set to CComMultiThreadModel), this code segment would lead to acquisition of that critical section for the entire scope of the Bar method. To upgrade CFoo to a causality-based lock, it must first aggregate an instance of CCausalityLock—for instance, m_cCLock. To use this new lock instead of the critical section for object locking, you then have the option of either amending the class declaration or altering the way the object lock is acquired. The former approach might look like this:

class CFoo : public …          // Implementation private:          CCausalityLock m_cCLock;         void Lock()     {         m_cCLock.Lock();     }         void Unlock()     {         m_cCLock.Unlock();     } };

And the latter approach might appear as follows:

STDMETHODIMP CFoo::Bar()     CCSBLock<CCausalityLock> cLock(m_cCLock);              return S_OK;

Finally, let's suppose that CFoo aggregated an additional critical section for more specific locking, to be used only in certain methods. The CCSBLock template cannot be used to enter and leave the critical section directly, so an adapter is needed. We could implement an adapter class that aggregates the critical section structure and associate the two objects with one another whenever we create an instance of the adapter class on the stack. But now we have to create two objects to implement stack-based resource management. We could also have CFoo aggregate the adapter class instead of the critical section directly. This would work well in the case of CFoo, but not in general, where we might need to deal with synchronization objects already held by base classes or other objects. For the ultimate flexibility and convenience, we therefore opt to extend CCSBLock to accept an optional lock acquisition and release functional that becomes aggregated by the new lock template itself. This is our new lock template, complete with an adapter for the general case and a specialization for critical sections:

template <class T> class CLockAdapterFunctional // Construction - destruction private:     CLockAdapterFunctional<T>& operator= (const         CLockAdapterFunctional<T>&); // Acquisition and release public:     void operator () (T& rTLockee, bool bAcquire)     {         if (bAcquire)             rTLockee.Lock();         else             rTLockee.Unlock();     } }; class CLockAdapterFunctional<CRITICAL_SECTION> // Construction - destruction private:     CLockAdapterFunctional<T>& operator= (const         CLockAdapterFunctional<CRITICAL_SECTION>&); // Acquisition and release public:     void operator () (CRITICAL_SECTION& rtCritSec, bool bAcquire)     {         if (bAcquire)             EnterCriticalSection(&rtCritSec);         else             LeaveCriticalSection(&rtCritSec);     } }; template <class T, class CAdapterFunc = CLockAdapterFunctional<T> > class CAdapterLock // Construction - destruction public:     CAdapterLock(T& rTLockee, bool bAutoLock = true,                  const CAdapterFunc& rcAdapter = CAdapterFunc()) :           // our adapter functionals don't carry state, but yours            // might, and so we prepare to maintain functional state         m_rTLockee (rTLockee),         m_cAdapter (rcAdapter),         m_nLockCount (bAutoLock ? 1 : 0)     {         if (bAutoLock)             m_cAdapter(rTLockee, true);     } private:     CAdapterLock(const CAdapterLock<T>&);     CAdapterLock<T>& operator= (const CAdapterLock<T>&); public:     ~CAdapterLock()     {         if (m_nLockCount)             m_cAdapter(m_rTLockee, false);     } // Operations     void Lock()     {         if (! m_nLockCount++)             m_cAdapter(m_rTLockee, true);     }     void Unlock()     {         _ASSERTE (m_nLockCount);         if (! --m_nLockCount)             m_cAdapter(m_rTLockee, false);    } // Implementation private:     T& m_rTLockee;     CAdapterFunc m_cAdapter;     unsigned short m_nLockCount; };

You now could adjust the previous code segment and replace CCSBLock with CAdapterLock, and everything would continue to work the same way. But you also could declare a lock manager instance of type CAdapterLock<CRITICAL_SECTION> and obtain the desired behavior. Note that this improved template allows you to define further specializations of CLockAdapterFunctional. But carrying a functional state means that you can also pass a regular function pointer to a synchronization object management function. Again, in the case of critical sections, passing a regular function pointer might look like this:

void ManageCriticalSection(CRITICAL_SECTION& rtCritSec, bool bAcquire)     if (bAcquire)         EnterCriticalSection(&rtCritSec);     else         LeaveCriticalSection(&rtCritSec); typedef void (*t_pfnCritSecManage)(CRITICAL_SECTION&, bool); STDMETHODIMP CFoo::Bar()     CAdapterLock<CRITICAL_SECTION, t_pfnCritSecManage>         cLock(m_tSecondarySection, true, &ManageCriticalSection);              return S_OK;

As you have seen, options for stack-based automatic management of lock resources abound. Use them and avoid getting lost in tracing all possible exit paths (which might include exceptions for functions that are not interface method implementations) from your methods.

Fine-Grained Locks

At times a coarse-grained, object-level lock can be too broad and simple to provide a sufficient amount of concurrency. The implementer of a concurrent object then might have to identify which code paths and object uses do not conflict with one another, and apply separate locking mechanisms to only the mutually exclusive areas. One way to accomplish this is to aggregate more than one lock and have the implementation acquire the lock instance that protects one particular data member or set of data members. Everything I showed you in the previous section is applicable to this approach.

An alternate and often complimentary fine-grained locking approach consists of pushing into the lock information about the nature of the data access sought and having the lock determine whether and how long to block a caller, depending on the types of access currently in progress. A lock that performs this task is sometimes called a group lock, because it services different groups of callers that are asking for and engaged in more than one type of activity.

The most common form of group lock is the multi-read, single-write (MRSW) lock, also called the read/write lock. This lock distinguishes between read and update access to a resource (usually data in memory), offering two distinct lock acquisition methods. This lock can have an arbitrarily large number of lock owners in its reader group but only one owner in its writer group. In addition, the two groups are mutually exclusive—in other words, no readers can own the lock if a writer currently owns it, and vice versa.

MRSW locks are so popular because in-memory data is amenable to simultaneous reading but naturally susceptible to concurrency when updates are in progress. We have already seen why simultaneous updates cause trouble. The reason even read access must be blocked while a single update operation is in progress concerns the internal consistency of multiple data elements, when the update involves more than one memory location. Common data structures such as trees and lists have update isolation requirements similar to the ones that database transactions offer. For example, to remove a node from something as simple as a doubly linked list, more than one pointer must be adjusted. Traversing the list while one pointer has been adjusted and the other has not could easily lead to faults. Therefore, readers must be blocked until the entire update operation has been completed. This is also why it is often necessary to protect access to more than one data element by a single lock: the data elements might be related, and you might have to avoid an inconsistent state.

Library vendors commonly certify their products as safe for concurrent access as long as update operations are not involved. In fact, the STL portion of the Standard C++ Library does not contain locks to protect itself from concurrent read/write access to the same data structures, and if multiple threads can access artifacts such as container objects, such access must be guarded by at least an MRSW lock. If you find that course-grained locking provides insufficient concurrency in an object, an MRSW lock might help solve the problem. We will spend the rest of this section examining MRSW lock features and implementations.

MRSW Lock Concepts

Its description makes the MRSW lock appear deceptively simple. Upon some consideration, however, you will discover a number of subtleties in the MRSW specification whose resolution affects how an MRSW lock instance can be used. Let's examine these possible features:

  • Recursion. Win32 mutexes and critical sections are recursive. For an MRSW lock, recursiveness is defined by a reader's ability to reacquire read access and a writer's ability to reacquire write access while still holding the lock, all without blocking.
  • Promotion. This feature is defined by a reader's ability to acquire a write lock without releasing its read lock.
  • Inversion. This feature is defined by a writer's ability to acquire a read lock without releasing its write lock. You might ask why you ever would want to do this, given that a write lock is really a superset of a read lock. But the need for inversion frequently arises in implementations, when functions holding a write lock call functions requiring only a read lock. Releasing the write lock before calling the other function is often not an option, since it would break up the atomicity of the update operation.
  • Writer starvation. A steady stream of overlapping read lock requests might prevent blocked writers from ever proceeding into the lock. A sophisticated lock implementation will block readers when writers are queued, even if the active reader group is not empty.
MRSW Lock Implementations

Windows NT and Windows 2000 offer an MRWS lock implementation to kernel-mode device driver implementers. Unfortunately, the only such lock available to Win32 applications is not documented.11In addition, this lock does not offer promotion (an attempt to promote will result in deadlock) and does not prevent writer starvation.

Because the lock functions remain undocumented in Windows 2000, and because it is quite possible to build your own MRSW lock on top of Win32 synchronization objects, I do not advocate using the undocumented system MRSW lock. Building an MRSW lock on top of the Win32 API also makes it portable to all Win32 platforms, not just the current versions of Windows NT and Windows 2000.

A Full-Featured Win32 MRSW Lock

Building your own MRSW lock is possible, but it is not easy. In fact, implementers face a number of subtle and sometimes rare race conditions that must be avoided. Implementing promotion is particularly challenging, since any reader can be promoted to a writer only when all readers are applying to be upgraded to writers. Leaving MRSW implementation as an exercise for you to do on your own would be a bit unfair, so I will dedicate the rest of this section to one complete MRSW lock implementation.

The lock whose source code follows has these operational characteristics:

  • An arbitrary number of threads can simultaneously hold a read lock.
  • Only one thread at a time can hold a write lock.
  • If a write lock has been acquired, attempts to acquire a read lock will block (or time out) until the write lock has been released, unless the request to obtain the read lock is being made by the single thread (which holds the write lock), in which case the request will succeed immediately.
  • If one or more read locks have been acquired, attempts to acquire a write lock will block (or time out) until all read locks have been released, unless all read locks are being held by threads that request to obtain a write lock, in which case one thread (but not necessarily one holding a read lock) will be granted the write lock. All other threads will remain blocked until this condition evaluates to true again—that is, all threads holding read locks are blocked on write lock requests and no outstanding write lock exists.
  • If a write lock has been acquired, only the single thread holding the lock can reacquire it; all other threads will block (or time out) on a write lock request until the write lock has been released.
  • The lock can be configured to accept time-out values in lock acquisition requests. If so configured, the lock has these additional characteristics:

    • The lock uses slightly more resources and is slightly slower.
    • A time-out value can be passed to the acquisition methods; this value is passed through to Win32 wait functions.
    • The acquisition methods return false on failures because of time-outs.
    • The ResetStoredTimeout method or the constructor can be used to store a default time-out in the lock object; this value will be used if the MRSW_STORED_TIMEOUT constant is passed to the acquisition methods.
    • When a thread that holds a read lock requests a write lock, the time-out parameter to the acquisition method is automatically overridden to carry the value INFINITE.
    • While a write lock is being requested (and the requester is blocked), and while a write lock is outstanding, all new threads will block (or time out) on any lock request. The order in which these requests will be granted is guaranteed first in, first out (FIFO) when the lock is configured to accept time-out values; otherwise, the system's policy for critical sections applies.

Being unable to honor requests for time-out during promotion is an interesting, if unexpected, byproduct of MRSW requirements: at the time a reader submits its request for promotion, a blocked writer can be admitted into the lock if that reader was to the last active reader not stalled in a promotion request. If we permitted this promotion request (or that of any other reader stalled in its promotion request at the time a writer is released) to time out, the caller would be informed that the promotion time interval has indeed expired. However, that caller would certainly still expect to hold the read lock it was attempting to upgrade. Continuing with its operation after the promotion failure could lead to concurrent read and write operations, which cannot be allowed. All promotions therefore are executed without a time-out.

Let's look at the source code for this MRSW lock:

// MRSWLock.h : header file // #ifndef MRSWLock_H #define MRSWLock_H #include "AtlSupport.h" #include "ComSTLBridgeExpImp.h" /////////////////////////////////////////////////////////////////////////// // CMRSWLockRoot: the multi-reader, single-writer lock; the lock can be //                configured to accept a time-out value on acquisition class DLLEXPORTIMPORT CMRSWLockRoot // Types and constants public:     // Constant indicates that previously set time-out should be used     static const DWORD MRSW_STORED_TIMEOUT; // Constructors - destructors     CMRSWLockRoot(bool bWaitableLock = false, DWORD nInitialTimeout =                   INFINITE);     ~CMRSWLockRoot();     // Copy and assignment private:     CMRSWLockRoot(const CMRSWLockRoot&);                 // not impl     CMRSWLockRoot& operator = (const CMRSWLockRoot&);    // not impl // Operations public:     bool AcquireReadLock(DWORD nMillisecondTimeout = MRSW_STORED_TIMEOUT);     bool AcquireWriteLock(DWORD nMillisecondTimeout = MRSW_STORED_TIMEOUT);     void ReleaseReadLock();     void ReleaseWriteLock();     void ResetStoredTimeout(DWORD nMilliseconds); // Implementation private:     class CPrivateLockRouter     {     public:         inline CPrivateLockRouter(CMRSWLockRoot& rcTarget);         inline void Lock();         inline void Unlock();     private:         CMRSWLockRoot& m_rcTarget;     };     friend class CPrivateLockRouter;     class CPrivateLock : public CCSBLock<CPrivateLockRouter>     {     public:         inline CPrivateLock(CMRSWLockRoot& rcTarget);     private:         CPrivateLockRouter m_cLockRouter;     };     typedef std::map<DWORD, unsigned short> t_ARM;     inline void            Lock();     inline void            Unlock();     bool                   AcquireBlade(DWORD nMillisecondTimeout);     inline void            ReleaseBlade();     void                   UpdateEvent(bool bTowardsSignalledState);     const bool             m_bWaitableLock;     DWORD                  m_nStoredTimeout;     CRITICAL_SECTION       m_tInternalLock;     union     {         CRITICAL_SECTION   m_tReadersWriterBlade;         HANDLE             m_hReadersWriterBlade;     };     HANDLE                 m_hSingleWriterStop;     t_ARM                  m_cActiveReaders; }; ///////////////////////////////////////////////////////////////////////////// // The read and write lock classes in the derivation diamond are provided // for the benefit of the auto-balance lock template class DLLEXPORTIMPORT CMRSWReadLock : public virtual CMRSWLockRoot public:     void Lock();     void Unlock(); }; class DLLEXPORTIMPORT CMRSWWriteLock : public virtual CMRSWLockRoot public:     void Lock();     void Unlock(); }; class DLLEXPORTIMPORT CMRSWLock : public CMRSWReadLock,                                   public CMRSWWriteLock // Types public:     typedef CMRSWReadLock t_ReadLock;     typedef CMRSWWriteLock t_WriteLock; // Construction     CMRSWLock(bool bWaitableLock = false, DWORD nInitialTimeout =               INFINITE); }; /////////////////////////////////////////////////////////////////////////// // TMRSWLock: a threading model_sensitive template that resolves to either //            the CMRSWLock class or a fake lock // The SectionSwitch helps TMRSWLock in branching to full or stubbed // functionality by examining the underlying critical section class; it can // be used directly if desired template <class TCriticalSection> class TMRSWSectionSwitch // Types and constants - for compatibility with CMRSWLock public:     typedef TMRSWSectionSwitch t_ReadLock;     typedef TMRSWSectionSwitch t_WriteLock;     enum { MRSW_STORED_TIMEOUT }; // Constructors - destructors     TMRSWSectionSwitch(bool /*bWaitableLock*/ = false,                        DWORD /*nInitialTimeout*/ = INFINITE) {}    // Copy and assignment private:     TMRSWSectionSwitch(const TMRSWSectionSwitch&);              // not impl     TMRSWSectionSwitch& operator = (const TMRSWSectionSwitch&); // not impl // Operations public:     bool AcquireReadLock(DWORD /*nMillisecondTimeout*/ =                          MRSW_STORED_TIMEOUT)         { return true; }     bool AcquireWriteLock(DWORD /*nMillisecondTimeout*/ =                           MRSW_STORED_TIMEOUT)         { return true; }     void ReleaseReadLock() {}     void ReleaseWriteLock() {}     void ResetStoredTimeout(DWORD /*nMilliseconds*/) {}     void Lock() {}     void Unlock() {} }; template <> class TMRSWSectionSwitch<CComCriticalSection> : public CMRSWLock public:     TMRSWSectionSwitch(bool bWaitableLock = false,                        DWORD nInitialTimeout = INFINITE) :         CMRSWLock (bWaitableLock, nInitialTimeout) {} }; // The TMRSWLock will be the MRSW lock type most useful to the majority of // clients, since it is sensitive to the ATL threading model type template <class TThreadModel> class TMRSWLock : public TMRSWSectionSwitch<TThreadModel::CriticalSection> public:     TMRSWLock(bool bWaitableLock = false, DWORD nInitialTimeout =               INFINITE) :         TMRSWSectionSwitch<TThreadModel::CriticalSection>             (bWaitableLock, nInitialTimeout) {} }; /////////////////////////////////////////////////////////////////////////// #endif // MRSWLock.cpp : implementation file // #include "stdafx.h" #include "MRSWLock.h" #include "CSBError.h" #include "ComSTLBridgeStatusCodes.h" /////////////////////////////////////////////////////////////////////////// // CMRSWLockRoot /////////////////////////////////////////////////////////////////////////// // Class variable initializations const DWORD CMRSWLockRoot::MRSW_STORED_TIMEOUT = INFINITE - 1; /////////////////////////////////////////////////////////////////////////// // Construction and destruction CMRSWLockRoot::CMRSWLockRoot(bool bWaitableLock /*= false*/,                              DWORD nInitialTimeout /*= INFINITE*/) :     m_bWaitableLock (bWaitableLock),     m_nStoredTimeout (nInitialTimeout)     _ASSERTE (bWaitableLock || nInitialTimeout == INFINITE);     // Initialize internally used lock; may throw exception     InitializeCriticalSection(&m_tInternalLock);     bool bInitStage2 = false;     try     {         // Initialize blade; blade type determined by flag         if (bWaitableLock)         {             if (! (m_hReadersWriterBlade = CreateMutex(NULL, FALSE, NULL)))                 CCSBStdException::CSBThrowException(static_cast <HRESULT>                     (HRESULT_FROM_WIN32(GetLastError())));         }         else             InitializeCriticalSection(&m_tReadersWriterBlade);         bInitStage2 = true;         // Initialize event         if (! (m_hSingleWriterStop = CreateEvent(NULL, TRUE, TRUE, NULL)))             CCSBStdException::CSBThrowException(static_cast <HRESULT>                 (HRESULT_FROM_WIN32(GetLastError())));     }     catch (…)     {         // Free resources allocated before the failure occurred         DeleteCriticalSection(&m_tInternalLock);         if (bInitStage2)             if (bWaitableLock)                 _VERIFYE (CloseHandle(m_hReadersWriterBlade));             else                 DeleteCriticalSection(&m_tReadersWriterBlade);         throw;     } CMRSWLockRoot::~CMRSWLockRoot()     // Lock must not be in use now     _ASSERTE (m_cActiveReaders.empty());     // Free all resources used by this lock     DeleteCriticalSection(&m_tInternalLock);     if (m_bWaitableLock)         _VERIFYE (CloseHandle(m_hReadersWriterBlade));     else         DeleteCriticalSection(&m_tReadersWriterBlade);     _VERIFYE (CloseHandle(m_hSingleWriterStop)); /////////////////////// //////////////////////////////////////////////////// // Operations bool CMRSWLockRoot::AcquireReadLock(DWORD nMillisecondTimeout                                     /*= MRSW_STORED_TIMEOUT*/)     // Determine this thread's current read lock count     const DWORD nThreadId = GetCurrentThreadId();     CPrivateLock cLock(*this);  // lock for map access     t_ARM::iterator cCurThreadPos = m_cActiveReaders.find(nThreadId);     if (cCurThreadPos != m_cActiveReaders.end())     {         // Reader lock is being reacquired by this thread;         // bump acquisition count and return without further ado         ++(*cCurThreadPos).second;         return true;     }     // This is an initial read lock acquisition for this thread; we must     // acquire the blade so that all writers remain excluded after blade     // acquisition and until the read lock is released     cLock.Unlock();     if (! AcquireBlade(nMillisecondTimeout))         return false;     // The thread is getting ready to run with lock; enter its ID into the     // active reader map (and do *not* release the blade before this entry     // has been made; otherwise, a writer could block on the writer stop     // event forever)     cLock.Lock();  // lock for map access     try     {         m_cActiveReaders[nThreadId] = 1;     }     catch (…)     {         ReleaseBlade();         throw;     }     cLock.Unlock();  // map access complete     // While holding the blade, prevent writers from achieving lock by     // resetting the writer stop event     _VERIFYE (ResetEvent(m_hSingleWriterStop));     // Release blade now for acquisition by other new readers or writers     ReleaseBlade();     return true; /*    Note: the time-out value will be converted to INFINITE automatically for          any calling thread that already holds a read lock */ bool CMRSWLockRoot::AcquireWriteLock(DWORD nMillisecondTimeout                                      /*= MRSW_STORED_TIMEOUT*/)     // Determine whether this is a read-to-write conversion     const DWORD nThreadId = GetCurrentThreadId();     CPrivateLock cLock(*this);  // lock for map/set access and container                                 // crunch     t_ARM::iterator cCurThreadPos = m_cActiveReaders.find(nThreadId);     const bool bLockConvert = cCurThreadPos != m_cActiveReaders.end();     // We are about to become a waiting writer; set the recursion count in     //  the map to zero, thus indicating that this reader thread is waiting     //  to achieve write lock     unsigned short nSavedRecursionCount;     if (bLockConvert)     {         // Note: we do not simply remove the entry from the map in order to         // avoid the (rather complicated) issue of not being able to         // reacquire memory for it before returning         nSavedRecursionCount = (*cCurThreadPos).second;         (*cCurThreadPos).second = 0;         // The preceding operation might have converted all active readers          // to waiting writers, and thus the writer stop event might have to          // be opened for this or another currently blocked writer thread         UpdateEvent(true);     }     // Adjust/retrieve time-out value     if (m_bWaitableLock)         if (bLockConvert)             nMillisecondTimeout = INFINITE;         else if (nMillisecondTimeout == MRSW_STORED_TIMEOUT)             nMillisecondTimeout = m_nStoredTimeout;     cLock.Unlock();     // Now attempt to acquire the blade     bool bAcquisitionException;     do     {         try         {             if (! AcquireBlade(nMillisecondTimeout))             {                 _ASSERTE (! bLockConvert);                 return false;             }             bAcquisitionException = false;         }         catch (const CCSBStdException& rcException)         {             _UNUSED (rcException);             if (! bLockConvert)                 throw;             bAcquisitionException = true;             // A converting thread cannot be allowed to escape at this             // stage of the lock acquisition process, since the above             // ContainerCrunch might have released another writer thread;             // thus, returning now could lead to simultaneous read/write             // activity             ATLTRACE(_T("MRSW read to write lock conversion failed "                         "unexpectedly at mutex with error %lx; "                         "retrying...\n"), rcException.ErrorCode());         }     } while (bAcquisitionException);     // Now wait for the stop event to become signaled; this will occur (if     // it has not already) when all currently active readers have released     // their locks or attempt to convert to write locks     DWORD nWaitResult;     while (true)     {         nWaitResult = WaitForSingleObject(m_hSingleWriterStop,                                           nMillisecondTimeout);         _ASSERTE (nWaitResult != WAIT_TIMEOUT || ! bLockConvert);         if (bLockConvert && nWaitResult == WAIT_FAILED)         {             ATLTRACE(_T("MRSW read to write lock conversion failed "                         "unexpectedly at event with error %lx; "                         "retrying...\n"), GetLastError());             continue;         }         if (nWaitResult == WAIT_OBJECT_0)             break;         // We have been unsuccessful at acquiring the event and give up         ReleaseBlade();         if (nWaitResult == WAIT_TIMEOUT)             return false;         _ASSERTE (nWaitResult == WAIT_FAILED);         CCSBStdException::CSBThrowException(static_cast <HRESULT>             (HRESULT_FROM_WIN32(GetLastError())));     }     // We have successfully passed the event and are now the only active     // thread holding any lock on this object while holding the blade     // Clean up before returning     if (bLockConvert)     {         (*cCurThreadPos).second = nSavedRecursionCount;         UpdateEvent(false);     }     return true; void CMRSWLockRoot::ReleaseReadLock()     // Locate thread entry in map     CPrivateLock cLock(*this);  // lock for map access and container crunch     t_ARM::iterator cCurThreadPos =         m_cActiveReaders.find(GetCurrentThreadId());     _ASSERTE (cCurThreadPos != m_cActiveReaders.end());     // If this is a recursive lock, decrement count and get out     if ((*cCurThreadPos).second > 1)     {         --(*cCurThreadPos).second;         return;     }     // Otherwise, this thread's entry must be removed from the active      // reader map and the writer release condition must be evaluated     m_cActiveReaders.erase(cCurThreadPos);     UpdateEvent(true); void CMRSWLockRoot::ReleaseWriteLock()     // Releasing the blade will allow queued writers or readers to achieve     // lock     ReleaseBlade(); void CMRSWLockRoot::ResetStoredTimeout(DWORD nMilliseconds)     CPrivateLock cLock(*this);     m_nStoredTimeout = nMilliseconds; /////////////////////////////////////////////////////////////////////////// // Implementation /*    Acquire internal short-term lock. Not in header because of private    protection status. */ void CMRSWLockRoot::Lock()     EnterCriticalSection(&m_tInternalLock); /*    Release internal short-term lock. Not in header because of private    protection status. */ void CMRSWLockRoot::Unlock()     LeaveCriticalSection(&m_tInternalLock); bool CMRSWLockRoot::AcquireBlade(DWORD nMillisecondTimeout)     _ASSERTE (m_bWaitableLock || nMillisecondTimeout ==               MRSW_STORED_TIMEOUT);     if (! m_bWaitableLock)     {         EnterCriticalSection(&m_tReadersWriterBlade);         return true;     }     // Find time-out value to use     if (m_bWaitableLock && nMillisecondTimeout == MRSW_STORED_TIMEOUT)     {         CPrivateLock cLock(*this);  // lock for non-const obj state access         nMillisecondTimeout = m_nStoredTimeout;     }     const DWORD nWaitResult = WaitForSingleObject(m_hReadersWriterBlade,                                                   nMillisecondTimeout);     if (nWaitResult == WAIT_FAILED)         CCSBStdException::CSBThrowException(static_cast <HRESULT>             (HRESULT_FROM_WIN32(GetLastError())));     if (nWaitResult == WAIT_TIMEOUT)         return false;     if (nWaitResult == WAIT_ABANDONED)         ATLTRACE(_T("MRSWLock blade acquired through abandoned "                      "mutex.\n"));     return true; void CMRSWLockRoot::ReleaseBlade()     if (m_bWaitableLock)         _VERIFYE (ReleaseMutex(m_hReadersWriterBlade));     else         LeaveCriticalSection(&m_tReadersWriterBlade); /*    The method compares the currently active readers against the set of    writers now waiting to run. If every active reader is also a writer    waiting to run and the given flag instructs to look for this condition,    we signal the writer stop event in order to allow that single writer    (hence the event's name) that made it to the event to proceed now. If    the condition is false and we were instructed to look for falsehood, we    reset the event.    Note: the internal lock must have been acquired before calling this          method. This is natural, since it is usually called after an          adjustment to the reader map, for which a lock must have been          acquired already. */ void CMRSWLockRoot::UpdateEvent(bool bTowardsSignalledState)     for (t_ARM::iterator cArmIter = m_cActiveReaders.begin();         cArmIter != m_cActiveReaders.end(); ++cArmIter)         if ((*cArmIter).second)         {             // A currently executing active reader exists             if (! bTowardsSignalledState)                 _VERIFYE (ResetEvent(m_hSingleWriterStop));             return;         }     // There are no active readers, or all active readers are waiting to     // acquire a write lock     if (bTowardsSignalledState)         _VERIFYE (SetEvent(m_hSingleWriterStop)); /////////////////////////////////////////////////////////////////////////// // CMRSWLockRoot::CPrivateLock and CPrivateLockRouter // These private classes prevent non-CMRSWLock code from gaining access to // the private Lock and Unlock methods while providing auto-balanced access // for CMRSWLock code to these methods CMRSWLockRoot::CPrivateLockRouter:: CPrivateLockRouter(CMRSWLockRoot& rcTarget) :     m_rcTarget (rcTarget) void CMRSWLockRoot::CPrivateLockRouter::Lock()     m_rcTarget.Lock(); void CMRSWLockRoot::CPrivateLockRouter::Unlock()     m_rcTarget.Unlock(); CMRSWLockRoot::CPrivateLock::CPrivateLock(CMRSWLockRoot& rcTarget) :     CCSBLock<CPrivateLockRouter> (m_cLockRouter, false),     m_cLockRouter (rcTarget)     // Now that lock router is properly initialized, have CCSBLock clamp     // down on it     Lock(); /////////////////////////////////////////////////////////////////////////// // CMRSWLock, CMRSWReadLock, CMRSWWriteLock: these classes form the bottom // of the inheritance diamond and allow for the implicit conversion of a // compound lock to a function-specific lock offering one simple Lock and // Unlock method each. These intermediate types are therefore suitable for // use with auto-balance lock templates. void CMRSWReadLock::Lock()     if (! AcquireReadLock())         CCSBStdException::             CSBThrowException(CSBSUPPORTLIB_E_MRSWLOCK_TIMEOUT); void CMRSWReadLock::Unlock()     ReleaseReadLock(); void CMRSWWriteLock::Lock()     if (! AcquireWriteLock())         CCSBStdException::             CSBThrowException(CSBSUPPORTLIB_E_MRSWLOCK_TIMEOUT); void CMRSWWriteLock::Unlock()     ReleaseWriteLock(); CMRSWLock::CMRSWLock(bool bWaitableLock /*= false*/,                      DWORD nInitialTimeout /*= INFINITE*/) :     CMRSWLockRoot (bWaitableLock, nInitialTimeout)

The diamond inheritance hierarchy from CMRSWLock via CMRSWReadLock and from CMRSWWriteLock to CMRSWLockRoot is intended to allow for easy use of a CMRSWLock instance with the simple lock template, without the use of an adapter. You therefore can acquire read and write locks as follows:

CMRSWLock m_cMRSWLock; CCSBLock<CMRSWReadLock> cReadLock(m_cMRSWLock); CCSBLock<CMRSWWriteLock> cWriteLock(m_cMRSWLock);  // promotion

The template TMRSWLock is provided for use in ATL object templates. Type instantiation with, for example, CComMultiThreadModel, results in a true MRSW lock. Instantiation with ATL thread model types that provide only fake critical sections yields a fake MRSW lock.

Since this lock keeps track of physical thread IDs instead of causality IDs, it cannot handle causalities that reenter with different threads. That is one functionality upgrade that I will leave to you.

Table of Contents



Foreword Page xv
Acknowledgments Page xvii
Introduction Page xix
PART I COM+ FUNDAMENTALS Page  
1 Error Handling Page 3
 COM+ Errors and Structured Exception Handling Page 4
 The COM+ Error-Handling Model Page 5
  Result Codes Page 8
  Error Context Page 10
 Visual Basic Environment Considerations Page 12
 Visual C++ Environment Considerations Page 13
 An Error-Model Integration Approach for C++ Page 16
  Result Code Framework Page 16
  Reporting Functions Page 17
  Exception Class Page 20
  Exception-Processing Macros Page 27
  Usage Pattern Page 29
2 Smart Pointers Page 31
 Smart Pointer Advantages Page 32
 Usage Patterns Page 34
 Smart Pointer Comparison Page 39
 Extended Interfaces (#import) Page 41
  UUID Type Binding Page 41
  Exceptions Page 41
  Return Values Page 42
  Syntactic Properties Page 45
 Smart Pointers as Parameters Page 47
 Smart Pointer Pitfalls Page 50
  Release Page 50
  SetErrorInfo Page 50
  Interoperation Leaks Page 52
 Generic Programming Considerations Page 53
3 Strings Page 55
 Character Encoding Page 56
  American National Standards Institute (ANSI) Page 56
  Double-Byte Character Set (DBCS) Page 57
  Unicode Page 57
 Platform Considerations Page 59
 Your Project Setting Page 61
 TCHAR.H Page 63
 String Conversion Macros Page 64
 The OLECHAR Data Type Page 66
 The BSTR Data Type Page 67
 BSTR Alternatives Page 70
 String Templates and Classes Page 72
 BSTR Wrapper Classes Page 76
  Feature Comparison Page 76
  Usage Pattern Page 79
4 Concurrency Page 81
 Elements of Interception Page 83
  Concurrency vs. Reentrancy Page 83
  Interception Implementation Page 83
  The Apartment Page 85
  Managing STA Concurrency Page 89
  The Context Page 91
  The Message Filter Page 98
  Interception Services Page 100
 Context Neutrality Page 103
  Implementation Page 103
  Internal Object References Page 104
  But Is It Fast? Page 106
  FTM vs. TNA Page 107
  It’s the Object’s Choice Page 108
 Concurrency Design Guidelines Page 109
  The Best Concurrency Is No Concurrency Page 109
  Exceptions: The Case of Client Notification Page 111
  Standard Synchronization Settings Page 112
 Concurrency in Local Servers Page 113
  Apartments in Local Servers Page 113
  Local Server Pitfalls Page 114
  Partial Location Transparency Page 115
  Implications Page 116
 Locking Page 116
  Coarse-Grained Locks Page 117
  Fine-Grained Locks Page 130
5 Implementation Environments Page 149
 Object Glue: IDL and the Type Library Page 151
 Visual C++ Page 155
  COM+ Integration Approaches Page 156
  Calling COM+ Objects Page 158
  Implementing COM+ Objects Page 162
  Event Support Page 171
  Class Factories Page 177
  Multi-Dual Inheritance Page 179
  Special Considerations for Larger Projects Page 181
 Visual Basic Page 183
  Calling COM+ Objects Page 183
  Implementing COM+ Objects Page 185
  Event Support Page 193
  Multi-Dual Inheritance Page 195
 Visual J++ Page 196
  Calling COM+ Objects Page 197
  Implementing COM+ Objects Page 202
  Event Support Page 208
  Class Factories Page 216
  Multi-Dual Inheritance Page 217
 Script Page 217
  Calling COM+ Objects Page 219
  Implementing COM+ Objects Page 220
  Event Support Page 226
 Selecting an Implementation Environment Page 228
PART II ARCHITECTURAL PATTERNS AND SOLUTIONS Page  
6 Architectural Patterns and Solutions Reuse Page 235
 Reuse Through Object Orientation Page 236
 Object Orientation in COM+ Page 237
 Hierarchical Reuse Page 239
 The Case for Isolating Interface Implementations Page 240
 COM+ Solutions Page 242
  Containment Page 243
  Aggregation Page 243
 Implementation Inheritance Page 247
 Multiple Implementation Inheritance Page 254
 Enhancing Source Code Reuse with C++ Templates Page 263
  Parameterizing on the Derived Class Type Page 264
  Parameterizing on the Base Class Type Page 269
  Implementing Interfaces that Have Not Yet Been Defined Page 272
7 Streaming and Persistence Page 275
 Lightweight Persistence Page 277
 Persistence Solutions Page 280
  Manual Data Transformation Page 281
  Frameworks or Other Proprietary Solutions Page 284
  Choosing a Portable Format Page 287
 Type Stream Architecture Page 299
  The ITypeStream Interface Page 300
  The CTypeStreamImpl Class Page 303
  Type Stream Shift Operators Page 315
  Encoders and Adapters Page 325
  Type Stream Persistence Interface Page 328
  CTypeStreamOnIStream Page 334
 The C++ IOStream Adapter and Encoder Page 341
 Network Data Representation Page 362
 The NDR Stream Page 370
 Usage Patterns Page 402
8 Marshal-by-Value Page 407
 When and Why to Marshal by Value Page 409
 IMarshal Examined Page 414
 Naïve MBV Implementations Page 417
 Reusable MBV Page 421
 IMarshal Reexamined Page 423
 A Solution Page 425
 Marshaling Visual Basic Objects by Value Page 453
 Fine-Tuning MBV Page 470
 Implications Page 472
9 Reference Cycle Management Page 475
 Resource Management Page 476
 Abandoned Rings Page 483
 Specific vs. Generic Solutions Page 488
 COM+ Objects in Garbage Collection Environments Page 491
  Visual Basic Page 492
  Visual J++ Page 497
 C++ Solution Framework Page 501
 Simplifying the Model with the Universal Delegator Page 512
 Reusing Split Identity from Other Languages Page 515
10 Generic Programming Page 517
 The Power of Generic Programming Page 518
 A Review of STL Page 526
  Containers Page 527
  Iterators Page 529
  Generic Algorithms Page 531
 Tension with Component Technology Page 534
 CSB Architecture Page 538
 CSB Guide Page 547
  Philosophy Page 547
  Collection Wrapper Page 549
  Predefined Traits Page 556
  Predefined Interface Method Implementations Page 566
  Persistence Support Page 568
  STL Adapters Page 572
  Support Structures Page 582
  Selecting Functionality Page 585
  Project Configuration Page 589
  Compiler and STL Support Page 590
 CSB Internal Type Safety Page 591
 Usage Patterns Page 593
PART III COM+ IN THE ENTERPRISE Page  
11 Four-Tier Enterprise Application Architecture Page 619
 COM+ Design Pattern Concepts Page 619
  N-Tier Application Architecture Page 621
  Business Objects Page 623
 The Design Pattern Architecture Page 626
  Presentation Services Layer Page 628
  Object Services Layer Page 628
  Transaction Services Layer Page 629
  Data Services Layer Page 630
  Simplifying Object Persistence Page 630
 A COM-Based Hierarchical Object Model Page 631
 From Rows and Columns to Collections and Objects Page 635
  IPersistObjectStream Interface Page 638
  IPersistObjectStream::CreateChildInstance Page 638
  IPersistObjectStream::Load Page 639
  IPersistObjectStream::Save Page 639
  IPersistObjectStream::SaveCompleted Page 640
  IPersistObjectStream::Status Page 640
  IObjectStream Interface Page 641
  IObjectStream::Contents Page 641
  IObjectStream::Load Page 642
  IObjectStream::PropertyExists Page 642
  IObjectStream::PropertyIsNull Page 643
  IObjectStream::ReadProperty Page 643
  IObjectStream::Save Page 644
  IObjectStream::WriteCollection Page 644
  IObjectStream::WriteObject Page 645
  IObjectStream::WriteProperty Page 646
  IPersistTransStream Interface Page 646
  IPersistTransStream::CreateNestedTrans Page 646
  IPersistTransStream::ExecDelete Page 647
  IPersistTransStream::ExecInsert Page 648
  IPersistTransStream::ExecUpdate Page 648
  IPersistTransStream::Save Page 649
  ITransStream Interface Page 650
  ITransStream::Clear Page 650
  ITransStream::Contents Page 650
  ITransStream::Parent Page 651
  ITransStream::PropertyExists Page 651
  ITransStream::PropertyIsNull Page 652
  ITransStream::ReadProperty Page 652
  ITransStream::Save Page 653
  ITransStream::WriteCollection Page 653
  ITransStream::WriteObject Page 654
  ITransStream::WriteProperty Page 654
  Using the Design Pattern Interfaces Page 655
  Retrieving a Complex Object Page 656
  Fetching Child Objects on Demand Page 667
  Creating a Directory Object Page 668
 Object Persistence and COM+ Transactions Page 673
  Saving Changes to an Existing Object Page 675
  Saving a New Object Page 687
  Deleting an Object Page 691
 Using the Design Pattern to Implement a Web-Based Application Page 695
12 SOAP Page 697
 Why Use SOAP? Page 699
 The Details of SOAP Page 702
  SOAP Request Page 703
  SOAP Response Page 706
  SOAP Faults Page 707
 SOAP Code Sample Page 709
 SOAP Toolkits Page 716
  Making Your SOAP Server Known Page 716
  Describing Your SOAP Server Page 718
  Calling the SOAP Methods Page 719
 Designing SOAP Solutions Page 727
  Interoperability Page 729
  Performance Page 730
  Support for Existing Components Page 731
  Security Page 731
  Drawbacks Page 731
  Benefits Page 731
  Toolkits Page 732
13 The MTS Revolution Page 733
 A Brief History of Scalability Page 734
 The Single Concurrent Client Model Page 743
 Designing for Scalability Page 751
  Refining a Web Example Page 751
  In-Memory Alternatives Page 759
  Thread Control Page 767
 Transactions Page 776
 Statelessness Page 789
 Project Modeling for the Internet Page 793
14 Data Access Page 801
 The Resource Dispenser Page 803
 Data Access Technology Survey Page 808
 A Crash Course in OLE DB Page 813
  Transparent OLE DB Services Page 816
  OLE DB Provider Service Components Page 819
  OLE DB Objects Page 821
  Transaction Support Page 825
  Cursors Page 829
  Rowset Processing Page 833
 ActiveX Data Objects Briefing Page 834
 Resource Pooling Page 842
 C++ Data Access Page 847
 Provider Specifics: Fast Loading Page 859
INDEX Page 867

Customer Reviews

Most Helpful Customer Reviews

See All Customer Reviews