Chapter 11. .NET Design Guidelines for Components Used by COM Clients

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 11. .NET Design Guidelines for Components Used by COM Clients

In This Chapter

• Naming Guidelines

• Usage Guidelines

• Reporting Errors

• Exposing Enumerators to COM

• Versioning

• Deployment

• Testing Your Component from COM

This chapter begins Part IV, “Designing Great .NET Components for COM Clients” in which we look at the process of writing .NET components that might be exposed to COM clients. Two kinds of .NET components fall into this category:

• Components designed primarily with .NET clients in mind, for which ease of use from COM is an added bonus

• Components specifically written for COM clients, because writing a COM component in a .NET language is often easier than writing it in a non-.NET language (especially unmanaged C++)

This chapter focuses mostly on the first type of .NET component, covering general design decisions that affect both .NET and COM users of your class library. Chapters 12 and 13 focus on customizations that affect only COM users. Chapter 14, “Implementing COM Interfaces for Binary Compatibility,” focuses on the second type of .NET component, describing how to achieve binary compatibility with existing COM clients by implementing COM interfaces.

FAQ: How do I write a COM component in a .NET language such as C# or Visual Basic .NET?

Any .NET component is automatically a COM component, as demonstrated in Part III, “Using .NET Components in COM Applications.” The main difference between .NET components used as COM components and traditional COM components is that .NET DLLs do not export four entry points that traditional in-process COM servers export:

• DllCanUnloadNow

• DllGetClassObject

• DllRegisterServer

• DllUnregisterServer

Instead, the CLR execution engine (MSCOREE.DLL) exports these and is registered as the in-process server, acting on the assembly’s behalf. For this reason, some people don’t consider .NET components to be true COM components, just components that can act like COM components. Either way, the fact remains that you don’t need to do anything special to write a COM component in any .NET language; no additional code or custom attributes are necessary to expose .NET components to the world of COM.

Part III demonstrated that exposing a .NET component to COM is easy to do, but designing it to be “COM-friendly” is much harder. The .NET Framework gives a lot of flexibility to API designers, and that’s great, but it also makes it easier for developers to make bad choices. To help steer software developers and architects in the right direction, the .NET Framework SDK documentation contains a section entitled Design Guidelines for Class Library Developers, found under the Reference section in the .NET Framework topic. These guidelines promote (among other things) consistency, predictability, and ease-of-use from any .NET language.

This chapter focuses on these .NET design guidelines in the context of usability from COM, to promote ease-of-use from any .NET or COM language. Whereas some of the documented .NET guidelines work well for COM clients, and some don’t affect COM clients, others greatly reduce your COM usability. The goal of this chapter’s guidelines is to help you improve your class library’s usability from COM without sacrificing any ease of use for .NET clients.

Tip

Occasionally, authors who aren’t concerned about designing COM-friendly .NET components (or who want to discourage the use of their class libraries from COM clients) make their entire assembly COM-invisible. Doing this is discussed in Chapter 12, “Customizing COM’s View of .NET Components.” However, as you’ve seen in Part III, COM-invisible APIs can cause a lot of confusion for your users. It’s often an unnecessary limitation to impose on your users, making otherwise usable APIs unusable from COM without workarounds.

Naming Guidelines

The .NET Framework documentation gives extensive naming guidelines in order to make your APIs predictable, easy to search, and easy to understand with minimal documentation. These guidelines cover casing, word forms (nouns for types versus verbs for members), and word choice (for example, avoiding abbreviations). For the most part, COM clients aren’t affected by your name decisions any differently than .NET clients, but this section discusses a few special considerations to keep in mind.

Any .NET naming guidelines in the SDK documentation that are not given attention in this section either coincide with making a .NET type COM-friendly or don’t have any effect. For example, the naming guideline to not give enumeration members a common prefix or an Enum suffix works well for COM clients because they see enumeration members almost identically—FileAccess.Read (in C# and VB .NET) or FileAccess::Read (in C++) versus FileAccess_Read in the exported type library.

Names to Avoid

The .NET guidelines warn against using identifiers that conflict with common keywords in any .NET language, such as the VB .NET keywords AddHandler or For. (Of course, this is hard to do as the set of .NET languages continues to grow.) When considering COM clients, the set of words to avoid expands slightly. For example, it would be unwise to give a public member the name of a COM data type such as BSTR, VARIANT, or IUnknown since it would wreak havoc with unmanaged C++ COM clients.

Fortunately, it’s unlikely that you’d want to use any “COM keywords” as names of .NET APIs because most of them don’t make sense outside the context of COM. Remember that for any names that should be avoided, any variations differing in case should be avoided, too. This is for the benefit of case-insensitive languages like VB .NET or the case-insensitive type libraries that are exposed to COM clients.

Digging Deeper

Most languages have a mechanism for using types and members whose names conflict with keywords. Therefore, defining types and members with keyword names doesn’t necessarily block the use of your APIs from certain languages. For example, Visual Basic enables you to surround a keyword-named identifier with square brackets, C# enables you to precede the identifier with the @ symbol, and Visual C++ lets you rename type library identifiers with the #import statement’s rename directive. Still, this doesn’t avoid the fact that such names can end up being confusing and cumbersome for your users!

You should also avoid creating member names that conflict with the names of the IUnknown or IDispatch interfaces’ methods. These are:

• QueryInterface

• AddRef

• Release

• GetTypeInfoCount

• GetTypeInfo

• GetIDsOfNames

• Invoke

Because all .NET classes and interfaces have IUnknown’s methods when exposed to COM and most classes and interfaces have IDispatch’s methods, user-defined members with these names are exposed to COM with different names to avoid name collisions. Although it’s unlikely that you’d name a method GetIDsOfNames, it’s not far-fetched to want to create a method called Release or Invoke. It is okay, however, to define member names that contain these words, such as ReleaseHandler or InvokeMember.

As explained in Chapter 9, “An In-Depth Look at Exported Type Libraries,” the type library exporter appends an underscore and a unique number to disambiguate your methods with the same name. So the following Visual Basic .NET interface:

Public Interface ISlingshot
Sub Load()
Sub Release()
End Interface

is exposed to COM as follows (in IDL syntax):

[
  ...
]
interface ISlingshot : IDispatch {
  [id(0x60020000)]
  HRESULT Load();
  [id(0x60020001), custom(0F21F359-AB84-41E8-9A78-36D110E6D2F9, "Release")]
  HRESULT Release_2();
};

Requiring COM clients to call a Release_2 method, which likely would not be mentioned in ISlingshot’s .NET-focused documentation, does not provide a great user experience.

Something else to avoid, which may not be obvious, is a property with a given name (let’s say Prop) if a method already exists on the current type or any base types with the name GetProp or SetProp. Why? Because, by default, the Visual C++ type library importer (invoked by the #import statement) creates accessor methods for properties prefixed with “Get” and “Set.” Therefore, an interface with such a property/method pair causes a name collision (albeit for unmanaged Visual C++ clients only), forcing the client to use additional directives with the #import statement to avoid the duplicate names.

Caution

Any .NET class that exposes an auto-dual class interface should avoid readable properties called Type or HashCode. The accessor methods generated for COM clients by Visual C++ (GetType or GetHashCode) would conflict with the methods with the same name from System.Object that are already defined on every class interface.

Namespaces and Assembly Names

A .NET namespace doesn’t have significance to COM clients, because exported type names don’t include a namespace unless name conflicts occur, as explained in Chapter 9. What is significant is the library name, which is often used like a namespace in Visual Basic 6 code or unmanaged Visual C++ code. Because the type library exporter creates the library name from the assembly’s simple name, it’s important to choose a sensible assembly name.

So, how do you choose an assembly’s simple name? The C#, VB .NET, and C++ compilers set this name to the name of the DLL (minus the file extension). For multi-file assemblies, the compilers choose the name of the DLL containing the manifest. The convention for choosing a DLL name (a project name in Visual Studio .NET) is to match the namespace of the types contained within. If the assembly contains multiple namespaces, the name chosen is usually the longest common prefix of the namespaces. Therefore, although .NET namespaces don’t affect COM clients directly, they usually affect them indirectly by influencing the choice of assembly name.

The recommended assembly name/namespace has the form of:

CompanyName.TechnologyName

Such as, perhaps:

SamsPublishing.Books.DotNet.DotNetAndCom.Chapter11

Be aware, however, that dots (periods) are illegal in a library name, so the type library exporter converts them to underscores when constructing a library name from an assembly name. The previous assembly name would result in a library name that looks like:

SamsPublishing_Books_DotNet_DotNetAndCom_Chapter11

Therefore, you might consider choosing a short assembly name without periods for the benefit of COM clients but using longer period-delimited namespaces for the benefit of .NET clients.

Case Insensitivity

Chapter 10, “Advanced Topics for Using .NET Components,” demonstrated the problems that arise when differently cased identifiers in an assembly’s public APIs are exported to a case-insensitive type library. Consider this section a reminder to avoid using the same identifier with a different case as much as possible.

If you follow the .NET guidelines of always using Pascal casing (uppercase first letter, as in “PascalCasing”) for public types and members, then you never have to worry about case conflicts among the type and member names exposed to COM. Parameter names can cause problems, however, because the guidelines recommend using camel casing (lowercase first letter, as in “camelCasing”). Therefore, it’s somewhat likely that any large assembly contains the same identifier as a lowercase parameter name somewhere and a capital type or member name somewhere else. Listing 11.1 demonstrates this with two C# interfaces that follow the .NET naming guidelines, but are exported with undesirable behavior due to the conflicting IsVisible and isVisible identifiers.

Listing 11.1. .NET Interfaces with Conflicting Capitalization of the Same Identifier

Notice that the name of the property changed from ITreeNode.IsVisible on the .NET side to ITreeNode.isVisible on the COM side since IControl.ChangeState’s isVisible parameter was emitted first by the C# compiler. Once isVisible is placed in the type library’s case-insensitive table, the same entry is used for both the isVisible parameter and the IsVisible property. This can be pretty confusing for COM users looking at documentation for the .NET types. Had ITreeNode been emitted before IControl, the property’s name would be preserved and ChangeState’s parameter would be renamed to IsVisible. This would be more desirable than the parameter name changing since COM clients likely wouldn’t even notice the parameter name change. For this trivial example, switching the order of interface definitions in the source code would do just this.

Caution

Besides parameters of public members, beware of private fields of value types! Since these are exposed to COM just like public fields yet are typically named with camel casing, the effects of these names are just as dangerous as the effects of parameter names!

Controlling the case of exported identifiers by re-arranging source code is not practical or even possible for large projects. When compiling multiple source files, the order that they are processed can be unpredictable, even changing based on the fully qualified path names of the source files! The result is that the simple action of re-compiling your assembly can potentially change COM’s view of your APIs, causing compilation errors (for C++ clients; not case-insensitive VB6 clients) if a COM client is recompiled with your new exported type library! Adding new classes or members to your assembly increases the chances that the case of your exported APIs will change! To prevent making inadvertent API changes, you should create a “names file” and use it with TLBEXP.EXE’s /names option. This is described in Appendix B, “SDK Tools Reference.”

Caution

If you’re developing an application with managed and unmanaged pieces, you can encounter a situation in which the cases of exported types or members oscillate regularly and cause unmanaged C++ compilation errors! You should consider using a names file with the type library exporter, but using the rename directive with the unmanaged C++ #import statement can hide these undesirable effects, as discussed in Chapter 10.

Digging Deeper

It should be obvious that type and member names changing case can break COM clients on re-compilation. However, you can rest assured that normal COM clients can’t be broken by such name changes when they aren’t recompiled.

V-table binding doesn’t rely on names; just virtual function table slots. When COM clients late bind to .NET components using the IDispatch interface, the names passed into Invoke are case-insensitive, so a case change can’t matter. When COM clients late bind to .NET components using the IDispatchEx interface (which is possible for .NET classes implementing IReflect or IExpando), its InvokeEx method supports case-sensitive member invocation. However, the managed implementation should be based on .NET names, not the names that are exported in a type library. For more about the IDispatchEx interface, refer to Chapter 14.

Another course of action would be to avoid the name conflicts in the first place by renaming conflicting types, members, or parameters. Don’t switch parameter names to use pascal casing or type/member names to use camel casing because doing so would look non-standard to .NET clients. On the other hand, naming an IsVisible property as something like Visible instead can prevent these problems without impacting .NET clients.

Tip

The problems caused by non-deterministic casing of exported identifiers are serious enough to warrant writing a little utility that scans your assemblies for case conflicts, and either generates a names file to use when exporting a type library or notifies you so you can change the conflicting names. Such a program could easily be written with the help of reflection. Or, using the type library exporter APIs described in Chapter 22, “Using APIs Instead of SDK Tools,” you could create a smart exporter that gives the casing of type and member names priority over the casing of parameter names.

If you end up using a names file, be sure to ship the exported type library with your application so your users don’t attempt to export one that does not have the desired casing!

Keep in mind that these case-sensitivity problems are not specific to COM Interoperability. Standard C++ COM components can run into the same problems when creating a type library with the same identifiers in multiple cases.

Usage Guidelines

Whereas the previous section focused on the names used in your .NET class libraries, this section focuses on the usage of types and members, regardless of their names. For example, static members (Shared in VB .NET) should either be avoided or accompanied with instance members that accomplish the same tasks because static members aren’t naturally exposed to COM.

First we’ll look at some tradeoffs involved in choosing when to use classes, when to use interfaces, when to use custom attributes, and so on. Then we’ll drill down into specific concerns that arise when using overload methods, parameterized constructed, and enumerations.

Interfaces Versus Classes

As you know, COM is based on interfaces. All APIs are exposed via interfaces, and all parameters are either primitive types, structures, or interfaces; never classes. (The classes you can define and use as parameters in Visual Basic 6 are really hiding an interface/coclass pair, so you end up with the same design considerations as defining an interface.) With .NET, however, you can choose how much you want to involve interfaces in your public APIs.

The .NET guidelines recommend using classes instead of interfaces. The motivation behind this recommendation is that APIs that use class types heavily are generally considered to be easier to use than APIs that use interface types heavily. Also, an abstract class (MustInherit in VB .NET) is more flexible than an interface from a versioning perspective. A non-abstract member could be added to an abstract class without breaking existing derived classes. Adding a member to an interface, however, would break any existing classes implementing the interface because they would no longer implement all the members.

Implementation inheritance is touted as an easy way to get a stock implementation that can be customized as you see fit, but this is often difficult without intimate knowledge about the class’s implementation. In COM, the same pattern can be achieved with interface implementation plus delegation. A COM class can implement a large interface and selectively implement methods. For methods whose implementation the class wants to “inherit,” it can simply call methods on another object that already implements the interface.

The heavy use of classes can be seen throughout the .NET Framework APIs. For example:

• Remote objects that need to be marshaled by reference must derive from the System.MarshalByRefObject class.

• Web services must derive from the System.Web.Services.WebService class.

• Windows Forms must derive from the System.Windows.Forms.Form class.

• Web Forms must derive from the System.Web.UI.Page class.

• Serviced components must derive from the System.EnterpriseServices.ServicedComponent class.

In a world without multiple implementation inheritance, however, relying too much on class types can put unintended constraints on clients. For example, it’s impossible to have a Windows Form class that is also a serviced component. Had the requirements instead been “Windows Forms must implement the IForm interface, serviced components must implement the IServicedComponent interface,” and so on, this restriction would not exist because any class can implement an arbitrary number of interfaces. A more realistic limitation can be seen with the System.Windows.Forms.MainMenu class. This class is used as the type of the System.Windows.Forms.Form.Menu property, yet many of MainMenu’s methods necessary for customizing its drawing behavior are not virtual (Overridable in VB .NET). The result is that you can’t plug in your own custom-drawn menu object as a Windows Form’s menu. Had the Menu property been an IMainMenu interface, there would be no roadblock to doing this.

Therefore, even without COM in the picture, you should use interface types rather than class types for parameters, properties, and fields of public or protected APIs in case someone wants to plug in objects that can’t derive from the class you defined. For COM-friendly .NET components, communicating via interface types is always preferred over class types. When interfaces are available, COM clients don’t need to rely on invoking thru CLR-generated class interfaces. These class interfaces are either slower to use due to forced late binding or they don’t version well.

Furthermore, a COM object implementing a .NET interface can be passed to a .NET method expecting an interface parameter. From the .NET perspective, every COM object is a Runtime-Callable Wrapper (RCW) that either is the System.__ComObject type or a type that derives from it, and that class derives from System.MarshalByRefObject which derives from System.Object. This means that a COM object can only be passed as a parameter to a .NET method if the signature expects either an interface or one of the classes in the RCW’s inheritance hierarchy (Object, MarshalByRefObject, __ComObject, or the type itself).

The point to remember is that although COM’s role in your applications may be fading away, the role of interfaces is just as important as always. If you define an interface as well as a class that implements the interface, COM clients can make use of the interface when such objects are passed to COM, yet .NET clients can mostly ignore the interface and use the class directly.

Tip

When defining classes, define a corresponding interface for the class to implement and always use the interface type rather than the class type in any method, property, field, or event definitions. This provides maximum flexibility for both .NET and COM users of your APIs.

A great example of such a class/interface pair exists in the System assembly: System.ComponentModel.Component and System.ComponentModel.IComponent. The .NET Framework has well over 100 methods with an IComponent parameter, such as System.Windows.Forms.Design.PictureBoxDesigner.Initialize:

public void Initialize(IComponent component);

At the same time, the .NET Framework has no methods with a Component parameter. Therefore, there’s no limit on the types of classes (even COM classes) that can be passed in as long as they implement the IComponent interface.

Interfaces Versus Custom Attributes

The .NET guidelines advise against using interfaces as “empty markers.” This means that you shouldn’t define an empty interface to mark a class with a certain characteristic, for example (in C#):

public interface IRestricted {}

public class C : IRestricted
{
...
}

After all, this is what custom attributes are for! In .NET APIs, it’s more natural to define a RestrictedAttribute custom attribute and mark the class with this.

With a class that implements IRestricted, clients determine whether an object is classified as “restricted” by casting the type to the interface and seeing whether or not it fails. (Thanks to operators like C#’s as, a failed cast does not have to incur the expense of an InvalidCastException.) With a class marked with a RestrictedAttribute custom attribute, a .NET client can check for the RestrictedAttribute with code like the following (in C#):

public bool IsRestricted(Object o)
{
  object [] attributes = o.GetType().GetCustomAttributes(
    typeof(RestrictedAttribute), true);

  if (attributes.Length > 0)
    return true;
  else
    return false;
}

Custom attribute retrieval requires more code and is slower than a cast, but it’s standard practice in the .NET Framework. Note that true is passed for the GetCustomAttributes inherit parameter because this matches the inherited behavior of an interface, but it only has an effect if the custom attribute is defined to enable inheritance.

The choice between marking a class with a custom attribute or having it implement an interface is also present when the attribute or interface must contain members. For example, the KeywordAttribute custom attribute defined in Chapter 1, “Introduction to the .NET Framework,” could instead be defined as the following interface if you only wanted to use it on classes:

public interface IKeywordProvider
{
string [] GetKeywords();
int [] GetRelevanceFactors();
}

There are two disadvantages for COM clients when .NET components use custom attributes rather than interfaces:

• Custom attributes don’t appear in exported type libraries, so they can’t be retrieved in a fashion familiar to COM clients. The exporter makes no attempt to bridge .NET custom attributes to IDL custom attributes.

• Inspecting a custom attribute’s member from COM involves late binding unless the custom attribute class implements an interface or exposes an auto-dual class interface. Besides the significantly worse performance compared to a simple QueryInterface call and method calls through a v-table, late binding from unmanaged C++ code is cumbersome.

Therefore, you might want to consider defining and implementing interfaces before going overboard with marking your classes with custom attributes, especially if you need more than a simple marker. There are actually a few examples of empty marker interfaces in the .NET Framework where you would have expected a custom attribute (such as System.Web.IHttpHandler in the System.Web assembly) but ironically they are all COM-invisible. Of course, if you need to mark elements such as parameters or methods with extra information, custom attributes are the only choice.

Tip

To make custom attribute retrieval more COM-friendly for custom attributes with members, have your attribute classes implement interfaces exposing its members. That way, COM clients don’t have to use late binding to retrieve the attribute’s data. So, if you decide to mark a class with a custom attribute rather than having it implement an interface, define the interface with these members anyway so the attribute class can implement it!

For the KeywordAttribute example in Chapter 1, this technique would look like the following:

using System; public interface IKeyword{ string Keyword { get; } int Relevance { get; set; }}[AttributeUsage(AttributeTargets.All, AllowMultiple=true)]public class KeywordAttribute : Attribute, IKeyword{ // Implementation is identical to Listing 1.2. ...}

It’s a good idea not to end the interface name with “Attribute” since it might confuse users to believe that the interface is itself a custom attribute.

Properties Versus Fields

The .NET guidelines recommend the use of properties rather than public fields. Although properties incur some overhead, they version more gracefully than fields because a property’s implementation can be changed in a compatible way. To change the behavior of a field, you’d need to convert it to a property first but that’s not a compatible change. Although the client source code may look the same for accessing an object’s field and accessing an object’s property, they are two entirely different actions at the MSIL level.

Although fields of .NET classes are exposed as COM properties, this does not mean that changing between properties and fields is a compatible change for COM clients. They are when a COM client late binds by name (without saving the property’s DISPID) but the layout of the class interface changes. When exposing an auto-dual class interface for a .NET class, any public fields become COM properties at the end of the interface. For this reason, too, public fields should be avoided when exposing an auto-dual class interface because the order of members in source code doesn’t faithfully represent the order of the interface that COM clients may rely on.

The guidelines also recommend using read-only static fields or constants rather than properties for global constant values. For example (in C#):

public struct Byte
{
  public const byte MinValue = 0;
  public const byte MaxValue = 25;
  ...
}

Keep in mind, however, that such static fields are not directly accessible from COM, whereas properties that expose these values would be.

Using Overloaded Methods

The .NET guidelines recommend the use of overloaded methods to define different methods with the same semantics, or to provide clients with a variable number of arguments. In addition, the guidelines recommend using overloaded methods instead of optional parameters because optional parameters are not in the CLS (so languages like C# don’t have to support them). That is why, for example, System.Reflection.Assembly doesn’t define a single GetType method that looks like the following (in VB .NET syntax):

Public Function GetType( _
  name As String, _
  Optional throwOnError As Boolean = False, _
  Optional ignoreCase As Boolean = False _
) As System.Type

but rather three overloads to achieve the same combinations of invocations:

Overloads Public Function GetType( _
  name As String _
) As Type
Overloads Public Function GetType( _
  name As String, _
  throwOnError As Boolean _
) As Type

Overloads Public Function GetType( _
  name As String, _
  throwOnError As Boolean, _
  ignoreCase As Boolean _
) As Type

Indeed, the use of overloaded methods is pervasive in the .NET Framework. Unfortunately, as you’ve seen in Chapter 8, “The Essentials for Using .NET Components from COM,” overloaded methods are not permitted in COM, so the renamed methods exposed to COM are not COM-friendly. The previous GetType methods on the exported Assembly class interface are exported to a type library as follows (in Visual Basic 6 syntax):

Public Function GetType_2( _
  ByVal name As String _
) As Type

Public Function GetType_3( _
  ByVal name As String, _
  ByVal throwOnError As Boolean _
) As Type

Public Function GetType_4( _
  ByVal name As String, _
  ByVal throwOnError As Boolean, _
  ByVal ignoreCase As Boolean _
) As Type

Notice that even the first method has _2 appended because all the GetType methods are overloads of System.Object.GetType, which also appears on the _Assembly interface. Besides looking unfamiliar to COM clients following the Assembly class’s documentation, the situation is even worse when such methods are on classes that don’t expose them on an interface with type information. When late binding, COM clients may end up guessing what the name of the overloaded method is they’re invoking by trial and error, unless they use an assembly browser like ILDASM.EXE that can show the exact order in which the overloaded methods are defined by the class. In addition, overloaded methods present a problem for versioning that’s discussed in the upcoming “Versioning” section.

Versioning headaches alone should discourage you from using overloaded methods. But besides versioning, choosing whether or not to use overloaded methods usually involves a tradeoff between ease of use for .NET clients and ease of use for COM clients. Here are some guidelines:

• Consider changing overloaded methods to regular methods with similar names. Assembly.GetType is probably not a good candidate for this (or I’m not imaginative enough), but the three methods could be called GetTypeCaseSensitive, GetTypeCaseInsensitive, FastGetTypeCaseSensitive, FastGetTypeCaseInsensitive. (The “Fast” could refer to the faster act of returning null on failure rather than throwing an exception.) However, these names are long and strange-looking for all clients (.NET or COM)!

• If you’re primarily concerned with Visual Basic clients (VB .NET on the .NET side and VB6 on the COM side), favor optional parameters on a single method rather than overloaded methods. This is beneficial because all clients see the original method name. In the worst case, clients that don’t support optional parameters simply can’t omit any arguments. Be aware, however, of the versioning concerns with optional parameters outlined in Chapter 3, “The Essentials for Using COM in Managed Code.”

• Although the guidelines tell you to avoid reserved parameters because an overloaded method can always be added later, you might want to consider limited use of reserved parameters if it’s likely that more functionality will need to be added at a later date.

• If you must use overloaded methods, try to make the most commonly used overload the first one listed so most COM clients won’t have to use the overloads with mangled names. (This is also good for IntelliSense in Visual Studio .NET so it shows the most commonly used overload first.)

Tip

Using overloaded methods, but making the most common one defined first, is the choice made throughout the .NET Framework. Therefore, choosing this pattern for your APIs has the benefit of consistency with the framework.

Using Constructors

You must define a public default constructor (one with no parameters) if you plan to make your class creatable from COM. Therefore, if you plan to create parameterized constructors, be sure to also expose the same functionality through a public default constructor coupled with members that enable the user to set the same state that would have been set with the constructor parameters.

Notice that in C#, Visual Basic .NET, and C++, the compiler supplies an empty public default constructor if you don’t define any constructors. If you do add any constructors yourself, then the compiler no longer supplies one implicitly. Listing 11.2 demonstrates this situation.

Listing 11.2. Writing COM-Creatable Classes and Non-COM-Creatable Classes in C#

Using Enumerations

The .NET guidelines recommend the use of enumerations (enums) for strongly-typed parameters, properties, fields, and so on. This remains a good practice when COM clients are involved, with one exception. As mentioned in the previous chapter, unmanaged script clients are unable to invoke .NET members with enum parameters via reflection in version 1.0 of the CLR. Still, in most cases, rather than defining static constants in your class, such as (in C#):

public const int DAYOFWEEK_SUNDAY = 0;
public const int DAYOFWEEK_MONDAY = 1;
public const int DAYOFWEEK_TUESDAY = 2;
public const int DAYOFWEEK_WEDNESDAY = 3;
public const int DAYOFWEEK_THURSDAY = 4;
public const int DAYOFWEEK_FRIDAY = 5;
public const int DAYOFWEEK_SATURDAY = 6;

you should define an enumeration:

public enum DayOfWeek
{
  Sunday = 0,
  Monday = 1,
  Tuesday = 2,
  Wednesday = 3,
  Thursday = 4,
  Friday = 5,
  Saturday = 6,
}

(And in this case, don’t even bother defining it because the canonical DayOfWeek enumeration already exists in the System namespace.) Enums are natural for .NET clients because they are widely used in .NET Framework APIs, and they are natural for most COM clients because the alternative of using constants would not be exposed to COM.

Another guideline for defining enums is to use a 32-bit integer as the underlying type. This is the default behavior for enums defined in C#, VB .NET, and C++, but it’s possible in all these languages to define enumerations with a different base type, such as a 64-bit integer:

C#:

public enum ManyFlags : long
{
...
}

Visual Basic .NET:

Public Enum ManyFlags As Long
...
End Enum

C++:

public __value enum ManyFlags : System::Int64
{
...
};

Increasing the size of the enum from 32 bits would be necessary if defining an enumeration with more than 32-bit flags. However, such enumerations pose a problem for COM since an enumeration in a type library always has a 32-bit underlying type. As discussed in Chapter 9, parameters, return types, and fields of non-32-bit enum types are exported as their underlying types. The enums themselves are still exported as long as its values fit inside 32 bits, but they can’t be used directly in exported signatures. Therefore, avoid defining enumerations with base types other than a 32-bit integer because doing so eliminates most of the benefit of using enums in the first place from COM’s perspective.

Tip

An enumeration consisting of more than 32 bit flags can most likely be re-factored to multiple enumerations. Such an action could be used to avoid enumerations based on types that are too big for type libraries.

Choosing the Right Data Types

The kind of data types you choose to use in your public methods, properties, and fields is critical to ensuring smooth interaction with COM. The next chapter focuses on customizing COM’s view of data types with MarshalAsAttribute, but here we’ll briefly look at some general data type guidelines.

Using OLE Automation Compatible Types

The most important guideline is to make your classes and interfaces OLE Automation compatible by restricting your public signatures to using only OLE Automation compatible data types. That’s right, you should still be concerned with OLE Automation compatibility of .NET types when interaction with COM is likely!

A COM interface is considered OLE Automation compatible if it is derived from IDispatch or IUnknown (as all exported .NET interfaces are), marked with the [oleautomation] attribute (as all exported .NET interfaces are), and all of its members are OLE Automation compatible.

A COM member is considered OLE Automation compatible if it has the STDCALL calling convention (on 32-bit platforms, which all exported .NET members do have), if its return type is HRESULT, SCODE, or void, and if its parameter types are all from a subset of data types. These data types, in their .NET form, are listed in Table 11.1. These types can be used as by-value or by-reference parameters and still be considered OLE Automation compatible.

Table 11.1. .NET Data Types That Are Marshaled as OLE Automation Compatible Data Types

For .NET data types that can be marshaled to more than one COM data type, Table 11.1 points out which of the COM types are OLE Automation compatible. Again, marshaling customizations are covered in the next chapter. Although SAFEARRAYs are OLE Automation compatible, Visual Basic 6 clients can’t consume methods with by-value SAFEARRAY parameters, so you might want to avoid those as much as possible.

Why should your .NET types be OLE Automation compatible, anyway? It’s important for .NET classes because by default they are only accessed from COM via IDispatch, so parameters and return types must be able to be stored inside a VARIANT. However, VARIANTs can contain several types that aren’t OLE Automation compatible, such as unsigned integers. It’s most important that .NET interfaces (including class interfaces) are OLE Automation compatible because the exported interfaces use the OLE Automation type library marshaler for COM marshaling across apartment, thread, or process boundaries.

Chapter 6, “Advanced Topics for Using COM Components,” explained that COM objects that aren’t OLE Automation-compatible work fine across apartments in managed code as long as an appropriate marshaler is registered for the appropriate interfaces. Similarly, using .NET objects that aren’t OLE Automation-compatible can work fine across apartments as long as an appropriate marshaler is registered to perform the customized COM marshaling. The “Disabling Type Library Marshaling of .NET Interfaces” section in the next chapter describes how an ambitious .NET developer can accomplish this. Fortunately, because .NET classes are registered with the Both threading model value, COM marshaling can often be avoided when .NET components are used in-process, making the need for OLE Automation compatibility less important.

Tip

Besides COM marshaling concerns, it’s a good idea to use OLE Automation compatible data types for the sake of Visual Basic COM clients. The CLS guidelines help by excluding unsigned integers, but you should also avoid publicly exposing types like System.Guid or System.Decimal (without using MarshalAsAttribute to turn it into the CURRENCY type) because they can’t be consumed in Visual Basic 6.

Avoiding Pointers

Rather than exposing parameters that are pointers to types, use by-reference parameters. In other words, do this in a C# interface:

void UpdateTime(ref DateTime t);

rather than:

unsafe void UpdateTime(DateTime* t);

Although the signature in an exported type library looks the same for both cases, shown below in IDL, the two signatures behave much differently:

HRESULT UpdateTime([in, out] DATE* t);

Exposing pointers to COM is not a good idea because the Interop Marshaler only does a shallow copy of the data being pointed to. This means that passing pointers across the Interop boundary only works as you might expect for blittable types like integers, floats, and doubles.

Avoiding public signatures with pointer types is a good idea in .NET anyway, since pointers are not in the CLS and languages like Visual Basic .NET would not be able to consume your APIs.

Tip

Always prefer defining by-reference parameters rather than using pointers. Besides being better for .NET clients, the Interop Marshaler is designed for them.

Avoiding Nested Arrays

Nested arrays, also known as jagged arrays, are simply arrays of arrays. Nested arrays are not supported in COM Interoperability. Note that nested arrays are not the same as multi-dimensional arrays, which are supported. Nested arrays provide the flexibility of non-rectangular storage, but multi-dimensional arrays should be favored in your public APIs so COM clients can be supported.

To summarize, define methods as follows:

C#: void ProcessData(int [,] data)

VB .NET: Sub ProcessData(data(,) As Integer)

rather than the COM-unusable alternative:

C#: void ProcessData(int [][] data)

VB .NET: Sub ProcessData(data()() As Integer)

Avoiding User-Defined Value Types

Due to the COM Interoperability limitations with VARIANTs containing the VT_RECORD type, user-defined value types (UDTs) should be avoided as much as possible in APIs that might be exposed to COM. This means avoiding exposing generic System.Object types that could contain value types, since COM clients would be forced to access the value type thru a VARIANT. Even exposing a value type as its exact type is bad for COM clients that late bind, because parameters are packed into VARIANTs. Chapter 14 discusses a workaround for the late binding scenario, however, by customizing the IDispatch implementation exposed to COM. Late binding aside, by-value UDT parameters can’t be consumed by Visual Basic 6 clients when v-table binding, so that’s another good reason to avoid them.

If you do decide to define and use your own value types, avoid defining fields that are reference types such as arrays or classes other than Object. Such fields can’t be marshaled unless marked with MarshalAsAttribute appropriately, as discussed in Chapter 19, “Deeper Into PInvoke and Useful Examples.” Also, keep in mind that because only a value type’s instance fields are exposed to COM (whether public or not), none of the properties and methods you might define are directly available to COM.

Digging Deeper

When defining value types, it’s important to remember that private fields are exposed to COM. This fact can affect the use of common obfuscation techniques. The process of obfuscation, when applied to software, means obscuring internal details such that the software’s usage is unaffected, but the software is much harder to reverse-engineer. If you decide to scramble the names of your non-public types and members, you could be scrambling part of the public API from COM’s perspective!

Reporting Errors

When it comes to reporting errors, the .NET guidelines provide good information that coincides with being COM-friendly. Two particularly important guidelines are to throw exceptions for error conditions, and to favor throwing pre-defined exception types rather than inventing your own kind of exception. As always, there are some additional considerations to think about to make your use of exceptions provide the best experience for COM clients.

Defining New Exception Types

Defining a new exception type can be useful when you provide additional information that clients can act on programmatically. For example, System.Runtime.InteropServices could define a ClassNotRegisteredException type with a Clsid property of type Guid that contains the CLSID corresponding to the class that isn’t registered. Catching a ClassNotRegisteredException and checking its Clsid property would be much handier than catching a COMException, checking if the ErrorCode property returns 0x80040154 (REGDB_E_CLASSNOTREG), then parsing the CLSID string from the Message property. However, new exception types can be overkill if most users have no need to take different programmatic action. This is why such a ClassNotRegisteredException doesn’t exist—most users would not want to attempt anything more than displaying a message about this type of error.

When defining a new exception type that might be thrown to a COM client, choosing a value for its protected HResult property is critical because this value becomes an HRESULT in the unmanaged world that identifies the exception type. Choosing a unique value is recommended so COM clients can easily identify the unique exception type. As demonstrated in Chapter 8, “The Essentials for Using .NET Components from COM,” a non-Visual Basic COM client can always discover the exact type of exception thrown by calling _Object.GetType on the object obtained from the IErrorInfo interface pointer, but this isn’t something many COM clients are likely to do.

Creating a hierarchy of exceptions can be valuable for your .NET users, but it doesn’t mean much to COM clients. If every exception type defines its own HRESULT value, then these error codes have no real connection that enables them to be grouped as a unit. Therefore, exception designers sometimes share HRESULT values with similar exceptions so a COM client can “catch” a set of exception types all at once. The most common form of this is simply inheriting the HRESULT value from your base exception class. For example, if you define a handful of exceptions that derive from System.ApplicationException that don’t override ApplicationException’s HResult property, a COM client could always handle any application-defined errors the same way (such as printing the exception’s message) simply by checking for the single HRESULT belonging to ApplicationException (0x80131600). Inheriting the base exception class’s HRESULT is often done unintentionally, because the author might not have even been aware of the HResult property that could have been overridden or didn’t fully understand the consequences.

Tip

When defining a new exception type, remember to perform the following actions so the exception can be transported from one machine to another:

• Mark the exception with the System.SerializableAttribute custom attribute

• Implement a deserialization constructor, which looks like the following in C#: public MyException(SerializationInfo info, StreamingContext context)

• Implement the System.Runtime.Serialization.ISerializable interface if your exception type contains its own fields

These steps have nothing to do with COM Interoperability, but they’re important for .NET exceptions and easy to forget.

User-defined exceptions have an undesirable aspect for applications that make more than one transition between managed and unmanaged code in a given call stack. As discussed in Chapter 3, many user-defined errors from COM objects appear as a COMException in managed code. This happens because the CLR has a list of HRESULTs that are transformed into specific exceptions, and there’s no way to add to the list. Therefore, any unrecognized HRESULT, even if originating from a .NET exception, becomes a COMException. Figure 11.1 depicts a .NET exception being thrown from a .NET component and caught by another .NET component. If there’s a COM component in the middle that propagates the error from server to client, the exception transforms into a COMException.

Figure 11.1. When passing from .NET to COM to .NET, any exception not in the CLR’s list morphs into a COMException.

Fortunately, other exception information (such as the ever-important message) is preserved, but the type of the exception is not. Appendix C, “HRESULT to .NET Exception Transformations,” contains the complete list illustrated in the figure. In general, it’s best to avoid user-defined exceptions and use exceptions whose types are familiar to .NET clients and whose HRESULT values are familiar to COM clients.

Caution

If you examine the list of HRESULT to exception transformations in Appendix C, you’ll notice that the HRESULTs are comprised of well-known COM HRESULTs plus HRESULTs defined by the .NET exceptions in the mscorlib assembly. This means that no exceptions defined elsewhere in the .NET Framework are preserved when passing through a COM layer. Treat the list of exceptions in Appendix C as your “play book” of exceptions that should be thrown whenever possible rather than exceptions not in the list.

General Guidelines

Besides avoiding user-defined exceptions, here are some general do’s and don’ts for using exceptions in COM-friendly .NET components:

• Try to fit as much information as possible into an exception’s Message, HelpLink, and Source properties because this information is natural to obtain from COM using IErrorInfo (or the global Err object in Visual Basic 6). Although the other members (like InnerException or user-defined members) can be accessible from COM, existing COM clients that are unaware of .NET may be designed to perform intelligent actions based on the standard information.

• Don’t overuse inner exceptions. Any COM clients that aren’t .NET-aware (for instance, the Visual Basic 6 IDE) would only display the information for the outermost exception because that’s the only error they would be aware of.

• Throw exceptions for errors rather than returning error codes. Although returning error codes looks natural to COM (as long as the method is marked with the PreserveSigAttribute), such APIs are not natural to use in .NET.

• If you’re intent on returning error codes rather than throwing exceptions, be sure to mark such a method with the PreserveSigAttribute so your return value is treated as the return value from COM’s perspective. Otherwise, COM clients would have two different error codes to worry about when calling your method. Be sure that your returned error codes respect the COM standard for HRESULTs.

• Avoid returning null (Nothing in VB .NET) rather than throwing an exception to indicate failure. Returning null for common error cases is recommended by the .NET guidelines and done in some common .NET Framework APIs. Although it’s compelling to not throw an exception on failure in situations in which the failure may occur frequently (for performance reasons), this behavior can be very confusing to COM clients. Why? Because COM clients see S_OK returned whenever an exception isn’t thrown, so it may mistakenly lead COM clients to believe the call succeeded when it really did not.

TIP

Many .NET Framework methods that return null on failure have an overload with a boolean throwOnError parameter, giving clients the choice of fast failure versus easily noticeable failure. If you feel that it’s necessary to provide a method that returns null on failure, consider adding the throw-on-error overload, but list that definition first in source code. Assuming that your compiler respects the ordering when emitting metadata, you’ll encourage COM clients to call the version that fails more naturally for them since the throw-on-error overload would have the original method name, whereas the method that returns null on failure would be decorated with the _2 suffix.

Exposing Enumerators to COM

One of the topics discussed by the .NET guidelines is choosing when to expose arrays and when to expose collections. Indeed, an array is a type of collection, but the difference here is whether all elements are handed out at once or if elements are obtained one-at-a-time through calls into your component. Exposing a collection can be useful, for example, to provide read-only access to the elements of a non-public array. The trade-off for .NET and COM interaction is that returning an array is a single method call that crosses the managed/unmanaged boundary, but the call can be expensive if the CLR must marshal a large array with non-blittable elements. On the other hand, returning a collection can result in many method calls across the Interop boundary (especially if the client inspects every element of the collection), but each method call could involve much less marshaling. Usually exposing arrays to COM clients yields better performance.

If you want to expose an enumerator on a .NET type, the official technique is to implement System.Collections.IEnumerable. IEnumerable has a single GetEnumerator method that returns an IEnumerator interface, which enumerates over generic Object types. C#’s foreach statement and VB .NET’s For Each statement actually work with any method called GetEnumerator that takes no parameters and returns an object with a MoveNext method and a readable Current property. This enables strongly typed enumerations, because the types being enumerated can be made more specific for better performance.

COM clients expect an enumerator type to be returned by a member marked with DISPID –4. This enumerator type is usually the IEnumVARIANT interface (COM’s equivalent of IEnumerator) but can be more specific types as well.

COM Interoperability handles .NET collections in a way such that the design that’s natural for .NET also fits in well for COM. For starters, the IEnumerator type is always exposed to COM as IEnumVARIANT. In addition, the definition of IEnumerable marks its GetEnumerator method with a custom attribute so it’s exported with a DISPID equal to –4. This means that the IEnumerable interface is exported as follows (as seen by running OLEVIEW.EXE on mscorlib.tlb):

[
  odl,
  uuid(496B0ABE-CDEE-11D3-88E8-00902754C43A),
  version(1.0),
  dual,
  oleautomation,
  custom(0F21F359-AB84-41E8-9A78-36D110E6D2F9,
"System.Collections.IEnumerable")
]
interface IEnumerable : IDispatch
{
    [id(0xfffffffc)]
    HRESULT GetEnumerator([out, retval] IEnumVARIANT** pRetVal);
};

The 0xfffffffc value is hexadecimal for –4. Therefore, any class that implements IEnumerable or an interface derived from IEnumerable can be used as an enumerator naturally from COM.

The support doesn’t stop there, however. Any class or interface exposed to COM that has a GetEnumerator method with the proper format is assigned a DISPID of –4 unless a different member is already explicitly marked with that DISPID. The “proper format” is a method with no parameters, a return type of IEnumerator, and with the name of GetEnumerator (although any case will do). This means that the following .NET interface:

public interface INonStandardCollection
{
IEnumerator GetEnumerator();
}

is exported to COM as:

[
  ...
]
interface INonStandardCollection : IDispatch
{
  [id(0xfffffffc)]
  HRESULT GetEnumerator([out, retval] IEnumVARIANT** pRetVal);
};

Since strongly-typed GetEnumerator methods aren’t automatically marked with DISPID –4 when exposed to COM, you should mark such methods with this DISPID explicitly using DispIdAttribute. This is demonstrated in Listing 11.3.

Listing 11.3. Marking a Strongly-Typed GetEnumerator Method with the DispIdAttribute Is Recommended to Properly Expose an Enumerator to COM

A typical pattern used by .NET classes is to explicitly implement IEnumerable to provide the standard enumerator when the client calls on the IEnumerable type, while defining a public GetEnumerator method with a strongly-typed enumerator returned. To be COM-friendly, the DispIdAttribute should be applied to the strongly-typed GetEnumerator, as shown in the following C# code:

1: using System.Collections;
2: using System.Runtime.InteropServices;
3:
4: public class EnumerableObject : IEnumerable
5: {
6:   private Hashtable table;
7:
8:   ...
9:
10:   // Explicitly-implemented GetEnumerator
11:   IEnumerator IEnumerable.GetEnumerator()
12:   {
13:     return table.Values.GetEnumerator();
14:   }
15:
16:   // Strongly-typed GetEnumerator
17:   [DispId(-4)]
18:   public IDictionaryEnumerator GetEnumerator()
19:   {
20:     return table.GetEnumerator();
21:   }
22: }

By using the IEnumerable interface, COM clients can get the standard IEnumVARIANT interface back by invoking the member with DISPID –4. When calling methods on the class via late binding, invoking the member with DISPID –4 gives back the IDictionaryEnumerator interface. Had the attribute not been applied in Line 17, no enumerator would be exposed on the class type from COM’s perspective.

TIP

When defining a collection class, always implement the IEnumerable interface. When adding a strongly-typed GetEnumerator method, mark it with the DispIdAttribute custom attribute to give it a DISID equal to –4.

Versioning

A .NET class library author can make certain changes to her assembly so that .NET clients built with a previous version can switch to using the updated version without experiencing problems. This process of safely evolving a component in compatible ways is known as versioning.

Besides changing implementation, an assembly’s public APIs can sometimes be changed in ways that don’t break existing .NET clients. This includes:

• Adding types

• Adding members to enums (without affecting existing values)

• Adding members to classes

• Reordering types in source code

• Reordering enum members in source code (without affecting existing values)

• Reordering class, value type, or interface members in source code

It should be obvious that removing or changing public or protected members would break any .NET clients that currently use them. Although adding members to an interface doesn’t affect .NET clients that call its members, it would break any .NET clients that attempt to implement the interface because it would no longer implement the entire interface. So even in the pure .NET world, adding members to an interface should never be done.

The default behavior of COM Interoperability enables a class library author to make almost all of the same kinds of changes to an assembly without breaking COM clients. The allowable changes are slightly more restrictive when COM clients are involved, so it’s important to understand what kind of changes you can make and still be compatible. Here are changes you can make that are compatible for both .NET and COM clients:

• Adding types

• Adding members to enums (without affecting existing values)

• Adding members to classes marshaled as reference types that don’t expose an auto-dual class interface (with an exception for overloaded methods)

• Reordering types in source code

• Reordering enum members in source code (without affecting existing values)

• Reordering members in source code for classes marshaled as reference types that don’t expose an auto-dual class interface (with an exception for overloaded methods)

The one noticeable difference is that members of interfaces or types marshaled as structs should never be rearranged once you’ve shipped an assembly. Such a change won’t affect .NET clients but completely breaks COM clients that rely on interface or structure layout.

Adding or reordering methods of classes that don’t expose auto-dual class interfaces is mostly safe because the class’s members are either not directly accessible from COM or accessible only via late binding. As long as COM clients don’t cache DISPIDs (which can change when members are reordered or added), late binding by member names continues to work as classes evolve. The reason .NET classes are given auto-dispatch class interfaces by default is to prevent COM clients from depending on DISPIDs that may change. The next chapter describes how to enable the different kinds of class interfaces.

Caution

Due to the way overloaded methods are handled, there’s an important exception to two of the changes that are otherwise compatible for .NET and COM clients. Because the method names exposed to COM (with _2, _3, and so on) are determined by the ordering of the overloads, rearranging overloaded methods or adding a new overload before the last existing overload causes the method names to change from COM’s perspective! This is why using overloaded members can be dangerous. Although it might be tempting to rearrange them, perhaps sorted by how frequently they’re used so IntelliSense behaves nicer, never rearrange them once you’ve shipped your assembly!

Besides auto-dispatch class interfaces that insulate COM clients from evolving classes, the key to COM Interoperability’s default versioning behavior lies in the automatically generated GUIDs for .NET types. Authors of .NET types don’t need to think about generating unique GUIDs and decorating types with them since GUIDs are no longer used for identification in the strictly-.NET world. COM clients still require GUIDs in order to use your types, but fortunately the CLR generates them for .NET types on demand. Furthermore, the GUIDs are generated in such a way that provides COM clients with a versioning experience consistent with .NET clients. The rules of generating these automatic GUIDs are explained in the next few sections, which discuss LIBIDs, CLSIDs, and IIDs. The next chapter describes how to choose your own GUIDs if you want more control.

With all of the automatically-generated GUIDs, you can rest assured that they won’t conflict with GUIDs generated by the existing CoCreateGuid API used by all standard tools.

Library Identifiers (LIBIDs)

Recall that when a type library is registered, its location is placed under the following registry key:

HKEY_CLASSES_ROOTTypeLib{LIBID}Major.MinorLCIDwin32

The algorithm that automatically generates a LIBID for a type library exported from an assembly ensures that each distinct assembly produces a type library that can be registered independently of any others.

To do this, a unique LIBID is generated based on a hash of three of the four parts of an assembly’s identity—name, public key, and version. An assembly’s culture does not affect the LIBID because this information is preserved in a type library’s locale identifier (LCID), and it’s possible to register multiple type libraries with the same LIBID but different locales. The same is true for a type library version number, so why does the assembly version number affect the LIBID? Because an assembly’s four-part version number can express more values than a type library’s two-part version number. Therefore, having a LIBID that changes along with the assembly’s version number is the only way to distinguish between type libraries for two assemblies that are identical except for build and revision numbers. Two otherwise-identical assemblies with version numbers 1.0.0.0 and 1.0.9.23 can be registered independently under:

HKEY_CLASSES_ROOTTypeLib{LIBID1}1.0LCIDwin32

and

HKEY_CLASSES_ROOTTypeLib{LIBID2}1.0LCIDwin32

Class Identifiers (CLSIDs)

The unique CLSIDs generated are based on a hash of a fully-qualified class name and the identity of the assembly containing the class. This ensures that classes with different identities in the .NET world always have different identities in COM by default. As with the generation of LIBIDs, only three fourths of an assembly’s identity is used as input for calculating CLSIDs. Therefore, two classes differing only by culture would have the same CLSIDs, but assembly cultures are intended to be used for satellite assemblies containing resources; not code. The same rules for CLSIDs apply to GUIDs generated for delegates, structs, and enums.

The fact that a class’s members don’t affect its automatically generated CLSID makes member additions or reordering a compatible change for COM clients. But the CLSID only remains the same as long as the assembly’s version number doesn’t change. If you do change the version number, publisher policy can redirect requests for the older assembly to the newer assembly as long as the older assembly is still registered. Figure 11.2 illustrates this situation.

Figure 11.2. COM objects built with assembly version 1.0.0.0 can still work with assembly version 2.0.0.0.

Caution

By default, every Visual Studio .NET project uses the AssemblyVersionAttribute custom attribute to give the assembly a version of “1.0.*”. The wildcard has the effect of producing a different assembly version number each time the project is compiled. This behavior does not work well with the automatically generated LIBIDs and CLSIDs because it causes them to change each time you re-compile the project! This causes any COM clients to fail unless they are recompiled with the newer version of the assembly, unless you also bother with publisher policy each time. Plus, if you register your assembly each time, you can pollute your registry with multiple entries pretty quickly. To solve this problem, you should change the value of AssemblyVersionAttribute (in AssemblyInfo.cs, AssemblyInfo.vb, or AssemblyInfo.cpp, depending on the project language) to a stable number that doesn’t use a wildcard.

Interface Identifiers (IIDs)

An IID that is generated for a .NET interface is based on a hash of its fully-qualified name plus the signatures of all the interface’s members that are exposed to COM. This means that changing or adding methods to an interface gives it a new IID because an interface should never be changed without changing its identity.

Unlike CLSIDs or LIBIDs, the interface’s assembly identity does not affect IID generation. It’s critical for versioning that IIDs don’t change based on an assembly’s version number. Consider the scenario pictured in Figure 11.2. The COM client believes it’s using version 1.0.0.0 of the assembly, and may query for several interfaces exposed in that assembly’s exported type library. To be compatible, version 2.0.0.0 of the assembly must contain the same classes and interfaces, but if the interfaces’ IIDs would have changed, then the COM client’s QueryInterface calls would have failed.

One subtlety to the IID generation rule is that the names of an interface’s members don’t play a role in the IID generation. This means that you could change the order of an interface’s methods, and if the list of signatures still looked the same (ignoring member names), the IID would not change. For example, changing a C# interface definition from:

public interface ITrafficLight
{
  void Go();
  void Stop();
  void Yield();
}

to:

public interface ITrafficLight
{
  void Stop();
  void Yield();
  void Go();
}

would not affect .NET clients but would break COM clients in subtle ways since they could still get the “same” interface but calling its methods by v-table offsets now does something entirely different. The moral of the story, mentioned earlier, is that you should never, ever reorder members!

Because assembly identity does not factor into IID calculations, unrelated interfaces in separate assemblies released by separate publishers can end up with the same IID if they have the same fully-qualified name and the same “shape.” Although this is an unlikely occurrence (even more unlikely than conflicting ProgIDs in COM), it can make unregistration of an exported type library dangerous since there’s a possibility that an interface will be unregistered that is needed by a separate application for COM marshaling. One way to avoid this situation is to give your interfaces explicit IIDs, as explained in the next chapter, and take on the responsibility of ensuring that your interface definitions never change.

Deployment

When deploying an assembly that might be used by COM clients, it’s a good idea to register it with REGASM.EXE so clients don’t have to perform the registration. This would be like registering a COM component with REGSVR32.EXE when deploying it. You may also want to deploy and register a type library, generated using REGASM.EXE’s /tlb option.

Caution

The side-by-side and versioning capabilities of .NET components work well with COM clients, as long as all versions of the components remain registered. That’s because COM clients built with an older version of an assembly can only be redirected to a newer version if the older version’s registry entries are still in the Windows Registry (shown in Figure 11.2).

Imagine that you install a COM application that uses version 1.0 of assembly X, and then install version 2.0 of assembly X that includes publisher policy to redirect requests for version 1.0 to version 2.0. You might be tempted to uninstall version 1.0 if the COM application still works correctly with version 2.0. However, if X’s uninstallation program unregisters X, then doing so breaks the COM application because the CLSIDs it uses in CoCreateInstance calls are no longer registered. Similarly, if X has a type library that was originally registered but now unregistered, necessary registry entries for marshaling interfaces may have been removed, since IIDs remain fixed across assembly versions.

To summarize, unregistration poses the following dangers if you uninstall version 1.0 after installing version 2.0:

• Unregistered IIDs break COM marshaling of interfaces.

• Unregistered CLSIDs break COM instantiation of classes from 1.0. If other COM clients attempt to instantiate classes from 2.0, it still works unless the .NET component used GuidAttribute to keep CLSIDs fixed from one version to the next.

If you instead install the COM application with version 1.0 of X, then uninstall version 1.0 before installing version 2.0, unregistration causes the following problems:

• IIDs are still registered, so COM marshaling is not broken.

• Unregistered CLSIDs break COM instantiation of classes from 1.0, unless the .NET component used GuidAttribute to keep CLSIDs fixed from one version to the next. If other COM clients attempt to instantiate classes from 2.0, it still works.

You can even run into trouble if you install the COM application with version 1.0 of X, then install version 2.0, then uninstall version 2.0. This causes the following problems:

• Unregistered IIDs break COM marshaling of interfaces.

• CLSIDs for version 1.0 are still registered, unless the .NET component used GuidAttribute to keep CLSIDs fixed from one version to the next. Therefore, COM instantiation of classes from 1.0 still works if the .NET component used the CLSIDs auto-generated by the CLR.

Therefore, you should be conservative with unregistration of assemblies and their type libraries, because it can have disastrous effects for COM clients dependent on these registry entries.

.NET assemblies should usually be given a strong name, and the best place to install assemblies that may be used by COM clients is the Global Assembly Cache. Besides the fact that assemblies in the GAC are loaded faster than assemblies elsewhere, relying on a file path using the CodeBase mechanism is more fragile. If you do decide to register a CodeBase pointing to the location of your assembly, be sure that your assembly has a strong name in order to prevent it from interfering with other .NET applications.

Visual Studio .NET exposes the functionality of REGASM.EXE in Visual C# and Visual Basic .NET projects via the Register for COM Interop project setting. Figure 11.3 shows this option for a Visual C# project, and Figure 11.4 shows this option for a Visual Basic .NET project. You can reach these dialogs by right-clicking on your project in the Solution Explorer and choosing Properties, but it’s only enabled for class library projects.

Figure 11.3. The Register for COM Interop option in a Visual C# project.

Figure 11.4. The Register for COM Interop option in a Visual Basic .NET project.

Caution

Using Visual Studio .NET’s Register for COM Interop option is equivalent to doing the following from a command prompt:

regasm MyProject.dll /tlb /codebase

Although the property page shown in Figure 11.3 states that the registration can only be done for strong-named assemblies, this is not enforced in any Visual Studio .NET project type. Because the /codebase option is used, you may want to avoid the use of Register for COM Interop altogether, or at least ensure that your assembly has a strong name.

Testing Your Component from COM

Obviously, the more testing you can do to exercise your .NET component from COM, the better. For those who don’t want to invest a lot of time in doing this, there are a few simple tasks that should be done to give your class library a COM-focused sanity check:

• Run REGASM.EXE on your assemblies to ensure that no errors or warnings occur.

• Run REGASM.EXE on your assemblies with the /regfile option in order to understand exactly what gets registered for the assemblies.

• Run TLBEXP.EXE on your assemblies to ensure that no errors or warnings occur when exporting type libraries. (Or decide whether or not you’re willing to accept the conditions explained by the warnings.)

• View the type libraries produced by TLBEXP.EXE in a viewer like OLEVIEW.EXE to get a handle on what your APIs look like to COM clients. It’s this step in which you’re likely to find the most surprises, such as renamed members due to conflicts, case-insensitivity effects, and members that you may not have realized were exposed or hidden.

• After running REGASM.EXE on your assemblies, open OLEVIEW.EXE and locate your classes in the tree view under Object Classes, Grouped by Component Category, .NET Category. By expanding the node for a class, OLEVIEW.EXE attempts to instantiate it. If it succeeds, it calls IUnknown.QueryInterface on your object for every registered interface on the machine and lists the interfaces the instantiated object implements. This is a quick yet very useful test to make sure that your objects can be successfully instantiated and the expected interfaces can be obtained, as pictured in Figure 11.5. (However, it’s possible that more interfaces than just the ones listed can be obtained since not all interfaces are necessarily registered.) If you can’t find the class you’re looking for after running REGASM.EXE, make sure that it’s marked as public in the assembly and has a public default constructor. Although failure can occur from an exception thrown within your object’s constructor, 99% of the failures are due to HRESULT 0x80131552—a type load exception caused by not being able to locate the assembly listed in the registry. Unless you drop your assemblies in the same directory as OLEVIEW.EXE (only temporarily for test purposes, of course), you should either run REGASM.EXE with the /codebase option or install the assemblies into the Global Assembly Cache to make them loadable from OLEVIEW.EXE.

Figure 11.5. Using OLEVIEW.EXE to test your .NET components from COM.

This is just about all you can do besides sitting down and writing some code to test your class library from COM. (Unless, of course, you already have some mechanism for automated COM component testing.) Visual Basic 6 is a good environment to use for writing some quick tests that use your .NET components.

Conclusion

COM Interoperability was designed to make common .NET practices exposed as common COM practices. Just because HRESULTs and pointers are used in COM APIs doesn’t mean that you should continue using them in .NET APIs exposed to COM. However, as this chapter shows, there’s a limit to how much COM Interoperability can or will do.

Although .NET components can be exposed to COM with no extra work needed by the component developer, and although the CLR and its COM Interoperability tools choose sensible default behavior, there are still many special considerations to keep in mind if there’s a possibility for your components to be used by COM clients.

The main lessons, in a nutshell, are:

• Don’t create APIs that rely on parameterized constructors.

• Don’t create APIs that rely on static members.

• Don’t create APIs that rely on non-field members in value types.

• Don’t create APIs that rely on nested arrays.

• Think twice before using overloaded methods.

• Don’t forget the benefits of interface-based programming.

• Throw exception types defined in the mscorlib assembly.

The key to all of the advice in this chapter is to make your class libraries more COM-friendly while sacrificing as little as possible all the benefits that the published .NET guidelines give to .NET clients. Rather than trying to completely avoid COM-unfriendly elements like static members, providing COM-friendly alternatives can appease both worlds as long as it doesn’t clutter your APIs with two ways to accomplish everything. By understanding the limitations and interactions when COM clients attempt to use your .NET APIs, you can make more informed decisions about your design.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 11. .NET Design Guidelines for Components Used by COM Clients

Create new playlist

Sign In

Sign Up

Chapter 11. .NET Design Guidelines for Components Used by COM Clients

In This Chapter

Tip

Naming Guidelines

Names to Avoid

Caution

Namespaces and Assembly Names

Case Insensitivity

Caution

Caution

Tip

Usage Guidelines

Interfaces Versus Classes

Tip

Interfaces Versus Custom Attributes

Tip

Properties Versus Fields

Using Overloaded Methods

Tip

Using Constructors

Using Enumerations

Tip

Choosing the Right Data Types

Using OLE Automation Compatible Types

Tip

Avoiding Pointers

Tip

Avoiding Nested Arrays

Avoiding User-Defined Value Types

Reporting Errors

Defining New Exception Types

Tip

Caution

General Guidelines

TIP

Exposing Enumerators to COM

TIP

Versioning

Caution

Library Identifiers (LIBIDs)

Class Identifiers (CLSIDs)

Caution

Interface Identifiers (IIDs)

Deployment

Caution

Caution

Testing Your Component from COM

Conclusion

Table of Contents for
Chapter 11. .NET Design Guidelines for Components Used by COM Clients