Canonical schemas
Once again, let's quote Wikipedia on canonical models:
In
enterprise application integration, the "canonical data model" is a
design pattern used to communicate between different data formats.
Source: http://en.wikipedia.org/wiki/Canonical
Without
designing a canonical, or intermediary, data format you can end up with
a mess of point-to-point translations that leads to a brittle
application.
By
injecting a common schema, you can reduce coupling between the source
formats and destination formats. At any one time, a message only needs
to be translated to or from the canonical format.
So
where do we implement this in BizTalk solutions? I specifically
recommend this approach when interacting with auto-generated schemas.
The WCF LOB adapters available in BizTalk Server 2009 will
automatically generate schemas that comply with the selected system
interface. Naturally, this schema exhibits extremely tight coupling to
the destination platform. If we use these schemas in our
orchestrations, or worse, in exposed services, we've instantly made
either incremental or far-reaching change a more complicated endeavor.
We've also failed miserably at the SOA goal of abstraction.
That
said, in some cases you clearly need to interact with LOB schemas from
within an orchestration. Consider the scenario where some of the LOB
schema fields are populated through enrichment via other messages.
Clearly I need the more stateful orchestration environment to host more
complex message creation.
Let's demonstrate an additional example. The LOB adapters expose fairly CRUD(create-read-update-delete)
types of operations on databases such as SQL Server 2008. As we've
discussed in the book so far, a service is a more business-oriented
module that should extend higher than simply slapping a SOAP interface
on low-level APIs. The "Enrollment" document-style schema I created
earlier may actually be a canonical schema that sits in between a
variety of input formats and destination systems. Let's say my database
that stores enrollment information has the following structure:
If
I solicit schemas from the SQL Server WCF adapter for these tables,
I'll end up with a series of message types. It would be a fairly poor
decision to expose each schema as a distinct service operation.
Instead, I have an intermediate enrollment schema that abstracts the
more complex set of individual services needed to insert a new subject
enrollment.
A
well thought-out canonical schema doesn't simply have copies of the
same nodes and structures from all the schemas that rely on it. While
clearly the canonical format needs to capture the right data to serve
its purpose, put consideration into how the logical composite entity
should look and take a more business-centric view of the data.
Specifically, remove redundant fields, reorganize elements into a
natural hierarchy, reconsider data types, and evaluate occurrence and
restriction boundaries.
Pitfall
Be
careful about going overboard on canonical schemas and introducing
accidental complexity. For messaging-only solutions, you may be fine
with the schema formats demanded by the source and destination systems.
When your scenario involves the injection of processing logic (for
example rules or orchestration), then it becomes more attractive to
decouple components through the use of canonical schemas.