Friday, March 23, 2012

OSB - Design Time Considerations


Design Time Considerations for Proxy Applications

Consider the following design configurations for proxy applications based on your OSB usage and use case scenarios:
  • Avoid creating many OSB context variables that are used just once within another XQuery
    Context variables created using an Assign action are converted to XmlBeans and then reverted to the native XQuery format for the next XQuery. Multiple "Assign" actions can be collapsed into a single Assign action using a FLWOR expression. Intermediate values can be created using "let" statements. Avoiding redundant context variable creation eliminates overheads associated with internal data format conversions. This benefit has to be balanced against visibility of the code and reuse of the variables.
  • Transforming contents of a context variable such as $body.
    Use a Replace action to complete the transformation in a single step. If the entire content of $body is to be replaced, leave the XPath field blank and select "Replace node contents". This is faster than pointing to the child node of $body (e.g. $body/Order) and selecting "Replace entire node". Leaving the XPath field blank eliminates an extra XQuery evaluation.
  • Use $body/*[1] to represent the contents of $body as an input to a Transformation (XQuery / XSLT) resource.
    OSB treats "$body/*[1]" as a special XPath that can be evaluated without invoking the XQuery engine. This is faster than specifying an absolute path pointing to the child of $body. A general XPath like "$body/Order" must be evaluated by the XQuery engine before the primary transformation resource is executed.
  • Enable Streaming for pure Content-Based Routing scenarios.
    Read-only scenarios such as Content-Based Routing can derive better performance from enabling streaming. OSB leverages the partial parsing capabilities of the XQuery engine when streaming is used in conjunction with indexed XPaths. Thus, the payload is parsed and processed only to the field referred to in the XPath. Other than partial parsing, an additional benefit for read-only scenarios is that streaming eliminates the overhead associated with parsing and serialization of XmlBeans.
    The gains from streaming can be negated if the payload is accessed a large number of times for reading multiple fields. If all fields read are located in a single subsection of the XML document, a hybrid approach provides the best performance. (See "Design Considerations for XQuery Tuning" for additional details.)
    The output of a transformation is stored in a compressed buffer format either in memory or on disk. Therefore, streaming should be avoided when running out of memory is not a concern.
  • Set the appropriate QOS level and transaction settings.
    Do not set XA or Exactly-Once unless the reliability level required is once and only once and its possible to use the setting (it is not possible if the client is a HTTP client). If OSB initiates a transaction, it is possible to replace XA with LLR to achieve the same level of reliability.
    OSB can invoke a back end HTTP service asynchronously if the QOS is "Best- Effort". Asynchronous invocation allows OSB to scale better with long running back-end services. It also allows Publish over HTTP to be truly fire-and-forget.
  • Disable or delete all log actions.
    Log actions add an I/O overhead. Logging also involves an XQuery evaluation which can be expensive. Writing to a single device (resource or directory) can also result in lock contentions.

Design Considerations for XQuery Tuning

OSB uses XQuery and XPath extensively for various actions like Assign, Replace, and Routing Table. The following XML structure ($body) is used to explain XQuery and XPath tuning concepts:
<Item name="ACE_Car" >20000 </Item>
<Item name=" Ext_Warranty" >1500</Item>
…. a large number of items
<Shipping>My Shipping Firm </Shipping>
  • Avoid the use of double front slashes ("//") in XPaths.
    $body//CustName while returning the same value as $body/Order/CtrlArea/CustName will perform a lot worse than the latter expression. "//" implies all occurrences of a node irrespective of the location in an XML tree. Thus, the entire depth and breadth of the XML tree has to be searched for the pattern specified after a "//". Use "//" only if the exact location of a node is not known at design time.
  • Index XPaths where applicable.
    An XPath can be indexed by simply adding "[1]" after each node of the path. XQuery is a declarative language and an XPath can return more than one node; it can return an array of nodes. $body/Order/CtrlArea/CustName implies returning all instances Order under $body and all instances of CtrlArea under Order. Therefore, the entire document has to be read in order to correctly process the above XPath. If you know that there is a single instance of Order under $body and a single instance of CtrlArea under Order, we could rewrite the above XPath as $body/Order[1]/CtrlArea[1]/CustName[1].
    The second XPath implies returning the first instances of the child nodes. Thus, only the top part of the document needs to be processed by the XQuery engine resulting in better performance. Indexing is key to processing only what is needed.
    Indexing should not be used when the expected return value is an array of nodes. For example, $body/Order[1]/ItemList[1]/Item returns all "Item" nodes, but $body/Order[1]/ItemList[1]/Item[1] only returns the first item node. Another example is an XPath used to split a document in a "for" action.
  • Extract frequently used parts of a large XML document as intermediate variables within a FLWOR expression
    An intermediate variable can be used to store the common context for multiple values. Sample XPaths with common context:
    $body/Order[1]/Summary[1]/Total, $body/Order[1]/Summary[1]/Status,$body/Order[1]/Summary[1]/Shipping
    The above XPaths can be changed to use an intermediate variable:
    let $summary := $body/Order[1]/Summary[1]
    $summary/Total, $ summary/Status, $summary/Shipping
    Using intermediate variables consumes more memory but reduces redundant XPath processing.
  • Using a Hybrid Approach for read-only scenarios with Streaming
    The gains from streaming can be negated if the payload is accessed a large number of times for reading multiple fields. If all fields read are located in a single subsection of the XML document, a hybrid approach provides the best performance. The hybrid approach includes enabling streaming at the proxy level and Assigning the relevant subsection to a context variable, The individual fields can then be accessed from this context variable.
    The fields "Total" and "Status" can be retrieved using three Assign actions:
    Assign "$body/Order[1]/Summary[1]" to "foo"
    Assign "$foo/Total" to "total"
    Assign "$foo/Status" to "total"

No comments: