Friday 14 November 2008

Data Aggregation via JMX and the Grid

Following my post on JMX and the Grid which got picked up by The Server Side and by Nati Shalom's blog here I thought I'd add some more brief thoughts on another complimentary JMX pattern we've used in conjunction with grid applications.

The original post talks about collating client-side access to a distributed population of JMX MBeans that comprise the application. In essence the technique described in that post to use a JavaSpace (or other rendezvous technology) to act as a point of registration and lookup. This gives the client-side access to (say) a list of MBeans for each instance of a given type component wherever its running in the grid and the ability for the agent to communicate with any MBean to get/set attributes, invoke management operations or hear notification events.

The client-side (or "agent") of JMX is by nature pretty dumb. Generally the agent uses metadata info about the MBean to generate a UI on the fly. Although it's possible to write custom JMX agents for your application (and we do that), to make sure your management MBeans will work with any JMX agent you really have to design to the lowest common denominator agent.

So let's consider the use-case where our MBeans are collecting stats about (say) our application's performance: average task execution time, latency etc. Stats can be produced for each individual component and made available via the MBean, but we also want to be able to see an aggregated view statistics for the application as a whole.

Aggregation

To deal with the dumb JMX agent we really need to collate and aggregate the data server-side. I'm not going to dwell too much on the approach to this, other than to say aggregation might be done in one of three ways:
  1. Writing an server-side component that collects stats from individual MBeans and aggregates. In this case, using the approach outlined in my previous JMX piece might be handy
  2. Tapping into the underlying components using some application-specific API and aggregating from there
  3. Having the components publishing their stats into a JavaSpace and having an aggregating component attached to the space to perform the aggregation.
Focussing on the last of these approaches for a moment, using the space as a rendezvous point for collation and aggregation has some merits: publication of stats as POJOs to the space is easy and listening to those publications to trigger aggregation is also simple to implement.

Publication to JMX

Regardless of the approach to aggregation, we also need a technique for making the aggregated stats available to dumb JMX agent. The aggregating component needs to expose an MBean to provide access to the aggregated data values. In a simple application these can be held as in-memory values within the aggregating component. However, to deal with large data volumes and to provide fault-tolerance we prefer the following approach:
  1. Aggregating components write the results back to the JavaSpace
  2. A stateless component provides an MBean that acts as a facade to the aggregated data, which is actually fetched on demand from the space
Using the GigaSpaces product we can rely on the space itself to manage live reliable backup of our aggregated data and the Service Grid to host and maintain our stateless aggregated MBean facade.

Summary

Although in our simple aggregating stats use-case we might not care about dropping data or fault-tolerance, there are many real-world examples where we would care far more about these issues. The bare-bones architecture of using the space as both a rendezvous point and a safe holding repository, with access via stateless service components applies well.

One of the reasons I'm a fan of GigaSpaces and space-based architectures is that a number of architectural choices that are traditionally hard-wired: transactional/non-transactional, sync or async replication can be changed through configuration only. This enables common design patterns (and therefore components) to be applied to a wide range of application problems, by enabling the data integrity/performance equation to be tweeked at a late stage of application assembly.

I know this last paragraph is a bit of a leap from the initial topic, but I'll return to this theme in later postings which discuss other use-cases where data integrity and fault-tolerance are a significant issue, in an attempt to make it stand up.

Monday 3 November 2008

JMX For Grid-Based Applications

This blog talks about a (fairly) simple technique that solves the problem of how you get a unified view of JMX Management Beans in a distributed application. This is proving useful in a number of application projects we are doing at PSJ where the application MBeans, for example worker beans are distributed around the network when deployed in the GigaSpaces Service Grid. The technique describes how a new protocol can be added to enable JMX agents and JMX servers to find each other in a network deployment. The protocol uses the GigaSpace as the rendezvous point, but the approach can be easily adapted to work with other network registration and rendezvous technologies.

A Brief History of JMX

JMX has been around for a long time as one of the core APIs in J2EE and recent versions of Java have seen it incorporated into the JVM to provide memory and other basic stats. The basics of JMX are pretty simple: you instrument your applications by providing one or more management beans MBeans. MBeans provide read-only and read-write attributes, and operations that provide information and enable the application's management characteristics to be controlled. MBeans are published to the world by registering them with an MBeanServer. MBeans are interacted with by a separate agent, which finds MBeans in the MBeanServer and provides a UI to interact with them. Often this UI takes the form of a generic user interface based on properties provided by the MBean, or as specific to the application as required.

The original JMX specs were charmingly vague about where you might find an MBeanServer to register with or how you'd connect to one from the agent side. In more recent times, the JVM provides a default MBeanServer inside the JVM. As well as providing management capability for the JVM this MBeanServer can act as a repository for application MBeans. The jconsole provided with Java 1.6 provides a decent, if primitive, GUI for you to see the MBeans registered with the JVM's MBeanServer and interact with them.

JMX JSR-160 Connectors

JMX also now provides connectors that facilitate remote connection between agent and server, and come in two flavours: server-side and client-side. The server-side connector enables you to set up a connection channel to talk to the MBeanServer remotely, for example by exporting an RMI stub. The client-side connector provides the means of binding to the server-side connector and establishing a connection, for example via JNDI lookup of the server-side connector's RMI stub.

If you use Spring, you can simply declare connectors in Spring configuration, as follows:

<bean id="serverConnector"
class="org.springframework.jmx.support.ConnectorServerFactoryBean">
<property name="objectName" value="system:name=spaceconnector"/>
<property name="serviceUrl”
value="service:jmx:rmi://localhost/jndi/rmi://localhost:1099/test"/>
<property name="environment">
<map>
<entry key="jmx.remote.jndi.rebind" value="true" />
</map>
</property>
</bean>

The client-side connector is declared as:

<bean id="clientConnector"
class="org.springframework.jmx.support.MBeanServerConnectionFactoryBean">
<property name="serviceUrl"
value="service:jmx:rmi://localhost:1099"/>
</bean>
JMX Issues Specific to the Grid

For monolithic applications that run in a single JVM this architecture works fine, but when applied to distributed applications running in the grid we have two additional problems:

  1. How does the client-side find all the MBeans that comprise the application?
  2. How can it interact with the distributed parts of the application from a single client?

This has become a real-world problem for PSJ in implementing grid-deployed applications within the GigaSpaces Service Grid and other grid fabrics. Here's the nub of the solution we came up with to solve these problems. In essence the solution has two parts:

  • Provide a means of binding the MBeanServers from individual JVMs into a community that represents the application on the grid.
  • Collate the MBeanServers together to provide a single client-side federated connector.

Communities of MBeanServers

Fortunately JMX provides a very open means of extending the protocols that are supported in bringing JMX server and client connections together. By providing an additional protocol we can control the server-side connector's registration process and the client-side connector's binding mechanism. There are various options to achieve our ends here including using naming hierarchies in JNDI, JINI lookup groups, but the one we settled on uses a JavaSpace to act as a rendezvous point for servers and clients. The rationale here is partially simply expedience: the applications to which we've applied this pattern already use a GigaSpace to share state and so in some senses the community is bound by the fact that all components comprising the application point at the same GigaSpace instance. The second rationale is that adding space-based registration and client-side lookup using the GigaSpace is very quick and easy to write, using simple POJOs to represent the registration.

Adding a Space-Based Server-Side JMX Connector

The standard RMI-based JMX URL can be changed by replacing the service.jmx.rmi part with service:jmx:space to indicate that we want to use a space protocol. The JMX spec lets us add handler classes to deal with the "space" protocol referenced in the URL on the server and client sides. On the server-side we need to provide a specific ServerProvider class implementing JMXConnectorServerProvider. The ServerProvider actually just piggy-backs on the existing RMI one and in addition to RMI stub registration places an entry in the JavaSpace with the RMI connection URL needed by the client. To finish off, using the Spring approach we simply need to declare the server connector to use the new protocol:

<bean id="serverConnector"
class="org.springframework.jmx.support.ConnectorServerFactoryBean">
<property name="objectName" value="system:name=spaceconnector"/>
<property name="serviceUrl"
value="service:jmx:space://localhost/jndi/rmi://localhost:1099/test"/>
<property name="environment">
<map>
<entry key="jmx.remote.jndi.rebind" value="true" />
<entry key="jmx.remote.protocol.provider.pkgs"
value="com.psjsolutions.sflib.jmx.protocols"/>
<entry key="space"><ref bean="space"/></entry>
</map>
</property>
</bean>
Notice that we've had to reference our protocol support package in the environment map for the connector factory. We can also place protocol-specific properties in the environment - in this case the space we want to use to hold the entries.

The Client-Side Connector

I said earlier that there are two problems to overcome in applying JMX to the grid: one being rendezvous/binding and the other being obtaining a collated view of all the MBeans out there. Both these issues are addressed by the client-side connector. First off we need a client-side connector that can look in the space to find all the MBeanServer connection details for the networked community. This is pretty easy. In symmetry with the server-side provider all we need to do is to write a ClientProvider that understands the space protocol and provides the agent with a client connector. In Spring this looks like:
<bean id="clientConnector"
class="org.springframework.jmx.support.MBeanServerConnectionFactoryBean">
<property name="serviceUrl"
value="service:jmx:space:///jini://*/*/${javaspace.name}"/>
</bean>
All we've done here is to replace the RMI-based URL with one that specifies the space protocol and provides the URL of the space. The ClientProvider parses the service URL extracting the space URL and using it to bind to the space and extract all the server connector details. This brings us to the final part, which is to provide a collated "virtual" server connection that sits between the client-side user code and the set of MBeanServers.

Federating the Client-Side Server Connection

As far as the client-side code is concerned the connector it gets is an object that implements javax.management.MBeanServerConnection. This API is actually pretty straight-forward, enabling MBeans to be found by naming and query patterns and using the found MBeanName handles to get/set attributes and invoke operations. In Java 7 there may well be a formally supported means of cascading or handing on these requests, but as of the time of writing there's no capability out of the box. We therefore implemented a FederatedMBeanServerConnection class that picks up a number of MBeanServer connections from the space, connects to them and then delegates operations to the set of servers, effectively acting as a multiplexer.

In Summary

As we like to carry around these common solutions to common problems from job-to-job, we've added the capabilities described here to our foundation libraries that we often use to implement client engagements. By seperating the federation/multiplexing capability from the space protocol we can use this approach in a number of different distributed architectures, and will probably add protocols as the need arises. The beauty of the approach is that neither the client or server side code that use the connectors know what's being done under the covers. It's all abstracted into protocol URLs and therefore simple configuration changes.

Are there any gotchas? The one we've hit so far is possible ambiguity in naming MBeans. In many senses the set of MBeanServers in the grid can be thought of as a single virtual networked MBeanServer. However, whilst you can't register the same named MBean twice with a single MBeanServer, there's nothing to stop you registering the same named MBean with different MBeanServers. In fact in a grid-style environment where a given application unit is deployed as many replicated instances it's quite likely that you will hit this issue. Why is this important:? Well inability to enforce unique names can lead to ambiguity when we use the virtual MBeanServer as multiple MBeans with the same name can be found in the virtual server. We can work around this to a large extent by collating MBean queries and tagging the owning server in the MBeanName handed back through the client connector. This disambiguates things in most of the use cases including attribute get/set and operation invocation. However if the client code asks for a specific MBean by name then all we can do is return the first one we encounter. This is not ideal, but in practice the way to solve this is to use naming strategies when registering MBeans server-side. Spring has the idea of a pluggable NamingStrategy for MBeans that are auto-created by Spring. Using the instance-based NamingStragey resolves this problem for Spring Mbeans and, if it is a real issue, this approach could also be adopted by application code that is using explicit creation/registration of MBeans.