OpenIDM 2.0 relies on database availability and an MVCC (multi version concurrency control) layer for the ability to concurrently and safely operate on the same data set. With its RESTful architecture, OpenIDM 2.0 takes advantage of the proven scalability and reliability of HTTP load balancers and associated infrastructure. Further enhancements to cluster awareness are on the roadmap, yet it is already possible to set up OpenIDM 2.0 in an active-active or active-passive setup.
You must have a good understanding of OpenIDM 2.0 to set up HA with all nodes cooperating nicely.
- Active-passive requires an external mechanism or script to detect the failure of one node and to bootstrap/activate another.
- Active-active setups require that you take care to prevent different nodes from unnecessarily duplicating each others' actions, such as triggering the same reconciliation at the same time, and to make sure available OpenIDM nodes share tasks appropriately.
High level recommendations:
- Use external mechanisms such as HTTP load balancers to detect failed nodes and re-direct traffic.
- Use specific nodes for specific tasks. For example, reserve some nodes as targets for HTTP load balancers, others for scheduled reconciliation.
- Know that scheduled tasks are not aware of other nodes out of the box. Hence if a schedule should run either on a specific node only, or only once per cluster of nodes, this can be achieved by using the supported extensibility of custom script, configuration and state in the repository. Solutions range from simply having different schedules associated with a node, to sophisticated fail-over in using configuration to optionally associate a task with a specific node, state in the repository to keep track of which node owns which task and whether the task finished successfully or failed.
Reconciliation can be set up to correct the state of your data over time, giving you leeway when setting up the service without the danger of missing changes/losing data. In terms of handling fail-over of reconciliation itself an example of a simple approach would be to set up one node to reconcile first, and another node to check at a later time whether reconciliation has completed as expected and reconcile if required. The same reconciliation task running more than once at the same time is not technically an issue, but can result in confusing results and is not recommended.