This page is meant to document steps required to use OpenDJ as a suitable repository for OpenIDM.
The primary goal of this is to support generic CREST repositories. OpenDJ's rest2ldap currently exposes a CREST CollectionResourceProvider that should be a suitable repository for IDM.
Initializing the Repo
Configuring the CollectionResourceProvider
Rest2LDAP provides a builder for easily configuring and creating an LDAPCollectionResourceProvider. We can use the Rest2LDAP.builder() method to instantiate the builder and then use the following methods to configure and build it:
- ldapConnectionFactory(ConnectionFactory factory)
- Sets the connection factory
- configureMapping(JsonValue configuration)
- Configures the JSON to LDAP mapping
- Builds the resource provider
The ConnectionFactory can be created using the Rest2LDAP.configureConnectionFactory(JsonValue configuration) method. See the opendj.rest2ldap-servlet.json configuration file for an example of the connection factory config and two mapping configuration examples.
OpenDJ can be embedded in openidm-repo-opendj module and started at startup similarly to how start the embedded OrientDB. There is a good implementation to use as a reference in the AMSessionStore project in OpenAM. In OpenIDM we will probably need to initialize and start the embedded opendj instance in the bundle Activator (see SetupOpenDJ.main() and EmbeddedOpenDJ.setup()), and register a RepoBootService implementation.
For a generic CREST repo we would likely need to make a configurable revision property that would be injected to _rev.
LdapCollectionResourceProvider currently supports OCC via etags using the revision property of the request. The etag attribute is configured via the "etagAttribute" property of the Rest2LDAP config. The etag attribute likely needs to be requested as part of the "additionalLDAPAttributes" configuration as well.
Given the "eventual" consistency model of replicated DJ this will only work on a single node. We may need to force MVCC operations to a primary node.
If we wish to achieve elastic scalability forcing MVCC operations to a master node will not be ideal. We could potentially implement a soft-shard approach where we establish MVCC locality on a given node for a subset of data. A basic example below demonstrates locality based on the first character in a hash of the userName.
Example ID set with Hashes
If we have 4 servers they would get hashes starting with 0-3, 4-7, 8-b, c-f respectively.
|0-3||Server 1||user.3, user.4, user.5, user.8|
user.2, user.7, user.10
|Server 3||user.1, user.6|
Naturally, as the data set grew distribution would become more even.
An initial implementation would require that all cluster nodes simply be replicates in DJ which would result in all cluster nodes having a full dataset. The designated segment would only act as the most recent copy of the data.
Possibility of data loss: There is a possibility of data loss if a cluster node goes down with changes that have not yet been persisted to the rest of the cluster. It may be possible to designate multiple primary nodes per subset similar to ElasticSearch (http://www.elastic.co/guide/en/elasticsearch/guide/master/_scale_horizontally.html). This could also eventually lead to true sharding with data partitioning so all servers are not required to carry a full dataset (at the cost of fault tolerance).
LDAPCollectionResourceProvider currently only supports querying via queryFilter. We can hard-code basic queries such as query-all-ids like we do in the scheduler but we will likely want to look in to having named queryFilters instead of queryExpressions to make these more portable.
For the initial phase of implementation we will be supplementing the existing OrientDB repository. We will create an additional
repo.opendj.json file that will sit alongside the
repo.orientdb.json. Calls for managed users will be intercepted on the router and sent to OpenDJRepoService.
- Create new repository module - OPENIDM-3153Getting issue details... STATUS
- Create OpenDJRepoService for handling persistence via Rest2LDAP - OPENIDM-3173Getting issue details... STATUS
- Create configuration mapping for managed users - OPENIDM-3158Getting issue details... STATUS
- Add support for embedded DJ server - OPENIDM-3161Getting issue details... STATUS
Phase two of the implementation will consist of adding additional persistence capabilities beyond managed users (config, schedulers, audit). We may be able to drop OrientDB during this phase if all persistence requirements are met. This would likely mean using H2 as the Activiti store.
- Scheduler Persistence - OPENIDM-3171Getting issue details... STATUS
- Cluster Config - OPENIDM-3172Getting issue details... STATUS
- Audit Persistence
- Config Persistence - OPENIDM-3169Getting issue details... STATUS
- Links Persistence - OPENIDM-3163Getting issue details... STATUS
Activiti is currently very tightly coupled to a SQL relational database via JDBC. There has been some initial work by a core member to get Activiti working on top of Neo4J. In the interim Activiti could be supported via H2 as it is currently with OrientDB.