Skip to end of metadata
Go to start of metadata

The purpose of this page is to discuss a possible solution to Link Expansion on relationships via JDBC repos.  This is based on research done in OPENIDM-8006, in response to OPENIDM-7404

The base of the problem is how to solve the case when the data of a relationship is desired along with the primary objects data.

For example a rest query like the following:

curl --request GET \
--url 'http://localhost:8080/openidm/managed/user?_queryFilter=true&_fields=*,/roles/*' \
--header 'cache-control: no-cache' \
--header 'content-type: application/json' \
--header 'x-openidm-password: openidm-admin' \
--header 'x-openidm-username: openidm-admin'

The above query is asking for all users and their associated roles to be returned.

This particular query is currently implemented to retrieve all users, and then looping over each user, lookup the relationship, and then the lookup role data if it hasn't already been retrieved.  As you can probably guess, and we know from OPENIDM-7404, this solution is not scalable in most sized datasets.

 

Things needed for a JDBC solution:

Hybrid explicit/eav tables

Relational databases are built to join entities that are related.  To do that though, you need explicit related columns to join on, as well as columns to add indexes on.

  1. OPENIDM-6690 - provide a way to configure columns as explicit alongside generic full object columns.
  2. OPENIDM-8097 - pull the firstId,firstPropName,secondId,secondPropName value out of the EAV relationshipproperties table and make them explicit columns. Create indexes on those relationship columns.

For demonstration purposes, this example table pivots the existing relatopnshipproperties table into one that can be efficiently joined. Note the added indexes as well.  The structure of this table is similar to what the final implemetation would need.

CREATE TABLE openidm.pivot_relationships AS
  (SELECT DISTINCT rp.relationships_id,
                   r.objectid   AS _id,
                   r.rev        AS _rev,
                   fi.propvalue AS firstId,
                   fn.propvalue AS firstPropertyName,
                   si.propvalue AS secondId,
                   sn.propvalue AS secondPropertyName,
                   p.propvalue  AS properties,
                   r.fullobject AS rel_fullobject
   FROM   openidm.relationshipproperties rp
          LEFT JOIN openidm.relationshipproperties fi
                 ON ( fi.relationships_id = rp.relationships_id
                      AND fi.propkey = '/firstId' )
          LEFT JOIN openidm.relationshipproperties fn
                 ON ( fn.relationships_id = rp.relationships_id
                      AND fn.propkey = '/firstPropertyName' )
          LEFT JOIN openidm.relationshipproperties si
                 ON ( si.relationships_id = rp.relationships_id
                      AND si.propkey = '/secondId' )
          LEFT JOIN openidm.relationshipproperties sn
                 ON ( sn.relationships_id = rp.relationships_id
                      AND sn.propkey = '/secondPropertyName' )
          LEFT JOIN openidm.relationshipproperties p
                 ON ( p.relationships_id = rp.relationships_id
                      AND sn.propkey = '/properties' )
          LEFT JOIN openidm.relationships r
                 ON ( r.id = rp.relationships_id ));
ALTER TABLE openidm.pivot_relationships
   MODIFY firstId VARCHAR(300);
ALTER TABLE openidm.pivot_relationships
   MODIFY firstPropertyName VARCHAR(300);
ALTER TABLE openidm.pivot_relationships
   MODIFY secondId VARCHAR(300);
ALTER TABLE openidm.pivot_relationships
   MODIFY secondPropertyName VARCHAR(300);
CREATE INDEX `index_rel_firstId_firstProp`
   ON openidm.pivot_relationships(`firstId`, `firstPropertyName`);
CREATE INDEX `index_rel_secondId_secondProp`
   ON openidm.pivot_relationships(`secondId`, `secondPropertyName`)
;   

Query object data with their relationship data in single queries.

OPENIDM-8103

Here is a query that retrieves all the users and their assigned roles.

SELECT mng.objectid,
       mng.fullobject      AS fullobject,
       rel.rel_fullobject  AS relationshipProps,
       mng_role.objectid   AS relationshipId,
       mng_role.fullobject AS relationshipObj,
       'roles' as propName,
       'managed/role/' as resourceCollection
FROM (select mng.* from  managedobjects mng
       JOIN openidm.objecttypes o
         ON ( mng.objecttypes_id = o.id
              AND o.objecttype = 'managed/user' ) order by mng.objectid limit 100) as mng
       LEFT JOIN (SELECT vf.rel_fullobject,
                         vf.firstid  AS relationshipId,
                         vf.secondid AS objId,
                         m.objectid
                  FROM   openidm.pivot_relationships vf
                         JOIN openidm.managedobjects m
                           ON ( vf.secondid = concat('managed/user/', m.objectid)
                                AND vf.secondpropertyname = 'roles' )
                  UNION ALL
                  SELECT vs.rel_fullobject,
                         vs.`secondid` AS relationshipId,
                         vs.`firstid`  AS objId,
                         m.objectid
                  FROM   openidm.pivot_relationships vs
                         JOIN openidm.managedobjects m
                           ON ( vs.`firstid` = concat('managed/user/', m.objectid)
                                AND vs.`firstpropertyname` = 'roles' )
                   ) rel
              ON ( mng.objectid = rel.objectid )
       LEFT JOIN openidm.managedobjects mng_role
                 JOIN openidm.objecttypes ro
                   ON ( mng_role.objecttypes_id = ro.id
                        AND ro.objecttype = 'managed/role' )
              ON ( mng_role.objectid = REPLACE(rel.relationshipid, 'managed/role/', '')
        ) 
;

 

Here is a sample query that retrieves all users with their roles, authzroles, and reports.  

Note that with 7,000 managed objects, which includes 4000 users and 3000 roles, accounting for 12,000 relationships - the following query returns in about a second on a macbook pro.

 

SELECT
   mng.objectid,
   mng.fullobject,
   rel.rel_fullobject AS relationshipprops,
   mng_role.objectid AS relationshipid,
   CASE
      WHEN
         mng_role.objectid IS NOT NULL 
      THEN
         concat ( '{"_id":', mng_role.objectid, ',"description":"', mng_role.description, '"}') 
      ELSE
         NULL 
   END
   AS relationshipobj, 'authzRoles' AS propname, 
  'repo/internal/role/' AS resourcecollection 
FROM
   (
      SELECT
         mng.* 
      FROM
         openidm.managedobjects mng 
         JOIN
            openidm.objecttypes o 
            ON ( mng.objecttypes_id = o.id 
            AND o.objecttype = 'managed/user' ) 
   )
   AS mng 
   LEFT JOIN
      (
         SELECT
            vf.rel_fullobject,
            vf.firstid AS relationshipid,
            vf.secondid AS objid,
            m.objectid 
         FROM
            openidm.pivot_relationships vf 
            JOIN
               openidm.managedobjects m 
               ON ( vf.secondid = concat('managed/user/', m.objectid) 
               AND vf.secondpropertyname = 'authzRoles' ) 
            UNION ALL
            SELECT
               vs.rel_fullobject,
               vs.`secondid` AS relationshipid,
               vs.`firstid` AS objid,
               m.objectid 
            FROM
               openidm.pivot_relationships vs 
               JOIN
                  openidm.managedobjects m 
                  ON ( vs.`firstid` = concat('managed/user/', m.objectid) 
                  AND vs.`firstpropertyname` = 'authzRoles' )
      )
      rel 
      ON ( mng.objectid = rel.objectid ) 
   LEFT JOIN
      openidm.internalrole mng_role 
      ON ( mng_role.objectid = replace(rel.relationshipid, 'repo/internal/role/', '') ) 
   
UNION ALL
   SELECT
      mng.objectid,
      mng.fullobject,
      rel.rel_fullobject AS relationshipprops,
      mng_role.objectid AS relationshipid,
      mng_role.fullobject AS relationshipobj,
      'authzRoles' AS propname,
      'managed/role' AS resourcecollection 
   FROM
      (
         SELECT
            mng.* 
         FROM
            openidm.managedobjects mng 
            JOIN
               openidm.objecttypes o 
               ON ( mng.objecttypes_id = o.id 
               AND o.objecttype = 'managed/user' )  
      )
      AS mng 
      LEFT JOIN
         (
            SELECT
               vf.rel_fullobject,
               vf.firstid AS relationshipid,
               vf.secondid AS objid,
               m.objectid 
            FROM
               openidm.pivot_relationships vf 
               JOIN
                  openidm.managedobjects m 
                  ON ( vf.secondid = concat('managed/user/', m.objectid) 
                  AND vf.secondpropertyname = 'authzRoles' ) 
               UNION ALL
               SELECT
                  vs.rel_fullobject,
                  vs.`secondid` AS relationshipid,
                  vs.`firstid` AS objid,
                  m.objectid 
               FROM
                  openidm.pivot_relationships vs 
                  JOIN
                     openidm.managedobjects m 
                     ON ( vs.`firstid` = concat('managed/user/', m.objectid) 
                     AND vs.`firstpropertyname` = 'authzRoles' )
         )
         rel 
         ON ( mng.objectid = rel.objectid ) 
      LEFT JOIN
         openidm.managedobjects mng_role 
      JOIN
         openidm.objecttypes ro 
         ON ( mng_role.objecttypes_id = ro.id 
         AND ro.objecttype = 'managed/role' ) 
         ON ( mng_role.objectid = replace(rel.relationshipid, 'managed/role/', '') ) 
      
UNION ALL
      SELECT
         mng.objectid,
         mng.fullobject,
         rel.rel_fullobject AS relationshipprops,
         mng_role.objectid AS relationshipid,
         mng_role.fullobject AS relationshipobj,
         'roles' AS propname,
         'managed/role' AS resourcecollection 
      FROM
         (
            SELECT
               mng.* 
            FROM
               openidm.managedobjects mng 
               JOIN
                  openidm.objecttypes o 
                  ON ( mng.objecttypes_id = o.id 
                  AND o.objecttype = 'managed/user' )  
         )
         AS mng 
         LEFT JOIN
            (
               SELECT
                  vf.rel_fullobject,
                  vf.firstid AS relationshipid,
                  vf.secondid AS objid,
                  m.objectid 
               FROM
                  openidm.pivot_relationships vf 
                  JOIN
                     openidm.managedobjects m 
                     ON ( vf.secondid = concat('managed/user/', m.objectid) 
                     AND vf.secondpropertyname = 'roles' ) 
                  UNION ALL
                  SELECT
                     vs.rel_fullobject,
                     vs.`secondid` AS relationshipid,
                     vs.`firstid` AS objid,
                     m.objectid 
                  FROM
                     openidm.pivot_relationships vs 
                     JOIN
                        openidm.managedobjects m 
                        ON ( vs.`firstid` = concat('managed/user/', m.objectid) 
                        AND vs.`firstpropertyname` = 'roles' )
            )
            rel 
            ON ( mng.objectid = rel.objectid ) 
         LEFT JOIN
            openidm.managedobjects mng_role 
         JOIN
            openidm.objecttypes ro 
            ON ( mng_role.objecttypes_id = ro.id 
            AND ro.objecttype = 'managed/role' ) 
            ON ( mng_role.objectid = replace(rel.relationshipid, 'managed/role/', '') ) 
         
UNION ALL
         SELECT
            mng.objectid,
            mng.fullobject,
            rel.rel_fullobject AS relationshipprops,
            mng_role.objectid AS relationshipid,
            mng_role.fullobject AS relationshipobj,
            'reports' AS propname,
            'managed/role' AS resourcecollection 
         FROM
            (
               SELECT
                  mng.* 
               FROM
                  openidm.managedobjects mng 
                  JOIN
                     openidm.objecttypes o 
                     ON ( mng.objecttypes_id = o.id 
                     AND o.objecttype = 'managed/user' )  
            )
            AS mng 
            LEFT JOIN
               (
                  SELECT
                     vf.rel_fullobject,
                     vf.firstid AS relationshipid,
                     vf.secondid AS objid,
                     m.objectid 
                  FROM
                     openidm.pivot_relationships vf 
                     JOIN
                        openidm.managedobjects m 
                        ON ( vf.secondid = concat('managed/user/', m.objectid) 
                        AND vf.secondpropertyname = 'reports' ) 
                     UNION ALL
                     SELECT
                        vs.rel_fullobject,
                        vs.`secondid` AS relationshipid,
                        vs.`firstid` AS objid,
                        m.objectid 
                     FROM
                        openidm.pivot_relationships vs 
                        JOIN
                           openidm.managedobjects m 
                           ON ( vs.`firstid` = concat('managed/user/', m.objectid) 
                           AND vs.`firstpropertyname` = 'reports' )
               )
               rel 
               ON ( mng.objectid = rel.objectid ) 
            LEFT JOIN
               openidm.managedobjects mng_role 
            JOIN
               openidm.objecttypes ro 
               ON ( mng_role.objecttypes_id = ro.id 
               AND ro.objecttype = 'managed/user' ) 
               ON ( mng_role.objectid = replace(rel.relationshipid, 'managed/user/', '') ) ;

 

Additional enhancements

  1.  OPENIDM-8100: There is some inconsistency between the storage of an object ID along with the resource path for those objects.  For example, the managedobject objectId column only has the id of the object, while the relationshipproperties firstId and seccondId fields include both the path and id concatenated.  Although possible to use concat and replace to normalize joins on those columns, it would be much simpler and faster to stay consistent and keep the path separate from the id.
  2. When joining tables, the data becomes denomalized, and fullobject values could be returned repeatedly.  This could create additional IO that may become a bottleneck. If IO does become an issue, then consider keeping full objects in separate result sets where distinct fullobjects can be retrieved by ids, in bulk. 

 

  • No labels