IdMatch is a method of determining if a person presented by a System of Record is already known to the Identity Management System.

A demonstration version of the PostgreSQL-based ID match engine is available to anyone who wishes to experiment with it. This engine implements the CIFER ID Match Strawman API.

Gaining Access

The API endpoint is https://idmatch.testbed.tier.internet2.edu/match-poc

Access is via Basic Auth over https. Credentials will be assigned to a "System of Record" for use in making requests.

Data

The match engine is preloaded with 150,000 fake records, which you can download here if you wish to craft queries against them. These records are all loaded via the "test" System of Record.

Attributes

The match engine (as configured for this demo) supports the following attributes (along with their API labels):

Matching Rules

The following rules are configured for this demo:

Canonical Rules

If any of these rules finds exactly one match, processing stops and that match is returned. If any rule finds more than one match, the response is automatically switched to a potential match.

Canonical rules must match each attribute exactly.

  1. SOR + SORID
  2. SSN + Last Name + DoB
  3. SSN + Last Name + First Name

Potential Rules

  1. SSN + Last Name + DoB(distance)
  2. SSN(distance) + Last Name + DoB
  3. SSN + Last Name + First Name(distance)
  4. SSN(distance) + Last Name + First Name
  5. DoB + Last Name(distance) + First Name(distance)
  6. DoB + Last Name(distance) + Substring(distance)
  7. Email

Sending Queries

DO NOT SEND REAL DATA TO THIS SERVICE.

Send fake data only.

The full specification for sending queries is available in the CIFER ID Match Strawman API, however you will most likely want to send one of two requests.

Search/Update

Corresponds to the Reference Identifier Request.

Returns:

PUT https://idmatch.testbed.tier.internet2.edu/match-poc/v1/people/SOR/SORID
{
  "sorAttributes":
  {
    "names":[
      {
        "given":"Pat",
        "family":"Lee",
        "type":"official"
      }
    ],
    "dateOfBirth":"1983-03-18",
    "identifiers":[
      {
        "type":"national",
        "identifier":"012-99-5678"
      }
    ]
  }
}

 

Search Only

Corresponds to the Reference Identifier Request (Search Only). Parameters are as for the Search/Update request, above.

Returns:

POST https://idmatch.testbed.tier.internet2.edu/match-poc/v1/people/SOR/SORID
{
  ... as above ...
}

Notes

  1. If you successfully submit a Reference Identifier (PUT) Request (that is, you receive a reference identifier back in response), and you then try to issue the exact same request you will not receive the reference identifier back in subsequent responses. This is because your subsequent requests are technically attribute update requests.

Performance Notes

The demonstration server is a stock testbed VM running CentOS 7 and a local version of PostgreSQL 9. No optimizations have been made to the hardware, virtual machine, or database configuration other than the creation of indexes as describe the installation instructions.

This table shows the processing time for the initial data load. Note that in order for a new record to be added, all defined match rules must execute (and return no matches) – that is the worst possible performance for a query.

Load #

Records

Total Time (sec)Average Time Per Record (ms)Non-Exact Results
11000056156.10
240000260165.03
3500006079121.6335
45000011874237.5 (max ~1187)532

Performance numbers reflect queries performed against localhost. Remote queries will take longer due to network latency.

The increasing average request time is likely due to the increased expense of updating the database indexes as the data set gets larger.