Metadata Distribution WG 2013-07-18

Attending: IJ Kim, Ian Young, Scott Cantor, Tom Scavo, Harry Nicholos, Max Miller

In John’s absence, Scott Cantor led today’s discussion.

Minutes

  • Last time we discussed offline signing; this time the goal is to understand the impact of an online signing process. Specifically, what are the use cases that require online signing?
  • For reference, today InC Ops batch-produces a signed metadata aggregate every business day at approximately 3:00 pm Eastern. So in general there is a limited signing window (although emergency signings are possible).
  • We could move the publishing process to the endpoints (like OpenID), or more generally, to a metadata service (like DNS).
  • In some sense, DNS is a natural approach (“if it quacks like DNS, it must be DNS”) but perhaps this is not a viable option. It seems many DNS administrators would rather keep DNS “pristine” although there might be cooperation at some sites if we were to go down this path.
  • DNS SRV records are not sophisticated enough anyway. We would have to invent something new (using NAPTR records perhaps).
  • In any case, we’re looking at queries and lookups instead of batch downloads.
  • In the near term, a feasible solution would be to move from a batch-oriented approach to a more dynamic approach involving individual entities. In that case, would we want to query a central service?
  • From some (many?) IdP administrators' PoV, there would be concern if endpoints were responsible for publishing metadata. It would be safer to obtain metadata from the federation operator, not the endpoint.
  • If we did move to per-entity metadata, the number of authoritative signatures would grow by orders of magnitude. (We have over 1500 entities in InC metadata today.) Alternatively, we could impose transport security.
  • Whatever we do, important issues are trustworthiness of metadata and operational availability. Clearly metadata needs to be highly available.
  • From an implementation PoV, Shibboleth has the best support for metadata. SimpleSAMLphp can consume large metadata aggregates but the process depends on cron. (We know from experience that external processes are less reliable and tend not to be deployed.) Ping has expressed interest in providing better metadata support. It seems questionable to expect much uptake of a more complex consumption model for metadata when the simple batch (/etc/hosts) model has so little uptake. So in large part this is about Shibboleth.
  • A central registration and distribution model, where individual entity descriptors are signed by a trusted authority (i.e., the federation operator), would permit metadata to be aggregated and hosted from anywhere. Very flexible.
  • (At this point in the discussion, still no use case for PKI.)
  • Note that per-entity metadata breaks discovery processes that rely on a comprehensive metadata aggregate. These processes will need to be redesigned, perhaps using JSON metadata-like information retrieved just-in-time. This information tends to be less security-relevant than the full metadata is.
  • As noted earlier, under normal circumstances there is a minimum 24-hour window for signing and distributing InCommon metadata. Are there use cases that require more frequent signing? Of course problems and corrections sometimes require more frequent signing but this is the exception rather than the rule.
  • As a thought experiment, what would happen if the window were one hour (instead of one day)? In particular, what are the required interventions (necessitated by the RA process)?
  • Today, all metadata is reviewed by the RA, regardless of the request. Can certain innocuous changes be automated? Can we identify what changes those are?
  • A huge risk with the current metadata distribution model stems from the brittleness of metadata aggregates. In fact, this discourages experimentation and evolutionary change. This is reason alone to consider a different model.
  • There is a lot of value in keeping the signing key offline. Just look at the SWITCH docs to see how complicated using a PKI can be for deployers.
  • Thought experiment: If we had to go online, what would the requirements be?
  • Use Case: Local metadata at OSU. Online signing key. Short validity window (1 or 2 days). Initially considered a PKI, but it didn’t take long to realize that it was going to be mostly a waste of time. The cost/benefit ratio of the PKI model is high. Today the signing is still online, but no PKI.
  • Offline CRLs (produced every week) actually decrease overall security by extending the window of vulnerability to a compromised key to weeks from potentially days (if a signing process is online and automated).
  • Conclusion: An online signing key doesn’t necessitate a PKI. We’d have to examine the individual use case.
  • Use Case: InCommon/Comodo Certificate Service. Multiple online signing keys, each protected by an HSM. CRLs are refreshed daily, with 4-day validity window. OCSP.
  • If we did move to a PKI model, think about how much work it would be for participants to make the transition. And they would have to make the transition, otherwise there would be a security vulnerability.
  • Tentative conclusion: Maybe we don’t have to rekey.
  • One approach is consider types of risk. For example, an online signing key such that the key is not protected by an HSM.
  • Suppose the key were protected by an HSM. The question is how does the HSM work? If an attacker needs physical access to the HSM, then why bother with a PKI?
  • No labels

1 Comment

  1. Hi,

    Sorry I couldn't make the meeting due to a standing conflict -- sounds like it was a great call.

    Couple of thoughts/reactions to the items in the minutes...

    1. It was mentioned:

    • In some sense, DNS is a natural approach (“if it quacks like DNS, it must be DNS”) but perhaps this is not a viable option. It seems many DNS administrators would rather keep DNS “pristine” although there might be cooperation at some sites if we were to go down this path.
    • DNS SRV records are not sophisticated enough anyway. We would have to invent something new (using NAPTR records perhaps).

    Currently the DNS is widely used as a distributed database w/o specifically employing an application specific resource record. For example, DNS serves block list data for sites such as Spamhaus, and Routeviews uses it to distribute ASN and ASpath mapping information, and SPF and DKIM use DNS to share email-related parameters. I think that DNS *could* accomodate still more data (potentially including InCommon Federation metadata), even w/o necessarily getting a formal extension.

    2. I also noted the comment:

    • From some (many?) IdP administrators' PoV, there would be concern if endpoints were responsible for publishing metadata. It would be safer to obtain metadata from the federation operator, not the endpoint.

    Can someone elaborate on that point? If the metadata is digitally signed, then don't we have confidence that it is authentic and unaltered? Or is the "safety" issue something else, like a belief that individual sites may not offer sufficiently reliable access, relative to what InCommon itself offers?

    3. Regarding the comment:

    • If we did move to per-entity metadata, the number of authoritative signatures would grow by orders of magnitude. (We have over 1500 entities in InC metadata today.) Alternatively, we could impose transport security.

    If we're talking about distribution via DNS, wouldn't DNSSEC handle this? In the DNSSEC model, as long as you trust the root of the tree, that's the only key you need to maintain, since the next layer of keys are signed by the top key, right?

    4. Another interesting point from the call:

    • Clearly metadata needs to be highly available.

    Is this strictly true? Presumably there'd be caching, ala the DNS, where metadata would conceivably have a TTL (Time To Live). Some sites might set the TTL to be one day, others an hour or a week, assuming you allowed them to do so. You could even imagine a scheme where the effective TTL is the lesser of the published site TTL, and the relying party's *own* TTL, right? (e.g., you might say, as the site generating the meta data, "my TTL is a week" but a relyingj party might say, "Don't care what the generating site suggests, if a TTL is longer than a day, I'm still going to refresh after 24 houts..."

    Of course, note one subtle difference between the current model, and the caching implicit in it, and the possible use of a DNS-like model: currently you get the whole enchilada, the metadata for all the sites, every time you get a new copy from InCommon. In the DNS-like model, there may be some obscure metadata that you might only rarely need, and which thus could not be counted on to be cached. (But would an outage involving just one obscure corner case be a big deal? I guess it really depends which corner case we're talking about, right?)

    5. Finally, I also note the comment that:

    • Today, all metadata is reviewed by the RA, regardless of the request. Can certain innocuous changes be automated? Can we identify what changes those are?

    What if we were simply to decide that the correctness of metadata (like the correctness of DNS information, or the correctness of BGP routes, for that matter) is the responsibility of the originator, and no review was performed by the RA?

    Anyhow, just some thoughts, and sorry, again, that I couldn't make the call!