Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Introductions and Goals of the Workshop

    * R.L. Morgan, Senior Technology Architect, University of Washington

After attendees introduce themselves and share their interests, we'll review the issues and questions that bring us together, with a view toward outcomes: potential initiatives and focused research into requirements and solutions for collaborative academic services.

...

* Loretta Auvil, Senior Project Coordinator, University of Illinois at Urbana-Champaign

SEASR
How can we leverage technology infrastructure framework to support multiple domain apps?

Q: who Who would be building custom apps in this environment?
A: combination Combination of domain experts and developers. Java, Python, and GLisp are used. End users are the target audience for these apps, thus UI is very important. The ability to explore is also key, to enable users to see how others are using the data.

Q: are Are mashups supported in the SEASR/Meandre framework?
A: transformers Transformers can be written to translate data if required, or XML is the common lingua franca. Components from multiple repositories can be combined as appropriate. Licensing issues sometimes can be sticky. Exposing what is happening, and provenance capabilities, are key to making this useful.

Q: will Will SEASR supply actual services that can be used to build things?
A: this This is flexible, current focus is on building repositories where developers can register their component libraries. Eventually the goal is to provide services that are more baked for users.

Various navigation, visualization, and exploration schemes are being supported, e.g. tag clouds.

Q: is Is this too overwhelming for arts and humanities folks?
A: goal Goal is to make this as user-friendly as possible, but part of the challenge is increasing general understanding among those communities about what is possible and thus spurring their thinking about ways in which this could be useful to them.

* Steve Masover, Data Architect, University of California, Berkeley

Project Bamboo

Fuzziness/serendipity are key in Arts and Humanities -- key to useful discovery.

Provenance, and how and by whom resources are used/cited, is also very important. "Chains of credit" can be very important in tenure evaluations, as an example... IPR concerns in the commercial sector are an interesting parallel to this. There will likely also be a privacy-related tension in this - scholars want to know who is using their data, but don't want their use to be monitored.

Q: data Data privacy, and regulatory concerns - how are these addressed in exposing datasets? How does the infrastructure support this?
A: this This is on their radar, but too early for answers just yet...

Q: teaming Teaming in AArts & H Humanities seems to be rare, but digitizing content seems to be a big and growing issue. Is there a culture gap here?
A: workshop Workshop participants are self-selecting, and they tend to be more team-oriented than perhaps their colleagues may be.

...

* Daniel Davis, Chief Software Architect, Fedora Commons

Fedora Commons is driving towards being a more community-driven software development effort. Integrating capabilities from many other sources. Meeting with DSpace to look at collaboration opportunities... Looking at long-term sustainability for the project, and organizations doing similar and complementary work.

...

One person's metadata is another person's data - it all needs to be linked together and made accessible.

Q: is Is it possible to federate Fedora Repositories? What capabilities are there for supporting distributed management?
A: yesYes, you can use anything as an identifier, thus you can select the naming scheme that makes sense for you. Plus you can have alternate names for objects. You can deploy an enterprise search engine that will search on your criteria, and return objects wherever they may live.

IPR- Jim Henson site has a repository-centric security model. Muradora project moves policy driven operation (based on XACML) up into middleware. Any system that depends on an app on a system you don't control is by its very nature a problem for the protection of IPR. Need a well-defined trust and security model.

"vendorVendor-driven architectures" tend to be problematic... watch Watch out for vendor lock-in, is the term "standards-based" subject to interpretation by vendors?

Q: what What is the overlap between users and institutions that want to federate content, and the identity federations we are building?
A: they They are complementary capabilities, need to interoperate with each other. Duplication of information is permitted, e.g. person object associated with an account, also a pointer to an element of scholarship. How might these be merged?

Q: what What is anticipated timing if institution wants to setup a federated repository and ensure it is reusable and extendible?
A. this : This is being done now if your initial expectations are modest, more work happening on this including in the commercial space.

...

* Jens Haeusser, Director, Strategy, The University of British Columbia

Kuali Student

Q: To what degree are functional requirements being driven by academics v. admin?
A: We are seeking a balance, trying to reflect the broad range of the community, reflecting a broad range of practices.

Learning unit management is an early priority, treating them similar to SKUs for flexibility.

One mark of a good ESB (Enterprise Service Bus) in this context is the ability to integrate easily...

Q: are Are there use cases where services are outside the enterprise? How are these being addressed?
A: sometime Sometime from an outsourced vendor, sometimes from another campus. Sometimes there is interesting information supporting your business processes, but which are outside of your organization.

Q: The Educational Community License (ECL) is close but not identical to Apache license, is that a problem? How is this reconciled?
A: contributions Contributions to other technologies are done using their licenses.

...

    * David Gimpl, Software Engineer, IBM Corporation

IBM Blue Cloud Initiative

Question to ask yourself when considering cloud computing - What is the tolerable outage time that is acceptable for your key application(s)?

...

Genesis II
http://www.cs.virginia.edu/~vcgr/
http://vcgr.cs.virginia.edu/genesisII

Q: Open Grid Forum (OGF) is potentially in a period of transition, what happens to the standards if OGF goes away?
A: some Some may fade into oblivion, but others are being picked by vendors and thus will live on (e.g. BES used by IBM)

Q: OASIS is starting up a web services harmonization activity. Would that be a logical home for this work?
A: traditionally Traditionally organizations like OASIS have focused more on web services than on grid-related services, which are more about "the whole package." OASIS would likely not be interested in taking these on.

Q: what What about a user who wants to put up their own service?
A: JSDL A: Job Submission Description Language (JSDL)  can be used to describe any job, but that doesn't necessarily mean you would give that job to a BES (Basic Execution Service) container.

Q: how How does a client know what services are available, and how to get to them?
A: We are still developing a solution for this. Users are typically associated with a particular grid, and thus can be informed about what is available, but for other grids this is not so straightforward.

Q: is Is there an impact on performance from all of this layering? What is the overhead, is it all up front?
A: Morgan) There is overhead ongoing, a lot of it based on I/O. they are trying to get past this problem, but they are passing XML over HTTP which by its nature is not very efficient. Note the distinction between HPC and high throughput.
A: Gimpl) metrics Metrics and instrumentation are key. Why are there performance issues associated with a particular segment? BlueGene is both HPC and high throughput, but serves different needs.

Q: note Note utility characteristics - how does this factor in?
A: you You would be seeking cost/benefit answers for a particular job, depending upon your priorities.

Note that there are other specs coming out of OGF for the aggregation of metadata about services. Generally you know the characteristics of the service required to meet your needs for a particular task.

Q: deployment Deployment models - organizations tend to want to outsource things they understand well, since they are then able to evaluate the performance. Are we moving to a model in which our users would not know or care where an app lives. Are the migration capabilities in place to support this?

A: Gimpl) IBM supports a utility model, billing you for the services you actually use. The organization decides what services to use them for. Some organizations start apps and services internally, then migrate them to the cloud when it becomes in their interest to do so, e.g. when it is more efficient and cost effective to run them in the cloud. There are economies of scale that come into play in the utility computing model that are very compelling.
A: Morgan) the university has to maintain a certain amount of infrastructure, some of which is not continually in use, and some researchers don't have the budget for clusters but still have needs. Thus it makes sense for the university to utilize this excess capacity to support these users when it makes sense to do so.

Q: is Is UVa using its end-user PKI for this?
A: yesYes, but not as the sole authn authentication factor.

Panel: ESBs and Widely Distributed Services

...

It is not unforeseeable that an organization could have services running in multiple clouds, and it is essential that they be interoperable.

Q: in In terms of standards for pub/sub and service type messaging, we are not really there yet. Not many are widely viewed as industry standards. What protocols are you supporting?
A: any Any that we use will have to be open source and open standards. We have heard our customers insist that their vendors must be interoperable.

It makes sense to use cloud computing to take the support load off of internal IT staff, if the economics make sense and the SLA is adequate for our needs. The goal is to make the fabric as simple as possible to connect to. Utility is the model we are going after.

Q: what What do you mean by "firewall-friendly messaging"? what firewall are you referring to?
A: callbacks Callbacks to a client can be complex, and enabling it to traverse firewalls can be a challenge. WCF (windows communication foundation) infrastructure is an approach we have developed for this. IT admins still have the ability to permit deny access, but we have developed solutions for the hard coding problems

Q: what What about PII - HIPAA, FERPA, etc. -  issues around data moving outside the organization?
A: this This needs to be worked out over time, not a simple solution... Some apps and data just don't make sense to put in the cloud, and it is up to the organization to make that decision.

* Roland Hedberg, Internet Architect, Umea Universitet
Open Metadir (OM2)

Swedish universities are using a common software infrastructure, and are currently sharing information between them about students moving around to take courses.

...

Q: What changes when you switch to ESBs from a silo app?
A: The application can no longer control all that it used to, thus it really just needs to be able to talk services and rely on the ESB to do what it is supposed to do. A lot of it has to do with app owners getting comfortable with the concept and learning that it is reliable. It is important to start small, select apps that are aligned with your business processes and represent pain points.

Q: when When querying PeopleSoft, how are you getting data out?
A: there There are perceived performance issues related to direct access to PeopleSoft Student. They do a single extract of the tables they need, into an operational data store, and work from that.

...

Is formal data modeling important enough that taxonomies and controlled vocabularies should be put in place before beginning to code? What development methods lend themselves to adaptable data models? Who needs to play together in this space, both inside and beyond campuses?

Q: how How do you deal with different names for a particular object or service? Different naming conventions...

...

Not all policies are formalized and documented, but this helps when there are disagreements or differences in interpretation.

"reasonabilityReasonability" may be a challenge when you cross institutional boundaries.

...

Q: How does Google analytics figure into data collection issues?
A: Indiana U. does not permit this since the data lives outside their control. When central IT doesn't provide an adequate solution, users go elsewhere, and this is a good example. We are pushing central IT to provide this service.

Q: federation Federation level policy v. that of individual institutions... at what point does a federation look like a 3rd party? What is the trigger?
A: whenever Whenever you don't have direct control over the data, separate legal entity.
A: this This is not the federation operator, which is not involved in the transactions.

...

EU: When you cease your engagement with a student, you must erase all logs associated with that student. But a student's affiliation normally doesn't end when they graduate, i.e. alumni relationships...

Q: do Do these policies apply to EU citizens in the US taking classes?
A: unknownUnknown...

EU is currently working to define and triage (and normalize) what is to be considered PII under their regulations.

Are IP addresses considered PII in the EU? Because some are static, and can be linked with an individual, unless you know you are dealing with a dynamically assigned address you must treat it as PII.

is Is eduPersonTargetedID (EPTID) considered PII in the EU? It is persistent, non-reassignable, can be different for every site you visit so as to avoid correlation between sites. Thus we are hopeful that it will not be considered PII.

...

What do you do when there it not a trusted IdP in your country? OpenID assertions are dubious, by definition. Identity will likely start to mingle with more protocols, out of necessity.

Q: What is the timetable on EU decision?
A: likely Likely 3Q2008, there will be broad coverage when it happens.

...

Digital signatures are one possibility, add to metadata for artifacts? Create an authoritative registry?

Q: how How does correlate with creative commons efforts to attach licenses to content? Users can search for CC licensed data to use...

...

Scotty Logan, Stanford
IAM and well-behaved apps

FYI: For reference: http://oauth.net/

Wrap-Up and Findings

    * Session moderator: Kenneth J. Klingenstein, Director, Internet2 Middleware and Security, University of Colorado at Boulder

...