Background

This survey was originally created and distributed to EU National Research Networks by Otto Kreiter and others at Dante. It has been adapted to serve US Regional Networks. A similar effort was undertaken by the Quilt. Questions from the Virtual NOC Tools Workshop in the Fall of 2008 were referenced and integrated where appropriate.

Approach

We plan to complete these questions on performance working group calls. Anyone who would like to contribute but is unable to attend, we will send results after each call and solicit input from those who were unable to attend. When we have completed the survey, we will post the results on this wiki. 

Purpose

We hope the discussion generated by this survey will assist members of the working group by: (1) increasing awareness and understanding of practices in peer organizations, (2) and by increasing awareness and understanding of Internet2 staff so that they may better assist the Internet2 community.

1. Description of services provided by NOC at your REN.

2. Description of the structure of the NOC and network.

  •  Number of engineers in Level 1, Level 2, and Level 3
  • Topology of the network and number of nodes managed

3. NOC Network Monitoring

  • Does the NOC carry out network monitoring or is it outsourced?
  • Who is developing and/or maintaining the monitoring tools?
  • What monitoring software is used? Is it commercial, open source, developed in-house?

4. Principles of trouble-shooting network-related problems

  • Does the NOC perform network troubleshooting?
  • What are the procedures in case of a link failure?
  • What are the procedures in case of a network endpoint (NE) failure?
  • What are the procedures in case of an undiagnosed fault?
  • Where are technical procedures stored?
  • How often are procedures reviewed and updated?
  • Who or what team is responsible for maintaining these documents?

5. Metrics collected:

  •  Which metrics are used and why?
  •  How are they accessed and stored? (Public vs. login required) 
  • Metric, Definition, Frequency of collection

6. Visualization tools employed and used for monitoring and troubleshooting

  • What type of visualization tools are implemented?
  • What is their role in trouble-shooting and detecting network failures?

7. Alarm based monitoring

  • Monitoring systems deployed based on alarms and how alarms are triggered
  • Thresholds for alarm triggering

8. Related Information Systems

Database and other IS used to store and retrieve information about topology and network

  • Monitoring tool configuration updates (manual, automated)
  • Automated procedures (PIP, BoD, circuits) ?
  •  Awareness of, and interest in, GN2 developed systems (cNIS, AMPS)

9. Monitoring related tools administration

  •  Installation practices
  • Package Management
  • Security Rules

10. Availability of data

  • Who is accessing monitoring data (users, managers, engineers)
  • How or where is the monitoring data stored?

11. Reporting

  • What type of monitoring reports are created and to whom are they provided

12. NOC relationship with other entities (NREN, Campus, Universities, GEANT2, etc)

  • How do they interface with other entities to solve problems not originating in their domain
  • Who owns the problem

13. Experiences and plans in respect to national and multi-domain monitoring

  • No labels