Date: Fri, 29 Mar 2024 08:53:21 +0000 (UTC) Message-ID: <984579417.7725.1711702401692@ip-10-10-7-29.ec2.internal> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_7724_1671047158.1711702401692" ------=_Part_7724_1671047158.1711702401692 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
Grouper diagnostics provides a URL on Grouper WS and UI (= in Grouper 2.2+) which will help to give the health of Grouper. This = can include memory in the WS server, connection to the Grouper Registry DB,= that sources can perform queries, and that Grouper loader jobs are s= uccessfully executing. If everything is ok, a 200 HTTP code will be r= eturned, else 500. A description of the issue will be returned as wel= l. The point is that this URL can by pointed to be web monitoring sof= tware like Nagios, Big Brother, BMC, etc.
There is general information displayed on success as well, the server na= me, number of WS requests (since server started), the last error (if recent= ), etc.
There isn't any sensitive information in these calls, but if you want to= lock them down, do that in your servlet container or web server (or don't = map the servlet in the WS web.xml). You could restrict to your PC and= nagios server source IP addresses for example.
Each test is configurable to restrict it (without causing an error) in t= he grouper-ws.properties (grouper.properties in 2.2+). If you want to= customize the number of minutes since a SUCCESS should be detected in load= er jobs, you can do that as well. These settings are in the grouper-w= s.properties (grouper.properties in 2.2+)
Note, there is a lot of intelligent caching here so that repeated hits d= o not do queries each time.
In v4.10+ Grouper diagnostics will report success based on the schedule = of the job. Jobs that run every= minute, hour, day, week, month, year makes threshold: 30 min, 150 min, 52 hours, 8 days, 33 days, and 36= 7 days (unless there is an override in the config).
#if ign= ore tests. Note, in job names, invalid chars need to be replaced with unde= rscore (e.g. colon) #anything in this regex: [^a-zA-Z0-9._-] ws.diagnostic.ignore.memoryTest =3D false ws.diagnostic.ignore.dbTest_grouper =3D false ws.diagnostic.ignore.source_jdbc =3D false ws.diagnostic.ignore.loader_CHANGE_LOG_changeLogTempToChangeLog =3D false ws.diagnostic.ignore.loader_MAINTENANCE__grouperReport =3D false #number of minute that can go by without a success before an error is throw= n ws.diagnostic.minutesSinceLastSuccess.loader_SQL_GROUP_LIST__aStem_aGroup2 = =3D 60 # list groups which should check the size, in this case, "employee" or "stu= dents" in the key name is a variable # {valueType: "group", required: true, regex: "^ws\\.diagnostic\\.checkGrou= pSize\\.([a-zA-Z0-9._-]+)\\.groupName$"} #ws.diagnostic.checkGroupSize.students.groupName =3D community:students # min group size of known groups # {valueType: "integer", required: true, regex: "^ws\\.diagnostic\\.checkGr= oupSize\\.([a-zA-Z0-9._-]+)\\.minSize$"} #ws.diagnostic.checkGroupSize.students.minSize =3D 18000 #if a change log consumer hasn't had a success but it is running and progre= ss is being made, treat as a success # {valueType: "boolean", required: true} ws.diagnostic.successIfChangeLogConsumerProgress =3D true # usdu daemon minutes since success 10 days # {valueType: "integer"} ws.diagnostic.minutesSinceLastSuccess.loader_OTHER_JOB_usduDaemon =3D 14400 # allow diagnostics from these IP ranges, e.g. 1.2.3.4/32 or 2.3.4.5/24, co= mma separated, leave blank if available from everywhere # {valueType: "string", multiple: true} ws.diagnostic.sourceIpAddresses =3D=20 # if status details should be sent to the client or just logged # {valueType: "boolean", required: true} ws.diagnostic.sendDetailsInResponse =3D true
You can includeOnly jobs in the URL by comma separated param (2.2.3+ and= 2.2.2.api.patch.6)
SUCCESS= loader_CHANGE_LOG_changeLogTempToChangeLog: Loader job CHANGE_LOG_changeLo= gTempToChangeLog ignored in config since URL param contains includeOnly whi= ch doesn't have 'loader_CHANGE_LOG_changeLogTempToChangeLog' (46ms elapsed) SUCCESS loader_MAINTENANCE_cleanLogs: Not checking, there was a success fro= m before: 2016/01/31 11:45:13.000, expecting one in the last 3120 minutes (= 46ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_syncGroups: Not checking, there was a su= ccess from before: 2016/01/31 15:14:00.000, expecting one in the last 30 mi= nutes (46ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_grouperRules: Loader job CHANGE_LOG_cons= umer_grouperRules ignored in config since URL param contains includeOnly wh= ich doesn't have 'loader_CHANGE_LOG_consumer_grouperRules' (46ms elapsed) SUCCESS loader_SQL_SIMPLE__loader:owner__9178d7d636de49d6b271d12ca351dc19: = Not checking, there was a success from before: 2016/01/31 13:40:04.000, exp= ecting one in the last 3120 minutes (46ms elapsed)
You can exclude jobs in the URL by comma separated param (2.2.3+ and 2.2= .2.api.patch.6)
SUCCESS= loader_CHANGE_LOG_changeLogTempToChangeLog: Not checking, there was a succ= ess from before: 2016/01/31 15:14:50.000, expecting one in the last 30 minu= tes (31ms elapsed) SUCCESS loader_MAINTENANCE_cleanLogs: Loader job MAINTENANCE_cleanLogs igno= red in config since URL param contains exclude which has 'loader_MAINTENANC= E_cleanLogs' (31ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_syncGroups: Loader job CHANGE_LOG_consum= er_syncGroups ignored in config since URL param contains exclude which has = 'loader_CHANGE_LOG_consumer_syncGroups' (31ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_grouperRules: Not checking, there was a = success from before: 2016/01/31 15:14:02.000, expecting one in the last 30 = minutes (31ms elapsed) SUCCESS loader_SQL_SIMPLE__loader:owner__9178d7d636de49d6b271d12ca351dc19: = Loader job SQL_SIMPLE__loader:owner__9178d7d636de49d6b271d12ca351dc19 ignor= ed in config since URL param contains exclude which has 'loader_SQL_SIMPLE_= _loader:owner__9178d7d636de49d6b271d12ca351dc19' (31ms elapsed)
Use this to do checks often, or when there is a cluster, you can use thi= s on all nodes, and a deeper check on one node only
https://url.to.grouper.edu/grouper-ws/sta= tus?diagnosticType=3Dtrivial
Note, this is a success, but since there was an error recently, it is di= splayed
Server:= mchyzer-PC, grouperVersion: 1.6.0, up since: 2010/05/17 02:19, 0 requests SUCCESS memoryTest: Allocating 100000 bytes to an array to make sure not ou= t of memory (11ms elapsed) Diagnostics errors since start: 3 (11ms elapsed) Last diagnostics error date: 2010/05/17 02:23:27 Last diagnostics error message: There was an error in the diagnostic task DiagnosticLoaderJobTest, Loader j= ob CHANGE_LOG_changeLogTempToChangeLog :Cant find a success since: 2010/05/17 01:38:50.000, expecting one in the l= ast 30 minutes java.lang.RuntimeException: Cant find a success since: 2010/05/17 01:38:50.= 000, expecting one in the last 30 minutes =09at edu.internet2.middleware.grouper.ws.status.DiagnosticLoaderJobTest.do= Task(DiagnosticLoaderJobTest.java:103) =09at edu.internet2.middleware.grouper.ws.status.DiagnosticTask.executeTask= (DiagnosticTask.java:44) =09at edu.internet2.middleware.grouper.ws.status.GrouperStatusServlet.doGet= (GrouperStatusServlet.java:129) =09at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) =09at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) =09at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl= icationFilterChain.java:290) =09at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF= ilterChain.java:206) =09at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperV= alve.java:233) =09at org.apache.catalina.core.StandardContextValve.invoke(StandardContextV= alve.java:191) =09at org.apache.catalina.authenticator.AuthenticatorBase.invoke(Authentica= torBase.java:433) =09at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.j= ava:128) =09at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.j= ava:102) =09at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineVal= ve.java:109) =09at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.jav= a:293) =09at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java= :849) =09at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.proce= ss(Http11Protocol.java:583) =09at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:45= 4) =09at java.lang.Thread.run(Thread.java:619)
This will do a lightweight query to the registry, and the memory query= p>
https://url.to.grouper.edu/grouper-ws/status?d= iagnosticType=3Ddb
Server:= mchyzer-PC, grouperVersion: 1.6.0, up since: 2010/05/17 02:19, 0 requests SUCCESS memoryTest: Allocating 100000 bytes to an array to make sure not ou= t of memory (20ms elapsed) SUCCESS dbTest_grouper: Retrieved object from database (28ms elapsed) Diagnostics errors since start: 3 (28ms elapsed)
This will do a find by ID on all sources, and the DB test, and the memor= y test. Note that the same sources.xml settings in each source that c= onfigure the Grouper startup settings will apply here as well. i.e. y= ou can skip a source, or set the ID to search for.
&l= t;init-param>=20 <param-name>findSubjectByIdOnCheckConfig</param-name>=20 <param-value>true|false</param-value>=20 </init-param>=20 <init-param>=20 <param-name>subjectIdToFindOnCheckConfig</param-name>=20 <param-value>someSubjectIdWhichMightExistOrWhatever</param-= value>=20 </init-param>=20 <init-param>=20 <param-name>findSubjectByIdentifiedOnCheckConfig</param-nam= e>=20 <param-value>true|false</param-value>=20 </init-param>=20 <init-param>=20 <param-name>subjectIdentifierToFindOnCheckConfig</param-nam= e>=20 <param-value>someSubjectIdentifierWhichMightExistOrWhatever<= ;/param-value>=20 </init-param>=20 <init-param>=20 <param-name>findSubjectByStringOnCheckConfig</param-name>= ;=20 <param-value>true|false</param-value>=20 </init-param>=20 <init-param>=20 <param-name>stringToFindOnCheckConfig</param-name>=20 <param-value>someStringWhichMightExistOrWhatever</param-val= ue>=20 </init-param>
https://url.to.grouper.edu/grouper-ws/sta= tus?diagnosticType=3Dsources
Server:= mchyzer-PC, grouperVersion: 1.6.0, up since: 2010/05/17 02:19, 0 requests SUCCESS memoryTest: Allocating 100000 bytes to an array to make sure not ou= t of memory (37ms elapsed) SUCCESS dbTest_grouper: Retrieved object from database (40ms elapsed) SUCCESS source_g:gsa: Searched for subject by id: grouperTestSubjectByIdOnS= tartupASDFGHJ (42ms elapsed) SUCCESS source_jdbc: Searched for subject by id: grouperTestSubjectByIdOnSt= artupASDFGHJ (45ms elapsed) SUCCESS source_g:isa: Searched for subject by id: grouperTestSubjectByIdOnS= tartupASDFGHJ (45ms elapsed) Diagnostics errors since start: 3 (45ms elapsed)
Note: grouper 2.2.3+ and 2.2.2.api.patch.6 has a diagnostic type of daem= onJobsOnly where only daemon (and loader) jobs will be run.
https://url.to.grouper.edu/grouper-= ws/status?diagnosticType=3DdaemonJobsOnly
Server:= mchyzer-pc, grouperVersion: 2.2.2, up since: 2016/01/31 15:14, 0 requests SUCCESS loader_CHANGE_LOG_changeLogTempToChangeLog: Not checking, there was= a success from before: 2016/01/31 15:14:50.000, expecting one in the last = 30 minutes (65ms elapsed) SUCCESS loader_MAINTENANCE_cleanLogs: Not checking, there was a success fro= m before: 2016/01/31 11:45:13.000, expecting one in the last 3120 minutes (= 65ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_syncGroups: Not checking, there was a su= ccess from before: 2016/01/31 15:14:00.000, expecting one in the last 30 mi= nutes (66ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_grouperRules: Not checking, there was a = success from before: 2016/01/31 15:14:02.000, expecting one in the last 30 = minutes (66ms elapsed) SUCCESS loader_SQL_SIMPLE__loader:owner__9178d7d636de49d6b271d12ca351dc19: = Not checking, there was a success from before: 2016/01/31 13:40:04.000, exp= ecting one in the last 3120 minutes (66ms elapsed) Diagnostics errors since start: 0 (66ms elapsed)
"all" will test all loader jobs (for a success within a certain threshol= d), do a find by ID on all sources, and the DB test, and the memory t= est. By default all loader jobs will look for a success within the la= st 25 hours. The exception is change log jobs which look for a succes= s within the last 30 minutes. This is configurable in the grouper-ws.= properties
https://url.to.grouper.edu/grouper-ws/status?d= iagnosticType=3Dall
Server:= mchyzer-PC, grouperVersion: 1.6.0, up since: 2010/05/17 02:45, 0 requests SUCCESS memoryTest: Allocating 100000 bytes to an array to make sure not ou= t of memory (6055ms elapsed) SUCCESS dbTest_grouper: Retrieved object from database (6076ms elapsed) SUCCESS source_g:gsa: Searched for subject by id: grouperTestSubjectByIdOnS= tartupASDFGHJ (6077ms elapsed) SUCCESS source_jdbc: Searched for subject by id: grouperTestSubjectByIdOnSt= artupASDFGHJ (6091ms elapsed) SUCCESS source_g:isa: Searched for subject by id: grouperTestSubjectByIdOnS= tartupASDFGHJ (6091ms elapsed) SUCCESS loader_CHANGE_LOG_changeLogTempToChangeLog: Loader job CHANGE_LOG_c= hangeLogTempToChangeLog ignored in config (6091ms elapsed) SUCCESS loader_MAINTENANCE__grouperReport: Loader job MAINTENANCE__grouperR= eport ignored in config (6091ms elapsed) SUCCESS loader_MAINTENANCE_cleanLogs: Found the most recent success: 2010/0= 5/17 02:39:00.000, expecting one in the last 1500 minutes (6122ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_chrisTest: Loader job CHANGE_LOG_consume= r_chrisTest ignored in config (6122ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_chrisTest: Loader job CHANGE_LOG_consume= r_chrisTest ignored in config (6122ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_xmpp: Loader job CHANGE_LOG_consumer_xmp= p ignored in config (6122ms elapsed) SUCCESS loader_CHANGE_LOG_consumer_xmpp: Loader job CHANGE_LOG_consumer_xmp= p ignored in config (6122ms elapsed) SUCCESS loader_SQL_GROUP_LIST__aStem:aGroup2__f74068fd47124b079ea0c750354f6= 935: Found the most recent success: 2010/05/17 02:39:00.000, expecting one = in the last 1500 minutes (6125ms elapsed) SUCCESS loader_SQL_SIMPLE__aStem:aGroup__a186d80e0fe946b78dba45d16a2a1be7: = Found the most recent success: 2010/05/17 02:39:00.000, expecting one in th= e last 1500 minutes (6132ms elapsed) SUCCESS loader_ATTR_SQL_SIMPLE__penn:community:employee:orgPermissions:orgs= __a8c2933dd66945af9755372efa9141b5: Found the most recent success: 2010/05/= 17 02:39:00.000, expecting one in the last 1500 minutes (6135ms elapsed) Diagnostics errors since start: 0 (6135ms elapsed)
HTTP St= atus 500 - type Exception report message description The server encountered an internal error () that prevented it f= rom fulfilling this request. exception java.lang.RuntimeException: There was an error in the diagnostic task DiagnosticLoaderJobTest, Loader j= ob CHANGE_LOG_changeLogTempToChangeLog :Cant find a success since: 2010/05/17 01:38:50.000, expecting one in the l= ast 30 minutes =09edu.internet2.middleware.grouper.ws.status.GrouperStatusServlet.doGet(Gr= ouperStatusServlet.java:191) =09javax.servlet.http.HttpServlet.service(HttpServlet.java:617) =09javax.servlet.http.HttpServlet.service(HttpServlet.java:717) root cause java.lang.RuntimeException: Cant find a success since: 2010/05/17 01:38:50.= 000, expecting one in the last 30 minutes =09edu.internet2.middleware.grouper.ws.status.DiagnosticLoaderJobTest.doTas= k(DiagnosticLoaderJobTest.java:103) =09edu.internet2.middleware.grouper.ws.status.DiagnosticTask.executeTask(Di= agnosticTask.java:44) =09edu.internet2.middleware.grouper.ws.status.GrouperStatusServlet.doGet(Gr= ouperStatusServlet.java:129) =09javax.servlet.http.HttpServlet.service(HttpServlet.java:617) =09javax.servlet.http.HttpServlet.service(HttpServlet.java:717) note The full stack trace of the root cause is available in the Apache Tomc= at/6.0.20 logs.
Note: this works with apache, if you are not using apache (e.g. default = in v5), then you cannot do this.
This URL is in the container and is not protected by default from shib:&= nbsp;
/status_grouper/status?diagnosticType=3DdbYou can do this yourself, in the demo server, this URL is protected:
https://grouperdemo.internet2.edu/grouper_v2_3/status=
Because this URL is protected:
https://grouperdemo.internet2.edu/grouper_v2_3
This server uses Apache in front of tomcat with reverse proxy AJP, so to= make the status servlet not protected, make another mapping in apache whic= h is not protected:
ProxyPa= ss /status_grouper_v2_3/status ajp://localhost:8131/grouper_v2_3/statusthat URL is not protected by authn, so it is unprotected:
https://grouperdemo.internet2= .edu/status_grouper_v2_3/status?diagnosticType=3Dall
See Also