We had an outage Saturday morning from 9am-2pm. We finished in this timeframe, though it took most of the time.
- LDAP was up the whole time
- SQL interface to Grouper was up the whole time (except when the DDL views were being replaced which we rearranged the script to minimize)
- The WS was in readonly mode the whole time (except when the service was bounced and perhaps part of the DDL upgrade)
- The UI was down
Steps
- Disconnect and reconnect to VPN (to not get hit by 10 hour timeout)
- Copy tarballs in advance to prod servers
Analyze tables beforehand (takes 30 minutes)
select 'ANALYZE TABLE ' || table_name || ' estimate STATISTICS sample 100000 rows;' as script from user_Tables where table_name like 'GROUPER%' e.g. ANALYZE TABLE GROUPER_ATTRIBUTES estimate STATISTICS sample 100000 rows; select to_char(sysdate,'mm/dd/yyyy hh24:mi:ss') from dual;
- Turn off envs (loader, WS, UI)
Turn on WS as readonly
[appadmin@flash2 classes]$ pwd /opt/appserv/tomcat_3b/webapps/grouperWs/WEB-INF/classes [appadmin@flash2 classes]$ emacs grouper.properties # set the API as readonly (e.g. during upgrades). Any updates will throw an exception grouper.api.readonly = true copy to nodes [appadmin@fastprod-mgmt-01 classes]$ clusterCopy.sh grouperWs grouper.properties bounce the ws test the WS read success [mchyzer@flash pennGroupsClient-test-2.0.0]$ java -jar grouperClient.jar --operation=getMembersWs --groupNames=test:testGroup --debug=true write failure [mchyzer@flash pennGroupsClient-test-2.0.0]$ java -jar grouperClient.jar --operation=addMemberWs --groupName=test:testGroup --subjectIdentifiers=mchyzer --debug=true
Run change log to change log temp
./gsh grouperSession = GrouperSession.startRootSession(); edu.internet2.middleware.grouper.GrouperSession: 5754482603ea4f3db46079e64430c165,'GrouperSystem','application' gsh 1% loaderRunOneJob("CHANGE_LOG_changeLogTempToChangeLog");
Have DBA perform a backup on that schema
Put the new UI in place (keep backup of old), or just upgrade in place with installer if not building from source
sftp> cd /tmp sftp> put C:\dev\eclipse\projects\pennGrouper\dist\grouperUiDev.tar.gz [appadmin@flash2 webapps]$ pwd /opt/appserv/tomcat_2v/webapps [appadmin@flash2 webapps]$ tar czf /tmp/grouperUi.20160622.tgz grouper/ [appadmin@flash2 webapps]$ rm -rf grouper* [appadmin@flash2 webapps]$ tar xf /tmp/grouperUiDev.tar [appadmin@flash2 webapps]$ tar xf /tmp/grouperUiDev.tar [appadmin@flash2 webapps]$ mkdir grouper [appadmin@flash2 webapps]$ cd grouper [appadmin@flash2 grouper]$ unzip ../grouper.war [appadmin@flash2 grouper]$ cd .. [appadmin@flash2 webapps]$ ls grouper grouper.war [appadmin@flash2 grouper]$ rm -rf /opt/appserv/tomcat_2v/logs/grouper [appadmin@flash2 grouper]$ mkdir /opt/appserv/tomcat_2v/logs/grouper ###### INSTALL PATCHES (if not upgrading from installer) [appadmin@flash2 grouper]$ cd /tmp [appadmin@flash2 tmp]$ mkdir grouperInstaller [appadmin@flash2 tmp]$ cd grouperInstaller/ [appadmin@flash2 grouperInstaller]$ wget http://software.internet2.edu/grouper/release/2.3.0/grouperInstaller.jar java -jar grouperInstaller.jar patch, ui, put in dir: /opt/appserv/tomcat/apps/grouper/webapps/grouper
Keep a copy of old GSH scripts from WS (since they are in WEB-INF/bin/*.gsh and will be overwritten)
Generate the DDL script
[appadmin@flash2 grouper]$ ./gsh -registry -check [appadmin@flash2 bin]$ pwd /opt/appserv/tomcat_2v/webapps/grouper/WEB-INF/bin Rearrange script so the id index stuff (DDL and SQL) is in another file. run that first so server stays up readonly Make 10 scripts which have 50k rows in them instead of one script with 500k rows. Add commits to them between each line to avoid deadlock Login to sqlplus and run the scripts c:\temp>sqlplus authzadm/XXXXXXXXX@dcom SQL> @C:\temp\grouperDdl_20160707_14_37_09_963_start6.sql make sure grouper_groups, grouper_stems, grouper_attribute_def, grouper_attribute_def_name has no rows with id_index=null select * from grouper_groups where id_index is null select * from grouper_stems where id_index is null select * from grouper_attribute_def where id_index is null select * from grouper_attribute_def_name where id_index is null
See if need to fix pit_gs_start_idx
CREATE UNIQUE INDEX pit_gs_start_idx ON GROUPER_PIT_GROUP_SET (START_TIME, SOURCE_ID); ##change this CREATE INDEX pit_gs_start_idx ON GROUPER_PIT_GROUP_SET (START_TIME, SOURCE_ID); ##figure out the issue select * from grouper_pit_group_set gpgs1, grouper_pit_group_set gpgs2 where GPGS1.ID != gpgs2.id and (gpgs1.start_time = gpgs2.start_time or (gpgs1.start_time is null and gpgs2.start_time is null)) and (gpgs1.source_id = gpgs2.source_id or (gpgs1.source_id is null and gpgs2.source_id is null));
Copy the deployed dir, run the upgrader, say yes when asked to run the upgrade GSH scripts for 2.2.0, 2.2.1, 2.3.0, etc against database (this can be time consuming)
- Note, if there is a problem about group not exist e.g. in require group, look at the table: grouper_attributes_legacy, and edit group names if they were renamed since the require group was put into place
Analyze tables again
select 'ANALYZE TABLE ' || table_name || ' estimate STATISTICS sample 100000 rows;' as script from user_Tables where table_name like 'GROUPER%' e.g. ANALYZE TABLE GROUPER_ATTRIBUTES estimate STATISTICS sample 100000 rows;
Turn off readonly WS, delete it, put new WS app in place (or just upgrade in place if not building from source)
[appadmin@fastprod-mgmt-01 webapps]$ clusterRun grouperWs "rm -rf /opt/appserv/tomcat/apps/grouperWs/webapps/*" [appadmin@fastprod-mgmt-01 webapps]$ pwd /opt/appserv/tomcat/apps/grouperWs/webapps [appadmin@fastprod-mgmt-01 webapps]$ tar xzvf /tmp/grouperWsProd.tar.gz [appadmin@fastprod-mgmt-01 webapps]$ mkdir grouperWs [appadmin@fastprod-mgmt-01 webapps]$ cd grouperWs [appadmin@fastprod-mgmt-01 grouperWs]$ unzip ../grouperWs.war [appadmin@fastprod-mgmt-01 grouperWs]$ cd .. [appadmin@fastprod-mgmt-01 webapps]$ clusterCopy.sh grouperWs /opt/appserv/tomcat/apps/grouperWs/webapps/grouperWs.war [appadmin@fastprod-mgmt-01 webapps]$ clusterCopy.sh grouperWs /opt/appserv/tomcat/apps/grouperWs/webapps/grouperWs
- Compile invalid views if views use the grouper views
- Get new UI/WS/loader, make sure patched
- Try GSH
- Turn on UI/WS/loader
- Test
Upgrade, patch, and turn on offsite readonly WS