Thursday, July 9, 2015

Daily Oracle DBA Checklist - For Production




Daily Production Oracle DBA checklist
                                              

Here I’ve mentioned lot of Daily activities of a production Oracle DBA (based on my experience). Actually it totally depends on your Database’s characteristics, Size of Database, environment etc. you may have something more to check. So please keep in mind.
And here I am telling you to check that’s mean not to check all by manually. It will take lot of times.  DBAs should try to avoid manually checking anything daily or weekly. Maximum checks should be automated, and should alert the DBA only if there is an error (by email, SMS). But for these, DBA need’s good knowledge on shell scripting, PowerShell, DOS, PL/SQL, OEM or other monitoring tools.
 
Here are the (in general) Daily Production DBA job checklist-


1) Checking all instances status, Make sure all are up and running 

     How: Log in to each instance, run any query to make sure databases are available and running. Such as
            $ export ORACLE_SID=orcl
            $ sqlplus /nolog
            SQL> select host_name from v$instance ; 

2) Check alert log entries for any error or any new things (at least daily two times for all databases)

i) When you enter into office 
ii) When you are leaving office  
And when any complain comes about database slowness or anything

3) Verify success of databases RMAN/export/any other backups.

                                    i)  Check the RMAN/export logs for any error.

                                    ii) Check the backup size for confirmation and daily backup size growth.  

           

4. Verify success of database backup archiving to tape (or let system admin know to back it up.)

5) Verify Auto control file backup successful 

6) Verify success of all schedule job

7) Verify enough spaces are there for -

                                    i) Space of Tablespaces (if autoextend = no)
                                    ii) Space in OS Hard disk/ Storage LUN
                                    iii) Archive log space/ flash_recovery_area locations

8) Check to all objects status to make sure all are 'VALID'

9) If you have a Standby Database, check the status of standby database and gap between primary. 

10) Review contention for CPU, memory, network, and disk resources.

                        In linux can use -  top, netstat, vmstat  etc. commands
                        In Sun Solaris can use – prstat, vmstat ,netstat etc. 

10) Checks for blocking locks , killed sessions etc. 

11) Check connected sessions (average count) to verify there are no abnormalities (sometimes parallel thread exceeds your process parameter’s value) 

                        i) Check top session’s information.
                        ii) Top SQL statements

12) Troubleshooting if any issue reported 

13) Spend sometimes to increase and upgrade your knowledge 

Oh, Next time I will try to write with how to do all checks...(Automated way)

 Cheers.... Halim