OEM Views for Displaying -Metrics and Thresholds – Part 1

If you use Oracle’s Grid Control (Regardless of the version), you have been probably asked or wondered what thresholds are used for the targets you are monitoring.  We all have tried to standardize and make the thresholds and metrics the same for all customers, but as much we try,  each customer and database will have different characteristics that will require threshold modifications to the standard.  I sure at some point, documentation would be asked for and or required for the thresholds set and used for the targets you monitor.   You have a couple of choices one is to use the Grid Control OMS console and either perform a screen print of the metrics, or you can utilize the tables inside of grid control.

BTW, you can export the templates, but will have to convert them from XML to a readable format and cant guarantee all targets are using the values from the “Standard” templates.

If you have more than 1 or target to monitor, you need a better way than screen prints every target.    I have started to utilize the Grid Control tables more as the source of these reports.   I have found a few tables that will help generate these reports into CSV files or can be used by APEX.

Todays blog will show the table and query (Sqlplus) that will get information from your target’s templates used (Default or otherwise).   Part 2 will show the tables for the targets actually used metrics.

As a note all tables are owned by the user sysman and are views and can be found in the following URL:

https://docs.oracle.com/cd/E24628_01/doc.121/e57277/toc.htm – Grid Control Extensibility Guide.

To find out the metrics and thresholds used by the templates you can query the view  – mgmt$template_metric_settings – this view contains the template settings and information.  The columns in the table are defined and described  in the following document: mgmt$template_metric_settings

To compile the template metric values, you can execute the following query  Note – This doesn’t include corrective actions, so the query doesn’t include those columns.   If you use the correction actions, please add the corrective action columns to the query:

<select t.template_name, 
       t.target_type, 
       t.metric_name, 
       t.metric_column, 
       t.collection_name, 
       t.column_label,
       decode (t.warning_operator,0,'GT',
       1,'EQ',
       2,'LT',
       3,'LE',
       4,'GE',
       5,'CONTAINS',
       6,'NE',
       7,'MATCH : Regular expression',
       'Unknown') "Warning Operator",
       warning_threshold,
       decode(critical_operator,0,'GT',
       ,'EQ',
       2,'LT',
       3,'LE',
       4,'GE',
       5,'CONTAINS',
       6,'NE',
       7,'MATCH : Regular expression',
       'Unknown') "Critical Operator", 
       t.critical_threshold,
       t.occurrence_count,
       t.warning_action_type,
       t.critical_action_type
 from sysman.mgmt$template_metric_settings t
where (t.template_name, t.target_type, t.metric_name not in (
select template_name, target_type, metric_name
 from  sysman.mgmt$template_metric_settings i
where t.template_name = i.template_name
  and t.target_type = i.target_type
  and a.metric_name = b.metric_name
  and (ltrim(rtrim(i.critical_threshold)) is null) and (ltrim(rtrim(i.warning_threshold)) is null))
order by t.template_name, t.target_type, t.column_label >

Note – when copying the script – the single quotes may not copy correctly.

You can also spool this out to a CSV file using the “colsep ,” or use the information from https://chartio.com/resources/tutorials/how-to-write-to-acsv-file-using-oracle-sql-plus/

Part II will include all the target thresholds query

OEM12C Deployment and Mass Promotion

OEM12C Deployment and Mass Promotion

I was privileged to be a speaker at the Rocky Mountain training days #td17 on the topic of “Automagically” Deploying EM12c Agents and Mass promotion of targets once the hosts have been added.

The presentation is attached, but in a nut shell, using EM12C’s RPM generation and using Puppet we are able to push out agents, the moment the machines are built and validated.

The first step is to log into the OMS machine and generate the RPM using EMCLI (After logging into emcli )

> emcli get_supported_platforms  <== This will get the information for building the RPM’s

 

emcli get_agentimage_rpm -destination=/tmp -platform=”Linux x86-64″ -version=”12.1.0.5.0″

This last command will generate a RPM in the /tmp directory called: “oracle-agt-12.1.0.5.0-1.0.x86_64.rpm”   << notice the version # and platform in the name

Now that the RPM is ready, we use Puppet to copy and to install the RPM on new machines.

A few notes about puppet installs with RPMS

  • We use a standard location / directory
  • Only run if the standard location is NOT in place
  • Only run if “emwd.pl” is not in place
  • Verifies the correct ownership of the directories
  • Uses oraInventory for location

There are 2 steps for the puppet process

  1. apply RPM – >> rpm -ivh oracle-agt-12.1.0.5.0-1.0.x86_64.rpm
  2. create agent.properties file – With host name, default password, etc.
  3. run – /etc/init.d/oracle-agt RESPONSE_FILE=/usr/lib/oracle/agent/agent.properties
  4. When Step 3 finishes, there should be a new OMS entry for that host and connected to the OMS.

Last step – Just added recently – (Not part of presentation) is to perform a mass promotion of non-host targets like databases, listeners, etc.

mike-gangler-rmoug-v32016_1197_gangler_pdf

Mass Promote code is in GIT Hub – https://github.com/harry2040/OEM12cMassPromote

Thanks again to #td17 for a great conference and the opportunity.

–Mike

Collaborate 16 !

I will be presenting 3 presentations at this years Collaborate 16 in Las Vegas.

Monday – April 11 – 12 – 12:30 – Palm B – Using Puppet and RPM’s to deploy EM12C Agents

Wednesday – April 13 – – 12:00 – 12:30 – Jasmine B – Granting Oracle Schema Permissions When Objects not Created Yet !

Thursday – April 14 – 8:30 am – 9:30 am – Palm  D – Optimize and Simplify Oracle 12C RAC using dNFS, ZFS and OISP

2016_1197_gangler_pdf

 

2016_1198_gangler_ppt

 

2016_1201_gangler_ppt

MOUS15 is here !!!

If you are in the Indiana, Michigan, Illinois, Ohio or Canada region and want to see and experience great Oracle user and technical presentations, networking and hands-on workshops, please register to the Michigan Oracle User Summit (#MOUS15), on Wednesday, November 18th at Schoolcraft college in Livonia, Michigan (www.mous.us).    Its a full day of presentations (45) that include Big Data presentations, Hyperion presentations, JDEdwards, Oracle Applications, Middleware (SOA), Database (Oracle12c, RAC, others) and 2 hands-on workshops – #cloneattack and #racattack.   We also have 13 Oracle Ace and Ace directors, Rich Niemiec of Rolta will be our keynote speaker.   Please visit our website (www.mous.us) or guidebook for more details on the conference content.

So not only do you get some great user and technical presentations, but you get 2 great hands-on workshops – Clone Attack and Racattack (www.racattack.org).

Clone attack (www.delphix.com ) – did you ever want to create development databases from production with a small footprint and dynamic way ?   Well come to MOUS15 and create many development databases on your own PC.

RACATTACK – (www.racattack.org) – Racattack allows you to create a 12c RAC database on your own PC, yes, you too can have your own RAC database on your pc.

Wait there is more !! – WE ARE ON GUIDEBOOK – http://www.guidebook.com (Look for MOUS) its free !!

Lastly, this conference is a great way to network with other Oracle users, a full day of presentations, hands-on workshops, great food , education and its priced reasonably ($50 pre-register, $75 at door).

Hope to see you at the conference and please contact me with any questions,

--Mike Gangler
mjgangler@yahoo.com

ORA-10997 and ORA-09968 on ASM startup (With NFS)

During a recent hardware upgrade, i received the following error on a ASM (NO-RAC) startup.   It was very confusing, because i was custom to seeing ASM on Storage LUNS and in this case it was using ASM (NO-RAC) and NFS storage.   Unfortunately, as always with late night changes, you don’t notice things until you get fresh eyes on the issue.   A simple “df -kh” would’ve told me that it was NFS, but made too many presumptions.

When i first told about the NFS settings, i thought i needed to restart nfs, but that didn’t work.   Than after getting another fresh pair of eyes from the Linux team, we realized that the directory “/var/lib/nfs” was set to 750???   After some investigation, we saw that puppet was changing it to 750 every 5 minutes, so even when we changed it to 755, it was 750.  so the next question, why ?

Well it appears that FTI requirements are the culprit.  Apparently the auditors don’t understand nfs and O/S’s, probably don’t care either.   Apparently, FTI companies are required to set this to 750…. Now i understand security, but this directory needs to be readable by Oracle to start up the nfs drives and ASM.

Until we get a permanent solution, the linux team offered the following solution:

the setting of 755 is only needed for startup, so the temporary solution is (as root):


root> chmod 755 /var/lib/nfs
root> chatter +i /var/lib/nfs <== protects
root> crsctl start has
== after successful startup ====
root> chatter -i /var/lib/nfs <== takes off the protection

The following is the error description:

grid> srvctl start asm

PRCR-1079 : Failed to start resource ora.asm
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-10997: another startup/shutdown operation of this instance in progress
ORA-09968: unable to lock file
Linux-x86_64 Error: 37: No locks available
Additional information: 64
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/product/11.2.0.4/grid_home/log/<machine_name>/agent/ohasd/oraagent_grid/oraagent_grid.log".
CRS-2674: Start of 'ora.asm' on '<machine name>' failed

[kslwt_validate_gwaitctx: magic], [21313] – “asm_diskstring” > 1024

Recently our staff tried to add disk strings to ASM diskgroups and the strings exceeded 1024 characters. We are on a oracle 11.2.0.4 on redhat linux. In the Docs, it says it should hold up to 4K, but when we added to the asm_diskstrings we got the following error and the ASM DB wouldn’t startup causing our production db to be down:

ORA-0600: [kslwt_validate_gwaitctx: magic], [21313]

We looked many places (Including Metalink and Communities) and couldn’t find a ora-0600 error but did find a ora-7445 with the same code/error.

We opened a ticket and was notified that there is a bug with this error and a patch to fix it.

We were hitting Bug 18273830 where when the asm_diskstring becomes larger than 1024 characters it causes this failure. There is a one-off Patch available on My Oracle Support and 11.2.0.4 which will resolve this problem

Patch 18273830
Release Oracle 11.2.0.4.0
Platform Linux x86-64

Hope this helps, as there isn’t much documentation on this bug/error.

RMAN-03009 Error with ORA-19511

I recently modified the RMAN servers to perform auto backup of the controlfiles, etc. After the change, the incremental’s worked without error.

When I Ran the Full RMAN backup, got the following errors :
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of Control File and SPFILE Autobackup command on ch00 channel at 08/03/2014 18:42:51
ORA-19506: failed to create sequential file, name=”c-2369888563-20140803-ff”, parms=””
ORA-27027: sbtremove2 returned error
ORA-19511: Error received from media manager layer, error text:
Failed to remove, c-2369888563-20140803-ff, from image catalog.

The reason why is the rman backups were NOT using the rman catalog for over a year (Commented out for some odd reason). The RMAN backups were running but without a catalog connection.

So when i added control file auto backup, I received this error.

To FIX:

Connect to RMAN using the rman catalog
run : rman> crosscheck archivelog all;
Run Full backup.

Once the crosscheck cleaned up all the archive logs since 2013, the full backup could be performed.

I also added the RMAN catalog to all the backups on this machine

Cloud 12c Upgrade issues and resolutions

In the process of upgrading to EM12c Cloud control Version 3, i received a few errors that delayed the upgrade by months.   With the help of Oracle support, i finally resolved it today and finished he upgrade.

Error  #1 – “em target policy error”  

Fix – under the sysman account – “exec emd_maintenance.ENABLE_EM_SECURITY_POLICIES”  <==== FIX FOR Error

This was due to missing information below (BEFORE ERROR).

Executing Prereq Action

—————————–
executing Action: select count(*) from (select ‘EM_TARGET_POLICY’ from dual where not exists (select policy_name from dba_policies where object_owner=upper(?) and pf_owner=upper(?) and object_name IN (‘MGMT_TARGETS’,’EM_MANAGEABLE_ENTITIES’)))

=========NEXT ERROR ================

After that was fixed, received the next error (at about 60/70 % upgraded)

INFO: oracle.sysman.top.oms:Securing OMS ……
INFO: oracle.sysman.top.oms:EM Key is secured and is backed up at /u01/app/oracle/middleware_3/oms/sysman/config/emkey.ora
INFO: oracle.sysman.top.oms:Adapter created successfully: emgc_USER
INFO: oracle.sysman.top.oms:Adapter created successfully: emgc_GROUP
INFO: oracle.sysman.top.oms:EM Key is secured and is backed up at /u01/app/oracle/middleware_3/oms/sysman/config/emkey.ora
INFO: oracle.sysman.top.oms:ORA-00001: unique constraint (SYSMAN.MGMT_OMS_PROPERTY_DEF_PK) violated
INFO: oracle.sysman.top.oms:ORA-06512: at “SYSMAN.EM_OMS_PROPERTIES”, line 420
INFO: oracle.sysman.top.oms:ORA-06512: at “SYSMAN.EM_OMS_PROPERTIES”, line 761
INFO: oracle.sysman.top.oms:ORA-06512: at “SYSMAN.EM_OMS_PROPERTIES”, line 589
INFO: oracle.sysman.top.oms:ORA-06512: at “SYSMAN.EM_OMS_PROPERTIES”, line 982
INFO: oracle.sysman.top.oms:ORA-06512: at line 1
INFO: oracle.sysman.top.oms:

==== BEFORE FIX =============== – No rows showing as SYSMAN

SYSMAN_apgrid10>select count(*) from mgmt_oms_property_def

COUNT(*)
———-
0

Elapsed: 00:00:00.01
SYSMAN>show user
USER is “SYSMAN”
SYSMAN>connect sys as sysdba
Connected.
SYS>select count(*) from sysman.mgmt_oms_property_def;

COUNT(*)
———-
540

Elapsed: 00:00:00.01
SYS_apgrid10>show user
USER is “SYS”

===========Fix =======

>grant exempt access policy to sysman;  <=== this is what you have to do to fix the unique constraint error

Grant succeeded.

Elapsed: 00:00:00.07
>select count(*) from sysman.mgmt_oms_property_def;

COUNT(*)
———-
494

Elapsed: 00:00:00.10
>connect sysman
Connected.
SYSMAN_apgrid10>select count(*) from mgmt_oms_property_def;

COUNT(*)
———-
494