Silect offers new MP Authoring tool – MP Author Professional

June 15, 2016, 6:13 am

≫ Next: Silect MPAuthor version 7.0 released

≪ Previous: Writing events with parameters using PowerShell

http://www.silect.com/content/mp-author-professional

Building on the simple-to-use, free MP Author tool, they released MP Author Professional. It adds/extends many more monitoring scenarios to the basic version.

↧

Silect MPAuthor version 7.0 released

June 21, 2016, 6:16 pm

≫ Next: SQL MP updates – version 6.7.2.0

≪ Previous: Silect offers new MP Authoring tool – MP Author Professional

The free version of Silect’s MP Author just released a new version: 7.0

http://www.silect.com/mp-author

June 2016: MP Author Version 7.0

The enhancements in MP Studio and MP Author consist of a number of changes to improve reliability and performance including the following:

“Event Description” has been added as an expression option for both event monitors and rules
Added support for “CONTAINS” as a comparison for Registry discovery for new classes
Support for alert suppression for rules
Added full support for alert parameters for event rules (to bring it to parity with event monitors)
Updates to the User Guide and Installation Guide
Recover and continue from SDK calls for MPB format MPs that result in exceptions
Various other fixes and performance improvements

↧

SQL MP updates – version 6.7.2.0

July 11, 2016, 7:48 am

≫ Next: System Center 2016 gets a launch date

≪ Previous: Silect MPAuthor version 7.0 released

The SQL Management packs have been updated. This is the first major release since 6.6.4.0, back in late 2015.

I will list all the SQL MP below. Don’t import all of them unless you need each specialization. For instance, don’t import the reporting and analysis services MP’s if you really only want alerting and monitoring for the SQL Database Engines.

The latest versions of ALL SQL MP’s and locations below, as of 7/11/2016:

SQL 2005-2012 DB Engine	6.7.2.0	http://www.microsoft.com/en-us/download/details.aspx?id=10631
SQL 2008 Replication	6.7.2.0	http://www.microsoft.com/en-us/download/details.aspx?id=47723
SQL 2008 Reporting Services	6.6.7.6	https://www.microsoft.com/en-us/download/details.aspx?id=43391
SQL 2008 Analysis Services	6.6.7.6	https://www.microsoft.com/en-us/download/details.aspx?id=41659
SQL 2012 Replication	6.7.2.0	http://www.microsoft.com/en-us/download/details.aspx?id=47721
SQL 2012 Reporting Services	6.6.7.6	https://www.microsoft.com/en-us/download/details.aspx?id=43392
SQL 2012 Analysis Services	6.6.7.6	https://www.microsoft.com/en-us/download/details.aspx?id=41658
SQL 2014 DB Engine	6.7.2.0	http://www.microsoft.com/en-us/download/details.aspx?id=42573
SQL 2014 Replication	6.7.2.0	http://www.microsoft.com/en-us/download/details.aspx?id=47720
SQL 2014 Reporting Services	6.6.7.6	https://www.microsoft.com/en-us/download/details.aspx?id=43390
SQL 2014 Analysis Services	6.6.7.6	https://www.microsoft.com/en-us/download/details.aspx?id=44586
SQL 2016 DB Engine	6.7.2.0	http://www.microsoft.com/en-us/download/details.aspx?id=53008
SQL 2016 Replication	6.7.2.0	http://www.microsoft.com/en-us/download/details.aspx?id=53009
SQL 2016 Reporting Services	6.7.2.0	http://www.microsoft.com/en-us/download/details.aspx?id=53010
SQL 2016 Analysis Services	6.7.2.0	http://www.microsoft.com/en-us/download/details.aspx?id=53011

↧

System Center 2016 gets a launch date

July 12, 2016, 1:41 pm

≫ Next: Updated SQL RunAs addendum MP’s

≪ Previous: SQL MP updates – version 6.7.2.0

System Center 2016 and Windows Server 2016 will be launched at the Microsoft Ignite conference in September 2016.

Read about it here: https://blogs.technet.microsoft.com/systemcenter/2016/07/12/system-center-2016-to-launch-in-september/

↧

Updated SQL RunAs addendum MP’s

July 13, 2016, 9:57 am

≫ Next: How to add accounts from another domain into a SCOM User Role

≪ Previous: System Center 2016 gets a launch date

Just a quick notice to let everyone know I updated my SQL RunAs Addendum MP’s.

If you haven’t read about this, it is a new way to configure security for the SQL management packs which eases administrative burden with traditional RunAs accounts. Read more about the solution here: https://blogs.technet.microsoft.com/kevinholman/2016/04/26/sql-mp-run-as-accounts-no-longer-required/

In this update:

Version 6.7.2.0 – Update
- Added support for SQL 2016
- Added additional monitors to check for ability to connect to SQL and sysadmin role check
- Removed any alerting by default to reduce noise.
- Added new task to configure Healthservice login for LOW PRIV to SQL
- Added folders, and state views to ease configuration and running tasks.

You can download the updated MP’s on TechNet Gallery:

https://gallery.technet.microsoft.com/SQL-Server-RunAs-Addendum-0c183c32

↧

How to add accounts from another domain into a SCOM User Role

July 19, 2016, 12:29 pm

≫ Next: MP Update: Windows Server Operating System 6.0.7316.0 released

≪ Previous: Updated SQL RunAs addendum MP’s

Normally – when you have a trust with a remote account domain, and you want to add users from the rote domain to SCOM, things go perfectly.

However, if the user account in the remote domain uses a different UPN name than the SAM account name – the SCOM UI block’s it.

For instance, I have a SCOM infrastructure in OPSMGR.NET (OPSMGR), but want to grant users in DMZ.CORP (DMZ) domain access. This works fine, if the UPN domain name for my user is the same as the SAM account name.

In the image – I am trying to add DMZ\sqlmondmz account to my SQL Ops Team role:

When I check names – I can see the UPN domain is different, than the actual DNS domain name of DMZ.CORP:

This results in the following error:

Date: 7/19/2016 2:25:18 PM
Application: Operations Manager
Application Version: 7.1.10226.1177
Severity: Error
Message:

Microsoft.EnterpriseManagement.Common.UserRoleUserUnresolvedException: Unable to resolve the user sqlmondmz@zzz.com associated with the user role. Error code 1332. Check your active directory configuration.
   at Microsoft.EnterpriseManagement.Common.Internal.ServiceProxy.HandleFault(String methodName, Message message)
   at Microsoft.EnterpriseManagement.Common.Internal.SecurityConfigurationServiceProxy.UpsertUserRolesV2(ICollection`1 urUpdateResults, ICollection`1 urScopeUpdateResults, ICollection`1 urViewScopeUpdateResults, ICollection`1 urTaskScopeUpdateResults, ICollection`1 urConsoleTaskScopeUpdateResults, ICollection`1 urTemplateScopeUpdateResults, ICollection`1 urDashboardReferenceScopeUpdateResults, ICollection`1 urUserUpdateResults)
   at Microsoft.EnterpriseManagement.SecurityConfigurationManagement.UpdateUserRoles(ICollection`1 userRoles)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Console.ConsoleJobExceptionHandler.ExecuteJob(IComponent component, EventHandler`1 job, Object sender, ConsoleJobEventArgs args)

The workaround?

Use PowerShell to add these users to the role:

$Role = Get-SCOMUserRole -Name “SQL Ops Team”
$Role | Set-SCOMUserRole -User ($Role.Users + “DMZ\sqlmondmz”)

This doesn’t have the same UI restriction:

↧

MP Update: Windows Server Operating System 6.0.7316.0 released

August 1, 2016, 8:40 am

≫ Next: SQL MP Run As Accounts – NO LONGER REQUIRED

≪ Previous: How to add accounts from another domain into a SCOM User Role

The base operating system MP’s have been updated:

https://www.microsoft.com/en-us/download/details.aspx?id=9296

Previously, there were a couple interim releases that were pulled due to issues, mostly affecting older operating system versions. This was due to the MP focusing on changes for Windows Server 2016. This MP update addresses those issues caused by the interim MP changes. The previous stable MP version was 6.0.7297.0 so I will focus on changes since that MP:

Changes in version 6.0.7303.0

MP used to discover physical CPU, which performance monitor instance name property was not correlated with Windows PerfMon object (expecting instance name in (socket, core) format). That affected related rules and monitors. With this release, MP discovers logical processors, rather than physical, and populates performance monitor instance name in proper format
Microsoft.Windows.Server.ClusterSharedVolumeMonitoring.mp and Microsoft.Windows.Server.Library.mp scripts code migration to PowerShell in scope of Windows Server 2016 Nano support (relevantly introduced in Windows Server 2016 MP version 10.0.1.0).
Updated Microsoft.Windows.Server.ClusterSharedVolumeMonitoring.ClusterSharedVolume.Monitoring.State monitor alert properties and description. The fix resolved property replacement failure warning been generated on monitor alert firing.

Changes in version 6.0.7310.0

Several bugs located in Cluster Shared Volumes MP were fixed (see below); error handling migrated to common recommended scenario. Enabled Quorum monitoring via changing the monitoring logic. The monitoring logic is splitting for Nano Server (with usage of PowerShell) and all other operation systems.

Fixed bug: disk free space monitoring issue on Quorum disks in failover clusters; the monitor was displayed as healthy, but actually it did not work and no performance data was collected.
Fixed bug: logical disk discovery did not discover logical disk on non-clustered server with Failover Cluster Feature enabled.
Fixed bug: Clustered Shared Volumes were being discovered twice – as a Clustered Shared Volume and as a logical disk; now they are discovered as Clustered Shared Volumes only.
Fixed bug (partially): mount points were being discovered twice for cluster disks mounted to a folder – as a cluster disk and as a logical disk. See Troubleshooting and Known Issues section for details.
Fixed bug: Cluster Shared Volume objects were being discovered incorrectly when they had more than one partition (applied to discovery and monitoring): only one partition was discovered, while the monitoring data was discovered for all partitions available. The key field is changed, and now partitions are discovered correctly; see Troubleshooting and Known Issues section for details.

Error handling was corrected. Logical disk correct discoveries on non-cluster server with Failover Clustered Server Feature is installed.
Created new overrides for Cluster Shared Volume MP, as long as the old ones did not work.
Cluster disk monitors alert messages: alert title might be disorienting and was corrected.

Changes in version 6.0.7316.0

Due to incompatibility issues in monitoring logic, several Cluster Shared Volumes MP bugs remained in version 6.0.7310.0. These are now fixed in the current version (see the complete list of bugs below). To provide compatibility with the previous MP versions, all monitoring logic (structure of classes’ discovery) was reverted to the one present in version 6.0.7297.0.

Fixed bug: disk free space monitoring issue on Quorum disks in failover clusters; the monitor was displayed as healthy, but actually it did not work and no performance data was collected.
Fixed bug: logical disk discovery did not discover logical disk on non-clustered server with Failover Cluster Feature enabled.
Fixed bug: Clustered Shared Volumes were being discovered twice – as a Clustered Shared Volume and as a logical disk; now they are discovered as Clustered Shared Volumes only.
Fixed bug (partially): mount points were being discovered twice for cluster disks mounted to a folder – as a cluster disk and as a logical disk. See Troubleshooting and Known Issues section for details.
Fixed bug: Cluster Shared Volume objects were being discovered incorrectly when they had more than one partition (applied to discovery and monitoring): only one partition was discovered, while the monitoring data was discovered for all partitions available. The key field is changed, and now partitions are discovered correctly; see Troubleshooting and Known Issues section for details.
Fixed bug: physical CPUs are now discovered on Windows Server 2008 R2 platforms; logical CPUs are no longer discovered, see Troubleshooting and Known Issues section for details.
Fixed bug: Windows Server 2008 Max Concurrent API Monitor did not work on Windows Server 2008 platform. Now, it is supported on Windows Server platforms starting from Windows Server 2008 R2.
Fixed bug: when network resource name contained more than 15 symbols, the last symbols of the name were cut off, which was resulting in cluster disks and Cluster Shared Volume discovery issues.

Cluster disk monitors alert messages: alert title might be disorienting and was corrected.

I have been running this version for a few weeks now, and I haven’t seen any major issues. However, like ALL MP’s, I recommend careful testing and evaluation in your lab and test environments before moving to production.

↧

SQL MP Run As Accounts – NO LONGER REQUIRED

August 25, 2016, 12:04 pm

≫ Next: UR11 for SCOM 2012 R2 – Step by Step

≪ Previous: MP Update: Windows Server Operating System 6.0.7316.0 released

Over the years I have written many articles dealing with RunAs accounts. Specifically, the most common need is for monitoring with the SQL MP. I have explained the issues and configurations in detail here: Configuring Run As Accounts and Profiles in OpsMgr – A SQL Management Pack Example

Later, I wrote an automation solution to script the biggest pain point of RunAs accounts: distributing them, here: Automating Run As Account Distribution – Finally! Then – took it a step further, and built this automation into a management pack here: Update- Automating Run As Account distribution dynamically

Now – I want to show a different approach to configuring monitoring for the SQL MP, which might make life a lot simpler for SCOM admins, and SQL teams.

What if I told you – there was a way to not have to mess with RunAs accounts and the SQL MP at all? No creating the accounts, no distributing them, no associating them with the profiles – none of that? Interested? Then read on.

The big challenge in SQL monitoring is that the SCOM agent runs as LocalSystem for the default agent action account. However, LocalSystem does not have full rights to SQL server, and should not ever be granted the SysAdmin role in SQL. This is because the LocalSystem account is quite easy to impersonate to anyone who already has admin rights to the OS.

We can solve this challenge, by introducing Service SID’s. SQL already uses Service Security Identifiers (SID’s) to grant access for the service running SQL server, to the SQL instance. You can read more about that here: https://support.microsoft.com/en-us/kb/2620201

Service SID’s were introduced in Windows Server 2008 and later.

We can do the same thing for the SCOM Healthservice. This idea was brought to me by a fellow MS consultant – Ralph Kyttle. He pointed out, this is exactly how OMS works to gather data about SQL server. We have an article describing this recommended configuration here: https://support.microsoft.com/en-us/kb/2667175

Essentially – this can be accomplished in two steps:

Enable the HealthService to be able to use a service SID.
Create a login for the HealthService SID to be able to access SQL server.

That’s it!

This creates a login in SQL, and allows the SCOM agent to be able to monitor SQL server, without having to maintain another credential, deal with password changes, and removes the security concern of a compromised RunAs account being able to access every SQL server in the company! No more configuration, no more credential distribution.

I even wrote a Management Pack to make setting this initial configuration up much simpler.

*** Updated 7-13-2016 for MP version 6.7.2.0

Let me demonstrate:

First, we need to ensure that all SCOM agents, where SQL is discovered – have the service SID enabled. I wrote a monitor to detect when this is not configured, and targeted the SQL SEED classes. For each SQL version, there is an Addendum MP which shows the SEED classes:

This monitor will show a warning state when the Service SID is not configured for any agent where we discover an instance of a SQL SEED class.

The monitor has a script recovery action, which is disabled by default. You can enable this and it will automatically configure this as soon as SQL is detected, and will restart the agent.

Alternatively – I wrote two tasks you can run – the second one configures the service SID, but will wait for the next reboot (or service restart) before this actually becomes active. The first task configures the service AND then restarts the agent Healthservice. You can multi-select items in this view and run against multiple agents, making this one-time configuration easy.

Here is what it looks like in action:

So – once that is complete – we can create the login for SQL.

In the Addendum MP for each SQL version – there is a state view for the DB engine. If you switch to this view, or any Database Engine view – you will see two new tasks show up which will create a SQL login for the HealthService. One creates the login and assigns it the SysAdmin role to the instance. The other creates the login and configures the login for Low Priv mode. You just need to choose whichever you want to use for your organization.

If you run this task, and don’t have rights to the SQL server – you will get this:

Have your SQL team run the task and provide a credential to the task that will be able to create a login and assign the necessary SysAdmin role to the service:

Voila!

What this actually does – is create this login on the SQL server and set it to SysAdmin role:

All of these activities are logged for audit in the Task Status view:

To further assist with this configuration, I added two additional monitors:

The first monitor turns unhealthy if we cannot connect to SQL at all:

The second monitor turns unhealthy if we CAN connect to SQL, but we detect that the login for “NT Service\HealthService” does not have the “SysAdmin” role. You should use this monitor if you are granting the SysAdmin role to the Healthservice, and you should disable it if you are using Lowest Priv. It specifically checks to see if the Healthservice login is configured as a SysAdmin:

None of the monitors generate alerts by default, to limit adding noise to SCOM. If you want alerting you can enable that in the MP and configure it as you wish.

***NOTE: These monitors run every 4 hours by default, so after making the changes to add the NT Service\Healthservice to SQL, it will take that long before the monitors change state, unless you bounce the health service on the agent to speed that up. So be aware.

Now – as new SQL servers are added over time – the Service SID can automatically be configured using the recovery, and the SQL team will just need to add the HealthService login as part of their build configuration, or run this task one time for each new SQL server to enable it for monitoring.

I find this to be much simpler than dealing with RunAs accounts, and it appears to be a more secure solution as well. I welcome any feedback on this approach, or for my Management Pack Addendum.

I have included my SQL RunAs Addendum MP’s to be available below:

Version 6.6.4.0 – Original release of the addendum MP’s.
Version 6.7.2.0 – Update
- Added support for SQL 2016
- Added additional monitors to check for ability to connect to SQL and sysadmin role check
- Removed any alerting by default.
- Added new task to configure Healthservice login for LOW PRIV to SQL
- Added folders, and state views to ease configuration and running tasks

https://gallery.technet.microsoft.com/SQL-Server-RunAs-Addendum-0c183c32

↧

UR11 for SCOM 2012 R2 – Step by Step

September 6, 2016, 10:42 am

≫ Next: SCOM 2016 is available

≪ Previous: SQL MP Run As Accounts – NO LONGER REQUIRED

KB Article for OpsMgr: https://support.microsoft.com/en-us/kb/3183990

Download catalog site: http://catalog.update.microsoft.com/v7/site/Search.aspx?q=3183990

NOTE: I get this question every time we release an update rollup: ALL SCOM Update Rollups are CUMULATIVE. This means you do not need to apply them in order, you can always just apply the latest update. If you have deployed SCOM 2012R2 and never applied an update rollup – you can go straight to the latest one available. If you applied an older one (such as UR3) you can always go straight to the latest one!

Key Fixes:

Network discovery fails because of monitoring host crash if no paging file is set on the operating system
When no paging file is set on the operating system, the page file size is implicitly set to 0. This causes the monitoring host to crash. This update fixes such an exception.
Backport PuTTY 0.64 and 0.66 updates from 2016 release
Operations Managers ssh-based administration of UNIX/Linux computers (agent discovery and installation, upgrade, uninstallation) now supports UNIX and Linux computers that are configured to require SHA2 HMACs and those with Key Exchange Algorithm changes, as specified in RFC 4419 (Ubuntu 15.10, 16.04 LTS).
Update Register-SCAdvisor cmdlet for WEU workspaces
This update adds support to register the Operations Manager Management group to workspaces in regions other than Eastern US by using the Register-SCAdvisor cmdlet. The cmdlet takes an additional optional parameter (SettingServiceUrl), which is the URL for setting the service in the region of the workspace. If it is not specified, the workspace is assumed to be in the Eastern US.
ACS eventschema.xml has incorrect parameter mappings for multiple audit events
The report named Usage_-_Sensitive_Security_Groups_Changes used to say n/a\n/a for some events in the Changed By column. And in some events, the Member User column contained the account name of the user who made the change instead of the account that was added or removed. This fix resolves this issue, as the Changed By column now contains the account name of the user who made the change, and the Member User column contains the name of the added or removed account, where applicable.
Memory leak when monitoring network devices by using SNMPv3
This update fixes a memory leak in Network Monitoring area that occurs while monitoring network devices by using SNMPv3.
Web Console user can view datawarehouse performance or SLA widget data outside of their scoped dashboard views
This update implements verification of the logged-in user to confirm that the user has access to the opened dashboard before loading the same.
Downtime duration doesn’t take business hour into consideration
Business hours are being calculated even when the Business hours check box is cleared. This update resolves this issue.
The updated RDL files are located in the following location:
%SystemDrive%\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\Reporting
To update the RDL file, follow these steps:
1. Go to http://MachineName/Reports_INSTANCE1/Pages/Folder.aspxMachineName //Reporting Server.
2. On this page, go to the folder to which you want to add the RDL file. In this case, click Microsoft.SystemCenter.DataWarehouse.Report.Library.
3. Upload the new RDL files by clicking the upload button at the top. For more information, see the Upload a File or Report (Report Manager) topic on the Microsoft Developer Network website.
Cisco 3172 PQ network device certification issues
This update fixes issues in monitoring the Cisco 3172 PQ network device and its components.
Adding SCOM assessment links in the Operations Management Suite view in the SCOM console
Links for Operations Manager Assessment and Pre-Configuration steps for Operations Manager Assessment are added in the Operations Management Suite Connection node under the administration pane. SCOM assessment solution in OMS is currently in private preview, please reach out to your “TAM or other Microsoft contact” to obtain access to the solution.
SQL Server Seed relationship with a server that is running Windows
The dynamic members of the group are not populated if the dynamic rule is based on a hosted relationship with Windows Server. This fix enables retrieval of hosting entity.
Alert subscriptions are not always fired for 3 state monitors
This update allows for configuration of a 3 state monitor to raise alerts; whose severity is in sync with monitor’s health state. One can create a subscription to be notified on modification of alert severity. Even if the monitor’s state keeps toggling between warning and critical, severity of the alert keeps being updated, and a notification fired on an update to the severity of an alert.
When you connect SCOM to OMS, Availability monitors health state of some management servers changes to Warning state
If OMS workspace is configured to collect certain event logs and if those event logs are not present on the management servers that are connected to that workspace, the “Availability” monitor’s health state on those management servers that are used to change the Warning state. This state change scenario is fixed. This prevents the switch to Warning state.
RunAs Account password expiration does not work with Active directory Password Settings Objects that breaks the validation of LOCAL User Accounts
Local Accounts could not be validated by using ADSystemInfo. Therefore, when any local account is added into RunAs account, an Error event is logged in Event Viewer for an exception in validating the local account. After this update, local accounts are validated.
MPB Entries in Catalog database for the VersionIndependentGuid column is updated
This update enables the SCOM console to show the correct mpb installation status in the management server when it tries to connect to an online catalog to update mpbs.
If the first try of importing MPB failed then re-importing the MPB was not possible until the SCOM console is closed and reopened
This update enables re-download and installation of an mpb, without closing and reopening the SCOM console, even if the first try installation of that mpb failed because of a dependency issue.
Change of the displayname field for a group in a sealed or unsealed management pack
Renaming a group through PowerShell cmdlets was not displaying the new group name in the SCOM console. This update resolves this issue and the renaming of a group correctly displays the renamed group name in the SCOM console.

New Linux operating system versions supported

Ubuntu Linux 16.04 LTS (x86 and x64) is now supported in System Center 2012 R2 Operations Manager.

Issues that are fixed in the UNIX and Linux management packs

During UNIX/Linux computer discovery, the GetOSVersion.sh script is run with sudo elevation if a sudo-enabled user is selected for Discovery. This update prevents the GetOSVersion.sh script from being run with sudo elevation and does not have to be authorized in the /etc/sudoers folder.
Scripts executed by the ExecuteScript method in Management Packs always run from the /tmp folder. With this update, the temporary folder for scripts is now configurable. To use another folder, update the symbolic link to link to a temporary folder of your choice:
/etc/opt/microsoft/scx/conf/tmpdir
UNIX or Linux computers together with sshd versions that implement the Key Exchange Algorithms described in RFC 4419, such as Ubuntu 15.10, cannot be discovered with the Discovery Wizard.
Network statistics collected on AIX servers are reset when another tool such as NetStat is also used.
Physical disks are shown incorrectly as offline if an LVM snapshot is taken.

Lets get started.

From reading the KB article – the order of operations is:

Install the update rollup package on the following server infrastructure:
- Management servers
- Audit Collection servers
- Gateway servers
- Web console server role computers
- Operations console role computers
Apply SQL scripts.
Manually import the management packs.
Update Agents

Additionally, we will add the steps to update Linux management packs and agents.

1. Management Servers

Since there is no RMS anymore, it doesn’t matter which management server I start with. There is no need to begin with whomever holds the “RMSe” role. I simply make sure I only patch one management server at a time to allow for agent failover without overloading any single management server.

I can apply this update manually via the MSP files, or I can use Windows Update. I have 3 management servers, so I will demonstrate both. I will do the first management server manually. This management server holds 3 roles, and each must be patched: Management Server, Web Console, and Console.

The first thing I do when I download the updates from the catalog, is copy the cab files for my language to a single location:

Then extract the contents:

Once I have the MSP files, I am ready to start applying the update to each server by role.

***Note: You MUST log on to each server role as a Local Administrator, SCOM Admin, AND your account must also have System Administrator role to the SQL database instances that host your OpsMgr databases.

My first server is a management server, and the web console, and has the OpsMgr console installed, so I copy those update files locally, and execute them per the KB, from an elevated command prompt:

This launches a quick UI which applies the update. It will bounce the SCOM services as well. The update usually does not provide any feedback that it had success or failure.

You can check the application log for the MsiInstaller events to show completion:

Log Name:      Application
Source:        MsiInstaller
Date:          8/31/2016 9:01:13 AM
Event ID:      1036
Description:
Windows Installer installed an update. Product Name: System Center Operations Manager 2012 Server. Product Version: 7.1.10226.0. Product Language: 1033. Manufacturer: Microsoft Corporation. Update Name: System Center 2012 R2 Operations Manager UR11 Update Patch. Installation success or error status: 0.

You can also spot check a couple DLL files for the file version attribute.

Next up – run the Web Console update:

This runs much faster. A quick file spot check:

Lastly – install the console update (make sure your console is closed):

A quick file spot check:

Additional Management Servers:

I now move on to my additional management servers, applying the server update, then the console update and web console update where applicable.

On this next management server, I will use the example of Windows Update as opposed to manually installing the MSP files. I check online, and make sure that I have configured Windows Update to give me updates for additional products:

The applicable updates show up under optional – so I tick the boxes and apply these updates.

After a reboot – go back and verify the update was a success by spot checking some file versions like we did above.

Updating ACS (Audit Collection Services)

You would only need to update ACS if you had installed this optional role.

On any Audit Collection Collector servers, you should run the update included:

A spot check of the files:

Updating Gateways:

I can use Windows Update or manual installation.

The update launches a UI and quickly finishes.

I was prompted for a reboot.

Then I will spot check the DLL’s:

I can also spot-check the \AgentManagement folder, and make sure my agent update files are dropped here correctly:

***NOTE: You can delete any older UR update files from the \AgentManagement directories. The UR’s do not clean these up and they provide no purpose for being present any longer.

2. Apply the SQL Scripts

In the path on your management servers, where you installed/extracted the update, there are two SQL script files:

%SystemDrive%\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\SQL Script for Update Rollups
(note – your path may vary slightly depending on if you have an upgraded environment or clean install)

First – let’s run the script to update the OperationsManagerDW (Data Warehouse) database. Open a SQL management studio query window, connect it to your Operations Manager DataWarehouse database, and then open the script file (UR_Datawarehouse.sql). Make sure it is pointing to your OperationsManagerDW database, then execute the script.

You should run this script with each UR, even if you ran this on a previous UR. The script body can change so as a best practice always re-run this.

If you see a warning about line endings, choose Yes to continue.

Click the “Execute” button in SQL mgmt. studio. The execution could take a considerable amount of time and you might see a spike in processor utilization on your SQL database server during this operation.

You will see the following (or similar) output: “Command(s) completes successfully”

Next – let’s run the script to update the OperationsManager (Operations) database. Open a SQL management studio query window, connect it to your Operations Manager database, and then open the script file (update_rollup_mom_db.sql). Make sure it is pointing to your OperationsManager database, then execute the script.

You should run this script with each UR, even if you ran this on a previous UR. The script body can change so as a best practice always re-run this.

I have had customers state this takes from a few minutes to as long as an hour. In MOST cases – you will need to shut down the SDK, Config, and Monitoring Agent (healthservice) on ALL your management servers in order for this to be able to run with success.

You will see the following (or similar) output:

IF YOU GET AN ERROR – STOP! Do not continue. Try re-running the script several times until it completes without errors. In a production environment with lots of activity, you will almost certainly have to shut down the services (sdk, config, and healthservice) on your management servers, to break their connection to the databases, to get a successful run.

Technical tidbit: Even if you previously ran this script in any previous UR deployment, you should run this again in this update, as the script body can change with updated UR’s.

3. Manually import the management packs

There are 55 management packs in this update! Most of these we don’t need – so read carefully.

The path for these is on your management server, after you have installed the “Server” update:

\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\Management Packs for Update Rollups

However, the majority of them are Advisor/OMS, and language specific. Only import the ones you need, and that are correct for your language. I will remove all the MP’s for other languages (keeping only ENU), and I am left with the following:

What NOT to import:

The Advisor MP’s are only needed if you are using Microsoft Operations Management Suite cloud service, (Previously known as Advisor, and Operations Insights).
The APM MP’s are only needed if you are using the APM feature in SCOM.
Note the APM MP with a red X. This MP requires the IIS MP’s for Windows Server 2016 which are in Technical Preview at the time of this writing. Only import this if you are using APM *and* you need to monitor Windows Server 2016. If so, you will need to download and install the technical preview editions of that MP from https://www.microsoft.com/en-us/download/details.aspx?id=48256
The TFS MP bundle is only used for specific scenarios, such as DevOps scenarios where you have integrated APM with TFS, etc. If you are not currently using these MP’s, there is no need to import or update them. I’d skip this MP import unless you already have these MP’s present in your environment.
However, the Image and Visualization libraries deal with Dashboard updates, and these always need to be updated.
I import all of these shown without issue.

4. Update Agents

Agents should be placed into pending actions by this update for any agent that was not manually installed (remotely manageable = yes):

One the Management servers where I used Windows Update to patch them, their agents did not show up in this list. Only agents where I manually patched their management server showed up in this list. FYI. The experience is NOT the same when using Windows Update vs manual. If yours don’t show up – you can try running the update for that management server again – manually.

If your agents are not placed into pending management – this is generally caused by not running the update from an elevated command prompt, or having manually installed agents which will not be placed into pending.

In this case – my agents that were reporting to a management server that was updated using Windows Update – did NOT place agents into pending. Only the agents reporting to the management server for which I manually executed the patch worked.

I manually re-ran the server MSP file manually on these management servers, from an elevated command prompt, and they all showed up.

You can approve these – which will result in a success message once complete:

Soon you should start to see PatchList getting filled in from the Agents By Version view under Operations Manager monitoring folder in the console:

5. Update Unix/Linux MPs and Agents

The current Linux MP’s can be downloaded from:

https://www.microsoft.com/en-us/download/details.aspx?id=29696

7.5.1060.0 is current at this time for SCOM 2012 R2 UR11.

****Note – take GREAT care when downloading – that you select the correct download for SCOM 2012 R2. You must scroll down in the list and select the MSI for 2012 R2:

Download the MSI and run it. It will extract the MP’s to C:\Program Files (x86)\System Center Management Packs\System Center 2012 R2 Management Packs for Unix and Linux\

Update any MP’s you are already using. These are mine for RHEL, SUSE, and the Universal Linux libraries.

NOTE: Upon first import – you might see that “Linux Operating System Library” (Microsoft.Linux.Library.mp) file fails to import. If this happens, simply make sure you have imported version 7.5.1060.0 of UNIX/Linux Core Library (Microsoft.Unix.Library.mp) FIRST, then you can import “Linux Operating System Library” (Microsoft.Linux.Library.mp) without issue.

You will likely observe VERY high CPU utilization of your management servers and database server during and immediately following these MP imports. Give it plenty of time to complete the process of the import and MPB deployments.

Next – you need to restart the “Microsoft Monitoring Agent” service on any management servers which manage Linux systems. I don’t know why – but my MP’s never drop/update in the \Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\AgentManagement\UnixAgents\DownloadedKits folder until this service is restarted.

Next up – you would upgrade your agents on the Unix/Linux monitored agents. You can now do this straight from the console:

You can input credentials or use existing RunAs accounts if those have enough rights to perform this action.

Finally:

6. Update the remaining deployed consoles

This is an important step. I have consoles deployed around my infrastructure – on my Orchestrator server, SCVMM server, on my personal workstation, on all the other SCOM admins on my team, on a Terminal Server we use as a tools machine, etc. These should all get the matching update version.

Review:

Now at this point, we would check the OpsMgr event logs on our management servers, check for any new or strange alerts coming in, and ensure that there are no issues after the update.

Known issues:

See the existing list of known issues documented in the KB article.

1. Many people are reporting that the SQL script is failing to complete when executed. You should attempt to run this multiple times until it completes without error. You might need to stop the Exchange correlation engine, stop all the SCOM services on the management servers, and/or bounce the SQL server services in order to get a successful completion in a busy management group. The errors reported appear as below:

——————————————————
(1 row(s) affected)
(1 row(s) affected)
Msg 1205, Level 13, State 56, Line 1
Transaction (Process ID 152) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Msg 3727, Level 16, State 0, Line 1
Could not drop constraint. See previous errors.
——————————————————–

↧

SCOM 2016 is available

October 21, 2016, 10:49 pm

≫ Next: VSAE (Visual Studio Authoring Extensions) updated for SCOM 2016

≪ Previous: UR11 for SCOM 2012 R2 – Step by Step

Yes, yes, I know. Late to the party. Better late than never.

For those that don’t know – SCOM 2016 was released at Ignite this year, on Sept 29th.

It hit the GA milestone on October 12th, which is the date it becomes Generally Available on Volume License media and MSDN, along with the date it becomes officially supported.

You can download the 180 day EVAL – which is available at: https://www.microsoft.com/en-us/evalcenter/evaluate-system-center-2016

The EVAL version is fully capable of being licensed with your key from MSDN or Volume Licensing, later after you deploy.

Want to know more?

↧

VSAE (Visual Studio Authoring Extensions) updated for SCOM 2016

October 21, 2016, 10:50 pm

≫ Next: OpsMgr 2016 – QuickStart Deployment Guide

≪ Previous: SCOM 2016 is available

VSAE has been updated for SCOM 2016.

Get it here:

https://www.microsoft.com/en-us/download/details.aspx?id=30169

Works great with my fragment library – which you can download and try out here:

https://blogs.technet.microsoft.com/kevinholman/2016/06/04/authoring-management-packs-the-fast-and-easy-way-using-visual-studio/

Even if you ARE running SCOM 2016 – I still recommend choosing SCOM 2012R2 – so any MP’s you author will be compatible with either version.

↧

OpsMgr 2016 – QuickStart Deployment Guide

October 21, 2016, 10:54 pm

≫ Next: UR1 for SCOM 2016 – Step by Step

≪ Previous: VSAE (Visual Studio Authoring Extensions) updated for SCOM 2016

There is already a very good deployment guide posted on TechNet here: https://technet.microsoft.com/en-us/system-center-docs/om/deploy/deploying-system-center-2016-operations-manager

The TechNet deployment guide provides an excellent walkthrough of installing OpsMgr 2016 for the “all in one” scenario, where all roles are installed on a single server. That is a very good method for doing simple functionality testing and lab exercises.

The following article will cover a basic install of System Center Operations Manager 2016. The concept is to perform a limited deployment of OpsMgr, only utilizing as few servers as possible, but enough to demonstrate the roles and capabilities in OM2016. For this reason, this document will cover a deployment on 3 servers. A dedicated SQL server, and two management servers will be deployed. This will allow us to show the benefits of high availability for agent failover, and the highly available resource pool concepts. This is to be used as a template only, for a customer to implement as their own pilot or POC, or customized deployment guide. It is intended to be general in nature and will require the customer to modify it to suit their specific data and processes.

This also happens to be a very typical scenario for small environments for a production deployment. This is not an architecture guide or intended to be a design guide in any way. This is provided “AS IS” with no warranties, and confers no rights. Use is subject to the terms specified in the Terms of Use.

Server Names\Roles:

SQL1 SQL Database Services, Reporting Services
SCOM1 Management Server Role, Web Console Role, Console
SCOM2 Management Server Role, Web Console Role, Console

Windows Server 2016 will be installed as the base OS for all platforms. All servers will be a member of the AD domain.

SQL 2016 will be the base standard for all database and SQL reporting services.

High Level Deployment Process:

1. In AD, create the following accounts and groups, according to your naming convention:

DOMAIN\OMAA OM Server Action Account
DOMAIN\OMDAS OM Config and Data Access Account
DOMAIN\OMREAD OM Datawarehouse Reader Account
DOMAIN\OMWRITE OM Datawarehouse Write Account
DOMAIN\SQLSVC SQL Service Account
DOMAIN\OMAdmins OM Administrators security group

2. Add the OMAA, OMDAS, OMREAD, and OMWRITE accounts to the “OMAdmins” global group.

3. Add the domain user accounts for yourself and your team to the “OMAdmins” group.

4. Install Windows Server 2016 to all server role servers.

5. Install Prerequisites and SQL 2016.

6. Install the Management Server and Database Components

7. Install the Reporting components.

8. Deploy Agents

9. Import Management packs

10. Set up security (roles and run-as accounts)

Prerequisites:

1. Install Windows Server 2016 to all Servers

2. Join all servers to domain.

3. Install the Report Viewer controls to any server that will receive a SCOM console. Install them from https://www.microsoft.com/en-us/download/details.aspx?id=45496 There is a prereq for the Report View controls which is the “Microsoft System CLR Types for SQL Server 2014” (ENU\x64\SQLSysClrTypes.msi) available here: https://www.microsoft.com/en-us/download/details.aspx?id=42295

4. Install all available Windows Updates.

5. Add the “OMAdmins” domain global group to the Local Administrators group on each server.

6. Install IIS on any management server that will also host a web console:

Open PowerShell (as an administrator) and run the following:

Add-WindowsFeature NET-WCF-HTTP-Activation45,Web-Static-Content,Web-Default-Doc,Web-Dir-Browsing,Web-Http-Errors,Web-Http-Logging,Web-Request-Monitor,Web-Filtering,Web-Stat-Compression,Web-Mgmt-Console,Web-Metabase,Web-Asp-Net,Web-Windows-Auth –Restart

Note: The server needs to be restarted at this point, even if you are not prompted to do so. If you do not reboot, you will get false failures about prerequisites missing for ISAPI/CGI/ASP.net registration.

7. Install SQL 2016 to the DB server role

Setup is fairly straightforward. This document will not go into details and best practices for SQL configuration. Consult your DBA team to ensure your SQL deployment is configured for best practices according to your corporate standards.
Run setup, choose Installation > New SQL Server stand-alone installation…

When prompted for feature selection, install ALL of the following:
- Database Engine Services
- Full-Text and Semantic Extractions for Search
- Reporting Services – Native

On the Instance configuration, choose a default instance, or a named instance. Default instances are fine for testing, labs, and production deployments. Production clustered instances of SQL will generally be a named instance. For the purposes of the POC, choose default instance to keep things simple.
On the Server configuration screen, set SQL Server Agent to Automatic. You can accept the defaults for the service accounts, but I recommend using a Domain account for the service account. Input the DOMAIN\sqlsvc account and password for Agent, Engine, and Reporting. Set the SQL Agent to AUTOMATIC.
Check the box to grant Volume Maintenance Task to the service account for the DB engine. This will help performance when autogrow is needed.

On the Collation Tab – you can use the default which is SQL_Latin1_General_CP1_CI_AS
On the Account provisioning tab – add your personal domain user account and/or a group you already have set up for SQL admins. Alternatively, you can use the OMAdmins global group here. This will grant more rights than is required to all OMAdmin accounts, but is fine for testing purposes of the POC.
On the Data Directories tab – set your drive letters correctly for your SQL databases, logs, TempDB, and backup.
On the Reporting Services Configuration – choose to Install and Configure. This will install and configure SRS to be active on this server, and use the default DBengine present to house the reporting server databases. This is the simplest configuration. If you install Reporting Services on a stand-alone (no DBEngine) server, you will need to configure this manually.
Choose Install, and setup will complete.
You will need to disable Windows Firewall on the SQL server, or make the necessary modifications to the firewall to allow all SQL traffic. See http://msdn.microsoft.com/en-us/library/ms175043.aspx
When you complete the installation – you might consider also downloading and installing SQL Server Management Studio Tools from the installation setup page, or https://msdn.microsoft.com/en-us/library/mt238290.aspx

SCOM Step by step deployment guide:

1. Install the Management Server role on SCOM1.

Log on using your personal domain user account that is a member of the OMAdmins group, and has System Administrator (SA) rights over the SQL instances.
Run Setup.exe
Click Install
Select the following, and then click Next:
- Management Server
- Operations Console
- Web Console
Accept or change the default install path and click Next.
You might see an error from the Prerequisites here. If so – read each error and try to resolve it.
On the Proceed with Setup screen – click Next.
On the specify an installation screen – choose to create the first management server in a new management group. Give your management group a name. Don’t use any special or Unicode characters, just simple text. KEEP YOUR MANAGEMENT GROUP NAME SIMPLE, and don’t put version info in there. Click Next.
Accept the license. Next.
On the Configure the Operational Database screen, enter in the name of your SQL database server name and instance. In my case this is “DB01”. Leave the port at default unless you are using a special custom fixed port. If necessary, change the database locations for the DB and log files. Leave the default size of 1000 MB for now. Click Next.
On the Configure the Data Warehouse Database screen, enter in the name of your SQL database server name and instance. In my case this is “DB01”. Leave the port at default unless you are using a special custom fixed port. If necessary, change the database locations for the DB and log files. Leave the default size of 1000 MB for now. Click Next.
On the Web Console screen, choose the Default Web Site, and leave SSL unchecked. If you have already set up SSL for your default website with a certificate, you can choose SSL. Click Next.
On the Web Console authentication screen, choose Mixed authentication and click Next.
On the accounts screen, change the accounts to Domain Account for ALL services, and enter in the unique DOMAIN\OMAA, DOMAIN\OMDAS, DOMAIN\OMREAD, DOMAIN\OMWRITE accounts we created previously. It is a best practice to use separate accounts for distinct roles in OpsMgr, although you can also just use the DOMAIN\OMDAS account for all SQL Database access roles to simplify your installation (Data Access, Reader, and Writer accounts). Click Next.
On the Microsoft Update screen – choose to use updates or not. Next.
Click Install.
Close when complete.
The Management Server will be very busy (CPU) for several minutes after the installation completes. Before continuing it is best to give the Management Server time to complete all post install processes, complete discoveries, database sync and configuration, etc. 10 minutes is typically sufficient.

2. (Optional) Install the second Management Server on SCOM2.

Log on using your domain user account that is a member of the OMAdmins group, and has System Administrator (SA) rights over the SQL instances.
Run Setup.exe
Click Install
Select the following, and then click Next:
- Management Server
- Operations Console
- Web Console
Accept or change the default install path and click Next.
Resolve any issues with prerequisites, and click Next.
Choose “Add a management server to an existing management group” and click Next.
Accept the license terms and click Next.
Input the servername\instance hosting the Ops DB. Select the correct database from the drop down and click Next.
Accept the Default Web Site on the Web Console page and click Next.
Use Mixed Authentication and click Next.
On the accounts screen, choose Domain Account for ALL services, and enter in the unique DOMAIN\OMAA, DOMAIN\OMDAS accounts we created previously. Click Next.
On the Diagnostic Data screen – click Next.
Turn Microsoft Updates on or off for SCOM, Next.
Click Install.
Close when complete.

3. Install SCOM Reporting Role on the SQL server.

Log on using your domain user account that is a member of the OMAdmins group, and has System Administrator (SA) rights over the SQL instances.
Locate the SCOM media. Run Setup.exe. Click Install.
Select the following, and then click Next:
- Reporting Server
Accept or change the default install path and click Next.
Resolve any issues with prerequisites, and click Next.
Accept the license and click Next.
Type in the name of a management server, and click Next.
Choose the correct local SQL reporting instance and click Next.
Enter in the DOMAIN\OMREAD account when prompted. It is a best practice to use separate accounts for distinct roles in OpsMgr, although you can also just use the DOMAIN\OMDAS account for all SQL Database access roles to simplify your installation. You MUST input the same account here that you used for the OM Reader account when you installed the first management server. Click Next.
On the Diagnostic Data screen – click Next.
Turn Microsoft Updates on or off for SCOM, Next.
Click Install.
Close when complete.

You have a fully deployed SCOM Management group at this point.

What’s next?

Once you have SCOM up and running, these are some good next steps to consider for getting some use out of it and keep it running smoothly:

1. Apply the latest Update Rollup. At the time of this blog posting that is UR1. But you should always find and apply the most current CUMULATIVE update rollup.

2. Manually grow your Database sizes and configure SQL

When we installed each database, we used the default of 1GB (1000MB). This is not a good setting for steady state as our databases will need to grow larger than that very soon. We need to pre-grow these to allow for enough free space for maintenance operations, and to keep from having lots of auto-growth activities which impact performance during normal operations.
A good rule of thumb for most deployments of OpsMgr is to set the OpsDB to 50GB for the data file and 25GB for the transaction log file. This can be smaller for POC’s but generally you never want to have an OpsDB set less than 10GB/5GB. Setting the transaction log to 50% of the DB size for the OpsDB is a good rule of thumb.
For the Warehouse – you will need to plan for the space you expect to need using the sizing tools available and pre-size this from time to time so that lots of autogrowths do not occur. The sizing helper is available at: http://www.microsoft.com/en-us/download/details.aspx?id=29270

3. Deploy an agent to the SQL DB server.

This process has not changed from OpsMgr 2012, so you would use the typical mechanism to push or manually install. You can also refer to: https://technet.microsoft.com/en-us/system-center-docs/om/manage/managing-discovery-and-agents
You could also deploy any additional agents at this point.

4. Import management packs. Also refer to: https://technet.microsoft.com/en-us/system-center-docs/om/manage/using-management-packs

Using the console – you can import MP’s using the catalog, or directly importing from disk. I recommend always downloading MP’s and importing from disk. You should keep a MP repository of all MP’s both current and previous, both for disaster recovery and in the case you need to revert to an older MP at any time.
Import the Base OS and SQL MP’s at a minimum.

5. Enable Agent Proxy

I prefer to simply enable agent proxy for all agents. The BEST way to do this is to enable Agent Proxy as a default setting. That way you will never have to mess with this again:
https://blogs.technet.microsoft.com/kevinholman/2014/02/11/opsmgr-2012-enable-agent-proxy-on-all-agents/

6. Configure your OpsMgr environment to accept manually installed agents.

The default is to block manually installed agents. I recommend setting this to “Review new manual agent installations”

7. Configure Notifications:

http://blogs.technet.com/b/kevinholman/archive/2012/04/28/opsmgr-2012-configure-notifications.aspx

8. Deploy Unix and Linux Agents

http://blogs.technet.com/b/kevinholman/archive/2012/03/18/deploying-unix-linux-agents-using-opsmgr-2012.aspx

9. Configure Network Monitoring

http://blogs.technet.com/b/kevinholman/archive/2011/07/21/opsmgr-2012-discovering-a-network-device.aspx

10. Configure SQL MP RunAs Security:

11. Create a dashboard view:

http://technet.microsoft.com/en-us/library/hh230752.aspx#bkmk_howtocreateadashboardview

12. Continue with optional activities from the Quick Start guide on TechNet:

http://technet.microsoft.com/en-us/library/hh230738.aspx

13. Configure your management group to support APM monitoring.

http://technet.microsoft.com/en-us/library/hh543994.aspx
Import supporting management packs for IIS 7 and 8, and APM Web for IIS 7 and 8.

14. Deploy Audit Collection Services

http://technet.microsoft.com/en-us/library/hh298613.aspx
Install the audit collector on a management server, and create a database on a SQL server.
Upload the reports for ACS, my command is: UploadAuditReports.cmd “DB1” “http://db1/ReportServer” “c:\acs”
Create and set a filter: http://technet.microsoft.com/en-us/library/hh230740.aspx
You will need to grant NETWORK SERVICE full control to the AdtServer registry key to set a filter at the command line: http://social.technet.microsoft.com/Forums/en-US/operationsmanagerreporting/thread/ab22685e-36a1-49a9-b90e-d39ead31901f
My initial filter for lab use is: adtadmin /setquery /query:”SELECT * FROM AdtsEvent WHERE NOT (EventId=4768 OR EventId=4769 OR EventId=4624 OR EventId=4634 OR EventId=4672 OR EventId=4776)”

15. Learn MP authoring.

Fragments the fast and easy way with Visual Studio: https://blogs.technet.microsoft.com/kevinholman/2016/06/04/authoring-management-packs-the-fast-and-easy-way-using-visual-studio/
Download MPAuthor: http://www.silect.com/mp-author/

↧

UR1 for SCOM 2016 – Step by Step

October 22, 2016, 12:40 am

≫ Next: Enabling Scheduled Maintenance in SCOM 2016 UR1

≪ Previous: OpsMgr 2016 – QuickStart Deployment Guide

KB Article for OpsMgr: https://support.microsoft.com/en-us/kb/3190029

Download catalog site: http://catalog.update.microsoft.com/v7/site/Search.aspx?q=3190029

NOTE: I get this question every time we release an update rollup: ALL SCOM Update Rollups are CUMULATIVE. This means you do not need to apply them in order, you can always just apply the latest update. If you have deployed SCOM 2016 and never applied an update rollup – you can go straight to the latest one available.

Key fixes: We aren’t listing them.

Wait. What did he just say?

That’s right. We aren’t listing them in the KB like we normally due. There is a huge list of fixes, and detailing them all would be fairly pointless. UR1 was shipped the same day that SCOM 2016 became Generally Available. This IS the GA release. (SCOM 2016 UR1). You don’t need to look at the list and evaluate the fixes – you NEED to apply this first update. We did the same thing in SCOM 2012, the UR1 was critical and shipped at the same time the product became GA and officially supported. So just apply it. ASAP. Mmmmmkay? Smile

Lets get started.

From reading the KB article – the order of operations is:

Install the update rollup package on the following server infrastructure:

Management servers
Web console server role computers
Operations console role computers

Apply SQL scripts.

Manually import the management packs.

Update Agents

Additionally, we will add the steps to update any Linux management packs and agents, if they are present.

1. Management Servers

I can apply this update manually via the MSP files, or I can use Windows Update. I have 2 management servers, so I will demonstrate both. I will do the first management server manually. This management server holds 3 roles, and each must be patched: Management Server, Web Console, and Console.

The first thing I do when I download the updates from the catalog, is copy the cab files for my language to a single location, and then extract the contents:

Once I have the MSP files, I am ready to start applying the update to each server by role.

***Note: You MUST log on to each server role as a Local Administrator, SCOM Admin, AND your account must also have System Administrator role to the SQL database instances that host your OpsMgr databases.

This launches a quick UI which applies the update. It will bounce the SCOM services as well. The update usually does not provide any feedback that it had success or failure.

You can check the application log for the MsiInstaller events to show completion:

Log Name:      Application
Source:        MsiInstaller
Date:          10/22/2016 1:11:18 AM
Event ID:      1036
Description:
Windows Installer installed an update. Product Name: System Center Operations Manager 2016 Server. Product Version: 7.2.11719.0. Product Language: 1033. Manufacturer: Microsoft Corporation. Update Name: System Center 2016 Operations Manager UR1 Update Patch. Installation success or error status: 0.

You can also spot check a couple DLL files for the file version attribute.

Next up – run the Web Console update:

This runs much faster. A quick file spot check:

Lastly – install the console update (make sure your console is closed):

A quick file spot check:

Additional Management Servers:

Windows Update did not have the UR1 available from the web at the time of this posting – so I will continue to patch my additional management servers manually (which I prefer anyway!)

2. Apply the SQL Scripts

In the path on your management servers, where you installed/extracted the update, there is ONE SQL script file:

%SystemDrive%\Program Files\Microsoft System Center 2016\Operations Manager\Server\SQL Script for Update Rollups
(note – your path may vary slightly depending on if you have an upgraded environment or clean install)

***Warning: At the time of this posting – the KB article is wrong. It references the data warehouse DB and a script name of UR_Datawarehouse.sql. However – UR1 for SCOM 2016 contains a script to be run against the OperationsManager database, with a name of update_rollup_mom_db.sql

You should run this script with each UR, even if you ran this on a previous UR. The script body can change so as a best practice always re-run this.

You will see the following (or similar) output:

Technical tidbit: Even if you previously ran this script in any previous UR deployment, you should run this again in this update, as the script body can change with updated UR’s.

3. Manually import the management packs

There are 8 management packs in this update! Most of these we don’t need – so read carefully.

The path for these is on your management server, after you have installed the “Server” update:

\Program Files\Microsoft System Center 2016\Operations Manager\Server\Management Packs for Update Rollups

However, the majority of them are Advisor/OMS, and language specific. Only import the ones you need, and that are correct for your language.

This is the initial import list:

What NOT to import:

The Advisor MP’s are only needed if you are using Microsoft Operations Management Suite cloud service, (Previously known as Advisor, and Operations Insights).
The Alert Attachment MP update is only needed if you are already using that MP for very specific other MP’s that depend on it (rare)
The IntelliTrace Profiling MP requires IIS MP’s and is only used if you want this feature in conjunction with APM.

So I remove what I don’t want or need – and I have this:

These import without issue.

4. Update Agents

Agents should be placed into pending actions by this update for any agent that was not manually installed (remotely manageable = yes):

You can approve these – which will result in a success message once complete:

5. Update Unix/Linux MPs and Agents

The current Linux MP’s at the time of this posting are on the SCOM 2016 Media in the “Management Packs” folder.

7.6.1064.0 is current at this time for SCOM 2016 UR1.

Import any MP’s you wish to use with SCOM. These are mine for RHEL, SUSE, and the Universal Linux libraries. There are no updates specific to UR1.

6. Update the remaining deployed consoles

Review:

Now at this point, we would check the OpsMgr event logs on our management servers, check for any new or strange alerts coming in, and ensure that there are no issues after the update.

↧

Enabling Scheduled Maintenance in SCOM 2016 UR1

October 22, 2016, 1:24 pm

≫ Next: SCOM Console crashes after October Windows cumulative updates – Resolved

≪ Previous: UR1 for SCOM 2016 – Step by Step

When you try and use the new Scheduled Maintenance feature in SCOM 2016 UR1, you will probably see the following error pop up as soon as you select “Maintenance Schedules” in the Operations Console:

Date: 10/22/2016 3:03:32 PM
Application: Operations Manager
Application Version: 7.2.11719.0
Severity: Error
Message:

The EXECUTE permission was denied on the object ‘sp_help_jobactivity’, database ‘msdb’, schema ‘dbo’.
The data access service account might not have the required permissions

If you move forward and try to create a maintenance schedule – you will see something like this:

Note:  The following information was gathered when the operation was attempted.  The information may appear cryptic but provides context for the error.  The application will continue to run.

Microsoft.EnterpriseManagement.Common.ServerDisconnectedException: The client has been disconnected from the server. Please call ManagementGroup.Reconnect() to reestablish the connection. ---> System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.

Server stack trace:
   at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()
   at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
   at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
   at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]:
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at Microsoft.EnterpriseManagement.Common.Internal.IDispatcherService.DispatchUnknownMessage(Message message)
   at Microsoft.EnterpriseManagement.Common.Internal.AdministrationServiceProxy.CreateMaintenanceSchedule(String scheduleName, Boolean recursive, Boolean isEnabled, Boolean isRecurrence, Boolean isEndTimeSpecified, Int32 duration, Int32 reason, String comments, String managedEntityIdList, Int32 freqType, Int32 freqInterval, Int32 freqSubdayType, Int32 freqSubdayInterval, Int32 freqRelativeInterval, Int32 freqRecurrenceFactor, DateTime activeStartTime, DateTime activeEndDate)
   --- End of inner exception stack trace ---
   at Microsoft.EnterpriseManagement.Common.Internal.ExceptionHandlers.HandleChannelExceptions(Exception ex)
   at Microsoft.EnterpriseManagement.Common.Internal.AdministrationServiceProxy.CreateMaintenanceSchedule(String scheduleName, Boolean recursive, Boolean isEnabled, Boolean isRecurrence, Boolean isEndTimeSpecified, Int32 duration, Int32 reason, String comments, String managedEntityIdList, Int32 freqType, Int32 freqInterval, Int32 freqSubdayType, Int32 freqSubdayInterval, Int32 freqRelativeInterval, Int32 freqRecurrenceFactor, DateTime activeStartTime, DateTime activeEndDate)
   at Microsoft.EnterpriseManagement.Monitoring.MaintenanceSchedule.MaintenanceSchedule.CreateMaintenanceSchedule(MaintenanceSchedule maintenanceSchedule, ManagementGroup mg)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Administration.MaintenanceModeSchedule.Pages.MaintenanceModeScheduleDetailsPage.<OnSave>b__0(Object param0, ConsoleJobEventArgs param1)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Console.ConsoleJobExceptionHandler.ExecuteJob(IComponent component, EventHandler`1 job, Object sender, ConsoleJobEventArgs args)
System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.

Server stack trace:
   at System.ServiceModel.Channels.CommunicationObject.ThrowIfFaulted()
   at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
   at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
   at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)

Exception rethrown at [0]:
   at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
   at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
   at Microsoft.EnterpriseManagement.Common.Internal.IDispatcherService.DispatchUnknownMessage(Message message)
   at Microsoft.EnterpriseManagement.Common.Internal.AdministrationServiceProxy.CreateMaintenanceSchedule(String scheduleName, Boolean recursive, Boolean isEnabled, Boolean isRecurrence, Boolean isEndTimeSpecified, Int32 duration, Int32 reason, String comments, String managedEntityIdList, Int32 freqType, Int32 freqInterval, Int32 freqSubdayType, Int32 freqSubdayInterval, Int32 freqRelativeInterval, Int32 freqRecurrenceFactor, DateTime activeStartTime, DateTime activeEndDate)

This is caused by the SCOM Data Access account needing some additional permissions in SQL in order to control maintenance schedules via SQL Agent. You will need to configure this just once to get it going.

In SQL Management studio – expand Security, Logins, and find the account you used for the SDK/DAS account.

Right-click the account and choose Properties.

Select User Mapping, and check the box next to the MSDB database.

Grant the following rights for the SDK/DAS account to the MSDB database:

SQLAgentOperatorRole
SQLAgentReaderRole
SQLAgentUserRole

Once you apply this one-time change, the Maintenance Schedules works perfectly.

↧

SCOM Console crashes after October Windows cumulative updates – Resolved

October 27, 2016, 7:55 pm

≫ Next: Deploying SCOM 2016 Agents to Domain controllers – some assembly required

≪ Previous: Enabling Scheduled Maintenance in SCOM 2016 UR1

There is an issue where after patching your Windows Server or Workstation machine with the monthly cumulative updates – you might see you SCOM console crash with an exception.

This affects SCOM 2012 and SCOM 2016

Log Name: Application

Event ID: 1000

Description:

Faulting application name: Microsoft.EnterpriseManagement.Monitoring.Console.exe, version: 7.2.11719.0, time stamp: 0x5798acae

Faulting module name: ntdll.dll, version: 10.0.14393.206, time stamp: 0x57dac931

The KB article explains the issue:

System Center Operations Manager Management Console crashes after you install MS16-118 and MS16-126 https://support.microsoft.com/en-us/kb/3200006

We have released updated patches for each OS now, including the latest branch of Windows 10 and Windows Server 2016.

Smaller individual hotfixes are available for:

Windows Vista
Windows 7
Windows 8.1
Windows Server 2008
Windows Server 2008R2
Windows Server 2012
Windows Server 2012 R2

At the following location: http://catalog.update.microsoft.com/v7/site/Search.aspx?q=3200006

(The Microsoft catalog requires Internet Explorer, FYI)

The fix was applied to the latest cumulative update for Windows 10 and Windows Server 2016:

For Windows 10 RTM: https://support.microsoft.com/en-us/kb/3199125

For Windows 10 version 1511: https://support.microsoft.com/en-us/kb/3200068

For the latest Windows 10 version 1607 and Windows Server 2016: https://support.microsoft.com/en-us/kb/3197954

The Windows 10 and Server 2016 updates are available right now via Windows Update.

↧

Deploying SCOM 2016 Agents to Domain controllers – some assembly required

November 4, 2016, 9:28 am

≫ Next: Does SCOM 2012 R2 support monitoring Windows Server 2016?

≪ Previous: SCOM Console crashes after October Windows cumulative updates – Resolved

Something that a fellow PFE (Brian Barrington) called to my attention, with SCOM 2016 agents, when installed on a Domain Controller: the agent just sits there and does not communicate.

The reason? Local System is denied by HSLOCKDOWN.

HSLockdown is a tool that grants or denies a particular RunAs account access to the SCOM agent Healthservice. It is documented here.

When we deploy a SCOM 2016 agent to a domain controller – you might see it goes into a heartbeat failed state immediately, and on the agent – you might see the following events in the OperationsManager log:

Log Name:      Operations Manager
Source:        HealthService
Event ID:      7017
Task Category: Health Service
Level:         Error
Computer:      DC1.opsmgr.net
Description:
The health service blocked access to the windows credential NT AUTHORITY\SYSTEM because it is not authorized on management group SCOM. You can run the HSLockdown tool to change which credentials are authorized.

Followed eventually by a BUNCH of this:

Log Name:      Operations Manager
Source:        HealthService
Event ID:      1102
Task Category: Health Service
Level:         Error
Computer:      DC1.opsmgr.net
Description:
Rule/Monitor “Microsoft.SystemCenter.WMIService.ServiceMonitor” running for instance “DC1.opsmgr.net” with id:”{00A920EF-0147-3FCC-A5DC-CEC1CA93AFED}” cannot be initialized and will not be loaded. Management group “SCOM”

If you open an Elevated command prompt, and browse to the SCOM agent folder – you can run HSLOCKDOWN /L to list the configuration:

There it is. NT Authority\SYSTEM is denied.

I’ll be researching why this change was made – this did not happen by default in SCOM 2012R2.

In the meantime – the resolution is simple.

On domain controllers – simply run the following command in the agent path where HSLOCKDOWN.EXE exists:

HSLockdown.exe <YouManagementGroupName> /R “NT AUTHORITY\SYSTEM”

This will remove the explicit deny for Local System. Restart the SCOM Microsoft Monitoring Agent Service (Healthservice)

Here is an example (my management group name is “SCOM”)

↧

Does SCOM 2012 R2 support monitoring Windows Server 2016?

November 11, 2016, 5:51 am

≫ Next: Monitoring UNIX/Linux with OpsMgr 2016

≪ Previous: Deploying SCOM 2016 Agents to Domain controllers – some assembly required

This has been coming up quite a bit lately –

The answer is YES, and we have updated the SCOM 2012 R2 documentation:

https://technet.microsoft.com/en-us/library/dn281931(v=sc.12).aspx

There is no minimum UR level required to support this. However, we always recommend applying the most current cumulative update rollup to your SCOM agents.

Operations Manager Windows Agent

Windows Server 2003 SP2
Windows 2008 Server SP2
Windows 2008 Server R2
Windows 2008 Server R2 SP1
Windows Server® 2012
Windows Server® 2012 R2
Microsoft Hyper-V Server ® 2012 R2
Windows Server 2016
Windows XP Pro x64 SP2
Windows XP Pro SP32
Windows Vista SP2
Windows XP Embedded Standard
Windows XP Embedded Enterprise
Windows XP Embedded POSReady
Windows 7 Professional for Embedded Systems
Windows 7 Ultimate for Embedded Systems
Windows 7
Windows® 8
Windows® 8.1
Windows ® 10
Windows Server®2016 Technical Preview

↧

Monitoring UNIX/Linux with OpsMgr 2016

November 11, 2016, 12:38 pm

≫ Next: SCOM SQL queries

≪ Previous: Does SCOM 2012 R2 support monitoring Windows Server 2016?

Microsoft started including Unix and Linux monitoring in OpsMgr directly in OpsMgr 2007 R2, which shipped in 2009. Some significant updates have been made to this for OpsMgr 2012. Primarily these updates are around:

Highly available Monitoring via Resource Pools
Sudo elevation support for using a low priv account with elevation rights for specific workflows.
ssh key authentication
New wizards for discovery, agent upgrade, and agent uninstallation
Additional PowerShell cmdlets
Performance and scalability improvements
New monitoring templates for common monitoring tasks

Now – with SCOM 2016 – we have added:

Support for additional releases of operating systems: (Link)
Increased scalability (2x) with asynchronous monitoring workflows
Easier agent deployment using existing RunAs account credentials
New Management Packs and Providers for LAMP stack
New UNIX/Linux Script templates to ease authoring (Link)
Discovery filters for file systems (Link)

I am going to do a step by step guide for getting this deployed with SCOM 2016. As always – a big thanks to Tim Helton of Microsoft for assisting me with all things Unix and Linux.

High Level Overview:

Import Management Packs

Create a resource pool for monitoring Unix/Linux servers

Configure the Xplat certificates (export/import) for each management server in the pool.

Create and Configure Run As accounts for Unix/Linux.

Discover and deploy the agents

Import Management Packs:

The core Unix/Linux libraries are already imported when you install OpsMgr 2016, but not the detailed MP’s for each OS version. These are on the installation media, in the \ManagementPacks directory. Import the specific ones for the Unix or Linux Operating systems that you plan to monitor.

Create a resource pool for monitoring Unix/Linux servers

The FIRST step is to create a Unix/Linux Monitoring Resource pool. This pool will be used and associated with management servers that are dedicated for monitoring Unix/Linux systems in larger environments, or may include existing management servers that also manage Windows agents or Gateways in smaller environments. Regardless, it is a best practice to create a new resource pool for this purpose, and will ease administration, and scalability expansion in the future.

Under Administration, find Resource Pools in the console:

OpsMgr ships 3 resource pools by default:

Let’s create a new one by selecting “Create Resource Pool” from the task pane on the right, and call it “UNIX/Linux Monitoring Resource Pool”

Click Add and then click Search to display all management servers. Select the Management servers that you want to perform Unix and Linux Monitoring. If you only have 1 MS, this will be easy. For high availability – you need at least two management servers in the pool.

Add your management servers and create the pool. In the actions pane – select “View Resource Pool Members” to verify membership.

Configure the Xplat certificates (export/import) for each management server in the pool

Operations Manager uses certificates to authenticate access to the computers it is managing. When the Discovery Wizard deploys an agent, it retrieves the certificate from the agent, signs the certificate, deploys the certificate back to the agent, and then restarts the agent.

To configure for high availability, each management server in the resource pool must have all the root certificates that are used to sign the certificates that are deployed to the agents on the UNIX and Linux computers. Otherwise, if a management server becomes unavailable, the other management servers would not be able to trust the certificates that were signed by the server that failed.

We provide a tool to handle the certificates, named scxcertconfig.exe. Essentially what you must do, is to log on to EACH management server that will be part of a Unix/Linux monitoring resource pool, and export their SCX (cross plat) certificate to a file share. Then import each others certificates so they are trusted.

If you only have a SINGLE management server, or a single management server in your pool, you can skip this step, then perform it later if you ever add Management Servers to the Unix/Linux Monitoring resource pool.

In this example – I have two management servers in my Unix/Linux resource pool, MS1 and MS2. Open a command prompt on each MS, and export the cert:

On MS1:
C:\Program Files\Microsoft System Center 2016\Operations Manager\Server>scxcertconfig.exe -export \\servername\sharename\MS1.cer
On MS2:
C:\Program Files\Microsoft System Center 2016\Operations Manager\Server>scxcertconfig.exe -export \\servername\sharename\MS2.cer

Once all certs are exported, you must IMPORT the other management server’s certificate:

On MS1:
C:\Program Files\Microsoft System Center 2016\Operations Manager\Server>scxcertconfig.exe –import \\servername\sharename\MS2.cer
On MS2:
C:\Program Files\Microsoft System Center 2016\Operations Manager\Server>scxcertconfig.exe –import \\servername\sharename\MS1.cer

If you fail to perform the above steps – you will get errors when running the Linux agent deployment wizard later.

Create and Configure Run As accounts for Unix/Linux

Next up we need to create our run-as accounts for Linux monitoring. This is documented here: (Link)

We need to select “UNIX/Linux Accounts” under administration, then “Create Run As Account” from the task pane. This kicks off a special wizard for creating these accounts.

Lets create the Monitoring account first. Give the monitoring account a display name, and click Next.

On the next screen, type in the credentials that you want to use for monitoring the UNIX/Linux system(s). These accounts must exist on each UNIX/Linux system and have the required permissions granted:

On the above screen – you have two choices. You can use a privileged account for handling monitoring, or you can use an account that is not privileged, but elevated via sudo. I will configure this with the most typical customer scenario – which is to leverage sudo elevation which is specifically granted in the sudoers file. (more on that later)

On the next screen, always choose “more secure” and click “Create”

Now – since we chose More Secure – we must choose the distribution of the Run As account. Find your “UNIX/Linux Monitoring Account” under the UNIX/Linux Accounts screen, and open the properties. On the Distribution Security screen, click Add, then select “Search by resource pool name” and click search. Find your Unix/Linux monitoring resource pool, highlight it, and click Add, then OK. This will distribute this account credential to all Management servers in our pool:

Next up – we will create the Agent Maintenance Account.

This account is used for SSH, to be able to deploy, install, uninstall, upgrade, sign certificates, all dealing with the agent on the UNIX/Linux system.

Give the account a name:

From here you can choose to use a SSH key, or a username and password credential only. You also can choose to leverage a privileged account, or a regular account that uses sudo. I will be choosing the most typical – which is an account that will leverage sudo:

Next – depending on your OS and elevation standards – choose to use SUDO or SU:

On the next screen, always choose “more secure” and click “Create”

Now – since we chose More Secure – we must choose the distribution of the Run As account. Find your “UNIX/Linux Agent Maintenance Account” under the UNIX/Linux Accounts screen, and open the properties. On the Distribution Security screen, click Add, then select “Search by resource pool name” and click search. Find your Unix/Linux monitoring resource pool, highlight it, and click Add, then OK. This will distribute this account credential to all Management servers in our pool:

Next up – we must configure the Run As profiles.

There are three profiles for Unix/Linux accounts:

The agent maintenance account is strictly for agent updates, uninstalls, anything that requires SSH. This will always be associated with a privileged (or sudo elevated) account that has access via SSH, and was created using the Run As account wizard above.

The other two Profiles are used for Monitoring workflows. These are:

Unix/Linux Privileged account
Unix/Linux Action Account

The Privileged Account Profile will always be associated with a Run As account like we created above, that is Privileged OR a unprivileged account that has been configured with elevation via sudo. This is what any workflows that typically require elevated rights will execute as.

The Action account is what all your basic monitoring workflows will run as. This will generally be associated with a Run As account, like we created above, but would be used with a non-privileged user account on the Linux systems, and wont request sudo elevation.

***A note on sudo elevated accounts:

sudo elevation must be passwordless.
requiredtty must be disabled for the user.

For my example – I am keeping it very simple. I created two Run As accounts, one for monitoring and one for agent maintenance. I will associate these Run As account to the appropriate RunAs profiles.

I will start with the Unix/Linux Action Account profile. Right click it – choose properties, and on the Run As Accounts screen, click Add, then select our “UNIX/Linux Monitoring Account”. Leave the default of “All Targeted Objects” and click OK, then save.

Repeat this same process for the Unix/Linux Privileged Account profile, and associate it with your “UNIX/Linux Monitoring Account”.

Repeat this same process for the Unix/Linux Agent Maintenance Account profile, but use the “Unix/Linux Agent Maintenance Account”.

Discover and deploy the agents

Run the discovery wizard.

Click “Add”:

Here you will type in the FQDN of the Linux/Unix agent, its SSH port, and then choose All Computers in the discovery type. ((We have another option for discovery type – if you were manually installing the Unix/Linux agent (which is really just a simple provider) and then using a signed certificate to authenticate))

Check the box next to “Use Run As Credentials”. This will leverage our existing Agent Maintenance account for the discovery and deployment.

Click “Save”. On the next screen – select a resource pool. We will choose the resource pool that we already created.

Click Discover, and the results will be displayed:

Check the box next to your discovered system – and click “Manage” to deploy the agent.

DOH!

There are many reasons this could fail. The most common is rights on the UNIX/Linux systems you are trying to manage. In this case – I didn’t configure SUDO on the Linux box. Lets discuss that now.

I need to modify the /etc/sudoers file on each UNIX/Linux server, to grant the granular permissions.

NOTE: The sudoers configuration has changed from SCOM 2012 R2 to SCOM 2016. This is because we no longer install each package directly (such as .rpm packages). Now, each agent is included in a .sh file that has logic to determine which packages are applicable, and install only those. Because of this – even if you configured sudoers for SCOM 2012 R2 and previous support, you will need to make some modifications.

Here is a sample sudoers file for all operating systems, in SCOM 2016:

#-----------------------------------------------------------------------------------
#Example user configuration for Operations Manager 2016 agent
#Example assumes users named: scxmaint & scxmon
#Replace usernames & corresponding /tmp/scx-<username> specification for your environment

#General requirements
Defaults:scxmaint !requiretty

#Agent maintenance
##Certificate signing
scxmaint ALL=(root) NOPASSWD: /bin/sh -c cp /tmp/scx-scxmaint/scx.pem /etc/opt/microsoft/scx/ssl/scx.pem; rm -rf /tmp/scx-scxmaint; /opt/microsoft/scx/bin/tools/scxadmin -restart
scxmaint ALL=(root) NOPASSWD: /bin/sh -c cat /etc/opt/microsoft/scx/ssl/scx.pem

##Install or upgrade
#AIX
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].aix.[[\:digit\:]].ppc.sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].aix.[[\:digit\:]].ppc.sh --upgrade
#HPUX
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].hpux.11iv3.ia64.sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].hpux.11iv3.ia64.sh --upgrade
#RHEL
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].rhel.[[\:digit\:]].x[6-8][4-6].sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].rhel.[[\:digit\:]].x[6-8][4-6].sh --upgrade
#SLES
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].sles.1[[\:digit\:]].x[6-8][4-6].sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].sles.1[[\:digit\:]].x[6-8][4-6].sh --upgrade
#SOLARIS
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].solaris.1[[\:digit\:]].x86.sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].solaris.1[[\:digit\:]].x86.sh --upgrade
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].solaris.1[[\:digit\:]].sparc.sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].solaris.1[[\:digit\:]].sparc.sh --upgrade
#Linux
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].universal[[\:alpha\:]].[[\:digit\:]].x[6-8][4-6].sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].universal[[\:alpha\:]].[[\:digit\:]].x[6-8][4-6].sh --upgrade


##Uninstall
scxmaint ALL=(root) NOPASSWD: /bin/sh -c /opt/microsoft/scx/bin/uninstall

##Log file monitoring
scxmon ALL=(root) NOPASSWD: /opt/microsoft/scx/bin/scxlogfilereader -p

###Examples
#Custom shell command monitoring example – replace <shell command> with the correct command string
# scxmon ALL=(root) NOPASSWD: /bin/bash -c <shell command>

#Daemon diagnostic and restart recovery tasks example (using cron)
#scxmon ALL=(root) NOPASSWD: /bin/sh -c ps -ef | grep cron | grep -v grep
#scxmon ALL=(root) NOPASSWD: /usr/sbin/cron &  


#End user configuration for Operations Manager agent
#-----------------------------------------------------------------------------------

Since the above file contains ALL OS’s and examples, I am going to trim it down to just what I need for this Ubuntu Linux system:

#-----------------------------------------------------------------------------------
#Ubuntu Linux configuration for Operations Manager 2016 agent

#General requirements
Defaults:scxmaint !requiretty

#Agent maintenance
##Certificate signing
scxmaint ALL=(root) NOPASSWD: /bin/sh -c cp /tmp/scx-scxmaint/scx.pem /etc/opt/microsoft/scx/ssl/scx.pem; rm -rf /tmp/scx-scxmaint; /opt/microsoft/scx/bin/tools/scxadmin -restart
scxmaint ALL=(root) NOPASSWD: /bin/sh -c cat /etc/opt/microsoft/scx/ssl/scx.pem

##Install or upgrade
#Linux
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].universal[[\:alpha\:]].[[\:digit\:]].x[6-8][4-6].sh --install; EC=$?; cd /tmp; rm -rf /tmp/scx-scxmaint; exit $EC
scxmaint ALL=(root) NOPASSWD: /bin/sh -c sh /tmp/scx-scxmaint/scx-1.[5-9].[0-9]-[0-9][0-9][0-9].universal[[\:alpha\:]].[[\:digit\:]].x[6-8][4-6].sh --upgrade

##Uninstall
scxmaint ALL=(root) NOPASSWD: /bin/sh -c /opt/microsoft/scx/bin/uninstall

##Log file monitoring
scxmon ALL=(root) NOPASSWD: /opt/microsoft/scx/bin/scxlogfilereader -p
#-----------------------------------------------------------------------------------

I will edit my sudoers file and insert this configuration. You can use vi, visudo, or my personal favorite since I am a Windows guy – download and install winscp, which will allow a gui editor of the files and helps anytime you need to transfer files to and from Windows and UNIX/Linux using SSH. Generally we want to place this configuration in the appropriate section of the sudoers file – not at the end. There are items at the end of the file that need to stay there. I put this right after the existing “Defaults” section in the existing sudoers configuration, and save it.

Now – back in SCOM – I retry the deployment of the agent:

This will take some time to complete, as the agent is checked for the correct FQDN and certificate, the management servers are inspected to ensure they all have trusted SCX certificates (that we exported/imported above) and the connection is made over SSH, the package is copied down, installed, and the final certificate signing occurs. If all of these checks pass, we get a success!

There are several things that can fail at this point. See the troubleshooting section at the end of this article.

Monitoring Linux servers:

Assuming we got all the way to this point with a successful discovery and agent installation, we need to verify that monitoring is working. After an agent is deployed, the Run As accounts will start being used to run discoveries, and start monitoring. Once enough time has passed for these, check in the Administration pane, under Unix/Linux Computers, and verify that the systems are not listed as “Unknown” but discovered as a specific version of the OS:

Here is is immediately – before the discoveries complete:

Here is what we expect after a few minutes:

Next – go to the Monitoring pane – and select the “Unix/Linux Computers” view at the top. Look that your systems are present and there is a green healthy check mark next to them:

Next – expand the Unix/Linux Computers folder in the left tree (near the bottom) and make sure we have discovered the individual objects, like Linux Server State, Logical Disk State, and Network Adapter state:

Run Health explorer on one of the discovered Linux Server State objects. Remove the filter at the top to see all the monitors for the system:

Close health explorer.

Select the Operating System Performance view. Review the performance counters we collect out of the box for each monitored OS.

Out of the box – we discover and apply a default monitoring template to the following objects:

Operating System
Logical disk
Network Adapters

Optionally, you can enable discoveries for:

Individual Logical Processors
Physical Disks

I don’t recommend enabling additional discoveries unless you are sure that your monitoring requirements cannot be met without discovering these additional objects, as they will reduce the scalability of your environment.

Out of the box – for an OS like RedHat Enterprise Linux 5 – here is a list of the monitors in place, and the object they target:

There are also 50 or more rules enabled out of the box. 46 are performance collection rules for reporting, and 4 rules are event based, dealing with security. Two are informational letting you know whenever a direct login is made using root credentials via SSH, and when su elevation occurs by a user session. The other two deal with failed attempts for SSH or SU.

To get more out of your monitoring – you might have other services, processes, or log files that you need to monitor. For that, we provide Authoring Templates with wizards to help you add additional monitoring, in the Authoring pane of the console under Management Pack templates:

In the reporting pane – we also offer a large number of reports you can leverage, or you can always create your own using our generic report templates, or custom ones designed in Visual Studio for SQL reporting services.

As you can see, it is a fairly well rounded solution to include Unix and Linux monitoring into a single pane of glass for your other systems, from the Hardware, to the Operating System, to the network layer, to the applications.

Partners and 3rd party vendors also supply additional management packs which extend our Unix and Linux monitoring, to discover and provide detailed monitoring on non-Microsoft applications that run on these Unix and Linux systems.

Troubleshooting:

The majority of troubleshooting comes in the form of failed discovery/agent deployments.

Microsoft has written a wiki on this topic, which covers the majority of these, and how to resolve:

http://social.technet.microsoft.com/wiki/contents/articles/4966.aspx

For instance – if your DNS name that you provided does not match the DNS hostname on the Linux server, or match it’s SSL certificate, or if you failed to export/import the SCX certificates for multiple management servers in the pool, you might see:

Agent verification failed. Error detail: The server certificate on the destination computer (rh5501.opsmgr.net:1270) has the following errors:
The SSL certificate could not be checked for revocation. The server used to check for revocation might be unreachable.

The SSL certificate is signed by an unknown certificate authority.
It is possible that:
1. The destination certificate is signed by another certificate authority not trusted by the management server.
2. The destination has an invalid certificate, e.g., its common name (CN) does not match the fully qualified domain name (FQDN) used for the connection. The FQDN used for the connection is: rh5501.opsmgr.net.
3. The servers in the resource pool have not been configured to trust certificates signed by other servers in the pool.

The server certificate on the destination computer (rh5501.opsmgr.net:1270) has the following errors:
The SSL certificate could not be checked for revocation. The server used to check for revocation might be unreachable.
The SSL certificate is signed by an unknown certificate authority.
It is possible that:
1. The destination certificate is signed by another certificate authority not trusted by the management server.
2. The destination has an invalid certificate, e.g., its common name (CN) does not match the fully qualified domain name (FQDN) used for the connection. The FQDN used for the connection is: rh5501.opsmgr.net.
3. The servers in the resource pool have not been configured to trust certificates signed by other servers in the pool.

The solution to these common issues is covered in the Wiki with links to the product documentation.

Perhaps – you failed to properly configure your Run As accounts and profiles. You might see the following show as “Unknown” under administration:

Or you might see alerts in the console:

Alert: UNIX/Linux Run As profile association error event detected

The account for the UNIX/Linux Action Run As profile associated with the workflow “Microsoft.Unix.AgentVersion.Discovery”, running for instance “rh5501.opsmgr.net” with ID {9ADCED3D-B44B-3A82-769D-B0653BFE54F9} is not defined. The workflow has been unloaded. Please associate an account with the profile.

This condition may have occurred because no UNIX/Linux Accounts have been configured for the Run As profile. The UNIX/Linux Run As profile used by this workflow must be configured to associate a Run As account with the target.

Either you failed to configure the Run As accounts, or failed to distribute them, or you chose a low priv account that is not properly configured for sudo on the Linux system. Go back and double-check your work there.

If you want to check if the agent was deployed to a RedHat system, you can provide the following command in a shell session:

↧

SCOM SQL queries

November 11, 2016, 7:37 pm

≫ Next: MP University – Fall 2016 – Wednesday Nov 16th

≪ Previous: Monitoring UNIX/Linux with OpsMgr 2016

These queries work for SCOM 2012 and SCOM 2016. Updated 11/11/2016

Large Table query. (I am putting this at the top, because I use it so much – to find out what is taking up so much space in the OpsDB or DW)

--Large Table query.  I am putting this at the top, because I use it so much to find out what is taking up so much space in the OpsDB or DW

SELECT TOP 1000
a2.name AS [tablename], (a1.reserved + ISNULL(a4.reserved,0))* 8 AS reserved,
a1.rows as row_count, a1.data * 8 AS data,
(CASE WHEN (a1.used + ISNULL(a4.used,0)) > a1.data THEN (a1.used + ISNULL(a4.used,0)) - a1.data ELSE 0 END) * 8 AS index_size,
(CASE WHEN (a1.reserved + ISNULL(a4.reserved,0)) > a1.used THEN (a1.reserved + ISNULL(a4.reserved,0)) - a1.used ELSE 0 END) * 8 AS unused,
(row_number() over(order by (a1.reserved + ISNULL(a4.reserved,0)) desc))%2 as l1,
a3.name AS [schemaname]
FROM (SELECT ps.object_id, SUM (CASE WHEN (ps.index_id < 2) THEN row_count ELSE 0 END) AS [rows],
SUM (ps.reserved_page_count) AS reserved,
SUM (CASE WHEN (ps.index_id < 2) THEN (ps.in_row_data_page_count + ps.lob_used_page_count + ps.row_overflow_used_page_count)
ELSE (ps.lob_used_page_count + ps.row_overflow_used_page_count) END ) AS data,
SUM (ps.used_page_count) AS used
FROM sys.dm_db_partition_stats ps
GROUP BY ps.object_id) AS a1
LEFT OUTER JOIN (SELECT it.parent_id,
SUM(ps.reserved_page_count) AS reserved,
SUM(ps.used_page_count) AS used
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.internal_tables it ON (it.object_id = ps.object_id)
WHERE it.internal_type IN (202,204)
GROUP BY it.parent_id) AS a4 ON (a4.parent_id = a1.object_id)
INNER JOIN sys.all_objects a2  ON ( a1.object_id = a2.object_id )
INNER JOIN sys.schemas a3 ON (a2.schema_id = a3.schema_id)
WHERE a2.type <> N'S' and a2.type <> N'IT'

Database Size and used space. (People have a lot of confusion here – this will show the DB and log file size, plus the used/free space in each)

--Database Size and used space.  
--this will show the DB and log file size plus the used/free space in each

select a.FILEID,
[FILE_SIZE_MB]=convert(decimal(12,2),round(a.size/128.000,2)),
[SPACE_USED_MB]=convert(decimal(12,2),round(fileproperty(a.name,'SpaceUsed')/128.000,2)),
[FREE_SPACE_MB]=convert(decimal(12,2),round((a.size-fileproperty(a.name,'SpaceUsed'))/128.000,2)) ,
[GROWTH_MB]=convert(decimal(12,2),round(a.growth/128.000,2)),
NAME=left(a.NAME,15),
FILENAME=left(a.FILENAME,60)
from dbo.sysfiles a

Operational Database Queries:

Alerts Section (OperationsManager DB):

Number of console Alerts per Day:

--Number of console Alerts per Day:

SELECT CONVERT(VARCHAR(20), TimeAdded, 102) AS DayAdded, COUNT(*) AS NumAlertsPerDay
FROM Alert WITH (NOLOCK)
WHERE TimeRaised is not NULL
GROUP BY CONVERT(VARCHAR(20), TimeAdded, 102)
ORDER BY DayAdded DESC

Top 20 Alerts in an Operational Database, by Alert Count

--Top 20 Alerts in an Operational Database, by Alert Count

SELECT TOP 20 SUM(1) AS AlertCount,
 AlertStringName AS 'AlertName',
 AlertStringDescription AS 'Description',
 Name,
 MonitoringRuleId
FROM Alertview WITH (NOLOCK)
WHERE TimeRaised is not NULL
GROUP BY AlertStringName, AlertStringDescription, Name, MonitoringRuleId
ORDER BY AlertCount DESC

Top 20 Alerts in an Operational Database, by Repeat Count

--Top 20 Alerts in an Operational Database, by Repeat Count

SELECT TOP 20 SUM(RepeatCount+1) AS RepeatCount,
 AlertStringName as 'AlertName',
 AlertStringDescription as 'Description',
 Name,
 MonitoringRuleId
FROM Alertview WITH (NOLOCK)
WHERE Timeraised is not NULL
GROUP BY AlertStringName, AlertStringDescription, Name, MonitoringRuleId
ORDER BY RepeatCount DESC

Top 20 Objects generating the most Alerts in an Operational Database, by Repeat Count

--Top 20 Objects generating the most Alerts in an Operational Database, by Repeat Count

SELECT TOP 20 SUM(RepeatCount+1) AS RepeatCount,
 MonitoringObjectPath AS 'Path'
FROM Alertview WITH (NOLOCK)
WHERE Timeraised is not NULL
GROUP BY MonitoringObjectPath
ORDER BY RepeatCount DESC

Top 20 Objects generating the most Alerts in an Operational Database, by Alert Count

--Top 20 Objects generating the most Alerts in an Operational Database, by Alert Count

SELECT TOP 20 SUM(1) AS AlertCount,
 MonitoringObjectPath AS 'Path'
FROM Alertview WITH (NOLOCK)
WHERE TimeRaised is not NULL
GROUP BY MonitoringObjectPath
ORDER BY AlertCount DESC

Number of console Alerts per Day by Resolution State:

--Number of console Alerts per Day by Resolution State:

SELECT
CASE WHEN(GROUPING(CONVERT(VARCHAR(20), TimeAdded, 102)) = 1)
  THEN 'All Days' ELSE CONVERT(VARCHAR(20), TimeAdded, 102)
  END AS [Date],
CASE WHEN(GROUPING(ResolutionState) = 1)
  THEN 'All Resolution States' ELSE CAST(ResolutionState AS VARCHAR(5))
  END AS [ResolutionState],
COUNT(*) AS NumAlerts
FROM Alert WITH (NOLOCK)
WHERE TimeRaised is not NULL
GROUP BY CONVERT(VARCHAR(20), TimeAdded, 102), ResolutionState WITH ROLLUP
ORDER BY DATE DESC

Events Section (OperationsManager DB):

All Events by count by day, with total for entire database: (this tells us how many events per day we are inserting – and helps us look for too many events, event storms, and the result after tuning rules that generate too many events)

--All Events by count by day, with total for entire database

SELECT CASE WHEN(GROUPING(CONVERT(VARCHAR(20), TimeAdded, 102)) = 1)
THEN 'All Days'
ELSE CONVERT(VARCHAR(20), TimeAdded, 102) END AS DayAdded,
COUNT(*) AS EventsPerDay
FROM EventAllView
GROUP BY CONVERT(VARCHAR(20), TimeAdded, 102) WITH ROLLUP
ORDER BY DayAdded DESC

Most common events by event number and event source: (This gives us the event source name to help see what is raising these events)

--Most common events by event number and event source

SELECT top 20 Number as EventID,
 COUNT(*) AS TotalEvents,
 Publishername as EventSource
FROM EventAllView eav with (nolock)
GROUP BY Number, Publishername
ORDER BY TotalEvents DESC

Computers generating the most events:

--Computers generating the most events

SELECT top 20 LoggingComputer as ComputerName,
 COUNT(*) AS TotalEvents
FROM EventallView with (NOLOCK)
GROUP BY LoggingComputer
ORDER BY TotalEvents DESC

Performance Section (OperationsManager DB):

Performance insertions per day:

--Performance insertions per day: 

SELECT CASE WHEN(GROUPING(CONVERT(VARCHAR(20), TimeSampled, 102)) = 1)
 THEN 'All Days'
 ELSE CONVERT(VARCHAR(20), TimeSampled, 102)
 END AS DaySampled, COUNT(*) AS PerfInsertPerDay
FROM PerformanceDataAllView with (NOLOCK)
GROUP BY CONVERT(VARCHAR(20), TimeSampled, 102) WITH ROLLUP
ORDER BY DaySampled DESC

Top 20 performance insertions by perf object and counter name: (This shows us which counters are likely overcollected or have duplicate collection rules, and filling the databases)

--Top 20 performance insertions by perf object and counter name: 

SELECT TOP 20 pcv.ObjectName,
 pcv.CounterName,
 COUNT (pcv.countername) AS Total
FROM performancedataallview AS pdv, performancecounterview AS pcv
WHERE (pdv.performancesourceinternalid = pcv.performancesourceinternalid)
GROUP BY pcv.objectname, pcv.countername
ORDER BY COUNT (pcv.countername) DESC

To view all performance data collected for a given computer:

--To view all performance insertions for a given computer:

select Distinct Path,
 ObjectName,
  CounterName,
  InstanceName
from PerformanceDataAllView pdv with (NOLOCK)
inner join PerformanceCounterView pcv on pdv.performancesourceinternalid = pcv.performancesourceinternalid
inner join BaseManagedEntity bme on pcv.ManagedEntityId = bme.BaseManagedEntityId
where path = 'sql2a.opsmgr.net'
order by objectname, countername, InstanceName

To pull all perf data for a given computer, object, counter, and instance:

--To pull all perf data for a given computer, object, counter, and instance:

select Path,
 ObjectName,
 CounterName,
 InstanceName,
 SampleValue,
 TimeSampled
from PerformanceDataAllView pdv with (NOLOCK)
inner join PerformanceCounterView pcv on pdv.performancesourceinternalid = pcv.performancesourceinternalid
inner join BaseManagedEntity bme on pcv.ManagedEntityId = bme.BaseManagedEntityId
where path = 'sql2a.opsmgr.net' AND
 objectname = 'LogicalDisk' AND
 countername = 'Free Megabytes'
order by timesampled DESC

State Section:

To find out how old your StateChange data is:

--To find out how old your StateChange data is:

declare @statedaystokeep INT
SELECT @statedaystokeep = DaysToKeep from PartitionAndGroomingSettings
WHERE ObjectName = 'StateChangeEvent'
SELECT COUNT(*) as 'Total StateChanges',
count(CASE WHEN sce.TimeGenerated > dateadd(dd,-@statedaystokeep,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as 'within grooming retention',
count(CASE WHEN sce.TimeGenerated < dateadd(dd,-@statedaystokeep,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as '> grooming retention',
count(CASE WHEN sce.TimeGenerated < dateadd(dd,-30,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as '> 30 days',
count(CASE WHEN sce.TimeGenerated < dateadd(dd,-90,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as '> 90 days',
count(CASE WHEN sce.TimeGenerated < dateadd(dd,-365,getutcdate()) THEN sce.TimeGenerated ELSE NULL END) as '> 365 days'
from StateChangeEvent sce

Cleanup old statechanges for disabled monitors: http://blogs.technet.com/kevinholman/archive/2009/12/21/tuning-tip-do-you-have-monitors-constantly-flip-flopping.aspx

USE [OperationsManager]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
BEGIN
    SET NOCOUNT ON
    DECLARE @Err int
    DECLARE @Ret int
    DECLARE @DaysToKeep tinyint
    DECLARE @GroomingThresholdLocal datetime
    DECLARE @GroomingThresholdUTC datetime
    DECLARE @TimeGroomingRan datetime
    DECLARE @MaxTimeGroomed datetime
    DECLARE @RowCount int
    SET @TimeGroomingRan = getutcdate()
    SELECT @GroomingThresholdLocal = dbo.fn_GroomingThreshold(DaysToKeep, getdate())
    FROM dbo.PartitionAndGroomingSettings
    WHERE ObjectName = 'StateChangeEvent'
    EXEC dbo.p_ConvertLocalTimeToUTC @GroomingThresholdLocal, @GroomingThresholdUTC OUT
    SET @Err = @@ERROR
    IF (@Err <> 0)
    BEGIN
        GOTO Error_Exit
    END
    SET @RowCount = 1
    -- This is to update the settings table 
    -- with the max groomed data 
    SELECT @MaxTimeGroomed = MAX(TimeGenerated)
    FROM dbo.StateChangeEvent
    WHERE TimeGenerated < @GroomingThresholdUTC
    IF @MaxTimeGroomed IS NULL
        GOTO Success_Exit
    -- Instead of the FK DELETE CASCADE handling the deletion of the rows from 
    -- the MJS table, do it explicitly. Performance is much better this way. 
    DELETE MJS
    FROM dbo.MonitoringJobStatus MJS
    JOIN dbo.StateChangeEvent SCE
        ON SCE.StateChangeEventId = MJS.StateChangeEventId
    JOIN dbo.State S WITH(NOLOCK)
        ON SCE.[StateId] = S.[StateId]
    WHERE SCE.TimeGenerated < @GroomingThresholdUTC
    AND S.[HealthState] in (0,1,2,3)
    SELECT @Err = @@ERROR
    IF (@Err <> 0)
    BEGIN
        GOTO Error_Exit
    END
    WHILE (@RowCount > 0)
    BEGIN
        -- Delete StateChangeEvents that are older than @GroomingThresholdUTC 
        -- We are doing this in chunks in separate transactions on 
        -- purpose: to avoid the transaction log to grow too large. 
        DELETE TOP (10000) SCE
        FROM dbo.StateChangeEvent SCE
        JOIN dbo.State S WITH(NOLOCK)
            ON SCE.[StateId] = S.[StateId]
        WHERE TimeGenerated < @GroomingThresholdUTC
        AND S.[HealthState] in (0,1,2,3)
        SELECT @Err = @@ERROR, @RowCount = @@ROWCOUNT
        IF (@Err <> 0)
        BEGIN
            GOTO Error_Exit
        END
    END
    UPDATE dbo.PartitionAndGroomingSettings
    SET GroomingRunTime = @TimeGroomingRan,
        DataGroomedMaxTime = @MaxTimeGroomed
    WHERE ObjectName = 'StateChangeEvent'
    SELECT @Err = @@ERROR, @RowCount = @@ROWCOUNT
    IF (@Err <> 0)
    BEGIN
        GOTO Error_Exit
    END
Success_Exit:
Error_Exit:
END

State changes per day:

--State changes per day: 

SELECT CASE WHEN(GROUPING(CONVERT(VARCHAR(20), TimeGenerated, 102)) = 1)
THEN 'All Days' ELSE CONVERT(VARCHAR(20), TimeGenerated, 102)
END AS DayGenerated, COUNT(*) AS StateChangesPerDay
FROM StateChangeEvent WITH (NOLOCK)
GROUP BY CONVERT(VARCHAR(20), TimeGenerated, 102) WITH ROLLUP
ORDER BY DayGenerated DESC

Noisiest monitors changing state in the database in the last 7 days:

--Noisiest monitors changing state in the database in the last 7 days:

SELECT DISTINCT TOP 50 count(sce.StateId) as StateChanges,
  m.DisplayName as MonitorName,
  m.Name as MonitorId,
  mt.typename AS TargetClass
FROM StateChangeEvent sce with (nolock)
join state s with (nolock) on sce.StateId = s.StateId
join monitorview m with (nolock) on s.MonitorId = m.Id
join managedtype mt with (nolock) on m.TargetMonitoringClassId = mt.ManagedTypeId
where m.IsUnitMonitor = 1
  -- Scoped to within last 7 days 
AND sce.TimeGenerated > dateadd(dd,-7,getutcdate())
group by m.DisplayName, m.Name,mt.typename
order by StateChanges desc

Noisiest Monitor in the database – PER Object/Computer in the last 7 days:

--Noisiest Monitor in the database – PER Object/Computer in the last 7 days:

select distinct top 50 count(sce.StateId) as NumStateChanges,
bme.DisplayName AS ObjectName,
bme.Path,
m.DisplayName as MonitorDisplayName,
m.Name as MonitorIdName,
mt.typename AS TargetClass
from StateChangeEvent sce with (nolock)
join state s with (nolock) on sce.StateId = s.StateId
join BaseManagedEntity bme with (nolock) on s.BasemanagedEntityId = bme.BasemanagedEntityId
join MonitorView m with (nolock) on s.MonitorId = m.Id
join managedtype mt with (nolock) on m.TargetMonitoringClassId = mt.ManagedTypeId
where m.IsUnitMonitor = 1
   -- Scoped to specific Monitor (remove the "--" below): 
   -- AND m.MonitorName like ('%HealthService%') 
   -- Scoped to specific Computer (remove the "--" below): 
   -- AND bme.Path like ('%sql%') 
   -- Scoped to within last 7 days 
AND sce.TimeGenerated > dateadd(dd,-7,getutcdate())
group by s.BasemanagedEntityId,bme.DisplayName,bme.Path,m.DisplayName,m.Name,mt.typename
order by NumStateChanges desc

Management Pack info:

Rules section:

--To find a common rule name given a Rule ID name:
SELECT DisplayName from RuleView
where name = 'Microsoft.SystemCenter.GenericNTPerfMapperModule.FailedExecution.Alert'

--Rules per MP:
SELECT mp.MPName, COUNT(*) As RulesPerMP
FROM Rules r
INNER JOIN ManagementPack mp ON mp.ManagementPackID = r.ManagementPackID
GROUP BY mp.MPName
ORDER BY RulesPerMP DESC

--Rules per MP by category:
SELECT mp.MPName, r.RuleCategory, COUNT(*) As RulesPerMPPerCategory
FROM Rules r
INNER JOIN ManagementPack mp ON mp.ManagementPackID = r.ManagementPackID
GROUP BY mp.MPName, r.RuleCategory
ORDER BY RulesPerMPPerCategory DESC

--To find all rules per MP with a given alert severity:
declare @mpid as varchar(50)
select @mpid= managementpackid
  from managementpack
  where mpName='Microsoft.SystemCenter.2007'
select rl.rulename,rl.ruleid,md.modulename
  from rules rl, module md
  where md.managementpackid = @mpid
  and rl.ruleid=md.parentid
  and moduleconfiguration like '%<Severity>2%'

--Rules are stored in a table named Rules. This table has columns linking rules to classes and Management Packs. 
--To find all rules in a Management Pack use the following query and substitute in the required Management Pack name:
SELECT *
FROM Rules
WHERE ManagementPackID = (SELECT ManagementPackID from ManagementPack WHERE MPName = 'Microsoft.SystemCenter.2007')

--To find all rules targeted at a given class use the following query and substitute in the required class name:
SELECT * FROM Rules WHERE TargetManagedEntityType = (SELECT ManagedTypeId FROM ManagedType WHERE TypeName = 'Microsoft.Windows.Computer')

Monitors Section:

--Monitors Per MP:
SELECT mp.MPName, COUNT(*) As MonitorsPerMPPerCategory
FROM Monitor m
INNER JOIN ManagementPack mp ON mp.ManagementPackID = m.ManagementPackID
GROUP BY mp.MPName
ORDER BY COUNT(*) Desc

--To find your Monitor by common name:
select * from Monitor m
Inner join LocalizedText LT on LT.ElementName = m.MonitorName
where LTValue = ‘Monitor Common Name’

--To find your Monitor by ID name:
select * from Monitor m
Inner join LocalizedText LT on LT.ElementName = m.MonitorName
where m.monitorname = 'your Monitor ID name'

--To find all monitors targeted at a specific class:
SELECT * FROM monitor WHERE TargetManagedEntityType = (SELECT ManagedTypeId FROM ManagedType WHERE TypeName = 'Microsoft.Windows.Computer')

Groups Section:

--To find all members of a given group (change the group name below):
select TargetObjectDisplayName as 'Group Members'
from RelationshipGenericView
where isDeleted=0
AND SourceObjectDisplayName = 'All Windows Computers'
ORDER BY TargetObjectDisplayName

--Find find the entity data on all members of a given group (change the group name below):
SELECT bme.*
FROM BaseManagedEntity bme
INNER JOIN RelationshipGenericView rgv WITH(NOLOCK) ON bme.basemanagedentityid = rgv.TargetObjectId
WHERE bme.IsDeleted = '0'
AND rgv.SourceObjectDisplayName = 'All Windows Computers'
ORDER BY bme.displayname

--To find all groups for a given computer/object (change “computername” in the query below):
SELECT SourceObjectDisplayName AS 'Group'
FROM RelationshipGenericView
WHERE TargetObjectDisplayName like ('%sql2a.opsmgr.net%')
AND (SourceObjectDisplayName IN
(SELECT ManagedEntityGenericView.DisplayName
FROM ManagedEntityGenericView INNER JOIN
(SELECT     BaseManagedEntityId
FROM          BaseManagedEntity WITH (NOLOCK)
WHERE      (BaseManagedEntityId = TopLevelHostEntityId) AND (BaseManagedEntityId NOT IN
(SELECT     R.TargetEntityId
FROM          Relationship AS R WITH (NOLOCK) INNER JOIN
dbo.fn_ContainmentRelationshipTypes() AS CRT ON R.RelationshipTypeId = CRT.RelationshipTypeId
WHERE      (R.IsDeleted = 0)))) AS GetTopLevelEntities ON
GetTopLevelEntities.BaseManagedEntityId = ManagedEntityGenericView.Id INNER JOIN
(SELECT DISTINCT BaseManagedEntityId
FROM          TypedManagedEntity WITH (NOLOCK)
WHERE      (ManagedTypeId IN
(SELECT     DerivedManagedTypeId
FROM dbo.fn_DerivedManagedTypes(dbo.fn_ManagedTypeId_Group()) AS fn_DerivedManagedTypes_1))) AS GetOnlyGroups ON
GetOnlyGroups.BaseManagedEntityId = ManagedEntityGenericView.Id))
ORDER BY 'Group'

Management Pack and Instance Space misc queries:

--To find all installed Management Packs and their version:
SELECT Name AS 'ManagementPackID',
 FriendlyName,
 DisplayName,
 Version,
 Sealed,
 LastModified,
 TimeCreated
FROM ManagementPackView
WHERE LanguageCode = 'ENU'
OR LanguageCode IS NULL
ORDER BY DisplayName

--Number of Views per Management Pack:
SELECT mp.MPName, v.ViewVisible, COUNT(*) As ViewsPerMP
FROM [Views] v
            INNER JOIN ManagementPack mp ON mp.ManagementPackID = v.ManagementPackID
GROUP BY  mp.MPName, v.ViewVisible
ORDER BY v.ViewVisible DESC, COUNT(*) Desc

--How to gather all the views in the database, their ID, MP location, and view type:
select vv.id as 'View Id',
vv.displayname as 'View DisplayName',
vv.name as 'View Name',
vtv.DisplayName as 'ViewType',
mpv.FriendlyName as 'MP Name'
from ViewsView vv
inner join managementpackview mpv on mpv.id = vv.managementpackid
inner join viewtypeview vtv on vtv.id = vv.monitoringviewtypeid
-- where mpv.FriendlyName like '%default%' 
-- where vv.displayname like '%operating%' 
order by mpv.FriendlyName, vv.displayname

--Classes available in the DB:
SELECT count(*) FROM ManagedType

--Total BaseManagedEntities
SELECT count(*) FROM BaseManagedEntity

--To get the state of every instance of a particular monitor the following query can be run, (replace <Health Service Heartbeat Failure> with the name of the monitor):
SELECT bme.FullName,
 bme.DisplayName,
 s.HealthState
FROM state AS s,
 BaseManagedEntity as bme
WHERE s.basemanagedentityid = bme.basemanagedentityid
AND s.monitorid IN (SELECT Id FROM MonitorView WHERE DisplayName = 'Health Service Heartbeat Failure')

--For example, this gets the state of the Microsoft.SQLServer.2012.DBEngine.ServiceMonitor for each instance of the SQL 2012 Database Engine class.
SELECT bme.FullName,
 bme.DisplayName,
 s.HealthState
FROM state AS s, BaseManagedEntity as bme
WHERE s.basemanagedentityid = bme.basemanagedentityid
AND s.monitorid IN (SELECT MonitorId FROM Monitor WHERE MonitorName = 'Microsoft.SQLServer.2012.DBEngine.ServiceMonitor')

--To find the overall state of any object in OpsMgr the following query should be used to return the state of the System.EntityState monitor:
SELECT bme.FullName,
 bme.DisplayName,
 s.HealthState
FROM state AS s, BaseManagedEntity as bme
WHERE s.basemanagedentityid = bme.basemanagedentityid AND s.monitorid IN (SELECT MonitorId FROM Monitor WHERE MonitorName = 'System.Health.EntityState')

 --The Alert table contains all alerts currently open in OpsMgr. This includes resolved alerts until they are groomed out of the database. To get all alerts across all instances of a given monitor use the following query and substitute in the required monitor name:
SELECT * FROM Alert WHERE ProblemID IN (SELECT MonitorId FROM Monitor WHERE MonitorName = 'Microsoft.SQLServer.2012.DBEngine.ServiceMonitor')

--To retrieve all alerts for all instances of a specific class use the following query and substitute in the required table name, in this example MT_Microsoft$SQLServer$2012$DBEngine is used to look for SQL alerts:
SELECT * FROM Alert WHERE BaseManagedEntityID IN (SELECT BaseManagedEntityID from MT_Microsoft$SQLServer$2012$DBEngine)

--To determine which table is currently being written to for event and performance data use the following query:
SELECT * FROM PartitionTables WHERE IsCurrent = 1

--Number of instances of a type:  (Number of disks, computers, databases, etc that OpsMgr has discovered) 
SELECT mt.TypeName, COUNT(*) AS NumEntitiesByType
FROM BaseManagedEntity bme WITH(NOLOCK)
LEFT JOIN ManagedType mt WITH(NOLOCK) ON mt.ManagedTypeID = bme.BaseManagedTypeID
WHERE bme.IsDeleted = 0
GROUP BY mt.TypeName
ORDER BY COUNT(*) DESC

--To retrieve all performance data for a given rule in a readable format use the following query: (change the r.RuleName value – get list from Rules Table)
SELECT bme.Path, pc.ObjectName, pc.CounterName, ps.PerfmonInstanceName, pdav.SampleValue, pdav.TimeSampled
FROM PerformanceDataAllView AS pdav with (NOLOCK)
INNER JOIN PerformanceSource ps on pdav.PerformanceSourceInternalId = ps.PerformanceSourceInternalId
INNER JOIN PerformanceCounter pc on ps.PerformanceCounterId = pc.PerformanceCounterId
INNER JOIN Rules r on ps.RuleId = r.RuleId
INNER JOIN BaseManagedEntity bme on ps.BaseManagedEntityID = bme.BaseManagedEntityID
WHERE r.RuleName = 'Microsoft.Windows.Server.6.2.LogicalDisk.FreeSpace.Collection'
GROUP BY PerfmonInstanceName, ObjectName, CounterName, SampleValue, TimeSampled, bme.path
ORDER BY bme.path, PerfmonInstanceName, TimeSampled

--To determine what discoveries are still associated with a computer – helpful in finding old stale computer objects in the console that are no longer agent managed, or desired.
select BME.FullName, DS.DiscoveryRuleID, D.DiscoveryName from typedmanagedentity TME
Join BaseManagedEntity BME ON TME.BaseManagedEntityId = BME.BaseManagedEntityId
JOIN DiscoverySourceToTypedManagedEntity DSTME ON TME.TypedManagedEntityID = DSTME.TypedManagedEntityID
JOIN DiscoverySource DS ON DS.DiscoverySourceID = DSTME.DiscoverySourceID
JOIN Discovery D ON DS.DiscoveryRuleID=D.DiscoveryID
Where BME.Fullname like '%SQL2A%'

--To dump out all the rules and monitors that have overrides, and display the context and instance of the override:
select rv.DisplayName as WorkFlowName, OverrideName, mo.Value as OverrideValue,
mt.TypeName as OverrideScope, bme.DisplayName as InstanceName, bme.Path as InstancePath,
mpv.DisplayName as ORMPName, mo.LastModified as LastModified
from ModuleOverride mo
inner join managementpackview mpv on mpv.Id = mo.ManagementPackId
inner join ruleview rv on rv.Id = mo.ParentId
inner join ManagedType mt on mt.managedtypeid = mo.TypeContext
left join BaseManagedEntity bme on bme.BaseManagedEntityId = mo.InstanceContext
Where mpv.Sealed = 0
UNION ALL
select mv.DisplayName as WorkFlowName, OverrideName, mto.Value as OverrideValue,
mt.TypeName as OverrideScope, bme.DisplayName as InstanceName, bme.Path as InstancePath,
mpv.DisplayName as ORMPName, mto.LastModified as LastModified
from MonitorOverride mto
inner join managementpackview mpv on mpv.Id = mto.ManagementPackId
inner join monitorview mv on mv.Id = mto.MonitorId
inner join ManagedType mt on mt.managedtypeid = mto.TypeContext
left join BaseManagedEntity bme on bme.BaseManagedEntityId = mto.InstanceContext
Where mpv.Sealed = 0
Order By mpv.DisplayName

Agent Info:

--To find all managed computers that are currently down and not pingable:
SELECT bme.DisplayName,
  s.LastModified as LastModifiedUTC,
  dateadd(hh,-5,s.LastModified) as 'LastModifiedCST (GMT-5)'
FROM state AS s, BaseManagedEntity AS bme
WHERE s.basemanagedentityid = bme.basemanagedentityid
AND s.monitorid
 IN (SELECT MonitorId FROM Monitor WHERE MonitorName = 'Microsoft.SystemCenter.HealthService.ComputerDown')
 AND s.Healthstate = '3' AND bme.IsDeleted = '0'
ORDER BY s.Lastmodified DESC

--To find a computer name from a HealthServiceID (guid from the Agent proxy alerts)
select DisplayName, Path, basemanagedentityid from basemanagedentity where basemanagedentityid = '<guid>'

--To view the agent patch list (all hotfixes applied to all agents)
select bme.path AS 'Agent Name',
 hs.patchlist AS 'Patch List'
from MT_HealthService hs
inner join BaseManagedEntity bme on hs.BaseManagedEntityId = bme.BaseManagedEntityId
order by path

--Here is a query to see all Agents which are manually installed:
select bme.DisplayName from MT_HealthService mths
INNER JOIN BaseManagedEntity bme on bme.BaseManagedEntityId = mths.BaseManagedEntityId
where IsManuallyInstalled = 1

--Here is a query that will set all agents back to Remotely Manageable:
UPDATE MT_HealthService
SET IsManuallyInstalled=0
WHERE IsManuallyInstalled=1

--Now – the above query will set ALL agents back to “Remotely Manageable = Yes” in the console.  If you want to control it agent by agent – you need to specify it by name here:
UPDATE MT_HealthService
SET IsManuallyInstalled=0
WHERE IsManuallyInstalled=1
AND BaseManagedEntityId IN
(select BaseManagedEntityID from BaseManagedEntity
where BaseManagedTypeId = 'AB4C891F-3359-3FB6-0704-075FBFE36710'
AND DisplayName = 'servername.domain.com')

--Get the discovered instance count of the top 50 agents 
DECLARE @RelationshipTypeId_Manages UNIQUEIDENTIFIER
SELECT @RelationshipTypeId_Manages = dbo.fn_RelationshipTypeId_Manages()
SELECT TOP 50 bme.DisplayName, SUM(1) AS HostedInstances
FROM BaseManagedEntity bme
RIGHT JOIN (
SELECT
      HBME.BaseManagedEntityId AS HS_BMEID,
      TBME.FullName AS TopLevelEntityName,
      BME.FullName AS BaseEntityName,
      TYPE.TypeName AS TypedEntityName
FROM BaseManagedEntity BME WITH(NOLOCK)
      INNER JOIN TypedManagedEntity TME WITH(NOLOCK) ON BME.BaseManagedEntityId = TME.BaseManagedEntityId AND BME.IsDeleted = 0 AND TME.IsDeleted = 0
      INNER JOIN BaseManagedEntity TBME WITH(NOLOCK) ON BME.TopLevelHostEntityId = TBME.BaseManagedEntityId AND TBME.IsDeleted = 0
      INNER JOIN ManagedType TYPE WITH(NOLOCK) ON TME.ManagedTypeID = TYPE.ManagedTypeID
      LEFT JOIN Relationship R WITH(NOLOCK) ON R.TargetEntityId = TBME.BaseManagedEntityId AND R.RelationshipTypeId = @RelationshipTypeId_Manages AND R.IsDeleted = 0
      LEFT JOIN BaseManagedEntity HBME WITH(NOLOCK) ON R.SourceEntityId = HBME.BaseManagedEntityId
) AS dt ON dt.HS_BMEID = bme.BaseManagedEntityId
GROUP by BME.displayname
order by HostedInstances DESC

Misc OpsDB:

--To view grooming info:
SELECT * FROM PartitionAndGroomingSettings WITH (NOLOCK)

--GroomHistory
select * from InternalJobHistory
order by InternalJobHistoryId DESC

--Information on existing User Roles:
SELECT UserRoleName, IsSystem from userrole

--Operational DB version:
select DBVersion from __MOMManagementGroupInfo__

--To view all Run-As Profiles, their associated Run-As account, and associated agent name:
select srv.displayname as 'RunAs Profile Name',
srv.description as 'RunAs Profile Description',
cmss.name as 'RunAs Account Name',
cmss.description as 'RunAs Account Description',
cmss.username as 'RunAs Account Username',
cmss.domain as 'RunAs Account Domain',
mp.FriendlyName as 'RunAs Profile MP',
bme.displayname as 'HealthService'
from dbo.SecureStorageSecureReference sssr
inner join SecureReferenceView srv on srv.id = sssr.securereferenceID
inner join CredentialManagerSecureStorage cmss on cmss.securestorageelementID = sssr.securestorageelementID
inner join managementpackview mp on srv.ManagementPackId = mp.Id
inner join BaseManagedEntity bme on bme.basemanagedentityID = sssr.healthserviceid
order by srv.displayname

--Config Service logs
SELECT * FROM cs.workitem
ORDER BY WorkItemRowId DESC

--Config Service Snapshot history
SELECT * FROM cs.workitem
WHERE WorkItemName like '%snap%'
ORDER BY WorkItemRowId DESC

Data Warehouse Database Queries:

Alerts Section (Warehouse):

--To get all raw alert data from the data warehouse to build reports from:
select * from Alert.vAlertResolutionState ars
inner join Alert.vAlertDetail adt on ars.alertguid = adt.alertguid
inner join Alert.vAlert alt on ars.alertguid = alt.alertguid

--To view data on all alerts modified by a specific user:
select ars.alertguid, alertname, alertdescription, statesetbyuserid, resolutionstate, statesetdatetime, severity, priority, managedentityrowID, repeatcount
from Alert.vAlertResolutionState ars
inner join Alert.vAlert alt on ars.alertguid = alt.alertguid
where statesetbyuserid like '%username%'
order by statesetdatetime

--To view a count of all alerts closed by all users:
select statesetbyuserid, count(*) as 'Number of Alerts'
from Alert.vAlertResolutionState ars
where resolutionstate = '255'
group by statesetbyuserid
order by 'Number of Alerts' DESC

Events Section (Warehouse):

--To inspect total events in DW, and then break it down per day:  (this helps us know what we will be grooming out, and look for partitcular day event storms)
SELECT CASE WHEN(GROUPING(CONVERT(VARCHAR(20), DateTime, 101)) = 1)
THEN 'All Days'
ELSE CONVERT(VARCHAR(20), DateTime, 101) END AS DayAdded,
COUNT(*) AS NumEventsPerDay
FROM Event.vEvent
GROUP BY CONVERT(VARCHAR(20), DateTime, 101) WITH ROLLUP
ORDER BY DayAdded DESC

--Most Common Events by event number:  (This helps us know which event ID’s are the most common in the database)
SELECT top 50 EventDisplayNumber, COUNT(*) AS 'TotalEvents'
FROM Event.vEvent
GROUP BY EventDisplayNumber
ORDER BY TotalEvents DESC

--Most common events by event number and raw event description (this will take a very long time to run but it shows us not only event ID – but a description of the event to help understand which MP is the generating the noise)
SELECT top 50 EventDisplayNumber, Rawdescription, COUNT(*) AS TotalEvents
FROM Event.vEvent evt
inner join Event.vEventDetail evtd on evt.eventoriginid = evtd.eventoriginid
GROUP BY EventDisplayNumber, Rawdescription
ORDER BY TotalEvents DESC

--To view all event data in the DW for a given Event ID:
select * from Event.vEvent ev
inner join Event.vEventDetail evd on ev.eventoriginid = evd.eventoriginid
inner join Event.vEventParameter evp on ev.eventoriginid = evp.eventoriginid
where eventdisplaynumber = '6022'

Performance Section (Warehouse):

--Raw data – core query:
select top 10 *
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId

--Raw data – More selective of “interesting” output data:
select top 10 Path, FullName, ObjectName, CounterName, InstanceName, SampleValue, DateTime
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId

--Raw data – Scoped to a ComputerName (FQDN)
select top 10 Path, FullName, ObjectName, CounterName, InstanceName, SampleValue, DateTime
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId
WHERE Path = 'sql2a.opsmgr.net'

--Raw data – Scoped to a Counter:
select top 10 Path, FullName, ObjectName, CounterName, InstanceName, SampleValue, DateTime
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId
WHERE CounterName = 'Private Bytes'

--Raw data – Scoped to a Computer and Counter:
select top 10 Path, FullName, ObjectName, CounterName, InstanceName, SampleValue, DateTime
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId
WHERE CounterName = 'Private Bytes'
AND Path like '%op%'

--Raw data – How to get all the possible optional data to modify these queries above, in a list:
Select Distinct Path
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId

Select Distinct Fullname
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId

Select Distinct ObjectName
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId

Select Distinct CounterName
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId

Select Distinct InstanceName
from Perf.vPerfRaw pvpr
inner join vManagedEntity vme on pvpr.ManagedEntityRowId = vme.ManagedEntityRowId
inner join vPerformanceRuleInstance vpri on pvpr.PerformanceRuleInstanceRowId = vpri.PerformanceRuleInstanceRowId
inner join vPerformanceRule vpr on vpr.RuleRowId = vpri.RuleRowId

Grooming in the DataWarehouse:

--Here is a view of the current data retention in your data warehouse:
select ds.datasetDefaultName AS 'Dataset Name',
 sda.AggregationTypeId AS 'Agg Type 0=raw, 20=Hourly, 30=Daily',
 sda.MaxDataAgeDays AS 'Retention Time in Days'
from dataset ds, StandardDatasetAggregation sda
WHERE ds.datasetid = sda.datasetid
ORDER by ds.datasetDefaultName

--To view the number of days of total data of each type in the DW:
SELECT DATEDIFF(d, MIN(DWCreatedDateTime), GETDATE()) AS [Current] FROM Alert.vAlert
SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM Event.vEvent
SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM Perf.vPerfRaw
SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM Perf.vPerfHourly
SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM Perf.vPerfDaily
SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM State.vStateRaw
SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM State.vStateHourly
SELECT DATEDIFF(d, MIN(DateTime), GETDATE()) AS [Current] FROM State.vStateDaily

--To view the oldest and newest recorded timestamps of each data type in the DW:
select min(DateTime) from Event.vEvent
select max(DateTime) from Event.vEvent
select min(DateTime) from Perf.vPerfRaw
select max(DateTime) from Perf.vPerfRaw
select min(DWCreatedDateTime) from Alert.vAlert
select max(DWCreatedDateTime) from Alert.vAlert

AEM Queries (Data Warehouse):

--Default query to return all RAW AEM data: 
select * from [CM].[vCMAemRaw] Rw
inner join dbo.AemComputer Computer on Computer.AemComputerRowID = Rw.AemComputerRowID
inner join dbo.AemUser Usr on Usr.AemUserRowId = Rw.AemUserRowId
inner join dbo.AemErrorGroup EGrp on Egrp.ErrorGroupRowId = Rw.ErrorGroupRowId
Inner join dbo.AemApplication App on App.ApplicationRowId = Egrp.ApplicationRowId

--Count the raw crashes per day:
SELECT CONVERT(char(10), DateTime, 101) AS "Crash Date (by Day)", COUNT(*) AS "Number of Crashes"
FROM [CM].[vCMAemRaw]
GROUP BY CONVERT(char(10), DateTime, 101)
ORDER BY "Crash Date (by Day)" DESC

--Count the total number of raw crashes in the DW database:
select count(*) from CM.vCMAemRaw

--Default grooming for the DW for the AEM dataset:  (Aggregated data kept for 400 days, RAW 30 days by default)
SELECT AggregationTypeID, BuildAggregationStoredProcedureName, GroomStoredProcedureName, MaxDataAgeDays, GroomingIntervalMinutes
FROM StandardDatasetAggregation WHERE BuildAggregationStoredProcedureName = 'AemAggregate'

Aggregations and Config churn queries for the Warehouse:

--/* Top Noisy Rules in the last 24 hours */ 
select ManagedEntityTypeSystemName, DiscoverySystemName, count(*) As 'Changes'
from
(select distinct
MP.ManagementPackSystemName,
MET.ManagedEntityTypeSystemName,
PropertySystemName,
D.DiscoverySystemName, D.DiscoveryDefaultName,
MET1.ManagedEntityTypeSystemName As 'TargetTypeSystemName', MET1.ManagedEntityTypeDefaultName 'TargetTypeDefaultName',
ME.Path, ME.Name,
C.OldValue, C.NewValue, C.ChangeDateTime
from dbo.vManagedEntityPropertyChange C
inner join dbo.vManagedEntity ME on ME.ManagedEntityRowId=C.ManagedEntityRowId
inner join dbo.vManagedEntityTypeProperty METP on METP.PropertyGuid=C.PropertyGuid
inner join dbo.vManagedEntityType MET on MET.ManagedEntityTypeRowId=ME.ManagedEntityTypeRowId
inner join dbo.vManagementPack MP on MP.ManagementPackRowId=MET.ManagementPackRowId
inner join dbo.vManagementPackVersion MPV on MPV.ManagementPackRowId=MP.ManagementPackRowId
left join dbo.vDiscoveryManagementPackVersion DMP on DMP.ManagementPackVersionRowId=MPV.ManagementPackVersionRowId
AND CAST(DefinitionXml.query('data(/Discovery/DiscoveryTypes/DiscoveryClass/@TypeID)') AS nvarchar(max)) like '%'+MET.ManagedEntityTypeSystemName+'%'
left join dbo.vManagedEntityType MET1 on MET1.ManagedEntityTypeRowId=DMP.TargetManagedEntityTypeRowId
left join dbo.vDiscovery D on D.DiscoveryRowId=DMP.DiscoveryRowId
where ChangeDateTime > dateadd(hh,-24,getutcdate())
) As #T
group by ManagedEntityTypeSystemName, DiscoverySystemName
order by count(*) DESC

--/* Modified properties in the last 24 hours */
select distinct
MP.ManagementPackSystemName,
MET.ManagedEntityTypeSystemName,
PropertySystemName,
D.DiscoverySystemName, D.DiscoveryDefaultName,
MET1.ManagedEntityTypeSystemName As 'TargetTypeSystemName', MET1.ManagedEntityTypeDefaultName 'TargetTypeDefaultName',
ME.Path, ME.Name,
C.OldValue, C.NewValue, C.ChangeDateTime
from dbo.vManagedEntityPropertyChange C
inner join dbo.vManagedEntity ME on ME.ManagedEntityRowId=C.ManagedEntityRowId
inner join dbo.vManagedEntityTypeProperty METP on METP.PropertyGuid=C.PropertyGuid
inner join dbo.vManagedEntityType MET on MET.ManagedEntityTypeRowId=ME.ManagedEntityTypeRowId
inner join dbo.vManagementPack MP on MP.ManagementPackRowId=MET.ManagementPackRowId
inner join dbo.vManagementPackVersion MPV on MPV.ManagementPackRowId=MP.ManagementPackRowId
left join dbo.vDiscoveryManagementPackVersion DMP on DMP.ManagementPackVersionRowId=MPV.ManagementPackVersionRowId
    AND CAST(DefinitionXml.query('data(/Discovery/DiscoveryTypes/DiscoveryClass/@TypeID)') AS nvarchar(max)) like '%'+MET.ManagedEntityTypeSystemName+'%'
left join dbo.vManagedEntityType MET1 on MET1.ManagedEntityTypeRowId=DMP.TargetManagedEntityTypeRowId
left join dbo.vDiscovery D on D.DiscoveryRowId=DMP.DiscoveryRowId
where ChangeDateTime > dateadd(hh,-24,getutcdate())
ORDER BY MP.ManagementPackSystemName, MET.ManagedEntityTypeSystemName

--Aggregation history
USE OperationsManagerDW;
WITH AggregationInfo AS (
    SELECT
    AggregationType = CASE
        WHEN AggregationTypeId = 0 THEN 'Raw'
        WHEN AggregationTypeId = 20 THEN 'Hourly'
        WHEN AggregationTypeId = 30 THEN 'Daily'
        ELSE NULL
    END
    ,AggregationTypeId
    ,MIN(AggregationDateTime) as 'TimeUTC_NextToAggregate'
    ,COUNT(AggregationDateTime) as 'Count_OutstandingAggregations'
    ,DatasetId
    FROM StandardDatasetAggregationHistory
    WHERE LastAggregationDurationSeconds IS NULL
    GROUP BY DatasetId, AggregationTypeId
)
SELECT
SDS.SchemaName
,AI.AggregationType
,AI.TimeUTC_NextToAggregate
,Count_OutstandingAggregations
,SDA.MaxDataAgeDays
,SDA.LastGroomingDateTime
,SDS.DebugLevel
,AI.DataSetId
FROM StandardDataSet AS SDS WITH(NOLOCK)
JOIN AggregationInfo AS AI WITH(NOLOCK) ON SDS.DatasetId = AI.DatasetId
JOIN dbo.StandardDatasetAggregation AS SDA WITH(NOLOCK) ON SDA.DatasetId = SDS.DatasetId AND SDA.AggregationTypeID = AI.AggregationTypeID
ORDER BY SchemaName DESC

Misc Section:

--To get better performance manually:
--Update Statistics (will help speed up reports and takes less time than a full reindex):
EXEC sp_updatestats

--Show index fragmentation (to determine how badly you need a reindex – logical scan frag > 10% = bad. Scan density below 80 = bad):
DBCC SHOWCONTIG
DBCC SHOWCONTIG WITH FAST --(less data than above – in case you don’t have time)

--Reindex the database:
USE OperationsManager
go
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET ARITHABORT ON
SET CONCAT_NULL_YIELDS_NULL ON
SET QUOTED_IDENTIFIER ON
SET NUMERIC_ROUNDABORT OFF
EXEC SP_MSForEachTable "Print ‘Reindexing ‘+’?’ DBCC DBREINDEX (‘?’)"

--Table by table:
DBCC DBREINDEX (‘TableName’)

--Query to view the index job history on domain tables in the databases:
select *
from DomainTable dt
inner join DomainTableIndexOptimizationHistory dti
on dt.domaintablerowID = dti.domaintableindexrowID
ORDER BY optimizationdurationseconds DESC

--Query to view the update statistics job history on domain tables in the databases:
select *
from DomainTable dt
inner join DomainTableStatisticsUpdateHistory dti
on dt.domaintablerowID = dti.domaintablerowID
ORDER BY UpdateDurationSeconds DESC

↧

MP University – Fall 2016 – Wednesday Nov 16th

November 15, 2016, 7:14 am

≫ Next: Part 8: Use VSAE fragments to create a Windows Performance Monitor with Consecutive Samples

≪ Previous: SCOM SQL queries

I’ll be speaking at this free seminar, covering MP authoring using VSAE, and there will be many other topics covered:

Silect proudly presents MP University Fall 2016 Edition! Please join us Wednesday November 16, 2016 from 9AM to 4PM ET (UTC -5) for the premier event on developing, deploying and managing Operations Manager Management Packs and much more! And better yet it’s free!

Join industry experts including Brian Wren and Kevin Holman from Microsoft, Paul Chehowski CTO of Silect and others for this event.

Agenda

Here are some of the topics we’ll be covering:

Management Pack Development
SCOM 2016
Management Pack Authoring for SNMP devices
Management Pack Best Practices
Microsoft Operations Management Suite (OMS) and Power BI
Visual Studio Authoring Extensions
… and more!

https://attendee.gotowebinar.com/register/3022690883856906241

↧

New Linux operating system versions supported

Issues that are fixed in the UNIX and Linux management packs

Yes, yes, I know. Late to the party. Better late than never.

Want to know more?

UR1 is available:

New feature on tuning MP’s:

New feature on recommended MP’s and MP updates:

New feature on HTML based Web Console:

Improvements in console UI speed:

New tools for improving the network monitoring MP authoring experience:

New feature for Scheduled Maintenance mode:

New Unix/Linux features and improvements:

VSAE has been updated for SCOM 2016.

Get it here:

Server Names\Roles:

High Level Deployment Process:

Prerequisites:

SCOM Step by step deployment guide:

What’s next?

Key fixes: We aren’t listing them.

Lets get started.

1. Management Servers

2. Apply the SQL Scripts

3. Manually import the management packs

4. Update Agents

5. Update Unix/Linux MPs and Agents

6. Update the remaining deployed consoles

There is an issue where after patching your Windows Server or Workstation machine with the monthly cumulative updates – you might see you SCOM console crash with an exception.

This affects SCOM 2012 and SCOM 2016

Log Name: Application

Event ID: 1000

Description:

Faulting application name: Microsoft.EnterpriseManagement.Monitoring.Console.exe, version: 7.2.11719.0, time stamp: 0x5798acae

Faulting module name: ntdll.dll, version: 10.0.14393.206, time stamp: 0x57dac931

System Center Operations Manager Management Console crashes after you install MS16-118 and MS16-126 https://support.microsoft.com/en-us/kb/3200006

Something that a fellow PFE (Brian Barrington) called to my attention, with SCOM 2016 agents, when installed on a Domain Controller: the agent just sits there and does not communicate.

When we deploy a SCOM 2016 agent to a domain controller – you might see it goes into a heartbeat failed state immediately, and on the agent – you might see the following events in the OperationsManager log:

Followed eventually by a BUNCH of this:

High Level Overview:

Import Management Packs:

Create a resource pool for monitoring Unix/Linux servers

Configure the Xplat certificates (export/import) for each management server in the pool

Create and Configure Run As accounts for Unix/Linux

Discover and deploy the agents

Monitoring Linux servers:

Troubleshooting:

Alerts Section (OperationsManager DB):

Events Section (OperationsManager DB):

Performance Section (OperationsManager DB):

State Section:

Management Pack info:

Agent Info:

Misc OpsDB:

Data Warehouse Database Queries:

Misc Section:

Agenda