Automating the vRealize Automation Manager Service Failover

During a couple of vRealize Automation (vRA) design engagements I had to explain that the vRealize Automation Manager Service doesn’t have an Automated Failover process (active/passive) and relies on a manual intervention. This was quite hard for the customers to understand and accept because of active / active redundancy of other vRA components like the Web Service.

So OK what does the vRA Manager Service do (link)?

The Manager Service is a Windows service that coordinates communication between IaaS DEMs, the SQL Server database, agents, and SMTP. IaaS requires that only one Windows machine actively run the Manager Service. For backup or high availability, you may deploy additional Windows machines where you manually start the Manager Service if the active service stops.

And that last part is something my customers didn’t like (at all) because this depends on a person to activate the service manually. OK then how can we solve this?

Automating the Manager Service Failover

I like to keep things simple and wanted to Automate the Manager Service failover with vRealize Operations (vROps) monitoring the service and kicking off an action when the service is down. Eventually I got this to work but this took way too much effort and didn’t like the complex setup of vROps sending a SNMP trap to vRO and then let vRO kick off a Powershell script on the vRA IaaS Manager server. So back to the drawing board and the solution was way too simple… Running a scheduled task on the Secondary vRA IaaS Manager server that checks the Manager Service on the Primary and then starts it locally when the service is down.

Pre-requisites

  • Powershell allows the execution of scripts
  • Scheduled task is running under the vRA Service Account
  •  
    The Script

    (more…)

    Read More

    NSX for vSphere Configuration Maximums

    This post describes the NSX for vSphere Configuration Maximums for version 6.0.x, 6.1.x and 6.2.x.

    NSX for vsphere Configuration Maximums

    Whenever I got into a discussions about sizing, scalability and maximums of NSX I always turned to an excellent post written by Martijn Smit. But this post only contained information until version 6.1.x and not the latest version 6.2.x. And then during one of my projects some questions around the scalability of version 6.2.x came up and we had to do some research to find these scalability numbers. You can find the results of that research below.
     

    (more…)
    NSX 6.0 NSX 6.1 NSX 6.2
    Nodes
    vCenter 1 1 1
    Controllers 3 3 3
    vCenter Clusters 12 12 16
    Hosts per Cluster 32 32 32
    Hosts per Transport Zone 256 256 256
    L2
    Logical Switch 10,000 10,000 10,000
    Logical Switch Ports 50,000 50,000 50,000
    VXLAN/VLAN l2 Bridges per DLR 500 500 500
    Distributed Firewall (DFW)
    Rules per NSX Manager 100,000 100,000 100,000
    Rules per VM 1,000 1,000 3,500
    Rules per Host 10,000 1 10,000 1 10,000+ 1

    Read More

    Trend Micro Deep Security and NSX 6.2.3 issue

    Last week I had the pleasure of upgrading vCNS 5.5.4 to NSX 6.2.3 at a customer that also was running Trend Micro Deep Security 9.6 SP1. Before the upgrade I checked the compatibility matrices here, here, here and here and it looked like everything checked out. So I went ahead with the upgrade and the upgrade went super smooth and ran without any issues. After the upgrade was completed I linked the Trend Micro Deep Security Manager to the NSX Manager and we protected the VMs and again all looked good. But then… I ran into the most annoying error know to man (with Trend Micro Deep Security) “Anti-Malware Engine Offline” and “Web Reputation Engine Offline”.
     

    NSX 6.2.3

    Oh Joy!
     

    Let the troubleshooting begin!

    • Filter Drivers ESXi hosts
    • Check, all ESXi hosts have the Filter Driver Removed.
  • Guest Introspection Drivers VMware Tools
    • Check, all VMs have an updated version of the VMware Tools with the Guest Introspection option enabled.
  • Licensing NSX
    • Check, NSX 6.2.3 is licensed as “NSX for vSphere”.
  • Licensing Trend Micro Deep Security
    • Check, Anti-Malware and Web Reputation is licensed.
  • NSX Security Policy
    • Check, the correct NSX Security Policy is in place and applied on all VMs.
  • NSX Guest Introspection Service VMs
    • Check, the NSX Guest Introspection Service VMs are deployed and service is up and running.
  • Trend Micro Deep Security Service VMs
    • Check, the Trend Micro Deep Security Service VMs are deployed and service is up and running.
  • Trend Micro Deep Security Policy
    • Bingo! Disabling the Web Reputation solved also the “Anti-Malware Engine Offline” error. We have a lead!

     
    (more…)

    Read More

    NSX Manager SFTP Backup

    During my last couple of NSX projects the backup of the NSX Manager proved to be some kind of a challenge. Using the NSX manager, it is possible to create backups via the FTP or the SFTP transfer protocol, but because we wanted to adhere the NSX hardening recommendations SFTP is preferred transfer protocol. No biggie you would think, except that most of the customers did not possessed the proper SFTP (don’t confuse with FTPS!!) software to support this.

    Why is it so important to create a proper backup of the NSX Manager? Well that’s because the backup contains the following components :

  • NSX configuration
  • NSX Controllers configuration
  • Logical switches configuration
  • Routing configuration
  • Security groups, policies and settings
  • All firewall rules
  • And simply everything else that you configure within the NSX Manager UI or API
  •  
    I think you now understand why you want to have these settings safely stored away.

    So what are our options? On SFTP.net the authors created a list of stand-alone SFTP servers that can be used for this task. For some customers it is difficult to procure these types of software online and rather use “freeware”. Then the next problem arises, some companies won’t use encryption software if it’s not commercial… Yeah I love those discussion with the security guys :) .

    OK so just for the sake of it (and I’m not bound by any security guys looking over my shoulders) I’m just going for the NSX Manager SFTP Backup based on FreeFTPd for Windows.
    (more…)

    Read More

    VCDX #223

    The two week waiting period after the defense were maddening. I had mood swings ranging from “what the F did I do” to “maybe juuuuuust maybe”.

    And then finally after two weeks of nightmares and nail biting “the” defense results email arrives, I think I’ll never forget the moment that I received “the” email…

    I was driving home from work and got a message from my fellow VCDX wannabe Matthew Bunce stating that someone already got his defense results. While reading this message an email notification appeared in the top of my phone called “VCDX – VCDX-DCV Defense Results”, my heart dropped to the bottom of my car. I managed to raise my finger over the the notification, swipe it to the right and read the first line of the email.
     

    VCDX-Results

     
    I remember thinking to myself, how do you mean “Congratulations! You passed!”? What the? I Passed!?! After I almost crashed my car into the crash barier (this proves again you should not read emails on your phone while driving!) I directly phoned Matthew and he also received the same good news! The VCDX directory has two new additions to the VCDX family, VCDX #222 and #223! Happy days!

    You can read more about my VCDX journey and background on the blog of Gregg Robertson in his VCDX Spotlight section : The Saffageek VCDX Spotlight – Marco van Baggum

    It now has been more than two weeks ago since I got my VCDX results back and it finally starts to sink in, I actually PASSED! I’m still over the moon and can’t wait to get started on some new projects and write some overdue blog post.

    To be continued!

    VCDX #223

    Read More

    Upgrade vRealize Automation 6.2 To 7.x

    This post describes how to upgrade vRealize Automation 6.2 to 7.x. Before performing this upgrade please read my previous post “vRealize Automation 7 Upgrade Considerations“, this post describes multiple pitfalls and could prevent potential issues.

    Done reading? OK then let’s start!
     

    upgrade vRealize Automation
     

    Step 1 : Backup current Installation

    Before you do anything backup your current installation! Believe me when I say this is a critical step, if something goes wrong you don’t want to rely only on a VM snapshot…
     

    Step 2 : Shutdown vRealize Automation services on your IaaS server

    Shut down services in the following order on the IaaS servers. But be absolutely sure not to shut down the actual machine, otherwise the appliance upgrade will fail.
    Each virtual machine has a Management agent, which should be stopped with each set of services.

  • All VMware vCloud Automation Center agents
  • All VMware DEM workers
  • VMware DEM orchestrator
  • VMware vCloud Automation Center Service
  •  
    (more…)

    Read More

    vRealize Automation 7 Upgrade Considerations

    For an engagement last week, I had to find out if there are any considerations for performing an in place upgrade to vRealize Automation 7. And funny enough I found a few…
     

    vRealize Automation 7 upgrade
     

    vRealize Automation 7 Upgrade Considerations

    • Minimum upgrade version to vRA 7.0 is vRA 6.2.x
    • Note : vRA 6.2.4 will not be supported for upgrade to 7.0 until 7.x
  • vRA 7.0 will only work with vRO 7.0
  • Customers with vRA 6.0 / 6.1 need to upgrade to 6.2.x first
  • The upgrade process to vRA 7.0 will stop if :
    • Physical Endpoints are detected
    • vCloud Director Endpoints are detected
  • Application Services Blueprints will not be migrated
  • Add component for Multi Machine Blueprints will not be available in 7.0
  • vRA 7.0 vRO Plug-in is not backward compatible
  • Customizations that leverage Custom Components Catalog (CCC) and vCloud Automation Center Designer (CDK) will not be supported in 7.0
  •  

    Background Information :

     

    Physical Endpoints

    All previously supported physical endpoints like HP iLO, Cisco UCS, Dell iDRAC etc are not supported. I could not find any specific reason for it, only that it did not make the vRA 7.0 release.
    (more…)

    Read More

    vExpert 2016 and VCDX Phase 1

    vExpert 2016

    I was super excited last week that I have been awarded vExpert 2016 for the second(!!) time.
     

    vExpert 2016
     
    So what’s a “vExpert”?
    As VMware states : “A vExpert, in the simplest of terms, is an active member of the VMware community who imparts his/her advanced knowledge on others. The vExpert program is a way of recognizing people who participate in the community and increase awareness of VMware products and uses.“
     
    The list of the vExpert 2016 announcement can be found here : Link
     
    A great thanks to Corey Romero & the vExpert Team for all their great work.
    And of course congratulations to all other vExperts of 2016!
     

    VCDX Phase 1

     
    It has been a crazy month! Not much time for tweeting, blogging or anything else for that matter! Why you ask? Well because I had put my full focus on submitting a design before the VCDX submission deadline on the 14th of February and we actually made it!
     
    Many thanks go out to:
     
    Matthew Bunce
    Niels Hagoort
    Paul Geerlings
    Dennis Hoegen Dijkhof
    Gregg Robertson
     
    And of course my wife Rosa en son Panos for putting up with me over the last month(s) :) .
     
    Now the waiting begins for the review and maybe if everything goes well hopefully the invite for the defense! To be continued (I hope)…

    Read More

    Host Profiles : Number of network stack instances don’t match

    Today was a nice and peaceful day onsite, until I had the “pleasure” to configure vSphere Host Profiles and getting all the hosts compliant. After battling with some PSP path selection “Compliance Failures” an annoying “Number of network stack instances don’t match” failure appeared.

     
    Host Profiles

     
    This is not the first time I got this failure and I knew how to solve it, but there is not much information online how you can solve it so I thought lets dedicate a small post about it.

    The Host Profiles fix

    First open a SSH connection to the reference host and run the following command :

    Then open a SSH connection to the hosts that won’t get it and refuses to get compliant with the Host Profile and run the last command show above again. Compare the two results, if it is correct there is another netstack shown on the not compliant host. Write down the netstack name and run the following command :

    After this go back to the vSphere Host Profiles and click on “Check Profile Compliance”, the host should be “Compliant” when the check is completed!

     
    HostProfiles02
     

    Enjoy! :)

    Read More

    How to change your forgotten vRealize Operations 6.x root password

    Today I wanted to access my vRealize Operations (vROps) appliance through the console, but… oh cr*p what did I use as the root password of the appliance again…

    vRealize Operations

    After some research it appears to be quite easy to change your forgotten root password.
     

    Reset vRealize Operations root password

    Open the console of the vRealize Operations appliance through the vSphere Web Client or the old trusted vSphere Client. Then reboot the vRealize Operations appliance and when the bootloader appears just append init=/bin/bash to the boot options.

    vROps-reset-password01

    Proceed with booting the appliance and when the appliance is done booting type passwd (more…)

    Read More