“Exception occured in postInstallHook” Updating vCenter With a Custom EVC Mode

"Exception occured in postInstallHook" Updating vCenter With a Custom EVC Mode

In my home lab, I run a six host ESXi cluster: three seventh-generation Intel NUCs and three tenth-gen NUCs. Over time I hope to settle on all tenth-gens, but in the meantime I have to use Enhanced vMotion Compatibility (EVC) and a custom mode to allow vMotions between the NUC7s and NUC10s. Even if it’s perhaps generally less than ideal, overall it works just fine — but it introduces at least one wrinkle to the vCenter upgrade process which can already be a little delicate in some cases. If you run into “Exception occurred in postInstallHook” updating vCenter with a custom EVC mode, here’s one thing to be sure to check — and some tips to make the update process clearer in general.

Custom EVC Mode

Florian Grehl details in this article precisely why a custom EVC mode is required in my scenario, and provides a customized evcModes.xml file for use. It was a huge help that benefitted me and I’m sure many other vSphere home labbers out there.

Basically it boils down to the fact that, understandably, VMware built EVC to support Intel’s server-grade CPUs rather than consumer-grade CPUs like those in NUCs. Fortunately the workaround is really easy to implement, thanks to Florian’s work and detailed articles. He also frequently posts articles about vSphere home labs, and I highly recommend you check out his entire blog here: https://www.virten.net/

vCenter updates can be dicey — especially in a home lab

Much as I prefer the “cattle not pets” way of thinking about infrastructure objects, vCenter appliances are one of those things that in my opinion just make more sense to be long-lived. And that means they require updating. Overall I’d say it’s a relatively smooth process, but there are a number of articles and posts on Reddit etc. out there detailing some of the challenges home labbers face in applying patches and updates to vCenters — many of which to my eye seem likely to be rooted in the fact that in home labs usually lack the compute and storage performance of a professional environment.

A common failure and one that seems to lead to a lot of headaches is the dreaded “Exception occurred in postInstallHook” error.

Ugh

There are a whole bunch of possible root causes for that error. But I’d guess the one home lab operators run into most frequently is addressed in VMware KB 85068. Basically the issue stems from the fact that some of the vCenter’s services are (or at least were) configured with startup timeouts that are too low; when the update process runs, it expects services to start within the specified timeout period and if they don’t, the update fails.

The workaround is to change the timeout values in a few service config files — pretty trivial overall. One thing I’ve found though is that while the KB indicates the issue was fixed in vCenter 7.0 U3f, it’s still possible that services will not start in time, even with the extended timeouts specified in the KB. As I noted in a previous post, my lab has generally poor storage performance overall, and I suspect that tends to cause timeouts. So to address that, I do a couple things:

  • Follow the guidance in KB 85068, but using higher timeout values. I’ve set them to 6000 seconds rather than the 1500 specified by the KB with good results.
  • Storage vMotion the vCenter appliance to faster storage. My hosts each have a local 2.5″ SSD and an NVMe drive, and running the update while the vCenter is on those devices helps a lot.

Another issue with the process overall is that it can be a little opaque if you use the vCenter appliance’s VAMI. Also downloading the patch from across the Internet could introduce problems. So what I recommend is attaching the ISO to the vCenter appliance, and then using the CLI to apply the patch. It’s much easier to keep an eye on the process that way and immediately see and diagnose any issues.

Custom EVC mode XML file gets overwritten during update

As Florian notes in his article about NUC custom EVC modes, the vCenter update process overwrites the customized evcModes.xml file. This too will lead to the “Exception occurred in postInstallHook” failure:

Installation failed. Retry to resume from the current state. Or please collect the VC support bundle.
        Mismatch:
                summary: Internal error occurs during execution of update process Traceback (most recent call last):
  File "/storage/updatemgr/software-updatejvqc4ez8/stage/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 203, in patch
    _patchComponents(ctx, userData, statusAggregator.reportingQueue)
  File "/storage/updatemgr/software-updatejvqc4ez8/stage/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 84, in _patchComponents
    _startDependentServices(c)
  File "/storage/updatemgr/software-updatejvqc4ez8/stage/scripts/patches/py/vmware_b2b/patching/phases/patcher.py", line 53, in _startDependentServices
    serviceManager.start(depService)
  File "/storage/updatemgr/software-updatejvqc4ez8/stage/scripts/patches/libs/sdk/service_manager.py", line 901, in wrapper
    return getattr(controller, attr)(*args, **kwargs)
  File "/storage/updatemgr/software-updatejvqc4ez8/stage/scripts/patches/libs/sdk/service_manager.py", line 794, in start
    super(VMwareServiceController, self).start(serviceName)
  File "/storage/updatemgr/software-updatejvqc4ez8/stage/scripts/patches/libs/sdk/service_manager.py", line 665, in start
    raise IllegalServiceOperation(errorText)
service_manager.IllegalServiceOperation: Service cannot be started. Error: Error executing start on service vpxd. Details {
    "detail": [
        {
            "id": "install.ciscommon.service.failstart",
            "translatable": "An error occurred while starting service '%(0)s'",
            "args": [
                "vpxd"
            ],
            "localized": "An error occurred while starting service 'vpxd'"
        }
    ],
    "componentKey": null,
    "problemId": null,
    "resolution": null
}
Service-control failed. Error: {
    "detail": [
        {
            "id": "install.ciscommon.service.failstart",
            "translatable": "An error occurred while starting service '%(0)s'",
            "args": [
                "vpxd"
            ],
            "localized": "An error occurred while starting service 'vpxd'"
        }
    ],
    "componentKey": null,
    "problemId": null,
    "resolution": null
}


.
                resolution: Send upgrade log files to VMware technical support team for further assistance.

Especially if you’re using the VAMI to do the update rather than the CLI, you might immediately conclude you’re running into one of the other problems that can cause the postInstallHook exception and go chasing after the wrong root cause. But as you can see from the error message, the VPXD service cannot start which causes the update process to fail. The reason the VPXD service is failing is likely because the cluster is running in an EVC mode not present in the new version of the evcModes.xml file written by the update.

Fortunately the solution is quick and easy: you just overwrite the evcModes.xml file in /usr/lib/vmware-vpx again with the customized file, and restart the update process. It should succeed from there.

Summary and step-by-step process

I wasted a little bit of time in troubleshooting the postInstallHook error because I executed the update using the VAMI and had to dig around in log files to figure out what was going wrong. Of course as soon as I found the specific error it was a bit of a forehead-slap moment in realizing I had forgotten about the evcModes.xml file. Had I run the update from the CLI, it would’ve been quickly clear where the issue was, and if you use the CLI too, it’s less likely you’ll confuse this specific cause with some of the other issues that can happen with the update process.

To hopefully help out if you’re already running a cluster in a custom EVC mode, here’s the list of steps I’ve executed for the last several vCenter updates which have all gone off without a hitch:

  • Download the update ISO and connect it to the VM.
  • SSH into your vCenter and make sure you’re using the Appliance Shell (more info in KB 2100508 here; basically if you’re in a typical bash shell, execute /bin/appliancesh and enter your password.)
  • Stage the updates from the ISO. Run software-packages stage --iso and accept the EULA.
  • Start the upgrade process by running software-packages install --staged
  • Once the process fails with the error above, overwrite the VMware-provided evcModes.xml with your custom version. You might be able to time it such that you get the new file copied before the failure occurs, but it’s probably easier to just wait.
  • Again go into the Appliance Shell and run software-packages install --staged The update process will pick back up at the “Converting data as part of post install” stage and should proceed successfully.

I hope that helps!

Leave a Reply

Your email address will not be published. Required fields are marked *