Agent (re)install after Cloudify Manager Failover

Description

We're doing some in-house testing of healing prior to demoing to a customer and are experiencing problems if the heal is performed after a Cloudify manager failover. The URL that the healed instance uses to download the agent package still uses the IP address of the old Cloudify manager, so that fails. As far as I can see, the URL is being retrieved from the "package_url" key of the cloudify_agent runtime property and I can confirm that that is unchanged by a manager failover.

We raised a similar problem in 2018 (https://support.cloudify.co/hc/en-us/requests/9771), and I can confirm that the patch that was delivered for that request looks to have been installed on the Cloudify Managers here (certainly I've confirmed that the changed files from that patch have been updated on the managers).

Steps to Reproduce

Environment:
OS (CLI), HA cluster, cloud provider
------------------------------------

Steps to reproduce:
------------------
1.
2.
3.

Expected result:
---------------

Actual result:
-------------

Why Propose Close?

None

Activity

Show:
Łukasz Maksymczuk
February 17, 2020, 1:54 PM

It seems that this is a real issue, but I don't know why has it resurfaced...

This was originally fixed in where we made the package url be dynamically computed based on the current active manager ip. The patch was https://github.com/cloudify-cosmo/patchify/pull/26

Maybe that is set wrong (for some unknown reason), or the install script keeps using the previously-rendered package_url instead of the dynamically-computed url.
We need to repro and examine it closely, probably at least half a day to understand it. I don't know what then, either a workaround or another patch for... 4.5?

Ofer Yarom
February 17, 2020, 2:10 PM

Eve, do you have a reproduction? a cluster on which that can be tested?

Lukasz, what information do you need in order to understand this? will access to the file system suffice?

Jonathan Abramsohn
April 12, 2020, 10:11 AM

Customer reported to have the same problem on Cloudify 4.6.

Apparently the download script of the agent doesn’t switch to the new leader.

Ofer Yarom
April 13, 2020, 8:26 AM

Note that this is 4.x related only.

Michael Glokhman
April 13, 2020, 8:38 AM

of course

Assignee

Michael Glokhman

Reporter

Eve Land

Labels

Severity

High

Target Version

4.6

Premium Only

no

Found In Version

4.5

QA Owner

None

Bug Type

unknown

Customer Encountered

Yes

Customer Name

None

Release Notes

yes

Priority

None

Epic Link

Sprint

None

Priority

Unprioritized
Configure