We're updating the issue view to help you get more done. 

Existence of /var/pgdata.tmp directory prevents in place upgrade

Description

If for some reason /var/pgdata.tmp already exists, probably because it wasn't cleaned from previous versions, Manager install fails on:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 /var/log/cloudify/cloudify-cluster.log: DEBUG:root:running standby_clone with args {'owner': 'postgres', 'data_dir': '/var/pgdata', 'connstring': 'host=10.126.186.156 user=cloudify_replicator sslcert=/etc/cloudify/cluster-ssl/postgresql_client.crt sslmode=verify-ca sslkey=/etc/cloudify/cluster-ssl/postgresql_client_pg.key password=QoUgr9xmBjAOEUZqjkkMdw== port=15432 connect_timeout=3 sslrootcert=/etc/cloudify/ssl/cloudify_internal_ca_cert.pem'} ERROR:root:error in standby_clone Traceback (most recent call last): File "/opt/cloudify/sudo_trampoline.py", line 249, in <module> handler(config=config, **args) File "/opt/cloudify/sudo_trampoline.py", line 126, in standby_clone os.mkdir(tmpdir) OSError: [Errno 17] File exists: '/var/pgdata.tmp' handler_runner 2018-04-20 11:08:50,344:cloudify_premium.ha.database:ERROR: error cloning (retry 1/20) Traceback (most recent call last): File "/opt/manager/env/lib/python2.7/site-packages/cloudify_premium/ha/database.py", line 355, in standby_clone '--owner', self._db._owner File "/opt/manager/env/lib/python2.7/site-packages/cloudify_premium/ha/sudo.py", line 15, in run subprocess.check_output(['sudo', SUDO_SCRIPT_PATH] + command) File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output raise CalledProcessError(retcode, cmd, output=output) CalledProcessError: Command '['sudo', '/opt/cloudify/sudo_trampoline.py', 'standby_clone', '--connstring', 'host=10.126.16.16 user=cloudify_replicator sslcert=/etc/cloudify/cluster-ssl/postgresql_client.crt sslmode=verify-ca sslkey=/etc/cloudify/cluster-ssl/postgresql_client_pg.key password=qjkkMdw== port=15432 connect_timeout=3 sslrootcert=/etc/cloudify/ssl/cloudify_internal_ca_cert.pem', '--data-dir', '/var/pgdata', '--owner', 'postgres']' returned non-zero exit status 1 check_runner 2018-04-20 11:08:50,764:cloudify_premium.ha.checks:ERROR: Error running check check_local_db: could not connect to server: Connection refused Is the server running on host "127.0.0.1" and accepting TCP/IP connections on port 15432?

Install should either delete the existing file or use it but not fail on it.

Steps to Reproduce

Environment:
OS (CLI), HA cluster, cloud provider
------------------------------------

Steps to reproduce:
------------------
1. touch /var/pgdata.tmp on the leader of the cluster (simulate a failed failover from an earlier installation)
2. cfy cluster set-active manager2
3. cfy cluster nodes list

Expected result:
---------------
Waiting for manager2 to become the active node...
manager2 set as the new active node

  • All nodes are either in leader or replica state

Actual result:
-------------

  • Manager with /var/pgdata.tmp is in a failed state

Why Propose Close?

None

Status

Assignee

Ohad Baruch

Reporter

Jonathan Abramsohn

Severity

Medium

Target Version

4.5

Premium Only

no

Found In Version

4.3

QA Owner

Uri Wygodny

Bug Type

legacy bug

Customer Encountered

Yes

Customer Name

c954

Release Notes

yes

Priority

None

Sprint

None

Priority

Unprioritized