Syncthing stops sync the cluster_statuses folder after the DB is down

Description

First, we have a healthy cluster running.
I stop the patroni service on 2 of the DB nodes to simulate DB cluster failure.
Then I start the patroni service on those 2 DB nodes so the DB cluster will be healthy again.
The problem is that now the Syncthing service stops syncing the cluster_statuses folder after the DB is down.
The cluster_statuses folder includes that status report files of all the nodes and should be sync among the managers.
It cause weird behaviour of the GET cluster-status endpoint.

Steps to Reproduce

Environment:
OS (CLI), HA cluster, cloud provider
------------------------------------

Steps to reproduce:
------------------
1.
2.
3.

Expected result:
---------------

Actual result:
-------------

Why Propose Close?

None

Activity

Show:
Inbal Amrani
February 2, 2020, 10:42 AM

From our discussion, the problem was that the use-case the test checked it not realistic.
So I’m closing this issue and working on fixing the test in:

Barak Azulay
January 29, 2020, 4:19 PM

after merging into master , please merge it to 5.0.5-build branch as well, and only than move to FIXED

geokala
January 28, 2020, 9:45 AM

OK, so from discussion, this is unlikely to be triggered in the wild. It is caused by:

  1. syncthing gets upset when it sees a file being changed that it is currently syncing.

  2. When patroni (but not etcd) is stopped on two nodes of the DB cluster, we will still have a leader DB up and running, because patroni can still get a leader lock on etcd.

  3. This means that each status reporter will attempt to put its data on the restservice, which will hang waiting to write the last-login-time.

  4. Then, when the DB is brought back up, the last attempted update will be sent, immediately followed by the following one.

  5. This will cause syncthing to see updates on a file it is currently syncing, breaking replication on that directory.

 

Barak Azulay
January 27, 2020, 4:18 PM

any idea ? thought about the connection between DB cluster down and syncthing ?

Assignee

Inbal Amrani

Reporter

Inbal Amrani

Severity

Medium

Target Version

5.0.5

Premium Only

yes

Found In Version

5.0

Bug Type

new feature bug

Customer Encountered

No

Release Notes

no