Source failure recovery after losing connection
- Last Updated: March 30, 2020
- 2 minute read
- OpenEdge
- Version 12.2
- Documentation
When the OpenEdge Replication server loses connection with one
or more OpenEdge Replication agents, the OpenEdge Replication server tries to contact
the OpenEdge Replication agents and establish connection for an amount of
time determined by the connect-timeout value set
in the OpenEdge Replication server properties file.
The OpenEdge Replication server does the following:
- The OpenEdge Replication server recognizes that there has been an agent failure. The server places itself into a state that allows continuous database activity, as if OpenEdge Replication were not running.
- The OpenEdge Replication server tries to reconnect to OpenEdge Replication
agents for a set amount of time.
Source database activity by clients is still allowed unless schema updates are being performed.
- If the OpenEdge Replication server is able to reconnect to the OpenEdge
Replication agent, it again begins processing AI blocks from the database. When it gets
within ten AI blocks of the last AI block written, the OpenEdge Replication server
temporarily stalls normal database activity and completes the synchronization process.
Schema updates are not allowed while the OpenEdge Replication server is performing synchronization. If schema updates are being performed when failure recovery synchronization begins, source database updates will block until failure recovery completes.
- When synchronization is completed, the OpenEdge Replication server reinserts itself back into the AI block write process. The database is unstalled, allowing normal database activity and replication activity to continue.
If the OpenEdge Replication server is unable to reconnect to all agents or to
the critical agent in the configured connect-timeout period,
the OpenEdge Replication server terminates, and source database activity continues. In other
words, if there is no critical agent, the server must be able to reconnect to all agents; or
it will terminate. If one agent is specified as the critical agent, the server will continue
if it can reconnect to it.
When source database activity continues while the OpenEdge Replication server is not running, be sure that there is enough AI extent space to handle all database activity until the OpenEdge Replication server is restarted and replication continues.
There is a possibility when failure recovery is being performed and synchronization takes place that the OpenEdge Replication server might not catch up to the database. During this time, all target databases are not up to date with the source.