Disaster Recover Scenarios

Updated on 10 Jan 2022
3 Minutes to read

Article summary

Did you find this summary helpful?

Thank you for your feedback

Probench is currently hosted by OVH, data centres which have fully redundant power and network infrastructure in addition the services is covered by a 100% hardware and network availability (excluding scheduled maintenance) agreement. Should any failures occur they are to be rectified as soon as possible without additional charge from OVH. We ameliorate the risk of hardware failure by maintaining multiple configured servers, in different data centres.

Backup Plan

Click here to view the complete backup plan.

Scenario 1 – Database Corruption

In the event that the Database gets corrupted or is unreadable to steps to recover are as follows

• Take the site off line and put a holding page up
• Restore the latest Full backup copy with No Recovery
• Restore all log file backups until to the point of corruption
• Ensure the Database has been successfully restored before bringing the site back online

Potential data loss: any data collected after the logs were corrupted

Estimated downtime: 1-4 hours (depending on difficulty in restoring data)

Scenario 2 – Primary Server Failure / Data centre failure

We minimise our exposure to a single point of failure by having a secondary server in another data centre:

Prepare an email to to be sent to the clients informing about the downtime and we are investigating on the same.
Raising a ticket with OVH server provider informing about the inaccessible server.
(After a hour or two) Prepare an email to to be sent to the clients informing about the action we have taken and will keep them posted.
If we get an reply from OVH or through twitter then prepare an email to to be sent to the clients explaining the reason behind the disaster.
Point the IP address to the secondary server
Hosting the site on the secondary server's IIS
Put a holding page on the secondary server
Creating restore plan in Cloudberry backups to recover full database and files backups from Amazon S3

Sync CloudBerry for latest Data (see the image attached with this article)
!cb_sync.png!
Just restore the latest database backup
Second, Restore the Files from the folder of Latest Year (So that clients can start working)
Restore from folders of older years
Restore the log folders

Restore the latest Full backup copy with No Recovery
Ensure the Database has been successfully restored before bringing the site back online

Do not set database files location path in C drive.
Enable service broker

Pointing the hosted site to the live Probench code
Add new database connection to the Connection config on the secondary server
Change the FileStorageRoot to point to the new folder location
Install SSL certificates are installed on secondary server
If clients are using custom domain names then have to inform their IT team to point to the new server.
Once recovered, the secondary server is now the primary server
Modify the maintenance plan for the database backup and Cloudberry backup on the secondary server for newly added database.

Potential data loss: Any data collected the day of the event
Estimated downtime: 2-6 hours (depending on difficulty in restoring data)

Scenario 3 – Primary Server completely unavailable (“irrevocably destroyed”)

We cannot use our failed server anymore, so once the site is back up, provision a new server as soon as possible:

Follow Scenario 2
Provision a new server – in a different data centre to the current primary server – to become the new secondary server

Potential data loss: as for Scenario 2

Estimated downtime: as for Scenario 2

Scenario 4 – The hosting provider disappears

Should OVH become unavailable (e.g. “bankruptcy”, “legal intervention”, “terrorism”), we will restore our data on Amazon’s Web Services platform:

Create an elastic IP at Amazon
Immediately update the DNS to point to the new IP (will take time to propagate)
Create a VM with Windows and SQL Server on Amazon Web Services
Place a holding page for inform users once DNS change has spread
Copy the Full database backup from the encrypted S3 store
Restore the database
Copy the application data from the encrypted S3 store
Ensure the Database has been successfully restored before bringing the site back Online

Potential data loss: any data collected the day of the event
Estimated downtime: 4-8 hours (due to the time required to copy the data and provision the new servers)

Was this article helpful?

What's Next

Disaster Recovery Mock Drill Metric

Table of contents