💎一站式轻松地调用各大LLM模型接口,支持GPT4、智谱、星火、月之暗面及文生图 广告
# Disaster Recovery (Geo) > 原文:[https://docs.gitlab.com/ee/administration/geo/disaster_recovery/](https://docs.gitlab.com/ee/administration/geo/disaster_recovery/) * [Promoting a **secondary** Geo node in single-secondary configurations](#promoting-a-secondary-geo-node-in-single-secondary-configurations) * [Step 1\. Allow replication to finish if possible](#step-1-allow-replication-to-finish-if-possible) * [Step 2\. Permanently disable the **primary** node](#step-2-permanently-disable-the-primary-node) * [Step 3\. Promoting a **secondary** node](#step-3-promoting-a-secondary-node) * [Promoting a **secondary** node running on a single machine](#promoting-a-secondary-node-running-on-a-single-machine) * [Promoting a **secondary** node with multiple servers](#promoting-a-secondary-node-with-multiple-servers) * [Promoting a **secondary** node with an external PostgreSQL database](#promoting-a-secondary-node-with-an-external-postgresql-database) * [Step 4\. (Optional) Updating the primary domain DNS record](#step-4-optional-updating-the-primary-domain-dns-record) * [Step 5\. (Optional) Add **secondary** Geo node to a promoted **primary** node](#step-5-optional-add-secondary-geo-node-to-a-promoted-primary-node) * [Step 6\. (Optional) Removing the secondary’s tracking database](#step-6-optional-removing-the-secondarys-tracking-database) * [Promoting secondary Geo replica in multi-secondary configurations](#promoting-secondary-geo-replica-in-multi-secondary-configurations) * [Step 1\. Prepare the new **primary** node to serve one or more **secondary** nodes](#step-1-prepare-the-new-primary-node-to-serve-one-or-more-secondary-nodes) * [Step 2\. Initiate the replication process](#step-2-initiate-the-replication-process) * [Troubleshooting](#troubleshooting) * [I followed the disaster recovery instructions and now two-factor auth is broken](#i-followed-the-disaster-recovery-instructions-and-now-two-factor-auth-is-broken) # Disaster Recovery (Geo)[](#disaster-recovery-geo-premium-only "Permalink") Geo 复制您的数据库,Git 存储库和其他少量资产. 将来,我们将支持和复制更多数据,使您能够在灾难情况下以最少的精力进行故障转移. 有关更多信息,请参见地[电流限制](../replication/index.html#current-limitations) . **警告:**多辅助配置的灾难恢复在**Alpha 中** . 有关最新更新,请查看多级[灾难恢复史诗](https://gitlab.com/groups/gitlab-org/-/epics/65) . ## Promoting a **secondary** Geo node in single-secondary configurations[](#promoting-a-secondary-geo-node-in-single-secondary-configurations "Permalink") 目前,我们不提供自动方式来升级 Geo 副本并进行故障转移,但是如果您具有对该计算机的`root`访问权,则可以手动进行. 此过程将**辅助**地理节点升级为**主要**节点. 为了尽快恢复地理冗余,应在遵循这些说明后立即添加新的**辅助**节点. ### Step 1\. Allow replication to finish if possible[](#step-1-allow-replication-to-finish-if-possible "Permalink") 如果**辅助**节点仍在从**主**节点复制数据,请尽可能严格遵循[计划的故障转移文档](planned_failover.html) ,以避免不必要的数据丢失. ### Step 2\. Permanently disable the **primary** node[](#step-2-permanently-disable-the-primary-node "Permalink") **警告:**如果**主**节点脱机,则可能是**主**节点上保存的数据尚未复制到**辅助**节点. 如果继续,此数据应视为丢失. 如果**主**节点发生故障,则应尽一切可能避免发生裂脑情况,即在两个不同的 GitLab 实例中可能发生写操作,从而使恢复工作复杂化. 因此,为故障转移做准备,我们必须禁用**主**节点. 1. SSH 进入**主**节点以停止并禁用 GitLab,如果可能的话: ``` sudo gitlab-ctl stop ``` 如果服务器意外重启,请阻止 GitLab 重新启动: ``` sudo systemctl disable gitlab-runsvdir ``` **注意:(** **仅 CentOS** )在 CentOS 6 或更旧的版本中,如果没有可用的机器重启,没有简单的方法可以阻止启动[GitLab](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/3058) (请参阅[Omnibus GitLab 问题#3058](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/3058) ). 完全卸载 GitLab 软件包可能是最安全的: ``` yum remove gitlab-ee ``` **注意:** ( **Ubuntu 14.04 LTS** )如果您使用的是较旧版本的 Ubuntu 或基于 Upstart init 系统的任何其他发行版,则可以通过以下操作来阻止 GitLab 在计算机重启时启动: ``` initctl stop gitlab-runsvvdir echo 'manual' > /etc/init/gitlab-runsvdir.override initctl reload-configuration ``` 2. 如果您没有对**主**节点的 SSH 访问权限,请使计算机脱机并通过任何方式阻止其重启. 由于您可能有很多方法可以完成此操作,因此我们将避免使用单个建议. 您可能需要: * 重新配置负载均衡器. * 更改 DNS 记录(例如,将主要 DNS 记录指向**辅助**节点,以停止使用**主要**节点). * 停止虚拟服务器. * 阻止通过防火墙的流量. * 从**主**节点撤消对象存储权限. * 物理断开机器连接. 3. 如果您打算[更新主域 DNS 记录](#step-4-optional-updating-the-primary-domain-dns-record) ,则可能希望立即降低 TTL,以加快传播速度. ### Step 3\. Promoting a **secondary** node[](#step-3-promoting-a-secondary-node "Permalink") 升级辅助服务器时,请注意以下事项: * A new **secondary** should not be added at this time. If you want to add a new **secondary**, do this after you have completed the entire process of promoting the **secondary** to the **primary**. * 如果遇到`ActiveRecord::RecordInvalid: Validation failed: Name has already been taken`在此过程中, `ActiveRecord::RecordInvalid: Validation failed: Name has already been taken`错误,请阅读[故障排除建议](../replication/troubleshooting.html#fixing-errors-during-a-failover-or-when-promoting-a-secondary-to-a-primary-node) . #### Promoting a **secondary** node running on a single machine[](#promoting-a-secondary-node-running-on-a-single-machine "Permalink") 1. SSH 登录到**辅助**节点并以 root 用户身份登录: ``` sudo -i ``` 2. 编辑`/etc/gitlab/gitlab.rb`以通过删除启用`geo_secondary_role`所有行来反映其新的**主要**状态: ``` ## In pre-11.5 documentation, the role was enabled as follows. Remove this line. geo_secondary_role['enable'] = true ## In 11.5+ documentation, the role was enabled as follows. Remove this line. roles ['geo_secondary_role'] ``` 3. 将**辅助**节点升级为**主要**节点. 在将辅助节点升级为主节点之前,应运行飞行前检查. 它们可以单独运行,也可以与升级脚本一起运行. 要将辅助节点与预检检查一起提升为主节点: ``` gitlab-ctl promote-to-primary-node ``` **警告:**跳过飞行前检查将把辅助设备升级为主要设备,而无需进一步确认! 如果您已经运行了[预检检查,](planned_failover.html#preflight-checks)或者不想运行它们,则可以使用以下方法跳过预检检查: ``` gitlab-ctl promote-to-primary-node --skip-preflight-check ``` 您还可以单独运行飞行前检查: ``` gitlab-ctl promotion-preflight-checks ``` 4. 验证您可以使用先前用于**辅助**节点的 URL 连接到新提升的**主**节点. 5. 如果成功,则**辅助**节点现在已提升为**主要**节点. #### Promoting a **secondary** node with multiple servers[](#promoting-a-secondary-node-with-multiple-servers "Permalink") `gitlab-ctl promote-to-primary-node`命令尚不能与多台服务器一起使用,因为它只能在仅一台机器的**辅助** `gitlab-ctl promote-to-primary-node`上执行更改. 相反,您必须手动执行此操作. 1. SSH 进入**辅助**数据库中的数据库节点,并触发 PostgreSQL 升级为可读写: ``` sudo gitlab-pg-ctl promote ``` 在 GitLab 12.8 及更早版本中,请参阅[消息: `sudo: gitlab-pg-ctl: command not found`](../replication/troubleshooting.html#message-sudo-gitlab-pg-ctl-command-not-found) . 2. 在**辅助**计算机上的每台计算机上编辑`/etc/gitlab/gitlab.rb` ,以通过删除启用`geo_secondary_role`所有行来将其新状态反映为**主要** `geo_secondary_role` : ``` ## In pre-11.5 documentation, the role was enabled as follows. Remove this line. geo_secondary_role['enable'] = true ## In 11.5+ documentation, the role was enabled as follows. Remove this line. roles ['geo_secondary_role'] ``` 进行这些更改后,请在每台机器上[重新配置 GitLab,](../../restart_gitlab.html#omnibus-gitlab-reconfigure)以使更改生效. 3. 将**中学**提升到**小学** . SSH 进入单个应用程序服务器并执行: ``` sudo gitlab-rake geo:set_secondary_as_primary ``` 4. 验证您可以使用先前用于**辅助服务器**的 URL 连接到新升级的**主**服务器. 5. 成功! **中学**已升格为**小学** . #### Promoting a **secondary** node with an external PostgreSQL database[](#promoting-a-secondary-node-with-an-external-postgresql-database "Permalink") `gitlab-ctl promote-to-primary-node`命令不能与外部 PostgreSQL 数据库一起使用,因为它只能在使用 GitLab 的**辅助**节点和数据库在同一台机器上执行更改. 结果,需要手动处理: 1. 升级与**辅助**站点关联的副本数据库. 这会将数据库设置为可读写: * Amazon RDS- [将只读副本提升为独立数据库实例](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html#USER_ReadRepl.Promote) * PostgreSQL 的 Azure 数据库- [停止复制](https://docs.microsoft.com/en-us/azure/postgresql/howto-read-replicas-portal#stop-replication) * 其他外部 PostgreSQL 数据库-将以下脚本保存在辅助节点中,例如`/tmp/geo_promote.sh` ,然后修改连接参数以匹配您的环境. 然后,执行它以提升副本: ``` #!/bin/bash PG_SUPERUSER = postgres # The path to your pg_ctl binary. You may need to adjust this path to match # your PostgreSQL installation PG_CTL_BINARY = /usr/lib/postgresql/10/bin/pg_ctl # The path to your PostgreSQL data directory. You may need to adjust this # path to match your PostgreSQL installation. You can also run # `SHOW data_directory;` from PostgreSQL to find your data directory PG_DATA_DIRECTORY = /etc/postgresql/10/main # Promote the PostgreSQL database and allow read/write operations sudo -u $PG_SUPERUSER $PG_CTL_BINARY -D $PG_DATA_DIRECTORY promote ``` 2. 在**辅助**站点中的每个节点上编辑`/etc/gitlab/gitlab.rb` ,以通过删除启用`geo_secondary_role`所有行来将其新状态反映**为主** `geo_secondary_role` : ``` ## In GitLab 11.4 and earlier, remove this line. geo_secondary_role['enable'] = true ## In GitLab 11.5 and later, remove this line. roles ['geo_secondary_role'] ``` 进行这些更改后,请在每个节点上[重新配置 GitLab](../../restart_gitlab.html#omnibus-gitlab-reconfigure) ,以使更改生效. 3. 将**中学**提升到**小学** . SSH 进入单个辅助应用程序节点并执行: ``` sudo gitlab-rake geo:set_secondary_as_primary ``` 4. 验证您可以使用先前用于**辅助**站点的 URL 连接到新升级的**主**站点. 成功! **辅助**站点现在已提升为**主要**站点. ### Step 4\. (Optional) Updating the primary domain DNS record[](#step-4-optional-updating-the-primary-domain-dns-record "Permalink") 将主域的 DNS 记录更新为指向**辅助**节点将避免需要将对主域的所有引用更新为辅助域,例如更改 Git 远程服务器和 API URL. 1. SSH 进入**辅助**节点并以 root 用户身份登录: ``` sudo -i ``` 2. 更新主域的 DNS 记录. 更新主域名的 DNS 记录指向**辅助**节点后,编辑`/etc/gitlab/gitlab.rb` **辅助**节点上,以反映新的网址: ``` # Change the existing external_url configuration external_url 'https://<new_external_url>' ``` **Note:** Changing `external_url` won’t prevent access via the old secondary URL, as long as the secondary DNS records are still intact. 3. 重新配置**辅助**节点以使更改生效: ``` gitlab-ctl reconfigure ``` 4. 执行以下命令以更新新提升的**主**节点 URL: ``` gitlab-rake geo:update_primary_node_url ``` 此命令将使用`/etc/gitlab/gitlab.rb`定义的更改的`external_url`配置. 5. 仅对于 GitLab 11.11 到 12.7,您可能需要更新数据库中的**主**节点名称. 此错误已在 GitLab 12.8 中修复. 要确定是否需要执行此操作,请在`/etc/gitlab/gitlab.rb`文件中搜索`gitlab_rails["geo_node_name"]`设置. 如果用`#`注释掉或根本找不到它,则您将需要更新数据库中**主**节点的名称. 您可以像这样搜索它: ``` grep "geo_node_name" /etc/gitlab/gitlab.rb ``` 要更新数据库中**主**节点的名称: ``` gitlab-rails runner 'Gitlab::Geo.primary_node.update!(name: GeoNode.current_node_name)' ``` 6. 验证您可以使用其 URL 连接到新升级的**主数据库** . 如果您更新了主域的 DNS 记录,则这些更改可能尚未传播,具体取决于以前的 DNS 记录 TTL. ### Step 5\. (Optional) Add **secondary** Geo node to a promoted **primary** node[](#step-5-optional-add-secondary-geo-node-to-a-promoted-primary-node "Permalink") 使用上述过程将**辅助**节点提升为**主要**节点不会在新的**主要**节点上启用 Geo. 要使新的**辅助**节点在线,请按照[Geo 设置说明进行操作](../replication/index.html#setup-instructions) . ### Step 6\. (Optional) Removing the secondary’s tracking database[](#step-6-optional-removing-the-secondarys-tracking-database "Permalink") 每个**次级**有一个用于保存从**初级**的所有项目的同步状态的特殊的跟踪数据库. 由于**辅助服务器**已经升级,因此不再需要跟踪数据库中的数据. 可以使用以下命令删除数据: ``` sudo rm -rf /var/opt/gitlab/geo-postgresql ``` 如果您在`gitlab.rb`文件中启用了任何`geo_secondary[]`配置选项,则可以安全地注释掉这些选项或将其删除,然后[重新配置 GitLab](../../restart_gitlab.html#omnibus-gitlab-reconfigure)以使更改生效. ## Promoting secondary Geo replica in multi-secondary configurations[](#promoting-secondary-geo-replica-in-multi-secondary-configurations "Permalink") 如果您有多个**辅助**节点,并且需要升级其中一个,建议您按照[单辅助配置中的"](#promoting-a-secondary-geo-node-in-single-secondary-configurations)升级[**辅助** Geo"节点进行操作](#promoting-a-secondary-geo-node-in-single-secondary-configurations) ,之后还需要执行两个额外步骤. ### Step 1\. Prepare the new **primary** node to serve one or more **secondary** nodes[](#step-1-prepare-the-new-primary-node-to-serve-one-or-more-secondary-nodes "Permalink") 1. SSH 进入新的**主**节点并以 root 用户身份登录: ``` sudo -i ``` 2. Edit `/etc/gitlab/gitlab.rb` ``` ## Enable a Geo Primary role (if you haven't yet) roles ['geo_primary_role'] ## # Allow PostgreSQL client authentication from the primary and secondary IPs. These IPs may be # public or VPC addresses in CIDR format, for example ['198.51.100.1/32', '198.51.100.2/32'] ## postgresql['md5_auth_cidr_addresses'] = ['<primary_node_ip>/32', '<secondary_node_ip>/32'] # Every secondary server needs to have its own slot so specify the number of secondary nodes you're going to have postgresql['max_replication_slots'] = 1 ## ## Disable automatic database migrations temporarily ## (until PostgreSQL is restarted and listening on the private address). ## gitlab_rails['auto_migrate'] = false ``` (有关这些设置的更多详细信息,您可以阅读[配置主服务器](../replication/database.html#step-1-configure-the-primary-server) ) 3. 保存文件并重新配置 GitLab,以进行数据库侦听更改和要应用的复制插槽更改. ``` gitlab-ctl reconfigure ``` 重新启动 PostgreSQL 以使其更改生效: ``` gitlab-ctl restart postgresql ``` 4. 现在,重新启动 PostgreSQL 并重新侦听私有地址,即可重新启用迁移. 编辑`/etc/gitlab/gitlab.rb`并将配置**更改**为`true` : ``` gitlab_rails['auto_migrate'] = true ``` 保存文件并重新配置 GitLab: ``` gitlab-ctl reconfigure ``` ### Step 2\. Initiate the replication process[](#step-2-initiate-the-replication-process "Permalink") 现在,我们需要使每个**辅助**节点侦听新的**主要**节点上的更改. 为此,您需要再次[启动复制过程](../replication/database.html#step-3-initiate-the-replication-process) ,但这一次是针对另一个**主**节点. 所有旧的复制设置将被覆盖. ## Troubleshooting[](#troubleshooting "Permalink") ### I followed the disaster recovery instructions and now two-factor auth is broken[](#i-followed-the-disaster-recovery-instructions-and-now-two-factor-auth-is-broken "Permalink") 10.5 之前的 Geo 的安装说明无法复制`otp_key_base`机密,该机密用于加密存储在数据库中的两因素身份验证机密. 如果**主**节点和**辅助**节点之间的设置不同,启用了双重身份验证的用户将无法在故障转移后登录. 如果您仍然可以访问旧的**主**节点,则可以按照" [升级到 GitLab 10.5"](../replication/version_specific_updates.html#updating-to-gitlab-105)部分中的说明解决错误. 否则,密码将丢失,您需要[为所有用户重置两步验证](../../../security/two_factor_authentication.html#disabling-2fa-for-everyone) .