πŸ›‘Emergency Procedures

This page describes how to react in a network-wide emergency (funds-at-risk).

MAYAChain is a distributed network. When the network is under attack or a funds-at-risk bug is discovered, Node Operators should react quickly to secure and defend.

Even during emergencies, Node Operators should refrain from doxxing themselves. Staying pseudo-anonymous is critical to ensuring the network is impartial, neutral and resistant to capture.

Reporting a Bug

There is a formal Bounty Program in place for reporting bugs. If you have discovered a bug, you should immediately DM the team or any other admins and/or report via the bounty program. If the bug is deemed to be serious/criticial, you will be paid a bounty commensurate to the severity of the issue. Reports need to include:

  1. Description of the bug

  2. Steps to reproduce

  3. If funds are at risk

Admin Procedures

Once the bug has been verified, admin should make a decision on the level of response, including any mimir over-rides and announcements:

Critical - Funds At Risk

Major - Funds Not At Risk, but Network At Risk (disruption)

Minor - Funds Not At Risk, Network Not At Risk

Network Upgrades

The network cannot upgrade until 100% of active nodes are on the updated version. This can be accelerated:

  1. Naturally, by allowing the network to churn out old nodes

  2. Assertive, by waiting until a super-majority has upgraded (demonstrating acceptance of the upgrade) than banning old nodes

  3. Forced by hard-forking out old nodes.

During a natural upgrade cycle, it may take several days to churn out old nodes. If the upgrade is time-critical, the network may elect to ban old nodes. Banning a node will cycle them to be churned, kick them from TSS and eject them from the consensus set. That node will never be able to churn in again, they will need to fully leave, destroy their node, and set up a new one. Hard-forking out old nodes is also possible but comes with a significant risk of consensus failures.

Network Recovery

The network will not be able to recover until the upgrade is complete, any mimir over-rides are removed, and TSS is re-synced. Additionally, there may be a backlog of transactions that will need to be processed. This may take some time. If external chain daemons were stopped, re-syncing times may also be a factor.

All wallets and frontends should monitor for any of the halts and automatically go into maintenance mode when invoked.

Node Migration

Create node backup

kubectl get secret mayanode-mnemonic -n mayanode -o yaml > ~/mnemonic_secret_new.yaml

/root/.mayanode/MAYAChain-ED25519 /root/.mayanode/keyring-file/ (directory) /root/.mayanode/config/node_key.json /root/.mayanode/config/priv_validator_key.json

#Replace $pod with the mayanode pod name
kubectl cp mayanode/\$pod:/root/.mayanode/MAYAChain-ED25519 ~/MAYAChain-ED25519
kubectl cp mayanode/\$pod:/root/.mayanode/keyring-file/ ~/keyring-file/
kubectl cp mayanode/\$pod:/root/.mayanode/config/node_key.json ~/config/node_key.json
kubectl cp mayanode/\$pod:/root/.mayanode/config/priv_validator_key.json ~/config/priv_validator_key.json
#Replace $pod with the old bifrost pod name
kubectl exec deploy/mayanode -c bifrost -- sh -c "cd /root/.mayanode && tar cf \"bifrost.tar\" localstate-*.json"
kubectl cp mayanode/\$pod:/root/.mayanode/bifrost.tar ~/bifrost.tar

Restore Node backup

kind: Pod
apiVersion: v1
metadata:
  name: mayanode-recovery
spec:
  volumes:
    - name: volume-to-debug
      persistentVolumeClaim:
       claimName: mayanode
  containers:
    - name: debugger
      image: busybox
      command: ['sleep', '3600']
      volumeMounts:
        - mountPath: "/data"
          name: volume-to-debug

bifrost-recovery.yaml:

kind: Pod
apiVersion: v1
metadata:
  name: bifrost-recovery
spec:
  volumes:
    - name: volume-to-debug
      persistentVolumeClaim:
       claimName: bifrost
  containers:
    - name: debugger
      image: busybox
      command: ['sleep', '3600']
      volumeMounts:
        - mountPath: "/data"
          name: volume-to-debug
kubectl scale deployment mayanode bifrost --replicas=0
kubectl create -f mayanode-recovery.yaml
kubectl create -f bifrost-recovery.yaml
kubectl cp node_key.json mayanode-recovery:/data/.mayanode/config/node_key.json
kubectl cp priv_validator_key.json mayanode-recovery:/data/.mayanode/config/priv_validator_key.json
kubectl cp MAYAChain-ED25519 mayanode-recovery:/data/.mayanode/MAYAChain-ED25519
kubectl cp keyring-file mayanode-recovery:/data/.mayanode/
kubectl cp MAYAChain-ED25519 bifrost-recovery:/data/mayanode/MAYAChain-ED25519
kubectl cp keyring-file/mayachain.info bifrost-recovery:/data/mayanode/keyring-file/mayachain.info
kubectl cp bifrost.tar bifrost-recovery:/data/mayanode/bifrost.tar
kubectl exec deploy/mayanode -c bifrost-recovery -- sh -c "cd /root/.mayanode && tar xf bifrost.tar"
kubectl delete -f mayanode-recovery.yaml
kubectl delete -f bifrost-recovery.yaml

If you're maling live migration, then after stopping temporary pods on the NEW node stop mayanode and bifrost daemons on the OLD node

kubectl scale deployment mayanode bifrost --replicas=1
kubectl delete secret mayanode-mnemonic
kubectl apply -f mnemonic_secret_new.yaml
cd ./node-launcher/ && make set-ip-address

Last updated