Why performing maintenance operations
This application is built to be very resilient and do not require a lot of maintenance operations. However, in some situation you might want to restart the ACI (Azure Container Instance) to update your application version and benefit from the last patch releases.
In some cases, for instance if any error happened during the data processing, some of the files that are created for the operation will not be destroyed. They could end up in a situation where the files will accumulate and flood the volume allocated. This case would be easily fixed by stopping the app and restarting it.
/! \ We will not describe any kind of maintenance on the Azure gateway and routing configuration you might have set up on your environment as it may vary drastically from a user to another.
When operate
Data processes failed
The first signal your application needs to be fixed is when some of your data processes launched from Salesforce end up with a status “Failed”. Before rushing to your Azure platform, be sure to check if the error description does not provide any information who could indicate the issue is somewhere else. For instance, if you built your own deduplication rules, some errors may occur and stop the process.
So, for instance if you check you Salesforce Lightning Application “DQE Unify”, in the Runs tab and you find any run failed such as follow, you might want to perform a restart of your Azure Container Instance.
Data processes pending
Sometime the job queue will get stuck on the same message and blocking the data processes. In this situation you should clear the fileshare allocated to the rabbitmq container and reload the application. This will fix the issue in almost every cases.
If your fileshare linked to the rabbitmq container is defined as public, you can simply access it through the azure portal interface and delete what he contains.
If your fileshare is defined as private, you will need to access the container to manually clear the rabbitmq directory. When the container, if running, proceeds as follows:
- Log into the container
- Delete the following file
/bitnami/rabbitmq/mnesia/rabbit@localhost/msg_stores/vhosts/
- Stop the app and restart if from the overview panel
How to proceed
Get the logs
The Logs are stored for 30 days (by default) in the Log Analytics Worker previously created. In order to get them follow those steps:
- Got to log analytics worker, then in the tab Logs:
- Extract into CSV the results of this command
ContainerInstanceLog_CL | project TimeGenerated,ContainerName_s,Message | where ContainerName_s == "one-server" or ContainerName_s == "queue-worker" | sort by ContainerName_s asc, TimeGenerated asc
Rebuild the app
If you need to redeploy a newer version, or simply clear the volumes used by you application, you might need to stop the containers.
Simply go to the Overview tab and click Stop. Once the server has stopped, clicking Start will pull the latest release (labelled with the tag you provided in the configuration file) and rebuild the whole application.
This will have as side effect to clear any volume previously used by this application.
Restart
- Restart the Azure Container Instance
Go to the Overview tab and click Restart - Monitor the container reload
All the containers should end up in the state Running.
If some of it get a status Terminated, something went wrong, and you should contact the support. Otherwise, your server has restarted successfully.
Related to