Note: All the elements described below are recommendations from DQE, based on deployment experience with different customers.
The customer is responsible for the integration into its own architecture.
An architect familiar with the context and internal infrastructure should align DQE's recommendations with the client infrastructure.
As part of the installation, the approach has been dockerized and all components are deployed in Docker.
- Redis for data storage and processing
- RabbitMQ – scheduler of actions performed by the DQE engine
- The Redis database is deleted after each process
1. Deployment Architecture
The exchanges between the front end and the back end are detailed below.
To deploy an instance of the Unify Server application on your Azure organization, DQE-Software provides dedicated access to the container registry. There are different ways to deploy this container, but we recommend using an Azure Container App.
The procedure is explained in the installation section.
This involves the configuration of an Azure Gateway, or another load balancer, to expose this application with a public IP address and DNS. Otherwise, the Salesforce organization where you installed the user interface package will not be able to access the application.
To deploy this application, you need to create a YAML configuration file that is detailed in this document.
2. Flow Matrix
Here is a diagram describing all the flows coming in and out of the Azure Application Gateway in a standard Unify data process.
All IP addresses and ports described in this diagram must be opened in your gateway or firewall for the process to complete. All incoming connections are made through HTTPS.
3. Salesforce Connections
Authentication
When you install the Unify-UI package on your Salesforce organization, you will have to go through some configuration steps. During those steps, the Salesforce org will register to the Unify Server. At this step, the server creates a unique password and key credentials that are stored in your org and used to authenticate to your Unify Server instance.
The security protocol used by Salesforce to authenticate users and allow the various operations performed by the package, such as bulk import and export, is JSON Web Token (JWT).
This JWT can only be validated by Salesforce for the users enabled in the connected app installed with the Unify-UI package.
You can find more information in the official Salesforce documentation here.
Process
Below is the description of the flows between Salesforce and the ACA Unify Server, and between the ACA Unify Server and the DQE servers.
Step 1 - Define a New Process
A Salesforce user defines a new process.
Description of the process
- Object: Person Account
- Type of processing: Email validation
- Processed field: Email
- Filters: see business rules
Step 2 - Start the Process
A Salesforce user starts the process. This user needs to be part of the authorized logins in the connected app setup.
Step 3 - Send the Processing Request
The processing request is sent to the ACA Unify Server.
Step 4 - Authenticate with Salesforce
The ACA Unify Server authenticates with Salesforce through the connected app using a JWT token.
Step 5 - Export Data
Salesforce allows the server to perform a data export.
Step 6 - Save Exported Data
The ACA Unify Server exports the relevant data, such as the Id and Email fields of the Person Account object, and saves it in Azure File Storage in CSV format.
Step 7 - Process the Data
If the process is Data Quality processing: the ACA Unify Server makes anonymized unit API calls on each email in the extracted file. This applies only to email qualification.
During the whole process, this is the only step where some data may be sent by REST API through a secure connection to the DQE-Software production server.
If the process is Deduplication processing: the ACA Unify Server calculates duplicate groups and reconciles records according to the matching rules defined during the workshops.
Step 8 - Generate the Result File
The ACA Unify Server aggregates all processing responses into a final CSV file.
If the process is deduplication, the ACA Unify Server calculates the data merge result within the identified duplicate groups. This result is stored in the DQE_Fusion_Json_c field created for this purpose in the Lead object, while waiting to be used if the merge process is triggered automatically or manually at the end of processing.
Step 9 - Authenticate to Salesforce
The ACA Unify Server authenticates to Salesforce.
Step 10 - Allow Import or Update
Salesforce allows the server to perform a data import or update.
Step 11 - Send Processing Results
The ACA Unify Server sends the processing results to the relevant records, here Person Account, through the Bulk APIs exposed by Salesforce. Person Accounts are enriched with fields created for this purpose, such as the duplicate group number and, where applicable, the field storing the merge result for that group.
Step 12 - Delete CSV Files
The ACA Unify Server deletes the CSV file initially received during export in step 6 and the generated CSV file containing the results from step 8.
Step 13 - Send the Processing Report
The ACA Unify Server sends a statistical processing report to Salesforce, including the number of processed records and processing time. This report is visible in the Unify application in the Runs object.
3.1. Composition and Services
The stack is composed of Docker images orchestrated via Azure Container Apps. For each service, we outline the attributes and their functions below.
Web Application (one-server)
A backend container that contains a microservices application. This application exposes all the APIs called to launch processes, manage processing queues, and instantiate processing workers. It is dependent on Redis and RabbitMQ to function properly. This service is exposed on the web through the Nginx reverse proxy.
Docker Compose settings
-
image: The name of the image hosted on the DQE Azure Container Registry:
<Name of the registry container>.azurecr.io/<name of the image> - depends_on: redis, rabbitmq
- environment: SFAPIVERSION, PORT, REDIS_URL, CLOUDAMQP_URL, CUSTOMUI, UNIFYSERVERURL
- command:
bash ./entrypoint.sh
Worker (queue-worker)
A backend container that processes queued jobs asynchronously. It consumes tasks from the RabbitMQ queue and executes them. Uses the same Docker image as the web application.
Parameters to set in Docker Compose
- image: same image as the web application service
- depends_on: redis, rabbitmq
- environment: SFAPIVERSION, WORKDIRPATH, REDIS_URL, CLOUDAMQP_URL
- command:
python -u ./unify/queue_worker.pyc
CustomUI
This service is a web application exposing a front used to configure deduplication rules. It must therefore also be exposed on the web. This service also consumes some APIs from the backend, for example to retrieve metadata from the CRM.
- image:
<imageURL> - command:
npm start - environment: PORT=8001, SESSION_SECRET
Nginx
A reverse proxy that forwards incoming HTTP/HTTPS requests from the Application Gateway to the web application and CustomUI. Its configuration file is stored in the nginxconf Azure File Share and mounted into the container at runtime.
- image:
<imageURL> - ports: 80
- volume: nginxconf:/etc/nginx/conf.d
Redis
This service is a key/value database used to store internal operating keys such as Salesforce org configurations, sessions, etc.
It is possible to use the official redis:alpine image, but as a security measure some cloud providers block the retrieval of public images and only allow images from private registries. To overcome this restriction, DQE also publishes a compatible Redis image on its private registry.
Parameters
- image:
<imageURL> - volumes:
redisvol:/data(Azure File Share)
RabbitMQ
This service is a powerful queue manager allowing the reception and scheduling of processing requests. Until a process has been assigned to a worker and completed, it will be parked in the queue. This also allows for resiliency to failures — if the server is restarted, it resumes the last processing in the queue where it left off.
As with the Redis service image, you have the option to use a public version or the one in the private DQE registry.
Parameters
- image:
<imageURL> - volumes:
rabbitvol:/var/lib/rabbitmq(Azure File Share) - environment: RABBITMQ_DEFAULT_USER, RABBITMQ_DEFAULT_PASS, RABBITMQ_DEFAULT_VHOST
4. Installation
In this section, we describe the installation protocol to create an instance of DQE Unify Server as an Azure Container App.
We also provide an example of how to create an Azure Application Gateway.
However, this part will highly depend on your own organization. The customer technical team is responsible for applying the different recommendations.
For more information about configuring your Azure Application Gateway, refer to the Microsoft documentation here.
4.1. Create an Application Gateway
4.1.1. Basic Tab
The first tab should present a form as follows. Fill in the different parts as shown below.
4.1.1.1. WAF Policy
The WAF (Web Application Firewall) is used to define the IP ranges authorized to call the Application Gateway, such as Salesforce IP ranges.
Create a new WAF from the WAF policy field in 4.1.1. Basic Tab.
4.1.1.2. VNET
The VNET (Virtual Network) is used to connect the Application Gateway and the ACA (Azure Container App). All flows pass through that private network.
Create a new VNET from the Virtual network field in 4.1.1. Basic Tab.
4.1.2. Frontends
The first tab should present a form as follows. Fill in the different parts as shown below.
4.1.2.1. Public IP Address
The public IP is exposed by the Application Gateway. It is used to route traffic to the ACA.
Create a new public IP from the Public IP address field in 4.1.2. Frontends.
4.1.3. Backends
The first tab should present a form as follows. Fill in the different parts as shown below.
4.1.3.1. Backend Pool
The backend pool allows the Application Gateway to route flows through the VNET to the ACA. It is automatically filled in during the Network step of ACA creation.
Create a new backend pool from Add a backend pool in 4.1.3. Backends.
4.1.4. Configuration
The first tab should present a form as follows. Fill in the different parts as shown below.
4.1.4.1. Routing Rules - Listener
This configuration allows calls to the Application Gateway through the HTTPS protocol.
4.1.4.2. Routing Rules - Backend Target
The backend target is used by the Application Gateway to determine where to route incoming traffic. It uses the backend pool created previously.
4.1.4.3. Routing Rules - Backend Setting
The backend setting is used by the Application Gateway to determine which protocol to use when routing traffic to the ACA. This traffic is sent through the private VNET.
Warning: Protocol must be HTTP (not HTTPS). ACA internal ingress communicates over HTTP within the private VNET. The backend setting protocol must be set to HTTP, not HTTPS. If set to HTTPS, the Application Gateway will fail to connect to the ACA backend.
Then click Review + create.
4.2. Create a Container Apps Environment (ACAE)
4.2.1. Add a VNET Subnet for ACAE
Create a new subnet delegated to Microsoft.App/environments from the VNET created in 4.1.1.2. VNET.
4.2.2. Install Azure CLI
Azure CLI is required on your computer.
4.2.2.1. UNIX
$ sudo curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
4.2.2.2. Windows
Refer to the Microsoft documentation here.
4.2.3. Create ACAE
Note: The DQE Unify stack requires a Dedicated plan because the queue-worker needs 5 Gi of memory, which exceeds the 2 Gi per-container limit of the Consumption plan. The flag --enable-workload-profiles activates Dedicated plan support on the environment.
4.2.3.1. UNIX
az containerapp env create \
--name <container apps environments name> \
--resource-group <resource group> \
--location <location> \
--internal-only true \
--enable-workload-profiles \
--infrastructure-subnet-resource-id $(az network vnet subnet show \
--resource-group <vnet resource group> \
--vnet-name <vnet name (created on 4.1.1.1)> \
--name <subnet name (created on 4.2.1)> \
--query id -o tsv)
4.2.3.2. Windows
az containerapp env create `
--name <container apps environments name> `
--resource-group <resource group> `
--location <location> `
--internal-only true `
--enable-workload-profiles `
--infrastructure-subnet-resource-id $(az network vnet subnet show `
--resource-group <vnet resource group> `
--vnet-name <vnet name (created on 4.1.1.1)> `
--name <subnet name (created on 4.2.1)> `
--query id -o tsv)
4.2.3.3. Add a Dedicated workload profile (D8)
Add a Dedicated D8 workload profile (8 vCPU / 32 Gi) to the environment. This node provides enough capacity for the full stack (5 vCPU / 10 Gi total).
UNIX
az containerapp env workload-profile add \
--name <container apps environments name> \
--resource-group <resource group> \
--workload-profile-name "Dedicated-D8" \
--workload-profile-type D8 \
--min-nodes 1 \
--max-nodes 1
Windows
az containerapp env workload-profile add `
--name <container apps environments name> `
--resource-group <resource group> `
--workload-profile-name "Dedicated-D8" `
--workload-profile-type D8 `
--min-nodes 1 `
--max-nodes 1
4.3. Create a Container App from YAML
4.3.1. Create a Storage Account
4.3.1.1. Basics
Complete all necessary information.
4.3.1.2. Advanced
Nothing to change.
4.3.1.3. Networking tab
4.3.1.4. Data protection
Nothing to change.
4.3.1.5. Encryption
Nothing to change.
4.3.1.6. Tags
Nothing to change.
4.3.1.7. Review + Create
Create the storage account.
4.3.2. Create File Shares
The container requires three file shares: rabbitvol, nginxconf, and redisvol.
To use HTTPS on the ACA, Nginx is provisioned. To ensure that it has all required information, add the following files to the nginxconf file share:
- The SSL certificate
.pemor.crtfile - The SSL key
.keyfile - If the
.pemfile has a password, add it in a text file and add that text file to the file share - The
default.conffile defined in 4.3.6. Configure Nginx
4.3.3. Create a Storage Environment
The procedure below must be executed for each previously created file share name:
redisvolrabbitvolnginxconf
4.3.3.1. UNIX
az containerapp env storage set \
--access-mode ReadWrite \
--azure-file-account-name <account storage name> \
--azure-file-account-key <account storage key> \
--azure-file-share-name <fileshare name (redisvol, rabbitvol, nginxconf)> \
--storage-name <storage environment name (redisvol, rabbitvol, nginxconf)> \
--name <container app name> \
--resource-group <resource group> \
--output table
4.3.3.2. Windows
az containerapp env storage set `
--access-mode ReadWrite `
--azure-file-account-name <account storage name> `
--azure-file-account-key <account storage key> `
--azure-file-share-name <fileshare name (redisvol, rabbitvol, nginxconf)> `
--storage-name <storage environment name (redisvol, rabbitvol, nginxconf)> `
--name <container app name> `
--resource-group <resource group> `
--output table
4.3.4. Container App YAML Configuration File
On your local machine, create a YAML file for the Azure Container App. The ACA format is different from the ACI format: volumes use storageType: AzureFile referencing the storage environment names created in 4.3.3, and registry credentials are declared in configuration.registries. All containers share the same network namespace within a single container app, so inter-container communication uses localhost.
You need to complete the following keys:
-
environmentId: the full resource ID of the ACAE created in 4.2.3. Format:subscriptions/{subscriptionsId}/resourceGroups/{resourceGroupsName}/providers/Microsoft.App/managedEnvironments/{container apps environments name} -
location:<Location> -
configuration.registries: server, username, and passwordSecretRef referencing the registry password provided by DQE -
volumes.storageName: must match the storage environment names created in 4.3.3:redisvol,rabbitvol, andnginxconf.
Warning: RabbitMQ volume mount path: the YAML mounts the RabbitMQ volume at /bitnami (Bitnami image path). If you use the official rabbitmq image instead, change the mountPath to /var/lib/rabbitmq. Using the wrong path will result in RabbitMQ starting without persistence and potentially failing to write data.
location: <Location>
name: dqe-unify
type: Microsoft.App/containerApps
properties:
# Reference the Dedicated D8 workload profile created on 4.2.3.3
workloadProfileName: "Dedicated-D8"
environmentId: /subscriptions/{subscriptionsId}/resourceGroups/{resourceGroupsName}/providers/Microsoft.App/managedEnvironments/{container apps environments name (created on 4.2.3)}
configuration:
activeRevisionsMode: Single
ingress:
external: false # Traffic comes from the Application Gateway via private VNET
targetPort: 80
registries:
- server: <registry URL>
username: <Username provided by DQE>
passwordSecretRef: registry-password
secrets:
- name: registry-password
value: <Password provided by DQE>
template:
containers:
# Redis — 0.5 vCPU / 1 Gi (ratio 1:2)
- name: redis
image: <ImageURL>
resources:
cpu: 0.5
memory: 1Gi
volumeMounts:
- volumeName: redisvol
mountPath: /data
# RabbitMQ — 1 vCPU / 2 Gi (ratio 1:2)
- name: rabbitmq
image: <ImageURL>
resources:
cpu: 1
memory: 2Gi
env:
- name: RABBITMQ_DEFAULT_PASS
value: guest
- name: RABBITMQ_DEFAULT_USER
value: guest
- name: RABBITMQ_DEFAULT_VHOST
value: admin
volumeMounts:
- volumeName: rabbitvol
mountPath: /bitnami
# Nginx — 0.25 vCPU / 0.5 Gi (ratio 1:2)
- name: nginx
image: <ImageURL>
resources:
cpu: 0.25
memory: 0.5Gi
volumeMounts:
- volumeName: nginxconf
mountPath: /etc/nginx/conf.d
# Unify UI — 0.25 vCPU / 0.5 Gi (ratio 1:2)
- name: unify-ui
image: <ImageURL>
command: ["npm", "start"]
resources:
cpu: 0.25
memory: 0.5Gi
env:
- name: SESSION_SECRET
value: myveryimportantSecret
- name: PORT
value: "8001"
# Unify web server — 0.5 vCPU / 1 Gi (ratio 1:2)
- name: one-server
image: <ImageURL>
command: ["bash", "./entrypoint.sh"]
resources:
cpu: 0.5
memory: 1Gi
env:
- name: SFAPIVERSION
value: v59.0
- name: REDIS_URL
value: redis://localhost:6379
- name: CLOUDAMQP_URL
value: amqp://guest:guest@localhost:5672/admin
- name: CUSTOMUI
value: http://localhost:8001
- name: UNIFYSERVERURL
value: http://localhost:8000
- name: PORT
value: "8000"
# Queue Worker — 2.5 vCPU / 5 Gi (ratio 1:2) — requires Dedicated plan
- name: queue-worker
image: <ImageURL>
command: ["python", "-u", "./unify/queue_worker.pyc"]
resources:
cpu: 2.5
memory: 5Gi
env:
- name: SFAPIVERSION
value: v59.0
- name: WORKDIRPATH
value: /app/unify
- name: REDIS_URL
value: redis://localhost:6379
- name: CLOUDAMQP_URL
value: amqp://guest:guest@localhost:5672/admin
# Azure File Share volumes — storageName must match names created on 4.3.3
volumes:
- name: redisvol
storageType: AzureFile
storageName: redisvol
- name: rabbitvol
storageType: AzureFile
storageName: rabbitvol
- name: nginxconf
storageType: AzureFile
storageName: nginxconf
scale:
minReplicas: 1 # Always keep 1 replica running
maxReplicas: 1 # Single replica — no horizontal scaling for stateful services
4.3.5. Deploy the Container App
On your local machine, open a command prompt and log in to Azure with Azure CLI:
az login
Create the Azure Container App from the previously created YAML file:
az containerapp create \
--resource-group <your Resource Group Name> \
--name dqe-unify \
--environment <container apps environments name (created on 4.2.3)> \
--yaml <yaml file path>
After the operation is finished, you should get the following information:
Warning: Startup ordering: ACA starts all containers simultaneously — there is no dependsOn equivalent. RabbitMQ and Redis may not be ready when one-server and queue-worker start. If the application does not include retry logic, the containers will crash and restart in a loop until the services are available. This is handled automatically by restartPolicy: Always, but may cause a delay of 1–2 minutes before the application is fully operational.
4.3.6. Configure Nginx
Create a default.conf file with the following content. This configuration handles the HTTP connection between the Application Gateway and the ACA through the private VNET. All containers share the same network namespace within the container app, so inter-container routing uses localhost.
server {
listen 80;
location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://127.0.0.1:8001;
}
location /unify/ {
rewrite /unify/(.*) /$1 break;
proxy_set_header Host $host;
proxy_pass http://127.0.0.1:8000;
}
}
Upload this file in the nginxconf file share previously created.
Warning: Restart the ACA so that Nginx is updated with the latest change.
4.4. Setting Up the Application Gateway Backend Pool
After the ACA is created, retrieve its internal FQDN. Unlike ACI, ACA with internal ingress exposes a hostname, not a direct IP:
az containerapp show \
--name dqe-unify \
--resource-group <resource group> \
--query "properties.configuration.ingress.fqdn" -o tsv
The FQDN has the following format: dqe-unify.internal.<environment-id>.<region>.azurecontainerapps.io.
Go to Application Gateway > Backend Pool > {Your Backend Pool} > Edit the backend pool. Add the ACA FQDN as a backend target and select FQDN type, not IP address.
4.5. Setting Up the Health Probe of the Application Gateway
The health probe is used by the Application Gateway to frequently check whether the API on the ACA is up. Set the health probe, then click Test. If the status is Healthy, the ACA is properly configured and started.
Warning: Health probe path: set the probe path to / (root) and the protocol to HTTP on port 80. The Unify Server returns a 200 response on the root path when it is up. Do not use HTTPS for the probe — ACA internal ingress does not terminate TLS on the private VNET.
Add a health probe from the Application Gateway:
Click Test > Add.
4.5.1. Set Up the DNS
You must contact your administrators, or anyone with access to the DNS zone, and ask them to add a DNS name pointing to your Application Gateway public IP.
4.5.2. WAF Configuration
The WAF is used to authorize or block specific IP addresses trying to call the Application Gateway. For Salesforce processes to work, the WAF must authorize Salesforce IP ranges.
In the WAF, click Switch to prevention mode. This allows the WAF to use the custom rules below.
In the custom rules, add:
- Salesforce IP address ranges, including basic and Hyperforce ranges
- DQE check license server: ask support
- SF Automation 1: ask support
- SF Automation 2: ask support
You can find the exhaustive list of Salesforce IP addresses here.
After completing these steps, check your deployment as described in 4.5. Setting Up the Health Probe of the Application Gateway.
Related to