Note: All elements described below are recommendations from DQE, based on deployment experience with different customers.
The customer is responsible for integrating the solution into their own architecture.
An architect familiar with the customer context and internal infrastructure should align DQE's recommendations with the client infrastructure.
As part of the installation, the approach has been dockerized and all components are deployed in Docker:
- Redis: used for data storage and processing.
- RabbitMQ: schedules actions performed by the DQE engine.
- The Redis database is deleted after each process.
1. Deployment Architecture
The exchanges between the front end and the back end are detailed below.
To deploy an instance of the Unify Server application on your Azure organization, DQE-Software provides dedicated access to the container registry. There are several ways to deploy this container, but DQE recommends using an Azure Container Instance.
The procedure is explained in the installation section.
This requires the configuration of an Azure Gateway, or another load balancer, to expose this application with a public IP address and DNS. Otherwise, the Salesforce organization where the user interface package is installed will not be able to access the application.
To deploy this application, you need to create a YAML configuration file, which is detailed in this document.
2. Flow Matrix
The diagram below describes all flows coming in and out of the Azure Application Gateway in a standard Unify data process.
All IP addresses and ports described in this diagram must be opened in your gateway or firewall for the process to complete. All incoming connections use HTTPS.
3. Salesforce Connections
Authentication
When you install the Unify-UI package on your Salesforce organization, you must complete several configuration steps. During these steps, the Salesforce org registers with the Unify Server. At this stage, the server creates a unique password and key credentials, which are stored in your org and used to authenticate to your Unify Server instance.
The security protocol used by Salesforce to authenticate users and allow the operations performed by the package, such as bulk import and export, is JSON Web Token (JWT).
This JWT can only be validated by Salesforce for users enabled in the connected app installed with the Unify-UI package.
More information is available in the official Salesforce documentation here.
Process
The following section describes the flows between Salesforce and the ACI Unify Server, and between the ACI Unify Server and DQE servers.
Step 1 - Define a New Process
A Salesforce user defines a new process.
Example process description:
- Object: Person Account
- Type of processing: Email validation
- Processed field: Email
- Filters: see business rules.
Step 2 - Start the Process
A Salesforce user starts the process. This user must be part of the authorized logins in the connected app setup.
Step 3 - Send the Processing Request
The processing request is sent to the ACI Unify Server.
Step 4 - Authenticate with Salesforce
The ACI Unify Server authenticates with Salesforce through the connected app using a JWT token.
Step 5 - Export Data
Salesforce allows the server to perform a data export.
Step 6 - Save Exported Data
The ACI Unify Server exports the relevant data, such as the Id and Email fields of the Person Account object, and saves it in Azure File Storage in CSV format.
Step 7 - Process the Data
If the process is Data Quality processing: the ACI Unify Server makes anonymized unit API calls for each email in the extracted file. This applies only to email qualification.
During the whole process, this is the only step where some data may be sent by REST API through a secure connection to the DQE-Software production server.
If the process is Deduplication processing: the ACI Unify Server calculates duplicate groups and reconciles records according to the matching rules defined during the workshops.
Step 8 - Generate the Result File
The ACI Unify Server aggregates all processing responses into a final CSV file.
If the process is deduplication, the ACI Unify Server calculates the data merge result within the identified duplicate groups. This result is stored in the DQE_Fusion_Json_c field, created for this purpose in the Lead object, while waiting to be used if the merge process is triggered automatically or manually at the end of processing.
Step 9 - Authenticate to Salesforce
The ACI Unify Server authenticates to Salesforce.
Step 10 - Allow Import or Update
Salesforce allows the server to perform a data import or update.
Step 11 - Send Processing Results
The ACI Unify Server sends the processing results to the relevant records, for example Person Account, through the Bulk APIs exposed by Salesforce. Person Accounts are enriched with fields created for this purpose, such as the number of the duplicate group to which the record has been attached and, where applicable, the field storing the merge result for that group.
Step 12 - Delete CSV Files
The ACI Unify Server deletes the CSV file initially received during export in step 6 and the generated CSV file containing the results from step 8.
Step 13 - Send the Processing Report
The ACI Unify Server sends a statistical processing report to Salesforce, including the number of records processed and processing time. This report is visible in the Unify application in the Runs object.
4. Installation
This section describes the installation protocol used to create an instance of DQE Unify Server as an Azure Container Instance.
It also provides an example of how to create an Azure Application Gateway.
This part depends heavily on the customer's own organization. The customer's technical team is responsible for applying the different recommendations.
For more information about configuring an Azure Application Gateway, refer to the Microsoft documentation here.
4.1. Create an Application Gateway
4.1.1. Basic Tab
The first tab displays a form similar to the one below. Fill in the different parts as shown.
4.1.1.1. WAF Policy
The WAF (Web Application Firewall) is used to define the IP ranges authorized to call the Application Gateway, such as Salesforce IP ranges.
Create a new WAF from the WAF policy field in the 4.1.1. Basic Tab.
4.1.1.2. VNET
The VNET (Virtual Network) is used to connect the Application Gateway and the ACI (Azure Container Instance). All flows pass through this private network.
Create a new VNET from the Virtual network field in the 4.1.1. Basic Tab.
4.1.2. Frontends
The Frontends tab displays a form similar to the one below. Fill in the different parts as shown.
4.1.2.1. Public IP Address
The public IP address is exposed by the Application Gateway and is used to route traffic to the ACI.
Create a new public IP from the Public IP address field in 4.1.2. Frontends.
4.1.3. Backends
The Backends tab displays a form similar to the one below. Fill in the different parts as shown.
4.1.3.1. Backend Pool
The backend pool allows the Application Gateway to route flows through the VNET to the ACI. It is automatically filled in during the Network step of ACI creation.
Create a new backend pool from Add a backend pool in 4.1.3. Backends.
4.1.4. Configuration
The Configuration tab displays a form similar to the one below. Fill in the different parts as shown.
4.1.4.1. Routing Rules - Listener
This configuration allows calls to the Application Gateway through HTTPS.
4.1.4.2. Routing Rules - Backend Target
The backend target is used by the Application Gateway to determine where to route incoming traffic. It uses the backend pool created previously.
4.1.4.3. Routing Rules - Backend Setting
The backend setting is used by the Application Gateway to determine which protocol to use when routing traffic to the ACI. This traffic is sent through the private VNET.
Then click Review + create.
4.2. Create a Container Instance from a YAML File
4.2.1. Add a VNET Subnet for the ACI
Create a new subnet delegated to Microsoft.ContainerInstance/containerGroups from the VNET created in 4.1.1.2. VNET.
4.2.2. Azure CLI
Azure CLI is required on your computer.
4.2.2.1. UNIX
sudo curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
4.2.2.2. Windows
Refer to the Microsoft documentation here.
4.2.3. Create a Storage Account
4.2.3.1. Basics Tab
4.2.3.2. Advanced
Nothing to change.
4.2.3.3. Networking
4.2.3.4. Data Protection
Nothing to change.
4.2.3.5. Encryption
Nothing to change.
4.2.3.6. Tags
Nothing to change.
4.2.3.7. Review + Create
Create the storage account.
4.2.4. Create File Shares
The container requires three file shares:
rabbitvolnginxconfredisvol
To use HTTPS on the ACI, Nginx is provisioned. To ensure that Nginx has all required information, add the following files to the nginxconf file share:
- The SSL certificate
.pemor.crtfile. - The SSL key
.keyfile. - If the
.pemfile has a password, add the password in a text file and upload it to the file share. - The
default.conffile defined in 4.2.8. Configure Nginx.
4.2.5. Create a Log Analytics Workspace
The Log Analytics Workspace is used to store all logs from the ACI. By default, logs are stored for 30 days.
4.2.5.1. Basics Tab
4.2.5.2. Review + Create
Create the Log Analytics Workspace.
4.2.5.3. Get the Workspace ID and Key
4.2.6. ACI YAML Configuration File
On your local machine, create a YAML file. The YAML configuration file is provided below.
You must complete the following keys:
-
name:<Container Group Name> -
location:<Location> -
imageRegistryCredentials.username:<Username provided by DQE> -
imageRegistryCredentials.password:<Password provided by DQE> -
diagnostics.logAnalytics.workspaceId:<Log Analytics Workspace ID>, available from 4.2.5.3. Get the Workspace ID and Key. -
diagnostics.logAnalytics.workspaceKey:<Log Analytics Workspace Key>, available from 4.2.5.3. Get the Workspace ID and Key. - All
<file share name>,<storage account name>, and<storage account key>values.
Example with the previously created storage account and file share:
-
<file share name>=redisvol -
<storage account name>=myaccountstoragename -
<storage account key>= available from Storage account > Access keys > Show keys. -
id:subscriptions/{subscriptionsId}/resourceGroups/{resourceGroupsName}/providers/Microsoft.Network/virtualNetworks/{VNETName}/subnets/{VNETSubnetsName}
Example:
name: <Container Group Name> # Name of the container group
apiVersion: '2021-10-01'
location: <Location>
tags: {"docker-compose-application": "docker-compose-application"}
properties: # Properties of container group
containers: # Array of container instances in the group
# Redis Image configuration
- name: redis
properties: # Properties of an instance
image: <ImageURL> # Container image used to create the instance
resources: # Resource requirements of the instance
requests:
memoryInGB: 1
cpu: 0.5
volumeMounts: # Array of volume mounts for the instance
- name: redisvol
mountPath: /data
# RabbitMQ Image configuration
- name: rabbitmq # Name of an instance
properties: # Properties of an instance
image: <ImageURL> # Container image used to create the instance
ports: # External-facing ports exposed on the instance, must also be set in group ipAddress property
- protocol: TCP
port: 5672
environmentVariables:
- name: RABBITMQ_DEFAULT_PASS
value: guest
- name: RABBITMQ_DEFAULT_USER
value: guest
- name: RABBITMQ_DEFAULT_VHOST
value: admin
resources: # Resource requirements of the instance
requests:
memoryInGB: 1
cpu: 1
volumeMounts: # Array of volume mounts for the instance
- name: "rabbitvol"
mountPath: /bitnami
readOnly: false
# Nginx Image configuration
- name: nginx
properties: # Properties of an instance
image: <ImageURL> # Container image used to create the instance
ports: # External-facing ports exposed on the instance, must also be set in group ipAddress property
- protocol: TCP
port: 80
resources: # Resource requirements of the instance
requests:
memoryInGB: 1
cpu: 0.25
volumeMounts: # Array of volume mounts for the instance
- name: nginxconf
mountPath: /etc/nginx/conf.d
# Unify UI web server Image configuration
- name: unify-ui
properties: # Properties of an instance
image: <ImageURL> # Container image used to create the instance
command:
- "npm"
- "start"
ports: # External-facing ports exposed on the instance, must also be set in group ipAddress property
- protocol: TCP
port: 8001
environmentVariables:
- name: SESSION_SECRET
value: myveryimportantSecret
- name: PORT
value: 8001
resources: # Resource requirements of the instance
requests:
memoryInGB: 1
cpu: 0.25
# Unify web server Image configuration
- name: one-server
properties: # Properties of an instance
image: <ImageURL> # Container image used to create the instance
command:
- "bash"
- "./entrypoint.sh"
ports: # External-facing ports exposed on the instance, must also be set in group ipAddress property
- protocol: TCP
port: 8000
environmentVariables:
- name: SFAPIVERSION
value: v59.0
- name: REDIS_URL
value: redis://127.0.0.1:6379
- name: CLOUDAMQP_URL
value: amqp://guest:guest@127.0.0.1:5672/admin
- name: CUSTOMUI
value: http://127.0.0.1:8001
- name: UNIFYSERVERURL
value: http://127.0.0.1:8000
- name: PORT
value: 8000
resources: # Resource requirements of the instance
requests:
memoryInGB: 1
cpu: 0.25
# Unify Worker Image configuration
- name: queue-worker
properties: # Properties of an instance
image: <ImageURL> # Container image used to create the instance
command:
- "python"
- "-u"
- "./unify/queue_worker.pyc"
environmentVariables:
- name: SFAPIVERSION
value: v59.0
- name: WORKDIRPATH
value: /app/unify
- name: REDIS_URL
value: redis://127.0.0.1:6379
- name: CLOUDAMQP_URL
value: amqp://guest:guest@127.0.0.1:5672/admin
resources: # Resource requirements of the instance
requests:
memoryInGB: 5
cpu: 1
# Credential to pull the Unify image from private container
imageRegistryCredentials:
- server: <ImageURL>
username: <Username provide by DQE>
password: <Password provide by DQE>
diagnostics:
logAnalytics:
workspaceId: <Log Analytics Workspace Id>
workspaceKey: <Log Analytics Workspace Key>
restartPolicy: Always
ipAddress: # IP address configuration of container group
ports:
- protocol: TCP
port: 80
type: Private
osType: Linux
# Volumes parametre (azure fileshared)
volumes: # Array of volumes available to the instances
- name: rabbitvol
azureFile:
shareName: <file share name>
readOnly: false
storageAccountName: <storage account name>
storageAccountKey: <storage account key>
- name: nginxconf
azureFile:
shareName: <file share name>
readOnly: false
storageAccountName: <storage account name>
storageAccountKey: <storage account key>
- name: redisvol
azureFile:
shareName: <file share name>
readOnly: false
storageAccountName: <storage account name>
storageAccountKey: <storage account key>
subnetIds: # Subnet to deploy the container group into
- id: subscriptions/{subscriptions Id}/resourceGroups/{resourceGroupsName}/providers/Microsoft.Network/virtualNetworks/{VNETName}/subnets/{VNETSubnetsName}
4.2.7. Create the ACI
On your local machine, open a command prompt and log in to Azure with Azure CLI:
az login
Create the ACI from the YAML file:
az container create -g <your Resource Group Name> -f <.yaml file path>
After the operation is complete, you should see the following information:
4.2.8. Configure Nginx
Create a default.conf file containing the following content. This configuration is used for the HTTP connection between the gateway and the ACI through the private subnet.
server {
listen 80;
location / {
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://127.0.0.1:8001;
}
location /unify/ {
rewrite /unify/(.*) /$1 break;
proxy_set_header Host $host;
proxy_pass http://127.0.0.1:8000;
}
}
Upload this file to the nginxconf file share previously created.
Warning: restart the ACI so that Nginx is updated with the latest change.
4.3. Setting Up the Application Gateway Backend Pool
After the ACI is created, go to Application Gateway > Backend Pool > your backend pool > Edit backend pool.
Add the private IP address of your created ACI.
4.4. Setting Up the Application Gateway Health Probe
The health probe is used by the Application Gateway to frequently check whether the API on the ACI is up. Set up the health probe, then click Test. If the status is Healthy, the ACI is properly configured and started.
Add a health probe from the Application Gateway:
Click Test > Add.
4.4.1. Set Up the DNS
Contact your administrators, or anyone with access to the DNS zone, and ask them to add a DNS name pointing to your Application Gateway public IP.
4.4.2. WAF Configuration
The WAF is used to authorize or block specific IP addresses trying to call the Application Gateway. For Salesforce processes to work, the WAF must authorize Salesforce IP ranges.
In the WAF, click Switch to prevention mode. This allows the WAF to use the custom rules below.
In the custom rules, add:
- Salesforce IP address ranges, including basic and Hyperforce ranges.
- DQE check license server: ask support.
- SF Automation 1: ask support.
- SF Automation 2: ask support.
You can find the exhaustive list of Salesforce IP addresses here.
After completing these steps, check your deployment as described in 4.4. Setting Up the Application Gateway Health Probe.
Related to