There are two methods provided by Force.com that help us in data migration, Cloud Import Wizard is a cloud-based tool and does not require any download. It is easy to use and quickly lets us upload few standard objects and all custom objects. Apex Data Loader (API based) is a downloadable utility that has many more advantages over the cloud-based tool. Let us look at both the tools in details:
The cloud-based data import wizard is a cloud-based solution to import data into the organization quickly. It is a easy-to-use tool to load accounts, contacts, leads, solutions and custom objects. We can load up to 50,000 records at a time. It is accessible only to the system administrator or profile with administration access.
To use the data importing wizard perform the following steps:
Similarly, Force.com also provides us with a separate Import Custom Objects wizard that helps us load data in the custom object. Select the Import Custom Object Wizard for the exercise.
.CSV
file to upload. The file should have the column headings as the first row, this is helpful in identifying the column type and easy to map. On the next page, we can map the fields to the Salesforce columns as shown in the next screenshot. Map the columns from Excel to Salesforce fields and click on Next.The Apex data loader is a .net utility used to upload data to the organization. We can load any object that has API_Access. Unlike other tools, the data loader is not a cloud-based tool, but a desktop utility built for systems running on windows. Some features of the data loader are:
Some salient features of the data loader are listed as follows:
Unlike other features of Force.com, the data loader is not completely on cloud. We need to physically download and install it on the machine to use. To obtain and install the data loader perform the following steps:
An unofficial Mac version of the data loader is also available at http://www.pocketsoap.com/osx/lexiloader/
Install the data loader in the machine, before the proceeding to the next section. In the next section we will be loading data.
The data loader helps us in exporting data for backup, inserting data, deleting and hard deleting the data as shown in the following screenshot:
Let us export some opportunities for the purpose of an example:
.csv
as shown in the following screenshot and click on Next.success.csv
and error.csv
.Now that we have exported the data, let us try upserting the data. For the purpose of this example we will upsert the data using the template from extracting it. It is always advisable to extract the fields that we are supposed to update, insert, or upsert so that we get a template. Perform the following steps to upsert the data:
Thus we have seen the commands of export and upsert using the data loader. The commands of insert and update are very similar to upsert. However, they won't apply mapping with an external ID. The operation of delete and update requires a Salesforce ID.
By default the data loader is configured to operate using some default parameters. We can further configure it as per our requirements to improve performance.
The data loader by default works only with the production organization. To make it operable with the sandbox, we need to modify some parameters. Let us configure the data loader for sandbox operation.
Go to Settings | Settings in the Data Loader. A new window will open as shown in the following screenshot:
There are five major configuration changes that we need to keep in mind while setting up the Data Loader. Let us look at each one:
The data loader is a very easy and intuitive tool to use for loading data. As easy as it is to use, however, it is also requires manual intervention and a user input to operate. However, many times we require automated process to upload large chunks of data. For these cases we use the data loader through the command line interface.
Command line data loader performs the same operations as the normal data loader without the GUI and the easy-to-use interface.
Before we fire up the data loader with the command line interface we need to configure it so that it can perform the desired operations.
The main configuration file is config.properties
file, which is located in the Data Loader
folder. The default location of data loader is: c:Program FilesSalesforce.comApex Data Loader 22.0
For help using the command line data loader give the command
-help
at the command-line.
As the command line data loader runs on the command prompt we have very few options to customize and format the commands during the operation. To overcome this, we prepare some files that store the basic commands and configuration needed to run the data loader from command line. Let us look at some of the files used in operation.
The process-conf.xml
file contains a description for every potential process that could be called from the command line. Each of these processes is referred to as a Process Bean
. A sample Process-conf.xml
file is shown in the following screenshot:
The section marked in the outer rectangle is a single bean, which holds the data operation. The following are the main properties:
Name
: Name is used to identify the bean in the config.properties
file and when we call it from the command line interface, for example, accountMasterProcess.
sfdc.entity
: The Salesforce object that is the target of the actions. The value of this property is case sensitive. For example, Account (note: Capital A).process.operation
: This attribute identifies the type of operation for example, insert, upsert, update, and so on.process.mappingFile
: This attribute identifies the mapping between the fields in the source file (dataAccess.name
) and the Salesforce object. This mapping file is a .sdl
file. This file can be easily created from the GUI of the data loader.dataAccess.type
: This attribute identifies the operation to perform on the source file, For example, if we are using the ODBC data source the property will read databaseRead
, if we are including data from CSV, the property will be CSVRead
.sfdc.username
: This stores the Force.com login name for the command line run. If there is no owner name record specified, this will be the new owner.sfdc.password
: The configuration file is stored as a plain text on the machine, storing the password in the file is not a good and secure way. The sfdc.password
and process.encryptionKeyFile
serve the purpose of encrypting the password for added security. To generate the key and the encrypted password, the data loader provides the utility Encrypt.bat
with its installation. We will be looking at Encrypt.bat
file in the next section.These are the entities in process-conf.xml
file. This file has to be prepared every time we need to perform operation. If we need to perform multiple operations in a single time the entire <bean> </bean>
should be repeated with all the parameters inside it.
Once we configure the desired files we are now ready to run the operation. To run the command line job, the data loader comes with a process.bat
file kept in the Data loader folderin
folder.
To run the batch process we run the process using the name of the command as input. process ../accountMasterProcess
In this case, process is the command given to run the data loader and accountMasterProcess
is the name we have given for ID in Bean
attribute.
To login to the Salesforce.com organization, we need to specify the passing in the config.properties
file. For security purposes the password should be stored encrypted so that no unauthorized person can read it.
Salesforce provides us with the encrypt.bat
utility that encrypts the password from plain text. The utility is available at the default location Data loader folderinencrypt.bat
.
encrypt.bat
runs in the command line and supports the following commands:
Generate a key
: This command generates a key from the text we provide. Usually we should provide the password with the security code in plain text format to generate an encrypted key.Encrypt text
: It performs the same operation as generating a key, only it can be used to encrypt the general text provided.Verify encrypted text
: It matches a plain text and the encrypted password and prints a success or failure message.If there is some error in the operation of the data loader we can access the log files of the data loader. The two log files are:
sdl.log
: It contains a detailed chronological list of data loader log entries. Log entries marked—INFO are procedural items, such as logging in to Salesforce. Log entries marked—ERROR are problems, such as a submitted record missing a required field.sdl_out.log
: A supplemental log that contains additional information not captured in sdl.log
. For example, it includes log entries for the creation of proxy server network connections.These files are present in the temp
folder of the system running on Microsoft Windows and can be accessed by entering %TEMP%sdl.log
and %TEMP%sdl_out.log
in either the Run dialog.
The data loader and the web-based tool perform similar operations. This is a quick comparison of what to use in what kind of cases.
Apex data loader |
Cloud-based import tool wizard |
---|---|
Supports more than 50,000 records. |
Good to import records less than 50,000. |
Supports all the objects available for data important. |
Supports only few important objects for data important. |
We want to schedule the data imports example daily import, nightly import. |
We need an automatic de-dupe check based on account name and site, contact e-mail address, or lead e-mail address. |