Importing Data with Dataimport
In this technical reference document, we will cover the standard approach to importing data from external sources. Everything you need for this can be found in the dataimport module in hot-deploy.
Opentaps Data Import Strategy
The goal of the Data Import module is not to build a set of data import tools against a particular "standard," but rather to recognize that each organization has legacy or external data in its own unique format. Therefore, the Data Import module is a set of flexible tools which you can use as a reference point for setting up your own custom import and export. The existing services and entities can be used "as is" or with little modification if your data happens to be similar, or you can add to and extend them if you have additional data.
The Data Import module sets up "bridge entities" which are de-normalized and laid out in a way that is similar to most applications' data definitions. There are no foreign key relationships to any other opentaps entity, so any data could be imported into them. You would use your own database's import tools to import records into the bridge entities. Then, you would run one of the Data Import module's import services to transform the data in the bridge entities into the opentaps system. The Data Import services all follow a common standard:
- Each row of data in a bridge entity is wrapped in its own transaction when it is imported and succeeds or fails on its own.
- When a row of data in a bridge entity is imported successfully, the importStatusId field will be set to DATAIMP_IMPORTED
- If the import failed for the row, the status will be DATAIMP_FAILED and the importError field will contain any error messages.
To support this pattern, we have created a simple and extensible import framework. All the difficult details about setting up an import, starting transactions and handling errors are encapsulated into the OpentapsImporter class. Additionally, we have an interface called an ImportDecoder which is responsible for processing a single row from the bridge entity and mapping it onto a set of Opentaps Entities. When used properly, you will be able to focus the majority of your development on the problem of mapping the import data into the opentaps model.
Overview of Import Process
A brief outline of the import process is as follows,
- Break your original data into a set of suitably normalized CSV files. For example, put all your customers in one CSV and all your products in another.
- Create an Opentaps Import Entity (i.e., the bridge table) that has the same fields as your CSV file.
- Add three more fields for use by the import system: importStatusId, importError, and processedTimestamp
- Import your CSV data into this table using standard SQL procedures for your database
- Define a transactionless opentaps service that will execute your import (use-transaction="false")
- Create an implementation of ImportDecoder, which requires a decode() method
- In the decode() method, you are passed a row from the bridge entity.
- Use the row data to create the equivalent set of opentaps entities.
- If there are problems that should cause the row to not be imported, throw any kind of exception. The exception message will be stored in importError
- If any exception is thrown in decode(), the import of that particular row will fail and all of its operations will be rolled back.
- In the service implementation, create an instance of OpentapsImporter
- Specify the name of your Opentaps Import Entity in the constructor
- Specify the ImportDecoder that you just created
- Run the import by calling opentapsImporter.runImport()