Odi Architecture
Odi Architecture
Repositories:
1. Oracle Data Integrator Repository is composed of one Master Repository and several
Work Repositories.
2. ODI objects developed or configured through the user interfaces (ODI Studio) are stored
in one of these repository types.
3. Master repository stores the following information
1. Security information including users, profiles and rights for the ODI platform
2. Topology information including technologies, server definitions, schemas,
contexts,languages and so forth.
3. Versioned and archived objects.
4. Work repository stores the following information
1. The work repository is the one that contains actual developed objects. Several
work repositories may coexist in the same ODI installation (example Dev work
repository ,prod work repository …etc )
2. Models, including schema definition, datastores structures and metadata, fields
and columns definitions, data quality constraints, cross references, data lineage
and so forth.
3. Projects, including business rules, packages, procedures, folders, Knowledge
Modules,variables and so forth.
4. Scenario execution, including scenarios, scheduling information and logs.
5. When the Work Repository contains only the execution information (typically for
production purposes), it is then called an Execution Repository.
4. Security Navigator: is the tool for managing the security information in Oracle
Data Integrator. Through Security Navigator you can create users and profiles
and assign user rights for methods (edit, delete, etc) on generic objects (data
server, datatypes, etc), and fine tune these rights on the object instances (Server
1, Server 2, etc).
5. Creating Master and Work Repository
Agents:
1. Oracle Data Integrator run-time Agents orchestrate the execution of jobs. The agent
executes jobs on demand and to start the execution of scenarios according to a
schedule defined in Oracle Data Integrator.
Languages:
1. Languages defines the languages and language elements available when editing
expressions at design-time.
9. Creating Model
1. Models are the objects that will store the metadata in ODI.
2. They contain a description of a relational data model. It is a group of DataStores
(Known as Tables) stored in a given schema on a given technology.
3. A model typically contains metadata reverse-engineered from the “real” data model
(Database, flat file, XML file, Cobol Copybook, LDAP structure ...etc)
4. Database models can be designed in ODI. The appropriate DDLs can then be generated
by ODI for all necessary environments (development, QA, production)
5. Reverse engineering is an automated process to retrieve metadata to create or update
a model in ODI.
6. Reverse- Engineering also known as RKM(Reverse engineering Knowledge Module)
7. Reverse engineering two types
1. Standard reverse-engineering
i. Uses JDBC connectivity features to retrieve metadata, then writes it to the
ODI repository.
ii. Requires a suitable driver
2. Customized reverse-engineering
i. Read metadata from the application/database system repository, then
writes these metadata in the ODI repository
ii. Uses a technology-specific strategy, implemented in a Reverse-
engineering Knowledge Module (RKM)
2. A folder is a hierarchical grouping beneath a project and can contain other folders and
objects.
3. Every package, Mapping, Reusable Mapping and procedures must belong to a folder.
4. Objects cannot be shared between projects. except (Global variables, sequences, and
user functions)
5. Objects within a project can be used in all folders.
6. A knowledge module is a code template containing the sequence of commands
necessary to carry out a data integration task.
7. There are different knowledge modules for loading, integration, checking, reverse
engineering, and journalizing.
8. All knowledge modules code will be executed at run time.
9. There are six types of knowledge modules:
MAPPINGS
Markers
1. A marker is a tag that you can attach to any ODI object, to help organize your project.
2. Markers can be used to indicate progress, review status, or the life cycle of an object.
3. Graphical markers attach an icon to the object, whereas non graphical markers attach
numbers, strings, or dates.
4. Markers can be crucial for large teams, allowing communication among developers from
within the tool.
1. Review priorities.
2. Review completion progress.
3. Add memos to provide details on what has been done or has to be done.
4. This can be particularly helpful for geographically dispersed teams.
5. Project markers:
1. Are created in the Markers folder under a project
2. Can be attached only to objects in the same project
6. Global markers:
1. Are created in the Global Markers folder in the Others view
2. Can be attached to models and global objects
11. Components
1. In the logical view of the mapping editor, you design a mapping by combining datastores
with other components. You can use the mapping diagram to arrange and connect
components such as datasets, filters, sorts, and so on. You can form connections
between data stores and components by dragging lines between the connector ports
displayed on these objects.
2. Mapping components can be divided into two categories which describe how they are
used in a mapping: projector components and selector components.
i. Projector Components
ii. Selector Components
Projector Components
1. Projectors are components that influence the attributes present in the data that flows
through a mapping. Projector components define their own attributes: attributes from
preceding components are mapped through expressions to the projector's attributes. A
projector hides attributes originating from preceding components; all succeeding
components can only use the attributes from the projector.
2. Built-in projector components:
1. Dataset Component
2. Datastore Component
3. Set Component
4. Reusable Mapping Component
5. Aggregate Component
6. Distinct Component
Selector Components
1. Selector components reuse attributes from preceding components. Join and Lookup
selectors combine attributes from the preceding components. For example, a Filter
component following a datastore component reuses all attributes from the datastore
component. As a consequence, selector components don't display their own attributes in the
diagram and as part of the properties; they are displayed as a round shape. (The
Expression component is an exception to this rule.)
2. When mapping attributes from a selector component to another component in the mapping,
you can select and then drag an attribute from the source, across a chain of connected
selector components, to a target datastore or next projector component. ODI will
automatically create the necessary queries to bring that attribute across the intermediary
selector components.
3. Built-in selector components:
1. Expression Component
2. Filter Component
3. Join Component
4. Lookup Component
5. Sort Component
6. Split Component
Variables
1. Variable – An ODI object which stores a typed value, such as a number, string or date.
2. Variables are used to customize transformations as well as to implement control
structures, such as if-then statements and loops, into packages.
3. Variables are two types
a. Global
b. Project Level
4. To refer to a variable, prefix its name according to its scope:
a. Global variables: GLOBAL.<variable_name>
b. Project variables: <project_code>.<variable_name>
5. Variables are used either by string substitution or by parameter binding.
a. Substitution: #<project_code>.<variable_name>
b. Binding: :<project_code>.<variable_name>
6. Variables are four types
a. Declare
b. Set
c. Evaluate
d. Refresh
7. Variables can be used in packages in several different ways, as follows:
a. Declaration: When a variable is used in a package (or in certain elements of the
topology which are used in the package), it is strongly recommended that you
insert a Declare Variable step in the package. This step explicitly declares the
variable in the package.
b. Refreshing: A Refresh Variable step allows you to re-execute the command or
query that computes the variable value.
c. Assigning: A Set Variable step of type Assign sets the current value of a
variable.
d. Incrementing: A Set Variable step of type Increment increases or decreases a
numeric value by the specified amount.
Sequences
Common to all projects, whereas project sequences are only available in the project
Procedures
Procedure Examples:
1. SQL Statements
2. OS Commands
3. ODI Tools
4. JYTHON Programs
5. Variables
6. Sequences
7. User Functions
8. …etc
OdiFileCopy Syntax
Examples
1. Copy the file "host" from the directory /etc to the directory /home:
a. OdiFileCopy -FILE=/etc/hosts -TOFILE=/home/hosts
2. Copy all *.csv files from the directory /etc to the directory /home and overwrite:
a. OdiFileCopy -FILE=/etc/*.csv -TODIR=/home -OVERWRITE=yes
3. Copy all *.csv files from the directory /etc to the directory /home while changing their
extension to .txt:
a. OdiFileCopy -FILE=/etc/*.csv -TOFILE=/home/*.txt -OVERWRITE=yes
4. Copy the directory C:\odi and its sub-directories into the directory C:\Program Files\odi
a. OdiFileCopy -DIR=C:\odi "-TODIR=C:\Program Files\odi" -
RECURSE=yes
Packages
Scenarios
of the components which contributed to creating it will not change it in any way.
6. Once generated, the scenario is stored inside the work repository. The scenario can be
Exported then imported to another repository (remote or not) and used in different contexts. A
scenario can only be created from a development work repository, but can be imported into
both development and execution work repositories.
Version Control
Note: Version management is supported for master repositories installed on database engines
such as Oracle, Hypersonic SQL, and Microsoft SQL Server.
1. Projects, Folders
2. Packages, Scenarios
3. Mappings (including Reusable Mappings), Procedures, Knowledge Modules
4. Sequences, User Functions, Variables
5. Models, Model Folders
6. Solutions
7. Load Plans
1. Knowledge Modules are templates of code that define integration patterns and their
implementation
2. They are usually written to follow Data Integration best practices, but can be modified for
project specific requirements
3. These are 6 types
4. RKM (Reverse Knowledge Modules) are used to perform a customized Reverse
engineering of data models for a specific technology. These KMs are used in data
models.
5. LKM (Loading Knowledge Modules) are used to extract data from source systems (files,
middleware, database, etc.). These KMs are used in Mappings.
6. JKM (Journalizing Knowledge Modules) are used to create a journal of data
Modifications (insert, update and delete) of the source databases to keep track of the
changes. These KMs are used in data models and used for Changed Data Capture.
7. IKM (Integration Knowledge Modules) are used to integrate (load) data to the target
tables. These KMs are used in Mappings.
8. CKM (Check Knowledge Modules) are used to check that constraints on the sources
and targets are not violated. These KMs are used in data model’s static check and
interfaces flow checks.
9. SKM (Service Knowledge Modules) are used to generate the code required for creating
data services. These KMs are used in data models.
1. When processing happens between two data servers, a data transfer KM is required.
a. Before integration (Source Staging Area)
2. When processing happens within a data server, it is entirely performed by the server.
a. A single-technology IKM is required.
3. LKM and IKMs can use in four possible ways
4. Normally we ever create new KMs .But sometimes we may need to modify the existing
KMs
5. While modifying KMs , Duplicate existing steps and modify them. This prevents typos in
the syntax of the odiRef methods.
1. The purpose of Changed Data Capture is to allow applications to process changed data
only
2. Loads will only process changes since the last load
3. The volume of data to be processed is dramatically reduced
4. CDC is extremely useful for near real time implementations, synchronization, Master
Data Management
5. In general CDC Techniques are Four types
1. Trigger based – ODI will create and maintain triggers to keep track of the
changes
2. Logs based – for some technologies, ODI can retrieve changes from the
database logs. (Oracle, AS/400)
3. Timestamp based – If the data is time stamped, processes written with ODI can
filter the data comparing the time stamp value with the last load time. This
approach is limited as it cannot process deletes. The data model must have been
designed properly.
4. Sequence number – if the records are numbered in sequence, ODI can filter the
data based on the last value loaded. This approach is limited as it cannot
process updates and deletes. The data model must have been designed
properly.
6. CDC in ODI is implemented through a family of KMs: the Journalization KMs
7. These KMs are chosen and set in the model
8. Once the journals are in place, the developer can choose from the interface whether he
will use the full data set or only the changed data
9. Changed Data Capture (CDC), also referred to as Journalizing
Journalizing Components
Using CDC
This approach has a limitation, illustrated in the following example: You want to process
changes in the ORDER and ORDER_LINE datastores (with a referential integrity
constraint based on the fact that an ORDER_LINE record should have an associated
ORDER record). If you have captured insertions into ORDER_LINE, you have no
guarantee that the associated new records in ORDERS have also been captured.
Processing ORDER_LINE records with no associated ORDER records may cause
referential constraint violations in the integration process.
Consistent Set Journalizing provides the guarantee that when you have an
ORDER_LINE change captured, the associated ORDER change has been also
captured, and vice versa. Note that consistent set journalizing guarantees the
consistency of the captured changes. The set of available changes for
which consistency is guaranteed is called the Consistency Window.
Changes in this window should be processed in the correct sequence
(ORDER followed by ORDER_LINE) by designing and sequencing
integration interfaces into packages.