[go: up one dir, main page]

0% found this document useful (0 votes)
154 views26 pages

Odi Architecture

The document provides an overview of the Oracle Data Integrator (ODI) architecture, including its repositories, user interface, agents, languages, models, projects, knowledge modules, markers, and mapping components. Specifically: - ODI uses master and work repositories to store security, topology, versioned objects and developed integration objects respectively. - The ODI Studio interface provides navigators for designing mappings, monitoring executions, managing topology, and security. - Agents orchestrate job executions on demand or by schedule. Languages define expression elements. - Models store metadata like tables and schemas. Projects group related objects by functional domain. - Knowledge modules define integration tasks through code templates. Markers tag

Uploaded by

pdvprasad_obiee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
154 views26 pages

Odi Architecture

The document provides an overview of the Oracle Data Integrator (ODI) architecture, including its repositories, user interface, agents, languages, models, projects, knowledge modules, markers, and mapping components. Specifically: - ODI uses master and work repositories to store security, topology, versioned objects and developed integration objects respectively. - The ODI Studio interface provides navigators for designing mappings, monitoring executions, managing topology, and security. - Agents orchestrate job executions on demand or by schedule. Languages define expression elements. - Models store metadata like tables and schemas. Projects group related objects by functional domain. - Knowledge modules define integration tasks through code templates. Markers tag

Uploaded by

pdvprasad_obiee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

ODI ARCHITECTURE:

Repositories:

1. Oracle Data Integrator Repository is composed of one Master Repository and several
Work Repositories.
2. ODI objects developed or configured through the user interfaces (ODI Studio) are stored
in one of these repository types.
3. Master repository stores the following information
1. Security information including users, profiles and rights for the ODI platform
2. Topology information including technologies, server definitions, schemas,
contexts,languages and so forth.
3. Versioned and archived objects.
4. Work repository stores the following information
1. The work repository is the one that contains actual developed objects. Several
work repositories may coexist in the same ODI installation (example Dev work
repository ,prod work repository …etc )
2. Models, including schema definition, datastores structures and metadata, fields
and columns definitions, data quality constraints, cross references, data lineage
and so forth.
3. Projects, including business rules, packages, procedures, folders, Knowledge
Modules,variables and so forth.
4. Scenario execution, including scenarios, scheduling information and logs.
5. When the Work Repository contains only the execution information (typically for
production purposes), it is then called an Execution Repository.

ODI Studio and User Interface:

1. Administrators, Developers and Operators use the Oracle Data Integrator


Studio to access the repositories.
2. ODI Studio provides four Navigators for managing the different aspects and
steps of an ODI integration project:

1. Designer Navigator: is used to design data integrity checks and to build


transformations
2. Operator Navigator: is the production management and monitoring tool. It is
designed for IT production operators. Through Operator Navigator, you can
manage your mapping executions in the sessions, as well as the scenarios in
production.

3. Topology Navigator: is used to manage the data describing the information


system's physical and logical architecture. Through Topology Navigator you can
manage the topology of your information system, the technologies and their
datatypes, the data servers linked to these technologies and the schemas they
contain, the contexts, the languages and the agents, as well as the repositories.
The site, machine, and data server descriptions will enable Oracle Data
Integrator to execute the same mappings in different physical environments.

4. Security Navigator: is the tool for managing the security information in Oracle
Data Integrator. Through Security Navigator you can create users and profiles
and assign user rights for methods (edit, delete, etc) on generic objects (data
server, datatypes, etc), and fine tune these rights on the object instances (Server
1, Server 2, etc).
5. Creating Master and Work Repository

Agents:

1. Oracle Data Integrator run-time Agents orchestrate the execution of jobs. The agent
executes jobs on demand and to start the execution of scenarios according to a
schedule defined in Oracle Data Integrator.

Languages:

1. Languages defines the languages and language elements available when editing
expressions at design-time.

9. Creating Model
1. Models are the objects that will store the metadata in ODI.
2. They contain a description of a relational data model. It is a group of DataStores
(Known as Tables) stored in a given schema on a given technology.
3. A model typically contains metadata reverse-engineered from the “real” data model
(Database, flat file, XML file, Cobol Copybook, LDAP structure ...etc)
4. Database models can be designed in ODI. The appropriate DDLs can then be generated
by ODI for all necessary environments (development, QA, production)
5. Reverse engineering is an automated process to retrieve metadata to create or update
a model in ODI.
6. Reverse- Engineering also known as RKM(Reverse engineering Knowledge Module)
7. Reverse engineering two types
1. Standard reverse-engineering
i. Uses JDBC connectivity features to retrieve metadata, then writes it to the
ODI repository.
ii. Requires a suitable driver
2. Customized reverse-engineering
i. Read metadata from the application/database system repository, then
writes these metadata in the ODI repository
ii. Uses a technology-specific strategy, implemented in a Reverse-
engineering Knowledge Module (RKM)

8. Some other methods of reverse engineering are


1. Delimited format reverse-engineering
i. File parsing built into ODI.
2. Fixed format reverse-engineering
i. Graphical wizard, or through COBOL copybook for Mainframe files.
3. XML file reverse-engineering (Standard)
i. Uses JDBC driver for XML
9. Reverse engineering is incremental.
10. New metadata is added and old metadata is removed.

1. A project is a collection of ODI objects created by users for a particular functional


domain. Those objects are(Folders ,Variables …etc)

2. A folder is a hierarchical grouping beneath a project and can contain other folders and
objects.
3. Every package, Mapping, Reusable Mapping and procedures must belong to a folder.
4. Objects cannot be shared between projects. except (Global variables, sequences, and
user functions)
5. Objects within a project can be used in all folders.
6. A knowledge module is a code template containing the sequence of commands
necessary to carry out a data integration task.
7. There are different knowledge modules for loading, integration, checking, reverse
engineering, and journalizing.
8. All knowledge modules code will be executed at run time.
9. There are six types of knowledge modules:
MAPPINGS

10. Some of the KMs are


1. LKM File to Oracle (SQLLDR)
i. Uses Jython to run SQL*LOADER via the OS
ii. Much faster than basic LKM File to SQL
2. LKM Oracle to Oracle (DBLINK)
i. Loads from Oracle data server to Oracle data server
ii. Uses Oracle DBLink
3. CKM Oracle
i. Enforces logical constraints during data load
ii. Automatically captures error records

Markers

1. A marker is a tag that you can attach to any ODI object, to help organize your project.
2. Markers can be used to indicate progress, review status, or the life cycle of an object.
3. Graphical markers attach an icon to the object, whereas non graphical markers attach
numbers, strings, or dates.
4. Markers can be crucial for large teams, allowing communication among developers from
within the tool.
1. Review priorities.
2. Review completion progress.
3. Add memos to provide details on what has been done or has to be done.
4. This can be particularly helpful for geographically dispersed teams.
5. Project markers:
1. Are created in the Markers folder under a project
2. Can be attached only to objects in the same project
6. Global markers:
1. Are created in the Global Markers folder in the Others view
2. Can be attached to models and global objects

11. Components

1. In the logical view of the mapping editor, you design a mapping by combining datastores
with other components. You can use the mapping diagram to arrange and connect
components such as datasets, filters, sorts, and so on. You can form connections
between data stores and components by dragging lines between the connector ports
displayed on these objects.
2. Mapping components can be divided into two categories which describe how they are
used in a mapping: projector components and selector components.

i. Projector Components
ii. Selector Components

Projector Components

1. Projectors are components that influence the attributes present in the data that flows
through a mapping. Projector components define their own attributes: attributes from
preceding components are mapped through expressions to the projector's attributes. A
projector hides attributes originating from preceding components; all succeeding
components can only use the attributes from the projector.
2. Built-in projector components:

1. Dataset Component
2. Datastore Component
3. Set Component
4. Reusable Mapping Component
5. Aggregate Component
6. Distinct Component

Selector Components
1. Selector components reuse attributes from preceding components. Join and Lookup
selectors combine attributes from the preceding components. For example, a Filter
component following a datastore component reuses all attributes from the datastore
component. As a consequence, selector components don't display their own attributes in the
diagram and as part of the properties; they are displayed as a round shape. (The
Expression component is an exception to this rule.)
2. When mapping attributes from a selector component to another component in the mapping,
you can select and then drag an attribute from the source, across a chain of connected
selector components, to a target datastore or next projector component. ODI will
automatically create the necessary queries to bring that attribute across the intermediary
selector components.
3. Built-in selector components:

1. Expression Component
2. Filter Component
3. Join Component
4. Lookup Component
5. Sort Component
6. Split Component

Variables

1. Variable – An ODI object which stores a typed value, such as a number, string or date.
2. Variables are used to customize transformations as well as to implement control
structures, such as if-then statements and loops, into packages.
3. Variables are two types
a. Global
b. Project Level
4. To refer to a variable, prefix its name according to its scope:
a. Global variables: GLOBAL.<variable_name>
b. Project variables: <project_code>.<variable_name>
5. Variables are used either by string substitution or by parameter binding.
a. Substitution: #<project_code>.<variable_name>
b. Binding: :<project_code>.<variable_name>
6. Variables are four types
a. Declare
b. Set
c. Evaluate
d. Refresh
7. Variables can be used in packages in several different ways, as follows:
a. Declaration: When a variable is used in a package (or in certain elements of the
topology which are used in the package), it is strongly recommended that you
insert a Declare Variable step in the package. This step explicitly declares the
variable in the package.
b. Refreshing: A Refresh Variable step allows you to re-execute the command or
query that computes the variable value.
c. Assigning: A Set Variable step of type Assign sets the current value of a
variable.
d. Incrementing: A Set Variable step of type Increment increases or decreases a
numeric value by the specified amount.

Sequences

1. A Sequence is a variable that increments itself automatically each time it is used.


2. It is mainly useful to generate surrogate Keys
3. It is equivalent to Sequence generator transformation of informatica
4. A sequence can be created as a global sequence or in a project. Global sequences are

Common to all projects, whereas project sequences are only available in the project

Where they are defined.

5. Oracle Data Integrator supports three types of sequences:


1. Standard sequences: whose current values are stored in the Repository.
2. Specific sequences: whose current values are stored in an RDBMS table cell.
Oracle Data Integrator reads the value, locks the row (for concurrent updates)
and updates the row after the last increment.

3. Native sequence: that maps a RDBMS-managed sequence.


6. Even we can use directly database Sequence without using ODI sequence types

Procedures

1. Procedure is a sequence of commands/Tasks executed by database engines or the


operating system, or using ODI Tools.
2. A procedure can have options that control its behavior.
3. Procedures are reusable components that can be inserted into packages.

Procedure Examples:

1. Email Administrator procedure:


a. Uses the “OdiSendMail” ODI tool to send an administrative email to a user. The
email address is an option.
2. Clean Environment procedure:
a. Deletes the contents of the /temp directory using the “OdiFileDelete” tool
b. Runs DELETE statements on these tables in order: CUSTOMER, CITY,
REGION, and COUNTRY
3. Create and populate RDBMS table:
a. Run SQL statement to create an RDBMS table.
b. Run SQL statements to populate table with records.
4. Initialize Drive procedure:
a. Connect to a network drive using either a UNIX or Windows command
(depending on an option).
b. Create a /work directory on this drive.
5. Email Changes procedure:
a. Wait for 10 rows to be inserted into the INCOMING table.
b. Transfer all data from the INCOMING table to the OUTGOING table.
c. Dump the contents of the OUTGOING table to a text file.
d. Email this text file to a user.

Commands or ODI objects that can be used in ODI procedures:

1. SQL Statements
2. OS Commands
3. ODI Tools
4. JYTHON Programs
5. Variables
6. Sequences
7. User Functions
8. …etc

Using ODI Tools

1. Example of ODI tools are


1. FILES Related
i. OdiFileAppend,OdiFileCopy,OdiFileDelete,OdiFileMove,OdiFileWait,Odi
MkDir,OdiOutFile,OdiSqlUnload,OdiUnZip,OdiZip
2. Internet Related
i. OdiSendMail,OdiFtpGet,OdiFtpPut,OdiReadMail,OdiScpGet,OdiScpPut,
OdiSftpGet,OdiSftpPut
3. .. etc (For more ODI TOOLS see package screen)

Let us see about OdiFileCopy

OdiFileCopy Syntax

OdiFileCopy -DIR=<dir> -TODIR=<dest_dir> [-OVERWRITE=<yes|no>] [-


RECURSE=<yes|no>] [-CASESENS=<yes|no>]

OdiFileCopy -FILE=<file> -TOFILE=<dest_file>|-TODIR=<dest_dir> [-


OVERWRITE=<yes|no>] [-RECURSE=<yes|no>] [-CASESENS=<yes|no>]

Examples

1. Copy the file "host" from the directory /etc to the directory /home:
a. OdiFileCopy -FILE=/etc/hosts -TOFILE=/home/hosts
2. Copy all *.csv files from the directory /etc to the directory /home and overwrite:
a. OdiFileCopy -FILE=/etc/*.csv -TODIR=/home -OVERWRITE=yes
3. Copy all *.csv files from the directory /etc to the directory /home while changing their
extension to .txt:
a. OdiFileCopy -FILE=/etc/*.csv -TOFILE=/home/*.txt -OVERWRITE=yes
4. Copy the directory C:\odi and its sub-directories into the directory C:\Program Files\odi
a. OdiFileCopy -DIR=C:\odi "-TODIR=C:\Program Files\odi" -
RECURSE=yes

Packages

1. An organized sequence of steps that makes up a workflow. In Package Each step


performs a small task, and they are combined together to make the package.
2. A Simple Package example
a. This package executes Three Mappings, and then archives some files.
b. If one of the Four steps fails, an email is sent to the administrator.
e. Conditional evaluation: An Evaluate Variable step tests the current value of a
variable and branches depending on the result of the comparison.
8. Variables also can be used in expressions of mappings, procedures and so forth.

Scenarios

1. A scenario is designed to put a source component (mapping, package, procedure,


variable) into production.
2. When a component development is finished and tested, you can generate the scenario
corresponding its actual state. This operation takes place in Designer Navigator.
3. The scenario code (the language generated) is frozen, and all subsequent modifications

of the components which contributed to creating it will not change it in any way.

4. It is possible to generate scenarios for packages, procedures, mappings, or variables.


5. Scenarios generated for procedures, mappings, or variables are single step scenarios

that execute the procedure, mapping, or refresh the variable.

6. Once generated, the scenario is stored inside the work repository. The scenario can be

Exported then imported to another repository (remote or not) and used in different contexts. A
scenario can only be created from a development work repository, but can be imported into
both development and execution work repositories.

Version Control

1. Oracle Data Integrator provides a comprehensive system for managing and


safeguarding changes.
2. It also allows these objects to be backed up as stable checkpoints, and later restored
from these checkpoints.
3. These checkpoints are created for individual objects in the form of versions, and for
consistent groups of objects in the form of solutions.

Note: Version management is supported for master repositories installed on database engines
such as Oracle, Hypersonic SQL, and Microsoft SQL Server.

4. A version is a backup copy of an object. It is checked in at a given time and may be


restored later.
5. Versions are saved in the master repository. They are displayed in the Version tab of the
object window.
6. The following objects can be checked in as versions

1. Projects, Folders
2. Packages, Scenarios
3. Mappings (including Reusable Mappings), Procedures, Knowledge Modules
4. Sequences, User Functions, Variables
5. Models, Model Folders
6. Solutions
7. Load Plans

Modifying Knowledge Modules

1. Knowledge Modules are templates of code that define integration patterns and their
implementation
2. They are usually written to follow Data Integration best practices, but can be modified for
project specific requirements
3. These are 6 types
4. RKM (Reverse Knowledge Modules) are used to perform a customized Reverse
engineering of data models for a specific technology. These KMs are used in data
models.
5. LKM (Loading Knowledge Modules) are used to extract data from source systems (files,
middleware, database, etc.). These KMs are used in Mappings.
6. JKM (Journalizing Knowledge Modules) are used to create a journal of data
Modifications (insert, update and delete) of the source databases to keep track of the
changes. These KMs are used in data models and used for Changed Data Capture.
7. IKM (Integration Knowledge Modules) are used to integrate (load) data to the target
tables. These KMs are used in Mappings.
8. CKM (Check Knowledge Modules) are used to check that constraints on the sources
and targets are not violated. These KMs are used in data model’s static check and
interfaces flow checks.
9. SKM (Service Knowledge Modules) are used to generate the code required for creating
data services. These KMs are used in data models.

Mainly used KMs?

1. When processing happens between two data servers, a data transfer KM is required.
a. Before integration (Source  Staging Area)

 Requires an LKM, which is always multi-technology

b. At integration (Staging Area  Target)

 Requires a multi-technology IKM

2. When processing happens within a data server, it is entirely performed by the server.
a. A single-technology IKM is required.
3. LKM and IKMs can use in four possible ways
4. Normally we ever create new KMs .But sometimes we may need to modify the existing
KMs
5. While modifying KMs , Duplicate existing steps and modify them. This prevents typos in
the syntax of the odiRef methods.

Change Data Capture (CDC)

1. The purpose of Changed Data Capture is to allow applications to process changed data
only
2. Loads will only process changes since the last load
3. The volume of data to be processed is dramatically reduced
4. CDC is extremely useful for near real time implementations, synchronization, Master
Data Management
5. In general CDC Techniques are Four types
1. Trigger based – ODI will create and maintain triggers to keep track of the
changes
2. Logs based – for some technologies, ODI can retrieve changes from the
database logs. (Oracle, AS/400)
3. Timestamp based – If the data is time stamped, processes written with ODI can
filter the data comparing the time stamp value with the last load time. This
approach is limited as it cannot process deletes. The data model must have been
designed properly.
4. Sequence number – if the records are numbered in sequence, ODI can filter the
data based on the last value loaded. This approach is limited as it cannot
process updates and deletes. The data model must have been designed
properly.
6. CDC in ODI is implemented through a family of KMs: the Journalization KMs
7. These KMs are chosen and set in the model
8. Once the journals are in place, the developer can choose from the interface whether he
will use the full data set or only the changed data
9. Changed Data Capture (CDC), also referred to as Journalizing

Journalizing Components

1. Journals: Contain references to the changed records


2. Capture processes: Captures the changes in the source datastores either by creating
triggers on the data tables, or by using database-specific programs to retrieve log data
from data server log files
3. Subscribers (applications, integration processes, and so on): That use the changes
tracked on a datastore or on a consistent set

CDC Infrastructure in ODI

1. CDC in ODI depends on a Journal table


2. This table is created by the KM and loaded by specific steps of the KM
3. This table has a very simple structure:
1. Primary key of the table being checked for changes
2. Timestamp to keep the change date
3. A flag to allow for a logical “lock” of the records
4. A series of views is created to join this table with the actual data
5. When other KMs will need to select data, they will know to use the views instead of the
tables

Using CDC

1. Set a JKM in your model


2. For all the following steps, right-click on a table to process just that table, or right-click on
the model to process all tables of the model:
3. Add the table to the CDC infrastructure: Right-click on a table and select Changed Data
Capture / Add to CDC
4. Oracle Data Integrator supports two journalizing modes:
1. Simple Journalizing tracks changes in individual datastores in a model.
2. Consistent Set Journalizing tracks changes to a group of the model's data
stores, taking into account the referential integrity between these datastores. The
group of datastores journalized in this mode is called a Consistent Set.
5. Simple vs. Consistent Set Journalizing
Simple Journalizing enables you to journalize one or more datastores. Each journalized
datastore is treated separately when capturing the changes.

This approach has a limitation, illustrated in the following example: You want to process
changes in the ORDER and ORDER_LINE datastores (with a referential integrity
constraint based on the fact that an ORDER_LINE record should have an associated
ORDER record). If you have captured insertions into ORDER_LINE, you have no
guarantee that the associated new records in ORDERS have also been captured.
Processing ORDER_LINE records with no associated ORDER records may cause
referential constraint violations in the integration process.

Consistent Set Journalizing provides the guarantee that when you have an
ORDER_LINE change captured, the associated ORDER change has been also
captured, and vice versa. Note that consistent set journalizing guarantees the
consistency of the captured changes. The set of available changes for
which consistency is guaranteed is called the Consistency Window.
Changes in this window should be processed in the correct sequence
(ORDER followed by ORDER_LINE) by designing and sequencing
integration interfaces into packages.

Although consistent set journalizing is more powerful, it is also more


difficult to set up. It should be used when referential integrity constraints
need to be ensured when capturing the data changes. For performance
reasons, consistent set journalizing is also recommended when a large
number of subscribers are required.

It is not possible to journalize a model (or datastores within a model)


using both consistent set and simple journalizing.

6. For consistent CDC, arrange the datastores in the appropriate order


(parent/child relationship) : in the model definition, select the
Journalized tables tab and click the Reorganize button
7. Add the subscriber (The default subscriber is SUNOPSIS) Right-click
on a table and select Changed Data Capture / Add subscribers
8. Start the journals: Right-click on a table and select Changed Data Capture / Start
Journal

You might also like