Oracle Data Integrator (ODI) is a fully-featured data integration product.
It makes building and maintaining Extract-Transform-Load (ETL) and Extract-Load-Transform (ELT) processes easy. From building single-table data loads to integrating multiple sources including flat files, ODI has a modular workflow and numerous technology drivers which speed development, reduce points of failure, and enable performance logging and analysis. There are also many enterprise ODI features such as code versioning, SOA integration, Change Data Capture, parallel loads, and integration with Oracle Enterprise Manager.
In this article, we’ll examine some ways to monitor and troubleshoot ODI performance. We’ll be using ODI Studio as our point of reference. The most basic way to see load execution status is to view the Operator log, which will show manual and scheduled execution results.
These windows are helpful for troubleshooting during development or occasional debugging of load failures
Drilling into any load step will provide not only the Success/Warning/Error status, but also row counts, the actual code that was sent to the RDBMS, and errors with tracebacks. These windows are helpful for troubleshooting during development or occasional debugging of load failures, but are not well positioned for automated performance monitoring or aggregate time series analysis.
We can do better by querying the ODI metadata repository schemas to obtain this same information via a SQL query (or load the log data into our data warehouse for further reporting). A simplified diagram of some of the useful repository tables follows (ODI 11g). Read access can be granted to a database user other than the repository user to maintain access control.
Now particular session information can be queried. Time histories of scenario execution statistics can also be extracted. For example, here is an analysis of a staging table filled by an ODI procedure.
This demonstrates that most days the procedure completes in a matter of minutes, but occasionally (1-3 times a week) the duration exceeds 2 hours. This longer duration is also increasing over time (due to a growing source dataset). In this case we looked at server resources to see if the load step was grinding to a halt from resource issues (from this or other concurrent processes). We also rewrote the step to optimize the SQL being executed.
In conclusion, ODI execution logs are a rich source of debugging as well as load performance monitoring. Querying the repository log tables can proactively identify and analyze trouble spots in the load, or troubleshoot unexpected execution results in a production environment. These are helpful tools to ensure production environments remain well-tuned and to respond efficiently to real-time events before they impact the client.