We like Pentaho a lot. Both DataCleaner and Pentaho are
commercial open source packages for various aspects of Data
Management. DataCleaner emphasizes the Data Quality aspects,
whereas Pentaho emphasizes the Business Intelligence aspects.
By combining DataCleaner and Pentaho you'll have the best
tools for any data processing challenge.
How does it work?
DataCleaner and Pentaho are integrated in three specific
DataCleaner is available in Pentaho Data Integration as a
profiling tool. Apply it to the data of any step, to
interactively analyze the impact of your ETL work.
Pentaho Data Integration transformations can be scheduled
and monitored in DataCleaner's monitoring server, allowing
you to combine ETL into your data quality or MDM solution.
DataCleaner analysis jobs can be invoked from
Pentaho Data Integration as part of larger batch data processing
Watch the video above to see how it works.
What can I use it for?
Pentaho and DataCleaner can be combined in many ways, for
Avoid bugs and mistakes by building a practice of always
profiling your data before building ETL jobs.
Use DataCleaner's scheduling and monitoring mechanism to
orchestrate and manage your portfolio of ETL jobs and monitor
their health, performance metrics, history and more.
Use Pentaho Data Integration to pre-process your data for
analytical and profiling usage by DataCleaner.