2012-09-20: DataCleaner 3 released

Dear friends, users, customers, developers, analysts, partners and more!

After an intense period of development and a long wait, it is our pleasure to finally announce that DataCleaner 3 is available. We at Human Inference invite you all to our celebration! Impatient to try it out? Go download it right now!

So what is all the fuzz about? Well, in all modesty, we think that with DataCleaner 3 we are redefining 'the premier open source data quality solution'. With DataCleaner 3 we've embraced a whole new functional area of data quality, namely data monitoring.

Traditionally, DataCleaner has its roots in data profiling. In the former years, we've added several related additional functions:- transformations, data cleansing, duplicate detection and more. With data monitoring we basically deliver all of the above, but in a continuous environment for analyzing, improving and reporting on your data. Furthermore, we will deliver these functions in a centralized web-based system.

So how will the users benefit from this new data monitoring environment? We've tried to answer this question using a series of images:

Monitor the evolution of your data:

Share your data quality analysis with everyone:

Continuously monitor and improve your data's quality:

Connect DataCleaner to your infrastructure using web services:


The monitoring web application is a fully fledged environment for data quality, covering several functional and non-functional areas:
  • Display of timeline and trends of data quality metrics
  • Centralized repository for managing and containing jobs, results, timelines etc.
  • Scheduling and auditing of DataCleaner jobs
  • Providing web services for invoking DataCleaner transformations
  • Security and multi-tenancy
  • Alerts and notifications when data quality metrics are out of their expected comfort zones.

Naturally, the traditional desktop application of DataCleaner continues to be the tool of choice for expert users and one-time data quality efforts. We've even enhanced the desktop experience quite substantially:
  • There is a new Completeness analyzer which is very useful for simply identifying records that have incomplete fields.
  • You can now export DataCleaner results to nice-looking HTML reports that you can give to your manager, or send to your XML parser!
  • The new monitoring environment is also closely integrated with the desktop application. Thus, the desktop application now has the ability to publish jobs and results to the monitor repository, and to be used as an interactive editor for content already in the repository.
  • New date-oriented transformations are now available: Date range filter, which allows you to subset datasets based on date ranges, and format date, which allows to format a date using a date mask.
  • The Regex Parser (which was previously only available through the ExtensionSwap) has now been included in DataCleaner. This makes it very convenient to parse and standardize rich text fields using regular expressions.
  • There's a new Text case transformer available. With this transformation you can easily convert between upper/lower case and proper capitalization of sentences and words.
  • Two new search/replace transformations have been added: Plain search/replace and Regex search/replace.
  • The user experience of the desktop application has been improved. We've added several in-application help messages, made the colors look brighter and clearer and improved the font handling.

More than 50 features and enhancements were implemented in this release, in addition to incorporating several hundreds of upstream improvements from dependent projects.

We hope you will enjoy everything that is new about DataCleaner 3. And do watch out for follow-up material in the coming weeks and months. We will be posting more and more online material and examples to demonstrate the wonderful new features that we are very proud of.

Comments (2)

Comment by
Shiva

2012-09-21
09:19
Hi Kasper,
What you have just now put into the hands of data professionals world-wide who are serious about data quality initiatives in their organizations, is a comprehensive-yet-low footprint tool that can complement and potentially increase the value of their already-deployed multi-million dollar MDM deployments.

As a practicing CDI/MDM professional and with a previous experience of building a generic, multi-domain MDM platform, I can see the value of this product release and look forward to using this in my upcoming customer engagements! Apart from its utility in data quality operations to enhance the quality of core data elements, it is surely a great tool to produce information that can be used to build a strong business case for investing in larger data management initiatives such as MDM projects.

I think it is time that Enterprise Data Governors/Managers, practicing DQ and MDM professionals in Centers of Excellence in all Consulting Organizations should seriously evaluate and include this tool as part of their arsenal of weaponry to combat the challenges of data hygiene which are at the heart of many inefficient business processes!

Comment by
beno

2012-09-24
09:11
Congratulations Human Inference on really delivering an incredible opensource app for information quality. Looking forward to try out the new monitorng features.

You need to be logged in to participate

In order to post your own comments on this news item, you need to be logged in.

Username:

Log in by clicking the login link at the top of the screen