Who uses DataCleaner?

DataCleaner has been deployed on thousands of machines around the world spanning almost every kind of industry and use-case for a data quality tool. On this page we've collected a few cases and testimonials of where DataCleaner is being used in practice. If you are actively using and enjoying DataCleaner, we appreciate all feedback about this.

DataCleaner & Stratebi

At Stratebi we use DataCleaner and train our clients in the use of DataCleaner, because it’s a great toolkit for data quality work – especially in combination with the Pentaho BI suite. With a tool like DataCleaner you avoid a lot of the common mistakes in doing Data Integration and ETL. You also sometimes able unlock new business value or cost savings because you’re able to identify strange behaviour in your database and improve or utilize this in upstream processes.

- Emilio Arias, Stratebi Business Solutions, Stratebi.

DataCleaner & Tecnológico de Monterrey

We have started with a new course this semester introducing Data Science to our bachelor of sciences students at Tecnologico de Monterry in Monterrey, Mexico. We found DataCleaner a excelent tool for stresing the relavance of the quality of data as a key issue during the data mining, data science process.

- Dr. Juan C. Lavariega, Ph.D, Professor Department of Computer Science, Tecnologico de Monterrey.

DataCleaner & Datalytics

Datalytics is a leading business consulting firm in the South American region. We're specialized in Data Integration, Data Cleansing, Business Intelligence and Data Mining solutions. We've been following your products for some time now, since my belief is that it's an excellent software and the perfect match for our offering.

- Andrés Eyherabide, Professional Services Manager, Datalytics.

DataCleaner & Platon

At Platon we use DataCleaner in our Information Management projects as a handy and powerful "swiss army knife" for activities related to Data Quality Analysis, Data Profiling and exploring customer data in general. We have found that the tool provides us with a lot of important analysis features in this process.

Among the important features for us is the unintrusive nature of DataCleaner and its ability to connect to a wide range of data sources.

Being a free and open source tool, DataCleaner is a choice which allows us to focus on our customers problems without having the hassle of picking a tool. On various occasions we have also cooperated with the community around specific DataCleaner features and issues.

- Asbjørn Leeth, Senior Consultant, Platon.

DataCleaner & BestBrains

BestBrains is a Danish company, specializing in Agile and Lean Software Development. Human Inference helped in implementing a solution based on DataCleaner for our customer data quality.

Our contacts database had a long story as simple comma separated files, but it was time to get an idea of the data quality and streamline the data before moving on to a full CRM system.

Using DataCleaner and it's Duplicate detection we identified around 10% redundant contact entries. Furthermore addresses were validated and standardized, so the data quality was lifted making it possible for BestBrains to improve our communication and relations with our large network.

- Jesper Thaning, Partner, BestBrains.

DataCleaner & InfoCepts

We used DataCleaner for testing flat files; we appreciate the simplicity and speed of the tool. We wish the DataCleaner team success in the venture ahead.

- Sumit Kumar Agrawal, Associate Solution Architect, InfoCepts.

DataCleaner & Pentaho Solutions (book)

Having used DataCleaner for data profiling on a couple of occasions during data warehouse projects, it was one of the obvious choices we made when we started laying out the concepts and content for the Pentaho Solutions book. Jos was the first to discover DataCleaner when he was researching available open source data quality tools for an article in the Dutch Database Magazine and immediately favored it over other available solutions. Since then he's been using it at client engagements as well. Roland also immediately liked the product, its ease of use and powerful regular expression validation options. We hope that using it as our data profiling tool of choice in 'Pentaho Solutions' will trigger more people to try and use DataCleaner for their profiling and data quality tasks!

- Roland Bouman and Jos van Dongen, authors of 'Pentaho Solutions'.

DataCleaner & Human Inference

When building up our extensive knowledge base about worldwide company names, persons' names and other input for our text interpretation and matching products, we use DataCleaner for the majority of the filtering, cleansing and categorization work. Our setup is a mix of the built-in components shipped with DataCleaner as well as a set of plugins and extensions that are particular to our processing flow, that counts more than 30 high-level verification steps to correctly categorize a particular name or name-part.

Before we had DataCleaner our tool support was based on a variety of tools that all played their role, but neither had the full power of execution. This made the process very tedious and time consuming. With DataCleaner we have a platform that we can build upon as well as take advantage of all the built-in analysis functionality.

- Jan-Arie Zwaan, Language & Culture Specialist, Human Inference.

Additional resources concerning Human Inference's usage of DataCleaner:

DataCleaner & FAP Europe

Our industry relies on client specific data sets that present themselves in multiple formats. These formats often vary from year to year, with and without warning. DataCleaner allows us to accurately predict the impact of these changes and manage new data sets accordingly.

Without DataCleaner, our only way of determining data validity would be after a transformation script is written and data is uploaded. The data may be erroneous and therefore redundant; a costly way to validate the data.

DataCleaner allows us to analyse the data and pass a validation mark against it before embarking on an invalid transformation process.

The collaborative participation aspect of DataCleaner brings together a vast array of ideas and offers us new ways of using the software.

As part of the DataCleaner development, we can work on improvements at our own pace. We can work on an idea and test it locally before submitting it as an improvement or fix.

- Michael Womersley-Carter, Data Analyst, FAP Europe.

DataCleaner & ITMATTER Inc.

ITMATTER Inc. uses DataCleaner as a part of our software development life cycle for data-driven enterprise applications. We rely on the data profiler and the data validator components in the Research & Development and Quality Assurance iteration phases of our agile development methodology.

Overall, DataCleaner fits into our philosophy of leveraging open source software wherever possible. The DataCleaner development team is highly motivated and responsive to their end-user and developer community taking a collaborative approach to ongoing support and new feature development.

- Daniel Fisla, CEO/President, ITMATTER Inc.

DataCleaner & Ben Bor

As an Information Architect I often implement solutions that integrate data from several sources, internal and external. This could be a data warehouse, a reporting system or any other large information integration programme. Without exception, the biggest problem in each of these programmes is Data Quality. Without it, the business can not rely on the information the system is supplying. I always insist on ensuring Data Quality in any project I am involved in. Open-source data profiling is a huge help, in that I can install and run a profiler in a day (compared to the weeks it would take to evaluate, agree, purchase and install a ‘commercial’ product). This gives me the ability to profile everything before it hits the integrated data store (or data warehouse).

I have evaluated several open-source data profiling solutions and find DataCleaner to be the best. It provides me with an easy way to run most of the profiling tasks that I need. The project team is very responsive: they have implemented several of my suggestions in a very short time, thus making the product even more suitable to my needs. I currently use DataCleaner for each and every project I am involved in.

- Ben Bor, Independent Information Architect.