Frequently Asked Questions (FAQ)
1. Interoperability and Connectivity
1.1. Can DataCleaner interact with more than one data
source (database, file, etc.)?
Yes. Usually a job in DataCleaner has just one source but it
can incorporate other sources (e.g. via a 'Table Lookup' or by using
a 'Composite datastore') or other destinations (e.g. via 'Insert
into Table' or similar components).
1.2. Can I use DataCleaner with [my relational database
Our database connectivity layer is based on Apache MetaModel. This means
that a lot of data sources are supported. We specifically test these
- Microsoft SQL Server
- IBM DB2
- Apache Derby, H2, HSQLDB
Furthermore the layer is designed so that if the following two
criteria are met, we generally support your database:
- The database has a JDBC compliant database driver.
- The database supports SQL-92.
Even if these requirements are not fully met, we encourage you
to try it out. The connectivity layer is quite flexible and tolerant
towards minor issues in e.g. the driver's JDBC compliancy or minor
1.3. Does DataCleaner run in virtualized environments
(such as VMware)?
Yes. Our only requirement is that Java 7 or later is
1.4 Which Operating Systems does DataCleaner run on?
We ensure that DataCleaner is working on Windows, Linux and
Mac OS X. Furthermore any Operating System with full Java support
Note that we have a few tips for installing on Mac OS X.
2. Commercial Editions
2.1. What are the functional differences between
community and commercial editions?
There are multiple functional differences. Here are the
- Quick Start wizards. A very useful facility to get your
users up-and-running as quickly as possible. They function as both
a shortcut to get started and as a way to learn by example.
- Duplicate Detection. Identify and merge duplicate records
in your data sets. For more information, see our dedicated information page about Deduplication.
- Contact Data enrichment and suppression. This covers
functions that allow you to identify deceased people in your
database or people who have moved to new addresses, subscribed to
do-not-mail registries and more.
- Email and Phone correction. These two functions allow you
to verify the validity of email addresses and phone numbers, and to
ensure a proper formatting of these. They also provide additional
information about the line-types or phone numbers, flag email
addresses with common spelling mistakes and more.
- National Identifier check. A set of transformations that
validate and extract information from social security numbers,
company registration numbers etc.
- Exportable Data Quality KPIs/metrics. Analysis results can
be exported to produce machine-readable files with Data Quality
- An easy to use installer application that lets you decide
the functions that you need without having to deal with external
- Pluggable security layer in DataCleaner monitor - use your
LDAP or other security backend to control credentials and user
- Multi-Tenancy in the DataCleaner monitor - make separate
workspaces for separate teams, customers or other groupings.
For more information, see the Compare the
2.2. Why should I buy a commercial edition of
If you're using DataCleaner for commercial reasons then it's
really the only right thing to do. The commercial editions are
quality assured and tested in a more mature way. With the commercial
editions you get more functionality and a support organization that
will help you if there should be any troubles.
2.3. Who am I doing business with then?
DataCleaner is being developed at Human Inference, a
daughter-company of Neopost.
Human Inference is based in The Netherlands, Germany and Denmark
while Neopost is a global company.
As a DataCleaner customer you will probably come across both
company brands. This is because Human Inference is increasingly
offering it's products through the global presence of Neopost and
Neopost is increasingly being used as the company brand for Human
Inference in regions where we previously did not have much presence.
2.4. Can I get support for the Community Edition?
No, when using the Community Edition you have to rely on the
community for help. This often works, but you need more patience and
you cannot hold anyone accountable for fixing your issues.
2.5. Who should use the community edition then?
We recommend the community Edition for developers, curious
users and for hobbyists. We also encourage people to get the latest
and greatest Community Edition if they want to test new functions as
With it's limited set of features the Community Edition may
also work as an ad-hoc tool for certain professional tasks, but
remember that you're missing out on a lot of nice functionality and
that you're on your own in case of troubles.