So far the reception of DataCleaner 3.6 has been quite awesome and we are happy to see all the interest in our latest software releases. This also means that we have new stuff ready and updated, because all the interest easily translates into improvements and feature requests.
So let's look at what we have in store for today, when we announce the release of DataCleaner 3.6.2!
- We've made several improvements to the Duplicate Detection feature. Several minor bugs where fixed and matching quality was improved - both for the initial "potential duplicates" training set generation, and for the final building of matching rules.
- The progress bar of a running job in the desktop UI has been beautified and made more interactive - it will set colors and update itself while the job is running.
- In clustered setups, jobs can now be cancelled across the cluster. No more waiting for all the slave instances to finish their jobs - they will cancel within seconds if the master node tells them to.
- We've added transformations for URL encoding and HTML encoding. For usages of DataCleaner where strings are being prepared for insertion into URLs or web sites, this is a great utility.
- For DataCleaner enterprise edition, our Hadoop integration is being improved a lot. We have fixed several minor issues here.
- Datastores configured in the desktop UI are now automatically persisted in the conf.xml file, making it easier to manage datastores also outside of the UI.
- A bug pertaining to the "Merge Duplicates" feature from EasyDQ was fixed.
So all in all a lot of cool but minor improvements. Go get the latest DataCleaner