Part VII. Invoking DataCleaner jobs

Table of Contents

21. Command-line interface
Usage scenarios
Executing an analysis job
Listing datastore contents and available components
Parameterizable jobs
Dynamically overriding configuration elements
22. Apache Hadoop and Spark interface
Hadoop deployment overview
Setting up Spark and DataCleaner environment
Upload configuration file to HDFS
Upload job file to HDFS
Upload executables to HDFS
Launching DataCleaner jobs using Spark
Limitations of the Hadoop interface