back to forum.

Topic: Profile Finder and Blanks

Topic by
boriskey

2012-07-10
17:21

Profile Finder and Blanks

Hi,

is there a way to see blanks/nulls when using Profile Finder?

When I do value distribution, NULL counts are pushed on top so I can always see how many empty values a column has. With Profile Finder though it is very misleading because nulls are not shown.

I am very happy with Profile Finder but because of this issue need to do another path with Values Distribution analyzer just to see if some columns have blank values.

Thanks!

Reply by
kasper

2012-07-11
09:22
Hi boriskey,

You're right. The Pattern Finder (note it's called 'Pattern', not 'Profile' :-)) ignores null values. Basically because no pattern can be derived from a null.

But I guess we could add a feature in a future version to write a "<null>" pattern in case of null values. Do you think that would be better?

Alternatively you could use a Null check filter to filter out nulls and eg. write all records with null values to a staging table or so.

Reply by
boriskey

2012-07-11
15:26
Hi Kasper,

while working on ETL and data integration projects, it is very important to check for blanks/nulls. Most of the time pattern finder is the best way to discover "surprises" in data so seeing how many values are blank/nulls is a great help. As a matter of fact I was working on my new project yesterday and found quite a few surprises just because I ran both pattern finder and value distribution analyzers. I also noticed that I only looked at value distribution when I needed to make sure there are no nulls in there.

Not sure how filter will help here since i do not need to filter these values - i want to report on them so subject matter experts can take these numbers and explain why there are blanks/nulls in fields which are not supposed to have those.

Great job on the last update BTW - saving results to a file is a great step forward for Data Cleaner!

Reply by
kasper

2012-07-16
12:06
Thank you for the feedback Boris. I've registered your request as a bugtracking ticket:

http://eobjects.org/trac/ticket/897.

Reply by
boriskey

2012-07-16
14:47
kasper, I also noticed then a column is all NULLs, datacleaner will raise an exception for that column and won't generate the profile for that column. But that ticket should take care of this too I guess. I was using a table on MS SQL Server database if it helps.

Thanks!

You need to be logged in to participate

In order to post your own comments on this topic, you need to be logged in.

Username:

Log in by clicking the login link at the top of the screen