Checking for Missing and Extra data on Questionnaires

E

Eric

Hi!

I am working on a database which did not have validation
rules set up initially; data-enterers simply entered data
directly into the tables with few validation checks.
Consequently there are many mistakes to be corrected: some
fields in the database now have 'missing' data and other
fields have 'extra' data were they should be empty.

The database actually contains questionnaire data - and
so 'missing' and 'extra' data typically happen as a result
of data-enterers or interviewers not complying with 'skip
patterns' e.g.


Questions may be:

Q1 Are you married?
Y/N
<if 'no' then skip to question 2>

Q1b When did you get married?
Y/N

Q2 What age are you?

That is, if someone reponds 'Yes' to the first question
they then they should be asked question 1b next. On the
other hand, if they answer 'No' to the first question they
should be asked Q2 next.

Where we have a 'yes' for Q1, and Q1b is not answered-
this would be considered a 'miss'

Where we have a 'no' for Q1, and Q1b IS answered - this
would be considered a 'more' or an 'extra' i.e. we have
information that we don't need.

So far I have been checking these 'skip' mistakes by
creating two queries for each question; a 'miss' query
which checks for missing data, and a 'more' query that
checks for 'more' data.

<I'll concentrate on 'missing' queries in the example
below>

Here's a typical miss query (in SQL):

SELECT [TABLE-G].ID, ([Q1]=1) And ([Q1A] Is Null) AS [Q1=1
AND Q1A IS NULL]
FROM [TABLE-G]
WHERE (((([Q1]=1) And ([Q1A] Is Null))=-1));

This gives me the The IDs of records where question Q1 has
been answered with 'yes' and Question Q1A has been
unanswered.

The problem is that I have to do many 'miss' checks like
this for each table (e.g. about 30).

Since looking up each query seperately is tedious, I have
attempted to bring all the queries into one large query.
(With all individual queries left joined to the TABLE-G
which contains the IDs of all records). However, ACCESS
crashes when the number of queries gets above about 16 or
so.

Also, I have about 40 tables to look at - so even if this
approach DID work - it would give me about 30*40 queries
in total just for the 'miss checks' - which would be very
awkward to work with.

Ideally I would like to have one query for each table -
which gives me all the missing IDs for relevant question
in which I am checking for misses (e.g. question Q1a above
(i.e. field).

I would be very grateful if someone could tell me how can
I:

a) Combine more then 16 'miss' queries together into a
single query, without ACCESS crashing?

(Is there something wrong with the way I have created the
query - with most of the 'work' being done in 'field' part
of the query, and not much in the 'criteria' section
Miss_query_1: ([Q1]=1) And ([Q1A] Is Null)..

b) Make a single SQL statement for each TABLE, that can
list of the 'missing' IDs for each question directly, with
needing to create a seperate query for each question?


Thanks in advance,

Eric
 
J

Jane Graves

The best I can think of is to have an IIF expression in a
separate field and sum the answers. For example, IIF(Q1=1
AND Q1A IS NULL,1,0). You can create a field for each of
your "miss" opportunities, then sum them all together and
filter for where that sum is greater than 0.

Jane
jane.graves @ infores.com
-----Original Message-----
Hi!

I am working on a database which did not have validation
rules set up initially; data-enterers simply entered data
directly into the tables with few validation checks.
Consequently there are many mistakes to be corrected: some
fields in the database now have 'missing' data and other
fields have 'extra' data were they should be empty.

The database actually contains questionnaire data - and
so 'missing' and 'extra' data typically happen as a result
of data-enterers or interviewers not complying with 'skip
patterns' e.g.


Questions may be:

Q1 Are you married?
Y/N
<if 'no' then skip to question 2>

Q1b When did you get married?
Y/N

Q2 What age are you?

That is, if someone reponds 'Yes' to the first question
they then they should be asked question 1b next. On the
other hand, if they answer 'No' to the first question they
should be asked Q2 next.

Where we have a 'yes' for Q1, and Q1b is not answered-
this would be considered a 'miss'

Where we have a 'no' for Q1, and Q1b IS answered - this
would be considered a 'more' or an 'extra' i.e. we have
information that we don't need.

So far I have been checking these 'skip' mistakes by
creating two queries for each question; a 'miss' query
which checks for missing data, and a 'more' query that
checks for 'more' data.

<I'll concentrate on 'missing' queries in the example
below>

Here's a typical miss query (in SQL):

SELECT [TABLE-G].ID, ([Q1]=1) And ([Q1A] Is Null) AS [Q1=1
AND Q1A IS NULL]
FROM [TABLE-G]
WHERE (((([Q1]=1) And ([Q1A] Is Null))=-1));

This gives me the The IDs of records where question Q1 has
been answered with 'yes' and Question Q1A has been
unanswered.

The problem is that I have to do many 'miss' checks like
this for each table (e.g. about 30).

Since looking up each query seperately is tedious, I have
attempted to bring all the queries into one large query.
(With all individual queries left joined to the TABLE-G
which contains the IDs of all records). However, ACCESS
crashes when the number of queries gets above about 16 or
so.

Also, I have about 40 tables to look at - so even if this
approach DID work - it would give me about 30*40 queries
in total just for the 'miss checks' - which would be very
awkward to work with.

Ideally I would like to have one query for each table -
which gives me all the missing IDs for relevant question
in which I am checking for misses (e.g. question Q1a above
(i.e. field).

I would be very grateful if someone could tell me how can
I:

a) Combine more then 16 'miss' queries together into a
single query, without ACCESS crashing?

(Is there something wrong with the way I have created the
query - with most of the 'work' being done in 'field' part
of the query, and not much in the 'criteria' section
Miss_query_1: ([Q1]=1) And ([Q1A] Is Null)..

b) Make a single SQL statement for each TABLE, that can
list of the 'missing' IDs for each question directly, with
needing to create a seperate query for each question?


Thanks in advance,

Eric
.
 
G

Guest

Hi Jane!

A belated thanks!

Your advice was exactly what I needed!

Eric

-----Original Message-----
The best I can think of is to have an IIF expression in a
separate field and sum the answers. For example, IIF (Q1=1
AND Q1A IS NULL,1,0). You can create a field for each of
your "miss" opportunities, then sum them all together and
filter for where that sum is greater than 0.

Jane
jane.graves @ infores.com
-----Original Message-----
Hi!

I am working on a database which did not have validation
rules set up initially; data-enterers simply entered data
directly into the tables with few validation checks.
Consequently there are many mistakes to be corrected: some
fields in the database now have 'missing' data and other
fields have 'extra' data were they should be empty.

The database actually contains questionnaire data - and
so 'missing' and 'extra' data typically happen as a result
of data-enterers or interviewers not complying with 'skip
patterns' e.g.


Questions may be:

Q1 Are you married?
Y/N
<if 'no' then skip to question 2>

Q1b When did you get married?
Y/N

Q2 What age are you?

That is, if someone reponds 'Yes' to the first question
they then they should be asked question 1b next. On the
other hand, if they answer 'No' to the first question they
should be asked Q2 next.

Where we have a 'yes' for Q1, and Q1b is not answered-
this would be considered a 'miss'

Where we have a 'no' for Q1, and Q1b IS answered - this
would be considered a 'more' or an 'extra' i.e. we have
information that we don't need.

So far I have been checking these 'skip' mistakes by
creating two queries for each question; a 'miss' query
which checks for missing data, and a 'more' query that
checks for 'more' data.

<I'll concentrate on 'missing' queries in the example
below>

Here's a typical miss query (in SQL):

SELECT [TABLE-G].ID, ([Q1]=1) And ([Q1A] Is Null) AS [Q1=1
AND Q1A IS NULL]
FROM [TABLE-G]
WHERE (((([Q1]=1) And ([Q1A] Is Null))=-1));

This gives me the The IDs of records where question Q1 has
been answered with 'yes' and Question Q1A has been
unanswered.

The problem is that I have to do many 'miss' checks like
this for each table (e.g. about 30).

Since looking up each query seperately is tedious, I have
attempted to bring all the queries into one large query.
(With all individual queries left joined to the TABLE-G
which contains the IDs of all records). However, ACCESS
crashes when the number of queries gets above about 16 or
so.

Also, I have about 40 tables to look at - so even if this
approach DID work - it would give me about 30*40 queries
in total just for the 'miss checks' - which would be very
awkward to work with.

Ideally I would like to have one query for each table -
which gives me all the missing IDs for relevant question
in which I am checking for misses (e.g. question Q1a above
(i.e. field).

I would be very grateful if someone could tell me how can
I:

a) Combine more then 16 'miss' queries together into a
single query, without ACCESS crashing?

(Is there something wrong with the way I have created the
query - with most of the 'work' being done in 'field' part
of the query, and not much in the 'criteria' section
Miss_query_1: ([Q1]=1) And ([Q1A] Is Null)..

b) Make a single SQL statement for each TABLE, that can
list of the 'missing' IDs for each question directly, with
needing to create a seperate query for each question?


Thanks in advance,

Eric
.
.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top