Excel data consolidation question

D

drodysill

I had a question -- I have a series of lists of users based on a userid

text string and their name, some of which are duplicates. What i've
done is import the lists into excel and then do a sort by userid.
Additionally, I made a note of which database they're each in, using a
different column for each of the four databases. I first copied all of
the entries from "users1" and then copied "users1" into every row as
column C. Next, I copied the "users2" list and for those, put "user2"
in column D...and so on up to users4.

Now, what I want to do is consolidate the entries, making sure there is

only ONE entry PER userid, but say if a user is in "user2" and "user3"
and thus has those entries in column D and E respectively, I want to
note both. Basically, I just want one row per userid and for each
userid, that row should include all "user(x)" column entries that they
exist in. Any ideas? I could also use access if that would be easier to

do this.


If it helps, I'm eventually going to compile all of these spreadsheets
(which are seperated by computer application they have access to...and
each of these "user(x)" subentries are different authentication
databases within that program)...and attempt to spit out a master list
that has one row per userid with all of the programs they have access
to (we wouldn't need to worry about the 'user(x)' subentries at that
point though, just if they exist in ANY of them).


Thanks much in advance or let me know if something isn't clear...


-Drew
 
A

aaron.kempf

Excel isn't a database.

Spit on anyone that uses Excel instead of a database

it isn't the right tool for anything.

-Aaron
 
A

atomiks

Ok true...very helpful in this case. It IS however good for easily
entering data that can be later compiled...which was my actual question
(note use of "I could also use MS Access"). Anyone have any real ideas?
 
H

Harlan Grove

(e-mail address removed) wrote...
I had a question -- I have a series of lists of users based on a userid
text string and their name, some of which are duplicates. What i've
done is import the lists into excel and then do a sort by userid.
Additionally, I made a note of which database they're each in, using a
different column for each of the four databases. I first copied all of
the entries from "users1" and then copied "users1" into every row as
column C. Next, I copied the "users2" list and for those, put "user2"
in column D...and so on up to users4.

Now, what I want to do is consolidate the entries, making sure there is
only ONE entry PER userid, but say if a user is in "user2" and "user3"
and thus has those entries in column D and E respectively, I want to
note both. Basically, I just want one row per userid and for each
userid, that row should include all "user(x)" column entries that they
exist in. Any ideas? I could also use access if that would be easier to
do this.
....

So each list contains just user ID (UID) and name (UNAME) fields? And
the consolidated table's col C through F would also contain the UID?

You need to start with a master list of all UIDs, which means it'd be
easier to add a 3rd column to each table for the list ID (LID), so
user1 would have LID 1, user2 LID 2, etc. Then append the augmented
lists into a single long list with 3 columns. I'll call it SLL. Sort it
in ascending order first by UID then LID.

At that point you could use Advanced Filters to extract only one
instance of each UID and UNAME in SLL. Then you could add formulas to
the columns to the right of the distinct UID and UNAME to indicate the
UIDs' presence in each of the original lists. If the topmost UID and
UNAME were in X2:Y2, try the formulas

AD2:
=MATCH(X2,INDEX(SLL,0,1),0)

AE2:
=COUNTIF(INDEX(SLL,AD2,1):INDEX(SLL,AD2+3,1),X2)-1

Z2:
=IF(COUNTIF(INDEX(SLL,AD2,3):INDEX(SLL,AD2+AE2),COLUMNS($Z2:Z2)),X2,"")

Fill Z2 right into AA2:AC2, then select Z2:AE2 and either double click
on the fill handle or fill down into the same rows as the UIDs and
UNAMEs in cols X and Y.

If you could live with 1s or 0s indicating whether or not a particular
UID appears in a given table, you could use SLL to generate a pivot
table with UID and UNAME as the row variables and LID as the column
variable with operation COUNT.
 
A

aaron.kempf

Excel isn't GOOD for data entry.

it isn't GOOD for consolidating multiple spreadsheets into a single
data store.

Databases are a much better long term solution.

Excel is a disease. Are you a leper?

-Aaron
 
A

atomiks

Yes, each list contains the UID and UNAME fields, in addition to a few
others but they aren't important for this process. Column C through F
on the consolidated table just note that UID/UNAME exists on the
respective source (Column C being user_ss4 and Column D being
EA_list_7, for example). It can really be anything. What I've been
doing is pasting each of the lists, leaving a space or two between each
list. The first list of UID/UNAMEs I'd put "user_ss4" in column C, then
for the next list i'd put "EA_list_7" in column D, etc...just so if I
eventually pulled them all together, there wouldn't be more than one
item in each column. I'll try out your suggestions now, just thought
I'd note this.

Otherwise, if Access would work better, I have no aversion to using it
at all.
 
H

Harlan Grove

(e-mail address removed) wrote...
Excel isn't GOOD for data entry.

Only because you don't really know how to use it.
it isn't GOOD for consolidating multiple spreadsheets into a single
data store.

Again, you don't know how to use it.
Databases are a much better long term solution.

Debatable. Depends on how easy it is to access the information. The OP
mentions different authentication *databases*. If they're truly
database tables, then best to use whatever database that is.
 
A

aaron.kempf

Harlan

you're just flat out full of shit.

Don't shoot the messenger-- just because I speak the truth.

A) manager has each employee fill out their hours in a spreadsheet
B) each employee does this and emails it into the manager.

how does the manager import 30 spreadsheets and consolidate numbers out
of all 30 workbooks?

please give me details... you are absolutely full of crap harlan.

C) what happens when someone enters 02042006 instead of 2/4/2006 in a
column?
Excel CHOKES on this simple type of data mismatch
D) what happens when Susie; over in marketing-- wants to take vacation
days.
She adds a column called 'vacation hours' and emails it to her
boss.
Seems like a perfectly natural thing to do.

SO HOW DOES EXCEL IMPORT FROM 30 DIFFERENT WORKBOOKS THEN HARLAN?

Don't talk shit about something WHEN YOU'RE WRONG.

Excel is a disease; and I spit on anyone that uses Excel for anything.

Spit in your face.

There are better ways-- email someone a form in Access; it gets
converted to a DAP (plain HTML); they enter all their data and
presto-chango-- I am ALREADY DONE.

So let's compare your 29 steps -- with rampant points of failure-- to
my much much much simpler and scalable solution.

HOW DO YOU LIKE ME NOW, PUNK?

Grow some balls and learn a real program; Harlan.

What a waste of time.

'yeah; it's more efficient to consolidate 30 spreadsheets than to keep
the data in a single database'

eat shit harlan you spew nothing but worthless garbage.

Excel is a waste of time.. A bigger time-waster than Internet Explorer
and Solitaire combined.

ANY COMPANY THAT WANTS TO THRIVE-- TO SUCCEED-- SHOULD UNINSTALL EXCEL
FROM EVERY MACHINE IN THEIR COMPANY.. AND WORK TO BUILD THE
INFRASTRUCTURE THAT THEY REALLY NEED.

uninstall excel from every machine at every company.

I have a free solution that is scalable and mulitple people can edit
their own data at the same time.

Harlan has a piece of shit technology that is inherently single user;
and not scalable enough to begin to meet the needs it's users.

Excel is crap.

I have a better spreadsheet than Excel; it doesn't require
installation; it doesn't require copying and pasting hundreds of work
books.

It has version control.
It has an audit trail.

I have pivotTables that are a thousand times better than your silly XLS
pivotTables.
 
H

Harlan Grove

(e-mail address removed) wrote...
....
how does the manager import 30 spreadsheets and consolidate numbers out
of all 30 workbooks?

http://www.google.com/[email protected]

Use the first approach.
C) what happens when someone enters 02042006 instead of 2/4/2006 in a
column?

They get a bunch of #'s rather than a date. If the spreadsheet is
well-written (so something you couldn't manage), the date could
validated using a formula like

=COUNT(1/(ABS(YEAR(DateEntry)-1960)<=60))

which evaluates to 1 if the year of DateEntry is within 60 years of
1960, so 1900..2020, which seems a reasonable range in 2006. Simple
enough to check narrower, more recent ranges. And using this validation
formula, ad hoc diagnostic messages could be implemented with formulas
like

=IF(ValidDateEntry,"","Invalid date entry in cell
"&CELL("Address",DateEntry))
Excel CHOKES on this simple type of data mismatch

Yes, it would if you were so incompetent not to check all entries. Any
cell can contain any value. That's flexibility, but it comes at a
price: it's up to the person writing the formulas that use user entries
to ensure those entries are valid. If that's different from database,
tough, this is just how spreadsheets work.
D) what happens when Susie; over in marketing-- wants to take vacation
days. She adds a column called 'vacation hours' and emails it to her boss.
Seems like a perfectly natural thing to do.

If the worksheet were protected, she wouldn't be able to insert
anything. At that point she'd need to call he boss to ask how to
include vacation hours. Of course this raises the question whether time
sheets should include anything other than work hours, and if they
should, why wouldn't there already be entries for vacation hours?

Feable effort creating this straw man, but this may be all you can
dream up.
There are better ways-- email someone a form in Access; it gets
converted to a DAP (plain HTML); they enter all their data and
presto-chango-- I am ALREADY DONE.

And if they forward the e-mail to, say, their home e-mail account so
they can fill it out in the evening, would they be able to make entries
to your database from any machine with an internet connection? If so,
what prevents anyone else from feeding garbage into your database?
I have a free solution that is scalable and mulitple people can edit
their own data at the same time.

Really? Where's the url to download it so anyone can see whether this
claim is BS or not?

To the OP: Don't mind Aaron. He's right to suggest that Access may be
better than Excel for this provided you can access the authentication
lists via ODBC. But when it comes to the antispreadsheet ranting, he's
just angry because he's never been able to figure out how to use them.
 
A

aaron.kempf

Harlan;

So you're going to do this

=COUNT(1/(ABS(YEAR(DateEntry)-1960)<=60))


For each of a trillion different cells.

And then you'll open up each spreadsheet by hand and if it has a 1 in
there; then you'll know that you've got to change this value by hand?

Are you kidding me??

Re:
Really? Where's the url to download it so anyone can see whether this
claim is BS or not?

it's just functionality that is built into Access.

File, SendTo

if worst comes to worse; you can save your form as a Data Access Page
and then email it around.

I've built quite a few complex 'Excel Replacement' solutions using this
type of technology.

but it looks just like a webpage; it is a simple webpage.
it uses these components called 'office web components'

When I say 'components' think of something similiar to Adobe Acrobat
Reader; or Flash.. all it consists of is a 'Object Tag' in HTML.

and these components are what make Excel completely and utterly
OBSOLETE.

you know when you save a spreadsheet as HTML and 'add interactivity'?
that is what I am talking about.

It isn't a piece of Excel-- these are primarily components that are
best created using Microsoft Access (or something like dreamweaver for
example).

those are the components that I'm referring to. I use those ALL DAY
LONG; EVERY DAY.
and they provide things like:

a) drilldown - the ability to have a drilldown effect in a pivotTable
b) displaying a field; but keep it collapsed-- so that drilldown is
easy
c) the ability to create custom fields INSIDE the pivotTable (you have
to create custom formulas OUTSIDE of a pivotTable)
d) the ability to have 250,000 rows in a 'spreadsheet'


Here is a basic page to help you to get your feet wet with Access
http://www.bcschools.net/staff/AccessHelp.htm

Here is a starter page that describes some funcitonality that is found
within Data Access Pages
http://office.microsoft.com/en-us/assistance/CH062526501033.aspx

Information about emailing Data Access Pages
http://office.microsoft.com/en-us/assistance/HP030890051033.aspx

you don't need 'Access' on your machine in order to fill out these
forms. All you need is a valid office license (2002 or 2003) and then
one of these products:

a) Access
b) Excel
c) Word
d) Outlook

I think that any of those 4 products counts as a license to use Office
Web Components.

-Aaron
 
A

aaron.kempf

Harlan

and for the record? the thing that makes this SECURE?

it's called 'windows authentication'

you can use SQL authenticaiton if you would prefer; you can even encrpt
the HTML so that it isn't readable in plain-text.

it does allow for offline-usage.. that would require something on the
clientside to push the XML buffer back to the server.

-Aaron
 
A

atomiks

Aaron -- can you tell me how I was trying to do in access? You appear
to actually know what you're doing, and if it's so easy and you're so
proficient, i'm actually interested in seeing how you'd do this.
 
N

Nick Hodge

Aaron

Three things about OWC

1) They are ActiveX and now blocked by default in WinXP SP2
2) They are deprecated in O12
3) Any web presentation is better hosted on the server (asp, asp.net, php,
etc) and presented in some of the new controls which implement XHTML, etc
which give rich user views, with no security warnings and server security,
without *any* need for client installed controls

--
HTH
Nick Hodge
Microsoft MVP - Excel
Southampton, England
www.nickhodge.co.uk
(e-mail address removed)
 
A

aaron.kempf

Nick;

you are such a ****ing liar.
they're not blocked in XP SP2.

they're not depecrated in Office 12. Office 12 hasn't shipped yet
****tard so go screw yourself.

'any web presentation is better hosted on the server'

oh i'm so glad that you are the SINGLE person that has EVERY POSSIBLE
understanding of how it's soooooo inefficient to run things on the
clientside.

I mean; if everything should run on the serverside; why is google and
yahoo and microsoft making powerful clientside AJAX applications?

OWC are the same thing as AJAX for all practical reasons.... a richer
client experience.
there isn't an interface in the world; anywhere-- that can compete with
OWC pivotTables for price, performance, portability and did i mention
PRICE?

your simple server-side pages are for losers.

Aren't you tired of posting back to the webserver a dozen times on
every page?

I personally hate how when you're typing something in windows live
mail; it's talking to the server and trying to auto-complete the name
of the contact that you're typing.

it's like the worst; slowest design EVER.

yeah.. 'everything should run on the server side'

kids.. shit

so uh.. where are flash add-ins going to run then; kid??

-Aaron
 
A

aaron.kempf

and for the record; it ISN"T a client installed control-- since it
comes with OFFICE ****TARD

and even if someone didn't have it installed; it's easy to prompt
people-- a single line of plain HTML says 'click here to install' or
the path to the CAB files on your webserver; it's pretty much
self-installing

-Aaron
 
A

aaron.kempf

1) open MS Access and create a new application / database with a mdb
extension if you're a newbie
2) Tools, Options, 'Pages' set up a UNC path (to a server share; or
mapped network drive or something) for the 'default connection file'
3) right-click import your spreadsheet data into Access.
4) fix some of the field names; etc to not have spaces; etc
5) select the table on which you want people to enter data
6) make a view that filters WHERE USER = SUSER_SNAME()
7) select the view (oir query) in the database window
8) go INSERT, PAGE and fill out the data access page wizard. Yes, Add,
Next, next, Add, Yes; etc
9) hit file, save in order to Save the DAP to the Access file that
you're working with.
10) it will also prompt you where to save the plain HTML file.
11) save it to a UNC path or to your C drive or something
12) right click in Windows and go 'Send-To (mail-recipient)'

rinse and repeat.

-Aaron
 
N

Nick Hodge

Aaron

For so long as your route to make your point is personal abuse to an
unsatisfactory level, I'll withdraw, not because your aggression intimidates
me, but because my intelligence and care for others stops me from going to
your level and I think everyone now see's the 'cut of your cloth'.

Suffice to say, you will notice that Office 12 will not ship with
OWC...fact, ActiveX controls will 'at best' fire the warning bar at the top
in IE. The comment on server side and client side controls ends up a little
like our ongoing 'discussions' on Excel/Access, each has there use and the
'cross-over' may be a little grey.

As I said....from my standpoint, whatever your reply on this....End!

--
HTH
Nick Hodge
Microsoft MVP - Excel
Southampton, England
www.nickhodge.co.uk
(e-mail address removed)
 
A

aaron.kempf

Nick;

I just disagree with you trying to spread lies

OWC isn't 'going away' it will be supported; according to what MS has
told us-- until 2011-2013.

That is 'long enough' for most projects.

I dont know what Office 2007 will hold. I don't really care.

All I know is that there ISN"T a better option today for displaying and
editing data.

PivotTables; Spreadsheet and Data Access Pages-- .NET can .NET compete
with this stack.

-Aaron
 
H

Harlan Grove

(e-mail address removed) wrote...
....
and even if someone didn't have it installed; it's easy to prompt
people-- a single line of plain HTML says 'click here to install' or
the path to the CAB files on your webserver; it's pretty much
self-installing
....

So you really don't care much about security, do you? Just have users
get into the habit of installing any old software from any old website?
FWIW, worms, trojans, spyware are also pretty much self-installing.
 
H

Harlan Grove

(e-mail address removed) wrote...
So you're going to do this

=COUNT(1/(ABS(YEAR(DateEntry)-1960)<=60))

For each of a trillion different cells.

Excel wouldn't be a good tool to use for data entry on the order of
trillions of cells. First off, I haven't heard of any multiterabyte
drives for anything that could be called a PC. Second, it'd take
several thousand man-years to enter a trillion cells manually. FWIW,
it'd take as long to enter this many date fields manually using Access.

As for using this against a few hundred cells, if the entry cells were
all in a single area range, you could use the following formula instead
to check all of them at once.

=--(COUNT(1/(ABS(YEAR(DateEntryRange)-1960)<=60))=COUNT(Range))

This would also give 1s and 0s as results.
And then you'll open up each spreadsheet by hand and if it has a 1 in
there; then you'll know that you've got to change this value by hand?

Are you kidding me??

Playing dumb again? Or is it playing?

The point of validation formulas is to catch entry errors IMMEDIATELY
after entry. If the user chooses to ignore the invalid entry, then,
yes, they'd have to change it later if they wanted it to be treated as
a date. Someone has to change the invalid entry, so it's left up to the
user to change it.

As an alternative, you could give the date entry cell the number format
Text, in which case it'd be treated as text, then use a defined name to
convert it into a date using a formula definition like

=IF(COUNT(1/(ABS(YEAR(A1)-1960)<=60)),A1,
IF(COUNT(--TEXT(--A1,"[<1000000]0\/0\/0000;0\/00\/0000")),
--TEXT(--A1,"[<1000000]0\/0\/0000;0\/00\/0000"),"?"))

It IS possible to adapt to inconsistent data entry in Excel, but it
usually isn't worth it.
Re:

it's just functionality that is built into Access.

File, SendTo

This was in response to your claim about free solutions to multiple
user data entry. The 'free' part obviously ignoring the need to
purchase Access for the machine designing the DAP.
but it looks just like a webpage; it is a simple webpage.
it uses these components called 'office web components'
....

It requires that if users would be using these forms within their
e-mail client, their e-mail client would need to be configured to run
ActiveX controls. That eliminates any company using Lotus Notes or any
other e-mail client that doesn't run ActiveX controls as well as
sensible, security-conscious companies that disable ActiveX controls in
e-mail clients.
When I say 'components' think of something similiar to Adobe Acrobat
Reader; or Flash.. all it consists of is a 'Object Tag' in HTML.
....

Or Nimda.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top