Query that transpose multiple field values into 1 cell and categorize.

world33

Registered User.
Local time
Today, 18:22
Joined
Oct 24, 2006
Messages
21
Hello,

I am not sure if I can express this properly but I need to transpose and aggregate multiple field values (University programs) into 1 single cell for each University (UNITID) so that it lists all the programs offered by the University in one cell (with a manual line break Char(10) separating each program) and categorize them according to the degree level offered.

So to make it simplier to understand I have a query named Query1 that looks like this (the number of programs is much longer and I cut it for simplicity and better view):

UNITIDDegreeLevelOfferedProgramsList
100654Bachelor's degreesSocial Sciences
100654Master's degreesAgricultural/Animal/Plant/Veterinary Science and Related Fields
100654Master's degreesArchitecture and Related Services
100654Doctorate degreesAgricultural/Animal/Plant/Veterinary Science and Related Fields
100663Pre-bachelor's degreesBusiness, Management, Marketing, and Related Support Services
100663Bachelor's degreesArea, Ethnic, Cultural, Gender, and Group Studies
100663Bachelor's degreesBiological and Biomedical Sciences
100663Master's degreesCommunication, Journalism, and Related Programs
100663Doctorate degreesBiological and Biomedical Sciences


and I would like the Query1 to be transformed into another query named Query2 that looks like this:


UNITID
OfferedProgramsListAggregatedbyUniversityID
100654
Bachelor's degrees
Agricultural/Animal/Plant/Veterinary Science and Related Fields
Architecture and Related Services
Biological and Biomedical Sciences
Business, Management, Marketing, and Related Support Services
Communications Technologies/Technicians and Support Services
Computer and Information Sciences and Support Services
Education
Engineering
Engineering/Engineering-related Technologies/Technicians
English Language and Literature/Letters
Family and Consumer Sciences/Human Sciences
Homeland Security, Law Enforcement, Firefighting and Related Protective Services
Liberal Arts and Sciences, General Studies and Humanities
Mathematics and Statistics
Natural Resources and Conservation
Parks, Recreation, Leisure, Fitness, and Kinesiology
Physical Sciences
Psychology
Public Administration and Social Service Professions
Social Sciences
Visual and Performing Arts

Master's degrees
Agricultural/Animal/Plant/Veterinary Science and Related Fields
Architecture and Related Services
Biological and Biomedical Sciences
Business, Management, Marketing, and Related Support Services
Communication, Journalism, and Related Programs
Computer and Information Sciences and Support Services
Education
Engineering
Family and Consumer Sciences/Human Sciences
Health Professions and Related Programs
Parks, Recreation, Leisure, Fitness, and Kinesiology
Physical Sciences
Psychology
Public Administration and Social Service Professions
Agricultural/Animal/Plant/Veterinary Science and Related Fields
Education
Physical Sciences
100663
Pre-bachelor's degrees
Business, Management, Marketing, and Related Support Services
Foreign Languages, Literatures, and Linguistics
Health Professions and Related Programs
Science Technologies/Technicians

Bachelor's degrees
Area, Ethnic, Cultural, Gender, and Group Studies
Biological and Biomedical Sciences
Business, Management, Marketing, and Related Support Services
Communication, Journalism, and Related Programs
Computer and Information Sciences and Support Services
Education
Engineering
English Language and Literature/Letters
Foreign Languages, Literatures, and Linguistics
Health Professions and Related Programs
History
Homeland Security, Law Enforcement, Firefighting and Related Protective Services
Liberal Arts and Sciences, General Studies and Humanities
Mathematics and Statistics
Philosophy and Religious Studies
Physical Sciences
Psychology
Public Administration and Social Service Professions
Social Sciences
Visual and Performing Arts

Master's degrees
Biological and Biomedical Sciences
Business, Management, Marketing, and Related Support Services
Communication, Journalism, and Related Programs
Computer and Information Sciences and Support Services
Education
Engineering
English Language and Literature/Letters
Health Professions and Related Programs
History
Homeland Security, Law Enforcement, Firefighting and Related Protective Services
Mathematics and Statistics
Multi/Interdisciplinary Studies
Physical Sciences
Psychology
Public Administration and Social Service Professions
Social Sciences
Visual and Performing Arts

Doctorate degrees
Biological and Biomedical Sciences
Computer and Information Sciences and Support Services
Education
Education
Engineering
Health Professions and Related Programs
Health Professions and Related Programs
Health Professions and Related Programs
Mathematics and Statistics
Multi/Interdisciplinary Studies
Parks, Recreation, Leisure, Fitness, and Kinesiology
Physical Sciences
Psychology
Social Sciences

As you can see all programs offered by the same University (UNITID) are aggregated into one cell and categorized (heading) according to the degree level.
In the actual query there are a few thousands of universities (UNITID) and around 52,000 data rows (programs) so the solution should be quick enough to process the query in a reasonable time.
I would appreciate any help in understanding the best solution for such a problem.
I attach a larger dataset in Excel with the first sheet showing the initial query format and a second sheet with the outcome I would like to achieve.

Thanks for any advice or help.

World33
 

Attachments

if you can share/upload a "real" tables/database.
 
A concatenated output within a query would, I suspect, be painfully slow.
However, a report could produce this easily. Have you considered building a report to give you this output?
 
I don't think you need a concatenate function. You need a split function to properly normalize the data. I would normalize it permanently, not do it on the fly every time you want to run the query you mentioned. OfferedProgramsList should be a separate table with one row per program. To get it to that structure, can be done with a code loop and a temp table. You can do it with a query but only if you know the actual maximum programs in any list.

Read the table in the first picture. For each row, use the Split() function to separate the programs into parts using "/" as the delimiter. Then loop through the array created by the Split() and write a row for each Program. This creates what should be the properly normalized schema. Then you can use a normal query with a join to produce the list you are looking for.

The single query method would be a union query with a fixed number of select queries. Keep in mind that the number of queries in a union is limited but seems to vary depending on complexity. Each query would use a function that passes in OfferedProgramsList and a number to identify which one of the values should be returned. Query 1 returns the the first, query 2 returns the second, etc. The problem is if the limit is 10 programs,, you will always return 10 rows for each university and 9 of them may be blank so you would need an outer query that selects the union and ignores the rows where the program is null.

I hate temp tables but the union query has issues also. The BEST solution is to fix the problem ONCE and normalize the table ONCE using the first method. Then the temp table becomes the permanent solution. If the data is imported from a file, you would normalize on the fly.
 
I have a query named Query1
You are free to be proud of your achievement, but it makes sense to start at the beginning, that is, deal with the database schema and the tables involved and their fields.
I don't have much confidence in an unknown query if it is to be processed further, even with a lot of data.
 

Users who are viewing this thread

Back
Top Bottom