# Query about How to group 2 tables together?

• ### 問題

• 你好
我e+ 要寫一個query 去summarise 2 個tables
2 個table 既schema 係一樣的

唔知我應該要點做呢?

點先可以 union o左2個table 之後再做group 之後再用sum 去calculate 果d data?

e.g.
CREATE TABLE [MissionResults]
(
[CustomerCode] Varchar(8) NULL,
[Mission_Code] Varchar(10) NULL,
[Duration] Int NULL, -- how long have they spend on it
[DateOfAction] Datetime NULL --Time stamp when they start to do the mission
[Spends] money NULL -- Money spend on the mission
)
go

CREATE TABLE [MissionResults_history]
(
[CustomerCode] Varchar(8) NULL,
[Mission_Code] Varchar(10) NULL,
[Duration] Int NULL, -- how long have they spend on it
[DateOfAction] Datetime NULL --Time stamp when they start to do the mission
[Spends] money NULL -- Money spend on the mission
)
go

我要group d data by
customercode, mission_code 去 summarise 佢個duration 同spend

如果我用

SELECT Customer, Mission_Code, SUM(Duration) as Duration, SUM(Spend) as Spend
FROM MissionResults
GROUP BY Customer,MissionCode

UNION
SELECT Customer, Mission_Code, SUM(Duration) as Duration, SUM(Spend) as Spend
FROM MissionResults_History
GROUP BY Customer,MissionCode

佢會就咁link 埋2 個table 既result 一齊 所以 同一個customercode,mission_code 會出現多過一次? 但係個total duration 同spend 係唔一樣的因為佢o地summarise 2 個唔同既table
我點先可以 令到佢無duplicate 既customer,mission_code 既result 呢? 令d duration spend 如果係同一個customercode,mission_code 佢會加埋一齊呢?

唔好意思有d亂
希望大家明白我想做什麼

謝謝大家
2008年5月13日 下午 03:58

### 解答

• -- Try this.

-- You are suggested to create an index on the columns

-- CustomerCode, Mission_Code.

CREATE TABLE [MissionResults]
(
[CustomerCode] Varchar(8) NULL,
[Mission_Code] Varchar(10) NULL,
[Duration] Int NULL, -- how long have they spend on it
[DateOfAction] Datetime NULL --Time stamp when they start to do the mission
, [Spends] money NULL -- Money spend on the mission
)

GO

CREATE TABLE [MissionResults_History]
(
[CustomerCode] Varchar(8) NULL,
[Mission_Code] Varchar(10) NULL,
[Duration] Int NULL, -- how long have they spend on it
[DateOfAction] Datetime NULL --Time stamp when they start to do the mission
, [Spends] money NULL -- Money spend on the mission
)

GO

SELECT A.CustomerCode, A.Mission_Code
, SUM(A.Duration) AS Duration
, SUM(A.Spends) AS Spends
FROM (
SELECT CustomerCode, Mission_Code
, SUM(Duration) AS Duration
, SUM(Spends) AS Spends
FROM MissionResults
GROUP BY CustomerCode, Mission_Code
UNION ALL
SELECT CustomerCode, Mission_Code
, SUM(Duration) AS Duration
, SUM(Spends) AS Spends
FROM MissionResults_History
GROUP BY CustomerCode, Mission_Code
) A
GROUP BY A.CustomerCode, A.Mission_Code

2008年5月14日 上午 04:10

### 所有回覆

• -- Try this.

-- You are suggested to create an index on the columns

-- CustomerCode, Mission_Code.

CREATE TABLE [MissionResults]
(
[CustomerCode] Varchar(8) NULL,
[Mission_Code] Varchar(10) NULL,
[Duration] Int NULL, -- how long have they spend on it
[DateOfAction] Datetime NULL --Time stamp when they start to do the mission
, [Spends] money NULL -- Money spend on the mission
)

GO

CREATE TABLE [MissionResults_History]
(
[CustomerCode] Varchar(8) NULL,
[Mission_Code] Varchar(10) NULL,
[Duration] Int NULL, -- how long have they spend on it
[DateOfAction] Datetime NULL --Time stamp when they start to do the mission
, [Spends] money NULL -- Money spend on the mission
)

GO

SELECT A.CustomerCode, A.Mission_Code
, SUM(A.Duration) AS Duration
, SUM(A.Spends) AS Spends
FROM (
SELECT CustomerCode, Mission_Code
, SUM(Duration) AS Duration
, SUM(Spends) AS Spends
FROM MissionResults
GROUP BY CustomerCode, Mission_Code
UNION ALL
SELECT CustomerCode, Mission_Code
, SUM(Duration) AS Duration
, SUM(Spends) AS Spends
FROM MissionResults_History
GROUP BY CustomerCode, Mission_Code
) A
GROUP BY A.CustomerCode, A.Mission_Code

2008年5月14日 上午 04:10
• Hi Martin

Thanks very much for your information, I have another problem, when I apply the same logic to another query, which I need to join 2 tables together and UNION it. Then I receive an error.
"Cannot resolve collation conflict for equal to operation."

Then I put "COLLATE DATABASE_DEFAULT" keywoard beside the JOIN condition, unfortunately it still did not work.

Could you give me some suggestion about what shall I do? The query is kind of complicated to me as I use CASE on the Select statement as well.

Here it is the example query I wrote
E.g.

SELECT c.CustomerCode, c.Mission_Code, 'Competition' as Description, SUM(Apend) As Amount,
'Total Duration ' +  SUM(Duration)+' Minutes' as Ref1, COUNT(c.CustomerCode) as NoOfAttempt, SUM(Duration) as Duration,
CASE WHEN (c.Spend>c.Record) THEN 'Pro'  ELSE 'Nor'  END  as Ref2, NULL as Ref3
FROM Mission_Detail c
ON c.CustomerCode = i.CustomerCode COLLATE DATABASE_DEFAULT
WHERE i.CustomerCode IS NOT NULL AND (HeadCustomerCode Is NULL OR HeadCustomerCode ='')
GROUP BY c.CustomerCode,  c.Mission_Code

UNION

SELECT c.HeadCustomerCode, c.Mission_Code, 'Competition' as Description, SUM(Apend) As Amount,
'Total Duration ' +  SUM(Duration)+' Minutes' as Ref1, COUNT(c.CustomerCode) as NoOfAttempt, SUM(Duration) as Duration,
CASE WHEN (c.Spend>c.Record) THEN 'Pro'  ELSE 'Nor'  END  as Ref2, NULL as Ref3
FROM Mission_Detail c
ON c.CustomerCode = i.CustomerCode COLLATE DATABASE_DEFAULT
WHERE i.CustomerCode IS NOT NULL AND (HeadCustomerCode Is NOT NULL OR HeadCustomerCode <>'')

User to Analysis is a table that store the customer code to be analysis

I am not sure where else shall I put "COLLATE DATABASE_DEFAULT"

Thanks very much for your help
2008年5月14日 上午 08:22
• Collation should be the same on both sides of the equality sign (=).

Try to change the ON clause of your query as follows.

ON c.CustomerCode COLLATE DATABASE_DEFAULT = i.CustomerCode COLLAGE DATABASE_DEFAULT

2008年5月17日 上午 06:58
• Hi ChiYau,

If you UNION 2 table with same schema, that's an expected result some values will get conflict.  Even if you establish index on whatever column, if you do not do something to make a value unique, even doing UNION on a field which is having "uniqueidentifier" data type, that's still possibility of duplicated data and cause problem on data consistency, though the possibility is low due to the high uniqueness of GUID.

Just like doing data partitioning, what will you do to split old record into an old table for archive, and what will you do to make sure when those table are re-union, rows are all unique?  One of the ways is making use of a flag.  For example, you can use time code in a multiple column primary key.  e.g. add a column in all tables to store time stamp value like "2008Q2" (to represent 2nd quarter of year 2008), and partition the table in terms of quarter.  Then, even you have 2 identical transaction in 2008Q1 and 2008Q2, when these 2 historical table are UNION'd a part of the primary key is gauranteed different, those maintaining row uniqueness.  Of course, for individual table, there should have other unique key(s) to maintain uniqueness within a table.  That technique is oftenly used during table partitioning and merge replication.

In fact, you should also need to decide whether the fields "customer code" and mission code" should be unique in your table or not.  Seems like you want to have a single customer record storing some expenditure history.  This may not be a good idea, since in long run you are not able to tell the breakdown history.  I would rather like to have a customer table just store only customer identification information, and other table(s) storing multiple transactions done by this customer.  Follow by creating VIEW through JOIN action to query the result.  Using this method will have the draw back that querying the summary of this customer will always needs query and calculation from different table, but data integrity and manageability will be greatly enhanced.  e.g. (pseudo codes below):

customer table (this table only keeps customer identification)

---------------------

customer_id <- PK (primary key)

customer_name (or some other customer identification informaiton)

....

customerTransactions table (this table keeps only unit transaction, like an invoice)

--------------------------------------

customerTransactions_id <- PK

customer_id <- FK (foreign key)

sumOfSpend (optional, you can cache this calculated value or do calculation live depends on performance)

sumOfDuration (optional, same as sunOfSpend)

......

transaction table (this table define transaction line items, like each break down item within an invoice)

------------------------

transaction_id <- PK

customerTransactions_id <- FK

dateOfAction

duration

mission_id <- FK

......

mission table (this table is the list of "missions" that can be consumed by customer, like "product list"

-------------------

mission_id <- PK

mission_code

mission_cost

.......

How do you think?

Cheers.

Lawrence

2008年5月23日 下午 05:12