问题描述:

Suppose i have two tables.

table one:

| col1 |

- - - - -

| do |

| big |

| gone |

table two

| col1 | col2 | col3 | col4 |

- - - - - - - - - - - - - - -

| do | blah | blah | big |

| big | do | blah | gone |

| blah | blah | blah | blah |

how do i search from table two such that rows which are displayed contain all values of col1 of table one

for eg. the result for the given situation should be

| col1 | col2 | col3 | col4 |

- - - - - - - - - - - - - - -

| big | do | blah | gone |

网友答案:

Nasty problem...

SELECT two.*
  FROM two
 WHERE (SELECT COUNT(*) FROM one) =
       (CASE WHEN col1 IN (SELECT * FROM one) THEN 1 ELSE 0 END +
        CASE WHEN col2 IN (SELECT * FROM one) THEN 1 ELSE 0 END +
        CASE WHEN col3 IN (SELECT * FROM one) THEN 1 ELSE 0 END +
        CASE WHEN col4 IN (SELECT * FROM one) THEN 1 ELSE 0 END
       )

The term 'efficiency' should not be mentioned in conjunction with this query.

网友答案:

Perhaps the trickiest part of this is guaranteeing that all the columns are covered in the second table. It is not enough just to count them, you also have be sure that all are the set:

select t.*
from two t left outer join
     one o1
     on o1.col1 = t.col1 left outer join
     one o2
     on o2.col1 = t.col2 and o2.col1 not in (coalesce(t.col1, '')) left outer join
     one o3
     on o3.col1 = t.col3 and o3.col1 not in (coalesce(t.col1, ''), coalesce(t.col2, '')) left outer join
     one o4
     on o4.col1 = t.col4 and o4.col1 not in (coalesce(t.col1, ''), coalesce(t.col2, ''), coalesce(t.col3, '')) cross join
     (select count(*) as cnt from one) const
where const.cnt = ((case when o1.col1 is not null then 1 else 0 end) +
                   (case when o2.col1 is not null then 1 else 0 end) +
                   (case when o3.col1 is not null then 1 else 0 end) +
                   (case when o4.col1 is not null then 1 else 0 end)
                  )

This looks up each value in the one table, with the proviso that the value has not been seen before. If there are duplicates in the one table, there is question on how to handle them. Would that mean that the value has to appear that many times?

网友答案:

This assumes CTE's, and bitoperations (leftshift and OR), as available in postgres (might be present in other DBMSses, too)

WITH rnk AS (
    SELECT col1, (rank() OVER (ORDER BY col1))::integer AS rnk
    FROM one
    )
, five AS (
    SELECT t.*
            , 0::integer
            | COALESCE( 1<< o1.rnk, 0)
            | COALESCE( 1<< o2.rnk, 0)
            | COALESCE( 1<< o3.rnk, 0)
            | COALESCE( 1<< o4.rnk, 0)
            AS mask
    FROM two t
    LEFT JOIN rnk o1 ON o1.col1 = t.col1
    LEFT JOIN rnk o2 ON o2.col1 = t.col2
    LEFT JOIN rnk o3 ON o3.col1 = t.col3
    LEFT JOIN rnk o4 ON o4.col1 = t.col4
    )
SELECT * FROM five f5
WHERE f5.mask IN (14)
    ;

Update: this one may be a bit cleaner, since is hides the bitshift inside the CTE.

WITH xrnk AS (
    SELECT col1, 1::integer << (rank() OVER (ORDER BY col1))::integer AS xrnk
    FROM one
    )
, five AS (
    SELECT t.*
        , ( COALESCE( o1.xrnk, 0)
          | COALESCE( o2.xrnk, 0)
          | COALESCE( o3.xrnk, 0)
          | COALESCE( o4.xrnk, 0)
          ) >> 1
        AS mask
    FROM two t
    LEFT JOIN xrnk o1 ON o1.col1 = t.col1
    LEFT JOIN xrnk o2 ON o2.col1 = t.col2
    LEFT JOIN xrnk o3 ON o3.col1 = t.col3
    LEFT JOIN xrnk o4 ON o4.col1 = t.col4
    )
SELECT * FROM five f5
WHERE f5.mask IN (7)
    ;

The simplest solution is always the best:

SELECT * FROM two t
WHERE NOT EXISTS (
        SELECT * FROM one o
        WHERE o.col1 <> t.col1 AND o.col1 <> t.col2
          AND o.col1 <> t.col3 AND o.col1 <> t.col4
        )
        ;

UPDATE: (thanks @dbenham) the simple query is rather sensitive to NULLs in the two table, which has to be handled by a bunch of COALESCE() wrappers. The 'XxxX' literal is intended to never match, obviously:

SELECT * FROM two t
WHERE NOT EXISTS (
        SELECT * FROM one o
        WHERE o.col1 <> COALESCE(t.col1, 'XxxX' )
          AND o.col1 <> COALESCE(t.col2, 'XxxX' )
          AND o.col1 <> COALESCE(t.col3, 'XxxX' )
          AND o.col1 <> COALESCE(t.col4, 'XxxX' )
        )
        ;
网友答案:

You don't state what SQL engine you are using - it can make a difference.

I've provided a solution that requires support for the row_number() function. I believe at least Oracle, DB2, and SQLServer all support row_number().

The problem is fairly straight forward once distinct values from table one are pivoted into a single row. There cannot be any matches if more than 4 distinct values exist in table one. It seems there should be a better way to do the pivoting, but I know this solution works.

I've taken pains to make sure the answer returns all rows from two if one is empty, and that duplicate rows in one are ignored.

with 
uniqueOne as ( 
  select distinct col1 from one 
),
ranked as (
  select col1, row_number() over (order by col1) seq from uniqueOne
),
vals as (
  select t1.col1 val1, 
         t2.col1 val2, 
         t3.col1 val3, 
         t4.col1 val4
    from (select 1 dummy) dummy
    left join ranked t1 on t1.seq=1
    left join ranked t2 on t2.seq=2
    left join ranked t3 on t3.seq=3
    left join ranked t4 on t4.seq=4
    left join ranked t5 on t5.seq=5
   where t5.seq is null
)
select two.*
  from two
 cross join vals
 where (vals.val1 is null or vals.val1 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val2 is null or vals.val2 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val3 is null or vals.val3 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val4 is null or vals.val4 in (two.col1, two.col2, two.col3, two.col4))
;


Here is a live demo of the solution


My goodness, I guess I should read my own answer and do some research more often. SQLServer has a Pivot operator that makes the solution very efficient. Oracle has Pivot as well, but it uses different syntax.

Here is a working demo of the SQLServer Pivot solution. Take a look at the sweet execution plan.

And here is the SQLServer query:

with 
uniqueOne as ( 
  select distinct col1 from one 
),
ranked as (
  select col1, row_number() over (order by col1) seq from uniqueOne
),
vals as (
  select [1] val1, [2] val2, [3] val3, [4] val4, [5] val5
    from ranked
    pivot ( min(col1) for seq in ([1],[2],[3],[4],[5]) ) PivotTable
)
select two.*
  from two
  join vals on val5 is null
 where (vals.val1 is null or vals.val1 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val2 is null or vals.val2 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val3 is null or vals.val3 in (two.col1, two.col2, two.col3, two.col4))
   and (vals.val4 is null or vals.val4 in (two.col1, two.col2, two.col3, two.col4))
;
相关阅读:
Top