Wednesday, April 13, 2011

Rollup Column for Normalized Sums (SQL) - Part 1

I have a table like so:

object_id | vote
1 | 2
1 | -1
1 | 5
2 | 3
2 | 1
3 | 4
3 | -2

I want this result (for this particular example, object_ids 1 and 2 are part of a group, defined elsewhere, and I'm looking for the normalized_score so that the sum always = 1. object_id 3 is part of an unused group.):

object_id | normalized_score
1 | 6/10
2 | 4/10

[added 3:05PM] 10 here is the sum of the votes for object_id in (1,2). There's a whole other set of logic to come up with the (1,2), I was just trying to give the cleanest question so people don't have to worry about that part.

[added 3:10PM] As pointed out in the comments, if the score for one of the objects is below 0, a problem arises. Here is the rule, "IF the score for any outcome_id is -x, AND that is the minimum score for the set, ADD x to all scores in order to zero-out the minimum score". I can do this on my own time though outside of SQL - so it's a bonus only if somebody has the cahones to try to tackle it in SQL.

If I do a self join, I can get the sum. I can't figure out how to get the normalized sum. Ideally this will work in both MySQL 5.x and Sqlite3. Otherwise, I can do this with two separate queries and just do the work in post-processing.

From stackoverflow
  • The comments are quite correct.. but I'll make the assumption that 10 is just some number you picked out of your... nose.

    SELECT object_id AS ObjectID, SUM(vote) + '/10' AS NormalizedVote FROM table GROUP BY object_id

    Enjoy.

    Adam Nelson : I'm sorry, I wasn't clear. '10' is the sum of all the other votes in that group - otherwise this would be a trivial question. I'll update the question.
  • -- SQL solution
    SELECT 
      object_id AS ObjectID, 
      (SUM(CASE SIGN(vote) WHEN 1 THEN vote ELSE 0 END) - 
        ((CASE SIGN(MIN(vote)) WHEN -1 THEN MIN(vote) ELSE 0) * 
         (COUNT(1)))) + '/10' AS NormalizedVote
    FROM table 
    GROUP BY object_id
    
    Adam Nelson : Similarly to the one I marked as the answer (although that one is fixable), the SUM() includes the votes for object_id=3, which won't work. I like that there's no subquery though - thanks for the input.
  • The solution without compensating for negative votes (I include this one because its much easier to read/understand):

    SELECT object_id
    ,      SUM(vote) + '/' + total AS normalized_score
    FROM   tabel
    ,      (
           SELECT sum(vote) AS total
           FROM   tabel
           ) 
    GROUP BY object_id, total
    

    Full solution:

    SELECT object_id
    ,      SUM(vote + minvote) + '/' + (total + minvote * count) AS normalized_score
    FROM   tabel
    ,      (
           SELECT sum(vote) AS total
           ,      CASE WHEN MIN(vote) < 0 THEN
                      -MIN(vote)
                  END AS minvote
           ,      COUNT(*) AS count
           FROM   tabel
           ) 
    GROUP BY object_id, total, minvote, count
    

    (I don't have access to MySQL, so I wrote a query in Oracle and replaced || for +. Hope it works in MySQL or at least helps :))

    Adam Nelson : Yeah - I was thinking a subquery was necessary. I usually try to avoid those, and I'm sure there's a non subquery solution, but that should be good enough for fewer than a few hundred thousand records - which is fine for now. Thanks for a great answer.

0 comments:

Post a Comment

Note: Only a member of this blog may post a comment.