sql select one row randomly but taking into account a weight

sql select one row randomly but taking into account a weight  using -'sql,mysql,select,random,weight'

I'm using MySQL.
I have a table which looks like that:

id: primary key
content: varchar
weight: int

What I want to do is randomly select one row from this table, but taking into account the weight. For example, if I have 3 rows:

id, content, weight
1, "some content", 60
2, "other content", 40
3, "something", 100

The first row has 30% chance of being selected, the second row has 20% chance of being selected, and the third row has 50% chance of being selected.

Is there a way to do that ? If I have to execute 2 or 3 queries it's not a problem.

Thanks in advance.

asked Oct 13, 2015 by balvant maurya
0 votes

4 Answers

0 votes

A simple approach (avoiding joins or subqueries) is to just multiply the weight by a random number between 0 and 1 to produce a temporary weight to sort by:

SELECT t.*, RAND() * t.weight AS w 
FROM table t 

To understand this, consider that RAND() * 2x will be a larger value than RAND() * x approximately two thirds of the time. Consequently, over time each row should be selected with a frequency that's proportional to its relative weight (eg. a row with weight 100 will be selected about 100 times more often than a row with weight 1, etc).

Update: this method doesn't in fact produce the correct distributions, so for now don't use it! (see the comments below). I think there should still be a simple method similar to the above that will work, but for now the more complex method below, involving joins, might be better. I'm leaving this answer up because: (a) there's relevant discussion in the comments below, and (b) if/when I get a chance, I'll try to fix it.

answered Oct 13, 2015 by tushar2k6
0 votes

This works in MSSQL and I am sure that it should be possible to change couple of keywords to make it work in MySQL as well (maybe even nicer):

SELECT      TOP 1 t.*
FROM        @Table t
INNER JOIN (SELECT      t.id, sum(tt.weight) AS cum_weight
            FROM        @Table t
            INNER JOIN  @Table tt ON  tt.id <= t.id
            GROUP BY    t.id) tc
        ON  tc.id = t.id,
           (SELECT  SUM(weight) AS total_weight FROM @Table) tt,
           (SELECT  RAND() AS rnd) r
WHERE       r.rnd * tt.total_weight <= tc.cum_weight
ORDER BY    t.id ASC

The idea is to have a cumulative weight for each row (subselect-1), then find the position of the spanned RAND() in this cumulative range.

answered Oct 13, 2015 by mca.agarwal
0 votes

Maybe this one:

) AS x ON T.ID >= x.ID LIMIT 1;

Or this one:

SELECT * FROM tablename
          WHERE somefield='something'
          ORDER BY RAND() LIMIT 1
answered Oct 13, 2015 by mannumits1
0 votes

I don't remember how to RND() in mysql, but here working example for MSSQL:

SELECT TOP(1) (weight +RAND ()) r, id, content, weight FROM Table

If TOP(1) is not applicable you just fetch first record from total result set.

answered Oct 13, 2015 by shegokar.anjeet