sql select one row randomly but taking into account a weight Â using -'sql,mysql,select,random,weight'

I'm using MySQL.

I have a table which looks like that:

id: primary key

content: varchar

weight: int

What I want to do is randomly select one row from this table, but taking into account the weight. For example, if I have 3 rows:

id, content, weight

1, "some content", 60

2, "other content", 40

3, "something", 100

The first row has 30% chance of being selected, the second row has 20% chance of being selected, and the third row has 50% chance of being selected.

Is there a way to do that ? If I have to execute 2 or 3 queries it's not a problem.

Thanks in advance.

Â Â Â Â

I'm using MySQL.

I have a table which looks like that:

id: primary key

content: varchar

weight: int

What I want to do is randomly select one row from this table, but taking into account the weight. For example, if I have 3 rows:

id, content, weight

1, "some content", 60

2, "other content", 40

3, "something", 100

The first row has 30% chance of being selected, the second row has 20% chance of being selected, and the third row has 50% chance of being selected.

Is there a way to do that ? If I have to execute 2 or 3 queries it's not a problem.

Thanks in advance.

Â Â Â Â

0 votes

0 votes

A simple approach (avoiding joins or subqueries) is to just multiply the weight by a random number between 0 and 1 to produce a temporary weight to sort by:

```
SELECT t.*, RAND() * t.weight AS w
FROM table t
ORDER BY w DESC
LIMIT 1
```

To understand this, consider that `RAND() * 2x`

will be a larger value than `RAND() * x`

approximately two thirds of the time. Consequently, over time each row should be selected with a frequency that's proportional to its relative weight (eg. a row with weight 100 will be selected about 100 times more often than a row with weight 1, etc).

**Update: this method doesn't in fact produce the correct distributions**, so for now **don't use it!** (see the comments below). I think there should still be a simple method similar to the above that will work, but for now the more complex method below, involving joins, might be better. I'm leaving this answer up because: (a) there's relevant discussion in the comments below, and (b) if/when I get a chance, I'll try to fix it.

0 votes

This works in MSSQL and I am sure that it should be possible to change couple of keywords to make it work in MySQL as well (maybe even nicer):

```
SELECT TOP 1 t.*
FROM @Table t
INNER JOIN (SELECT t.id, sum(tt.weight) AS cum_weight
FROM @Table t
INNER JOIN @Table tt ON tt.id <= t.id
GROUP BY t.id) tc
ON tc.id = t.id,
(SELECT SUM(weight) AS total_weight FROM @Table) tt,
(SELECT RAND() AS rnd) r
WHERE r.rnd * tt.total_weight <= tc.cum_weight
ORDER BY t.id ASC
```

The idea is to have a cumulative weight for each row (subselect-1), then find the position of the spanned RAND() in this cumulative range.

0 votes

Maybe this one:

`SELECT * FROM T JOIN (SELECT FLOOR(MAX(ID)*RAND()) AS ID FROM `

) AS x ON T.ID >= x.ID LIMIT 1;
Or this one:

```
SELECT * FROM tablename
WHERE somefield='something'
ORDER BY RAND() LIMIT 1
```

0 votes
I don't remember how to RND() in mysql, but here working example for MSSQL:

```
SELECT TOP(1) (weight +RAND ()) r, id, content, weight FROM Table
ORDER BY 1 DESC
```

If TOP(1) is not applicable you just fetch first record from total result set.

...