Select items from a list without repetition


There are more solutions how to select items from a list randomly

Initial steps:

  • Cycle from 1 to N
  • Select the item at random index,
    add it to the result array
  • Solutions:
    1. if the item is already in the list skip. Drawbacks: your number of iterations could be huge
    2. Remove the item from the original list: not works for large lists or if the item is not in memory; need a linked list otherwise which needs more memory

A better solution is to swap the item at index with an item at a random index – and iterate this from the beginning of the list x times. On finish take the first x items. The drawback is that the list should be in the memory and the items should be changeable. But then you lost the original list. You can do the same on an index array, but again it needs more memory.

So there is a much better solution than the previous ones:
Calculate probability for the items (at once only for one)
Probability= ‘items still needed’ / ‘all items left’.
If the probability is greater than the random number (between 0-1) add it to the result (‘items still needed decrase), else skip.
Decrease ‘all items left’.
For first look it seems a problem that all the items can be skipped but there isn’t any trouble: when ‘items still needed’=’all items left’ then the probability will be one, so all items will be added to the result, and the two counter will decrase together.
This algorithm can be used on items not in memory, no extra memory allocation needed and the items (and their orders) won’t change.