Random Number Generation

I want to know more about random number generation. It was used often when I was studying machine learning with Python, and I want to know more about its relevance.

Date Created:

References



Definitions


  • Statistical Randomness
    • The
  • Mersenne Twister Algorithm
    • The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by Makoto Matsumoto and Takuji Nishimura. Its name derives from the choice of a Mersenne prime as its period length.
      • A Mersenne Prime is a prime number that is one less than a power of two. A prime number of the form for some integer . Examples: 3, 7, 31, 8191, 131071, 524287, ...
    • The most commonly used version of the Mersenne Twister algorithm is based on the Mersenne prime .
    • For a w-bit word length, the Mersenne Twister generates integers in the range .
  • /dev/random
    • In Unix-like operating systems, /dev/random and /dev/urandom are special files that serve as cryptographically secure pseudorandom number generators. They allow access to a CSPRNG that is seeded with entropy (a value that provides randomness) from environmental noise, collected from device drivers and other sources.
    • This special file was originated in Linux in 1994 and quickly adopted by other Unix-like operating systems.


Notes


Random Number Generation is a process by which, often by means of a random number generator (RNG), a number r symbols that cannot be reasonably predicted better than random chance is generated. This means that the particular outcome sequence will contain some patterns detectable in hindsight but impossible to foresee. True random number generators can be hardware random-number generators (HNRGs), wherein each generation is a function of the current value of a physical environment's attitude that is constantly changing in a manner that is practically impossible to model. This would be in contrast to so-called random number generations done by pseudorandom number generators (PRNGs), which generate numbers that only look random but are in fact predetermined = these generations can be reproduced simply by knowing the state of the PRNG.
  • Methods for generating random data include coin flipping, rolling of dice, ...
  • Several computational methods for pseudorandom number generation exist. All fall short of the goal of true randomness, although they meet, with varying success, some of the statistical tests for randomness intended to measure how unpredictable their results are. This generally makes them unusable for applications such as cryptography.
  • Carefully designed crypto graphically secure pseudorandom number generators (CSPRNGs) exist with features specifically designed for use in cryptography.


Practical Applications and uses


  • used in applications where producing an unpredictable result is desirable:
    • Gambling
    • Statistical Sampling
    • Computer Simulation
    • Cryptography
  • Pseudorandom number generators are useful in developing Monte Carlo-method simulations, as debugging is facilitated by the ability to run the same sequence of random numbers again by starting from the same random seed
  • The generation of pseudorandom numbers is an important and common task in computer programming.
    • Cryptography requires a high degree of apparent randomness
    • Weaker forms of randomness are used in hash algorithms
  • A truly random system would have no restriction on the same item appearing two or three times in succession.


True vs Pseudo-random Numbers


  • There are two principle methods used to generate random numbers.
  • The first method measures some physical phenomenon that is expected to be random and then compensates for possible biases in the measurement process., e.g. measuring atmospheric noise, thermal noise, or other electromagnetic and quantum phenomena.
    • Entropy, in terms of random numbers, is a measure of unpredictability or surprise of the number generation process.
    • The speed at which entropy can be obtained from natural sources is dependent on the underling physical phenomena being measured.
    • This type of generation is blocking.
  • The second method uses computational algorithms that can produce long sequences of apparently random results, which are in fact completely determined by a shorter initial value, known as the seed value or key.
    • As a result, the entire seemingly random sequence can be reproduced if the seed is known. This type of random number generator is often called a pseudorandom number generator.
    • This type of generator typically does not rely on sources of naturally occurring entropy, though it may be periodically seeded by natural sources.
    • This type of generation is non-blocking.
  • Some systems take a hybrid approach, proving randomness harvested form natural sources when available, and falling back periodically to re-seeded software based cryptographically secure pseudo random number generators.
While a pseudorandom number generator based solely on deterministic logic can never be regarded as a true random number source in the purest sense of the word, in practice they are generally sufficient even for demanding security-critical applications.


Generation Methods


Physical Methods

  • The earliest form of generating random numbers tend to be too slow for most applications in statistics and cryptography.
  • A physical random number generator can be based on essentially random atomic or subatomic physical phenomenon whose unpredictability can be traced to the laws of quantum mechanics.


Computational Methods

  • Most computer-generated random numbers use PRNGs which are algorithms that can automatically create long runs of numbers with good random properties but eventually the sequence repeats.
  • The series of values generated by such algorithms is generally determined by a fixed number called a seed. One of the most common PRNG is the linear congruential generator, which uses the recurrence:
  • to generate random numbers, where a, b, and m are large integers, and is the next X as a series of pseudo random numbers. The maximum number of numbers the formula can produce is the modulus, m.
  • A simple pen and paper method for generating random umbers is the so-called middle-square method, suggested by von Neumann.
  • The default random number generator used in many languages is based on the Mersenne Twister algorithm and is not sufficient for cryptography purposes.
    • Such library functions often have poor statistical properties and will repeat patterns after only tens of thousands of trials.
    • They are often initialized using a computer's real time clock as the seed.
  • Much higher quality random number sources are available on most operating systems; for example /dev/random on various BSD flavors, Linux.


By Humans

  • Most studies find that human subjects have some degree of non-randomness when attempting to produce a random sequence.


Post Processing and Statistical Checks


  • Obtaining numbers which are completely unbiased takes care.
  • Generated random numbers are sometimes subjected to statistical tests before use to ensure that the underlying source is still working, and then post-processed to improve their statistical properties.
  • Statistical tests are also used to give confidence that the post-processed final output from a random number generator is truly unbiased, with various randomness test suites being developed.


Other Considerations


Reshaping the Distribution

Uniform Distributions
Most random number generators natively work with integers or individual bits, so an extra step is required to arrive at the canonical uniform distribution between 0 and 1. This implementation is not as trivial as dividing the integer by its maximum possible value:
  1. The integer used in the transformation must provide enough bits for the intended precision.
  2. The nature of floating-point math itself means there exists more precision the closer the number is to zero. This extra precision is usually not used due to the sheer number of bits required.
  3. Rounding error in division may bias the result. At worst, a supposedly excluded bound may be drawn contrary to expectations based on real-number math.

Whitening

  • The outputs of multiple independent RNGs can be combined (for example, using a bit-wise XOR operation) to provide a combined RNG at least as good as the best RNG used. This is referred to as software whitening.


Activities and Demonstrations


  • Sites that make available random number samples:
    • SOCR Resource Page
    • Quantum Optics Group at the ANU Generates Random Numbers Sourced from Quantum Vacuum
    • Random.org makes available random numbers that are sourced from the randomness of atmospheric noise


Backdoors


  • If a random number generator can be made predictable, it can be used as a backdoor by an attack to break the encryption.



Insert Math Markup

ESC
About Inserting Math Content
Display Style:

Embed News Content

ESC
About Embedding News Content

Embed Youtube Video

ESC
Embedding Youtube Videos

Embed TikTok Video

ESC
Embedding TikTok Videos

Embed X Post

ESC
Embedding X Posts

Embed Instagram Post

ESC
Embedding Instagram Posts

Insert Details Element

ESC

Example Output:

Summary Title
You will be able to insert content here after confirming the title of the <details> element.

Insert Table

ESC
Customization
Align:
Preview:

Insert Horizontal Rule

#000000

Preview:


Insert Chart

ESC

View Content At Different Sizes

ESC

Edit Style of Block Nodes

ESC

Edit the background color, default text color, margin, padding, and border of block nodes. Editable block nodes include paragraphs, headers, and lists.

#ffffff
#000000

Edit Selected Cells

Change the background color, vertical align, and borders of the cells in the current selection.

#ffffff
Vertical Align:
Border
#000000
Border Style:

Edit Table

ESC
Customization:
Align:

Upload Lexical State

ESC

Upload a .lexical file. If the file type matches the type of the current editor, then a preview will be shown below the file input.

Upload 3D Object

ESC

Upload Jupyter Notebook

ESC

Upload a Jupyter notebook and embed the resulting HTML in the text editor.

Insert Custom HTML

ESC

Edit Image Background Color

ESC
#ffffff

Insert Columns Layout

ESC
Column Type:

Select Code Language

ESC
Select Coding Language