# Extracting random data from static, for /dev/random

#### Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

•  Subject
• Author
• Posted on

I have a radio plugged into my soundcard, tuned to static. This should
be a good source of randomness, right? I know of course it's not all
random data, as confirmed with rngtest. I was looking at the lavarnd
project, they use an algorithm they're calling DigitalBlender (tm) to
"extract" the random data and throw away the rest.

Are there any tools that can do the same thing on an arbitrary input
stream that I could use with my audio static? I thought rngd would do
it, but that just tests the randomness and throws it away if it fails,
instead of extracting what ever is random.

I've tried audio-entropyd (aed), but it keeps failing the randomness
tests as well, so isn't giving me any extra entropy.

Any advice would be great, my entropy pool is getting exhausted more
and more often these days.

Thanks,
-Brian

## Re: Extracting random data from static, for /dev/random

One thing you can do is to take the input and hash it. Eg, take in 2048 bits and
has to 1024 bits (eg MD5) This means that if the input has at least 1024 bits of
randomness, the output of MD5 should also have roughly that much randomness.

Don't use /dev/random, use /dev/urandom, which does not exhaust. Yes, it uses a
PRNG but continuously seeded with whatever randomness it can find, giving you a
continuous stream of cryptographically good PRNG even when the physical
randomness
runs out.

## Re: Extracting random data from static, for /dev/random

Interesting, I figured hashing would factor into this somehow, but I
didn't realize it had this sort of "entropy-extraction" property. I
guess it makes sense, in fact good compression should probably have
similar effects. Thanks!

Using urandom, really? I've seen a number of suggestions for that from
people who don't seem to actually know anything other than "urandom
doesn't run out". I have a real hard time believing that the output of
urandom is cryptographically secure: no amount of processing can
increase entropy, so even if it's seeded with random data, the output
won't contain anymore randomness than the input. That means if there's
only 10 bits (for example) in the entropy pool, and that's used to
seed the prng, the output of that prng won't have more than 10 bits of
entropy, right? So even if you generate 1024 bits of output from the
PRNG, it will only contain the same small amount of randomness that
was used as input. I have a little bit of background in information
theory, but am certainly no expert: can anyone either back me up or
explain to me why I'm wrong?

Thanks for the feedback,
-Brian

## Re: Extracting random data from static, for /dev/random

Using /dev/random instead of /dev/urandom will cause the application
to lock up until enough random bits are available.  While /dev/urandom
will not have as much entropy as /dev/random, it should be enough
for a cryptographically secure pseudorandom number generator.

See http://en.wikipedia.org/wiki//dev/random

Regards, Dave Hodgins

--
Change nomail.afraid.org to ody.ca to reply by email.
(nomail.afraid.org has been set up specifically for
use in usenet. Feel free to use it yourself.)

## Re: Extracting random data from static, for /dev/random

No, compression will NOT. The correlations and biases which are there which have
cryptographic impact simply will not be recognized by a compressor-- They in
general would produce more output than input. If a stream is compressible, then
that stream already has very low entropy.

IF you know the input, then your comment is correct. So if someone knows all of
the input to the hash function, they can recreate the output. But urandom
constantly is seeding itself. It has thousands of bits of input. (What you did 10
days ago affects the output today) Now, the output does have more bits than the
input, thus youcould do an exhaustive search of the input and you would have to
go
through less than the total output stream length. But the input is so huge that
that exhaustive search is shall we say infeasible. That is why it is
cryptographically strong. A lot of thought went into /dev/urandom to ensure that
it cryptographically strong.

## Re: Extracting random data from static, for /dev/random

Yes, it has low entropy and can therefore be compressed. That's my
whole problem, I'm trying to remove everything except the pure
information: isn't that the whole point of compression? I understand
that current compression techniques may be insufficient for this, but
a theoretically optimal compression algorithm would work, wouldn't it?
Or am I still missing something? To be clear, I'm talking specifically
about compressing the static from my radio, not the output of /dev/
urandom.

[clip]

Again, please correct me if I'm wrong, but exact re-creation of my
entropy pool is not the only way to attack a weak entropy source.
Obviously, if someone was able to recreate the exact output of my RNG
system, they could crush what ever kind of crypto I'm trying to do
with it, but exact recreation is the extreme case. The whole reason
good entropy is important is because non-randomness has a dirty little
habit of revealing itself at the most inopportune times, namely when
someone is trying to attack your crypto-system. I'm not arguing
specifically (for the moment) that /dev/urandom is insufficient, but
you seem to be dwelling on this recreation and exhaustive searching
angle, but I don't think that's the only concern (and, as you point
out, it's not a realistic concern). You said yourself that compression
is insufficient because it doesn't account for the "correlations and
biases...which have cryptographic impact", and it's exactly these
correlations and biases that I'm concerned about.

I don't doubt that a corps of extremely intelligent, knowledgeable,
and diligent people worked on making /dev/urandom strong. However,
straight from the manpage:

"When read, /dev/urandom device will return as many bytes as are
requested. As a result, if there is not sufficient entropy in the
entropy pool, the returned values are theoretically vulnerable to a
cryptographic attack on the algorithms used by the driver. Knowledge
of how to do this is not available in the current non-classified
literature, but it is theoretically possible that such an attack may
exist. If this is a concern in your application, use /dev/random

I supposed perhaps as I was taking you too literally. Maybe what
you're trying to tell me is that /dev/urandom should be strong enough
to withstand any reasonable attack that anyone other than classified
government agencies could launch. And I supposed that's a reasonable
answer. For curiosities sake, can you confirm (or else continue to
rebuke) my assertion that in the theoretical limit, /dev/urandom is
not as secure as /dev/random?

Thanks again, I appreciate your feedback immensely.

-Brian

## Re: Extracting random data from static, for /dev/random

Well, not really. All compressors have very simple algorithms to
recognize repetitions. Ie, most correlations are not recognized by the
compressor and are not removed. Remember that there has to be a table at
the beginning to allow uncompression. That table gets larger and larger
the more subtle the patterns it compresses. so you could get the
situation where a file of length 1K had a final length of 10 TB with
almost all being taken up by the compression table. In the limit as the
file gets infinitely large, compression techniques always win ( the
table size becomes negligible) but most compressors work on finite sized
files.

See above.

You could certainly run it through a compressor, if that helped, first
and then use a "random" hash function to extract the remaining entropy.
Unfortunately you would have no real idea of how much hashing you should
do. for example the digits of pi almost certainly would not compress at
all, and yet, you know that that the entropy in the digits of pi is very
very small ( the minimal length program to produce the digits of pi is
very small). If your attacker knew that you used the digits of pi, you
would have to do a hash that had a huge compression factor to extract
the real entropy from that stream.

Now radio static is probably a lot better but you still need an
extimate of the entropy density in that stream. For example if at night
that frequency suddenly began picking up Radio Moscow, the entropy
content would plummet.

## Re: Extracting random data from static, for /dev/random

[snip]

With all due, I'm 95% sure that the most common compression algorithms
in widespread use are universal and therefore do not include code
tables in the message. The code tables are built up virtually in such
a way that the decoder can build the same table as he goes. As you
suggest, any useful code table would be immensely large and completely
counter to any efforts to compress.

[snip]

Yes, I've thought a little about the potential for "noise pollution",
if you will. I guess that just means it's very important to do an
accurate (and conservative) entropy estimate on the data, which is
another unknown for me: how exactly do I get a reasonable estimate of
the entropy? I'm familiar with the sample-entropy calculation (-sum
p_i log p_i), but I'm not sure what elements to calculate it on. I
know using just individual bits will not account for the vast majority
of potential correlations, but I'm not sure what the right answer is.

Cheers,
-Brian

## Re: Extracting random data from static, for /dev/random

Ok...Ok...Ok... I follow this NG because I hardly understand a thing
about crypto, so this is my learning time. I love the idea of the
radio-get-me-some-randomness-configuration, but I think it can be done
easier. Why don't you take a T.V.card, jam it in a free pci-slot and use
the black & white noise as randomness, combine it with /dev/random AND
/dev/urandom... and voilla! Complete randomness!
Even as one runs out, the other two together are random enough to keep
randomness...

Why both /dev/random AND /dev/urandom? Because otherwise people get into
fighting here over whats best...
Why a TVcard? Now you use a radio, wire, plugs, soundcard etc. With a TV
card you don't need an apparatus outside your box. You don't want to
pick up "real" signals, so no other cables and stuff required.

Greetings!

P.S. As I said, im a noob, so please be gentle with flaming :-)

bmearns wrote:

## Re: Extracting random data from static, for /dev/random

tuuttuuttuut@home.nl wrote:

Getting a random bit stream like that is possible but that goes only half
way. The bigger difficulty is obtaining the right kind of random. Is the bit
distribution uniform? Is the stream of bits temporally uncorrelated? Is the
random bit rate generation enough for your needs?

There is much to be said about random!

Regards.

## Re: Extracting random data from static, for /dev/random

I demand that tuuttuuttuut@home.nl may or may not have written...

Depends. I find that either I get no data or I get MPEG TS.

http://www.entropykey.co.uk/ seems very likely be better; at least it's
specifically designed for this kind of thing...

[snip]
--
| Darren Salt            | linux at youmustbejoking | nr. Ashington, | Doon
| using Debian GNU/Linux | or ds    ,demon,co,uk    | Northumberland | Army
| + http://www.xine-project.org /

Hope is a good breakfast, but it is a bad supper.

## Re: Extracting random data from static, for /dev/random

On Nov 17, 8:46=A0am, Darren Salt

Because I don't have a TV card and I do have a spare clock radio and
audio input jack. TV card is probably a fine idea, too, but it doesn't
solve any of my problems: static is static and will contain both
random and non-random data. My issue is extracting the random bits and
leaving behind anything predictable. Where exactly the static (or any
other data stream, for that matter) comes from is basically irrelevant
for the discussion.

[snip]

Thanks, Darren. I've seen entropykey before; it's really nice and
would no doubt solve my problem. There are plenty of hardware random
number generators out there, so I guess you could say this discussion
has become somewhat academic. I think there's a certain amount of hack
value in my project, though, and just for my own education, I'd like
to learn how to make it work properly.

-Brian