Do you have a question? Post it now! No Registration Necessary. Now with pictures!
- Mok-Kong Shen
August 9, 2012, 11:08 am
rate this thread
Previously in other groups I mentioned a simple to use, though
admittedly rather low bit-rate,
scheme of embedding stego bits in emails or other texts, for which nice
formatting is commonly
not done or required, via using the number of words mod 2 in the
individual lines of the text
to convey the stego bits. (alternatively one could use the number of
word gaps). The scheme is
apparently not very covenient and even error prone to apply manually in
when one has to utilize emails as a busy channel for frequent stego
information transfers with
a non-trivial flux of stego bits. I have therefore implemented the
scheme as Python code below
for those who may like to try it out.
To run the given example in the code, name an arbitrary short text file
of about 20 lines to
be rawfile.txt in the directory where the program is located.
With the first two paragraphs of "A Tale of Two Cities" as the rawfile,
one run of the example
in the code printed out the following:
all 10 stegobits are embedded, there are additionally 2 random dummy bits
bits recovered from stegofile: 010111011100
The corresponding stegofile obtained was formatted by the code as follows:
It was the best of times, it was the worst of times, it was the age
of wisdom, it was the age of foolishness, it was the epoch of belief, it
was the epoch of incredulity, it was the season of Light, it was the
season of Darkness, it was the spring of hope, it was the winter
of despair, we had everything before us, we had nothing before us, we
were all going direct to Heaven, we were all going direct the other
way -- in short, the period was so far like the present period, that
some of its noisiest authorities insisted on its being received, for
good or for evil, in the superlative degree of comparison only.
There were a king with a large jaw and a queen with a plain face, on the
throne of England; there were a king with a large jaw and a queen with
a fair face, on the throne of France. In both countries it was clearer
than crystal to the lords of the State preserves of loaves and
fishes, that things in general were settled for ever.
For comments, suggestions and critiques I should be very grateful.
M. K. Shen
# EMAILSTEGANO, a scheme for embedding stego bits in emails and similar text
# since emails are commonly frequently exchanged, one has a naturally
# busy channel for transmitting stego bits. the stego bits embedding rate
# of EMAILSTEGANO is only about 1 bit per line of the resulting stego text
# (cover text). the stego text merely differs slightly from the original
# text in formatting (i.e. both are identical word for word), which is its
# principal advantage in comparison to the other known text stego schemes,
# whether of syntatic or semantic nature. the low bit rate of the scheme
# the majority of cases in practice more or less compensated by the fact
# the accumulated volume of the cover texts and hence the number of
# transmitted in a certain time period could nonetheless be substantial.
# webpages of communication partners can serve similar functions as
# transmitting stego bits. for the texts in HTML source files
# the right mouse key) are fairly free from formatting constraints, even
# though the webpages themselves are always nicely formatted, and thus could
# be processed by EMAILSTEGANO. in this case the recipient needs however to
# know the avilability of new stego informations either from the new
# of the webpages or from notices via an independent channel.
# the user has to define maxlinelen which limits the width of the stegotext
# output. 72 seems to be a practically good value for maxlinelen. for e.g.
# the email software Thunderbird a text file having line length <= 72,
# and pasted into the input window, will remain unchanged in the 'visual'
# appearance for the sender.
# assumption: all words of the input file are shorter than maxlinelen/2.
# word is understood as any sequence of characters bounded by spaces or eol,
# i.e. "\n" in the sense of C and Python. (on typing into the editor of
# eol is generated when the return-key is pressed.) the paragraphs of
# raw input text file has to be seperated from one another by two or
# (and additionally any number of spaces). there is otherwise no
# EMAILSTEGANO on the format of that file, in particular its lines need
# left adjusted and their lengths can be completely arbitrary.
# the number of words in a line of the output file stegotext mod 2 gives the
# stego bit embedded in the line. note that, as a special convention
# by programming logic, for each paragraph of the stegotext output by
# EMAILSTEGANO, the last line contains no embedded stego bit. in
# single-line paragraph has no embedded stego bit. all lines of
# left adjusted. if desired, the sender may re-format the file stegotext by
# adding a few spaces at the beginning of some of its lines. this
# the stegobits recovered by the recipient (nothing should be done,
# on cutting and pasting into the email window this re-formatting
# lead to additional eols).
# the stegobits to be embedded is to be specified by the user as a
string of 0/1
# in the variable named stegobits. if there is more input text than
# process the bits in stegobits, random dummy bits will be used to
# rest of the input text. the random bits from Python's builtin PRNG
# a variable seed, if the user doesn't set a seed. thus if the
# repeated (i.e. with the same raw input text and the same stegobits)
# bits used may differ, which means that the part of the stegofile that
# corresponds to the dummy bits may not be formatted exactly the same
# length of stegobits could be a constant (or any multiple of it) agreed
# the communication partners or else one could arrange to have an eof symbol
# (e.g. for 5 bit encoding of the alphabet one 5 bit code could be chosen to
# serve as eof, a bof symbol may be similarly employed). stegobits may
# the beginning an agreed upon number of dummy bits to be ignored by the
# recipient. another means of indicating length of stegobits (including
# presence/absence of stego bits at all in a given email) is to utilize
# keywords, e.g. personal names, that are employed or absent in the
# itself. the sender can employ the function recoverstegobits() to
# the stegotext generated is ok.
# it is self-evident that for security the stegobits should stem from a
# encryption processing. (At the risk of being blamed for self-promotion, we
# mention here author's SHUFFLE2.)
# we assume only passive wardens, i.e. the stegotext is not modified en
# the recipient. since email writing is commonly done very legerely in
# of formatting, the eventual slight unnaturalness in the apprearance of the
# stegotext resulting is deemed to be acceptable and anyway apparently
# be used as an incriminating fact for the communication partners even
# non-democratic regimes that otherwise have severe regulations
# agencies to arbitrarily demand handing out of encryption keys of encrypted
# materials or even simply outlaw encrypted communications in general.
# one day they would have to outlaw emails as such in view of the nice
# hiding capability provided by EMAILSTEGANO??) note that the difficulty
# the wardens is that there is barely any practical means to more or less
# reliably discern/decide whether a given piece of email is the result of
# processing by EMAILSTEGANO or not.
# avoidance of long words tends to improve the appearance of stegotext. in
# extreme cases one could do a little bit rewriting of text input in
# obtain a better appearing stegotext.
# no attempt is made in coding to do optimization for efficiency etc.
# this software may be freely used:
# 1. for all personal purposes unconditionally and
# 2. for all other purposes under the condition that its name, version
# and authorship are explicitly mentioned and that the author is informed
# of all eventual code modifications done.
# version 1.0, 29.07.2012.
# some comment lines edited, lately 01.08.2012.
# author's email address: firstname.lastname@example.org
# an auxiliary function of getparagraphs().
if "\n" in g[h1:]:
for i in range(h1,hn):
if g[i]!=" ":
if break1==1: break
if break2==1: continue
for i in range(textlen):
print("word too long: ",text[i])
# convention: last line of a paragraph does not contain stego bit.
for i in range(len(lineblock)):
for j in range(len(lineblock[i])):
# index of the bit in stegobits that is yet to be embedded
for i in range(len(paragraphs)):
print("rawfile is too small for embedding all stego bits, only",bk,
"stego bits could be embedded")
print("all",len(stegobits),"stegobits are embedded")
print("all",len(stegobits),"stegobits are embedded, there are
len(dummybits),"random dummy bits")
# note that paragraphs here is defined differently than the global variable
# paragraphs used in getparagraphs() etc.
for i in range(len(paragraphs)):
# convention: last line of a paragraph does not contain embedded bit.
for j in range(len(lines)-1):
print("bits recovered from stegofile: ",recoveredbits)
# an example of use:
# maximal line length in the output stegofile.
# stego bits to be transmitted.
# name of the the user-given input text file.
# name of the output file that carries the user-given stegobits.
# sender generates the stegofile.
# the recipient obtains the stegobits or the sender checks the correctness
# of processing.
- » AC 2012 (Madrid, Spain): last call: until 31 August July 2012
- — Previous thread in » General Computer Security