[RFC] HTML::FormatData

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

I have written a module called HTML::FormatData that I am planning to put on
CPAN. Please take a look at the documentation below and tell me what you

Thank you....

    HTML::FormatData - formats strings and dates

    use HTML::FormatData;

    my $f = HTML::FormatData->new();

    my $string = "<b>bolded</b>"; my $formatted = $f->format_text( $string,
    strip_html=>1 ); # $string eq 'bolded'

    my $dt = $f->parse_date( $dt_string, '%Y%m%d%H%M%S' ); my $yrmoday =
    $f->format_date( $dt, '%Y%m%d' ); $yrmoday = $f->reformat_date(
    $dt_string, '%Y%m%d%H%M%S', '%Y%m%d' ); # shortcut

    HTML::FormatData contains utility functions to format strings and dates.

    This method creates a new HTML::FormatData object. Returns the blessed

  format_text( $string, %args )>
    Wrapper function for the text formatting routines below. Formats a
    string according to parameters passed in. While the functions this
    routine calls can be called directly, it will usually be best to always
    go thru this function.

    Returns the formatted string.

  decode_xml( $string )
    A copy of XML::Comma::Util::XML_basic_unescape. Returns an XML-unescaped

  decode_html( $string )
    Returns an HTML-unescaped string.

  decode_uri( $string )
    Returns an URI-unescaped string.

  strip_html( $string )
    Strips all HTML tags from string. Returns string.

  strip_whitespace( $string )
    Strips all whitespace ( \s ) characters from string. Returns string.

  clean_high_ascii( $string )
    Converts 8-bit ascii characters to their 7-bit counterparts. Tested with
    MS-Word documents; might not work right with high-ascii text from other
    sources. Returns string.

  clean_html_encoded_text( $string )
    Properly encodes some entities skipped by HTML::Entities::encode.
    Returns the modified string.

  decode_select_entities( $string )
    Takes HTML::Entities::encoded HTML and selectively unencodes certain
    entities for display on webpage. Returns modified string.

  clean_encoded_html( $string )
    Formats HTML-encoded HTML for display on webpage. Returns modified

  clean_encoded_text( $string )
    Formats HTML-encoded text for display on webpage. Returns modified

  clean_whitespace( $string [keep_full_breaks => 1 | keep_all_breaks => 1] )
    Cleans up whitespace in HTML and plain text. If passed an argument for
    handling line breaks, it will either keep full breaks (\n\n) or all
    breaks (any \n). Otherwise, all line breaks will be converted to spaces.
    Returns the modified string.

  clean_whitespace_keep_full_breaks( $string )
    Cleans up whitespace in HTML and plain text while preserving all full
    breaks (\n\n). Returns the modified string.

  clean_whitespace_keep_all_breaks( $string )
    Cleans up whitespace in HTML and plain text while preserving all line
    breaks (\n). Returns the modified string.

  force_lc( $string )
    Returns lc( $string ).

  force_uc( $string )
    Returns uc( $string ).

  truncate( $string, $count )
    Returns the first $count characters of string.

  truncate_with_ellipses( $string, $count )
    Returns the first $count - 3 characters of string followed by '...'.

  encode_xml( $string )
    A copy of XML::Comma::Util::XML_basic_escape. Returns an XML-escaped

  encode_html( $string )
    Returns an HTML-escaped string.

  encode_uri( $string )
    Returns an URI-escaped string.

  reformat_date( $string, $oldformat, $newformat )
    Takes a date string in $oldformat and returns a new string in

  parse_date( $string [, $format] )
    Takes a $string representing a date and time, and tries to produce a
    valid DateTime object. Returns the object upon success, otherwise undef.

    Setting $string to 'now' creates a DateTime object of the current date
    and time. Setting $string to 'today' creates a DateTime object of
    today's date and time set to midnight.

    Otherwise, you must pass a $format to parse the string correctly.
    $format can be set to one of the following "shortcuts": 'date8',
    'date14', or 'rfc822'.

  format_date( $dt, $format )
    Takes a DateTime object ($dt) and a $format, and returns the formatted

    $format is a DateTime 'strftime' format string. $format can be set to
    one of the following "shortcuts": 'date8', 'date14', and 'rfc822'.


    Copyright 2004-2005 by Eric Folley

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

Eric Folley
http://www.folley.net /

Site Timeline