Click here to get back home

FAQ 6.10 How do I use a regular expression to strip C style comments from a file?

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? PerlFAQ Server 06-06-2008
Posted by PerlFAQ Server on June 6, 2008, 9:03 am
Please log in for more thread options
This is an excerpt from the latest version perlfaq6.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

--------------------------------------------------------------------

6.10: How do I use a regular expression to strip C style comments from a file?

While this actually can be done, it's much harder than you'd think. For
example, this one-liner

perl -0777 -pe 's{}gs' foo.c

will work in many but not all cases. You see, it's too simple-minded for
certain kinds of C programs, in particular, those with what appear to be
comments in quoted strings. For that, you'd need something like this,
created by Jeffrey Friedl and later modified by Fred Curtis.

$/ = undef;
$_ = <>;

s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\.|[^"\])*"|'(\.|[^'\])*'|.[^/"'\]*)#defined
$2 ? $2 : ""#gse;
print;

This could, of course, be more legibly written with the "/x" modifier,
adding whitespace and comments. Here it is expanded, courtesy of Fred
Curtis.

s{
/\* ## Start of /* ... */ comment
[^*]*\*+ ## Non-* followed by 1-or-more *'s
(
[^/*][^*]*\*+
)* ## 0-or-more things which don't start with /
## but do end with '*'
/ ## End of /* ... */ comment

| ## OR various things which aren't comments:

(
" ## Start of " ... " string
(
\. ## Escaped char
| ## OR
[^"\] ## Non "\
)*
" ## End of " ... " string

| ## OR

' ## Start of ' ... ' string
(
\. ## Escaped char
| ## OR
[^'\] ## Non '\
)*
' ## End of ' ... ' string

| ## OR

. ## Anything other char
[^/"'\]* ## Chars which doesn't start a comment, string or
escape
)
}{defined $2 ? $2 : ""}gxse;

A slight modification also removes C++ comments, as long as they are not
spread over multiple lines using a continuation character):


s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//[^\n]*|("(\.|[^"\])*"|'(\.|[^'\])*'|.[^/"'\]*)#defined
$2 ? $2 : ""#gse;



--------------------------------------------------------------------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.

Similar ThreadsPosted
FAQ: How do I use a regular expression to strip C style comments from a file? October 7, 2004, 5:10 am
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? February 9, 2005, 12:03 am
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? May 16, 2005, 11:03 am
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? August 1, 2005, 4:03 pm
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? September 29, 2005, 10:03 am
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? October 31, 2005, 5:03 pm
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? January 26, 2006, 6:03 pm
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? May 2, 2006, 3:03 pm
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? November 18, 2006, 3:03 am
FAQ 6.10 How do I use a regular expression to strip C style comments from a file? March 12, 2007, 4:03 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap