Click here to get back home

Extracting Semantic Structure of HTML Document- Feature based

 HomeNewsGroups | Search

comp.infosystems.www.authoring.html - discuss HTML authoring here 

get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Extracting Semantic Structure of HTML Document- Feature based dayzman 01-30-2005
Posted by dayzman on January 30, 2005, 7:43 pm
Please log in for more thread options
Hi,

I've read somewhere that feature-based analysis can be used to extract
the semantic structure of HTML documents. By semantic structure, they
mean the model of the rendered view a reader sees. Now, my question is,
what should such feature-based analysis involve? What exactly is a
feature-based analysis?

Please help.

Cheers,
Michael



Posted by Steve Pugh on January 31, 2005, 10:51 am
Please log in for more thread options
dayzman@hotmail.com wrote:

show/hide quoted text

You asked the same question five days ago and at least twice in
December. You didn't bother to respond to any of the replies you got
then. So why should anyone bother replying to you now? Please join the
discussion and explain in more detail what you are looking for.

        Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor



Posted by dayzman on February 5, 2005, 9:49 pm
Please log in for more thread options
Hi,

Sorry for the late response. Well, I'm trying to extract a structure of
the view of what a reader sees, e.g. extract all headings and link them
to the corresponding paragraphs etc. In the end, the output graph shall
be a hierachy of sections, sub-sections etc. I know this can be quite
complex, because the HTML used today can be very messy, esp. with
tables. I was suggested to use a "feature-based analysis" to extract
such information, but I'm not sure what exactly that should mean. What
should a feature-based analysis be, even in other contexts? Is it
really feasible to extract "features" of HTML documents?

Any help will be much appreciated.

Cheers,
Michael

Steve Pugh wrote:
show/hide quoted text
extract
show/hide quoted text
they
show/hide quoted text
is,
show/hide quoted text
the
show/hide quoted text



Posted by legalois on February 8, 2005, 4:39 pm
Please log in for more thread options
dayzman@hotmail.com wrote:
show/hide quoted text
If the the author of the article you read, or the person who suggested
you use this method for analysis of mark-up text knows what *he* means
by it, maybe he is the best source for a clearer explanation. Frankly,
it sounds idiosyncratic--maybe his own invention. Better to go to the
source, and find out.
Or...have you considered looking at any of the several document object
models (DOM)? A DOM is not HTML, but who knows. Maybe someone was a
little confused...
- Jake Lloyd


Similar ThreadsPosted
Extracting Semantic Structure of HTML Doc January 25, 2005, 11:31 pm
Extracting semantic structure of HTML page December 23, 2004, 6:57 am
Semantic Structure of HTML page December 16, 2004, 6:58 am
semantic structure of x/y April 17, 2008, 7:21 pm
w3c recommended provision of a UA UI element for link based document relationships? December 6, 2004, 7:29 pm
A HTML document can be converted to XHTML document. January 31, 2005, 8:58 am
Free HTML editor with include feature? August 9, 2008, 6:52 am
web-based wysiwyg html composer August 11, 2004, 8:27 am
Simplicity in HTML structure January 23, 2006, 4:48 pm
text based html editor with master page? December 15, 2008, 6:17 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Driving a better car - Fuelzilla.com

Cabling site for homeowners and pros alike - Cabling-Design.com

Friends:

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap
Privacy Policy