\"Write a C++ code to take a .html input file and remove the html tags, so the r
ID: 3829100 • Letter: #
Question
"Write a C++ code to take a .html input file and remove the html tags, so the resulting text will be output as a .txt file. Use the table below (i.e. <title> My webpage title </title> should be output as === My webpage title === in the .txt file). A sample .html code/file is provided DOWN BELOW"
***Sample .html code**** (copy & paste from below or download here: http://www.mediafire.com/file/iqkwhf7vqcwyrh0/Sample2.html):
C++ Html input to .txt output
This is the first displayed text in the document
(Header 1).
This is the second line displayed (Header 2).
This is the third header, uncentered.
SECTION ONE
This is a paragraph of text. It does not matter how I type this in. The text will wrap automatically
in the browser window. If I want to start a
new line,
I have to use a line break tag.
This is an unordered list:
List item 1
List item 2
List item 3
This is an ordered list:
List item 1
List item 2
List item 3
This sentence is in boldface.
This sentence is in italics.
This sentence contains a link to the
NASA
page.
This sentence is right-aligned.
This sentence has an increased font size.
This sentence has a decreased font size.
This sentence has the font color
specified in hex.
This sentence is in
a sans serif font.
This is a big sentence.
This is a small sentence.
This sentence has a "mailto" link in it:
yourself@harvard.edu
This link describes color tables and their hex format:
Best Colors to Use and a Color Chart
SECTION TWO
Have You Heard of String Theory?
The New Theory of the Universe.
alt="String Theory" >
Which one do you like?
alt="Open strings attached to a pair of D-branes">
This is a table with 3 columns.
to free pretzels again.
alt="Thumbnail">
Here are links to HTML guides:
An easy to follow and fun guide
A simple and clean guidel
Excellent websit. Click "Try it yourself".
An interactive learning example.
The main international standards organization for the World Wide Web is
W3C
spans 2 columns Heading 2 element 1 element 2 element 3 This is a link
to free pretzels again.
colored cell uncolored cellalt="Thumbnail">
A thumbnail picture of the T-Shirt. row 6 col 1 row 6 col 2 this spans 3 rows row 7 col 1 row 7 col 2 row 8 col 1 row 8 col 2 HTML example HTML Tags Converted textExplanation / Answer
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
void main ()
{
string are;
ifstream infile;
infile.open ("html.txt");
while(!infile.eof) // reading lines
{
getline(infile,str); // parsing each line
while (str.find("<title>") != string::npos)
str.replace(str.find("<title>"), 6, "===");
while (str.find("</title>") != string::npos)
str.replace(str.find("</title>"), 7, "===");
while (str.find("<h1>") != string::npos)
str.replace(str.find("<h1>"), 4, "#");
while (str.find("</h1>") != string::npos)
str.replace(str.find("</h1>"), 5, "");
while (str.find("<h2>") != string::npos)
str.replace(str.find("<h2>"), 4, "##");
while (str.find("</h2>") != string::npos)
str.replace(str.find("</h2>"), 5, "");
while (str.find("<h2>") != string::npos)
str.replace(str.find("<h3>"), 4, "###");
while (str.find("</h3>") != string::npos)
str.replace(str.find("</h3>"), 5, "");
while (str.find("<p>") != string::npos)
str = "A blank line before and after the paragraph.";
while (str.find("<br>") != string::npos)
str ="A new line";
while (str.find("<u1>") != string::npos)
str.replace(str.find("<u1>"), 4, "");
while (str.find("</u1>") != string::npos)
str.replace(str.find("</u1>"), 5, "");
while (str.find("<li>") != string::npos)
str.replace(str.find("<li>"), 4, "*");
while (str.find("</li>") != string::npos)
str.replace(str.find("</li>"), 5, "");
while (str.find("<ol>") != string::npos)
str.replace(str.find("<ol>"), 4, "");
while (str.find("</ol>") != string::npos)
str.replace(str.find("</ol>"), 5, "");
while (str.find("<a") != string::npos){
str.replace(str.find("<a href="), 8,"");
str.replace(str.find(">"), 1, "");
str.replace(str.find("</a>"), 4, "");
}
cout<<str;// print string
}
infile.close();
system ("pause");
}
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.