Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

\"Write a C++ code to take a .html input file and remove the html tags, so the r

ID: 3829100 • Letter: #

Question

"Write a C++ code to take a .html input file and remove the html tags, so the resulting text will be output as a .txt file. Use the table below (i.e. <title> My webpage title </title> should be output as === My webpage title === in the .txt file). A sample .html code/file is provided DOWN BELOW"

***Sample .html code**** (copy & paste from below or download here: http://www.mediafire.com/file/iqkwhf7vqcwyrh0/Sample2.html):


C++ Html input to .txt output

This is the first displayed text in the document
(Header 1).

This is the second line displayed (Header 2).

This is the third header, uncentered.

SECTION ONE

This is a paragraph of text. It does not matter how I type this in. The text will wrap automatically

in the browser window. If I want to start a
new line,
I have to use a line break tag.

This is an unordered list:

List item 1

List item 2

List item 3

This is an ordered list:

List item 1

List item 2

List item 3

This sentence is in boldface.

This sentence is in italics.

This sentence contains a link to the

NASA

page.

This sentence is right-aligned.

This sentence has an increased font size.

This sentence has a decreased font size.

This sentence has the font color

specified in hex.

This sentence is in

a sans serif font.

This is a big sentence.

This is a small sentence.

This sentence has a "mailto" link in it:

yourself@harvard.edu


This link describes color tables and their hex format:

Best Colors to Use and a Color Chart

SECTION TWO

Have You Heard of String Theory?

The New Theory of the Universe.

alt="String Theory" >

Which one do you like?

alt="Open strings attached to a pair of D-branes">


This is a table with 3 columns.

to free pretzels again.

alt="Thumbnail">


Here are links to HTML guides:

An easy to follow and fun guide

A simple and clean guidel

Excellent websit. Click "Try it yourself".

An interactive learning example.



The main international standards organization for the World Wide Web is

W3C





Heading 1
spans 2 columns Heading 2 element 1 element 2 element 3 This is a link

to free pretzels again.

colored cell uncolored cell

alt="Thumbnail">

A thumbnail picture of the T-Shirt. row 6 col 1 row 6 col 2 this spans 3 rows row 7 col 1 row 7 col 2 row 8 col 1 row 8 col 2 HTML example HTML Tags Converted text

Explanation / Answer

#include <iostream>
#include <fstream>
#include <string>
using namespace std;

void main ()
{
string are;
       ifstream infile;
       infile.open ("html.txt");
while(!infile.eof) // reading lines
{
   getline(infile,str); // parsing each line
           while (str.find("<title>") != string::npos)
               str.replace(str.find("<title>"), 6, "===");
           while (str.find("</title>") != string::npos)
               str.replace(str.find("</title>"), 7, "===");
           while (str.find("<h1>") != string::npos)
               str.replace(str.find("<h1>"), 4, "#");
           while (str.find("</h1>") != string::npos)
               str.replace(str.find("</h1>"), 5, "");
           while (str.find("<h2>") != string::npos)
               str.replace(str.find("<h2>"), 4, "##");
           while (str.find("</h2>") != string::npos)
               str.replace(str.find("</h2>"), 5, "");
           while (str.find("<h2>") != string::npos)
               str.replace(str.find("<h3>"), 4, "###");
           while (str.find("</h3>") != string::npos)
               str.replace(str.find("</h3>"), 5, "");
           while (str.find("<p>") != string::npos)
               str = "A blank line before and after the paragraph.";
           while (str.find("<br>") != string::npos)
               str ="A new line";
           while (str.find("<u1>") != string::npos)
               str.replace(str.find("<u1>"), 4, "");
           while (str.find("</u1>") != string::npos)
               str.replace(str.find("</u1>"), 5, "");
           while (str.find("<li>") != string::npos)
               str.replace(str.find("<li>"), 4, "*");
           while (str.find("</li>") != string::npos)
               str.replace(str.find("</li>"), 5, "");
           while (str.find("<ol>") != string::npos)
               str.replace(str.find("<ol>"), 4, "");
           while (str.find("</ol>") != string::npos)
               str.replace(str.find("</ol>"), 5, "");
           while (str.find("<a") != string::npos){
               str.replace(str.find("<a href="), 8,"");
               str.replace(str.find(">"), 1, "");
               str.replace(str.find("</a>"), 4, "");
           }
   cout<<str;// print string
}
   infile.close();
   system ("pause");
}