Create a function scrape that uses a single parameter which contains the website
ID: 3580089 • Letter: C
Question
Create a function
scrape
that uses a single parameter which contains the website
that should be scraped and the
string representation of the regular expression. The
function should use the regular expression to return a list of all the matching
subsections of the website’s contents. Use this function on the website
http://www.lipsum.com
to find all words that start with an
“
h
”
but are not included
Create a function
scrape
that uses a single parameter which contains the website
that should be scraped and the
string representation of the regular expression. The
function should use the regular expression to return a list of all the matching
subsections of the website’s contents. Use this function on the website
http://www.lipsum.com
to find all words that start with an
“
h
”
but are not included
inside an HTML
Explanation / Answer
PROGRAM CODE:
package sample;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.ArrayList;
import java.util.Scanner;
public class HtmlReader {
public static void main(String[] args) {
//arraylist to check for html codes
ArrayList<String> htmlCode = new ArrayList<>();
htmlCode.add("html");
htmlCode.add("heig");
htmlCode.add("http");
htmlCode.add("html");
htmlCode.add("href");
String urlName = null;
//url name is given through commandline
urlName = args[0];
Scanner scan = null;
URL webPage = null;
try {
webPage = new URL(urlName);
scan = new Scanner(webPage.openStream());
while (scan.hasNext()) {
String line = scan.next();
if(line.charAt(0) == 'h' && !line.contains("</div>"))
{
String subLine = line.length() >= 4? line.substring(0, 4) : line;
if(!htmlCode.contains(subLine))
System.out.println(line + " ");
}
}
}
catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (scan != null) scan.close();
scan = null;
}
}
}
OUTPUT:
have
has
has
has
here,
here',
have
humour
has
have
humour,
hidden
handful
humour,
help
help
help
hosting
how
human
happiness.
how
him
has
has
harum
hic
hand,
hour,
have
holds
he
he
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.