Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Write a program in Java that checks whether a sequence of HTML tags is properly

ID: 3814921 • Letter: W

Question

Write a program in Java that checks whether a sequence of HTML tags is properly nested. For each opening tag, such as <p>, there must be a closing tag </p>. A tag such as <p> may have other tags inside, for example <p> <ul> <li> </li> </ul> <a> </a> </p>. The inner tags must be closed before the outer ones. The program should process a file containing tags. For simplicity, assume that the tags are separated by spaces, and that there is no text inside the tags.

Explanation / Answer

HtmlParser.java without using stack and Patter and Matcher i.e., done using LinkedList

import java.util.LinkedList;

import java.util.List;

public class HtmlParser {
   public static boolean isBalancedHtml(String html) {
       List<String> tagList = new LinkedList<>();
       String lines[] = html.split(" ");
       for (int i = 0; i < lines.length; i++) {
           String line = lines[i];
           while (line.indexOf("<") != -1) {
               int startIndex = line.indexOf("<");
               if (startIndex == -1) {
                   break;
               }
               boolean isClosingTag = line.charAt(startIndex + 1) == '/';
               int endIndex = line.indexOf(">");
               if (endIndex == -1 || endIndex > line.length()) {
                   break;
               }

               String tag = line.substring(startIndex+1, endIndex);
               String tagName = "";
               if (isClosingTag)
               {
                   tagName = line.substring(startIndex+2, endIndex);
               }
               else
               {
                   tagName = tag.split(" ")[0];
               }
               if (isClosingTag) {
                   String lastTagInList = tagList.get(tagList.size()-1);
                   if (!lastTagInList.equals(tagName)) {
                       return false;
                   }
                   tagList.remove(tagList.size()-1);
               } else {
                   tagList.add(tagName);
               }
               line = line.substring(endIndex + 1);
           }
       }
       return tagList.isEmpty();
   }

   public static void main(String[] args) {
       String htmlContent = "<img title="displays" src="big.gif"></img><p> <ul> <li> </li> </ul> <a> </a> </p>";
       System.out.println("Html content: " + htmlContent);

       if (isBalancedHtml(htmlContent)) {
           System.out.println("HTML conent is balanced");
       } else {
           System.out.println("Html content is not balanced");
       }
   }
}

-------------------------------------------------------------------------------------

HtmlParser.java using stack and regula expression (Pattern, Matcher)

import java.util.Stack;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class HtmlParser
{
final static Pattern pattern = Pattern.compile("</?(\w+)((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[\^'">\s]+))?)+\s*|\s*)/?>");

public static boolean isBalancedHtml(String html)
{
Stack<String> tagStack = new Stack<>();
String lines[] = html.split(" ");
for(int i = 0; i < lines.length; i++)
{
String line = lines[i];
while(line.indexOf("<") != -1)
{
int startIndex = line.indexOf("<");
if (startIndex == -1)
{
break;
}
boolean isClosingTag = line.charAt(startIndex+1) == '/';
int endIndex = line.indexOf(">");
if (endIndex == -1 || endIndex > line.length())
{
break;
}
  
String tag = line.substring(startIndex, endIndex+1);
Matcher matcher = pattern.matcher(tag);
matcher.find();
  
  
String tagName = matcher.group(1);
  
if (isClosingTag)
{
if (!tagStack.pop().equals(tagName))
{
return false;
}
}
else
{
tagStack.push(tagName);
}
  

line = line.substring(endIndex+1);
}
}
return tagStack.empty();
}
public static void main(String[] args)
{
String htmlContent = "<img title="displays" src="big.gif"></img><p> <ul> <li> </li> </ul> <a> </a> </p>";
System.out.println("Html content: " + htmlContent);
  
if (isBalancedHtml(htmlContent))
{
System.out.println("HTML conent is balanced");
}
else
{
System.out.println("Html content is not balanced");
}
}
}

Sample output

Html content: <img title="displays" src="big.gif"></img><p> <ul> <li> </li> </ul> <a> </a> </p>
HTML conent is balanced

Html content: <img title="displays" src="big.gif"></img><p> <ul> <li> </ul> <a> </a> </p>
Html content is not balanced

Please note this will handle simple cases only. Cases like <br /> and similar tags are not handled. Also cases when a tag attribute has '>' '<' charcter is not handled.

Please rate positively if this answered your query.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote