C/C++: As we introduced in class, the four functional characters (<, >, |, &) ar

ID: 3801245 • Letter: C

Question

C/C++:
As we introduced in class, the four functional characters (<, >, |, &) are for input redirection, output redirection, pipe and background jobs respectively. By default, they can be interpreted correctly by a UNIX terminal, but not by a C/C++ program. In a UNIX terminal if you type in “echo hello > file1”, you will find a file named “file1” created which has string “hello” stored inside, and that’s because the terminal interpreted “>” correctly and redirected the output of string “hello” from screen to “file1”.
In our Project 1, we want you to write a C/C++ program to interpret all these four functional characters as a UNIX terminal does. So when your program is running, we expect it to allow the user to type in a command line, and if the user types in “echo hello > file1”, the same thing as in a UNIX terminal should happen. This is only the test for output redirection, and we expect the other three functional characters to be interpreted correctly as well.
Note that function “system” cannot be used in your program because it will do everything for you. For example, “system(“echo hello > file1”);” can give the right result without needing you to do anything. Instead, we want you to use system calls for process management and file systems, such as fork, waitpid, execvp, exit, pipe, dup, open, close, etc. First, your program should read in the user’s command line and store it in an array. Then it checks if any of the four functional characters is in the array and will do corresponding work if so.
Use output redirection as an example, if I input “echo hello > file1”, my C/C++ program should read them in, and store them in an array whose name is “args”, with args[0] having “echo”, args[1] having “hello”, args[2] having “>” and args[3] having “file1”. Then my program checks if any of the four functional characters exits, and yes, “>” is found in args[2]. After this my program can do corresponding work: since “>” is in args[2], the following argument—args[3]—should be the output redirection destination, thus my program creates a file called “file1” (using system call “create”) and redirects standard output to “file1” (using system call “dup2”, there can be multiple solutions). Finally, my program executes the command by calling “execvp(args[0], args);”.

Explanation / Answer

The idea of a shell command interpreter is central to the Unix programming environment. In particular, the shell executes commands that you enter in response to its prompt in a terminal window, which is called the controlling terminal.

Each command is a character string naming a program or a built-in function to run, followed by zero or more arguments separated by spaces, and (optionally) a few special control directives. If the command names a program to run, then the shell spawns a new child process and executes that program, passing in the arguments. The shell can also direct a process to take its input from a file and/or write its output to a file. It also uses pipes to direct the output of one process to the input of another. Thus you can use a shell as an interactive scripting language that combines subprograms to perform more complex functions.

Shell can also read and execute a list of commands from a file, called a shell script. A shell script is just another kind of program, written in the shell command language—a programming language that is interpreted by the shell. Modern shells support programming language features including variables and control structures like looping and conditional execution. Thus shell scripting is akin to programming in interpreted languages such as Perl, Python, and Ruby.

There are different kinds of shells—after all they are just programs which can be easily extended to support different features. You can find out which shell you are running by inspecting the $SHELL environment variable as follows.

The shell maintains a set of environment variables with various names and values. This set is essentially a property list of user settings. Each environment variable has a name and a value: both the name and the value are character strings. Shell commands may reference environment variables by name. The environment variable are also passed to all programs that execute as children of the shell.

The Devil Shell: dsh

For this lab you will use Unix system calls to implement a simple shell— devil shell (dsh). dsh supports basic shell features: it spawns child processes, directs its children to execute external programs named in commands, passes arguments into programs, redirects standard input/output for child processes, chains processes using pipes, and (optionally) monitors the progress of its children.

You should understand the concepts of environment variables, system calls, standard input and output, I/O redirection, parent and child processes, current directory, pipes, jobs, foreground and background, signals, and end-of-file. These concepts are discussed in class and in the reading.

dsh prints a prompt of your choosing (get creative!) before reading each input line. As part of this lab, you will change the prompt to include the process identifier.

dsh reads command lines from its standard input and interprets one line at a time. We provide a simple command line parser (in parse.c) to save you the work of writing one yourself. You should not need to look at the parser code (really, don’t), but you will need to familiarize yourself with the data structures that the parser returns, and how to read important information from them. Your dsh must free these structures when it is done with them.

dsh exits when the user issues the built-in command quit, which you will implement. You can also exit dsh by pressing “ctrl-d” at the terminal. “ctrl-d” sends an end-of-file marker to signal the program that there is no more input to process (this condition tells dsh to quit). The parser already detects the EOF marker and indicates if it was received as input; you merely have to handle this case and route to the control to your quit implementation.

The command-line input to dsh can contain one or more commands, separated by the special charac- ters - “;” and “|”. The supplied command-line parser supports four special characters “;”, “|”, “<” and “>”. The special characters “<” and “>” are I/O redirection directives: “<” redirects the standard input of a child process to read from a named file, and “>” redirects the standard output of a child process to write to a named file.

We define the syntax of input that dsh accepts using a Backus Normal Form (BNF) notation, as follows, to make the shell syntax easy to understand.

The semicolon (;) is used to separate the commands on a command line, if there is more than one command. The commands are executed from left to right, sequentially. Unlike regular shells that halt on an error, dsh is so cool that it simply ignores the error or failure of a command and moves on to execute the next. The fact that the command failed is recorded in a log file. As part of the lab, you will handle the case where a user input can contain a sequence of commands each terminated by a semicolon (or end-of-line), and execute the commands sequentially in the order specified.

The special character “|” indicates a pipeline: if a command is followed by a “|” then the shell arranges for its standard output (stdout) to pass through a pipe to the standard input (stdin) of its successor. The command and its successor are grouped in the same job. If there is no successor then the command is malformed. Your dsh should enable a user to use pipes to compose commands in this way to perform more complex tasks.

In addition, the last non-blank character of a job may be an “&”. The meaning of “&” is simply that any child processes for the job execute in the background. If a job has no “&” then it runs in the foreground.

Each command (of a job) is a sequence of words (tokens) separated by blank space. The first token names the command: it is either the name of a built-in command or the name of an external program (an executable file) to run. The built-ins (discussed later) are implemented directly within dsh.

The dsh supports input/output redirection using the special characters “<” and “>”. Note that these two special characters are treated as directives to the shell to modify the standard input (stdin) and/or standard output (stdout) of the command. The directives always follow a command whose standard input and/or output are to be redirected to a file. These concepts are discussed in class and in the reading.

Pipelines

A pipeline is a sequence of processes chained by their standard streams, so that the output of each process (stdout) feeds directly as input (stdin) to the next one. If a command line contains a symbol |, the processes are chained by a pipeline.

Pipes can be implemented using the pipe() and dup2() system calls. A more generic pipeline can be of the form:

where inFile and outFile are input and output files for redirection.

The descriptors in the child are often duplicated onto standard input or output. The child can then exec() another program, which inherits the standard streams. dup2() is useful to duplicate the child descriptors to stdin/stdout. For example, consider:

where dup2() closes stdin and duplicates the input end of the pipe to stdin. The call to exec () will overlay the child’s text segment (code) with new executable and inherits standard streams from its parent–which actually inherits the input end of the pipe as its standard input! Now, anything that the original parent process sends to the pipe, goes into the newly exec’ed child process.

Navigate

C/C++: As we introduced in class, the four functional characters (<, >, |, &) ar

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

C/C++: As we introduced in class, the four functional characters (<, >, |, &) ar

Question

Explanation / Answer

Related Questions

Navigate