FALL2022 CS537 – Operating Systems XXXXXXXXXX
1
Assignment 4: Implementing a Linux shell
Date assigned: October 5, 2022
Date due: November 2, 2022
Unix Shell
In this project, you will build a simple Unix shell to replace sh provided by the Linux kernel. The shell is the
command-line interface, and thus central to any Unix/C programming environment. Mastering use of the
shell is necessary to become proficient in this world; knowing how the shell itself is built is the focus of this
project.
There are three specific objectives for this assignment:
1. To reinforce/expand your Unix programming skills
2. To create, destroy and manage Unix processes, and
3. To understand the roll and functioning of shell user interfaces
Overview
In this assignment, you will implement a command line interpreter (CLI) or, as it is more commonly known,
a shell. The shell should operate in this basic way: when you type in a command (in response to its
prompt), the shell creates a child process that executes the command you entered and then prompts for
more user input when it has finished.
The shell you implement will be like, but simpler than, the one used every day in Unix systems. If you don’t
know what shell you are running, it’s probably bash. One thing you should do on your own time is to learn
more about the Linux (Ubuntu) shell by reading the man pages or other online materials.
Program Specifications
Basic Shell: smash
Your basic shell, refe
ed to as “smash” (short for the Super Milwaukee Shell, naturally), is basically an
interactive loop: that repeatedly prints a prompt smash> (including the space after the greater-than sign),
parses the input, executes the command specified on that line of input, and waits for the command to finish.
This loop is repeated until the user types exit or ctrl-z
The name of your shell source code should be smash.c.
The shell can be invoked with either no arguments or a single argument; anything else is an e
or. Here is
the no-argument
version:
FALL2022 CS537 – Operating Systems XXXXXXXXXX
2
At this point, smash is running and ready to accept commands. Type away!
The mode above is called interactive mode and allows the user to type commands directly. The shell also
supports a batch mode, which will read input from a batch file and executes commands found there. Here is
how you run the shell with a batch file named batch.txt:
Notice that in interactive mode, a prompt is printed (smash> ). In batch mode, no prompt should be printed.
You should structure your shell such that it creates a process for each new command (the exception is
uilt-in commands, discussed below). Your basic shell should be able to parse a command and run the
program co
esponding to the command. For example, if the user types ls -la /tmp , your shell should run
the program
in/ls with the given arguments -la and / tmp. (How does the shell know to run
in/ls? It
follows something called the shell path; more on this below).
Structure
Basic Shell
The shell is very simple (conceptually): it runs in a while loop, repeatedly asking for input to tell it what
command to execute. It then executes that command. The loop continues indefinitely, until the user types the
shell's built-in command “exit”, at which point it exits. That’s it!
For reading lines of input, you use getline() li
ary method. This allows you to obtain a
itrarily long input
lines with ease. Generally, the shell will be run in interactive mode, where the user types a command-line
(one at a time) and the shell acts on it. However, your shell will also support batch mode, in which the shell
is given an input file of command-lines; in this case, the shell should not read user input (from stdin) but
ather from this batch file to acquire the commands to execute.
In either mode, if you hit the end-of-file marker (EOF) in the batch file you should call exit(0) and exit
gracefully.
To parse each command-line into its constituent pieces, consider using strsep(). Read the man page
(carefully) for more details.
To execute commands, use fork(), exec(), and wait()/waitpid(.). See those man pages for definition
of these functions and read the relevant book chapter (http:
www.ostep.org/cpu-api.pdf) for a
ief
overview.
You will note that there are a variety of commands in the exec family. For this project use execv. You should
not use the system() li
ary function call to run a command. Remember that if execv() is successful, it will
not return; if it does return, there was an e
or (e.g., the command does not exist). The most challenging part
http:
www.ostep.org/cpu-api.pdf
FALL2022 CS537 – Operating Systems XXXXXXXXXX
3
is getting the arguments co
ectly specified.
Note: Generally argv[0] for programs is the program’s name instead of a pathname.
Paths
In our example above, the user typed ls but the shell knew to execute the program
in/ls. How does
your shell know this?
It turns out that the user must specify a path variable to describe the set of directories to search for
executables; the set of directories that comprise the path are sometimes called the search path of the shell.
The path variable contains the list of all directories to search, in order, when the user types a command.
Important: Note that the shell itself does not implement ls or other commands (except built-ins). All it does
is find those executables in one of the directories specified by path and create a new process to run them.
To check if a particular file exists in a directory and is executable, consider the access() system call. For
example, when the user types “ls”, and path is set to include both /us
in and
in (assuming empty
path list at first,
in is added, then
us
in is added), try access("/us
in/ls", X_OK) . If that fails, try
in/ls. If that fails too, it is an
e
or. Your initial shell path should contain one directory:
in
Note: Most shells allow you to specify a binary specifically without using a search path, using either
absolute paths or relative paths. For example, a user could type the absolute path
in/ls and execute
the ls binary without a search path being needed. A user could also specify a relative path which starts
with the cu
ent working directory and specifies the executable directly, e.g., ./main . In this project, you do
not have to wo
y about these features.
Built-in Commands
Whenever your shell accepts a command, it should check whether the command is a built-in command or
not. If it is, it should not be executed like other programs. Instead, your shell will invoke your implementation
of the built-in command. For example, to implement the exit built-in command, you simply call exit(0); in
your smash source code, which then will exit the shell.
In this project, you should implement exit, cd, and path as built-in commands.
exit: When the user types exit, your shell should simply call the exit system call with 0 as a parameter.
It is an e
or to pass any arguments to exit.
cd: cd should always take one argument (0 or >1 args should produce an e
or). To change directories,
use the
chdir() system call with the argument supplied by the user; if chdir fails, that is also an e
or.
path: The path command takes 1 or more arguments, with each argument separated by whitespace from
the others. Three options are supported: add, remove, and clear. Clarification: Invalid arguments should
generate an e
or.
add accepts 1 path. Your shell should append it to the beginning of the path list. For example,
FALL2022 CS537 – Operating Systems XXXXXXXXXX
4
path add /us
in results in the path list containing /us
in and
in (notice the order here). Your shell
should not report an e
or if an invalid path is added. It should kindly accept it.
emove accepts 1 path. It searches through the cu
ent path list and removes the co
esponding one. If the
path cannot be found, this is an e
or.
clear takes no additional argument. It simply removes everything from the path list. If the user sets path to
e empty, then the shell should not be able to run any programs (except built-in commands).
Redirection
Many times, a shell user prefers to send the output of a program to a file rather than to the screen. Usually,
a shell provides this nice feature with the > character. Formally this is named as redirection of standard
output. To make your shell users happy, your shell should also include this feature, but with a slight twist
(explained below).
For example, if a user types “ls -la /tmp > output”, nothing should be printed on the screen. Instead,
the standard output of the ls program should be rerouted to the file output. In addition, the standard e
or
output of the program should be rerouted to the file output (the twist is that this is a little different than
standard redirection). However, if the program cannot be found (i.e., mistyped pwd as pdd), an e
or should
e reported, but not to be redirected to output.
If the output file exists before you run your program, you should simply overwrite it (after truncating it).
The exact format of redirection is a command (and possibly some arguments) followed by the redirection
symbol followed by a filename. Multiple redirection operators or multiple files to the right of the redirection
sign are e
ors. Redirection without a command is also not allowed - an e
or should be printed out, instead
of being redirected.
Note:
Don’t wo
y about redirection for built-in commands (e.g., we will not test what happens when you type
path
in > file).
Don’t wo
y about the order of stdout and stde
. In other