This blog post describing step by step what happens when you type
ls -land hit Enter in a shell.
Before guide you through the step-by-step. I will briefly explain some concepts.
What is a Shell?
A Shell is a program that takes command inputs typed from the user’s keyboard and passes them to the machine to run through the kernel. It also checks if the user’s command inputs are correct.
On most Linux systems, a program called bash (which stands for Bourne Again SHell, an improved version of the original Unix sh shell program, written by Steve Bourne) acts as a shell program.
To try to understand better how a shell works well, we have built our own simple_shell, so we will be using some tools of this (flowchart, examples …) for the description.
- let’s start talking about where the command is written
Prompt, an infinite loop
The prompt is the character or set of characters that is displayed on a command line to indicate that it is waiting for commands. This can vary depending on the shell and is usually configurable. This set of characters can provide us with information such as user privilege, name machine, working directory, etc.
In our project it was the first step, how to create an infinite loop that writes a prompt (Alej @ Super Shell $), for this we use while (1) (code, line 19), which means that it will keep executing what there is in the loop until some condition is met that makes it exit with a break (code, line 30) or something similar.
Analogously it is how your shell works, well, let’s try to write something, it can be nonsense characters and press enter, If all went well, we should have gotten an error message complaining that it can’t understand the command But if it had known the command, we would receive a response, as we see on the right of the image 3.
After the error message or the execution of the command, you see that the prompt appears again waiting for a new command, it is there where you see the infinite loop.
let’s see a bit of where we write and receive:
In Unix, a file descriptor is an abstract indicator (handle) used to access a file or other input/output resource, such as a pipe or network socket. File descriptors form part of the POSIX application programming interface. A file descriptor is a non-negative integer, generally represented in the C programming language as the type int(see image 3):
- Standard input: consists of the data that is sent to the program. In most cases, this data is entered using the keyboard or is the result of the execution of a previous command. The file descriptor associated with stdin is 0. As you can see in the code line 25, we read a buffer using getline function with the file descriptor STDIN.
- The standard output: is the way the program returns the data after its execution. Usually stdout is the computer screen. The corresponding file descriptor is 1. As you can see in the code line 24, we write our prompt in STDOUT.
- The standard error: (stderr, for Standard Error) is the channel through which an error message is sent in case its execution fails. Although this message will generally also be displayed on the screen, it is important to note that Linux allows you to distinguish between stderr and stdout to manipulate both sequences separately. The file descriptor is 2. We also use this descriptor when we write error messages but it is in other functions.
Now that we know how and where to write our command, let’s move to it and how the shell recognize it.
Reading the command
The shell will take all the written characters, save them and begin to separate them by a delimiter, in the case of our shell the getline () and strtok () functions were useful.
With getline all the characters written are stored in an array, and with strtok and the delimiter “ ” (space), save each set of characters in a position in an array, thus obtaining a two-dimensional array where my first position saves the first “word “, which is a fundamental piece for the next steps, let’s see it a more graphically.
Working with the first argument
Next we will see how to read this argument in search of being able to execute it, our first argument is ls:
- Is the first argument a Built-in.
Built-in: The built-in commands are contained within the shell itself. When the name of a built-in command is used as the first word of a simple command, the shell executes the command directly, without invoking another program. Built-in commands are required to implement functionality that is impossible or inconvenient to obtain with separate utilities. Some example of this are alias, bg, bind, break, builtin,
case, cd, command, compgen, complete, continue, declare, dirs, disown, echo, enable, eval, etc.
2. ls no is a Built-in, so let’s continue.
3. Environment search for PATH
An environment variable is a variable whose value is set outside the program, typically through functionality built into the operating system or microservice. An environment variable is made up of a name/value pair, and any number may be created and available for reference at a point in time.
PATH is an environment variable for POSIX operating systems and Microsoft systems, it specifies the paths in which the command interpreter must search for the programs to run.
As you can see, the PATH is a variable that stores the directories that we have to check in each directory of the PATH to see if the command comes out in onte of them.
4. The point is to start concatenating our first argument (ls) with each of those directories, in order to find the full path and find the executable we are looking for.
5. Our first argument is in the /usr/bin, so now we can send to excecutable with this directory and the ls concatenate, like this usr/bin/ls and them send the another arguments.
The source code for the Simple Shell can be found here.
Made with love from Colombia to Holberton school community and to the world by Alejandra Higuera and Mauricio ALejandro Torres.
Fell free to leave your comment thanks for reading, regards!