Skip to main content

MIT Missing Semester (2019)

2 - Virtual machines and containers

Exercises

2.1

Choose a container software (Docker, LXC, …) and install a simple Linux image. Try SSHing into it.

2.2

Search and download a prebuilt container image for a popular web server (nginx, apache, …)

3 - Shell scripting

Exercises

3.1 (bash and Linux)

If you are completely new to the shell you may want to read a more comprehensive guide about it such as BashGuide. If you want a more in-depth introduction The Linux Command Line is a good resource.

3.2 (PATH, which, type)

We briefly discussed that the PATH environment variable is used to locate the programs that you run through the command line. Let's explore that a little further

  • Run echo $PATH (or echo $PATH | tr -s ':' '\n' for pretty printing) and examine its contents. What locations are listed?
  • The command which locates a program in the user PATH. Try running which for common commands like echo, ls, or mv. Note that which is a bit limited since it does not understand shell aliases. Try running type and command -v for those same commands. How is the output different?
  • Run PATH= and try running the previous commands again, some work and some don't, can you figure out why?

3.3 (special variables)

  • What does the variable ~ expands as? What about .? And ..?
  • What does the variable $? do?
  • What does the variable $_ do?
  • What does the variable !! expand to? What about !!*? And !l?
  • Look for documentation for these options and familiarize yourself with them

3.4

Sometimes piping doesn't quite work because the command being piped into does not expect the newline separated format. For example file command tells you properties of the file.

Try running ls | file and ls | xargs file. What is xargs doing?

3.5

When you write a script you can specify to your shell what interpreter should be used to interpret the script by using a shebang line. Write a script called hello with the following contents and make it executable with chmod +x hello. Then execute it with ./hello. Then remove the first line and execute it again? How is the shell using that first line?

#! /usr/bin/python

print("Hello World!")

You will often see programs that have a shebang that looks like #! usr/bin/env bash. This is a more portable solution with it own set of advantages and disadvantages. How is env different from which? What environment variable does env use to decide what program to run?

3.6 (pipes, process substitution, subshell)

Create a script called slow_seq.sh with the following contents and do chmod +x slow_seq.sh to make it executable.

#! /usr/bin/env bash

for i in $(seq 1 10); do
echo $i;
sleep 1;
done

There is a way in which pipes (and process substitution) differ from using subshell execution, i.e. $(). Run the following commands and observe the differences:

  • ./slow_seq.sh | grep -P "[3-6]"
  • grep -P "[3-6]" <(./slow_seq.sh)
  • echo $(./slow_seq.sh) | grep -P "[3-6]"

3.7 (misc)

  • Try running touch {a,b}{a,b} then ls what did appear?
  • Sometimes you want to keep stdin and still pipe it to a file. Try running echo HELLO | tee hello.txt
  • Try running cat hello.txt > hello.txt what do you expect to happen? What does happen?
  • Run echo HELLO > hello.txt and then run echo WORLD >> hello.txt. What are the contents of hello.txt? How is > different from >>?
  • Run printf "\e[38;5;81mfoo\e[0m\n". How was the output different? If you want to know more, search for ANSI color escape sequences.
  • Run touch a.txt then run ^txt^log what did bash do for you? In the same vein, run fc. What does it do?

3.8 (keyboard shortcuts)

As with any application you use frequently is worth familiarising yourself with its keyboard shortcuts. Type the following ones and try figuring out what they do and in what scenarios it might be convenient knowing about them. For some of them it might be easier searching online about what they do. (remember that ^X means pressing Ctrl+X)

  • ^A, ^E
  • ^R
  • ^L
  • ^C, ^\ and ^D
  • ^U and ^Y

4 - Command line environment

Exercises

4.1

Run

cat .bash_history | sort | uniq -c | sort -rn | head -n 10

or

cat .zhistory | sort | uniq -c | sort -rn | head -n 10

for zsh to get top 10 most used commands and consider writing shorter aliases for them.

4.2

Choose a terminal emulator and figure out how to change the following properties:

  • Font choice
  • Color scheme. How many colors does a standard scheme have? why?
  • Scrollback history size

4.3

Install fasd or some similar software and write a bash/zsh function called v that performs fuzzy matching on the passed arguments and opens up the top result in your editor of choice. Then, modify it so that if there are multiple matches you can select them with fzf.

4.4

Since fzf is quite convenient for performing fuzzy searches and the shell history is quite prone to those kind of searches, investigate how to bind fzf to ^R. You can find some info here.

4.5

What does the --bar option do in ack?

5 - Data wrangling

Exercises

5.1

If you are not familiar with Regular Expressions here is a short interactive tutorial that covers most of the basics.

5.2

How is sed s/REGEX/SUBSTITUTION/g different from the regular sed? What about /I or /m?

5.3

To do in-place substitution it is quite tempting to do something like sed s/REGEX/SUBSTITUTION/ input.txt > input.txt. However this is a bad idea, why? Is this particular to sed?

5.4

Implement a simple grep equivalent tool in a language you are familiar with using regex. If you want the output to be color highlighted like grep is, search for ANSI color escape sequences.

5.5

Sometimes some operations like renaming files can be tricky with raw commands like mv. rename is a nifty tool to achieve this and has a sed-like syntax. Try creating a bunch of files with spaces in their names and use rename to replace them with underscores.

5.6

Look for boot messages that are not shared between your past three reboots (see journalctl's -b flag). You may want to just mash all the boot logs together in a single file, as that may make things easier.

5.7

Produce some statistics of your system boot time over the last ten boots using the log timestamp of the messages

Logs begin at ...

and

systemd[577]: Startup finished in ...

5.8

Find the number of words (in /usr/share/dict/words) that contain at least three as and don't have a 's ending. What are the three most common last two letters of those words? sed's y command, or the tr program, may help you with case insensitivity. How many of those two-letter combinations are there? And for a challenge: which combinations do not occur?

5.9

Find an online data set like this one or this one. Maybe another one from here. Fetch it using curl and extract out just two columns of numerical data. If you're fetching HTML data, pup might be helpful. For JSON data, try jq. Find the min and max of one column in a single command, and the sum of the difference between the two columns in another.

6 - Editors (Vim)

Exercises

6.1

Experiment with some editors. Try at least one command-line editor (e.g. Vim) and at least one GUI editor (e.g. Atom). Learn through tutorials like vimtutor (or the equivalents for other editors). To get a real feel for a new editor, commit to using it exclusively for a couple days while going about your work.

6.2

Customize your editor. Look through tips and tricks online, and look through other people's configurations (often, they are well-documented).

6.3

Experiment with plugins for your editor.

6.4

Commit to using a powerful editor for at least a couple weeks: you should start seeing the benefits by then. At some point, you should be able to get your editor to work as fast as you think.

6.5

Install a linter (e.g. pyflakes for python) link it to your editor and test it is working.

7 - Version control (Git)

Exercises

7.1

On a repo try modifying an existing file. What happens when you do git stash? What do you see when running git log --all --oneline? Run git stash pop to undo what you did with git stash. In what scenario might this be useful?

7.2

One common mistake when learning git is to commit large files that should not be managed by git or adding sensitive information. Try adding a file to a repository, making some commits and then deleting that file from history (you may want to look at this). Also if you do want git to manage large files for you, look into Git-LFS.

7.3

Git is really convenient for undoing changes but one has to be familiar even with the most unlikely changes

  1. If a file is mistakenly modified in some commit it can be reverted with git revert. However if a commit involves several changes revert might not be the best option. How can we use git checkout to recover a file version from a specific commit?
  2. Create a branch, make a commit in said branch and then delete it. Can you still recover said commit? Try looking into git reflog. (Note: Recover dangling things quickly, git will periodically automatically clean up commits that nothing points to.)
  3. If one is too trigger happy with git reset --hard instead of git reset changes can be easily lost. However since the changes were staged, we can recover them. (look into git fsck --lost-found and .git/lost-found).

7.4

In any git repo look under the folder .git/hooks you will find a bunch of scripts that end with .sample. If you rename them without the .sample they will run based on their name. For instance pre-commit will execute before doing a commit. Experiment with them.

7.5

Like many command line tools git provides a configuration file (or dotfile) called ~/.gitconfig . Create an alias using ~/.gitconfig so that when you run git graph you get the output of git log --oneline --decorate --all --graph (this is a good command to quickly visualize the commit graph).

7.6

Git also lets you define global ignore patterns under ~/.gitignore_global, this is useful to prevent common errors like adding RSA keys. Create a ~/.gitignore_global file and add the pattern *rsa, then test that it works in a repo.

7.7

Once you start to get more familiar with git, you will find yourself running into common tasks, such as editing your .gitignore. git extras provides a bunch of little utilities that integrate with git. For example git ignore PATTERN will add the specified pattern to the .gitignore file in your repo and git ignore-io LANGUAGE will fetch the common ignore patterns for that language from gitignore.io. Install git extras and try using some tools like git alias or git ignore.

7.8

Git GUI programs can be a great resource sometimes. Try running gitk in a git repo and explore the different parts of the interface. Then run gitk --all what are the differences?

7.9

Once you get used to command line applications GUI tools can feel cumbersome/bloated. A nice compromise between the two are ncurses based tools which can be navigated from the command line and still provide an interactive interface. Git has tig, try installing it and running it in a repo. You can find some usage examples here.

8 - Dotfiles

Exercises

8.1

Create a folder for your dotfiles and set up version control.

8.2

Add a configuration for at least one program, e.g. your shell, with some customization (to start off, it can be something as simple as customizing your shell prompt by setting $PS1).

8.3

Set up a method to install your dotfiles quickly (and without manual effort) on a new machine. This can be as simple as a shell script that calls ln -s for each file, or you could use a specialized utility.

8.4

Test your installation script on a fresh virtual machine.

8.5

Migrate all of your current tool configurations to your dotfiles repository.

8.6

Publish your dotfiles on GitHub.

9 - Backups

Exercises

9.1

Consider how you are (not) backing up your data and look into fixing/improving that.

9.2

Figure out how to backup your email accounts.

9.3

Choose a webservice you use often (Spotify, Google Music, etc.) and figure out what options for backing up your data are. Often people have already made tools (such as youtube-dl) solutions based on available APIs.

9.4

Think of a website you have visited repeatedly over the years and look it up in archive.org, how many versions does it have?

9.5

One way to efficiently implement deduplication is to use hardlinks. Whereas symbolic link (also called a soft link or a symlink) is a file that points to another file or folder, a hardlink is a exact copy of the pointer (it uses the same inode and points to the same place in the disk). Thus if the original file is removed a symlink stops working whereas a hard link doesn’t. However, hardlinks only work for files. Try using the command ln to create hard links and compare them to symlinks created with ln -s. (In macOS you will need to install the gnu coreutils or the hln package).

10 - Automation

Exercises

10.1

Make a script that looks every minute in your downloads folder for any file that is a picture (you can look into MIME types or use a regular expression to match common extensions) and moves them into your Pictures folder.

10.2

Write a cron script to weekly check for outdated packages in your system and prompts you to update them or updates them automatically.

11 - Machine introspection

Exercises

11.1

locate? dmidecode? tcpdump? /boot? iptables? /proc?

12 - Program introspection

tbd

13 - Package dependency management

tbd

14 - OS customization

Exercises

14.1

Figure out how to remap your Caps Lock key to something you use more often (such as Escape or Ctrl or Backspace).

14.2

Make a custom global keyboard shortcut to open a new terminal window or a new browser window.

15 - Remote machines

Exercises

15.1

For SSH to work the host needs to be running an SSH server. Install an SSH server (such as OpenSSH) in a virtual machine so you can complete the rest of the exercises. To figure out what is the ip of the machine run the command ip addr and look for the inet field (ignore the 127.0.0.1 entry, that corresponds to the loopback interface).

15.2

Go to ~/.ssh/ and check if you have a pair of SSH keys there. If not, generate them with ssh-keygen -t rsa -b 4096. It is recommended that you use a password and use ssh-agent, more info here.

15.3

Use ssh-copy-id to copy the key to your virtual machine. Test that you can ssh without a password. Then, edit your sshd_config in the server to disable password authentication by editing the value of PasswordAuthentication. Disable root login by editing the value of PermitRootLogin.

15.4

Edit the sshd_config in the server to change the ssh port and check that you can still ssh. If you ever have a public facing server, a non default port and key only login will throttle a significant amount of malicious attacks.

15.5

Install mosh in your server/VM, establish a connection and then disconnect the network adapter of the server/VM. Can mosh properly recover from it?

15.6

Another use of local port forwarding is to tunnel certain host to the server. If your network filters some website like for example reddit.com you can tunnel it through the server as follows:

  • Run ssh remote_server -L 80:reddit.com:80
  • Set reddit.com and www.reddit.com to 127.0.0.1 in /etc/hosts
  • Check that you are accessing that website through the server
  • If it is not obvious use a website such as ipinfo.io which will change depending on your host public ip.

15.7

Background port forwarding can easily be achieved with a couple of extra flags. Look into what the -N and -f flags do in ssh and figure out what a command such as this ssh -N -f -L 9999:localhost:8888 foobar@remote_server does.

16 - Web and browsers

Exercises

16.1

Edit a keyword search engine that you use often in your web browser.

16.2

Install the mentioned extensions. Look into how uBlock Origin/Privacy Badger can be disabled for a website. What differences do you see? Try doing it in a website with plenty of ads like YouTube.

16.3

Install Stylus and write a custom style for the class website using the CSS provided. Here are some common programming characters = == === >= => ++ /= ~=. What happens to them when changing the font to Fira Code? If you want to know more search for programming font ligatures.

16.4

Find a web api to get the weather in your city/area.

16.5

Use a WebDriver software like Selenium to automate some repetitive manual task that you perform often with your browser.

17 - Security and privacy

Exercises

17.1

Encrypt a file using PGP.

17.2

Use veracrypt to create a simple encrypted volume.

17.3

Enable 2FA for your most data sensitive accounts (e.g., GMail, Dropbox, Github, etc.).