In this introduction, I will explain how to set-up your computer for experiments in Python using the Anaconda distribution. I will show how to install it and use Jupyter Notebook, a tool for writing share-able code.
Anaconda calles itself the standard Python distribution for Data Science. Python, a language built by Guido Van Rossum in his spare time, has recently grown in popularity due to its simplicity and appeal to scientists. It has a great community of open-source developers that create and maintain Python libraries. In data science, you'll need to crunch and process data, create and train models and so on. The Scipy suite (which includes NumPy, Pandas and matplotlib) is the de-facto tool for these purposes.
After installing Anaconda, you will have not only a running Python installation with all these libraries, but also several tools for writing your code and sharing it. With Anaconda, you could also download and install Python libraries and create virtual environments using Conda.
Follow this link to download and install Anaconda in any OS. If you are running Windows or Mac's OSX, the installation should be very straightforward. If you're running Ubuntu, you'll download a .sh
file. In order to run it, navigate to where the file is in the console, give it execution permitions with chmod +x Anaconda3_..._.sh
. After that, install it by running ./Anaconda3_..._.sh
in your console. Do say yes to running conda init.
I would heavily recommend creating and working in a virtual environment with Python 3.6, since most libraries are not yet fully supported in Python 3.7. To do so, you will have to open a terminal client (system symbol in Windows) and follow this recipe.
If you are running Windows or Mac's OSX, you can access all the tools that Anaconda comes with by entering the Anaconda Navigator. Look for it using the Windows key or ctrl+space
in Mac. You can e.g. use Spyder as an IDE for Python or use the Jupyter Notebook, as we will do.
If you're running Ubuntu and said yes to conda init, you should be seeing a (base)
(or your virtual environment) in the left to your username in the console. If that's the case, you can easily open the Jupyter Notebook by running jupyter notebook
in the console.
Besides Jupyter Notebook, you can always use a code editor to write and run the now available Python instance. I heavily recommend Visual Studio Code. The good thing about VSCode is that you can fix keymappings to your favorite previous editors such as vim or Sublime, and it has a plethora of extensions that are easy to install. Moreover, it is open source and thus you don't have to worry about licences. If you plan to use VSCode frequently, it's absolutely worth it to familiarize yourself with key bindings, abbreviations and editing tips. Follow the documentation in their site.
With Jupyter Notebook, which is one of the writing tools Anaconda comes with, you can write .ipynb
files (or notebooks) that can be uploaded to gist with your github account or that can be sent for other users to see and run. These notebooks are built in blocks that can vary from code to markdown text, thus you can write text and explain your reasoning in the same space.
Open Jupyter Notebook (be it via the Anaconda Navigator or console) and navigate to an empty folder in which you want to create your first notebook. On the right hand side you'll see a button that says New, create a new Python 3 notebook.
You will be greeted by an empty block. There are two modes in Jupyter Notebook, green (or editing) and blue (or navigation). Edit the empty highlighted block with enter
, going into green mode. Write print("Hello World!")
in it and hit ctrl+enter
. As you can see, ctrl+enter
runs the block of code.
Go into navigation mode by hitting esc
. In order to create blocks, use a
to create an empty block above and b
to create a block below. You can navigate between your blocks with the usual up and down arrows, and you can edit a block by hitting enter
.
You can modify what the block contains with the following shortcuts:
m
in navigation (blue) mode transforms it to a Markdown block, and y
transforms it back to code.Feel free to learn about more shortcuts in the help tool that can be found in the navigation bar. Other ones that are frequently used are d + d
to delete a block, i + i
to interrupt the kernel (if it's stuck in an infinite loop of if it's eating a lot of resources) and alt + enter
to run a block and create one below.