How To Use This Tutorial
If you prefer to read the tutorial in your web browser without executing any of the code, then head over to the contents page and follow the links for each of the chapters.
However, the content of the tutorial is written using interactive Jupyter Notebooks. This means that all of the code can be run from within the tutorial itself. You can edit the code, add new code of your own and run it all!
To do that, you need a server to host the notebooks. There are two options:
If you have never used a Jupyter notebook before, you may want to have a look at the documentation. Alternatively, Corey Shafer produced a very good video tutorial where he installs a Jupyter server on his computer (option 2 below) and shows you how to use it. If you choose option 1 below, you can ignore where he installs and starts his server and concentrate on how to use the notebooks.
1. Use an Online Service
This is the simplest option. There is nothing to install on your computer and everything is done within your web browser.
If you choose this option, be aware that:
- The server will take a few moments to start, especially at times when the service is busy.
- Nothing is saved between sessions. If you close your web browser, any changes you have made to the code or the sample database will be lost.
- The code typically takes longer to execute than it would on a local computer. In particular, it is best not to run the first example in Chapter 12 - Bulk Loading.
Launch Notebook Server
2. Use Your Own Computer
Using your own computer allows you to work offline and to keep any changes you make to the code or the sample database. It does, however, require you to install some software before you can start:
2.1. Recommended Method - Anaconda
The Anaconda distribution of Python includes most of the necessary tools for this tutorial and so is the simplest option for running the code on your own computer.
Create a virtual environment:
conda create -n sql-python -python=3.6
conda install -c anaconda sqlalchemy
2.2 Other Python Distributions
For other Python distributions you can install the necessary libraries using pip. It is strongly recommended that you create a virtual environment before doing so.
pip install jupyter pandas sqlalchemy
2.3. Running the Notebooks
Each chapter listed on the Contents page has a link which you can use to download the notebook file. Place these files in a directory of your choice and then, in your terminal, change to that directory and start the Jupyter server:
After a few moments, your browser will open a page which lists all the notebooks you downloaded.
2.4 Optional Bulk Data File
Chapter 12 - Bulk Loading uses a 70 MB file with over 200 000 records. If you intend to work through that chapter, you will need to download the file to the same directory as the notebooks
2.5 Optional Database Browser
Although the tutorial focuses on using Python to control a database, it can sometimes be useful to use a command line or graphical tool. Some of the chapters include examples of using the open source DB Browser for SQLite in addition to the Python examples.