Anaconda

Basics

The open-source Anaconda Distribution is one of the easiest ways to get started with data science projects. It already includes Python and the most important data science modules.

Hinweis

Anaconda is a data science toolkit which already includes most of the data science modules we need.

Anaconda’s package manager conda makes it easy to manage multiple data environments that can be maintained and run separately without interference from each other (in so called virtual environments).

conda analyses the current environment including everything currently installed, and, together with any version limitations specified (e.g. the user may wish to have TensorFlow version 2.0 or higher), works out how to install a compatible set of dependencies, and shows a important if this cannot be done.

Instead of conda, you can also use pip (the standard package installer for Python) to install packages.

Note that you should only use either conda or pip in one environment (we usually use pip).

If you already have Anaconda

If you already have Anaconda on your machine, make sure that you use the latest version (in our course, we use Python 3.11).

In your command line tool, type python --version to see which Python version you are using in your Anaconda base environment:

  • On Windows open the Start menu and open an “Anaconda Command Prompt”. Type python --version to see which Python version you are using in your Anaconda base environment.

  • On macOS or Linux open a terminal window. Type python --version to see which Python version you are using in your Anaconda base environment.

You may also uninstall your current Anaconda environment from your machine and install the latest version: here a guide of how to uninstall Anaconda.

Installation

Install the latest version of the Anaconda Distribution:

After you have installed Anaconda, you can update it. The following commands will update all packages in the default “base” environment to the latest version but will not update Python:

To do

Now follow the steps described in the next section.

Anaconda environment

After you have installed and updated Anaconda, you can install the modules you need for a specific lab in a new environment.

To do

Miniforge

As an alternative to Anaconda, you can also use the open-source project Miniforge.

Miniforge is a small, bootstrap version of the data science platform Anaconda that includes only Python, the open source package management system conda and a small number of other useful packages.

Miniforge also uses Anaconda’s package manager conda, which makes it easy to manage multiple data environments that can be maintained and run separately without interference from each other (in so called virtual environments).

Hinweis

Miniforge is an community-led alternative to the data science platforms Anaconda and Miniconda, provided by Anaconda, Inc.

Compared to Anaconda, Miniforge provides more up-to-date packages, and is more user-friendly. Therefore, I recommend using Miniforge for data science projects.

To do

Install the latest version of the Miniforge

Visual Studio Code

Basics

Visual Studio Code (also called Code) is a powerful source code editor which runs on your desktop and is available for Windows, macOS and Linux. It comes with a rich ecosystem of extensions for Python.

Hinweis

Visual Studio Code is a code editor that can be used with a variety of programming languages including Python.



Installation

Install VS Code:

To do

Install extensions

The features that Visual Studio Code includes out-of-the-box are just the start. VS Code extensions let you add languages, debuggers, and tools to your installation to support your development workflow.

Let’s install some important extensions:

Jupyter Notebooks

We usually work with Jupyter Notebook files in VS Code:

To use a specific Anaconda environment as Jupyter kernel, select the kernel (e.g. lab) using the kernel picker in the top right of VS Code.

Optional tutorials

Some resources to get familiar with VS Code: