I’m working with academic data scientists who write in Python. I use C# mainly.

March 2022 update

What is Python

https://en.wikipedia.org/wiki/Python_(programming_language)

  • Interpreted (ie no compilation to bytecode or machine code). The python interpreter directly executes instructions.
  • Dynamically typed, but Strongly typed eg forbidding adding a number to a string.
  • Garbage collected
  • Structured (particularly procedural), OO and Functional
  • Has comprehensive standard library
  • Consistently ranks as one of the most popular programming languages
  • no { } - it uses whitespace. Recommended is 4 spaces.

What is Python2 vs Python3

Python3 is not backwards compatible with Python 2.

Python2 was discontinued with version 2.7.18 in 2020.

It seems there is still a lot of Python2 code around, so the interpreter is kept.

Installing Python

Most versions of Ubuntu 20.04 come with python3 pre-installed, and my Ubuntu 18 WSL2 install had Python 2 and 3 on it too.

https://www.python.org/downloads/ latest is 3.9.5 on 9th Jun 2021.

# Ubuntu 18.04 on WSL2 is Python 2.7.17
python --version

# Ubuntu 18.04 on WSL2 is Python 3.6.9
python3 --version

# I'm favouring apt instead of apt-get as it can update more dependencies lke azure-tools
sudo apt update
sudo apt upgrade

## Ubuntu 20.04 on Azure giving 3.8.5 on 9th June 2021
python3 --version

## Windows 10 - can't find a default python
## 3.9.5 after install on 9th June 2021
py --version

To get the latest version on my Ubuntu 18 WSL I used https://phoenixnap.com/kb/how-to-install-python-3-ubuntu:

https://stackoverflow.com/questions/41986507/unable-to-set-default-python-version-to-python3-in-ubuntu discusses default versions and how to take priority. Seems it is fine to have multiple versions around (and don’t remove)

sudo add-apt-repository ppa:deadsnakes/ppa

sudo apt install python3.9

# it didn't work! still said 3.6.9
python3 --version

# ahha it says 3.9.5
python3.9 --version

# probably don't want to do this
sudo apt remove python3
sudo apt remove python3.9

# hmm python 3.9 was still there after this
sudo apt-get remove --purge python3.9

# ahha this got rid of it
sudo apt autoremove

# this gets the right version, but have to type python3.9 to run it
sudo apt-get install python3.9

# but some of the apps I install depend on python3.6, so it came back
# lesson here is to keep multiple versions

Pip

Pip is a tool for installing Python packages from PyPI - Python Package Index. Analogous to CPAN for Perl.

PyPI primarily hosts Python packages in the form of archives called sdists or precompiled wheels.

When installing Python modules globally it is recommended to use apt.

Use pip to install a module globally only if there is no deb package for this module.

Prefer using pip within a virtual environment only. Python virtual environments allow you to install python modules in an isolated location for a specific project.

Seems tricky.. https://askubuntu.com/questions/431780/apt-get-install-vs-pip-install

sudo apt update
# insall pip
sudo apt install python3-pip

# 21.0.1 from python 3.6
# ahh where did 3.6.9 come back from?
# don't use sudo on pip3
pip3 --version

pip3 install scrapy
# a specific version
pip3 install scrapy==1.5

#replace pip3 with pip2 if using python2

# shows installed packages
# and showed me I was using an outdated version of python
pip3 list

# put on latest version of pip 21.1.2
/usr/bin/python3 -m pip install --upgrade pip

So this seems okay, except I’m manually having to install packages and in C# used to having the projects download packages automatically if they are not available.

Also I’m installing packages globally here and not on a per-project basis.

This is a pain point apparently, which is why virtual environments exist.

The next gotcha was using pip to install to the correct version of python:

Pip on a specific version of Python

# Installing Pandas on python3
# as my default intallation had switched to python3.9
python3 -m pip install pandas

python3.9 -m pip install pandas

Getting code working

Here is part of the script from the Face_detection https://github.com/spatial-intelligence/OSR4Rights project I’m working on.

#!/usr/bin/python3
import sys
import dlib  
import csv
import face_recognition
import os
import pickle
import time
import datetime
from PIL import Image, ImageDraw, ImageFont
import gc
import numpy
import psycopg2
import pdfkit
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.application import MIMEApplication
from email.mime.nonmultipart import MIMENonMultipart
import base64
import argparse
import time

The first line is a Linux shebang sha-bang, hasbang, pound-bang or hash-pling. Essentially saying execute this script using the python3 interpreter.

Getting a script working - ffhq-dataset downloader

This was interesting from a python point of view. But way easier to use the Google Drive UI to download them https://github.com/NVlabs/ffhq-dataset/

Right click on a folder and it will split the downloads into separate 2GB files.. neat!

ModuleNotFoundError: No module named 'requests' - googling around says I have to install the module.

As pip will always have the latest version of a package, I’m not sure if I should be favouring apt or pip.

Timings

Some timing code I wrote around a long running function which was useful.

 ts = time.time()
 # a long running function
 res=process_faceimg(path,'job1',1)
 te = time.time()
 tt = te-ts
 print('time taken: %2.2f secs' % tt)

VSCode

Python extension for >= 3.6 View, Command Pallete, Select the correct interpreter (I have 2.7.17 and 3.6.9)

Code

https://app.pluralsight.com/library/courses/practical-python-beginners/table-of-contents Practical Python for Beginners by Sarah Holderness.


# Python assumes the type of the variable based on the assigned value

# int
length = 10

# float
foo = 10.50

width = 20
area = length * width

Conda

https://docs.conda.io/projects/conda/en/latest/