Difference between revisions of "Python"

From CSE330 Wiki
Jump to navigationJump to search
 
(41 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Languages like Java and C++ have lots of rules regarding variable types, syntax, return values, and so on. Although these restrictions help make the compiled program run quickly, they are cumbersome when you are trying to write short, quick scripts to perform tasks. This is where a ''scripting language'' comes into play.
+
Languages like Java and C++ have lots of rules regarding variable types, syntax, return values, and so on. These can be cumbersome when you are trying to write short, quick scripts to perform tasks. This is where a ''scripting language'' comes into play.
  
'''Python''' is a language well-suited to rapid prototype development. It is an ''interpreted language'', which means that you do '''''not''''' need to compile the code when you run it. The syntax is clean, and it is usually clear at first glance what is going on when you write in Python.
+
'''Python''' is a language well-suited to rapid prototype development. It is generally an ''interpreted language'', which means that, generally speaking, a Python program is converted line by line into instructions for the computer to execute as the program runs, instead of having a separate compilation step beforehand {{ref|interpreted}}. The syntax is clean, and it is usually clear at first glance what is going on when you write in Python.
  
{{XKCD
+
= Installation =
|name=python
 
|id=353
 
}}
 
  
== Installation ==
+
Python may already be installed on your system. To see whether or not it is, enter the command
  
Python may already be installed on your system. To see whether or not it is, enter the command
+
<source lang="bash">
 +
$ python3 --version
 +
</source>
 +
If it tells you a version of Python '''greater than or equal to 3.6''' then you're good to go. If not, you need to do a quick package install to get it up and running. Using yum, it's just <code>sudo yum install python36</code>.
  
<source lang="bash">$ python --version</source>
+
If you'd like to be able to just type <code>python</code> instead of <code>python3</code>, you can add the following line to the <code>~/.bashrc</code> file on your instance:
  
If it tells you a version of Python (like "2.7.1"), then you're good to go.  If not, you need to do a quick package install to get it up and running. Apt and Yum both call a functional Python package '''python'''.
+
<source lang="bash">
 +
alias python="/usr/bin/python3"
 +
</source>
 +
 
 +
==== Python 2 vs Python 3 ====
 +
 
 +
In 2008, a new version of Python – Python 3 – was released, with a number of changes that made it incompatible with the previous version, Python 2. For several years, the Python community was slow to make the change to Python 3, but today, Python 3 is ''the'' standard for new code. The ''only'' reason to write Python 2 code today is because you have to work in an environment or with libraries that do not support Python 3, and that situation is becoming increasingly rare.
 +
 
 +
'''For CSE 330 (and this guide), you ''must'' use Python 3.6 or greater.'''
  
 
=== Pip ===
 
=== Pip ===
  
Linux distributions have package managers like ''Apt'', ''Yum'', and ''YaST''. PHP has a package manager named ''PEAR''. It's now time to introduce Python's leading package manager: '''pip'''.
+
Linux distributions have package managers like ''Apt'', ''Yum'', and ''YaST''. PHP has a package manager named ''PEAR''. It's now time to introduce Python's leading package manager: '''pip'''.
  
You need to install Pip from Apt or Yum before you can use it. Both call the package '''python-pip'''.
+
Once you have pip installed, you can use it to install Python packages. Use the <code>pip-3.6</code> command:
  
Once you have pip installed, you can use it to install Python packages. Use the '''pip-python''' (RHEL) or '''pip''' (Debian) command:
+
<source lang="bash">
 +
$ pip-3.6 install <package_name> # RHEL
 +
</source>
 +
 
 +
 
 +
If you'd like to be able to just type <code>pip</code> instead of <code>pip-3.6</code>, you can add the following line to the <code>~/.bashrc</code> file on your instance:
  
 
<source lang="bash">
 
<source lang="bash">
$ pip-python install package_name # RHEL
+
alias pip="/usr/bin/pip-3.6"
$ pip install package_name # Debian
 
 
</source>
 
</source>
  
== Running Python ==
+
= Running Python =
  
 
There are two common ways to run Python code: via the console, and via a Python script file.
 
There are two common ways to run Python code: via the console, and via a Python script file.
Line 35: Line 47:
 
=== The Python Console ===
 
=== The Python Console ===
  
The Python console enables you to experiment with code without opening a text editor. To enter the Python console, simply type the ''python'' command at the terminal:
+
The Python console enables you to experiment with code without opening a text editor. To enter the Python console, simply type the ''python'' command at the terminal:
 +
 
 +
<source lang="bash">
 +
$ python3
 +
</source>
 +
To leave the interactive console, either type <code>quit()</code> or press <code>Ctrl-D</code> (on both Mac and Windows).
 +
 
 +
=== Python Script Files ===
 +
 
 +
You can also save Python script files for later use. The extension for Python scripts is <code>*.py</code>. To run a script file, simply feed its path as an argument to the <code>python</code> command in Terminal:
 +
 
 +
<source lang="bash">
 +
$ python3 <my_script>.py
 +
</source>
 +
 
 +
= Python Syntax and Language Components =
 +
 
 +
This section contains a very brief overview of Python syntax. For a more comprehensive introduction, see [http://docs.python.org/tutorial/ the Python docs].
 +
 
 +
=== Python's Type System ===
 +
 
 +
Python is a dynamically-typed, strongly-typed, and duck-typed language. What does that mean? Well, let's take those categories one by one:
 +
 
 +
<ol>
 +
<li>
 +
'''Dynamic typing''', in contrast to ''static typing'' means that the types of objects can be changed. In Java, for instance, if you tried to write the following code:
 +
 
 +
<source lang="java">
 +
int c = 2;
 +
c = "fish";
 +
</source>
 +
 
 +
Your code would fail to compile, because the Java compiler wouldn't allow you to change the type of the variable <code>c</code> from <code>int</code> to <code>String</code>.
 +
 
 +
Python, however, is just fine with that. You can write:
 +
 
 +
<source lang="python">
 +
c = 2
 +
c = "fish"
 +
</source>
 +
 
 +
And Python will happily do what you ask.
 +
</li>
 +
 
 +
<li>
 +
'''Strong typing''', in contrast to ''weak typing'', means, roughly, that a language won't implicitly convert types for you. So far, you've used a lot of PHP, which is very weakly typed. For example:
 +
 
 +
<source lang="php">
 +
echo "10" . 2;
 +
</source>
 +
 
 +
In PHP, this works just fine. The PHP interpreter sees that <code>2</code> is an int, and silently converts it to a string for you, so that <code>"10". 2</code> equals <code>"102"</code>. The problem with weak typing is that it can make it unclear to the programmer why things are happening the way they are. For instance, in PHP, the expression <code>"10" == 10</code> is true, which can be less than obvious. Python, however, tries to disallow that sort of thing:
 +
 
 +
<source lang="python">
 +
"10" + "10" # Allowed – concatenates the two strings.
 +
10  +  10  # Allowed – adds the two ints.
 +
"10" +  10  # Not allowed – raises a TypeError
 +
</source>
 +
 
 +
Perhaps the one exception to this rule is the relationship between numeric types; it's perfectly fine to add a float to an int, for example.
 +
</li>
 +
 
 +
<li>
 +
'''Duck-typing''' is a little bit different than the other categories, because it doesn't really have a direct opposite. Essentially, duck-typing is the philosophy that "If it walks like a duck, and quacks like a duck, it's a duck." In programming terms, that means that we often don't care what type an object is, as long as it can do the things we ask.
 +
 
 +
Let's look at an example:
 +
 
 +
<source lang="python">
 +
# This function doubles the value it's passed.
 +
def double(thing):
 +
    return thing + thing
 +
double(10) # Returns 20
 +
double("fish") # Returns "fishfish";
 +
</source>
 +
 
 +
If this were Java, we'd have to write two different methods here: one for things of type <code>int</code>, and one for things of type <code>String</code>. Since this is Python, though, we don't have to do that. Instead, Python lets us say, essentially: "This function uses the <code>+</code> operator, so you can pass whatever object you want to the function, as long as the function supports the <code>+</code> operator".
 +
 
 +
To quote [https://groups.google.com/forum/?hl=en#!msg/comp.lang.python/CCs2oJdyuzc/NYjla5HKMOIJ Alex Martelli], one of the key people in the Python project,
 +
 
 +
<blockquote>
 +
"[Don't] check whether it IS-a duck: check whether it QUACKS-like-a duck, WALKS-like-a duck, etc, etc, depending on exactly what subset of duck-like behaviour you need."
 +
</blockquote>
 +
 
 +
</li>
 +
</ol>
  
<source lang="bash">$ python</source>
+
=== Basic Types ===
  
To leave the interactive console, either type "quit()" or press Ctrl-D (on both Mac and Windows).
+
In Python, the basic scalar types are:
  
=== Python Script Files ===
+
# <code>str</code> – a string.
 +
# <code>int</code> – an integer object with unlimited digits.
 +
# <code>bool</code> – a Boolean value.
 +
# <code>float</code> – a floating point number; generally equivalent to <code>double</code> in Java.
 +
# <code>bytes</code> – represents a collection of bytes; you're unlikely to use this often.
  
You can also save Python script files for later use.  The extension for Python scripts is '''*.py'''.  To run a script file, simply feed its path as an argument to the ''python'' command in Terminal:
+
Casting is done in the C style - to parse an <code>int</code> from a string, for instance, you'd write <code>int("123")</code>.
  
<source lang="bash">$ python my_script.py</source>
+
=== Basic Control Flow ===
 +
In Python, blocks of code are delineated using colons and indentation instead of curly braces. '''''That means that proper indentation is mandatory.''''' Every time you use a colon, you must indent one step further, and every time a block ends, you must unindent a step.
  
== Python Syntax and Language Components ==
+
Conditional statements are done using the <code>if</code>, <code>elif</code>, and <code>else</code> operators, like in the example below - pay attention to the indentation:
  
This section contains a very brief overview of Python syntax.  For a more comprehensive introduction, see [http://docs.python.org/tutorial/ the Python docs].
+
<source lang="python">
 +
cookies_left = 10
 +
cake_slices_left = 3
 +
if cookies_left > 10:
 +
    print("Lots of cookies!")
 +
elif cookies_left > 8:
 +
    print("Plenty left!")
 +
elif cookies_left == 5:
 +
    print("Halfway done!")
 +
else:
 +
    if cake_slices_left == 3:
 +
        print("Lots of cake!")
 +
    elif cake_slices_left == 2:
 +
        print("One slice gone!")
 +
    else:
 +
        print("Little to no cake left!")
 +
</source>
  
 
=== An Example Python Script ===
 
=== An Example Python Script ===
Line 60: Line 177:
 
fruits = ["apple", "banana", "cherry", "date"]
 
fruits = ["apple", "banana", "cherry", "date"]
 
for fruit in fruits:
 
for fruit in fruits:
print("I always love to eat a fresh %s." % fruit)
+
    print(f"I always love to eat a fresh {fruit}")
  
 
# Map the fruits list over to a new list containing the length of the fruit strings:
 
# Map the fruits list over to a new list containing the length of the fruit strings:
 
fruit_size = [len(fruit) for fruit in fruits]
 
fruit_size = [len(fruit) for fruit in fruits]
  
avg_fruit_size = sum(fruit_size) / float(len(fruit_size))
+
avg_fruit_size = sum(fruit_size) / len(fruit_size)
print("The average fruit string length is %4.2f." % avg_fruit_size)
+
print(f"The average fruit string length is {avg_fruit_size:.2f}.")
 
</source>
 
</source>
  
 
Some things to notice:
 
Some things to notice:
* Printing is achieved using the '''print''' function.
 
* A colon starts a block, similar to a curly brace '''{''' in many other languages.  The corresponding code block '''''must''''' be indented.  The end of the code block is signified by when the indentation ends.
 
* Strings can be printf-style formatted using the '''%''' operator
 
* Comments start with a pound symbol '''#'''
 
* We can transform/map a list to a new list in just one line.  (Beat that, Java!)
 
* When we compute the average fruit size, we need to cast ''len(fruit_size)'', which returns an int, to a float in order to prevent integer truncation.
 
  
For some more examples, see [http://wiki.python.org/moin/SimplePrograms the Python wiki].
+
* Printing is achieved using the <code>print()</code> function.
 +
* Comments start with a pound symbol <code>#</code>.
 +
* We can transform/map a list to a new list in just one line!
 +
 
 +
For some more examples, see [https://wiki.python.org/moin/SimplePrograms the Python wiki].
 +
 
 +
=== A Few Gotchas ===
 +
 
 +
Here are a few things you may not expect when starting out with Python:
 +
 
 +
<ol>
 +
 
 +
<li>
 +
Strings can be delimited using either single quotes (<code>'</code>) or double quotes (<code>"</code>), so <code>'Python'</code> is the same as <code>"Python"</code>. You can also use three sets of quotes (single or double again) to delineate a multiple-line string, like so:
 +
<source lang="python">
 +
long_string = """This
 +
is my
 +
very long
 +
multiple-line
 +
string"""
 +
</source>
 +
</li>
 +
 
 +
<li>
 +
To compare the equality of two objects, use the <code>==</code> operator. To compare the identity of two objects (i.e. whether or not they are literally the same object, at the same place in memory), use the <code>is</code> operator. These correspond, respectively to <code>.equals()</code> and <code>==</code> in Java.
 +
</li>
 +
 
 +
<li>
 +
The <code>/</code> operator performs floating-point division, not integer division, even when both its arguments are integers. If you want to do integer division, use the <code>//</code> operator. Let's look at an example:
 +
 
 +
<source lang="python">
 +
9  / 10    == 0.9 # Floating-point division
 +
9.0 / 10.0  == 0.9 # Floating-point division
 +
9  // 10  == 0  # Integer division
 +
9.0 // 10.0 == 0.0 # Integer division, even though the resulting value is a float
 +
</source>
 +
</li>
 +
 
 +
<li>
 +
As discussed in the section [[#Python's_Type_System|Python's Type System]], you can't combine strings and numbers using the <code>+</code> operator. Instead, you'll have to use one of the following methods of string formatting:
 +
 
 +
<source lang="python">
 +
 
 +
# Technique 1 – printf-style Formatting
 +
# This is a classic style of string formatting, used in many programming languages, including in Java's String.format.
 +
"You scored %d out of %d!" % (score, questions)
 +
 
 +
# Technique 2 – str.format
 +
"You scored {} out of {}!".format(score, questions)
 +
 
 +
# Technique 3 – f-strings
 +
# These were only introduced in Python 3.6 but they're very convenient.
 +
f"You scored {score} out of {questions}!"
 +
 
 +
# Technique 4 - String Concatenation
 +
# This is fine for small things, but for more complicated stuff, use one of the other techniques.
 +
"You scored " + str(score) + " out of " + str(questions) + "!"
 +
</source>
 +
</li>
 +
</ol>
  
 
=== Functions ===
 
=== Functions ===
  
Define functions using the '''def''' keyword:
+
Define functions using the <code>def</code> keyword:
  
 
<source lang="python">
 
<source lang="python">
 
def hello(name):
 
def hello(name):
print("Hello, %s!" % name)
+
    print(f"Hello, {name}!")
  
 +
 +
</source>
 +
 +
Invoke functions using parentheses:
 +
<source lang="python">
 
hello("Batman")
 
hello("Batman")
 
hello("Superman")
 
hello("Superman")
 
</source>
 
</source>
  
=== Lists ===
+
Functions can have optional arguments; however, these must come after any required arguments:
Unlike languages like Java and C, Python's array-like type doesn't have fixed length. Instead, a '''list''' is a dynamic array of objects of any type. Here's an example:
+
<source lang="python">
 +
def hello(name="User"):
 +
    print(f"Hello, {name}!")
 +
 
 +
hello() # Prints "Hello, User!"
 +
hello("Sarah") # Prints "Hello, Sarah!"
 +
</source>
 +
 
 +
Functions are just regular objects in Python, meaning you can do stuff like this:
 +
 
 +
<source lang="python">
 +
def hello(name="User"):
 +
    print(f"Hello, {name}!")
 +
 
 +
greeting = hello
 +
 
 +
greeting("Siwei") # Prints "Hello, Siwei!"
 +
</source>
 +
 
 +
=== Collection Types ===
 +
Python has a few basic, built-in collection types. They each have their own special use cases, but they share a lot of similarities. All built-in data structures in Python are zero-indexed, for instance.
 +
 
 +
To obtain the number of elements in a collection, use the <code>len()</code> function, like this:
 +
 
 +
<source lang="python">
 +
pets = ['Dog', 'Cat', 'Fish'] # This is a list; you'll learn about them below.
 +
 
 +
len(pets) # Returns 3
 +
</source>
 +
 
 +
To check if a collection contains an item, use the <code>in</code> operator:
 +
<source lang="python">
 +
pets = ['Dog', 'Cat', 'Fish'] # This is a list; you'll learn about them below.
 +
 
 +
'Dog' in pets # Returns True
 +
</source>
 +
 
 +
Helpfully, the Python Software Foundation maintains a page containing the asymptotic complexities of important operations on major Python data structures. It's available [https://wiki.python.org/moin/TimeComplexity here].
 +
 
 +
==== Lists ====
 +
 
 +
Unlike languages like Java and C, Python's array-like type doesn't have fixed length. Instead, a <code>list</code> is a dynamic array (like Java's <code>ArrayList</code>) of objects of any type. Here's an example:
  
 
<source lang="python">
 
<source lang="python">
pets = ['Dog', 'Cat', 'Fish'] # Lists can be creating by placing their items between square brackets
+
fruits = ["Apple", "Pear", "Orange"] # Lists can be creating by placing their items between square brackets
  
 
random_items = ['Apple', 12312, 2.0, [1, 2, 3]] # Lists can contain objects of different types - even other lists!
 
random_items = ['Apple', 12312, 2.0, [1, 2, 3]] # Lists can contain objects of different types - even other lists!
Line 101: Line 317:
 
pets.append('Turtle') # Adds 'Turtle' to the end of pets
 
pets.append('Turtle') # Adds 'Turtle' to the end of pets
 
</source>
 
</source>
 +
 +
For the next few examples, we'll be using the following list:
 +
 +
<source lang="python">
 +
pets = ['Dog', 'Cat', 'Fish', 'Turtle', 'Lizard', 'Snake']
 +
</source>
 +
 +
To access or assign elements of a list, use the <code>[]</code> operator.
 +
 +
<source lang="python">
 +
pets[0] # Returns 'Dog'
 +
pets[1] = 'Cow' # Sets pets[1] to 'Cow'
 +
pets[1] = 'Cat' # Sets pets[1] back to 'Cat'
 +
</source>
 +
 +
You can also access or assign lists using negative indices, which will return elements starting from the end of the list:
 +
 +
<source lang="python">
 +
pets[-1] # Returns 'Snake'
 +
pets[-2] # Returns 'Lizard'
 +
</source>
 +
 +
You can also use 'slice notation' to obtain or assign subsets of lists:
 +
<source lang="python">
 +
pets[start:end] # Returns items from index start to index end - 1.
 +
pets[start:]    # Returns items from index start to the end of the list, inclusive.
 +
pets[:end]      # Returns items from the beginning of the list to index end - 1.
 +
pets[:]        # Returns every item in the list
 +
</source>
 +
 +
There's also a third, optional argument: <code>step</code>, which lets you jump over values:
 +
<source lang="python">
 +
pets[start:end:step] # Returns items from index start to index end - 1, jumping step values at a time
 +
 +
# Examples:
 +
pets[1:5:2] # Returns items from index 1 to index 4, jumping 2 at a time, so ['Cat', 'Turtle']
 +
pets[::3] # Returns all items in the list, jumping 3 at a time, so ['Dog', 'Turtle']
 +
pets[::-1] # Returns all items in the list, jumping -1 at a time. This reverses the list, so ['Snake', 'Lizard', 'Turtle', 'Fish', 'Cat', 'Dog']
 +
</source>
 +
 +
Lists have all the performance characteristics you'd expect of a dynamic array - constant-time indexing, amortized constant-time appends, and so on.
 +
 +
==== Tuples ====
 +
 +
Python has a special data type called <code>tuple</code>, which is exactly the same as a list except for the fact that it cannot be modified, so you can't append or replace items.
 +
 +
<source lang="python">
 +
cities = ('St. Louis', 'Los Angeles', 'Seattle') # Tuples are defined using parentheses.
 +
numbers = 1, 2, 3 # You can also omit the parentheses if it's unambiguous
 +
single_item_tuple = (1,) # To make a tuple with only one item, put a comma after it
 +
</source>
 +
 +
Tuples also enable you to have multiple return values from a function:
 +
 +
<source lang="python">
 +
def compute_length(string):
 +
    str_len = len(string)
 +
    if str_len < 5:
 +
        return str_len, "short"
 +
    elif str_len < 40:
 +
        return str_len, "medium"
 +
    return str_len, "long"
 +
 +
length, description = compute_length("Four score and seven years ago")
 +
print(f"The {description} string is {length} characters long.")
 +
</source>
 +
The above example also demonstrates Python's if...elif...else conditional structure.
 +
 +
==== Dictionaries ====
 +
 +
Python has another datatype called a '''dictionary''' (or <code>dict</code>, for short), which are like ''Maps'' in Java, ''associative arrays'' in PHP, and ''object literals'' in JavaScript (coming up soon in Module 6). Essentially, they enable you to use any immutable object as the key in your data structure. <code>dict</code> objects can be created using either the <code>dict</code> function, or by placing items between curly braces, separating keys and values using colons:
 +
 +
<source lang="python">
 +
fruits_in_bowl = {
 +
    'apple': 4,
 +
    'banana': 2,
 +
    'cherry': 0,
 +
    'date': 12
 +
}
 +
 +
apple_count = fruits_in_bowl['apple'] # Equals 4
 +
 +
fruits_in_bowl['pear'] = 9 # Sets 'pear' in the dictionary to 9.
 +
 +
for fruit, num in fruits_in_bowl.items():
 +
    print(f"There are {num} {fruit}(s) in the bowl." % (num, fruit))
 +
</source>
 +
If you only want the keys from a dictionary, you can just iterate over it directly, like this:
 +
 +
<source lang="python">
 +
for fruit in fruits_in_bowl:
 +
    print(fruit)
 +
</source>
 +
If you want only the values but not the keys from a dictionary, use <code>my_dictionary.values()</code>.
 +
 +
==== Sets ====
 +
A <code>set</code> is an unordered collection of unique items, with constant-time membership checking.
 +
<source lang="python">
 +
fruits = {'Apple', 'Apple', 'Apple', 'Pear'} # The set removes duplicates, so it will only contain {'Apple', 'Pear'}
 +
fruits = set(['Apple', 'Apple', 'Apple', 'Pear']) # This line transforms a list into a set
 +
print('Apple' in fruits) # Constant-time!
 +
</source>
 +
  
 
=== Loops ===
 
=== Loops ===
Python while loops are written like this:
+
 
 +
In Python, while loops are written like this:
 +
 
 
<source lang="python">
 
<source lang="python">
 
i = 0
 
i = 0
Line 109: Line 430:
 
     i += 1
 
     i += 1
 
</source>
 
</source>
 +
For loops are written like this:
  
Python for loops are written like this:
 
 
<source lang="python">
 
<source lang="python">
 
pets = ['Dog', 'Cat', 'Fish']
 
pets = ['Dog', 'Cat', 'Fish']
Line 117: Line 438:
 
     print(pet)
 
     print(pet)
 
</source>
 
</source>
 
+
To iterate over a range of numbers, you can use the convenient <code>range</code> function, like so:
To iterate over a range of numbers, you can use the convenient ''range'' function, like so:
 
  
 
<source lang="python">
 
<source lang="python">
for i in range(20, 30): # Note, if you omit the first argument to range, it's assumed to be zero
+
for i in range(20, 30): # If you omit the first argument to range, it's assumed to be zero.
 
     print(i)
 
     print(i)
 
</source>
 
</source>
 +
Note that <code>range</code> provides a closed-open range. That is to say, it is inclusive of its lower bound and exclusive of its upper bound. To be very explicit about it:
  
 +
<source lang="python">
 +
list(range(3, 6)) == [3, 4, 5]
 +
</source>
 
Unlike in other languages you may know, the following is not good practice in Python:
 
Unlike in other languages you may know, the following is not good practice in Python:
  
Line 134: Line 458:
 
     pet = pets[i]
 
     pet = pets[i]
 
</source>
 
</source>
 +
Instead, if you need to access both a variable and its index, use the <code>enumerate</code> function, which allows you to iterate through a group of items, along with their indexes, like so:
  
Instead, if you need to access both a variable and its index, use the ''enumerate'' function, which allows you to iterate through a group of items, along with their indexes, like so:
+
<source lang="python">
 
 
<source lang="Python">
 
 
pets = ['Dog', 'Cat', 'Fish']
 
pets = ['Dog', 'Cat', 'Fish']
  
 
for index, pet in enumerate(pets):
 
for index, pet in enumerate(pets):
     print("%s at index %d" % (pet, index))
+
     print(f"{pet} at index {index}")
 
</source>
 
</source>
  
 +
=== Generator Expressions and Comprehensions ===
 +
 +
A generator is a special kind of object that, basically, represents a collection of items using a value and a function for generating the next value in the collection. In other words, it's a way of creating something that behaves like a list, but which only generates one value at a time, and discards a value when it's done with it. Say, for example, you wanted to do an operation on 10,000,000,000,000 numbers. You could put them all in a list, then iterate over the list, but that would take a lot of memory, plus you'd have to create and fill the entire list before beginning to actually work on your problem. Instead, you'd probably be better off using a generator; telling the computer, essentially: “The starting value is 0. When I ask for the next number in the collection, add 1 to current value and give me that.”
  
=== Tuples ===
+
A generator expression is an expression that is evaluated to create a generator. They're super useful. Generator expressions can take two forms – they can be used with and without a condition:
  
Python has a special datatype called '''tuple''', which is an unmodifiable array.
+
<source lang="python">
 +
(EXPRESSION for VARIABLE in ITERABLE) # Unconditional
 +
(EXPRESSION for VARIABLE in ITERABLE if CONDITION) # Conditional
 +
</source>
 +
For our purposes, you can think of an iterable as anything you can do a for loop over. Let's take a look at a generator expression in use:
  
 
<source lang="python">
 
<source lang="python">
cities = ('St. Louis', 'Los Angeles', 'Seattle') # Tuples are defined using parentheses.
+
double_even_digits = (number * 2 for number in range(10) if number % 2 == 0)
single_item_tuple = (1,) # To make a tuple with only one item, put a comma after it
 
 
</source>
 
</source>
 +
That line says that <code>double_even_digits</code> is every even number between 0 and 10 multiplied by two. Simple enough, right? If we wanted to use all of the numbers, not just the even ones, we could simply remove the if statement at the end of the expression.
  
 +
Here's the thing, though: if you were to print <code>double_even_digits</code>, instead of a nice bunch of numbers, you'd get something that looks like this:
  
They can also serve as convenient ways to assign multiple variables at once:
+
<code><generator object <genexpr><code>at 0x107fc92b0></code>
 +
 
 +
That's because generators are lazy! They don't generate their values until those values are needed, and they discard values once they're no longer needed. As a result, '''you can only iterate over a generator once'''. In this case, you can't print out the whole generator, because it hasn't generated all the values yet. You can, however, iterate over the generator, and print its values one at a time, like so:
  
 
<source lang="python">
 
<source lang="python">
first_name, last_name = "John", "Smith"
+
for number in double_even_digits:
 +
    print(number)
 
</source>
 
</source>
 
+
You can also make a list out of the generator, which will go through all the items in the generator and put them in a list, like so:
Tuples also enable you to have multiple return values from a function:
 
  
 
<source lang="python">
 
<source lang="python">
def compute_length(string):
+
even_number_list = list(double_even_digits)
str_len = len(string)
 
if str_len < 5:
 
return (str_len, "short")
 
elif str_len < 40:
 
return (str_len, "medium")
 
else:
 
return (str_len, "long")
 
 
 
length, description = compute_length("Four score and seven years ago")
 
print("The %s string is %d characters long." % (description, length))
 
 
</source>
 
</source>
 +
That's actually a great segue to the second topic of this section: comprehensions. A comprehension is just a prettier way to make lists, sets, and dicts out of a generator expressions.
  
The above example also demonstrates Python's if...elif...else conditional structure.
+
<source lang="python">
 
+
# If you want a list:
=== Dictionaries ===
+
list(i for i in range(10) if i % 2 == 0) # Don't do this
 +
[i for i in range(10) if i % 2 == 0] # Use a list comprehension instead!
  
Python has another datatype called a '''dictionary''' (or '''dict''', for short), which are like ''maps'' in Java, ''associative arrays'' in PHP, and ''object literals'' in JavaScript (coming up soon in Module 6).  Essentially, they enable you to use any immutable object as the key in your data structure.
+
# If you want a set:
 +
set(i for i in range(10) if i % 2 == 0) # Don't do this
 +
{i for i in range(10) if i % 2 == 0} # Use a set comprehension instead!
  
<source lang="python">
+
# If you want a dict:
fruits_in_bowl = {
+
dict((i, i + 1) for i in range(10) if i % 2 == 0) # Don't do this
'apple': 4,
+
{i : i + 1 for i in range(10) if i % 2 == 0} # Use a dict comprehension instead!
'banana': 2,
 
'cherry': 0,
 
'date': 12
 
}
 
  
for fruit, num in fruits_in_bowl.items():
+
# If you want a tuple:
print("There are %d %s(s) in the bowl." % (num, fruit))
+
tuple(i for i in range(10) if i % 2 == 0) # Unfortunately, there's no better way to do this
 
</source>
 
</source>
 +
In general, generators and comprehensions should be used whenever possible - they're the fastest and easiest way to build collections. If you ever see yourself writing code like this:
  
If you only want the keys from a dictionary, you can just iterate over it directly, like this:
+
<source lang="python">
<source lang='python'>
+
# BAD PRACTICE! DO NOT USE!
for fruit in fruits_in_bowl:
+
digits = []
     print(fruit)
+
for digit in range(10):
 +
     digits.append(digit)
 
</source>
 
</source>
 
+
Don't do it! Instead, use a generator expression or comprehension, as appropriate. A good way to decide whether a generator expression or comprehension is better for this situation is that, if you're just going to be iterating over your collection once, you should use an generator expression. If not, use a comprehension.
If you want only the values but not the keys from a dictionary, use '''my_dictionary.values()'''.
 
  
 
=== Sorting ===
 
=== Sorting ===
  
Sorting in Python is frequently performed using the '''sorted''' function, which takes two arguments: an iterable (anything you can do a for-loop over, including lists and tuples) and a function used to evaluate each item. If the function is omitted, Python will try to sort the lists by value.
+
Sorting in Python is frequently performed using the <code>sorted</code> function, which takes two arguments: an iterable (anything you can do a for-loop over, including lists and tuples) and a function used to evaluate each item. If the function is omitted, Python will try to sort the lists by value.
  
 
The following example demonstrates using an inline function, which Python calls a ''lambda''.
 
The following example demonstrates using an inline function, which Python calls a ''lambda''.
Line 217: Line 541:
 
print(new_fruits) # ['date', 'apple', 'banana', 'cherry']
 
print(new_fruits) # ['date', 'apple', 'banana', 'cherry']
 
</source>
 
</source>
 
+
Alternatively, you can sort a <code>list</code> using the <code>.sort</code> method. This is faster than calling <code>sorted</code>, but it modifies the list in-place and returns <code>None</code>.
Alternatively, you can sort a ''list'' using the '''.sort''' method. This is faster than calling '''sorted''', but it modifies the list in-place.
 
  
 
<source lang="python">
 
<source lang="python">
Line 224: Line 547:
 
fruits.sort()
 
fruits.sort()
 
</source>
 
</source>
 
 
=== Import ===
 
=== Import ===
  
If you want to use functions from other libraries (including ones that you install using pip), use '''import''':
+
If you want to use functions from other libraries (including ones that you install using pip), use <code>import</code>:
  
 
<source lang="python">
 
<source lang="python">
Line 233: Line 555:
  
 
current_time = time.localtime()
 
current_time = time.localtime()
print time.strftime('%a, %d %b %Y %H:%M:%S', current_time)
+
print(time.strftime('%a, %d %b %Y %H:%M:%S', current_time))
 
</source>
 
</source>
 
+
If you want to pull the functions out of their namespace, you can use <code>from ___ import ___</code> syntax:
If you want to pull the functions out of their namespace, you can use '''from ___ import ___''' syntax:
 
  
 
<source lang="python">
 
<source lang="python">
Line 242: Line 563:
  
 
current_time = localtime()
 
current_time = localtime()
print strftime('%a, %d %b %Y %H:%M:%S', current_time)
+
print(strftime('%a, %d %b %Y %H:%M:%S', current_time))
  
 
# Be aware that this technique, although convenient, may cause unexpected behavior if the function names that you're pulling out of the namespace are already used for other purposes in Python.
 
# Be aware that this technique, although convenient, may cause unexpected behavior if the function names that you're pulling out of the namespace are already used for other purposes in Python.
 
</source>
 
</source>
 
 
=== File I/O ===
 
=== File I/O ===
  
Line 257: Line 577:
 
f.close() # free up memory when we're finished with the file
 
f.close() # free up memory when we're finished with the file
 
</source>
 
</source>
 
 
A better option is to use a with-block, which handles opening and closing the file for you automatically:
 
A better option is to use a with-block, which handles opening and closing the file for you automatically:
  
Line 264: Line 583:
 
     file_contents = f.read()
 
     file_contents = f.read()
 
</source>
 
</source>
 
 
 
You can read a file line-by-line like this:
 
You can read a file line-by-line like this:
  
Line 271: Line 588:
 
with open("example.txt") as f:
 
with open("example.txt") as f:
 
     for line in f:
 
     for line in f:
print("Read line: %s" % line.rstrip())
+
        print(f"Read line: {line.strip()}")
 
 
 
</source>
 
</source>
 
 
You can write to a file like this:
 
You can write to a file like this:
  
 
<source lang="python">
 
<source lang="python">
 
with open("example.txt", "w") as f:
 
with open("example.txt", "w") as f:
f.write("Hello\nWorld\n")
+
    f.write("Hello\nWorld\n")
 
</source>
 
</source>
  
 
=== Command-Line Arguments ===
 
=== Command-Line Arguments ===
  
Command line arguments are accessible in the variable '''sys.argv'''.
+
Command line arguments are accessible in the variable <code>sys.argv</code>.
  
The following example shows a program that expects a filename as its argument, and it prints a usage message if the argument is not present ([http://stackoverflow.com/questions/983201/python-and-sys-argv source]).
+
The following example shows a program that expects a filename as its argument, and it prints a usage message if the argument is not present ([https://stackoverflow.com/questions/983201/python-and-sys-argv source]).
  
 
<source lang="python">
 
<source lang="python">
Line 292: Line 607:
  
 
if len(sys.argv) < 2:
 
if len(sys.argv) < 2:
sys.exit("Usage: %s filename" % sys.argv[0])
+
    sys.exit(f"Usage: {sys.argv[0]} filename")
  
 
filename = sys.argv[1]
 
filename = sys.argv[1]
  
 
if not os.path.exists(filename):
 
if not os.path.exists(filename):
sys.exit("Error: File '%s' not found" % sys.argv[1])
+
    sys.exit(f"Error: File '{sys.argv[1]}' not found")
 
</source>
 
</source>
  
Line 306: Line 621:
 
<source lang="python">
 
<source lang="python">
 
class Food:
 
class Food:
# constructor:
+
    # constructor:
def __init__(self, name):
+
    def __init__(self, name):
self.name = name
+
        self.name = name
+
 
@staticmethod
+
    def format_name(self):
def get_definition():
+
        return f"Gotta love to eat {self.name}"
return "Food is nourishment for carbon-based lifeforms."
 
 
def format_name(self):
 
return "Gotta love to eat " + self.name
 
  
 
class Fruit(Food):
 
class Fruit(Food):
def format_name(self):
+
    def get_definition(self):
return Food.format_name(self) + " (fruit)"
+
        return f"{self.format_name()} (fruit)"
  
 
fruit = Fruit("Cherry")
 
fruit = Fruit("Cherry")
print fruit.format_name()
+
print(fruit.format_name())
print Food.get_definition()
+
print(fruit.get_definition())
 
</source>
 
</source>
 +
This is the same example as in [[PHP#Object-Oriented_Programming|the PHP guide]].
  
This is the same example as in [[PHP#Object-Oriented Programming|the PHP guide]].
+
Some things to notice:
  
Some things to notice:
+
* Instance methods ''always'' take <code>self</code> as their first argument, followed by any number of additional parameters. This can be misleading for programmers familiar with other languages, because the number of arguments you feed to the method is actually ''one less than'' the number of declared parameters. Whenever you call a method on a class instance, that instance is implicitly fed into the explicitly-declared <code>self</code> parameter of the method.
* Static methods require the '''@staticmethod''' decorator
+
*: Note that the <code>self</code> variable has the same purpose as the <code>this</code> variable in languages like PHP, Java, JavaScript, and C++.
* Non-static methods ''always'' take ''self'' as their first argument, followed by any number of additional parameters. This can be misleading for programmers familiar with other languages, because the number of arguments you feed to the method is actually ''one less than'' the number of declared parameters. Whenever you call a method on a class instance, that instance is implicitly fed into the explicitly-declared ''self'' parameter of the method.
+
*There is no need for a <code>new</code> keyword in Python.
*: Note that the ''self'' variable has the same purpose as the ''this'' variable in languages like PHP, Java, JavaScript, and C++.
 
*: Take home message is that you need to add an additional parameter, ''self'', at the beginning of any instance method.
 
* There is no need for a ''new'' keyword in Python.
 
  
 +
= Footnotes =
 +
# {{note|interpreted}}This is an oversimplification. If you're interested, here's a quick explanation of how: In CPython (the reference implementation of Python) programs are compiled to bytecode, an intermediate representation, which is then interpreted.
 
[[Category:Module 4]]
 
[[Category:Module 4]]

Latest revision as of 06:10, 10 January 2023

Languages like Java and C++ have lots of rules regarding variable types, syntax, return values, and so on. These can be cumbersome when you are trying to write short, quick scripts to perform tasks. This is where a scripting language comes into play.

Python is a language well-suited to rapid prototype development. It is generally an interpreted language, which means that, generally speaking, a Python program is converted line by line into instructions for the computer to execute as the program runs, instead of having a separate compilation step beforehand [1]. The syntax is clean, and it is usually clear at first glance what is going on when you write in Python.

Installation

Python may already be installed on your system. To see whether or not it is, enter the command

$ python3 --version

If it tells you a version of Python greater than or equal to 3.6 then you're good to go. If not, you need to do a quick package install to get it up and running. Using yum, it's just sudo yum install python36.

If you'd like to be able to just type python instead of python3, you can add the following line to the ~/.bashrc file on your instance:

alias python="/usr/bin/python3"

Python 2 vs Python 3

In 2008, a new version of Python – Python 3 – was released, with a number of changes that made it incompatible with the previous version, Python 2. For several years, the Python community was slow to make the change to Python 3, but today, Python 3 is the standard for new code. The only reason to write Python 2 code today is because you have to work in an environment or with libraries that do not support Python 3, and that situation is becoming increasingly rare.

For CSE 330 (and this guide), you must use Python 3.6 or greater.

Pip

Linux distributions have package managers like Apt, Yum, and YaST. PHP has a package manager named PEAR. It's now time to introduce Python's leading package manager: pip.

Once you have pip installed, you can use it to install Python packages. Use the pip-3.6 command:

$ pip-3.6 install <package_name> # RHEL


If you'd like to be able to just type pip instead of pip-3.6, you can add the following line to the ~/.bashrc file on your instance:

alias pip="/usr/bin/pip-3.6"

Running Python

There are two common ways to run Python code: via the console, and via a Python script file.

The Python Console

The Python console enables you to experiment with code without opening a text editor. To enter the Python console, simply type the python command at the terminal:

$ python3

To leave the interactive console, either type quit() or press Ctrl-D (on both Mac and Windows).

Python Script Files

You can also save Python script files for later use. The extension for Python scripts is *.py. To run a script file, simply feed its path as an argument to the python command in Terminal:

$ python3 <my_script>.py

Python Syntax and Language Components

This section contains a very brief overview of Python syntax. For a more comprehensive introduction, see the Python docs.

Python's Type System

Python is a dynamically-typed, strongly-typed, and duck-typed language. What does that mean? Well, let's take those categories one by one:

  1. Dynamic typing, in contrast to static typing means that the types of objects can be changed. In Java, for instance, if you tried to write the following code:
    int c = 2;
    c = "fish";
    

    Your code would fail to compile, because the Java compiler wouldn't allow you to change the type of the variable c from int to String.

    Python, however, is just fine with that. You can write:

    c = 2
    c = "fish"
    

    And Python will happily do what you ask.

  2. Strong typing, in contrast to weak typing, means, roughly, that a language won't implicitly convert types for you. So far, you've used a lot of PHP, which is very weakly typed. For example:
    echo "10" . 2;
    

    In PHP, this works just fine. The PHP interpreter sees that 2 is an int, and silently converts it to a string for you, so that "10". 2 equals "102". The problem with weak typing is that it can make it unclear to the programmer why things are happening the way they are. For instance, in PHP, the expression "10" == 10 is true, which can be less than obvious. Python, however, tries to disallow that sort of thing:

    "10" + "10" # Allowed – concatenates the two strings.
    10   +  10  # Allowed – adds the two ints.
    "10" +  10  # Not allowed – raises a TypeError
    

    Perhaps the one exception to this rule is the relationship between numeric types; it's perfectly fine to add a float to an int, for example.

  3. Duck-typing is a little bit different than the other categories, because it doesn't really have a direct opposite. Essentially, duck-typing is the philosophy that "If it walks like a duck, and quacks like a duck, it's a duck." In programming terms, that means that we often don't care what type an object is, as long as it can do the things we ask. Let's look at an example:
    # This function doubles the value it's passed.
    def double(thing):
        return thing + thing
    double(10) # Returns 20
    double("fish") # Returns "fishfish";
    

    If this were Java, we'd have to write two different methods here: one for things of type int, and one for things of type String. Since this is Python, though, we don't have to do that. Instead, Python lets us say, essentially: "This function uses the + operator, so you can pass whatever object you want to the function, as long as the function supports the + operator".

    To quote Alex Martelli, one of the key people in the Python project,

    "[Don't] check whether it IS-a duck: check whether it QUACKS-like-a duck, WALKS-like-a duck, etc, etc, depending on exactly what subset of duck-like behaviour you need."

Basic Types

In Python, the basic scalar types are:

  1. str – a string.
  2. int – an integer object with unlimited digits.
  3. bool – a Boolean value.
  4. float – a floating point number; generally equivalent to double in Java.
  5. bytes – represents a collection of bytes; you're unlikely to use this often.

Casting is done in the C style - to parse an int from a string, for instance, you'd write int("123").

Basic Control Flow

In Python, blocks of code are delineated using colons and indentation instead of curly braces. That means that proper indentation is mandatory. Every time you use a colon, you must indent one step further, and every time a block ends, you must unindent a step.

Conditional statements are done using the if, elif, and else operators, like in the example below - pay attention to the indentation:

cookies_left = 10
cake_slices_left = 3
if cookies_left > 10:
    print("Lots of cookies!")
elif cookies_left > 8:
    print("Plenty left!")
elif cookies_left == 5:
    print("Halfway done!")
else:
    if cake_slices_left == 3:
        print("Lots of cake!")
    elif cake_slices_left == 2:
        print("One slice gone!")
    else:
        print("Little to no cake left!")

An Example Python Script

In my mind, there's no better way to learn Python than to be immersed in a simple example script.

print("Hello World")

fruits = ["apple", "banana", "cherry", "date"]
for fruit in fruits:
    print(f"I always love to eat a fresh {fruit}")

# Map the fruits list over to a new list containing the length of the fruit strings:
fruit_size = [len(fruit) for fruit in fruits]

avg_fruit_size = sum(fruit_size) / len(fruit_size)
print(f"The average fruit string length is {avg_fruit_size:.2f}.")

Some things to notice:

  • Printing is achieved using the print() function.
  • Comments start with a pound symbol #.
  • We can transform/map a list to a new list in just one line!

For some more examples, see the Python wiki.

A Few Gotchas

Here are a few things you may not expect when starting out with Python:

  1. Strings can be delimited using either single quotes (') or double quotes ("), so 'Python' is the same as "Python". You can also use three sets of quotes (single or double again) to delineate a multiple-line string, like so:
    long_string = """This
    is my
    very long
    multiple-line
    string"""
    
  2. To compare the equality of two objects, use the == operator. To compare the identity of two objects (i.e. whether or not they are literally the same object, at the same place in memory), use the is operator. These correspond, respectively to .equals() and == in Java.
  3. The / operator performs floating-point division, not integer division, even when both its arguments are integers. If you want to do integer division, use the // operator. Let's look at an example:
    9   / 10    == 0.9 # Floating-point division
    9.0 / 10.0  == 0.9 # Floating-point division
    9   // 10   == 0   # Integer division
    9.0 // 10.0 == 0.0 # Integer division, even though the resulting value is a float
    
  4. As discussed in the section Python's Type System, you can't combine strings and numbers using the + operator. Instead, you'll have to use one of the following methods of string formatting:
      
    # Technique 1 – printf-style Formatting
    # This is a classic style of string formatting, used in many programming languages, including in Java's String.format.
    "You scored %d out of %d!" % (score, questions)
    
    # Technique 2 – str.format
    "You scored {} out of {}!".format(score, questions)
    
    # Technique 3 – f-strings
    # These were only introduced in Python 3.6 but they're very convenient.
    f"You scored {score} out of {questions}!"
    
    # Technique 4 - String Concatenation
    # This is fine for small things, but for more complicated stuff, use one of the other techniques.
    "You scored " + str(score) + " out of " + str(questions) + "!"
    

Functions

Define functions using the def keyword:

def hello(name):
    print(f"Hello, {name}!")

Invoke functions using parentheses:

hello("Batman")
hello("Superman")

Functions can have optional arguments; however, these must come after any required arguments:

def hello(name="User"):
    print(f"Hello, {name}!")

hello() # Prints "Hello, User!"
hello("Sarah") # Prints "Hello, Sarah!"

Functions are just regular objects in Python, meaning you can do stuff like this:

def hello(name="User"):
    print(f"Hello, {name}!")

greeting = hello

greeting("Siwei") # Prints "Hello, Siwei!"

Collection Types

Python has a few basic, built-in collection types. They each have their own special use cases, but they share a lot of similarities. All built-in data structures in Python are zero-indexed, for instance.

To obtain the number of elements in a collection, use the len() function, like this:

pets = ['Dog', 'Cat', 'Fish'] # This is a list; you'll learn about them below.

len(pets) # Returns 3

To check if a collection contains an item, use the in operator:

pets = ['Dog', 'Cat', 'Fish'] # This is a list; you'll learn about them below.

'Dog' in pets # Returns True

Helpfully, the Python Software Foundation maintains a page containing the asymptotic complexities of important operations on major Python data structures. It's available here.

Lists

Unlike languages like Java and C, Python's array-like type doesn't have fixed length. Instead, a list is a dynamic array (like Java's ArrayList) of objects of any type. Here's an example:

fruits = ["Apple", "Pear", "Orange"] # Lists can be creating by placing their items between square brackets

random_items = ['Apple', 12312, 2.0, [1, 2, 3]] # Lists can contain objects of different types - even other lists!

pets.append('Turtle') # Adds 'Turtle' to the end of pets

For the next few examples, we'll be using the following list:

pets = ['Dog', 'Cat', 'Fish', 'Turtle', 'Lizard', 'Snake']

To access or assign elements of a list, use the [] operator.

pets[0] # Returns 'Dog'
pets[1] = 'Cow' # Sets pets[1] to 'Cow'
pets[1] = 'Cat' # Sets pets[1] back to 'Cat'

You can also access or assign lists using negative indices, which will return elements starting from the end of the list:

pets[-1] # Returns 'Snake'
pets[-2] # Returns 'Lizard'

You can also use 'slice notation' to obtain or assign subsets of lists:

pets[start:end] # Returns items from index start to index end - 1.
pets[start:]    # Returns items from index start to the end of the list, inclusive.
pets[:end]      # Returns items from the beginning of the list to index end - 1.
pets[:]         # Returns every item in the list

There's also a third, optional argument: step, which lets you jump over values:

pets[start:end:step] # Returns items from index start to index end - 1, jumping step values at a time

# Examples:
pets[1:5:2] # Returns items from index 1 to index 4, jumping 2 at a time, so ['Cat', 'Turtle']
pets[::3] # Returns all items in the list, jumping 3 at a time, so ['Dog', 'Turtle']
pets[::-1] # Returns all items in the list, jumping -1 at a time. This reverses the list, so ['Snake', 'Lizard', 'Turtle', 'Fish', 'Cat', 'Dog']

Lists have all the performance characteristics you'd expect of a dynamic array - constant-time indexing, amortized constant-time appends, and so on.

Tuples

Python has a special data type called tuple, which is exactly the same as a list except for the fact that it cannot be modified, so you can't append or replace items.

cities = ('St. Louis', 'Los Angeles', 'Seattle') # Tuples are defined using parentheses.
numbers = 1, 2, 3 # You can also omit the parentheses if it's unambiguous
single_item_tuple = (1,) # To make a tuple with only one item, put a comma after it

Tuples also enable you to have multiple return values from a function:

def compute_length(string):
    str_len = len(string)
    if str_len < 5:
        return str_len, "short"
    elif str_len < 40:
        return str_len, "medium"
    return str_len, "long"

length, description = compute_length("Four score and seven years ago")
print(f"The {description} string is {length} characters long.")

The above example also demonstrates Python's if...elif...else conditional structure.

Dictionaries

Python has another datatype called a dictionary (or dict, for short), which are like Maps in Java, associative arrays in PHP, and object literals in JavaScript (coming up soon in Module 6). Essentially, they enable you to use any immutable object as the key in your data structure. dict objects can be created using either the dict function, or by placing items between curly braces, separating keys and values using colons:

fruits_in_bowl = {
    'apple': 4,
    'banana': 2,
    'cherry': 0,
    'date': 12
}

apple_count = fruits_in_bowl['apple'] # Equals 4

fruits_in_bowl['pear'] = 9 # Sets 'pear' in the dictionary to 9.

for fruit, num in fruits_in_bowl.items():
    print(f"There are {num} {fruit}(s) in the bowl." % (num, fruit))

If you only want the keys from a dictionary, you can just iterate over it directly, like this:

for fruit in fruits_in_bowl:
    print(fruit)

If you want only the values but not the keys from a dictionary, use my_dictionary.values().

Sets

A set is an unordered collection of unique items, with constant-time membership checking.

fruits = {'Apple', 'Apple', 'Apple', 'Pear'} # The set removes duplicates, so it will only contain {'Apple', 'Pear'}
fruits = set(['Apple', 'Apple', 'Apple', 'Pear']) # This line transforms a list into a set
print('Apple' in fruits) # Constant-time!


Loops

In Python, while loops are written like this:

i = 0
while i < 10:
    i += 1

For loops are written like this:

pets = ['Dog', 'Cat', 'Fish']

for pet in pets:
    print(pet)

To iterate over a range of numbers, you can use the convenient range function, like so:

for i in range(20, 30): # If you omit the first argument to range, it's assumed to be zero.
    print(i)

Note that range provides a closed-open range. That is to say, it is inclusive of its lower bound and exclusive of its upper bound. To be very explicit about it:

list(range(3, 6)) == [3, 4, 5]

Unlike in other languages you may know, the following is not good practice in Python:

pets = ['Dog', 'Cat', 'Fish']

# BAD PRACTICE. DO NOT USE.
for i in range(len(pets)):
    pet = pets[i]

Instead, if you need to access both a variable and its index, use the enumerate function, which allows you to iterate through a group of items, along with their indexes, like so:

pets = ['Dog', 'Cat', 'Fish']

for index, pet in enumerate(pets):
    print(f"{pet} at index {index}")

Generator Expressions and Comprehensions

A generator is a special kind of object that, basically, represents a collection of items using a value and a function for generating the next value in the collection. In other words, it's a way of creating something that behaves like a list, but which only generates one value at a time, and discards a value when it's done with it. Say, for example, you wanted to do an operation on 10,000,000,000,000 numbers. You could put them all in a list, then iterate over the list, but that would take a lot of memory, plus you'd have to create and fill the entire list before beginning to actually work on your problem. Instead, you'd probably be better off using a generator; telling the computer, essentially: “The starting value is 0. When I ask for the next number in the collection, add 1 to current value and give me that.”

A generator expression is an expression that is evaluated to create a generator. They're super useful. Generator expressions can take two forms – they can be used with and without a condition:

(EXPRESSION for VARIABLE in ITERABLE) # Unconditional
(EXPRESSION for VARIABLE in ITERABLE if CONDITION) # Conditional

For our purposes, you can think of an iterable as anything you can do a for loop over. Let's take a look at a generator expression in use:

double_even_digits = (number * 2 for number in range(10) if number % 2 == 0)

That line says that double_even_digits is every even number between 0 and 10 multiplied by two. Simple enough, right? If we wanted to use all of the numbers, not just the even ones, we could simply remove the if statement at the end of the expression.

Here's the thing, though: if you were to print double_even_digits, instead of a nice bunch of numbers, you'd get something that looks like this:

<generator object <genexpr>at 0x107fc92b0>

That's because generators are lazy! They don't generate their values until those values are needed, and they discard values once they're no longer needed. As a result, you can only iterate over a generator once. In this case, you can't print out the whole generator, because it hasn't generated all the values yet. You can, however, iterate over the generator, and print its values one at a time, like so:

for number in double_even_digits:
    print(number)

You can also make a list out of the generator, which will go through all the items in the generator and put them in a list, like so:

even_number_list = list(double_even_digits)

That's actually a great segue to the second topic of this section: comprehensions. A comprehension is just a prettier way to make lists, sets, and dicts out of a generator expressions.

# If you want a list:
list(i for i in range(10) if i % 2 == 0) # Don't do this
[i for i in range(10) if i % 2 == 0] # Use a list comprehension instead!

# If you want a set:
set(i for i in range(10) if i % 2 == 0) # Don't do this
{i for i in range(10) if i % 2 == 0} # Use a set comprehension instead!

# If you want a dict:
dict((i, i + 1) for i in range(10) if i % 2 == 0) # Don't do this
{i : i + 1 for i in range(10) if i % 2 == 0} # Use a dict comprehension instead!

# If you want a tuple:
tuple(i for i in range(10) if i % 2 == 0) # Unfortunately, there's no better way to do this

In general, generators and comprehensions should be used whenever possible - they're the fastest and easiest way to build collections. If you ever see yourself writing code like this:

# BAD PRACTICE! DO NOT USE!
digits = []
for digit in range(10):
    digits.append(digit)

Don't do it! Instead, use a generator expression or comprehension, as appropriate. A good way to decide whether a generator expression or comprehension is better for this situation is that, if you're just going to be iterating over your collection once, you should use an generator expression. If not, use a comprehension.

Sorting

Sorting in Python is frequently performed using the sorted function, which takes two arguments: an iterable (anything you can do a for-loop over, including lists and tuples) and a function used to evaluate each item. If the function is omitted, Python will try to sort the lists by value.

The following example demonstrates using an inline function, which Python calls a lambda.

fruits = ['apple', 'banana', 'cherry', 'date']

# sort the fruits by string length
new_fruits = sorted(fruits, key=lambda v: len(v) // 2)

print(new_fruits) # ['date', 'apple', 'banana', 'cherry']

Alternatively, you can sort a list using the .sort method. This is faster than calling sorted, but it modifies the list in-place and returns None.

fruits = ['apple', 'banana', 'cherry', 'date']
fruits.sort()

Import

If you want to use functions from other libraries (including ones that you install using pip), use import:

import time

current_time = time.localtime()
print(time.strftime('%a, %d %b %Y %H:%M:%S', current_time))

If you want to pull the functions out of their namespace, you can use from ___ import ___ syntax:

from time import localtime, strftime

current_time = localtime()
print(strftime('%a, %d %b %Y %H:%M:%S', current_time))

# Be aware that this technique, although convenient, may cause unexpected behavior if the function names that you're pulling out of the namespace are already used for other purposes in Python.

File I/O

You can read an entire file into a variable like this:

f = open("example.txt")
file_contents = f.read()

f.close() # free up memory when we're finished with the file

A better option is to use a with-block, which handles opening and closing the file for you automatically:

with open("example.txt") as f:
    file_contents = f.read()

You can read a file line-by-line like this:

with open("example.txt") as f:
    for line in f:
        print(f"Read line: {line.strip()}")

You can write to a file like this:

with open("example.txt", "w") as f:
    f.write("Hello\nWorld\n")

Command-Line Arguments

Command line arguments are accessible in the variable sys.argv.

The following example shows a program that expects a filename as its argument, and it prints a usage message if the argument is not present (source).

import sys, os

if len(sys.argv) < 2:
    sys.exit(f"Usage: {sys.argv[0]} filename")

filename = sys.argv[1]

if not os.path.exists(filename):
    sys.exit(f"Error: File '{sys.argv[1]}' not found")

Object-Oriented Programming

You can define and use a class like this:

class Food:
    # constructor:
    def __init__(self, name):
        self.name = name
   
    def format_name(self):
        return f"Gotta love to eat {self.name}"

class Fruit(Food):
    def get_definition(self):
        return f"{self.format_name()} (fruit)"

fruit = Fruit("Cherry")
print(fruit.format_name())
print(fruit.get_definition())

This is the same example as in the PHP guide.

Some things to notice:

  • Instance methods always take self as their first argument, followed by any number of additional parameters. This can be misleading for programmers familiar with other languages, because the number of arguments you feed to the method is actually one less than the number of declared parameters. Whenever you call a method on a class instance, that instance is implicitly fed into the explicitly-declared self parameter of the method.
    Note that the self variable has the same purpose as the this variable in languages like PHP, Java, JavaScript, and C++.
  • There is no need for a new keyword in Python.

Footnotes

  1. ^ This is an oversimplification. If you're interested, here's a quick explanation of how: In CPython (the reference implementation of Python) programs are compiled to bytecode, an intermediate representation, which is then interpreted.