colidescope

/guides/ 04-intro-to-python-03-working-with-data-in-python

Welcome! You are not currently logged in. You can Log in or Sign up for an account.

Intro to Python

This guide will introduce the programming language Python and demonstrate how it can be used within Grasshopper to access Rhino's core geometric functions.

Working with data in Python

Now that we have some experience writing Python code and loading the Rhino libraries, let's take a step back and cover some basic concepts in Python. In this guide we will learn about the basic data types in Python and how we can work with them using variables and a variety of built-in operators. We will also see how we can store entire sets of data within a variable using two important structures: Lists and Dictionaries.

Other sections in this guide

Intro to Python
1Getting started with Python
2Exercise: Working with the Rhino API
3Working with data in Python

Elements of programming

If you’ve never written code, or done any computer programming, the whole concept might seem daunting at first. However, while advanced software development is without a doubt incredibly complex, programming in general is based on only a few key concepts:

  1. Variables that store data
  2. Conditionals that split instructions based on a condition in the data
  3. Loops that repeat instructions a certain number of times or until a condition is met
  4. Functions that wrap logic and exposes its inputs and outputs, making the logic easier to reuse
  5. Classes that define objects which store data as local variables (called “properties”) and implement behaviors as local functions (called “methods”)

Although mastering these components and all the techniques they enable takes an entire career to master, it should provide a measure of comfort even as you're starting out that no matter how much you have left to learn, nothing will come up that falls outside of these key concepts, all of which we will cover at least in some form over the next set of guides.

In the rest of this guide we will review variables and the basic data types supported by Python. Future guides will introduce conditionals, loops, functions, and classes. While all of these tutorials will use Python, most of the concepts they introduce apply to any programming language, though some of the terminology and exact implementations may differ. In any case, learning these concepts in Python gets you 80-90% of the way there to implementing similar things in any programming language you may encounter in the future.

Note

Reading through text describing how programming works can be dry, especially when you're just starting out and have not yet realized how fun and creative a process writing code can be. While the best way to learn programming is by working through examples (and we will do plenty of this), it is also helpful to start thinking through the main elements that compose a programming language like Python and study the various ways we can apply them to solve problems with code.

By understanding these concepts from the beginning, you will be less intimidated by all of the particular syntax that you don't yet know. As long as you can express what you want the program to do in general terms, you can always search the internet for examples on using the proper syntax. In fact this is how most people learn to code today. Furthermore, while each programming language has it’s own syntax, they almost all follow the same basic principles, so learning these principles will be useful no matter what language you end up using.

Don't be tempted to skim through the descriptions and go straight to the guided exercises. I encourage you to work through the content, type each example line into your Python script and execute it. Then make changes to the code, break it, then fix it again. Do this until you are sure you understand how the structure or technique being described works (as well as when it doesn't). Doing this will help you understand the exercises on a deeper level and help you to develop your own intuition for coding.

Comments and print()

The first bit of Python syntax we will cover is the all-important 'comment'. You specify a comment by starting a line with '#', which tells Python to ignore everything on that line after the '#' symbol. Try typing the following lines of code into your editor and executing the script:

# this is a comment
print('this is code') # this is also a comment

If you run this code you will see that it prints out 'this is code' because it executes the line print('this is code'). This uses the print() function which takes in any text you put inside the parenthesis and prints it to the code editor's console window. Meanwhile, it ignores both comments occurring after the '#' symbol. Although every language specifies them differently, comments are an important part of every programming language, as they allow the developer to add extra information and description to their code which is not strictly related to its execution.

print() is one of many functions built-in to Python. Throughout the tutorials we will see many examples of built-in functions as well as learn how to create custom functions in a later tutorial.

Most Python editors also have a shortcut for commenting whole lines of code. Pressing CTRL + / should 'comment out' the current line of code by placing a '#' in front of it. You can also use the shortcut with multiple lines selected to comment out entire sections of code. This is useful during troubleshooting if you want to disable certain parts of your script without deleting the actual code.

Now that we know the basics, let's start to explore the fundamental principles of coding in Python, starting with variables.

Working with variables

You can think of variables as containers that store data. You can use variables in Python to store data of any type and then later recall it when needed for further manipulation. Variables can be declared and assigned freely in Python, as opposed to other languages where you have to explicitly state the type of data they will be storing. To assign a value to a variable, use the = operator:

a = 2

Here, a is the name of the variable, and the number '2' is the data I am assigning it. From here on out, a will be associated with the number '2', until it is assigned another value, or the program ends. Try this code:

a = 2
b = 3
print(a + b)

This should print out the number 5, since a is storing the number '2', and b is storing the number '3'. You can use many other common arithmetic operators in the same way. Some of the most common are:

  • + — addition
  • - — subtraction
  • * — multiplication
  • / — division
  • ** — raise to a power
  • %modulo

Naming variables

In Python, you can name your variables anything as long as the name:

  1. does not contain any spaces
  2. only contains numbers and letters and the _ character
  3. does not start with a number
  4. is not a reserved keyword (such as 'print').

This means variable naming is quite flexible and you can really call your variables whatever you like. In practice, however, to improve code readability for themselves and others, most programmers follow some conventions for naming variables.

In general, variable names should not be too long to avoid unecessary typing, but should err on the side of description rather than brevity to facilitate the code's readability. For instance, if you are keeping track of the number of blueberries, it is better to call the variable 'numBlueberries' than simply 'n' or 'b'.

Since variable names tend to be composed of multiple words, there are conventions for how to join the words together. One common approach is to use 'camel case':

withCamelCaseTheFirstWordIsLowerCaseWhileAllSubsequentWordsAreUpperCase

This approach is popular in many programming languages such as C++, Java, and JavaScript. Another approach (called 'snake case') is to substitute underscores ('_') for spaces in variable names, for example:

variable-with-snake-case

This happens to be the method preferred by the Python community and for this reason I will use the snake case in most examples and exercises in these guides. However, it's important to understand that this is an optional 'style' adopted for legibility and not a hard requirement coming from the Python syntax.

Data types

Variables can hold data of different types. Although Python does not make you explicitly declare the type of data you will be using, it is important to know the types because they will each behave differently in your code. Although there are many different types of data supported by Python, the most common (and the ones supported by most every programming language) are:

  • int — meaning integer, or a whole number
  • float — meaning floating point, or decimal number
  • bool — meaning boolean, or a True/False
  • str — meaning string, or ‘a piece of text’

In Python you can use the type() built-in function to get the type of data stored in any variable. Try to run the following code:

print(type(12))
print(type(12.1))
print(type(True))
print(type('blueberries'))

You can see that it prints the four types described above. Notice also the particular way in which the data must be written so that Python does not confuse it with the name of a variable. Numbers can be written directly because you cannot name variables with only a number. Booleans must be written capitalized (True or False) as these are reserved key words in Python (notice that the Python editor gives a special color to the 'True' keyword). Strings are always contained within quotes. You can use either single (') or double (") quotes, but they must match on either side of the string. If you try to write:

print(type(blueberries))

without the quotes, you will get the following error:

NameError: name 'blueberries' is not defined

This error tells you that the name ‘blueberries’ is not defined as a variable. However, if you write:

blueberries = 5
print(type(blueberries))

it will tell you that the variable blueberries is storing data of type int, because 'blueberries' is now a variable with the integer 5 stored inside of it.

In Python, many operators are 'over-loaded', which means that they function differently depending on the data type that they are used on. For instance, if we run:

print(2 + 2)

we get '4'. When given two numbers, the + operator performs arithmetic addition. However, if we run:

print('First ' + 'Last')

we get 'First Last'. When given two strings, the + operator 'concatenates' or merges them together into one string. Over-loading is useful because it produces clean and readable code without having a special function for each type of variable. You have to be careful, however, because mismatching different types of variables can lead to errors. For instance, running this line:

numBerries = 5
print('Number of Blueberries: ' + numBerries)

will produce an error because it is trying to perform a concatenation of a string and an integer. Instead, you can use the str() function to convert the 5 to a string before using it with the + operator:

numBerries = 5
print('Number of Blueberries: ' + str(numBerries))

Multi-part variables

In addition to storing single items of data, you can also use variables to store many items of data and then access them in a structured way. There are two basic types of multi-part variables:

  • Lists — sometimes called Arrays
  • Dictionaries — sometimes called key-value pairs

Working with Lists

A List is an ordered set of elements, with the position of each item in the List specified by it's index. A List can be created in Python by using square brackets, and separating individual elements by commas like so:

numbers = [1, 2, 3, 4, 5]
fruits = ['apples', 'oranges', 'bananas']

To retrieve an object from a List, you once again use square brackets, but this time appended to the end of the variable name storing the List. Inside the brackets you place the index or position of the item you want in the List. For example:

numbers = [1, 2, 3, 4, 5]
print(numbers[0])
fruits = ['apples', 'oranges', 'bananas']
print(fruits[1])

Notice that like in all languages (including Grasshopper), counting in Python begins with '0', so if you want the first item in a list you use [0], the second item [1], and so on.

Unlike many other languages, Python will allow you to mix different types of data within a single List, so something like this is allowed:

fruitsAndNumbers = ['apples', 2, 'bananas']
print(type(fruitsAndNumbers))
print(type(fruitsAndNumbers[0]))
print(type(fruitsAndNumbers[1]))

You can also use a : operator within the square brackets to obtain a range of values from a List, which will create a new list that you can assign to a new variable:

numbers = [1, 2, 3, 4, 5]
newNumbers = numbers[0:3] # [index of first item:index after last item]
print(newNumbers)

You can even index backwards using negative indices. Here is a typical application that will print out the last item in the List:

numbers = [1, 2, 3, 4, 5]
print(numbers[-1])

Lists implement various methods to help you work the data stored within. The most common is .append(), which adds a value to the end of a List:

numbers = [1, 2, 3, 4, 5]
numbers.append(6)
print(numbers)

A common technique is to start with an empty List, and then fill it gradually with appends:

numbers = []
numbers.append(1)
numbers.append(2)
print(numbers)

Though this seems pretty manual at the moment, we will see how this can be automated using loops in a later guide.

That should be enough about Lists to get you started. Future tutorials will introduce other useful List methods such as .pop(), and for a comprehensive review of all methods you can refer to the Python documentation.

Working with Dictionaries

Lists are extremely useful for storing multiple items of data within a specific sequence. However, sometimes you want to be able to recall an item of data without knowing its exact position in a List. For this you can use Dictionaries, which store multiple items but in a different way. Instead of relying on the item order, Dictionaries store and recall data items by tying them to unique keys. Once the data is stored, you can use the keys to recall the data values tied to them. For this reason, Dictionary entries are often called 'key-value pairs'.

To create a Dictionary in Python you use curly braces, separating keys and values with a colon (:), and multiple entries with commas (,):

myDictionary = {'a': 1, 'b': 2, 'c': 3}

In this Dictionary, the integers 1, 2, and 3 are tied to their unique keys, 'a', 'b', and 'c'. Note that keys must be strings, while values can be any data type. To retrieve an item of data from this Dictionary, you can again use the square bracket notation, this time passing in a key instead of an index:

myDictionary = {'a': 1, 'b': 2, 'c': 3}
print(myDictionary['a'])

To add entries to a Dictionary, you just have to assign the data value that relates to a particular key using the = operator and the same square bracket syntax:

myDictionary = {'a': 1, 'b': 2, 'c': 3}
myDictionary['d'] = 4
print(myDictionary['d'])

As with Lists, you can start with an empty Dictionary and build it up over time:

myDictionary = {}
myDictionary['a'] = 1
myDictionary['b'] = 2
print(myDictionary)

Like Lists, Dictionaries implement many useful methods for working with the data contained inside. One very useful method is .keys(), which returns a List of all of the Dictionary’s keys, which you can then use to iterate over all the entries in the Dictionary:

python
myDictionary = {'a': 1, 'b': 2, 'c': 3}
print(myDictionary.keys())

For other useful methods you can refer to the proper place in the documentation.

Combining Lists and Dictionaries

Values within Lists and Dictionaries are not restricted to primitive data types such as numbers and strings. They can store data of any type (for example instances of geometry classes imported from the Rhino libraries) or can even store other Lists and Dictionaries. This allows you to use a combination of nested Lists and Dictionaries to build highly sophisticated data structures that can match the needs of any project. You can access items within such a hierarchical structure by chaining together requests with square brackets. Here is an example:

# start by initializing an empty Dictionary
myDictionary = {}

# add two Lists as entries in the Dictionary
myDictionary['numbers'] = [1, 2, 3, 4, 5]
myDictionary['fruits'] = ['apples', 'oranges', 'bananas']

# add new data to both Lists
myDictionary['numbers'].append(6)
myDictionary['fruits'].append({'berries':['strawberries', 'blueberries']})

# use a compound request to pull data from the Dictionary. This should print 'blueberries'
print(myDictionary['fruits'][-1]['berries'][1])

JSON, one of the most implemented and easiest to work with data formats, is actually based on this concept of nested lists and key-value pairs, and has great support within almost every programming language, including Python and JavaScript. We will work with the JSON format later in this sequence, but for now you can check out its documentation here: http://json.org/.

Conclusion

In this tutorial we reviewed the basic data types supported by Python and how we can work with them using variables. You are now ready to go on to the next series which will cover implementing basic imperative programming techniques in Python using Conditionals and Loops.

Additional resources: