This course provided by IBM will not teach everything about Python, but it gives me the tools to work as a data scientist and enough knowledge to continue to expand Python learning. The following are the notes I took during this course. Since that I’ve learned Python for everybody on Coursera before, so this note will only contain the necessary outlines and newly learned content.
Python Basics
Types
int
, float
, str
, True
, False
Expressions and Variables
Mathematical Operations
String Operations
Name[1:4]
,len()
,\n
try:
,except:
Python Data Structures
Lists [1, 2, 3]
and Tuples (1, 2, 3)
ABC[1]
, Tuples are immutable.extend()
,.append()
,.split()
,.pop()
,.del()
,.index()
,sorted()
- Append only adds one element to the list
Sets {q, w, e}
- unordered, unique element,
set(list)
,.remove()
set2 = set1 & set 3
,.union
,set2.issubset(set1)
,.issuperset
(),.difference()
Dictionaries {"key1":value1, "key2": value2}
DICT['Graduation']='2022'
,.keys()
,.values()
,'Graduation' in DICT
Python Programming Fundamentals
Conditions and Branching
- Comparison Operators
==
,Logic Operatorsor
AND
,Boolean if ():
,elif:
Loops:range(10,15)
,for a in range():
,while():
Functions: a function can have multiple parameters
def function(input):
global [variable]
Objects and Classes
Every object has a type, a blueprint and a set of methods.
An object is an instance of a particular type.
Class includes Data attributes and methods
dir(NameOfObject):
1
2
3
4class Circle(object):
def __init__(self, radius, color='red'):
self.radius = radius;
self.color = color;
Reading and Writing files with open()
1 | with open("/example1.txt", "r") as File1: # "r" for reading |
Pandas
Loading data with Pandas
1 | import pandas as pd |
Select Data from a data frame in Pandas
df.loc['row', 'column']: 'value'
loc
is primarily label based; when two arguments are used, you use column headers and row indexes to select the data you want.loc
can also take an integer as a row or column number.loc
will return aKeyError
if the requested items are not found.
df.iloc[0,0]:'value'
iloc
is integer-based. You use column numbers and row numbers to get rows or columns at particular positions in the data frame.iloc
will return anIndexError
if the requested indexer is out-of-bounds.
Use loc
and iloc
to slice data frames and assign the values to a new data frame.
z = df.iloc[0:2, 0:3]
Working with and Saving data with Pandas
df['ColumnA'].unique()
,`df1=df[df['ColumnA']>=1980]
Save as CSV:df1.to_csv('new.csv')
Numpy
Preparation
1 | # Import the libraries |
One Dimensional Numpy
A numpy array is similar to a list. It’s usually fixed in size and each element is of the same type.
1 | import numpy as np |
Numpy Array Operations
1 | u = np.array([1, 0]) |
A useful function for plotting mathematical functions is linspace
Linspace returns evenly spaced numbers over a specified interval.
1 | x = np.linspace(0, 2*np.pi, num=100) # Makeup a numpy array within [0, 2π] and 100 elements |
Two Dimensional Numpy
1 | a=[[11,12,13],[21,22,23],[31,32,33]] |
Simple APIs
Create and Use APIs in Python
An API lets two pieces of software talk to each other.
An essential type of API is a REST API (Representational State Transfer APIs) that allows you to access resources via the internet.
Preparation
1 | !pip install nba_api |
Pandas API
1 | import pandas as pd |
REST APIs
- Use the NBA API to determine how well the Golden State Warriors performed against the Toronto Raptors.
- Use the API do the determined number of points the Golden State Warriors won or lost by for each game.
1 | from nba_api.stats.static import teams #https://pypi.org/project/nba-api/ |
Create Speech to Text Translator
Convert an audio file of an English speaker to text using a Speech to Text API
1 | !pip install PyJWT==1.7.1 ibm_watson wget |
Translate the English version to a Spanish version using a Language Translator API
1 | from ibm_watson import LanguageTranslatorV3 |
HTTP and Requests
When the client use a web page, browser sends an HTTP request to the server where the page is hosted. The server tries to find the desired resource by default index.html
.
If request is successful, the server will send the object to the client in an HTTP
response, this includes information like the type of the resource, the length of the resource, and other information.
The HTTP
protocol allows you to send and receive information through the web including webpages, images, and other web resources.
Uniform resource locator (URL) is the most popular way to find resources on the web.
Request:GET
,POST
,PUT
,DELETE
method
1 | import requests |
Write wget
1 | !wget -O /resources/data/Example1.txt https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%205/data/Example1.txt |
Is Equal To:
1 | url='https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%205/data/Example1.txt' |
Get Request with URL Parameters
- We append
/get
in the Route indicate we would like to preform aGET
request url_get='http://httpbin.org/get'
- A query string is a part of a uniform resource locator (URL), this sends other information to the web server.
1 | payload={"name":"Joseph","ID":"123"} |
POST
is used to send data to a server, but the POST
request sends the data in a request body.
url_post='http://httpbin.org/post'
This endpoint will expect data as a file or as a form, a from is convenient way to configure an HTTP request to send data to a server.
1 | # To make a POST request we use the post() function, the variable payload is passed to the parameter data |
Final Project
Analyzing US Economic Data and Building a Dashboard
A template notebook is provided in the lab
Examine how changes in GDP impact the unemployment rate.
1 | import pandas as pd |
Question 1: Create a dataframe that contains the GDP data and display the first five rows of the dataframe.
1 | csv_path=links['GDP'] # links["GDP"] contains the path or name of the file. |
Question 2: Create a dataframe that contains the unemployment data. Display the first five rows of the dataframe.
1 | csv_path=links['unemployment'] |
Question 3: Display a dataframe where unemployment was greater than 8.5%.
1 | csv_path=links['unemployment'] |
Question 4: Use the function make_dashboard to make a dashboard
1 | # Create your dataframe with column date |
My Work
Final Assignment Notebook Url (May not be accessible from Mainland China)
Assignments
- Visit my Github Repository
Databases and SQL for Data Science with Python
Much of the world’s data resides in databases, A working knowledge of databases and SQL is a mus...
Data Visualization and Dashboards with Excel and Cognos
“A picture is worth 1,000 words”. This Course provided by IBM endows me with the ability to effe...