IBM Data Analyst Capstone Project

本文发布于：2021年4月3日

字数：2.1k字

时长：12分钟

In this course provided by IBM, I will assume the role of an Associate Data Analyst who has recently joined the organization and be presented with a business challenge that requires data analysis to be performed on real-world datasets. The capstone project will culminate with a presentation of your data analysis report, with an executive summary for the various stakeholders in the organization. I believe this project is a great opportunity to showcase Data Analytics skills, and demonstrate proficiency to potential employers. The following are the notes I took during this course.

Data Collection

Collecting Data Using APIs

The HTTP protocol allows you to send and receive information through the web including webpages, images, and other web resources.

Uniform resource locator(URL): the most popular way to find resources on the web

Scheme: http://
Internet address or Base URL: www.ibm.com
Route location on the web server: /images/IDSNlogo.png

Request

Request start line = GET method + location of the resource /index.html + HTTP version
Request header passes additional information with an HTTP request

Response

Response start line = version number HTTP/1.0 + a status code (200) meaning success, + a descriptive phrase (OK).
Response header contains useful information
Response body containing the requested file an HTML document

Requests in Python

import requests
import os 
from PIL import Image
from IPython.display import IFrame
#GET request //# Use single quotation marks for defining string
url='https://www.ibm.com/'  
r=requests.get(url)
#status of the request
r.status_code  
#view request headers //r.request.body
print(r.request.headers)  
#HTTP response header
header=r.headers  
#obtain the date
header['date']  
#obtain the type of data
header['Content-Type']  
r.encoding
#view text
r.text[0:100]  
#write content(image)
path=os.path.join(os.getcwd(),'image.png')
with open(path,'wb') as f:
    f.write(r.content)
Image.open(path)

Get Request with URL Parameters

You can use the GET method to modify the results of your query

IBM Data Analyst Capstone Project

Data Collection

Collecting Data Using APIs

Collecting Data Using Webscraping

Exploring Data

Data Wrangling

Exploratory Data Analysis

Data Visualization

Dashboard Creation

Presentation of Findings

Final Presentation (PDF version):

Assignments