NumPy 是 Numerical Python 的缩写,是一个开源的 Python 科学计算库,也是 SciPy、Scikit-Learn、tenorflow、paddlepaddle 等各种数据科学类库的基础库。使用 NumPy 可以方便的使用数组、矩阵进行计算。Pandas 是基于 NumPy 数组构建的,但二者最大的不同是 Pandas 是专门为处理表格和混杂数据设计的,适用于统计分析中的表结构,而 NumPy 更适合处理统一的数值数组数据。虽然在之前已经在 Python for Data Science 一课中学习并整理英文笔记,但我认为还是有必要对于数据处理中的基础内容、工作流程和注意事项进行详细补充。以下为我在学习和实战练习过程中所做的笔记,可供参考。
Despite the recent increase in computing power and access to data over the last couple of decades, our ability to use the data within the decision making process is either lost or not maximized at all too often, we don’t have a solid understanding of the questions being asked and how to apply the data correctly to the problem at hand. This course provided by IBM shares a methodology that can be used within data science, to ensure that the data used in problem solving is relevant and properly manipulated to address the question at hand. The following are the notes I took during this course.
In this course provided by IBM, I learned about Jupyter Notebooks, JupyterLab, RStudio IDE, Git, GitHub, and Watson Studio, what each tool is used for, what programming languages they can execute, their features and limitations. With the tools hosted in the cloud on Skills Network Labs, I can now run simple code in Python, R or Scala. The following are the notes I took during this course.
Jupyter Notebook 是基于浏览器网页的用于交互计算的应用程序,支持 Python、R、Julia 和 Scala 等多种语言,在数据科学相关领域有着非常大的用途。JupyterLab 是基于 web 的集成开发环境,包含了 Jupyter Notebook 所有功能的同时还支持操作终端、编辑 markdown 文本、打开交互模式、查看 csv 文件及图片等功能,最近在学习的 IBM 数据科学专项课程也都是基于 Jupyter Lab 的,在阿里云主机上部署 Jupyter 环境也能使研究和学习更加方便。以下为我总结的一些操作步骤和流程,可供参考。
The art of uncovering the insights and trends in data has been around since ancient times. The ancient Egyptians used census data to increase efficiency in tax collection and they accurately predicted the flooding of the Nile river every year. Since then, people working in data science have carved out a unique and distinct field for the work they do. This field is data science. This course provided by IBM gives me a chance to get an overview of what data science is today. The following are the notes I took during this course.
In this course provided by IBM, I will assume the role of an Associate Data Analyst who has recently joined the organization and be presented with a business challenge that requires data analysis to be performed on real-world datasets. The capstone project will culminate with a presentation of your data analysis report, with an executive summary for the various stakeholders in the organization. I believe this project is a great opportunity to showcase Data Analytics skills, and demonstrate proficiency to potential employers. The following are the notes I took during this course.
This Data Analysis with Python course provided by IBM is designed to teach future data analysts how to prepare data for analysis, perform simple statistical analysis, create meaningful data visualizations, predict future trends from data through a number of lecture, lab, and assignments using Python libraries. The following are the notes I took during this course.
One of the key skills of a data scientist is the ability to tell a compelling story, visualizing data and findings in an approachable and stimulating way. Learning how to leverage a software tool to visualize data will helps one understand the data better, and make more effective decisions. The main goal of this Data Visualization with Python course provided by IBM is to use various techniques and several data visualization libraries in Python, namely Matplotlib, Seaborn, and Folium for presenting data visually. The following are the notes I took during this course.
This Python Project mini-course provided by IBM is intended to demonstrate basic Python skills by performing specific tasks such as extracting data, web scraping, visualizing data, and creating a dashboard. The following are the notes I took during this course.
Much of the world’s data resides in databases, A working knowledge of databases and SQL is a must to become a data scientist. The emphasis in this course provided by IBM is on hands-on and practical learning. So, I’ll try to record how I work with real databases, real data science tools, real-world datasets and eventually, how I create a database instance in the cloud on the following notes I took during this course.
1 / 2