抱歉,您的浏览器无法访问本站

本页面需要浏览器支持(启用)JavaScript


了解详情 >

The art of uncovering the insights and trends in data has been around since ancient times. The ancient Egyptians used census data to increase efficiency in tax collection and they accurately predicted the flooding of the Nile river every year. Since then, people working in data science have carved out a unique and distinct field for the work they do. This field is data science. This course provided by IBM gives me a chance to get an overview of what data science is today. The following are the notes I took during this course.

Defining Data Science

Data science is the field of exploring, manipulating, and analyzing data, and using data to answer questions or make recommendations.

Data Science can help organizations to understand their environments, analyze existing issues, and reveal previously hidden opportunities.

Data scientists can use powerful data visualization tools to help stakeholders understand the nature of the results, and the recommended action to take.

Data Science is changing the way we work; it’s changing the way we use data and it’s changing the way organizations understand the world.

There are many paths to a career in data science; most, but not all, involve a little math, a little science, and a lot of curiosity about data.

New data scientists need to be curious, judgmental and argumentative.

Data Science: The Sexiest Job in the 21st Century

What Do Data Scientists Do?

A day in the Life of a Data Scientist: to discover optimum solutions to existing problems

Old problems, new problems, Data Science solutions

  • Identify the problem and establish a clear understanding of it.
  • Gather the data for analysis.
  • Identify the right tools to use.
  • Develop a data strategy.
  • Case studies are also helpful in customizing a potential solution.

Data Science Topics and Algorithms

  • Regression、Data visualization、Artificial neural networks、Structured data
  • Using complicated machine learning algorithms does not always guarantee achieving a better performance.

Accessing algorithms, tools, and data through the Cloud enables Data Scientists to stay up-to-date and collaborate easily.

Big Data and Data Mining

Foundations of Big Data

  • Big Data refers to the dynamic, large and disparate volumes of data being created by people, tools, and machines.
  • It requires new, innovative, and scalable technology to collect, host, and analytically process the vast amount of data gathered in order to derive real-time business insights that relate to consumers, risk, profit, performance, productivity management, and enhanced shareholder value.
  • The V’s of Big Data: Velocity、Volume、Variety、Veracity、Value

Hadoop and other tools, combined with distributed computing power, are used to handle the demands of Big Data.

Big Data is driving digital transformation

Most of the components of data science, such as probability, statistics, linear algebra, and programming, have been around for many decades but now we have the computational capabilities to apply combine them and come up with new techniques and learning algorithms.

Data Science Skills & Big Data

Data Mining: the process of automatically searching and analyzing data, discovering previously unrevealed patterns. It involves preprocessing the data to prepare it and transforming it into an appropriate format.

Deep Learning and Machine Learning

Data Science is the process and method for extracting knowledge and insights from large volumes of disparate data, it’s a broad term encompass the entire data processing methodology.

AI includes everything that allows computers to learn how to solve problems and make intelligent decisions.

Machine learning is a subset of AI that uses computer algorithms to analyze data and make intelligent decisions based on what it is learned without being explicitly programmed.

Machine learning algorithms are trained with large sets of data and they learn from examples. They do not follow rules-based algorithms.

Machine learning is what enables machines to solve problems on their own and make accurate predictions using the provided data.

Deep learning is a specialized subset of machine learning that uses layered neural networks to simulate human decision-making.

Deep learning algorithms can label and categorize information and identify patterns. It is what enables AI systems to continuously learn on the job and improve the quality and accuracy of results by determining whether decisions were correct.

A neural network in AI is a collection of small computing units called neurons that take incoming data and learn to make decisions over time.

Neural networks are often layer-deep and are the reason deep learning algorithms become more efficient as the data sets increase in volume, as opposed to other machine learning algorithms that may plateau as data increases.

Machine Learning has many applications, from recommender systems that provide relevant choices for customers on commercial websites, to detailed analysis of financial markets. (Predictive analytics、Decision trees、Bayesian Analysis、Naive Bayes、Recommendations)

Regression

Data Science in Business

Data Science helps physicians provide the best treatment for their patients, and helps meteorologists predict the extent of local weather events, and can even help predict natural disasters like earthquakes and tornadoes.

That companies can start on their data science journey by capturing data. Once they have data, they can begin analyzing it.

Applications of Data Science: Recommendation engine、Siri、Google

The purpose of the final deliverable of a Data Science project is to communicate new information and insights from the data analysis to key decision-makers.

Careers and Recruiting in Data Science

Data Scientists need programming, mathematics, and database skills, many of which can be gained through self-learning.

Companies recruiting for a Data Science team need to understand the variety of different roles Data Scientists can play, and look for soft skills like storytelling and relationship building as well as technical skills.

Curiosity is one of the most important skills that a data scientist should possess.

High school students considering a career in Data Science should learn programming, math, databases, and most importantly practice their skills.

The Report Structure

The length and content of the final report will vary depending on the needs of the project.

The structure of the final report for a Data Science project should include a cover page, table of contents, executive summary, detailed contents, acknowledgements, references and appendices.

The report should present a thorough analysis of the data and communicate the project findings.

评论



Copyright © 2020 - 2022 Zhihao Zhuang. All rights reserved

本站访客数: 人,
总访问量: