• ROI Training

Data Science with Python

Contact us to book this course
Curriculum icon
Curriculum

Big Data and Machine Learning

Delivery methods icon
Delivery methods

On-Site, Virtual

Duration icon
Duration

4 days

As we generate more and more data, business needs to make use of it for competitive advantage. What is needed is a consistent, easy to use set of tools, which an analyst can use interactively to extract business value in a timely manner.

Python is quick to learn and supplies tools for manipulating and analyzing data: pandas, numpy, scipy, scikit-learn. This course introduces attendees to these tools and via hands-on exercises based on real-world scenarios, shows how they can be applied to a range of business scenarios.

Learning objectives

  • Learn to apply Python tools and Data Science libraries to provide business value
  • Learn basic and advanced NumPy (Numerical Python) features
  • Get started with data analysis tools in the pandas library
  • Use high-performance tools to load, clean, transform, merge, and reshape data
  • Create scatter plots and static or interactive visualizations with matplotlib
  • Apply the pandas groupby facility to slice, dice, and summarize datasets
  • UseScikit-learn for machine learning

Who should attend

Anybody involved in the processing and analysis of data using Python, including business analysts, data engineers, data scientists and software engineers.

Prerequisites

A basic knowledge of Python is required.

Course outline

  • IPython
  • Introduction to:
    • NumPy
    • SciPy
    • Pandas
    • Matplotlib
    • Scikit-learn
  • Loading from CSV Files
  • Accessing SQL databases
  • Cleansing Data with Python
  • Stripping Out Extraneous Information
  • Normalizing Data
  • Formatting Data
  • Universal Functions: Fast Element-Wise Array Functions
  • Data Processing Using Arrays
  • File Input and Output with Arrays
  • Linear Algebra
  • Random Number Generation
  • Example: Random Walks
  • Introduction to pandas Data Structures
  • Essential Functionality
  • Summarizing and Computing Descriptive Statistics
  • Handling Missing Data
  • Hierarchical Indexing
  • Combining and Merging Data Sets
  • Reshaping and Pivoting
  • Data Transformation
  • String Manipulation
  • Introducing matplotlib
  • Plotting Functions in pandas
  • Plotting Maps
  • Python Visualization Tool Ecosystem
  • Python Visualization Tool Ecosystem
  • GroupBy Mechanics
  • Data Aggregation
  • Group-wise Operations and Transformations
  • Pivot Tables and Cross-Tabulation
  • Time Series Basics
  • Date Ranges, Frequencies, and Shifting
  • Time Zone Handling
  • Periods and Period Arithmetic
  • Resampling and Frequency Conversion
  • Time Series Plotting
  • Moving Window Functions
  • Optimization
  • Interpolation
  • Integration
  • Statistics
  • Spatial and Clustering Analysis
  • Signal and Image Processing
  • Sparse Matrices
  • Introduction to Machine Learning
  • Supervised Learning
  • Support Vector Machines
  • Naïve Bayes Classifiers
  • Unsupervised learning
  • Principal Component Analysis
  • Clustering Algorithms

Ready to accelerate your team's innovation?