Select a Subtopic

Day 16: Data Analysis with Pandas

Let's dive into **Day 16**, where we focus on **Data Analysis with Pandas**! I'll guide you through the essentials in an interactive way. By the end of this lesson, you'll have a solid understanding of data analysis using the **Pandas** library in Python.

Topics Covered:

DataFrames and Series
Reading/Writing CSV and Excel Files
Data Cleaning and Manipulation

1. Introduction to Pandas

Pandas is one of the most powerful libraries in Python for data manipulation and analysis. It provides two main classes:

Series: A one-dimensional array, like a list or a column in a table.
DataFrame: A two-dimensional table, like a spreadsheet or SQL table, where data is organized in rows and columns.

Install Pandas

If you haven't installed Pandas yet, run this command in your terminal:

pip install pandas

2. Working with Series

A **Series** is essentially a list of data. Here's how to create a Series:

import pandas as pd

# Creating a Series
data = [10, 20, 30, 40, 50]
series = pd.Series(data)
print(series)

Output:

0    10
1    20
2    30
3    40
4    50
dtype: int64

Each item in the Series has an index (0, 1, 2...) and a value (10, 20, 30...).

Exercise:

Create a Series containing the names of five of your favorite movies.

3. Working with DataFrames

A **DataFrame** is like a table, with columns and rows. Let's create a simple DataFrame:

import pandas as pd

# Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [24, 27, 22, 32],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}

df = pd.DataFrame(data)
print(df)