HTML

    Select a Subtopic

    Day 16: Data Analysis with Pandas

    Let's dive into **Day 16**, where we focus on **Data Analysis with Pandas**! I'll guide you through the essentials in an interactive way. By the end of this lesson, you'll have a solid understanding of data analysis using the **Pandas** library in Python.

    Topics Covered:

    • DataFrames and Series
    • Reading/Writing CSV and Excel Files
    • Data Cleaning and Manipulation

    1. Introduction to Pandas

    Pandas is one of the most powerful libraries in Python for data manipulation and analysis. It provides two main classes:

    • Series: A one-dimensional array, like a list or a column in a table.
    • DataFrame: A two-dimensional table, like a spreadsheet or SQL table, where data is organized in rows and columns.

    Install Pandas

    If you haven't installed Pandas yet, run this command in your terminal:

    pip install pandas

    2. Working with Series

    A **Series** is essentially a list of data. Here's how to create a Series:

    import pandas as pd # Creating a Series data = [10, 20, 30, 40, 50] series = pd.Series(data) print(series)

    Output:

    0 10 1 20 2 30 3 40 4 50 dtype: int64

    Each item in the Series has an index (0, 1, 2...) and a value (10, 20, 30...).

    Exercise:

    Create a Series containing the names of five of your favorite movies.


    3. Working with DataFrames

    A **DataFrame** is like a table, with columns and rows. Let's create a simple DataFrame:

    import pandas as pd # Creating a DataFrame data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [24, 27, 22, 32], 'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']} df = pd.DataFrame(data) print(df)

    Output:

    Name Age City 0 Alice 24 New York 1 Bob 27 Los Angeles 2 Charlie 22 Chicago 3 David 32 Houston

    Next Steps:

    In the next section, we'll dive deeper into **Data Visualization** with Matplotlib and Seaborn on **Day 17**!