Skip to the content.

Lists and Filtering Algorithms - Gyutae KIm

Popcorn Hack 1

movies = ["Inception", "The Dark Knight", "Interstellar", "The Matrix"]

movies[1] = "Pulp Fiction"

movies.append("Forrest Gump")

print("Updated list of movies:", movies)
Updated list of movies: ['Inception', 'Pulp Fiction', 'Interstellar', 'The Matrix', 'Forrest Gump']

Popcorn hack 2

ages = [15, 20, 34, 16, 18, 21, 14, 19]

eligible_ages = [age for age in ages if age >= 18]

print("Ages eligible for voting:", eligible_ages)
Ages eligible for voting: [20, 34, 18, 21, 19]

HW Hack 1

Video 1:

  • Ordered Data Structure: Lists in Python maintain the order of elements, unlike sets where order is not guaranteed.
  • Indexing: Lists use zero-based indexing, meaning the first element is at index 0. Negative indices allow access to elements from the end of the list.
  • Slicing: You can retrieve a range of elements using slicing (list[start:stop]), where the start index is included, but the stop index is excluded.
  • Dynamic and Heterogeneous: Lists can contain elements of different data types (e.g., integers, strings, floats) and can grow dynamically by appending new elements.
  • Duplicate Values: Lists can contain duplicate elements, unlike sets which only store unique values.
  • Concatenation: Lists can be combined using the + operator, but the order of concatenation matters (e.g., list1 + list2 is different from list2 + list1).

video 2:

  • Simplified Syntax: List comprehensions provide a concise way to create lists, often replacing longer for loops with a single line of code.
  • Readable Structure: The syntax of a list comprehension closely resembles natural language, making it easier to understand (e.g., [n * n for n in nums]).
  • Conditional Logic: List comprehensions can include conditions to filter elements (e.g., [n for n in nums if n % 2 == 0] for even numbers).
  • Nested Loops: They support nested loops, allowing the creation of more complex lists (e.g., [(letter, num) for letter in ‘ABCD’ for num in range(4)]).
  • Alternative to Map/Filter: List comprehensions often replace map and filter functions, offering better readability and simplicity.
  • Other Comprehensions: Similar syntax applies to dictionary comprehensions ({key: value for key, value in zip(keys, values)}), set comprehensions ({n for n in nums}), and generator expressions ((n * n for n in nums)).

HW Hack 2

numbers = list(range(1, 31))

filtered_numbers = [num for num in numbers if num % 3 == 0 and num % 5 != 0]

print("Original list:", numbers)
print("Filtered list (divisible by 3 but not by 5):", filtered_numbers)
Original list: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
Filtered list (divisible by 3 but not by 5): [3, 6, 9, 12, 18, 21, 24, 27]

HW Hack 3

import pandas as pd

# Load the Spotify Global Streaming Data 2024 CSV
data = pd.read_csv("Spotify_2024_Global_Streaming_Data.csv")

# Filter songs with over 10 million streams
filtered_data = data[data["Total Streams (Millions)"] > 10]

# Display the filtered data
print("Songs with over 10 million streams:")
print(filtered_data)
Songs with over 10 million streams:
           Country        Artist                    Album      Genre  \
0          Germany  Taylor Swift  1989 (Taylor's Version)      K-pop   
1           Brazil    The Weeknd              After Hours        R&B   
2    United States   Post Malone                   Austin  Reggaeton   
3            Italy    Ed Sheeran        Autumn Variations      K-pop   
4            Italy    Ed Sheeran        Autumn Variations        R&B   
..             ...           ...                      ...        ...   
495         Brazil       Karol G       MAÑANA SERÁ BONITO       Jazz   
496         Canada      Dua Lipa         Future Nostalgia  Classical   
497        Germany       Karol G       MAÑANA SERÁ BONITO       Rock   
498         Canada           SZA                      SOS      Indie   
499         Sweden           BTS                    Proof  Reggaeton   

     Release Year  Monthly Listeners (Millions)  Total Streams (Millions)  \
0            2019                         23.10                   3695.53   
1            2022                         60.60                   2828.16   
2            2023                         42.84                   1425.46   
3            2018                         73.24                   2704.33   
4            2023                          7.89                   3323.25   
..            ...                           ...                       ...   
495          2018                         18.80                   2947.97   
496          2023                         89.68                   4418.61   
497          2023                         36.93                   2642.90   
498          2022                         87.26                   4320.23   
499          2018                         89.96                   4804.15   

     Total Hours Streamed (Millions)  Avg Stream Duration (Min) Platform Type  \
0                           14240.35                       4.28          Free   
1                           11120.44                       3.90       Premium   
2                            4177.49                       4.03          Free   
3                           12024.08                       3.26       Premium   
4                           13446.32                       4.47          Free   
..                               ...                        ...           ...   
495                         12642.83                       3.59       Premium   
496                         11843.46                       3.15          Free   
497                          8637.46                       4.08          Free   
498                         12201.40                       2.79          Free   
499                         12044.32                       4.03          Free   

     Streams Last 30 Days (Millions)  Skip Rate (%)  
0                             118.51           2.24  
1                              44.87          23.98  
2                              19.46           4.77  
3                             166.05          25.12  
4                             173.43          15.82  
..                               ...            ...  
495                            83.30          18.58  
496                           143.96           5.82  
497                            76.36          15.84  
498                            84.50          13.07  
499                            92.27          34.36  

[500 rows x 12 columns]

Review Question

What are lists in Python, and how can they be modified? Lists in Python are ordered, mutable collections that can store elements of various data types. They can be modified using operations like indexing to update specific elements, appending to add new items, slicing to retrieve subsets, and concatenation to combine lists. Additional methods like sort(), reverse(), and remove() allow further manipulation.

Provide a real-world scenario for a filtering algorithm. A filtering algorithm can be used in a music streaming platform like Spotify to identify songs with over 10 million streams. This allows the platform to create a “Top Hits” playlist by processing the dataset and extracting songs that meet the criteria.

Why is analyzing filtering algorithm efficiency important? Efficient filtering algorithms are crucial for handling large datasets, ensuring fast performance and scalability. They minimize resource usage, reduce costs, and improve user experience by delivering results quickly, especially in real-time applications.