I earned a Certificate of Completion that verifies that I successfully completed the intermediate to advance “Python 3: Deep Dive (Part 2 - Iteration, Generators)” course on 26/01/2022 as taught by instructor Fred Baptiste at Udemy Academy. Fred Baptiste is a Professional Developer and Mathematician. The certificate indicates the entire course was completed as validated by the student. The course duration represents the total video hours of the course at the time of most recent completion. Length 34.5 total hours.
In this course, I learned about:
• Sequence Types and the sequence protocol
• Iterables and the iterable protocol
• Iterators and the iterator protocol
• Sequence slicing and how slicing relates to ranges
• List comprehensions and their relation to closures
• The reason why subtle bugs sometimes creep into list comprehensions
• All the functions in the itertools module
• Generator functions
• Generator expressions
• Context managers. The relationship between context managers and generator functions. When working with files it is more Pythonic to call pathlib.Path.open()
• Creating context managers using generator functions
• Using Generators as Coroutines
o Five Capstone Projects:
1) First, create a ‘Polygon’ class. The initializer needs a number of edges/vertices for the largest polygon in the sequence and a common circumradius for all polygons. Properties: edges, vertices, interior angle, edge length, apothem, area, perimeter, and return the Polygon with the highest area(perimeter ratio).
Functions: a proper representation(__repr__), implement equility(==) based on # of virtices and circumradius (__eq__), implements > based on # of virtices only(__gt__), functions as a sequence type(__getitems__), and support the (__len__).
Second, create a ‘Polygons’ class which is a sequence of polygon objects by implementing sequence protocol. Identify the polygon that has the highest area-to-perimeter ratio by sorting the polygons.
Goal one: Refactor the `Polygon` class so that all the calculated properties are lazy properties(cached), i.e. they should still be calculated properties, but they should not have to get recalculated more than once (since our `Polygon` class is "immutable").
Goal two: Refactor the `Polygons` (sequence) type, into an **iterable**. Make sure also that the elements in the iterator are computed lazily - i.e. you can no longer use a list as an underlying storage mechanism for your polygons. You'll need to implement both an iterable and an iterator.
2) Use data on New York parking cars to calculate the number of violations by Car Makers. Create a lazy iterator to extract data on New York parking cars from a CSV file. This will allow for keeping memory overhead to a minimum. Convert data to the correct types. Split data and clean it from trailing/ leading spaces. Set priority on the values in the rows: if a critical value in the row is empty then toss the row away. If the value is not critical and it is empty then leave the row. All the parsing functions should be generators unless need to iterate through the row more than once. Transform rows into structured forms like named tuple.
3) I am given four CSV data files: personal_info.csv, vehicles.csv, employement.csv, and update_status.csv. Each file contains different information on different people, but all files contain the same Social Security Number(SSN) per person which is a uniquely identifying key. Every SSN number appears only once in every file.
Goal one: use the CSV module to create four independent lazy iterators for each of the four files. Return named tuples. Data types should be appropriate (string, data, int, etc). The four iterators are independent of each other(for now).
Goal two: create a single iterable that combines all the data from all four files. Re-use the iterators created in goal 1. By combining I mean one row per SSN containing data from all four files in a single named tuple. Make sure that the SSN value is not repeated 4 times – one time per row is enough.
Goal three: Identify any stale records, where stale simply means the record has not been updated since 3/1/2017 (e.g. last update date < 3/1/2017). Create an iterator that only contains current records (i.e. not stale) based on the `last_updated` field from the `status_update` file.
Goal four: Find the largest group of car makes for each gender.
4) I am given two data files: cars.csv and personal_info.csv. The basic goal will be to create a context manager that only requires the file name and provides us an iterator we can use to iterate over the data in those files. The iterator should yield a named tuple with field names based on the header row in the CVS file.
Goal one: create a single class that implements both the context manager protocol and the iterator protocol.
Goal two: re-implement what I did in goal one, but use a generator function instead. Use @contextmanager from the contextlib module.
5) The goal of this project is to rewrite the pull pipeline we created in the **Application - Pipelines - Pulling** video in the **Generators as Coroutines** section. I should apply techniques used in the **Application - Pipelines - Broadcasting** video. The goal is to write a pipeline that will push data from the source file, `cars.csv`, and push it through some filters and a save coroutine to ultimately save the results as a CSV file.
You can find this course at https://www.udemy.com/course/python-3-deep-dive-part-2/