Member-only story
How To Add A New DataFrame Column In Python
We’ll use the Top 50 Songs from last year
A short introduction
This post assumes that you have installed the necessary packages, and it will use the Jupyter Notebook application to run Python.
A Python DataFrame is a two-dimensional data structure, similar to what a table looks like in Excel. It has rows and columns, and it is supported by the Python pandas library. You can manipulate the DataFrame by adding new columns, and you can use lambda expressions to fill in those columns.
How to find and prepare a dataset
First, you need to find a dataset to work with. Kaggle is an excellent resource for procuring manageable datasets. I selected one that provides information on the top 50 songs from 2019, called “top50.csv”.
I downloaded the file from Kaggle, unzipped it, and opened it up in Excel.
There, I performed some data cleansing to make the dataset readable in Jupyter Notebook. This consisted of replacing special characters in artist and album names with simpler counterparts. For the purposes of my limited usage, the special characters weren’t needed.
It’s important to keep in mind that replacing parts of a dataset should be done…