Member-only story

How To Add A New DataFrame Column In Python

We’ll use the Top 50 Songs from last year

an amygdala
4 min readJan 5, 2020

A short introduction

This post assumes that you have installed the necessary packages, and it will use the Jupyter Notebook application to run Python.

A Python DataFrame is a two-dimensional data structure, similar to what a table looks like in Excel. It has rows and columns, and it is supported by the Python pandas library. You can manipulate the DataFrame by adding new columns, and you can use lambda expressions to fill in those columns.

How to find and prepare a dataset

First, you need to find a dataset to work with. Kaggle is an excellent resource for procuring manageable datasets. I selected one that provides information on the top 50 songs from 2019, called “top50.csv”.

I downloaded the file from Kaggle, unzipped it, and opened it up in Excel.

There, I performed some data cleansing to make the dataset readable in Jupyter Notebook. This consisted of replacing special characters in artist and album names with simpler counterparts. For the purposes of my limited usage, the special characters weren’t needed.

It’s important to keep in mind that replacing parts of a dataset should be done…

--

--

an amygdala
an amygdala

Written by an amygdala

You Are Your Own, a curated collection of my feminist poems is available on Amazon & Free via Kindle Select: https://rb.gy/ncz77r

No responses yet