Pandas: Delete rows where column value is null
Data preprocessing is a crucial part of data analysis. There are different methods for handling missing data. Doing so effectively will determine the accuracy of your model. I enjoy using the sklearn-pandas library when possible, as DataFrameMapper seemingly bridges the gap between scikit-learn's abilities and pandas data structures. Here, we will use this to avoid data imputation on a particular column and instead remove rows with missing values in this column from the dataset. Do note that whether one should remove data versus the decision to substitute values should be decided on a case-by-case basis. The factors involving this decision are beyond the scope of this tutorial. Deleting rows where column[xxx] value is null I received the following question regarding working on a dataset in pandas: "Hello, I'm using the sklearn-pandas.DataFrameMapper to preprocess my data. I prefer not to impute values for a particular column. Instead, I want to exclude any rows where ...