Data Explorer

Streamlit | Python | ETL

Project

Overview

I've developed a Streamlit Python app that performs various data exploration operations on an uploaded dataset. The app offers the following functionalities:

Data preview: It provides a preview of the first few rows (Head) and last few rows (Tail) of the dataset.
Header Count: The app displays the total number of headers (fields or columns) present in the dataset.
Row Count: It shows the total number of rows in the dataset.
Total Data Points: The app calculates and presents the total number of data points in the dataset, considering all rows and columns.
Dataframe Dimension Count: It provides the dimensions (rows and columns) of the dataset.
Dataset Size: The app indicates the size of the dataset in terms of memory usage.
List of Headers: It displays a list of all headers (fields or columns) available in the dataset.
Data Types: For each header, the app presents the corresponding data type (e.g., integer, string, float, etc.).
Missing Values Count: It calculates and presents the total number of missing values in the dataset.
Duplicate Rows: The app identifies and provides a list of duplicate rows in the dataset. With these functionalities, users can efficiently explore and gain insights into their datasets.
Data Summary Table: The app generates a summary table for headers with numerical values, providing statistics such as mean, median, minimum, maximum, etc.

Click the code button for a more technical overview of the project and click the demo button to interact with the project.

Technologies

Streamlit

Python

Pandas

Numpy

Back