Part 2: Explore the data at multiple levels by creating data visualizations
a) Create 2 or 3 visualizations that represent/summarize/or characterize the dataset
as a whole (for dataset with many columns select a few for your topic)
b) Create 2 – 4 visualizations that compare/represent/summarize/or characterize one
or multiple subset(s)/subgroup(s) of the dataset (e.g. a random sample or subsets based on
particular attribute value(s))
Instructions for Part 2:
– Use Python or R, if you are absolutely feeling overwhelmed you can then resort to Excel. (Ideally as data scientists you will want to be familiar with how to
create visualizations with all three packages).
– One visualization has to be a table with a title (make sure formatted nicely) – this is typically best done using Excel.
– The other visualizations use more common types of graphs (e.g. histograms, boxplots, bar charts, line graphs, pie
charts, scatter plots, stacked bar charts, correlation matrix). Do not use maps, networks, or 3D at this time. (Try to use Python or R for these)
– For each non-table visualization: write a caption underneath (typically the title), and 1-2
sentence alt text describing what visualization shows (not redundant to explanation).
– For each visualization (table and chart/graph) write 1-2 sentence explanation such as why
you chose this type of data vis graphic, what it shows, or what new insights are revealed.
– Explain how you created the visualizations, e.g. if you used Python or R and which packages and how you used them.

This question has been answered.

Get Answer