Python for Data Science: Pandas 101
[Online] Part of the DataFest workshop series. Python can be a great option for exploration, analysis and visualization of tabular data, such as spreadsheets and CSV files, if you know which tools to use and how to get started. This workshop will take you through some practical examples of using Python and specifically the Pandas module to load data from files, access that data, and start visualizing it with the Pandas built-in plotting functions. You will also get some experience working in JupyterLab, the flexible programming environment which contain Jupyter Notebooks, a file browser, and more.
Note: This is an introductory workshop on the Pandas module, but you’ll probably be more comfortable following along if you have at least a little bit of experience using the Python programming language, since I won’t be spending time on the language itself. (Feel free to sign up no matter what your experience level, but past students with no Python or programming experience have found it too confusing to be useful.)
Anaconda Python distribution (Individual Edition):
I strongly recommend that you install the Anaconda Python Distribution to use in class. In principle, if you have something above Python 3.7 or so, plus all the necessary modules, everything should work fine. But, the Anaconda Distribution is packaged nicely, can be installed without admin privileges, and comes with everything you’ll need. If you have another version of Python already installed and you’re going to install Anaconda, it’s best to uninstall the other version first. It can get to be a mess if you have multiple versions of Python installed on one machine.
Go to the link above, hit Download, and choose the version for your operating system. I would recommend to just install for “yourself”, not for all users of the machine, since that way it will install everything in your Users/username folder and doesn’t require admin privileges.
If you’re on Mac and aren’t comfortable with shell scripts on the command line, choose the Graphical Installer.
On Windows, I would choose the 64-bit installer, unless you know you’re still running a 32-bit version of Windows on an older machine.
If you’re sticking with your non-Anaconda version of Python, make sure you have JupyterLab, Pandas, and all of their respective dependencies installed.
Please try to launch Python and JupyterLab before class to make sure they’re working! JupyterLab can be started from the Anaconda Navigator application, or from the Anaconda Prompt (Windows) or a Terminal (Mac) by typing (without quotes) “jupyter lab” and hitting return. From a Python notebook or an interactive Python prompt, you can test out the main modules you’ll need by typing this and executing the code cell:
import pandas as pd
This event is offered virtually. A zoom link will be sent via email to registered participants to join the workshop.
The content of the workshop may be recorded. If you are uncomfortable with a recording being published, please contact the instructor at any time prior to the conclusion of the workshop.
Data Visualization, Data Science
- Tuesday, March 15, 2022
- 10:00am - 12:00pm
- Data and Visualization