Data analysis

Data analysis involves a broad set of activities to clean, process and transform a data collection to learn from it. Python is commonly used as a programming language to perform data analysis because many tools, such as Jupyter Notebook, pandas and Bokeh, are written in Python and can be quickly applied rather than coding your own data analysis libraries from scratch.

Data analysis resources

  • The following series on data exploration uses Python as the implementation language while walking through various stages of how to analyze a data set.

    • Part 1 gives insight into how you should think about data and clarify what you are looking to learn.
    • Part 2 explains categorization and transforming a data set into one that is easier to analyze.
    • Part 3 shows how to visualize the results of your data exploration.
  • The Python Data Science Handbook is available to read for free online, although I also recommend buying the book as it is a great resource for learning the topic.

  • PyData TV contains all the videos from the PyData conference series. The conference talks are often given by professional data scientists and the developers who write these analysis libraries, so there is a wealth of information not necessarily captured anywhere else.

What else would you like to learn about Python and data?

Tell me about standard relational databases.

What're these NoSQL data stores hipster developers keep talking about?

Why is Python a good programming language to use?

Sign up here to receive a monthly email with major updates to this site, tutorials and discount codes for Python books.


Matt Makai 2012-2017