- Hands-On Python Deep Learning for the Web
- Anubhav Singh Sayak Paul
- 322字
- 2021-06-24 16:23:30
Pandas
Built on top of NumPy, pandas is one of the most widely used libraries for data science using Python. It facilitates high-performance data structures and data-analysis methods. Pandas provides an in-memory two-dimensional table object called a DataFrame, which in turn is made of a one-dimensional, array-like structure called a series.
Each DataFrame in pandas is in the form of a spreadsheet-like table with row labels and column headers. It is possible to carry out row-based or column-based operations, or both together. Pandas strongly integrates with matplotlib to provide several intuitive visualizations of data that are often very useful when making presentations or during exploratory data analysis.
To import pandas into a Python project, use the following line of code:
import pandas as pd
Here, pd is a common name for importing pandas.
Pandas provides the following data structures:
- Series: One-dimensional array or vector, similar to a column in a table
- DataFrames: Two-dimensional table, with table headers and labels for the rows
- Panels: A dictionary of DataFrames, much like a MySQL database that contains several tables inside
A pandas series can be created using the pd.Series( ) method, while a DataFrame can be created using the pd.DataFrame( ) method—for example, in the following code, we create a pandas DataFrame object using multiple series objects:
import pandas as pd
employees = pd.DataFrame({ "weight": pd.Series([60, 80, 100],index=["Ram", "Sam", "Max"]),"dob": pd.Series([1990, 1970, 1991], index=["Ram", "Max", "Sam"], name="year"),"hobby": pd.Series(["Reading", "Singing"], index=["Ram", "Max"])})
employees
The output of the preceding code is as follows:
Some of the most important methods available for a pandas DataFrame are as follows:
- head(n) or tail(n): To display the top or bottom n rows of the DataFrame.
- info( ): To display information on all the columns, dimensions, and types of data in the columns of the DataFrame.
- describe( ): To display handy aggregate and statistical information about each of the columns in the DataFrame. Columns that are not numeric are omitted.