Directory Image
This website uses cookies to improve user experience. By using our website you consent to all cookies in accordance with our Privacy Policy.

What Are Pandas Objects?

Author: Mansoor Ahmed
by Mansoor Ahmed
Posted: Jan 26, 2021

Description

Pandas objects may be believed as improved versions of NumPy structured arrays. The rows and columns are recognized with labels rather than simple integer indices in pandas objects. A set of common mathematical and statistical methods is equipped with them. On top of the basic data structures, the pandas offer a host of useful tools, methods, and functionality. There are three fundamental Pandas data structures;

  1. The Series

  2. DataFrame

  3. Index

We would start our code sessions with the standard NumPy and Pandas imports.

import numpy as np

import pandas as pd

Maximum mathematical and statistical methods fall into the category of reductions. They also belong to summary statistics and methods. Those methods remove a single value from a Series or a Series of values from the rows or columns of a DataFrame. They are all built from the ground up to eliminate missing data, matched with the equal methods of vanilla NumPy arrays.

Pandas Series

One-dimensional array of indexed data is a pandas Series. This can be created from a list or array as follows:

data = pd.Series([0.25, 0.5, 0.75, 1.0])

data

The series has both a sequence of values and a sequence of indices as we see in the output above. We can access them with the values and index attributes. The values are just a familiar NumPy array:

data.values.

Though, the index is an array-like object of type pd.Index.

data.index

Data can be accessed by the related index via the familiar Python square-bracket notation similar with a NumPy array:

data[1]

data[1:3]

Series as Generalized NumPy Array

It can look like the Series object is essentially identical with a one-dimensional NumPy array. The vital change is the presence of the index. This existence happened whereas the Numpy Array has an indirectly defined integer index used to access the values. The Pandas Series has an openly defined index associated with the values. This clear index definition offers the Series object additional capabilities. For instance, the index need not be an integer. It can contain values of any desired type. We can use strings as an index:

data = pd.Series([0.25, 0.5, 0.75, 1.0],

index=['a', 'b', 'c', 'd'])

data

Series as Specialized Dictionary

We can think of a Pandas Series a bit like a specialization of a Python dictionary. One dictionary is a structure that maps arbitrary keys to a set of random values. One series is a structure that maps typed keys to a set of typed values. The importance of typing is simple. The type-specific compiled code behind a NumPy array makes it well-organized than a Python list for certain operations. For certain operations, the type information of a Pandas Series makes it much more effective than Python dictionaries. By constructing a Series object directly from a Python dictionary, the series-as-dict analogy can be made even clearer:

population_dict = {'California': 38332521,

'Texas': 26448193,

'New York': 19651127,

'Florida': 19552860,

'Illinois': 12882135}

population = pd.Series(population_dict)

population

The Pandas Series Object

The Pandas Series is a one-dimensional array of indexed data. It might be created from a list or array as follows:

data = pd.Series([0.25, 0.5, 0.75, 1.0])

data

  1. 0.25
  2. 0.50
  3. 0.75
  4. 1.00

dtype: float64

About the Author

Mansoor Ahmed Chemical Engineer,Web developer

Rate this Article
Leave a Comment
Author Thumbnail
I Agree:
Comment 
Pictures
Author: Mansoor Ahmed

Mansoor Ahmed

Member since: Oct 10, 2020
Published articles: 124

Related Articles