Salesforce Data Analysis in Python: Analyzing Opportunities

Photo by James Harrison on Unsplash

In the last article, we got our project set up with the simple-salesforce Python package, and now we can use pandas and Salesforce to do some data analysis on the Salesforce Opportunity object.

Let’s look at some potential examples that will help you get started getting more out of your Salesforce data. We will go over some SOQL queries and pandas as well.

Let’s get started with an example problem: You want to get the percentage growth in closed Opportunities from last month. What would the SOQL look like? We would need to filter on all closed won Opportunities in the current month, and all closed won Opportunities from the previous month. The only field we need is Amount, but let’s pull in some more for this example.

This query uses SOQL date functions to select Opportunities within the selected year and month. Now we can put this in a DataFrame using code from the Part 1.

Now we can use pandas to calculate the sum of the Amount column.

Repeat for this month, making sure to change the month in the SOQL query:

Now we can get the percentage change:

This is obviously kind of inefficient if you want to do detailed analysis for many months or years, so let’s look at an easier way if all you care about is the amount for every month.

SOQL has an aggregate function to sum based on the amount, and we can use that in the query to get only the data that we need. We can also group by year and then by month. Here is an example of a complex SOQL query and how we can use pandas to analyze growth.


That’s a lot. Let’s break it down by line.

This line uses SOQL field aliases, which unfortunately only work with aggregate functions, such as SUM and CALENDAR_YEAR (There is no AS keyword as in vanilla SQL). We want to pull in the year, the month, and the sum of the Amount of Opportunities.

The next two lines are easy, we want Opportunities that closed after 2020 and have been won.

This grouping is what does the magic, we want to group by year then by month. The CALENDAR_XXXX functions in the SELECT only work when they are grouped as well. Now we can put the data into a DataFrame using what was in the last article (or just scroll up).

We can also change the index of the resulting DataFrame to use the year and the month as the index.

Now we can sort it by the index to have the data ordered by time.

Getting the percent change from the previous period in pandas is really simple, just use pct_change

We have been able to pull in data from Salesforce and put it into a DataFrame for further analysis, which can be powerful if your firm uses Salesforce heavily.

Full code:

Catch me at



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store