Methodology

Ad Observatory is a project of NYU Cybersecurity for Democracy  (C4D).

C4D collects data from a variety of sources and applies machine learning, topic modeling, and other types of tools to develop messaging insights on Ad Observatory. Data sources include:

The Meta Ad Library contains a comprehensive, searchable collection of ads running across Meta technologies. Ads about “social issues, elections or politics,” or political ads can be identified because they include information about who paid for the ad in a "Paid for by" disclaimer. See Meta’s documentation of its definition of these ads.

Meta Ad Library Reports contain weekly summary information about social issues, elections, and politics ads across its sites.

C4D researchers provide analysis and modeling of these data, so Ad Observatory users can see patterns in digital advertising: who is spending money on ads, what topics they are focusing on, what types of ads they are running (donate, buy, show up, etc.), types of ads, and spending on ads.

Keyword searches

You can use connectors for complex searches. Default without connectors is “AND”. Use “+” for AND, “|” for OR, and “-” for NOT. Default without connectors is “AND.” So, 

vote voting

will return all search results containing words “vote” and “voting”

Connectors: 

+  for AND
|   for OR 
- for NOT

Examples:  

Vote + ballot 

returns results containing words “Vote” AND “ballot”

Vote | voting

returns results containing words “Vote” OR “ballot”

Vote - Biden

returns results containing words “Vote” NOT “Biden”

Searches can be further filtered by using parentheses:

(Vote | Voting) (#2022Elections | #Democrat | #Republican | #Biden | #Trump)

Spending estimates

Meta reports broad ranges for spend of individual ads, instead of exact numbers. In order to estimate the spend of particular ads, C4D researchers incorporate multiple sources of data.

  • Data is collected for all ads going as far back in history as possible every day from the Meta Ad Library. In addition to other details about the ad, the ad’s start date and its current active status (active or inactive) is recorded. C4D infers each ad’s effective start and stop date from these data.
  • C4D also collects Meta Ad Library Report daily. This report lists, for each page and disclosure string combination, the amount spent on that combination over the history of the Ad Library. To get the most accurate estimate of how much was spent on a particular day by an advertiser, C4D subtracts the amount spent up to the prior day from the amount spent up to the day we are estimating spend for.
  • Also daily, C4D distributes the amount spent for the most recent day in the most recent Ad Library Report collected from the ads C4D knows were active on that day. C4D arrives at this figure by dividing the amount spent by the number of active ads proportionally to those ads' reported minimum spends and adding that amount to the spend estimate of each of those active ads. When C4D presents spend data over time, C4D estimates the spend per day of an ad as the total estimated spend for the ad divided by the number of days the ad was active.
  • Sometimes, the lifetime spend reported in the Ad Library Report for an advertiser appears to decrease day over day. Meta representatives have told the C4D team that this is because spend attributed to ads that ran on a given day can take up to five days to settle. Sometimes this negative delta is small, but at times it can be quite large. Also, sometimes these negative deltas actually represent an apparent correction after an incorrectly reported large increase in spend the previous day. To manage these apparently erroneous spikes and dips, when C4D observes a large single-day swing in either direction, and then a correction in the other direction on the subsequent day, C4D smooths the spend reported on the day of the correction over the period of the spike or dip.

Spending by region

C4D determines spend and impressions by region by taking the reported regional distribution for spending for each ad and assigning values accordingly. So for example, if an ad has a regional distribution of 20 percent for New York and 80 percent for California, and a total of $1,000 was spent, and it earned 1,000 impressions, then $200 would be reported as spent in New York and $800 in California.

For a small number of ad sponsors, Ad Observatory groups ads from multiple pages that are controlled by the same ad sponsor and generally disclose their ads as paid for by the same entity. Currently this is the case only for 2020 presidential candidates Joe Biden and Donald Trump, because these were the only major political spenders C4D observed engaging in this behavior. For example, the Joe Biden campaign was the declared payer on ads on both the Joe Biden and Kamala Harris Facebook pages as well as many others. Therefore on Ad Observatory, the Joe Biden sponsor view includes ads that ran on the Facebook Page of Kamala Harris,  Wisconsin for Biden, Biden Harris for WI etc. 

Ad type classification

C4D classifies ad types based on analysis–if it exists–of the text contained in outbound links and in button text (buttons that, when clicked, take the user to another site or action, such as providing contact information or payments.

“Show up” represents ads asking users to go somewhere in the physical world. 

“Donate” refers to ads asking for financial contributions.

 “Connect” ads ask users for contact information. 

“Buy” asks users to purchase merchandise or services. 

“Persuade” is the type given to ads that do not fall into the other categories.

Candidate, party, and PAC information

C4D relies on open data to identify candidates, party committees and Political Action Committees (PACs).

Partisanship lean

C4D relies on Open Secrets coding of “partisan lean” (left, right, all) for ad sponsors that are candidates, parties, and PACs. For corporations, C4D looks at that company’s PAC political donations. If 75 percent or more of those contributions go to members of left-leaning parties, that ad sponsor is coded as “left;” and ads are coded as “right” if 75 percent or more of the contributions go to members of right-leaning parties.

Topics

Topic model developed by C4D researchers, who classify ads into one or more topics based on their text content. Why don’t numbers add up? A single ad campaign may be classified with multiple topics.