Tracking the 2020 Democratic Field
The 2020 field is large. As of January 4, 2019 we are tracking 25 total potential Democratic candidates with search-based sampling every two minutes.

We could add some other obvious candidates (e.g. Bloomberg and Steyer), but we’ve postponed including them until their participation is more certain.
We added the candidates on a rolling basis. So the numbers should be viewed with a grain of salt, especially for “second tier” candidates.1
Understanding the Baseline
From the beginning, Kamala Harris has dominated the “baseline” state of the race. Looking at the period ending December 23, she has consistently been the most mentioned candidate in the race. 2

Looking more closely at the week ending on Dec 18, Kamala Harris held a sizable lead over Elizabeth Warren. And both had mention counts dwarfing those of the rest of the field:

We can see a consistent contribution from both the “mainstream” and “extreme” right and also a more pronounced effort by individual users to influence the debate.3

Enter Elizabeth Warren
Then on December 31, Elizabeth Warren (to date, a solid second place in the baseline data), announced the formation of a presidential exploratory committee. We can see right away how active conversation about a candidate can push the mention count substantially above the baseline leader.

Warren maintained a substantial mention count lead every day for a week after her announcement. On January 4, 2019 that lead was many multiples of her nearest opponent, Kamala Harris. Harris, in turn, held her own substantial baseline lead over the remainder of the pack.

Furthermore, Warren’s announcement shifted the center-of-gravity in the conversation substantially up-leftward in the credibility plot:

As of January 4, the discussion of active candidates had a much lower proportion of antagonistic (or otherwise bad faith) voices than did the baseline from two weeks prior, owing mostly to the increased volume of voices within the Democratic coalition.
Areas for Future Study
- How does the conversation vary by quadrant?
- How does the conversation vary by candidate?
- Can we generate reasonable summaries by bucket (i.e. quadrant or candidate)?
- Which of these accounts are “suspicious” (bot, sockpuppet, etc)?
- How does message variation (especially framing) impact virality?
- Can we add a dimension for “attitude”, reflecting the overall candidate-relative sentiment of the user?
- Also, some of the queries are more “generous” than others (e.g. we appear to be under-collecting for Beto as of today). I’ll note where we added a candidate recently added or where we have not perfected their Twitter query.
- Bucketed by week, beginning on Sunday.
- It should be noted that spot-checks on the upper-right reveal a number of accounts with “bottom right” iconography (MAGA, wwg1wga, etc) . These accounts may be promoting right-wing bias with “mainstream” support, an approach widely documented as fundamental to right wing insularity and its concomitant susceptibility to so-called network propaganda.
Note also that MBFC rates not only Fox News as WSJ as “mainstream” (i.e. top-half, MIXED or better) it also rates Sputnik News in this way. For now, the benefits of using a third-party rating applied with balanced and professional standards outweighs the impacts of unfortunate idiosyncrasies owing to the use of oversize buckets like this. And right wing involvement in the Democratic Primary is noteworthy regardless of its extremity.