Voter information is the lifeblood of all political campaigns. How many names you have and how much information is attached to those names—whether they’re registered to vote, how best to contact them, what their political leanings are—can determine victory or defeat in any given election. And frustratingly, for years, Democrats lagged behind Republicans in maintaining a master database of names, a technological gap that hurt them in 2016 and persisted into the 2018 cycle.
The party’s failure to keep up in the data arms race is particularly embarrassing when you consider how many left-leaning people work in the tech industry (some of these people recently started the group Tech for Campaigns to provide technical knowhow to Democratic office-seekers). And as recently as 2012, the Obama reelection campaign was perceived as having a cutting-edge data operation. So how did the Democrats lose ground to the Republicans, and what’s the plan to get it back? Here’s a brief timeline.
The initial online innovators were progressive Democrats
Any recent history of technological campaign innovation has to start with the Howard Dean presidential campaign, which was short-lived but served as an incubator for a number of innovations, including SEO tactics, campaign rally videos, volunteer-tracking software, and social media (the campaign had its own social networking feature called DeanLink). Barack Obama’s 2008 presidential campaign used many of the same tools to organize people on a mass scale, fueling his surprising primary win.
As social media and smartphones became commonplace, Democrats continued to adapt. ActBlue was growing into a political force, and the Obama team launched Dashboard, a smartphone app for volunteers. The campaign also outspent Mitt Romney by a wide margin when it came to digital ads and invested heavily in data analytics. At the heart of this operation was 10 terabytes of data about voters stored using a system called Vertica. During this period, Democrats had a fairly consistent edge on Republicans when it came to technology. This didn’t always translate into wins—the party lost a huge number of down-ballot races in the Obama era, especially in 2010—but it clearly gave Obama an edge.
The party couldn’t keep its advantage
Over time, the tools that were so novel when Dean and Obama used them became commonplace, and the GOP came out with its own innovation called Data Trust. This is a private company that keeps one centralized master list of voter data that Republican campaigns can buy. Campaigns can also update that list as they acquire new information about the voters in the database. Data Trust was founded in 2011, but really started to attract attention in the 2014 midterms. At the time, the Democratic PAC American Bridge filed an FEC complaint alleging Data Trust violated campaign laws prohibiting coordination between campaigns and outside groups, but the FEC ruled that Data Trust wasn’t breaking any laws. The GOP went on to use that system to put Donald Trump (whose 2016 campaign infrastructure was notoriously shoddy) over the top.
Meanwhile, the Democrats’ own data operation was atrophying. Vertica had been a good piece of technology back in the 2012 cycle, but it wasn’t built to house the amount of data that the Democrats needed in 2016. It had been created before cloud computing took off, so all those terabytes were stored on the DNC’s own servers, which frequently became overloaded. Raffi Krikorian, who served as the Democratic National Committee’s chief technology officer from 2017 to 2019, described to Wired just how bug-ridden the system had become:
"Krikorian started hearing what he calls 'war stories' about Vertica almost immediately, as he interviewed former campaign staffers like Robby Mook, Clinton’s campaign manager, and Stephanie Hannon, a former Googler and Clinton’s chief technology officer. The system was famous for crashing for 16 hours at a time. One data director in North Carolina told him she used to nap in her car just waiting for Vertica to come back online. Mook, Krikorian recalls, likened Vertica to Beirut—when the system got overloaded, as it almost always did, it would just shut down until the shelling stopped."
Not only was it constantly crashing, Vertica wasn't user-friendly to anyone who didn’t already know their way around a database; one DNC staffer told Wired it was “just columns of tables,” some of them labeled things like “This is the right one 2014 Booker.”
Obviously, Democrats had to improve their systems, perhaps by adopting a Data Trust-type model. But the path forward wasn’t clear.
The struggle to centralize data
Tom Perez, who took over as DNC chair after the 2016 cycle, wanted to build one big database where the party could keep all its information. But this would require state Democratic parties to surrender its information to the national committee, a prospect that many state-level operatives viewed with suspicion. “I’m not willing to give up one of our most important tools to a group of people who have never even worked on a campaign before,” Trav Robertson, the chairman of the South Carolina Democratic Party, told Politico in 2018. Krikorian had worked for Twitter and Uber before joining the DNC, and his team members were viewed as relative outsiders.
Meanwhile, progressive mega-donor and LinkedIn founder Reid Hoffman was working on a parallel database project called Alloy with many of the same aims, investing millions to start a company that would build a master voter list, a la Data Trust. But state parties didn’t appear to like this venture any better than the DNC proposal, and the DNC itself opposed it.
State parties eventually agreed in 2020 to create a new venture called the Democratic Data Exchange (DDX), which combined data from all the parties plus major PACs. It’s chaired by Howard Dean, whose leadership of the DNC during the “50-state strategy” days may have helped convince reluctant state parties to sign onto the effort.
The DDX went online midway through the presidential campaign—arguably much later than it should have, but given the opposition from state parties, the creation of DDX can be seen as a major accomplishment for Perez. It was also a severe blow to Hoffman’s startup, which had basically no reason to exist; it unceremoniously closed its doors and fired all its employees shortly after Election Day.
How does the Democratic Data Exchange work?
State parties and other groups (including the DCCC and DSCC) who belong to the DDX pay a fee to join and also give up their data, receiving the data of other member organizations in return. By keeping one master list, Democratic-aligned groups will hopefully be able to avoid duplicating one another’s efforts—for example, repeatedly contacting the same person and asking them to register to vote. That level of coordination (which edges up right to the line of what the law allows) was unavailable to the Clinton campaign in 2016, and may have contributed to her defeat.
There’s already one example of the DDX paying off relayed by the New York Times. In a “test run” of the system in the 2019 Kentucky gubernatorial race, the DDX turned up more than 14,000 supporters of Democratic candidate Andy Beshear that the Kentucky Democratic Party didn’t have on its list; the party subsequently contacted all of them and Beshear won by 5,000 votes.
That is a sign that the system works, at least to a degree, though it will no doubt continue to be refined as time goes on. The biggest outstanding question may be what role the DDX plays in primaries. Democrats have told the Times that each member organization can specify that its data not be used in primary contests if it wishes. But if the system allows insurgent campaigns, of which there will surely be many in the coming cycle, to obtain better data and defeat incumbents, the DDX will have some unintended consequences.