Data Engineering Intern (Spring 2018)

New York, United States Part-time

Applecart deploys proprietary technology to run smarter advertising campaigns. We work with some of the nation’s most prominent corporations, non-profit organizations, and political candidates to activate and communicate with key target audiences at a scale and level of efficacy previously thought impossible. Part high-level strategic consultancy, part cutting-edge data R+D lab, Applecart offers proven solutions derived from objective, iterative experimentation. Our core offering is a proprietary social graph that leverages publicly-available data to map real-world relationships between individuals at national scale. Our roots are in politics, where we have tested and honed our methods at every level, in high-impact circumstances. We have branched out beyond political campaigns to tackle new advertising challenges in which determining “who knows whom” provides decisive advantages for our clients.

Applecart’s political work has been featured by The Colbert Report, CNN, The Washington Post, The Associated Press, USA Today, The Huffington Post, Bloomberg, among other prominent news outlets.


As a Data Engineering Intern, you will be responsible for. Your work will directly affect our clients in the form of election outcomes, increasing political and non-profit fundraising yields and optimizing advertising spends and risk assessments.


  • Create, maintain, and scale data pipelines data ingesters, the social graph, machine learning predictors, client deliverables, and data warehousing.
  • Interact cross-functionally with a wide variety of people and teams. Work closely with client services and data engineers to identify and implement improvements to Applecart products and deliverables.
  • Implement systems for monitoring of streaming and batch data processing (e.g. DataDog, Nagios). Track data quality and consistency.
  • Evangelize solid coding practices (e.g. test driven development, code reviews, continuous deployment, automated linting, staging environments).
  • Contribute to the architectural designs and decision making around data stores, schemas, data security and cloud storage.
  • Rapidly prototype proof-of-concept data pipelines for social graph ROI determination.
  • Keep abreast of industry trends, best practices, and emerging methodologies.

Basic Qualifications:

  • Currently enrolled in, or recent graduate from, a BS or MS degree program in Computer Science, Math, Statistics or other technical field.
  • Industry, academic and/or project software engineering experience (especially with Python).
  • Background in data modeling & schema design.
  • Desire to write well-abstracted, reusable code components.
  • Ability to work in a fast-paced and deadline driven environment.
  • Some familiarity with Amazon Web Services (RDS, S3, EC2, EMR, Data Pipeline).
  • Background in data wrangling various structured, unstructured data sets, consuming APIs (e.g. rate limiting and exponential back-offs) and alike.

Preferred Qualifications:

  • Experience or desire to learn any of the following: Hive, Pig, Map Reduce, Spark, Elastic Search, HBase, Cassandra, Presto.
  • Experience with agile development or similar methodologies for continuous development of product and technology.
  • Experience with Spark Streaming and Spark SQL in a production setting.
  • Interest in graph database and computation frameworks (e.g. GraphX, TitanDB).
  • Engagements in a variety of coding projects, examples including but not limited to browser extensions, full stack development, web scraping & mechanical turk automation.
  • Significant interest or background in politics, advertising technology and/or behavior modeling is a big plus.

Logistical Details

  • Location: New York, NY
  • Application deadline: January 15, 2018
  • Internship term: Spring 2018 (with potential to extend to Summer 2018)
  • This is a paid internship
  • Applications will be reviewed on a rolling basis

Apply for this opening at