AI & NLP Fellowship: Data Engineering for Social Impact Opportunity

institute for development impact - i4di company

Subscribe to our Telegram Channel

AI & NLP Fellowship: Data Engineering for Social Impact in WASHINGTON DC-BALTIMORE AREA

Remote 4 months ago

Summer 2025 | Volunteer Position | Remote

Institute for Development Impact (I4DI) | DECipher Project


About the Project

DECipher is an AI-powered platform developed by the Institute for Development Impact (I4DI) to help global development professionals access and interpret decades of USAID-funded learning. It draws from one of the largest public document archives in international development, transforming raw PDFs into structured insights using modern machine learning techniques.

At its core, DECipher is a public infrastructure project. It connects natural language processing with real-world policy and program decisions. The work is technical, but the impact is human. It supports smarter, more accountable development efforts worldwide.


The Opportunity

We are offering a volunteer summer fellowship for individuals who want to gain real experience working with applied AI systems. Fellows will help us prepare a large, high-value dataset for fine-tuning domain-specific language models.

This is not a theoretical exercise. You will be working directly with tens of thousands of documents, contributing to the quality and integrity of training data that powers an open-access AI tool for public benefit. While unpaid, this role offers serious technical learning and the chance to be part of something that is both ambitious and grounded.


What You Will Work On

  • Process and clean large volumes of unstructured PDF documents
  • Develop and manage text extraction workflows using Python and NLP tools
  • Review document structure and metadata for consistency and quality
  • Label and classify documents to support supervised and semi-supervised learning
  • Support QA and data validation steps critical for model fine-tuning
  • Work with experienced engineers and researchers on a functioning AI pipeline


What You Will Learn

  • How to build structured datasets for training large language models
  • Techniques in OCR, document parsing, tokenization, and quality assurance
  • How NLP systems are adapted to real-world, domain-specific use cases
  • What it takes to make AI systems both reliable and accountable


Who You Are

  • Current student, recent graduate, or early-career professional with experience in Python and interest in NLP, machine learning, or data engineering
  • Comfortable working with complex documents, legacy formats, and detailed guidelines
  • Motivated by mission-driven tech and open-access knowledge
  • Looking for more than just a credential, you want meaningful work and real learning


What You Will Gain

  • Applied experience with large-scale data preparation
  • A practical, portfolio-worthy contribution to an operational AI system
  • Mentorship from a team experienced in responsible AI and development practice
  • Flexible hours and remote collaboration
  • Possibility for extended work or future opportunities based on performance


Fellowship Details

  • Volunteer position (unpaid)
  • Fully remote
  • Summer 2025
  • 8 to 12 week commitment
  • 30 to 40 hours per week, flexible scheduling


How to Apply

Send a brief message describing your interest and experience, along with a resume and link to relevant work, to recruitment@i4di.org.

Apply now

Subscribe our newsletter

New Things Will Always Update Regularly