A complete suite of Open Source projects for Code as Data and Machine Learning on Code

Philosophy & Governance

At source{d} we are creating a suite of Open Source tools enabling “Code as Data” and “Machine Learning on Code”.

We are also great believers in Open Source and its philosophy. Not only is our source code developed out in the open and made available to all, but also our culture, guides, and even OKRs are openly accessible on GitHub.

Upon request from the community, we plan to hold an election for a Technical Steering Committee (TSC) which will act as an escalation point for potential conflicts within projects, encourage cross-project communication/coordination and help with project governance.


Our current approach has our technology stack fully open-source under permissive licenses such as Apache 2.0 and GPL 3.0. We have released the source{d} Community Edition for single node deployments; multi-node deployments of the source{d} Enterprise Edition which allow distributed computing over a vast amount of repositories with a large number of concurrent users consist in a proprietary product.

This allows us to charge enterprises who are in need for a large number of nodes but not disadvantage individual developers or smaller organizations to take advantage of our technology.

How to contribute

No need to be an expert in Machine Learning to start contributing to the source{d} tech stack. As Open Source enthusiasts, we think everyone has a unique perspective and ideas that deserve to be heard either online on GitHub or in person at Meetups or conferences. From simple documentation improvements to more advanced pull requests for new features and help organizing #MLonCode meetups, we welcome all kind of contributions from the broader community.

Start contributing today by:

  • Creating issues and submitting pull requests on GitHub
  • Discussing design & change proposals with the source{d} team on Slack
  • Sending an email to devrel@sourced.tech about organizing events & meetups


Open-source components that make machine learning on source code a reality

The source{d} stack is built on top of open-source components that make machine learning on source code a reality: from datasets to models as well as data retrieval, language analysis and machine learning tools, all is freely available

OSS Projects Highlights