Building the first AI that understands code

We apply neural networks to source code from over 17 million software repositories and 6.6 million developers worldwide

Where we are

  • Implemented most git features in our open source go-git.
  • Fetched and analyzed every git repository on GitHub, BitBucket, self-hosted cgit
  • Built the world's fastest WeightedMinHash and k-means clustering implementations.
  • Trained neural networks to extract relevant features from code.
  • Show that RNNs with memory can learn from projects, and its limitations.
  • Trained neural networks on natural language use.

What is next

  • Reaching full feature parity on go-git with libgit2.
  • Fetching and processing any change to public git repos close to real-time.
  • Extending TensorFlow with new loss functions and apply to code style.
  • Finalizing our first trained models on all 6.6M developers.
  • Moving infrastructure to 1.4PB storage, GPU-running bare metal servers.
  • Launching a platform where developers find the right teams and projects to join.