interplay_appcoder_logo_91e76c6c30

The Future of Code

As an agile AI solution builder, we work with many LLMs. We use the latest AI frameworks and libraries. We built AppCoder to generate Python code for generative AI projects. Our goals include speeding up AI development, advancing creativity, and improving the code that powers our AI projects.

We use AppCoder for ourselves and when building alongside our enterprise clients. You can try it yourself.

hero_image_f6a0143f2f

THE RESULT

Interplay-AppCoder LLM, a revolutionary new high performing code generation model

Scoring high on ICE benchmark test

The ICE methodology provides metrics for Usefulness and Functional Correctness as a baseline for scoring code generation. Read more about the ICE methodology in this paper.

We utilized GPT4 to measure the metrics and provided a score between 0-4. This is the test dataset and Jupyter Notebook we used to perform the benchmark.

2.9

USEFULNESS

Usefulness: addresses whether the code output from the model is clear, presented in a logical order, and maintains human readability and whether it covers all functionalities of the problem statement after comparing it with the reference code.

2.4

FUNCTIONALITY

Functional Correctness: An LLM that has complex reasoning capabilities is utilized to conduct unit tests while considering the given question and the reference code.

Model NAme Usefulness Functional Correctness (0-4)
Interplay AppCoder LLM 2.968 2.476
Wizard Coder 1.825 0.603

What we are doing

We have been fine-tuning CodeLlama -7B, 34B and Wizard Coder -15B, 34B. We combined that fine-tuning with our hand-coded dataset training on LangChain, YOLO V8, VertexAI and many other modern libraries which we use on a daily basis. We fine-tuned our work on top of WizardCoder-15B.
We use cookies to make our site work. We'd also like to set optional analytics cookies to help us improve it. They will be enabled, unless you disable them. Our privacy policy
Accept
Decline