How de_val works

Step 1

Integration

Developers integrate de_val API into their LLM-based products.

import requests

api_url = "api.de-val.ai/api/v0/evaluate"

token = ""

payload = {

"tasks" : [

'hallucination','misattribution',

'relevance','summary_completeness'

"rag_context" : "",

"query" : "",

"llm_response" : "",

}

headers = {"Authorization": f"Bearer {token}"}

requests.post(api_url, payload)

import requests

api_url = "api.de-val.ai/api/v0/evaluate"

token = ""

payload = {

"tasks" : [

'hallucination','misattribution',

'relevance','summary_completeness'

"rag_context" : "",

"query" : "",

"llm_response" : "",

}

headers = {"Authorization": f"Bearer {token}"}

requests.post(api_url, payload)

Step 2

Q&A

Query to LLM

Can you tell me about the first moon landing?

LLM response

USA landed on the moon on Friday June 22, 1975.

Buzz Aidrin was first to step on the moon.

Dogs are a man’s friend.

USA landed on the moon on Friday June 22, 1975.

Buzz Aidrin was first to step on the moon.

Dogs are a man’s friend.

Context

Knowledge passed to LLM

On Saturday July 20, 1969, astronaut Neil Armstrong became the first person to walk on the moon, arguably the greatest technological achievement in human history.

Context

Knowledge passed to LLM

On Saturday July 20, 1969, astronaut Neil Armstrong became the first person to walk on the moon, arguably the greatest technological achievement in human history.

Evaluation

de_val API receives the original user query, the context provided to the LLM, and the LLM's response.

Step 3

Scoring

de_val returns scores across various key evaluation tasks like hallucinations, relevancy, and attribution.

Scoring

Score

0.33

0.21

0.00

0.77

Hallucination

Misattribution

Relevancy

Summary

completeness

Eval

Mistake

'Friday June 22, 1975.'

Scoring

Score

0.33

0.21

0.00

0.77

Hallucination

Misattribution

Relevancy

Summary

completeness

Eval

Mistake

'Friday June 22, 1975.'

Scoring

Score

0.33

0.21

0.00

0.77

Hallucination

Misattribution

Relevancy

Summary

completeness

Eval

Mistake

'Friday June 22, 1975.'

Scoring

Score

0.33

0.21

0.00

0.77

Hallucination

Misattribution

Relevancy

Summary

completeness

Eval

Mistake

'Friday June 22, 1975.'

Open-Source

Objective Measures

A/B Testing

Scalability

Accelerate Time-to-Market

Continuous Monitoring

Actionable Insights

Open-Source

Objective Measures

A/B Testing

Scalability

Accelerate Time-to-Market

Continuous Monitoring

Actionable Insights

Open-Source

Objective Measures

A/B Testing

Scalability

Accelerate Time-to-Market

Continuous Monitoring

Actionable Insights

Ready to take your AI
to the next level?

Join the Beta

Setting the Standard for AI Excellence

Company

Home

Product

Blog

Connect with us

Github

Discord

Medium

Setting the Standard for AI Excellence

Company

Home

Product

Blog

Connect with us

Github

Discord

Medium

Home

Product

Blog

Company

Github

Discord

Medium

Connect with us

Setting the Standard for AI Excellence

Company

Home

Product

Blog

Connect with us

Github

Discord

Medium

Join our Beta

Join our Beta

Join our Beta

How de_val works

How de_val works

How de_val works

Step 1

Step 1

Step 1

Step 1

Integration

Step 2

Step 2

Step 2

Step 2

Evaluation

Evaluation

Step 3

Step 3

Step 3

Step 3

Scoring

Scoring

0.33

0.33

0.33

0.33

Open-Source

Open-Source

Objective Measures

Objective Measures

A/B Testing

A/B Testing

Scalability

Scalability

Accelerate Time-to-Market

Accelerate Time-to-Market

Continuous Monitoring

Continuous Monitoring

Actionable Insights

Actionable Insights

Open-Source

Objective Measures

A/B Testing

Scalability

Accelerate Time-to-Market

Continuous Monitoring

Actionable Insights

Open-Source

Objective Measures

A/B Testing

Scalability

Accelerate Time-to-Market

Continuous Monitoring

Actionable Insights

Ready to take your AIto the next level?

Ready to take your AIto the next level?

Company

Connect with us

Company

Connect with us

Company

Connect with us

Company

Connect with us

Ready to take your AI
to the next level?

Ready to take your AI
to the next level?