DataBerry API Cluster

DataBerry API Cluster

” For a Safe, Secure and Positivity enriched Digital World “

DataBerry Cluster offers wide range of APIs pertaining to multiple domains primarily built on core vision of problem solving.

    1) DataBerry-Text Quality Analysis API for Real-time Data Quality Validation
    2) DataBerry Personal Identifier API for Real-time Personal Information Identification
    3) DataBerry Text Translator API for Real-time translation of text into multiple languages using Google Translate library
    4) DataBerry- Text Cleaner API for Automated Data Cleaning in input text query.
    5) DataBerry-Detoxifier API for Toxic Comments Removal for input text
    6) Domain Specific Sentiment Classification

Choose specific end-points that match your requirements from DataBerry cluster.

API Reference

DataBerry Cluster API


Base URL: https://databerrycluster.herokuapp.com
Docs : https://sukanthen.github.io/Databerry-API-Cluster/


Endpoint Method Description
/text_quality POST Text Quality Analysis API
/personal_identifier POST Personal Identifier API
/translate POST Text Translator API
/datacleaner POST Text Data Cleaner API
/detoxify POST Detoxifier API
/gibberish_check POST Gibberish Validator API


1) Text Quality Analysis API

Given a text input, the API will return

Input URL

URL Link : https://databerrycluster.herokuapp.com/text_quality

JSON Input

{"text": "The patient ordered a pizza and soon after eating it went to ICU in the most dreadfully horrible way and much more story to go."}

JSON Output:

{
    "count_of_words": 26,
    "count_unique_words": 24,
    "language": "english",
    "query_length": 127,
    "toxicity": 0
}


2) Personal Identifier API

Given a text input, the API will extract all personal information such as

Input URL

URL Link : https://databerrycluster.herokuapp.com/personal_identifier

JSON Input

{"text":"Hi Sukanthen,i was born on Oct 15 1999 in Neyveli, @ sukanthen1999@gmail.com. I am a male human with phone number 2A +1-541-754-3010."}

JSON Output:

{
    "dates": "1999-10-15",
    "email": "sukanthen1999@gmail.com.",
    "gender": "male",
    "phone_number": "+1-541-754-3010"
}


3) DataBerry Cluster Cleaner API

Given a text input, the Cleaner API will remove unwanted stopwords, html tags and user-defined words from the input text.

Input URL

URL Link : https://databerrycluster.herokuapp.com/datacleaner

JSON Input

{
	"text":"Ravi is a boy in Kerala.<p> I have a bad head ache. </p> <p align=center> But the problem is due to overeating and watching TV for long hours </p>",
	"remove_html":"True",
	"remove_stopwords":"True",
	"stopwords_list":"None"
}

JSON Output:

{
    "cleaned_text": "Ravi boy Kerala . I bad head ache . But problem due overeating watching TV long hours"
}


4) Detoxifier API

Given a input query with toxic comments, the API will exactly spot the toxic words and replace the letters of that particular word with asterisk (). (eg: moron –> ****)

Input URL

URL Link : https://databerrycluster.herokuapp.com/detoxify

JSON Input

# Dialogue from American Series Silicon Valley (Season 1)
{"text":"Hello, Richard Hendricks. I'm a total fucking retard."}

JSON Output:

{
    "cleaned_text": "Hello, Richard Hendricks. I'm a total ******* ******.",
}


5) Gibberish Validator API for text

Given a text input query, the API will validate if a text query is gibberish or normal and valid.

Input URL

URL Link : https://databerrycluster.herokuapp.com/gibberish_check

JSON Input

{"text":"xvusd sdh vuh whjguwr ijhb rrbhe jij v in sgnjvwjiqwur  qiweurw wquwrgbrwv q njsavnq df v  ed jnrerj."}

JSON Output: The output 1 denotes ‘gibberish’ text and 0 denotes normal/valid query.

{
    "gibberish_data": 1
}



linkedin twitter