Big DataData ExtractionDWH

Big Data Aggregation And Standardization For eHealth Startup

The Result

Delivered software that allowed the client to trace the cases of the drugs’ positive or negative (side) effects, thus helping them to provide healthcare insights, based on sentiment analysis and statistics methods. The client is now equipped with the software to predict financial markets’ indices of the pharmaceutics.


The Challenge

An eHealth startup needed to analyze the medical data to improve the quality of the patients’ treatment. The major task was to deliver software that would aggregate, standardize and validate different formats of data on diseases, drugs for those diseases, side effects and patients’ feedback on the drugs, and all relations between these data inputs. 


The Solution

Mellivora data engineers used AWS for the Big Data part of this project. Both structured and unstructured data, received from different sources like electronic health records, social media posts and regulatory agency reports, has been correlated and cross-referenced. It allowed the client to trace the cases of the drugs’ positive or negative (side) effects, thus helping them to provide healthcare insights based on sentiment analysis and statistics methods. 


Technology Stack

  • IBM DB2 DashDB
  • Hadoop – full stack: Hive, HBase, Sqoop, HDFS, MapReduce, Flume, Tez, Spark
  • Pentaho Data Integration
  • Pentaho Business Analytics Platform
  • Hortonworks Data Platform HDP 2.4 Apache
  • Amazon Web Services
  • MapReduce, Data orchestration, Stream data processing
  • Ubuntu 16.04, Windows, Centos
  • Karaf, Docker
  • Talend Integration Studio
  • GraphDB
  • Java, Python, bash, shell, SQL

About Mellivora Software

Mellivora Software helps SME and Enterprise businesses build custom IT solutions for a bunch of specific industries. Our core expertise is focused around:

  • Big Data, DevOps and AWS technologies
  • NLP technologies
  • Data Science/Machine Learning