Big Data Aggregation And Standardization For eHealth Startup

April 15, 2020 by
Administrator
The Result


Delivered software that allowed the client to trace the cases of the drugs’ positive or negative (side) effects, thus helping them to provide healthcare insights, based on sentiment analysis and statistics methods. The client is now equipped with the software to predict financial markets’ indices of the pharmaceutics.


The Challenge


An eHealth startup needed to analyze the medical data to improve the quality of the patients’ treatment. The major task was to deliver software that would aggregate, standardize and validate different formats of data on diseases, drugs for those diseases, side effects and patients’ feedback on the drugs, and all relations between these data inputs. 


The Solution


Mellivora data engineers used AWS for the Big Data part of this project. Both structured and unstructured data, received from different sources like electronic health records, social media posts and regulatory agency reports, has been correlated and cross-referenced. It allowed the client to trace the cases of the drugs’ positive or negative (side) effects, thus helping them to provide healthcare insights based on sentiment analysis and statistics methods. 


Technology Stack


  • IBM DB2 DashDB
  • Hadoop – full stack: Hive, HBase, Sqoop, HDFS, MapReduce, Flume, Tez, Spark
  • Pentaho Data Integration
  • Pentaho Business Analytics Platform
  • Hortonworks Data Platform HDP 2.4 Apache
  • Amazon Web Services
  • MapReduce, Data orchestration, Stream data processing
  • Ubuntu 16.04, Windows, Centos
  • Karaf, Docker
  • Talend Integration Studio
  • GraphDB
  • Java, Python, bash, shell, SQL