The Result
A crafted workflow to support MRR (Monthly Recurring Revenue) reports. Further ongoing support, maintenance and modifying Oozie workflows and reports according to the changing industry’s and business requirements
The Challenge
A laboratory of forecasting algorithms processing has come up with a startup project on analytic reports creation for the US lawyers services and turned to Mellivora’s Big Data and ETL expertise.
The Solution
The solution was built on Cloudera Hadoop Cluster, whilst the main database for analytics was built on Hive. The reporting tool Tableau was based on the predefined views from Hive, and all ETL processes were built under Oozie. Some reports were generated using Ruby-on-Rails instead of using Tableau, with Postgresql being used for temporal and control data.
The code was cleansed from the hardcorded “switch case” condition to allow flexible configuration of the result reports by changing SKU (Stock Keeping Unit – a products’ dictionary).
Technology Stack
- Stream data: Apache Kafka
- Indexing: ElasticSearch
- Computational module: Hadoop (Cloudera): Hive, Impala, HBase, Sqoop, Spark
- ETL and workflows: Oozie, Apache NiFi, Pentaho Kettle
- Front-End: Ruby-on-rails, Postgresql, Apache Zeppelin, Jupyter
- Platforms: Amazon AWS, Google Cloud
- BackEnd: GraphDB (Apache TinkerPop stack: Gremlin, JanusGraph/Titan, ElasticSearch 6.0)
- Languages: Java, Python, SQL
About Mellivora Software
Mellivora Software helps SME and Enterprise businesses build custom IT solutions for a bunch of specific industries. Our core expertise is focused around:
- Big Data, DevOps and AWS technologies
- NLP technologies
- Data Science/Machine Learning