Research

WorkfloWiz: Cloud-Native Big Data Workflow Managemenet System. Download the source code here.

Big Data Modeling: The process that encompasses collecting, formalizing, and analyzing data requirements in a large-scale information system, as well as understanding, representing, and organizing big data to meet these requirements.

KDM: An automated big data modeling tool for NoSQL Apache Cassandra that dramatically simplifies and streamlines database design. Try KDM!.

Scientific Workflow Management in the Cloud: Developing techniques to structure, formalize and analyze heterogeneous scientific processes, as well as to orchestrate their execution in distributed environments, such as clouds and grids.

Workflow Composition and Shimming: Designing workflows often involves connecting components (such as Web services) with similar but incompatible interfaces. I propose a foundational approach to solve this so-called shimming problem by reducing it to the runtime coercion problem from the type theory.

UTPB: A Benchmark for Scientific Workflow Provenance Storage and Querying Systems. We have developed a collection of techniques and tools for workload generation, query selection, and performance measurement of large-scale provenance systems.

SPARQL-to-SQL Translation: I have investigated various SPARQL-to-SQL translation strategies and their effect on the resulting SQL query performance.

S2ST: A next-generation RDF Database Management System featuring schema-, data-, and query mapping algorithms that bridge the gap between the relational and the RDF graph data model.