Past Research Projects

UnityJDBC

UnityJDBC is a Java-based integration system capable of the integration and querying of any number of relational databases. UnityJDBC provides a dynamic, bottom-up federation/virtualization where any number of sources can be queried and merged using SQL and JDBC. On top of this foundation is a semantic analysis system that performs schema matching by name correspondences.

The UnityJDBC system has been released as a commercial database integration product and is available at UnityJDBC.com. UnityJDBC uses the JDBC interface to allow databases such as Microsoft SQL Server, Oracle, Postgres, and MySQL to be queried by Java programs using one SQL query. This research also led to formation of company for intelligent database caching called Heimdall Data.

Faster Querying for Database Integration and Virtualization with Distributed Semi-Joins, 2017 International Symposium on Big Data and Data Science, published by IEEE CPS.
Next Generation JDBC Database Drivers for Performance, Transparent Caching, Load Balancing, and Scale-out, 32nd Annual ACM Symposium on Applied Computing (SAC’17), pages 915-918. Best Poster Award out of 52 posters at conference.
Integration and Virtualization of Relational SQL and NoSQL Systems including MySQL and MongoDB, 2014 International Conference on Computational Science and Computational Intelligence (CSCI 2014), Las Vegas, NV, USA.
Reducing the Cost of Validating Mapping Compositions by Exploiting Semantic Relationships, ODBASE 2006 - On the Move to Meaningful Internet Systems 2006: CoopIS, DOA, GADA, and ODBASE (Lecture Notes in Computer Science 4275), Montpellier, France, 882 - 890. [25% acceptance rate] Presentation
Auto-completion of Underspecified SQL Queries, ER 2006 - 25th International Conference on Conceptual Modeling (ER 2006), Tucson, Arizona. Nov, 2006, 584.
Dynamic Database Integration in a JDBC Driver, ICEIS 2005 - 7th International Conference on Enterprise Information Systems - Databases and Information Systems Integration Track, Miami, FL, May 24-28, 2005. [43% acceptance rate] Presentation
AutoJoin: Providing Freedom from Specifying Joins, ICEIS 2005 - 7th International Conference on Enterprise Information Systems - Human-Computer Interaction Track, Miami, FL, May 24-28, 2005. [20% acceptance rate] Presentation
Composing Mappings Between Schemas Using a Reference Ontology, ODBASE 2004 - International Conference on Ontologies, Databases and Applications of SEmantics, Oct. 25-29, 2004, Larnaca, Cyprus [25% acceptance rate] Appears in On the Move to Meaningful Internet Systems 2004 (LNCS #3291) pages 783-800.
Flexible Semantic B2B Integration Using XML Specifications, SCI 2002 - The 6th World Multi-Conference on Systemics, Cybernetics, and, Informatics, July 14th-18th, 2002, Orlando, Florida.
Using Unity to Semi-Automatically Integrate Relational Schema, Demonstration at ICDE 2002 - 18th International Conference on Data Engineering, February 26-March 1, 2002, San Jose, California, pages 329-330.
Querying Relational Databases without Explicit Joins, DASWIS 2001- International Workshop on Data Semantics in Web Information Systems (in conjunction with ER'2001), November 29-30, 2001, Yokohama, Japan. [28% acceptance rate] Appears in Conceptual Modeling for New Information Systems Technologies (LNCS #2465) pages 278-291.
Integrating Relational Database Schemas using a Standardized Dictionary, SAC'2001 - 16th ACM Symposium on Applied Computing March 11-14, 2001 Las Vegas, USA, pages 225-230.
Multidatabase Querying by Context, DataSem 2000 20th annual conference on the Current Trends in Databases and Information Systems Brno, Czech Republic October 21-24, 2000, pages 127-136.
Integrating Data Sources Using A Standardized Global Dictionary, 4th International Conference on Business Information Systems (BIS-2000) Poznan, Poland April 12-13, 2000. Appears in KNOWLEDGE DISCOVERY FOR BUSINESS INFORMATION SYSTEMS Chapter 7, pages 153-172.
Unity - A Database Integration Tool, TRLabs Emerging Technology Bulletin December 4, 2000
Automatic Integration of Relational Database Schemas, U. of Manitoba Technical Report TR-00-15
Multidatabase Querying by Context, U. of Manitoba Technical Report TR-00-16
Unity - A Database Integration Tool, U. of Manitoba Technical Report TR-00-17

Join Algorithm Performance

A join is one of the most costly operators in relational databases. This research examined join algorithms for different applications including rapid user feedback (early hash join), skew-aware joins (histojoin), joins for integrated and distributed systems (slice join), and multi-way joins capable of combining more than two relations simultaneously. Special focus was on how database join algorithms can be applied to mobile and sensor devices and systems that use solid-state (flash) drives.

Histojoin has been included in Postgres 8.4.

Publications:

Are Multi-way Joins Actually Useful?, 15th International Conference on Enterprise Information Systems (ICEIS 2013), Angers, France. [8% Full Paper Acceptance Rate]
Exploiting Join Cardinality for Faster Hash Joins, SAC 2009 - 24th ACM Symposium on Applied Computing (SAC'09), March 2009. Honolulu, HI. pages 1549-1554. [29% acceptance rate]
Using Intrinsic Data Skew to Improve Hash Join Performance, Information Systems Volume 34 No. 6 (September 2009) pages 493-510.
Improving Join Performance for Skewed Databases, CCECE 2008 - IEEE Canadian Conference on Electrical and Computer Engineering 2008, Niagara Falls, Ontario, Canada, May 2008, pages 387-391.
Using Slice Join for Efficient Evaluation of Multi-Way Joins, Data and Knowledge Engineering, Volume 67, Issue 1, October 2008, pages 118-139.
The Effect of Reading Policy on Early Join Result Production, Information Sciences, Volume 177, Issue 19, October 2007, pages 3939-3956.
Optimal Policies to Obtain the Most Join Results, Journal of Theoretical Probability, Volume 20, Issue 2, June 2007, pages 237-256.
Early Hash Join: A Configurable Algorithm for the Efficient and Early Production of Join Results, VLDB 2005 - 31st Very Large Data Bases Conference, August 30 to September 2, 2005, Trondheim, Norway, pages 841-852. [16% acceptance rate] Presentation
A Case for Merge Joins in Mediator Systems, IIWeb-2004 - Workshop on Information Integration on the Web (co-located with VLDB), Aug. 30, 2004, Toronto, Canada, pages 109-114. [50% acceptance rate]

NEXRAD Weather Archive Project

I was a co-PI on a multi-institution project based at the University of Iowa that constructed a system for the archival, analysis, and distribution of the weather radar data collected by the Next Generation Radar (NEXRAD) system in the United States. Besides severe weather forecasting, this data is useful for flood prediction, rainfall estimation, and even bird and insect migration. I designed the original data archive architecture and worked on the problem of storing and analyzing this massive (numerous terabytes), real-time data set.

Publications:

Managing Data Quality in a Terabyte-Scale Data Archive, SAC 2008 - 23rd Annual ACM Symposium on Applied Computing, Fortaleza, Brazil, March 2008. Presentation
Towards Better Utilization of NEXRAD Data in Hydrology: an Overview of Hydro-NEXRAD, World Environmental and Water Resources Congress 2007: Restoring Our Natural Habitat, Volume 243, Number 40927, pages 288-296.
An Architecture for Real-Time Warehousing of Scientific Data, CSC 2005 - The 2005 International Conference on Scientific Computing, Monte Carlo Resort, Las Vegas, Nevada, USA (June 20-23, 2005). [37% acceptance rate] Presentation
Building a Terabyte NEXRAD Radar Database for Hydrometeorology Research, Computers & Geosciences, Volume 32, Issue 2, March 2006, pages 247-258.

Automated Testing Systems

This project has built the AutoEdu testing and marking system that allows question templates to be used to randomize questions and automatically mark them. The AutoEdu system provides instant student feedback while eliminating costly manual marking. The system was deployed in first year Physics courses at UBC Okanagan and served over 800 students per year.

Publications:

Experiences using an Automated Testing and Learning System, Computers and Advanced Technology in Education (CATE 2011).

Database-Assisted Real-Time Path Finding in Video Games

The goal of this research is to use database technology to support real-time path finding in video games. Using a pre-computed database allows for real-time path finding on even the largest maps and thousands of simultaneous path finding agents. This research is conducted in collaboration with Vadim Bulitko at the University of Alberta. More Information

Publications:

Database-Driven Real-time Heuristic Search in Video-game Pathfinding, IEEE Transactions on Computational Intelligence and AI in Games.
Fast Grid-based Path Finding for Video Games, 6th Annual Canadian Conference on Artificial Intelligence (AI'13), pages 100-111. [27% Acceptance Rate]
Trading Space for Time in Grid-based Path Finding, AAI-13 Student Abstract and Poster Program 2013..
On Case Base Formation in Real-Time Heuristic Search, 8th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE-12, Stanford, CA, October 2012, pages 106-111.

Student Projects

Simulating Multidatabase Transaction Management Protocols - Transaction management in a multidatabase is complex because you cannot violate the autonomy of each participating database. I constructed a transaction manager (TM) simulator in C++ capable of implementing and comparing any TM protocol. The simulator was used by Aruna Adil for her Master's thesis to compare several protocols, and was specifically used to prove that the more serial Barker's algorithm is actually more efficient than the higher concurrency Ticket Method protocol due to database hotspots and frequent conflicts between transactions.

U. of Manitoba Technical Report TR-98-05
Computer Applications in Industry and Engineering (CAINE-98) - Las Vegas, Nevada, November 11-13, 1998, pages 93-97.

Dynamic Schema Evolution - A summer project funded by NSERC in 1996 was the construction of a XWindows graphical user interface for TIGUKAT. The GUI implemented TIGUKAT routines for dynamic schema evolution (changes) in an object-oriented database.

DSVM Objectbase - An undergraduate research project involved constructing a distributed shared virtual memory (DSVM) objectbase using the EXODUS storage manager, Treadmarks shared memory module, and a custom UNIX/C++ user interface. The project produced an working objectbase prototype in the summer of 1995 and was funded by NSERC.

Constructing a Distributed Objectbase using DSVM, NSERC Research Project 1995
Using DSVM to Implement Transparent Distributed File Access, Undergraduate Research Project 1995

Optimal Fragment Placement using Simulated Annealing - My first NSERC funded summer project in 1994 involved building a C++ program that used simulated annealing to place fragments of distributed objectbase. The fragments were pre-partitioned using vertical and horizontal fragmentation algorithms. The goal of the implementation was to determine an optimal placement of the fragments at the various sites of the distributed database. The simulated annealing process, so named because it mimics how crystals anneal (or combine) by slowly dropping their temperature, produced excellent results for the NP-search space.

Simulation annealing project overview , NSERC funded project 1994

Root Page

Go Back