This would allow organisations to run a new database without having to wait for a custom interface. and the tool was presented at last week’s ACM (Association for Computing Machinery) International Conference on Management of Data. The tool’s main drop-down menu has 17 entries, such as “hide,” “sort,” “filter,” and “delete” that will look familiar to spreadsheet users. In the conference paper, (right) Bakke and Karger prove that those apparently simple functions are enough to construct any database query possible in SQL-92, which is the core of the version of SQL taught in most database classes.
While some database queries are simple as in a company wanting to printout names and phone numbers of all of its customers. But it might also want a printout of just those customers in a given zip code whose purchase totals exceeded some threshold amount over a particular time span. If each purchase has its own record in the database, the query will need to include code for summing up the purchase totals and comparing them to the threshold quantity.
To makes things even more complicated a database will generally store related data in different tables. For demonstration purposes, Bakke loaded several existing databases into his system. One of them, a database used at MIT to track research grants, has 35 separate tables; another, which records all the information in a university course catalogue, has 15.
Bakke and Karger’s tool lets the user pull in individual columns from any table — say, name and phone number from the first, purchase orders and dates from the second, and products from the third. (The tool will automatically group the products associated with each purchase order together in a single spreadsheet “cell.”) A filter function just like that found in most spreadsheet programs can restrict the date range and limit the results to those that include a particular product. The user can then hide any unnecessary columns, and the report is complete.
Where previous projects have explored techniques for database query construction using editable flow-chart diagrams or virtual buttons that can be snapped together the Bakke and Karger’s tool enables “direct manipulation” of data. “It really harkens back to our physical nature, that we’re very comfortable with the idea that if I pick something up and I twist it, then it will twist, and if I shake it, it will shake” Karger (left) says. “You want the same feeling when you’re manipulating information in a computer — that you’re picking up the information and pushing it this way or sliding it that way or cutting things out — instead of writing some instructions telling the computer to do something.
“Database querying is hard, but we can make it tolerable,” Bakke says. At present, Bakke’s tool enables query construction on an existing database, but does not enable direct data entry or modification. He expects to add that functionality over the next six months, and his hope is to release the tool in a year or so.
“It’s almost ironic,” Karger says. “Eirik’s software is far more robust than just about everything that graduate students have built. But he’s not satisfied with releasing it in its current form. He’s aiming for something of commercial quality.” Bakke concludes, “When you’re dealing with people’s data, you really need to get it right,”