Student conducts signifcant computer research
Cody Kinneer, ’16, a computer science major and political science minor, has recently published two papers on his research regarding the evaluation of the performance of software. Both of his papers describe the object of Kinneer’s research, a software tool known as ExpOse, and how the tool evaluates the limitations of the performance of database schema-testing software.
“We use software everywhere,” Kinneer said, when summarizing the importance of his work. “There is software telling your toaster how much heat it should use, software guiding rockets that take off into space… if you have a pacemaker, there’s probably a computer making sure it doesn’t shock you when you’re okay.”
Specifically, Kinneer’s research examines databases, or systems that manage large amounts of data. The financial database of a company, in an example outlined by Kinneer, might contain information pertaining to accounting information, items held in inventory, or all of the transactions going through.
Databases needs schemas, or gatekeeper programs that determine what types of data are allowed to enter into the database.
“In the example of a hospital database…the schema tells you that, in the patient records, a certain field has to be a valid blood type,” Kineer explains. “So if you’re trying to type in a blood type and you type in negative three instead of O negative, the schema will tell you no, that isn’t a valid blood type, try again. That’s important because if you’re looking at a patient’s record you don’t want to see negative three as the blood type when you’re desperately trying to figure out what kind of blood type to give them.”
Kinneer’s research branches off from the work of Dr. Phil McMinn, a Reader (the United Kingdom equivalent of Associate Professor), and Ph.D. student Chris Wright at the University of Sheffield in the U.K. McMinn and Wright have been developing a project called Schema Analyst that is still ongoing today.
This program generates test data entries to put into the schemas to see if the schema either accepts the test data as good or rejects it as bad. The program generates tests for all of the different rules of the schema, so that one can determine what kind of data is not getting rejected that ought to be. The algorithm that the program uses is known as a search-based approach.
“When I came onto the project no one really understood what the performance trade-offs were of this search-based system for generating test data,” Kinneer said. “They knew that it worked well and that it came up with high-quality test data, but they didn’t really know how long it was going to take to do that and how scalable it was.”
Schema Analyst worked reasonably well Kinneer explained, if one enters a small input schema to generate test data for. The questions that Kinneer’s software is designed to answer are: what would happen if we were to test a very large input schema, and what point would additional resources no longer help the program handle large schemas?
As a solution to the problem of search-based performance analysis, Kinneer developed his own software system, which eventually became known as ExpOse.
Luke Smith, ’16, a computer science major, helped Kinneer develop the software over the summer.
“It was just something we started in our free time,” Smith said.
Computer Science Chair and Associate Professor Gregory Kapfhammer has been working with Kinneer on the project since the academic year of 2014-15. During that time, Kinneer was selected as one of two Cupper Scholars in the computer science department.
“This isn’t just a research project,” Kapfhammer said. “The system that Cody has developed is something that I’m going to use when I teach the second level computer science class at Allegheny. He’s created something that is going to help people in industry, it’s going to help people who are doing research in this area, and it’s going to be useful to students in computer science.
“It runs the gambit of people all the way in their second class to researchers at the top echelons of their fields. And that’s neat, when you have one tool, one concept, that does all of those things.”
Kinneer has also released his software on the popular site GitHub, a software and programming-sharing site for computer scientists in academia and in industry.
“If you want to be able to release your code to the rest of the world, you create a Github site,” Kapfhammer explained. “Essentially, Github is like Google Documents for computer scientists. It’s specifically designed for sharing the artifacts and deliverables that are important to computer scientists.”
Kinneer was also invited to present his software at the 2015 International Conference on Software and Knowledge Engineering in Pittsburg, Pa. this summer.
Kapfhammer concluded “I do think that there is this sweet spot in the field of computing. You have to have a great idea, with one foot in the realm of theory, one foot in the realm of practice. Then you have to explain that idea to the rest of the world, through both writing and through speaking. And then you have to back it all up with releases of software and data.
“[Kinneer] did all of those five things. And I think that’s what makes this really exciting: that there’s the idea, it’s mathematically well-founded, it’s been implemented with one foot in the realm of practice, he’s done good writing and good speaking, and additionally, it’s now transitioning into practice, both at Allegheny, and he’s released it to the world, and that’s what scientists should do.”