Indiana University

 

Karma Provenance Collection Tool

Overview
Provenance (or lineage, trace) of digital scientific data is a critical component to broadening sharing and reuse of scientific data.  Provenance captures the information needed to attribute ownership and determine, among other things, the quality of a particular data set.  Provenance collection is often a tightly coupled part of a cyberinfrastructure system, but is better served as a standalone tool.  The Karma tool is a standalone tool that can be added to existing cyberinfrastructure for purposes of collection and representation of provenance data. Karma utilizes a modular architecture that permits support for multiple instrumentation plugins that make it usable in different architectural settings.

The Karma Provenance Tool is licensed under Apache License, Version 2.0 (the "License") (http://www.apache.org/licenses/LICENSE-2.0).  The code is copyrighted and copyright owned by The Trustees of Indiana University.  Karma is a product of the Data to Insight Center of Pervasive Technology Institue (http://pti.iu.edu) at Indiana University. See Digital Data Provenance for more information.

Features of Latest Release (v3.1)

  • Events ingest using RabbitMQ enterprise messaging system
  • OPM v1.1 compliant

Download v3.1

Download v3.0
Karma v3.0 service: The Karma distribution files required to run the provenance service and GUI [source code] [binary].  Runs on minimal versions Apache Axis2 1.4, JDK 1.5, MySQL 5.1, XMLbeans 2.3.0.

Axis2 handlers v1.0: The Axis2 handlers are packaged as a client handler and service handler where use is determined by the role the entity to be instrumented plays. See documentation for more information on models.  [source code] [service handler binary] [client handler binary]

 

Publications

  • Bin Cao, Beth Plale, Girish Subramanian, Ed Robertson, Yogesh Simmhan, Provenance Information Model of Karma Version 3, IEEE 2009 Third International Workshop on Scientific Workflows (SWF'09), July 2009.
  • Bin Cao, Girish Subramanian, Beth Plale, Poster: Provenance Collection in a Industry Biochemical Discovery Cyberinfrastructure, IEEE e-Science, Indianapolis, IN, December 2008.
  • The Open Provenance Model (v1.01). Moreau, L. (Editor), B. Plale, S. Miles, C. Goble, P. Missier, R. Barga, Y. Simmhan, J. Futrelle, R. McGrath, J. Myers, P. Paulson, S. Bowers, B. Ludaescher, N. Kwasnikowska, J. Van den Bussche, T. Ellkvist, J. Frieire, P. Groth, Technical Report, Electronics and Computer Science, University of Southampton, 2008. http://eprints.ecs.soton.ac.uk/16148
  • Yogesh L. Simmhan, Beth Plale, Dennis Gannon, Query Capabilities of the Karma Provenance Framework, Concurrency and Computation: Practice and Experience, Vol 20, Issue 5, pp. 441-451, John Wiley and Sons, 2008.
  • Yogesh Simmhan, Beth Plale, and Dennis Gannon, Karma2: Provenance Management for Data Driven Workflows, Extended and invited from ICWS 2006. International Journal of Web Services Research, IGI Publishing, Vol 5, No 2, 2008.
  • Yogesh Simmhan, Beth Plale, Dennis Gannon, Towards a Quality Model for Effective Data Selection in Collaboratories, IEEE Workshop on Workflow and Data Flow for Scientific Applications (SciFlow06), held in conjunction with ICDE, Atlanta, GA, April 2006.[Slides]
  • Yogesh Simmhan, Beth Plale, Dennis Gannon, A Performance Evaluation of the Karma Provenance Framework for Scientific Workflows, International Provenance and Annotation Workshop (IPAW'06), Lecture Notes in Computer Science 4145, L. Moreau and I Foster (Eds), Springer-Verlag, Berlin Heidelberg pp. 222-236, 2006. [Slides]
  • Yogesh Simmhan, Beth Plale, and Dennis Gannon, A Framework for Collecting Provenance in Data-Centric Scientific Workflows, Proceedings of the IEEE International Conference on Web Services pp. 427-436, 2006.
  • Yogesh L. Simmhan, Beth Plale, and Dennis Gannon, A Survey of Data Provenance in e-Science, ACM SIGMOD Record, Vol. 34, No. 3, September 2005.
  • Yogesh L. Simmhan, Beth Plale, and Dennis Gannon, A Survey of Data Provenance Techniques, Technical Report TR-618, Computer Science Department, Indiana University, Bloomington, 2005.

Contact

  • Beth Plale [plale at indiana dot edu]
  • Yiming Sun [yimsun at indiana dot edu]

Project Contributors

  • Beth Plale, Project Director 
  • Yiming Sun, Senior Software Developer
  • Bin Cao
  • Dennis Gannon
  • You-Wei Cheah
  • Devarshi Ghoshal
  • Ed Robertson
  • Yogesh Simmhan
  • Girish Subramanian
  • Yuan Luo

Digital Data Provenance >>