Contributing to PhysioNet

The new PhysioNet website is available at: https://physionet.org. We welcome your feedback.

PhysioNet is growing, thanks to the generosity of our contributors around the world. Your contributions are welcome..

PhysioNet is a forum for the exchange of data and software among researchers. We welcome contributions of data to PhysioBank, and of software to PhysioToolkit. If you have publications that are based on your contributed data or software (or on existing PhysioBank or PhysioToolkit materials), we encourage you to contribute copies (or citations) to the PhysioNet Publications pages. If you have developed tutorials, class notes or problem sets, or other materials that can help others to understand or use PhysioNet better, please share them with the community.

We created PhysioNetWorks as a collaborative virtual laboratory for development of data and software that will be contributed to PhysioNet (whether immediately or at some time in the future). All contributions go first to PhysioNetWorks for review before they appear in PhysioBank, PhysioToolkit, or the PhysioNet Library.

Submitting your contribution

Please read this entire page before submitting your contribution!

  1. If you have not already done so, create a PhysioNetWorks account for yourself. See PhysioNetWorks for details on this step and those that follow.
  2. Create a PhysioNetWorks project for your contribution and upload your files to it. (If you have many files, or if they require more than one gigabyte, just a sample of your files is sufficient.) During this period, the project is visible only to you. When ready, mark your project as ready to activate.
  3. We will review your contribution (usually within a day or two, but not on weekends or US holidays). If your project is suitable for eventual free dissemination on PhysioNet, it will be activated as a restricted-access project that you may share with selected colleagues if you wish. If your project requires additional work before it can be activated, you will receive an email describing what is needed.
  4. PhysioNetWorks provides an environment in which you and your collaborators can continue to add to and further develop your contribution. When you are ready, mark your project as ready to publish. At this time, the contents of your project should conform to the guidelines below.

We strongly urge contributors to use PhysioNetWorks to upload their files, but we will accept contributions on standard media (CDs, DVDs, flash memory, USB drives) if necessary. Please contact us if you cannot upload your contribution.

All contributions are reviewed

Rigorous review of the data and software available from PhysioNet is essential so that researchers can confidently make use of these materials. For this reason, we attempt to make explicit the extent to which all data and software have been reviewed, by assigning each item to one of three classes:

  • Class 1 databases and software are fully supported. Class 1 databases have been carefully scrutinized and have been thoroughly annotated. Class 1 software has been extensively and rigorously tested. We will correct and document any remaining errors, and encourage users to bring these to our attention.
  • Class 2 databases and software are archival copies of materials that support published research, contributed by authors or journals. We will maintain copies of the original data and software together with corrections submitted by the authors. We encourage users to report errors directly to the authors; if you do not receive a response from the authors after a reasonable time, please let us know.
  • Class 3 databases include collections of data that may have been less thoroughly studied than those in class 1, but that may be of interest to the research community. These databases include works in progress, to which users are invited to contribute. In some cases, these databases may be archived on their creators' web sites. Class 3 software includes code that may need further testing or development; again, you are invited to dig in and help their creators transform these works in progress into robust and useful tools for research.

The listing of PhysioBank databases and the PhysioToolkit software index indicate which fully-supported databases and software belong to class 1. We make class 2 and class 3 data and software available via PhysioNet as a service to the research community. Contributed data and software are placed in classes 2 and 3 on acceptance, and may be admitted to class 1 after review and a public comment period.

Guidelines

PhysioNetWorks hosts works in progress that have not yet been admitted to one of the classes above. In order to establish and to maintain the highest standard of quality in the data and software available from PhysioNet, we use the guidelines below to determine when a work in progress is complete. When both its author and PhysioNet agree, a completed work can be transferred to a suitable publicly accessible area of PhysioNet.

Data Contributions

We follow these guidelines in soliciting and accepting contributions of data:

  • Selection of subjects: Although data from longitudinal, population-based studies are ideal, we accept data from other types of studies provided that the method of selection has been carefully documented. At least a minimum of demographic information (age, gender, medications, diagnoses) should be available for each subject; additional demographic information (e.g., race, ethnicity) is desirable. Anonymity of all subjects must be assured. For data from experimental in vivo or in vitro studies, we favor studies where meaningful time series have been collected and detailed descriptions of the experimental conditions are provided.
  • Recording and data preparation: Almost all PhysioBank data were originally recorded digitally. We may accept analog recordings if no comparable data are available to us in digital form. Technical details of the recordings (signal types, transducer types and locations, recording bandwidth, instrumentation used) must be supplied. Please consult us for recommendations before digitizing analog recordings.
  • Verifiability: Databases of derived time series (such as heart rate or RR interval time series) are of limited interest and value, unless the original signals from which the time series were derived are included so that independent verification is possible. Moreover, information contained in the original signals is often useful for purposes other than those served by the derived time series. Although we will sometimes accept databases containing only derived time series (see, for example, the Gait Maturation Database), we strongly encourage contributions that include the original signals.
  • Annotation: Depending on the nature of the data, annotations at varying levels of detail may be required. In most cases, an initial set of annotations should be provided or funded by the contributor. We will provide training and facilities to visiting contributors in order to accomplish this goal.

Preparing a contribution of data:

  • Health information that would permit identification of individual human subjects cannot be shared freely under US law, which governs what PhysioNet can do. PhysioNet provides software that can assist in removing PHI from your data so that they can be included in PhysioBank. If you have questions about this requirement, please ask us for help.
  • Upload a plain text file containing a brief description of your data set (no more than one page). Call this file “README”. This should include:
    • The name by which you would like the data set known
    • The name(s) of the creator(s) of the data set
    • A paragraph or two describing the data set: What are its contents? How were recordings made? How were subjects selected? How were annotations (if any) created and verified? If you defined your own annotation types, what are they? (If your annotation scheme is too complex to describe in a sentence or two, please supply a separate file describing the annotations, and include a reference to that file in your “README”.)
    • References to any published works (include URLs if available on-line) that describe or make use of the data set, in the form in which these works should be cited in any future publications
    • Contact information: your name and e-mail address, and, if possible, the name and e-mail address of an alternate contact who can answer or refer technical questions about the data set
    Much, if not all, of this information will also be in the abstract that you prepare at the time you create your PhysioNetWorks project. The README can be a copy of the abstract in this case.
  • Supply a plain text file containing a list of the records by record name, with one record name per line. Record names may contain digits, underscores, and lower-case letters only. Call this file “RECORDS”.
  • For any signals of types not listed in wfdbcal, please supply additional one-line entries to be added to that calibration file, in a plain text file named "CALIBRATION". Details of the format are at the top of wfdbcal, and also in wfdbcal(5).
  • For each recording in your data set, supply one of the following:
  • Optional: please include or give a reference to any accompanying clinical information for a record in a file named RECORD-info.txt (where RECORD is replaced by the name of the record). Alternatively, this information can be summarized for all records in one file called info.txt.

Software Contributions

We seek contributions of software to PhysioToolkit, with emphasis on applications that are of potential value to users of PhysioBank. Our guidelines for contributors of software, which we continue to develop in consultation with our Advisory Board, are similar in scope and intent to our guidelines for contributors of data.

  • Open-source licensing: We accept contributions of software only if sources are included and made available under a license compliant with the Open Source Definition. We have adopted this policy to protect all users of PhysioToolkit, including ourselves. By adhering to this policy, we ensure that users can verify that the software is performing its intended functions and no others, and that the software can be modified to suit the varying requirements of its users.
  • No access control: We will not undertake to maintain access control over contributed software. Such contributions, if accepted, will be made freely available to the research community.
  • Conformity to stated use: The software must be demonstrated to perform the task intended for it. If its performance is dependent on its input data (as, for example, a QRS detector), such dependencies must be characterized with reference to relevant data from PhysioBank.
  • Conformity to interface standards: We encourage the use of common interfaces. PhysioToolkit software tends to follow the ``toolbox'' approach as pioneered in the UNIX environment. Relatively small applications that perform well-defined tasks can be designed, implemented, and rigorously tested at reasonable cost. With careful attention to interfaces, they can be designed to inter-operate predictably and usefully with other tools. By contrast, the ``Swiss Army chainsaw'' approach, exemplified by the massive, monolithic applications common in the commercial software industry, is ill-suited to rapid development of software for leading-edge research.
  • Portability: Software accepted for inclusion in PhysioToolkit should be portable across supported operating environments. Currently, these environments include GNU/Linux, MacOS/X (Darwin) 10.2 and later, MS-Windows 9x and later (using the freely available Cygwin package), and Unix (including FreeBSD and Solaris). We favor software that conforms to open systems standards (POSIX and, for interactive software, X11). Most existing PhysioToolkit software is written in the C programming language, for performance and portability. Non-portable software may be accepted provisionally if comparable portable software is unavailable.
  • Source-level documentation: Sources for all PhysioToolkit software must be deposited in the PhysioToolkit software source archive. The sources for each package must include a working makefile (a formal description of the dependencies of the source files on each other and on any external sources, used by the make utility to manage the compilation process). Each source file must include comments at the beginning of the file to identify (at least) the name of the file, the author, and the date of last revision.
  • User documentation: Each application (or independently usable component, such as a subroutine library) must be documented by a UNIX-format man (manual) page (see the WFDB Applications Guide for examples). This requirement is to ensure that the contents of the PhysioToolkit library can be searched using a wide variety of standard search engines; man pages are mechanically reformatted as hypertext for use on PhysioNet (see, for example, the WFDB Applications Guide). At a minimum, the man page for each application must include:
    • the name of the application and a one-line description of its intended use
    • a synopsis of its use and of any run-time options
    • any relevant cross-references to other software in PhysioToolkit or elsewhere
    • references to any other relevant documentation (e.g., tutorial materials)
    • the names and e-mail addresses of the original author(s), and of the current maintainer
    Where appropriate, additional material (examples, descriptions of algorithms used, etc.) may be included in the man page. A very useful resource for writing man pages is the Linux Man Page HOWTO.

PhysioNet Library contributions

We seek publications that are accompanied by contributions of original data or software that may be of interest to other users of PhysioNet, as well as papers that make use of existing PhysioBank data or PhysioToolkit software, and relevant tutorials and other reference materials.

Please do not submit materials that cannot be reproduced freely. PhysioNet does not assert a copyright on your contribution, but if you have transferred your copyright privileges to a journal in which your work has appeared, you must obtain permission to publish it on-line from the publisher of the journal.

Accessibility: We aim to make all text on PhysioNet readable using text-to-speech or text-to-Braille translators. Since PhysioNet is supported by US federal funding, this is not only a goal but also a legal requirement. All PhysioNet users benefit from text that can be indexed and searched. Please let us know if you find text pages that are not accessible so that we can correct them, and if you are contributing files to be posted, especially multi-column text or PDF files, please check that they are accessible. An easy test for PDF files is to try to read them using the text-to-speech converter included as a standard feature in Adobe Acrobat software.

Preferred formats

All publicly accessible files on PhysioNet must be written in open formats. This allows them to be checked for PHI and other inappropriate content. Use of open formats also ensures that files remain usable even if the software used to create them becomes unavailable, allows their contents to be indexed and searched, and provides access to the widest audience.

For text, we prefer the following formats (in decreasing order of preference):

  • LaTeX sources suitable for reformatting using LaTeX2HTML
  • HTML
  • Plain text
  • PDF, if readable by text-to-speech converters. (See the note about accessibility in the preceding section.)

If you are preparing material in HTML format for posting on PhysioNet, please consider starting with our template. (This is not required, but it will help you to lay out your pages in the style used here, and it will reduce the amount of editing we will need to do with it.)

For illustrations within text, as well as other types of images, we prefer:

  • vector formats for drawings, charts, graphs, and other types of line art; EPS, PS, PDF, and SVG are examples of open vector formats
  • PNG or JPEG for photos and other continuous-tone images

Avoid using screen captures unless you are trying to illustrate the appearance of a user interface; most plots in particular look better and load more quickly if rendered using a suitable vector format.

For binary data files, we prefer any of the many formats that are compatible with the WFDB library. More than 50 existing PhysioBank data collections use these formats; they are flexible, efficient, and allow use of standard software tools such as those provided by the PhysioBank ATM for visualization and analysis.

For text-encoded data files, we prefer TSV or CSV. XML is acceptable, especially for data with complex structure, but be sure that a freely redistributable schema is included or referenced in your contribution.

For software, sources are required.

PhysioNet sometimes provides file archives in zip, gzip, or bzip2 format for downloading convenience, but the contents of such archives are not indexed by the major search engines, so we almost always provide unpacked directories as well.

Other formats: Files in other formats will need to be converted to one of the formats above before public release. You can help us, and ensure that the conversion is correct, by performing the conversion yourself and comparing the results with your original file(s).

Contributions are irrevocable

Just as readers of a printed journal expect to be able to read articles at any time after they have been published, users of PhysioNet must be able to depend on future availability of materials published here. For this reason, we must refuse requests to withdraw contributed materials or to restrict their distribution once they have been posted here (although updates are always welcome and encouraged).

Questions and Comments

If you would like help understanding, using, or downloading content, please see our Frequently Asked Questions.

If you have any comments, feedback, or particular questions regarding this page, please send them to the webmaster.

Comments and issues can also be raised on PhysioNet's GitHub page.

Updated Thursday, 20 October 2016 at 19:30 BRST

PhysioNet is supported by the National Institute of General Medical Sciences (NIGMS) and the National Institute of Biomedical Imaging and Bioengineering (NIBIB) under NIH grant number 2R01GM104987-09.