University of California Davis
University of Cincinnati Medical Center
University of Massachusetts Medical School
University of Michigan Medical School
Vanderbilt University School of Medicine
Protocols & Methods
Reagents & Resources
Tissues & Samples
Conditions of Use
Data Usage Policy
Tracers in Metabolic Research
Isotope Tracers in Metabolic Research: 3-Part Webinar Series
Energy Expenditure Analysis
CalR: Indirect Calorimetry Analysis
Guidelines & Policies
ParaKMeans is a high performance parallel processing implementation of the K Means Clustering algorithm.
This software is distributed under the Open Software License v3.0 agreement
the software please read and accept the
We designed the software so it can be deployed on most Windows operating systems. The applications are written for the .NET Framework v1.1 using the C# programming language. The parallel nature of the application comes from the use of a web service to perform the distance calculations and cluster assignments. Because we use a webservice, it is essential that at least one computer has Internet Information Services (IIS v.5 or better) installed and running.
Additional help and system requirements can be found
The application was designed in a modular fashion to provide both deployment flexibility as well as flexibility in the user interface. The application is made of three software components:
A Web Service
- this software component is the main computation workhorse and resides on the "compute nodes". Data and the cluster centroids are sent to the web service where the distance calculations and cluster assignments are performed. Of note, is that once the data is sent to the web service it never leaves.
A Main API
- this software component is a .NET dynamic link library (DLL) used by the user interfaces to orchestrate the activities of the compute nodes. The API is responsible for managing the ThreadPool and working with the web services to perform the K Means clustering algorithm across the compute nodes. The API provides all the methods and properties necessary. We will provide documentation to the API in case anyone wants to use it in another application.
- this software component provides the actual application that the user interacts with to run the programs. We provide two different user interfaces, a stand alone windows application and a web application.
stand-alone windows application. The windows application can be installed on any windows machine regardless of whether or not IIS is installed. This application provides easy file management, compute node management, program options and a results window for data viewing and saving.
web application. This interface requires IIS to be installed on the computer. The web application provides the same functionality as the stand alone, but requires that each set of data to be analyzed be uploaded to the server.
Although the software was created using this modular design, the end user only needs to concern themselves with the web service and which user interface they want to install.
The basic steps involved in the ParaKMeans algorithm:
The user opens or uploads the data to be analyzed.
The user selects whether to cluster genes, arrays or both.
The user selects the number of clusters and compute nodes to use in the algorithm.
The user selects the method to initialize the centroids for the first round.
The algorithm partitions the data based on the number of nodes used.
The algorithm creates an array of web proxies used to connect to the compute nodes.
The algorithm initializes the centroids based on the method selected by the user.
The algorithm asychronously sends the data and the initial centroids to the compute nodes.
Each compute node calculates the Euclidian Distance matrix and assigns the data on that node to each of the cluster centroids.
Once all the compute nodes finish the cluster assignments, the performance function for that node is returned and summed across all nodes. The summed performance function is used to calculate the new centroids.
The algorithm sends the new centroids back to the compute nodes for another round of assignments.
The algorithm ends when the performance function does not change between rounds.
Back to Top
There was a problem with the page:
Safari Browser Detected...
We strive to make the MMPC site compatable with as many browsers as possible, but some of our third party tools don't work with the Safari browser.
In order to explore this site we highly recommend using the most recent versions of the following browsers:
Please acknowledge all posters, manuscripts or scientific materials that were generated in part or whole using funds from the MMPC using the following text:
Financial support for this work was provided by the NIDDK Mouse Metabolic Phenotyping Centers (National MMPC, RRID:SCR_008997,
) under the MICROMouse Program, grants DK076169.
Citation text and image have been copied to your clipboard. You may now paste them into your document. Thank you!
Warranty disclaimer and copyright notice
THE NATIONAL MMPC MAKES NO REPRESENTATION ABOUT THE SUITABILITY OR ACCURACY OF THE SOFTWARE OR DATA FOR ANY PURPOSE, AND MAKES NO WARRANTIES, EITHER EXPRESS OR IMPLIED, INCLUDING MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE OR DATA WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADEMARKS, OR OTHER RIGHTS. THE SOFTWARE AND DATA ARE PROVIDED "AS IS".
The Mouse Metabolic Phenotyping Centers (MMPC) is an NIDDK funded consortium and adheres to the
NIH Data Sharing Policy
MMPC clients make their data freely available whereby MMPC users may freely build upon, enhance and reuse those data for any purpose without restriction. Scholarly citation norms must be followed for content reuse. Please acknowledge the MMPC using the following text: 'The MMPC data used in this manuscript was supported by the NIDDK National Mouse Metabolic Phenotyping Centers (National MMPC, RRID:SCR_008997,
)'. To cite specific MMPC centers, please use the appropriate RRID available from the MMPC website (
Please note that the acknowledgment text includes a Research Resource Identifier (RRID) for the MMPC CU and Centers. Reproducibility is one of the corner stones of effective, open and transparent biomedical published research. However, too often, resources (e.g. model organisms, antibodies, and tools) are not reported with adequate detail to ensure others can replicate or expand upon the published results. The Research Resource Identification Initiative (#RII) seeks to change these limitations in reporting by the use of unique Research Resource Identifiers (RRIDs). This initiative is designed to encourage authors to provide identification of the types of resources used in their research by adding a globally unique accession number to the resources described in the their manuscripts. These identifiers, called RRIDs, will allow authors to cite the resources that they use in their manuscripts. RRIDs allow for easy tracking of all papers that have used the same resource making it easy to access how the same resources works in other scenarios.
It is expected that MMPC users follow scholarly citation norms, giving credit to fellow scholars when accessing/using protocols and data, including data derived by MMPC (such as summary data) and any plots, tables or screenshots depicting those data.
It is possible for invalid or incomplete results to be presented on the MMPC web site due to software bugs, data problems, or artifacts of human error. Data sets are not necessarily static; we reserve the right to post corrections and updates as needed.
Data contributors and data users may not use MMPC in any unlawful manner, or in any manner that could impair MMPC services, security or functionality. Automated usage (webcrawlers and similar) must observe each page's "meta robots" html tags and space requests by ≥ 2 seconds. We reserve the right to block any IP associated with what we consider to be excessive or abusive usage patterns, and/or to take any action we deem necessary.
The MMPC is a National Institutes of Health-sponsored resource that provides experimental testing services to scientists studying diabetes, obesity, diabetic complications, and other metabolic diseases in mice.
Interested in receiving MMPC News?
2017 National MMPC. All Rights Reserved.