Home Core Ideas Roadmap Team Data Specification Software MultiCellDB In the News Publications Thanks Contact

MultiCellDS on Twitter

Recent news

Wednesday, Aug 10th, 2016
Moving the blog to MathCancer.org: Hi, everyone!Blogspot has been a great platform for me, but in the end, editing posts with source code and mathematics has been too much of ... [read more]

Wednesday, Feb 24th, 2016
Saving MultiCellDS data from BioFVM: Note: This is the fifth in a series of "how-to" blog posts to help new users and developers of BioFVM. IntroductionA major initiative for my lab ... [read more]

Core Ideas

This page introduces the guiding principles for the MultiCellDS project and core concepts:

Motivation for multicellular data standards

If computational modeling is to reach its fullest potential in (multicellular) biology and medicine, we must make advances in extracting biophysical parameters from experimental and clinical data, using these measurements to seed computational models, analyzing model outputs to make predictions, and quantitatively comparing model predictions to biomedical data. As experimental measurements come with higher throughput and become more quantitative, it will be necessary to accomplish these tasks consistently, efficiently, and automatically.

Morever, repositories of shared experimental, clinical, and simulation data—with a suite of standardized data processing tools—will be needed to drive the next generation of predictive computational modeling.

Likewise, as more and more image-based multicellular data are created by high-throughput experiments and clinical studies, it is essentially that we find a commmon language to communicate those data and open them up to novel data analyses. Those data must be connected to contextual information—the experimental conditions, cell lines used, who performed them, and with what software—to allow research transparency and reproducibility.

These issues can be addressed by developing standardizations for multicellular data, shared preprocessing and postprocessing tools support these standards, and repositories of shared data. This is our motivation for the MultiCellDS Project: an effort to create a multicellular data standard, along with tools and a repository.

To learn more about the origins of the MultiCellDS project, take a look at our publications and news releases.

[Return to top]

Design principles

The MultiCellDS Project aspires to promote data sharing in computational and experimental biology and medicine, particularly cancer. We aim to foster comparison, refinement, and recombination of models that can better understand biology and predict disease progression.

To achieve these goals, MultiCellDS operates under these guiding principles:

[Return to top]

MultiCellDS vs. MultiCellXML

MultiCellDS grew from the original MultiCellXML project. MultiCellDS is the data specification and includes:

MultiCellDS data can be stored in XML files (MultiCellXML) or in a repository (MultiCellDB). In the future, MultiCellDS data will also be saved in HDF format (MultiCellHDF) to allow better data compression.

[Return to top]

Digital cell lines

A key problem facing computational biologists is a lack of standardized recording of cell phenotypic properties. Moreover, we do not have standardized model cell systems for computational experiments. MultiCellDS aims to solve this with digital cell lines: the digital analogue of an experimental cell line.

A digital cell line is an extensible, standardized representation of a cell line. It includes cell phenotypic parameters (e.g., cell cycle and volume data elements), along with information on the microenvironmental context (e.g., oxygenation). These data elements may be recorded in several microenvironmental conditions and grouped in a digital cell line.

This data model reflects the physical biology origins of MultiCellDS, and we intend digital cell lines to broadly sample the microenvironmental space to include different combinations of hypoxia/oxygenation, matrix stiffness, signaling factors (e.g., those secreted by co-cultured cells), and therapeutic compounds. In the future,these phenotype datasets will be combined with molecular descriptions of the cell's internal molecular state and embedded as SBML, BioPax, or other well-established standards for subcellular and systems biology. The net result: systematic, multiscale characterizations of cell behavior, external environment, and internal state. We envision broad multiscale modeling and data mining possibilities with such richly characterized digital cell lines.

A digital cell line is a data model and not a computational model. It gives an orderly, standardized recording of key cell phenotypic and biophysical characteristics, and leaves model building and interpretation to modelers. MultiCellDS aims to provide a curated library of digital cell lines that are constantly improved by community-contributed, peer-reviewed measurements. We hope to reduce unnecessary duplication of experimental work, increase data sharing, and free computational modelers to focus on building, calibrating, and improving their models.

[Return to top]

Digital snapshots

A digital snapshot records the current state of an experiment or a simulation. It includes key metadata (e.g., user information, software information, experimental setup, citation information), a list of digital cell lines involved, a list of all cells and their current phenotypic state, and the current state of the microenvironment . Other data aggregations (e.g., bundling a time course of digital snapshots) are also being developed.

[Return to top]

Comparison to related standards

There are several similar (but non-overlapping) standards to consider.

SBML (Systems Biology Markup Language) and BioPAX (Biological Pathways Exchange) describe biology at the subcellular scales (with a focus on molecular biology and signaling pathways), whereas MultiCellDS focuses primarily upon describing the (biophysical) phenotype of multicellular systems and the microenvironment. In short, MultiCellDS focuses at describing biology from the scale of a single cell and up in size, while SBML and BioPAX focus on detailed descriptions within a single cell. In future developments of MultiCellDS, we plan to incorporate subcellular descriptors that integrate with (and can embed) SBML, BioPAX, and related subcellular data standards.

The CBO (the Cell Behavior Ontology) is a standard for specifying computational biological models. This is complementary to MultiCellDS, which records simulation and experimental data and parameters, but leaves questions of modeling (and description of modeling) as out of scope. Future instances of MultiCellDS may embed CBO in simulation snapshots to describe the models used to generate the results.

Similarly, CellML (Cell Markup Language) is focused primarily on specifying and exchanging mathematical models, rather than model data. It has an excellent repository of mathematical models at models.cellml.org.

[Return to top]