Search DBMI   
 
Selected Research Activities, Fall 2004

You may access specific faculty research summaries by
selecting the first letter of the faculty member's last name.

Suzanne Bakken

    Although variation in practice is ubiquitous, many studies have documented that underserved populations such as racial and ethnic minorities, women and children, those with lower socioeconomic status, and persons with HIV/AIDS are significantly less likely than others to receive care that is consistent with the best health care evidence. Information technology, informatics processes, and informatics competencies are essential components of an infrastructure that supports evidence-based practice. Research activities are focused on the intersection of informatics, evidence-based practice, and underserved populations.
  1. Center for Evidence-based Practice in the Underserved

    The aims of this research center funded by the National Institute of Nursing Research are to: 1) utilize information technology and informatics processes to support collaborative, interdisciplinary research related to evidence-based practice in underserved populations; 2) facilitate the building of evidence for practice in underserved populations through the funding of pilot studies, mentoring of investigators, and other activities; 3) develop and implement informatics-based approaches that enable data aggregation, secondary data use, and building of evidence across CEBP studies; these include translation of research protocols into standardized terminologies, creation of electronic data dictionaries and scoring algorithms for standardized research instruments, and development of a common data repository; and 4) develop the expertise of Center investigators in informatics-based approaches for targeting and tailoring interventions for underserved populations. The five pilot studies currently funded by the Center are: Icons for Displaying Basic Activities of Daily Living - Christine Curran, PhD, RN and Justin Starren, MD, PhD; Screening for Emotional and Behavioral Disorders in Children - Judy Honig, EdD, RN and Christopher Lucas, MD; Parkinson's and Self-care; Immigrant Latino Descriptions - Janice Smolowitz, EdD, RN and Cheryl Waters, MD; Traditional Chinese Medical Practitioners’ Diagnostic Reasoning - Joyce K. Anastasi, PhD, RN, FAAN, Lac and Vicki LeBlanc, PhD; and Actigraphy and State Monitor Methods to Measure Children’s Sleep/Wake Patterns - Mary Woods Byrne, PhD, MS, MPH and Debra Seltzer, MD.

  2. Informatics for Evidence-based Nurse Practioner (NP)Practice

    We have designed a system architecture that supports documentation and analysis of student encounters in an electronic clinical log and have implemented it for Year 1 of our 3-year Master’s Entry to Practice (ETP) nursing program using Palm M500-515s as input devices, Satellite forms, XTNDConnect Serverä synchronization software, and Access database software. Students, clinical preceptors, and faculty receive electronic reports every two weeks that support the assessment and benchmarking of student performance. We are evaluating the impact of the project on critical thinking ability and informatics competencies. In a second project, funded by the Health Resources and Services Administration (HRSA), we are piloting a NP Student Clinical Log using the same architecture with an expanded set of data elements designed to capture the diagnoses and interventions of advanced practice nurses. We are also evaluating aspects of wireless networking for the NP project.

  3. Concept Representation for Evidence-based Practice

    Representation of concepts in an unambiguous manner is an essential component of an informatics infrastructure that supports evidence-based practice. In addition to typical clinical concepts, informatics innovations for application in underserved populations also require the representation sociocultural concepts that support the tailoring or targeting of interventions based upon these variables. Through a series of experiments in our laboratory, we continue to develop and test terminology models that support the design of concept-oriented terminologies and the representation of a broad variety of concepts for use in computer-based systems including electronic health records and web-based informational and educational applications.

[ top of page ]

Andrea Califano

  1. Aracne

    We are investigating information theoretical approach for the deconvolution of gene regulatory pathways in human B cells. By performing three-variable mutual information analysis on large panels of gene expression data, high-fidelity probabilistic gene regulatory topologies are reconstructed as adirected graphs, where the probability of a direct interaction between two genes becomes inversely proportional to their distance in the graph. This method offers three key advantages over other approaches, such as Bayesian Networks: first, it has quadratic, rather than exponential local complexity in the number of connections to a given gene; second, it does not require the underlying network to be modeled as a directed acyclic graph; third, it relies on the data full dynamic range rather than on the binned representation of the expression values.

    The approach has been applied to the deconvolution of key cancer-related pathways in B cells, using a large set of over 340 microarrays from normal, tumor-related and experimentally manipulated samples. Using literature and experimental data, we have validated large sub-networks involving known proto-oncogenes, such as c-MYC and BCL6, encoding transcription factors, and we have identified key upstream regulators and downstream targets. Predictions have been validated using chromatin immunoprecipitation and other techniques. Results for the c-MYC proto-oncogene, for instance, show that over 75% of its predicted first-neighbors are in-fact direct targets. This percentage decreases exponentially with increasing distance in the graph. False positive ratio for predicted c-MYC downstream targets validated by chromatin immunoprecipitation is about 10%.

    Key open investigation areas include the study of conditional mutual information as a key predictor of non-transcriptional interactions and complex topology-switching behavior, as well as simulation of perturbations using Markov Random Fields and continuous Hopfield-like networks.

  2. BioWorks

    BioWorks is a Plug&Play platform for integrated genomics that includes over 50 interoperable modules that allow the management, analysis, and visualization of a variety of biomedical data, including protein and DNA sequences, gene expression and genotypic data, pathways, genetic and clinical ontologies, etc. Several algorithmic modules are available including client and server based processing.

    BioWorks is an open source platform which extends the caWorkbench microarray analysis platform, also authored by Drs. Califano and Floratos, and distributed as part of the NCI caCORE platform.

    Open areas of investigation include the design of an ontology, BISON, to represent interfaces which extend the interoperability of basic biomedical data (e.g. genes, sequences, etc.) to complex data structures and algorithms (e.g. patterns, clusters, phylogenies, etc.). Additionally, we are extending BioWorks, using the Globus toolkit, to allow full data and computational grid enablement.

    Availability: caWorkbench is currently available at http://ncicb.nci.nih.gov/download/index.jsp. Version 1.0 of BioWorks is scheduled for public release on July 1st 2004.

  3. SPLASH

    The discovery of sparse amino acid or nucleic acid motifs is central to a number of relevant problems in biology. Statistically significant motifs, have been shown to define regions that play key functional or structural roles.

    Splash is an ultra-efficient deterministic pattern discovery algorithm, which can find sparse amino or nucleic acid patterns matching identically or similarly in a set of protein or DNA sequences.

    Large databases, such as a complete genome, the full set of PROSITE families, or the non-redundant SWISS-PROT database can be processed in a few hours on a typical workstation. Individual families or superfamilies can be analyzed exhaustively or hierarchically in seconds to minutes, leading to the discovery of all of their conserved motifs down to those supported by a handful of sequences.

    Key open investigation areas include the discovery of modules of DNA binding signatures in the fruit fly using a motif-motif discovery algorithm and the use of SPLASH as a seeding algorithm to detect more subtle signatures using EM or Gibbs sampler based algorithms.

    Algorithm Availability: We are currently working on a July 1st 2004 release of a new implementation of the SPLASH algorithm freely available to the research community.

  4. Genes@Work

    Description:The analysis of complex microarray data, including gene expression, proteomics, and genotypic dat a is becoming central to the understanding of molecular cellular mechanisms. Genes@Work provides a novel, ass ociation discovery-based approach to microarray data analysis.

    This algorithm has been applied to a variety of biological problems such as the analysis of B-cells in a vari ety of normal and cancer related stages, of NCI-60 cell lines, and of pediatric brain tumors.

    Algorithm Availability: We are currently working to make a new implementation of the algorithm freely availab le to the research community. In the meantime, Genes@Work can be accessed at http://www.research.ibm.com/FunGen/NewFiles/FGDownloads.htm.

[ top of page ]

James Cimino

  1. Palm-based Clinical Information System (PalmCIS)

    This project seeks to extend the functionality of the clinicians' clinical information system (WebCIS) to a wireless, Palm-based platform. We seek to learn how this platform can be best use to provide timely responses to information needs and improve coordination among patient care team members. Initial deployment of PalmCIS is using Web browser on a Kyocera Smart phone (a combination of Palm Pilot and a cell phone), and is being provided to hospitalist in the Allen Pavilion.

  2. Infobuttons

    This project seeks to understand the information needs of clinicians (nurses and physicians) that arise in the context of interacting with a clinical information system. We are studying, through empiric observation, the specific information needs that arise among users of WebCIS. Based on these observations, we will create context specific links, in WebCIS, to on-line information resources that answer the questions we find to be associated with those contexts. These links are referred to as "Infobuttons" (because of their white-i-in-blue-circle logo). Once these are deployed for use in WebCIS, we will study their use through direct observation to determine if they are useful and if they can improve the rate at which clinicians' information needs are satisfied. Work on this project began in 2001 and is funded, as of July 1, 2002, by a grant from the National Library of Medicine.

[ top of page ]

Carol Friedman

  1. Medical Language Processing (MedLEE):

    This project seeks to apply advanced natural language processing (NLP) techniques to the creation of a system called MedLEE, which extracts, structures, and encodes information in clinical reports so that the information can be accessed by other automated processes. There is a wealth of clinical information in patient reports that is urgently needed for automated applications aimed at improving patient care and lowering costs, such as detection of medical errors, risk assessment of patients, automated encoding, data mining, and clinical research. However, information in the form of text cannot be reliably used for computerized applications. MedLEE has been evaluated numerous times and has been shown to have good performance for clinical applications. MedLEE covers the domains of radiology, discharge summaries, pathology, electrocardiography, and echocardiography. MedLEE is continually being refined for new clinical applications and new clinical domains. One application we are working on involves exploration of an interface that provides users with dynamic viewing capabilities using the XML output generated by MedLEE. The dynamic viewer will be able to summarize multiple reports, show significant findings, and present different views of the processed output.

  2. Biomolecular Language Processing (GENIES):

    This project involves creation of an NLP system called GENIES, which automatically captures biomolecular relations from the literature. This was accomplished through adaptation of MedLEE to this new domain by changing some of the knowledge sources (e.g. the lexicon and grammar) while leaving the programming engine the same. There is a wealth of knowledge concerning biomolecular interactions and relations that are being published at a rapid rate in the literature, and an automated extraction system can be invaluable for capturing and organizing the information, which can then be used to provide researchers with improved access to the relevant information. GENIES is part of a larger system called GeneWays, headed by Andrey Rzhetsky, which maintains a knowledge base consisting of the extracted knowledge and tools to manipulate the knowledge.

  3. Biomedical Language Processing (PhenoGenes):

    PhenoGenes is a natural language processing (NLP) high throughput tool based on an adaptation of MedLEE that automatically extracts clinical and genetic information from the electronic medical record (EMR). In addition, a derivative of MedLEE, called BioMedLEE is being developed to extract and encode phenotypic and genetic information from journal articles. The overall goal is to 1) enable the mining of phenotypic and genotypic data in the EMR, 2) amass knowledge concerned with diseases and biomolecular relationships from journals, 3) use the EMR data to validate hypotheses in the literature, and 4) develop a visualization tool for researchers that will present diverse views of the knowledge.

  4. Automated Acquisition of Biomedical Knowledge for NLP Purposes:

    This project complements the NLP extraction projects, described above. It will use existing biomedical knowledge bases and machine learning techniques to acquire knowledge that is critical for NLP systems. One type of knowledge is lexical knowledge, which is needed to recognize, semantically categorize, accurately identify and disambiguate biological entities. This task is crucial to the performance of NLP extraction systems. Another type of knowledge consists of enumeration of semantic patterns that are found in the text, which is needed to accurately extract the relevant relations among the substances and processes.

[ top of page ]

Richard Friedman

    Dr. Richard Friedman's main research activities are his collaborations with biomedical experimentalists in his capacity as Bioinformatics Consultant at the Oncoinformatics Facility of the Columbia Cancer Center. These projects are currently mostly involve analysis of microarray gene expression data to determine the molecular etiology of various cancers and lupus. At present, he is interested in collaborating with interested students on a systematic comparison of the efficacy of various methods of discovering and discriminating between various classes of genes and biological samples (including patients) based on microarray expression data and the further development of class discovery methods.

[ top of page ]

Jan Horsky

  1. Interaction with Provider Order Entry Systems: Methodology for the evaluation of cognitive complexity

    The process of computer-based clinical ordering is frequently made excessively cognitively demanding by poorly designed interfaces. We are developing methodology to characterize the sources of unnecessary cognitive complexity of the interface. Specifically, we investigate the interaction of clinicians with a complex provider order entry system (CPOE) using theoretical foundations from cognitive science and an explanatory framework based on the theory of distributed cognition. The goal is to characterize the nature of cognitive demands imposed by the interaction and its effects on usability, user performance and medical error. Overly complex or inconvenient interfaces that are difficult to navigate claim a disproportionate share of human limited working memory resources. In effect, they divert focus away from the main clinical task and delay its completion. Well designed interfaces allow users to focus primarily on higher-order cognitive activity, such as clinical reasoning and treatment planning. Also, making the ordering process fit better into established workflow routines may help overcoming the frequently strong initial opposition of clinicians to CPOE and smooth the progress of large-scale implementation of this safety-enhancing technology in US hospitals.

  2. Cognition and Error Management in Critical Care

    The core objective of this research is to develop a cognitive framework of medical errors in critical care environments (medicine, surgery and psychiatry), where decisions are often made under high stress, time pressure, and with incomplete information. This theoretical framework provides two functions: (1) a cognitive taxonomy of errors where each category of medical error is associated with a specific cognitive mechanism, and (2) a theoretical explanation of why these errors occur and prediction of the circumstances in which such errors would occur. The studies are being conducted in the adult medical and psychiatric emergency departments, and in the cardio-thoracic intensive care unit. Using cognitive science techniques, we have collected and analyzed data in these units, and have developed decision making models of these complex environments. Additional data collection and analysis is ongoing in the light of these models.

    In our view human errors are products of cognitive activities in people's adaptation to their complex physical, social, and cultural environments. Our cognitive approach stresses actions in conceptual understanding and thought processes during clinical problem solving. The actions reflect the level of expertise and the demands of tasks in clinical performance. In order to manage errors during clinical decision-making, it is critical to understand how decisions are made and what underlying cognitive mechanisms are used to process information during interactions with patients, colleagues, and technology in the critical-care environment. Unlike the popular goal of achieving flawless performance (through development of error-free systems), the empirical results from this study will be used to enhance and modify the current, more static, error taxonomy and will guide development of adaptive systems that anticipate errors, respond to them, or substitute less serious errors that allow subsequent intervention before the errors result in an adverse event.

[ top of page ]

George Hripcsak

  1. Discovering and Applying Knowledge in Clinical Databases

    With the advent of improved clinical information system products (e.g., ambulatory systems, order entry systems), improved data entry technologies (e.g., speech recognition, text processing techniques), and further adoption of data interchange standards, more institutions are generating electronic medical records. The records are used mainly for individual patient care, but exploiting the records for clinical research and quality functions has lagged behind. We are developing and testing methods to mine a clinical data repository. We are exploiting the vast amount of information in the repository (latent associations and knowledge) and using computer intensive techniques and advances in data representation and manipulation to better interpret what is in the database and to overcome the challenges of complex, missing, and inaccurate data.

    1. One area of focus is the development of a similarity metric to be used in case-based reasoning and in the nearest neighbor technique. The goal is to create a metric that assesses overall similarity between cases but which can be made to be orthogonal to specific research questions.

    2. Representing temporal information and presenting it to machine learning algorithms are challenges. We are using temporal constraint satisfaction problems to encode the temporal information in narrative and coded clinical data.

    3. We are abstracting diagnosis and symptom features from narrative reports using statistical, knowledge-driven, and heuristic methods. The features are used for summarizing the electronic medical record and as input features for machine learning.

    4. Cross validation is a ubiquitous method to estimate machine learning performance in the setting of few test cases. We are studying how to best estimate the variance of the performance estimates obtained through cross validation.

  2. Mining Complex Clinical Data for Patient Safety Research (CLIPS)

    Medical errors hurt patients, cost money, and undermine the health care system, and the first step to reducing errors is detecting them. We are applying advanced informatics methods, such as natural language processing, data visualization, machine learning, and cognitive analysis, to detect medical errors automatically based on our electronic medical record. We can detect errors as "conflicts" in the record when diagnoses, procedures, medications, or test results do not match what is expected for a given patient.

    1. We are using queries on natural language processed discharge summaries to detect medical events specified in the New York State NYPORTS initiative. We are using existing reporting systems to estimate system sensitivity and manual review to estimate system positive predictive value.

    2. We are using the electronic medical record to study nosocomial pressure ulcers, improving estimates of incidence and designing prediction instruments that work from available electronic data.

  3. Clinical Information Systems

    The clinical information systems at the medical center serve as a living laboratory for informatics research. We are applying the fruit of informatics research—methods and knowledge such as data mining, natural language processing, workflow management, terminology, and standards—to the medical center’s systems to improve the quality and efficiency of care and to demonstrate the utility of the methods.

[ top of page ]

Celina Imielinska

  1. Integrated System for Quantification of Adipose Tissue From Whole Body MRI Scan

    Dr. Imielinska has worked on with Dr. Andrew Laine (Associate Professor at Biomedical Engineering Department) and Dr. Steve Heymsfield (Professor of Medicine and expert in obesity research), among others, to build an integrated imaging system to improve the overall performance of acquisition, segmentation, quantification and analysis of adipose tissues in humans obtained from whole body MRI scans. These results are quantitatively evaluated in terms of composition of fat and adipose tissue by using a physical phantom designed and built specially for this project, and a novel method that separates and classifies spectral signatures of known molecular MR properties of fat. Once considered a tissue of secondary importance, adipose tissue is now at the center of biological investigations in animals and humans. Body composition, particularly the adipose tissue compartment, is central to the study and clinical management of many conditions and diseases. Once considered an inert metabolic fuel storage depot, today adipose tissue is considered a dynamic organ complete with endocrine properties, neurological innervations, a highly developed vascular system, and importantly metabolic heterogeneity from site to site. Adipose tissue is currently one of the most studied body components in animals and humans.

  2. Modeling of the da Vinci Robotic Surgical System for Training, Evaluation and Quantification of Surgical Skills.

    In this project, Dr. Imielinska collaborates with Drs. Robert Ashton and Joseph DeRose, (cardiothoracic surgeons from St. Luke’s-Roosevelt Hospital and Columbia University), and Dr. Tony Jebara, (Assistant Professor from Department of Computer Science), on the challenges of quantitative benchmarking of surgical skills assessment. This work has long-term ramifications for certification of basic surgical skills and advanced robotic surgical skills. Learning surgical skills has traditionally been implemented by an apprenticeship model. With the adoption of regulations that limit resident training time, a need for more efficient methods to teach and assess surgical skills has arisen. Following the introduction of state-of-the art technology for performing laparoscopic and robotic surgery to perform remote and augmented procedures, there now exists a glaring lack of standardization (consensus) on how to assess surgical competence with these new devices. We propose correcting this by using tele-robotic sensing to track surgical activity and machine learning methods to statistically infer levels of proficiency and to computationally model the space of human surgical skills. This novel approach that combines methods in computer vision, robotics, machine learning, human perception, cognitive psychology, with understanding of the clinical domain, proposes design and implementation of a Quantitative Rating Box (QRB) to classify novice, intermediate, and expert levels of surgical performance. We have a concrete plan for benchmarking robotic and laparoscopic surgery, and we anticipate similar benchmarks for physical learning in all areas of surgery and medicine (both open and closed). The goal is to be able to minimize clinical error in surgical (medical) applications.

  3. Vesalius1 ProjectTM

    The mission of the Vesalius ProjectTM ,[11-27], the Visible Human Project, [28], at Columbia, that Dr. Imeilinska founded together with Dr. Molholt and Ewa Soliz in 1996, is to systematically develop 3D anatomy models, from the Visible Human datasets, and incorporate the visualizations into network-based segments of the anatomy curriculum at Columbia University College of Physicians & Surgeons. This is a long term and tedious task that requires strong interdisciplinary collaboration among experts in the areas of image processing, computer graphics, 3D visualization, anatomy, cognitive psychology, computational linguistics and multimedia, all of whom have had a role in this project. It is our belief that without all, or the majority, of these contributions, creating appropriate, user-friendly teaching and learning tools from the Visible Human data sets will not be successful. To turn this resource, the Visible Human data, into useful applications for medical education is both a technological task (segmentation and generation of 3D anatomy models) and a pedagogic task (content determination, cognitive design and multimedia design). Either task undertaken apart from the other will fail. Without the strong interaction among scientists, content experts, and designers in the context of a health sciences education setting there is little hope for success. Until the technical challenges related to image processing, visualization, representation, storage, and manipulation of complex color data sets like the Visible Human data, are resolved, systematic building of an anatomy curriculum will never evolve beyond a “boutique” operation. The mission of the Vesalius ProjectTM at Columbia reaches far beyond such a boutique operation. We have introduced the electronic anatomy curriculum in a broad and systematic fashion. Anatomist Dr. Judith Venuti used 3D visualizations obtained from the Visible Human data in lectures on male pelvic anatomy [11,17,22], and starting spring of 2002 semester the foot anatomy lesson has been taught by Dr. Ahmet Sinav from the Foot Anatomy Atlas [14]. Our long-term goal is to develop a complete set of 3D anatomy models for use in our anatomy curriculum. Additionally, such electronic applications will have to be thoroughly evaluated to show their effectiveness in teaching and learning, a process now underway for the foot material.

  4. Visible Human Project Segmentation and Registration Toolkit

    Dr. Imielinska’s involvement into the Vesalius ProjectTM resulted in winning a substantial, multi-year funding, from the National Library of Medicine to form the Insight Consortium: ITK Project, [28], that is the Visible Human Project Segmentation and Registration Toolkit . The ITK is an open source repository of tools for segmentation and registration of the Visible Human and radiological data, already attracting international community of users. Under the auspices of the ITK, the collaboration between Columbia (Drs. Imielinska, Laine, Schmidt, Molholt, LeBlanc) and University of Pennsylvania (Dr. Metaxas and Dr. Udupa) produced the Hybrid Segmentation Engine [5-10][30-34] that consists of component modules for automated segmentation of radiological patient and the Visible Human data. We integrate boundary-based and region-based segmentation methods to exploit the strength of each method hopefully to cover the weakness of the other method. This powerful and promising approach combines fuzzy connectedness, Voronoi Diagram classification, Gibbs prior models and the deformable models, that constitute respective components of the engine.

  5. Framework for Evaluating Image Segmentation Methods

    In a collaborative project with Dr. Jay Udupa from Medical Image Processing Group at the Department of Radiology at University of Pennsylvania, Drs. Imielinska, Laine, Schmidt, and Molholt of Columbia, are working to create framework for evaluating image segmentation methods[5] [9-10]. Medical image segmentation is the bottleneck problem for any image based system that relies on extraction of anatomy from radiological imaging. We propose to design, develop, implement, test, evaluate, and deploy a framework, complete with methods, image data, and software, for comprehensively evaluating algorithms that are developed for segmenting a wide range of medical images. This framework provides an innovative matrix and a methodology that incorporates qualitative and quantitative measures, and integrates these new metrics with existing metrics that have been used only in isolation on single applications in past evaluation efforts. The result will be a system that will enable assessment of algorithm efficacy on precision, accuracy and efficiency. Our hypothesis is that such a framework, and the associated image data and software, will provide basic scientists as well as biomedical investigators with a common, and readily deployable, usable means for establishing standard references suitable for their applications. This will fill a critical void and, we suggest, will accelerate the development and deployment of segmentation methods that are actually useful in real applications. Again, this highly interdisciplinary effort has brought together computer scientists, biomedical engineers, cognitive psychologists, and clinical experts.

  6. Microscope-Based Augmented Reality for Skull Base Neurosurgery

    n this newly defined project, Dr. Imielinska is collaborating with Dr Jeff Bruce and Dr. D’Ambrosio, neurosurgeons, Dr. Steven Feiner, (Computer Science), Dr Andrew Laine (Biomedical Engineering), Dr. Joy Hirsch (Radiology and Psychology), and Dr. Jannick Rolland (School of Optics at the University of Central Florida). The overarching goal of this project is to design, develop, implement, test, and evaluate a complete set of tools for a microscope-based, augmented reality, neuronavigation system for skull base neurosurgery. This project will provide innovative methodologies for segmentation and three-dimensional (3D) reconstruction of multi-modal, patient specific data, injection of stereoscopic 3D virtual images into the operating microscope, calibrating of microscope optics, registration of virtual anatomy with intraoperative physical patient’ anatomy, annotation of interactive 3D environments, and the design of an effective, participatory and task-specific interface. We plan to design and build a novel, accurate, high-resolution, 3-D, frameless navigational system in which a stereoscopic overlay of patient-specific image-guidance information is injected directly into both eyepieces of the operating microscope, providing a 3D, augmented reality overlay of virtual structures beneath the surface of the surgical field, as if the tissue were transparent. In particular, we propose a method to objectively quantify the global error, an accuracy assessment that is impossible using currently available, state-of –the-art technology, a novel approach that would help minimize surgical error in microscope-guided neurosurgery. Again this multi-disciplinary and multi-institutional collaboration, that has included medical experts from the onset of our collaboration, has produced a number of publications [35-38].

  7. Objective Quantification of Perfusion-Weighted Computer Tomography in the Setting of Acute Aneurysmal Subarachnoid Hemorrhage

    In the clinical project that Dr. Imielinska set up together with Dr D’Ambrosio, Mike Sughrue, Dr. Liu, and Dr. Connolly, we plan to design a new clinical protocol, aided by a quantification of CT-perfusion functions (Dr. Xin Liu is an M.D. with MS in Computer Science who is one of the first Ph.D. students at DMBI pursuing her degree in biomedical imaging informatics). This new project that has already produced good preliminary results [39], and it has the potential to be turned, after thorough evaluation, into a new clinical protocol that will contribute to an improvement of clinical output. Stroke is the third leading cause of death and the leading cause of disability in contemporary society. Subarachnoid hemorrhage (SAH) accounts for approximately 6 to 8% of all strokes and 22 to 25% of cerebrovascular deaths. Perfusion-Weighted Computed Tomography (CTP) is a relatively recent innovation that utilizes a series of axial head CT images to track the time course of a signal from an administered bolus of intravenous contrast. Most commonly in clinical practice, scans are interpreted using the qualitative detection of gross side-to-side asymmetry of CBF. In our experience, this subjective approach lends itself to misdiagnosis and potential failure to treat patients. For these reasons, we have begun work on the development of a novel new algorithm for analyzing post-processed CTP images. In our approach we provide quantification of symmetry of the mirrored brain hemispheres for cerebral blood flow (CBF) and other perfusion parameters. The challenge of this project is to prove the strong links between the results of quantification of the CTP parameters and clinical interpretation of these scans. Understanding how a computer-aided tool might play a role supporting a clinician making the final diagnosis, and decisions about treatment, is crucial in designing a system that can help and not harm. This project is yet another interdisciplinary challenge: how to design, build, test, validate and evaluate a technology in a context of specific clinical application.

  8. Structure-function relationships in the human visual system using diffusion tensor imaging, functional magnetic resonance imaging and visual field testing

    Pre- and post-operative assessments in patients with anterior visual pathway compression. The recently developed magnetic resonance technique of diffusion tensor imaging (DTI) is used clinically to trace the structure of white fiber tracts in the human brain. This novel imaging technique that derives microstructural and physiological features of tissues has many potential immediate practical applications. For example, DTI has been used to illustrate fiber tract degeneration after nerve injury, malformations in fiber tract development, and changes in fiber tract integrity caused by surgical interventions. Furthermore, preoperative DTI data has been successfully integrated into neurosurgical navigation systems to avoid damage to structurally intact fiber pathways. The focus of this project is to improve our understanding of the relationships between brain structure and function, using pre- and post-operative assessment in patients with anterior visual pathway compression. Dr. Imielinska has been collaborating on this project with neurosurgeons: Dr. Jeffrey Bruce and Dr. D'Ambrosion, and teh Director of the fMRI Research Center at Columbia, Dr. Joy Hirsch.

  9. Center for Analysis, Visualization and Simulation

    All of the projects that Dr. Imielinska has been involved in the last few years are tied together in the CAnVaS, the Center for Analysis, Visualization and Simulation (that, in turn is to become a part of a larger Columbia effort, to create the Simulation Center). The mission of the CAnVaS is to develop image-based tools used in designing and building meaningful applications in medicine, teaching, training and clinical support. The overall goal is to foster a research and development environment that will help minimize clinical error. Such a center provides the lab environment for conducting interdisciplinary research. Students who choose the biomedical imaging informatics track as their sub-specialization will participate in CAnVaS. The center will also provide research projects for DBMI courses and individual research opportunities with selected faculty. The biomedical imaging informatics curriculum under the track will prepare the students to be able to cross disciplinary boundaries and participate in integrative team approaches to complex biomedical imaging problems. Columbia University administration has a significant interest in creating a comprehensive Simulation Center to teach, train, certify and re-certify medical professionals using simulation technology and other simulated scenarios. This effort is being headed by Dr. Harvey Colten, V.P. and Senior Associate Dean for Academic Affairs.

[ top of page ]

Stephen Johnson

  1. Machine Learning of Natural Language Structures

    Natural language processing systems are becoming increasingly important in biomedical applications such as extraction of information from medical records and retrieval of information from the scientific literature. Building successful systems has traditionally required enormous manual effort from linguists, knowledge engineers and domain experts. The AQUA project investigates methods of automating the acquisition of linguistic information from collections of text (corpora) using machine- learning techniques. AQUA employs transformation-based learning, which analyzes examples of paired source and target structures to produce rules that can transform source structures into target structures. AQUA is being applied to several different kinds of structures in biomedical language: morphological composition of words, syntactic parts of speech in sentences, syntactic dependency of sentences, and semantic categories.

  2. Capturing Clinical Reasoning in Patient Records

    Narrative information is vital to health care, because it enables physicians to synthesize the raw facts and provide a context and interpretation for them. Electronic medical record systems contain a wealth of clinical data, but typically lack the clinical narrative found in paper records, e.g., the patient history and progress notes. Numerous barriers prevent the timely acquisition of narrative data, and most computer systems are unable to use such information productively. Current approaches offer a tradeoff: capture of rich clinical data that lacks structure (using transcription services or speech technology), versus entry of structured data that lacks flexibility and expressiveness (using template systems). The eNote system uses natural language processing techniques, allowing physicians full freedom of expression while producing structured documents that preserve the richness and enable further computer processing. eNote uses speech recognition and inference rules to speed the entry of notes, and maintain the continuity of information throughout a patient’s care.

  3. Digital Delivery of Knowledge at the Point of Care

    During the process of patient care, clinicians often require access to medical knowledge concerning diagnosis and treatment of their patients. The availability of appropriate knowledge can make an enormous difference in patient care while the lack of timely information is a major cause of medical errors. Paradoxically, clinicians suffer from knowledge overload and under usage: the amount of medical knowledge available online is overwhelming (journals, textbooks, guidelines, etc.), and it is extremely difficult to obtain knowledge relevant to specific patient cases. To address this problem, we are developing the HINT system, which guides clinicians through the search and retrieval process, and tailors knowledge to meet the needs of the particular patient. The system works by using information in the patient’s electronic medical record to customize the search. HINT employs several innovative techniques based on cognitive science and linguistics to capture the clinician’s knowledge needs at the point of care, and match these against what is available in online databases. This work is part of the Persival system, a collaboration between Biomedical Informatics and Computer Science, funded under the Digital Library Initiative by the National Library of Medicine and the National Science Foundation.

  4. Semi-structured Databases for Clinical Research

    Patient care and clinical research generate large amounts of data with complex structures that demand robust, sophisticated methods of management. New technologies such as natural language processing are making even more data available, but in structures that are increasingly complex, such as the Extensible Markup Language (XML). Such data are described as “semi-structured”, because they have an irregular form that is often not known in advance, and which can change frequently and without notice. Current biomedical databases do not have the ability to manage data of this complexity in a suitable way. However, new technology is emerging that can facilitate the storage and retrieval of semi-structured data. This research explores novel methods of modeling and organizing biomedical data extracted from clinical narrative, scientific literature, and genetic databases, as well as methods for querying and aggregating semi-structured data, using formalisms such as XQuery.

  5. New York Cancer Project

    The New York Cancer Project (NYCP) is large-scale investigation into genetic and environmental influences on the development of cancer. The NYCP database currently contains information on approximately 18,000 subjects based on an hour-long interview, and links to blood samples stored in a biorepository at North Shore University Hospital. The data include demographics, ethnic background, personal and family medical history, reproductive history, medication use, health care utilization and screening behaviors, substance use, as well as occupational and environmental exposures. One of the central informatics challenges is designing a database schema to support an epidemiologic study of this scale, along with complex data transformations and cleaning procedures. The project employs a distributed data collection system that enables interviewers and subjects scattered throughout the metropolitan area to enter data, and a Web-based application that allows research staff to track enrollment and contact subjects for annual follow-up. The system also supports requests for blood samples using various matching criteria, as well as complex epidemiologic queries.

[ top of page ]

Desmond Jordan

    Dr. Jordan's research passion is implementing electronic patient charting and automatic real-time data facilities which evaluate a patients status determines the variance between times of actual clinical milestones in the OR, ICU and Critical Care Units. Current projects include:
  1. MAGIC (Multimedia Automatic Generation of Intensive Care data)

    MAGIC the automated generation of summaries that combine natural language and graphics tailored for the information needs of different care-givers. By integrating data currently available in the computerized operating room with other on-line databases at Columbia University Medical Center.

  2. PERSIVAL (PErsonalized Retrieval and Summarization of Image, Video And Language resources)

    PERSIVAL aims to provide personalized access to a distributed patient care digital library. PERSIVAL is a joint research initiative between the fields of NLP, human-computer interaction, medical informatics, video processing, library and cognitive science. Key features of PERSIVAL include personalized access to distributed, multimedia resources available both locally and over the Internet, fusion of repetitive information and identification of conflicting information from multiple relevant sources, and presentation of information in concise multimedia summaries that cross-link images, video, and text. When the latest medical information is provided at the point of patient care, it can help practicing clinicians to avoid missed diagnoses and minimize impending complications.

  3. Automated Selection and Evaluation of Patient Care Plans Using Medical Inferences: Patient care plans and critical paths formulated by a careful analysis of the patient's current condition, such as that provided by the Multisystem Severity Illness Score (MSIS, APACHE) and other quantitative scoring metrics.

  4. Electronic patient charting and automatic real-time data facilities that evaluate a patient's status and determine the variance between times of actual clinical milestones in the ICU and Critical Care Units.

  5. Characterizing Medical Errors in Critical Care Medicine: Focus is on how health care professionals generate errors during patient care in the critical care. In these environments, decisions are often made under high stress, time pressure, and with the requirement to process a massive amount of clinical information. Our goal is to develop a cognitive model of errors, whereby we can predict circumstances in which a specific error would occur, and to provide proper decision support to reduce serious errors.

  6. Safety and Use of Infusion Pumps in Critical Care: The goal of this study is to examine the interaction between health care providers and an infusion pump, regularly used in the intensive care environments, when they provide critical care to the patients in the intensive care units of the hospitals. The focus is on the different roles that the providers play in the overall flow of patient care, and on identifying possible areas where patient safety is compromised by poor technology design, poor communication, and by lack of relevant training.

[ top of page ]

David Kaufman

  1. Cognitive Evaluation of the IDEATel Diabetes Telemedicine and Education Program.

    The evaluation included the use of cognitive walkthrough methodology, user testing, and participant interviews. Thus far, 24 subjects from the greater New York City area and Upstate New York have participated in the study. The study revealed dimensions of the interface that subjects found difficult to use including scroll bars, misleading labels, navigational indicators and various problematic widgets. In addition, subjects’ level of literacy and numeracy affected their ability to negotiate the system. There was also a host of noncognitive factors such as patient health, depression, self-efficacy, and computer anxiety that impacted their use of the system. The results of the study have contributed to 1) the iterative design of the interface, 2) revision of the subject training protocol and 3) re-development of a CD-ROM tutorial on how to use the system effectively. The ultimate objective of this work is to empower older adults with the necessary range of computer and literacy skills to avail themselves of the excellent resources afforded by the system to better manage their diabetes.

  2. Cognitive Dimensions of the Query User Interface:

    The Health Information Needs Tailoring (HINT) component of the PERSIVAL system is designed to facilitate or semi-automate the query formulation component process of information retrieval at the point of care (i.e., as the clinician views the patient record). The Query User Interface (QUI) is the visible component of HINT. This study focused on a cognitive evaluation of the QUI. As part of an on-going assessment, we evaluated the system’s ability to allow users easily and intuitively express their information needs. The evaluation included a cognitive walkthrough of the query formulation process, quantitative estimation of cognitive load, and usability testing. The usability testing was designed to determine the ease in which users could complete a query drawing on question stems presented to the user. The results suggest that there are features in the QUI that contribute to a greater cognitive load and result in greater effort on the part of the subject. The results of usability testing are consistent with these findings. The study suggested several ways in which the QUI can be improved.

  3. Usability Evaluation of an Innovative Smoking Cessation Program:

    The Smoke Clinic is an innovative web-based comprehensive smoking cessation clinic. The objective of this evaluation is to characterize the dimensions of the Smoke Clinic web site that promote productive use and meaningful engagement, as well as, to document features or aspects of the interface that may impede such use. Our approach employs two classes of usability evaluation including a) usability inspection and b) usability testing. Usability inspection is designed to characterize dimensions of the interface that adhere to usability principles and to note violations of these principles. We employ 2 classes of inspection methods including a cognitive walkthrough which is scenario or task based and a heuristic evaluation which emphasizes dimensions of the screen. The system is evaluated on the basis of a small set of well-tested design principles such as visibility of system status, user control and freedom, consistency and standards, flexibility and efficiency of use.

  4. Program in Mental Health Informatics:

    Evidence-based medicine coupled with emerging information technologies represent an unprecedented opportunity to greatly improve the delivery of mental health care. In particular, the proliferation of high quality guidelines and other on-line resources can have a significant impact on the quality of treatment. However, there remain formidable barriers to the dissemination of cutting-edge research and its application to clinical practice. The program in mental health informatics in the Department of Psychiatry is a recent initiative spearheaded by Vimla Patel. The mission of the overall program are to 1) promote evidence-based practices in psychiatry through the use of information technologies and related resources, 2) identify and reduce barriers to such practices and 3) promote scholarship in the area through a range of activities (e.g., courses and seminars). There are currently several related on-going research projects.

[ top of page ]

Rita Kukafka

    Dr. Kukafka's research interests focus broadly on use of informatics methods and applications to advance public health practice and research. Training activities include establishing a public health informatics track in the Department of Biomedical Informatics, and coordinating the health promotion disease prevention track in the School of Public Health with a new focus on interactive health communication. She is currently principal investigator/co-principal investigator on the following funded research projects:
  1. An Internet-Based Information Technology to Reduce Prescription Errors with HIV/AIDS Anti Retroviral Medications

    The goal of this project is to implement and evaluate an information technology intervention to augment professional education and technical assistance efforts intended to reduce prescription errors, improve quality of care and resource utilization in ambulatory care settings. It employs a web-based, interactive decision-support system used by physicians at point of care when they are formulating a patient’s medication regimen, and patient tailored education designed to designed to enhance patient adherence to antiretroviral therapy. This component of the intervention is designed to support the physicians’ efforts to implement the patient counseling components of the guidelines. In addition to the web-based, interactive software, an agency and provider IT needs assessment assesses institutional barriers and physicians’ readiness to use such IT. Results of the needs assessment is used to tailor for the staff of each agency push technology designed to motivate use of the software.
    The intervention is being implemented in 42 agencies in three states. Agencies are recruited from a larger pool that is receiving a program of continuing professional education and technical assistance offered by the NY/NJ AIDS Education and Training Center (AETC) and the Midwest AETC. Many of the agencies receive Ryan White funding. They all care for substantial numbers of patients from vulnerable populations, and they serve medically underserved communities. The evaluation design randomly assigns agencies into three arms: (1) an enhanced intervention that couples access to the web-site and the push technology intended motivate use of the system, (2) access to the web-site system without the motivational component, and (3) non-intervention control group. Outcome data on medical errors, resource utilization, and patient health status will be collected through chart abstraction. Use of the system will be documented through short provider interviews and usage data collected from the website. Major products include the web-site system, a training manual, and forms to conduct an Information Technology needs-assessment.

  2. Automated ICF Coding Using Medical Language Processing

    The World Health Organization's newly-revised International Classification of Functioning, Disability and Health (ICF) has been cited as a potential code set for reporting issues of measurement and interpretation of functional status. Functional status information is more optimal than past classification systems in carrying out public health practice because it views health broadly considering developmental, behavioral, emotional, social and environmental conditions. Public health has long viewed health within this ecological paradigm. In order for the ICF to be widely adopted, it is critical that we evaluate the ICF as a possible mechanism for that purpose and explore methods to facilitate the inclusion of ICF information into standardized records. The proposed research addresses this task. This research is to evaluate the function of the ICF for capturing and encoding clinical data from patient records. The goal is to explore the feasibility of extracting and encoding functional status information from patient records. The study will evaluate 1) the ICF classification completeness and comprehensiveness in coding clinical concepts and, 2) the MedLEE NLP system in parsing clinically relevant concepts, and coding in the ICF classification.

[ top of page ]

Yves Lussier

    Dr. Lussier’s research focuses on the use of ontologies, knowledge technologies and computational phenotypic networks to accurately individualize the understanding, the prediction, and the treatment of disease. Specific research projects include:
  1. Clinical Genomics Technology

    The Human Genome has set the pace for post-genomic discovery research. While post-genomic fields focused at the molecular level are intensively pursued, little effort is being deployed in the later stages of molecular medicine discovery research, such as Clinical (functional) Genomics. The following pioneering studies aim at demonstrating the relevance and significance of integrating mainstream clinical informatics science to current bioinformatics genomic discovery science:

    1. Phenotype Organizer System (POS):
      The long-term goal of this NIH-funded project is to build innovative informatics tools, capable of automatically querying, organizing and visualizing phenotypic data (traits, syndromes, etc.) across Phenotype databases, to facilitate phenotypic research that aims to unlock the gene-disease-relationships. This program proposes to adopt a multidisciplinary approach (informatics, genomics and biomedical research) to explore the value of semantic, probabilistic and terminological technologies in phenotypic data and knowledge processing. The proposed research may provide a unique approach to accelerate biomedical research by improving access to phenotypic data and knowledge processing – the Semantic Phenome (phenotypic-genomic relations. We have conducted several proof-of-concept studies (e.g., QMR-OMIM, GenesTrace).

    2. Modeling of Emerging Infectious Diseases
      In collaboration with Ian Lipkin, Mark Gerstein, Jeffery Skolnick and Andrea Califano, we are conceptualizing the framework and developing the "PathoGene" software platform for the molecular and clinical modeling of EID.

    3. Molecular Medicine Matrix M3
      The Molecular Medicine Matrix is a project that leverages mediated schemas, language understanding and ontology to enable the creative interoperation of otherwise heterogeneous biological and clinical databases. We have created dynamic maps between a large set of terminologies as GO, OMIM, SNOMED CT, UMLS, PhenoSlim, NCBI and MP (1, 2, ,3). We are currently adapting M3 in a Phenotype Organizer System (POS) to accelerate comparative biology of phenotypes.

    4. PhenoGenes
      In collaboration with Carol Friedman, we are developing a representational model that depicts genotypic and phenotypic relations found in the literature and also in clinical reports. The knowledge bases of PhenoGenes will be integrated in the Clinigene discovery platform.

  2. Vigilens Health Monitor

    Lately, the development of clinical practice guidelines (CGPs) and decision support systems (DSS) have received increased emphasis. However, despite this focus on development, less attention has been paid to their integration and evaluation in a genuine clinical practice. The overall goal of this proposal is to develop a modular, portable and multi-institutional DSS supporting CPGs that will improve healthcare. The unique architecture of the Vigilens DSS will provides for server-based/tele- event, guideline and outbreak monitoring. Several specific projects stem out of the Vigilens Health Monitor endeavor and are aimed at evaluating the appropriateness and the potential misuse of practice guidelines:

    1. Personalized Notification Subsystem
      This funded program is aimed at increasing, evaluating and quantifying the clinical applicability, complexity and flexibility of guidelines including institution's policies and users preferences. We are currently collaborating with IBM Research (Watson Lab) on pervasive notification and Biodefense / Homeland Security applications

    2. Rx/Dx
      This project is directed at improving the quality of medication prescribing by personalizing a CGP for the clinical context of an individual patient, taking into account a thorough understanding of their narrative records with language understanding tools and exceptions to the guideline.

[ top of page ]

Eneida Mendonça

  1. Digital Library Initiative 2: "PERSIVAL -- Personalized Search and Summarization over Multimedia Information"

    This project intends to personalize access to a distributed care digital library through the development of a system that will support the search of relevant information. The system aims to tailor search, presentation, and summarization of online medical literature and consumer health information to the user, whether patient or health care provider.
  2. Mobile Information and Coordination for HealthCare

    The study will measure the direct impact on the interventions on these proximal causes of errors. Based on an analysis of the clinical information needs and communication pattern of nurses and physicians in the practice of in-patient care, we will extend existing Web-based clinical information through three non-institution-specific, informatics-based wireless applications. We will build, deploy, and evaluate in a controlled study applications that support task management among nurses and physicians, provide clinical information for display on a wireless Palm-based device, and integrate data from the data repository to provide Palm-based alerts related to drug-disease and drug-laboratory interaction.

  3. CAP: Community Access Process

    The goal of the SASA-CAP project is to develop an integrated delivery system for the uninsured in 5 communities in New York City, including Morris Heights, Washington Heights, Inwood, Harlem, and South Bronx. Main objectives are: a) Create an integrated and efficient health care delivery system through the use of case management using a Health Priority Specialist to link frequent users of the emergency room to primary care practices, b) coordinate and track specialty referrals between hospitals, off-site practices, community practitioners and groups and c) Conduct eligibility assessments to facilitate enrollment in public health insurance programs and re-enroll individuals into Child Health Plus, Medicaid, and Family Health Plus.

  4. Immunization Registry

    Through this collaboration, and with support from the National Immunization Program of CDC, we have developed EzVAC, a provider-based immunization registry and a state-of-the-art information system that was developed to serve clinical needs as well as health outreach and education purposes. EzVAC is an interactive, multi-institution information system that aggregates data from a variety of sources, including community hospitals, clinics and private practices. Built primarily to collect immunization data, EzVAC is a flexible platform that includes an immunization database, hospital registration systems, a Web-based registry server, Web user interface, and reminder, recall and forecast applications. The system has been in use since April 1999 at all the general pediatric practices affiliated with the Columbia Presbyterian Medical Center (CPMC) campus and the New York Weill Cornell campus of the New York-Presbyterian Hospital. EzVAC currently contains immunization records of over 100,000 children delivered by 600 providers, located at the hospital network of pediatric practices, school-based clinics, in-patient wards, the pediatric emergency department and the offices of private physicians in the community.

[ top of page ]

Vimla Patel

    Studies of Human Errors in Naturalistic Medical Environments:
  1. Characterizing Medical Errors in Critical Care Medicine

    The objective of this research is to characterize how health care professionals generate errors during patient care in the critical care (intensive care and emergency) environments in Psychiatry, Internal medicine and Surgery. In these environments, decisions are often made under high stress, time pressure, and with incomplete information, leading to a high degree of uncertainty in diagnosis and patient management. Our goal is to develop a cognitive model of errors, whereby we can predict circumstances in which a specific error would occur, and to provide proper decision support to reduce serious errors.

  2. Safety and Use of Infusion Pumps in Critical Care

    The goal of this study is to examine the interaction between health care providers and an infusion pump, regularly used in the intensive care environments, when they provide critical care to the patients in the intensive care units of the hospitals. The focus is on the different roles that the providers play in the overall flow of patient care, and on identifying possible areas where patient safety is compromised by poor technology design, poor communication, and by lack of relevant training.

  3. Organizational Decision Making and Healthcare Technology

    The goal of this study is to determine how life-preserving but high-risk devices, such as infusion pumps and cardiac monitors, are purchased in hospitals. Who makes these decisions and what safety factors are considered in such decision-making? The purpose is to develop a formal evaluation guideline to assist health care providers in following a rigorous, balanced process for making decisions in selecting devices, such that safety factors are given a high priority along with economic and logistic factors.

    Evaluation of Heath Care Information Systems
  1. Mediating Effects of Physician Order-Entry Systems

    This research is directed towards assessing the effects of computer-based medical ordering on the cognitive behavior of physicians and other professionals involved in patient care. One of the goals is to investigate the extent to which physicians-in-training (interns, residents) who work primarily or exclusively with a physician order-entry system, develop different reasoning strategies from those writing mostly paper orders, and to determine what safety margins are compromised, if any.

  2. Enabling Psychiatrists' Access to Knowledge Resources

    The core aim of this NLM systems grant includes presenting mental health professionals with access to guidelines and other resources in real time at the point of care and adapting Cimino’s Infobuttons information resource delivery system to the field of mental health. The initial focus is in the area of pharmacology. Our efforts have been directed towards a) identifying the best web resources for psychopharmacology queries as derived from clinical information needs and b) implementing Infobuttons to search knowledge resources such as Micromedex.

    Studies of Consumer Education, Decision Making and Technology Use
  1. Decision Making by Young Adults Regarding Sexual Risk-Taking and HIV

    Despite studies showing that adolescents have good knowledge of HIV and how it is transmitted, they still participate in risky sexual behaviors, putting themselves at risk for acquiring HIV or other sexually transmitted diseases. In this study, young adults are exposed to hypothetical and real-life situations involving risk and asked to explain the critical decisions that they must make. This decision making process will be modeled such that it will inform the design and implementation of technological support for information dissemination and decision support.

  2. Usability Evaluation of a Smoking Cessation Program

    The Smoke Clinic is an innovative web-based comprehensive smoking cessation clinic. It is interactive, easily accessible, and is designed to be tailored to an individual’s needs and personal smoking history. The objective of this evaluation is to characterize the dimensions of a Smoke Clinic web site that promote productive use by consumers, and to document features or aspects of the technology interaction that may interfere with such use. How effective is this system in changing consumer behavior? The system is evaluated on the basis of a small set of well-tested design principles from cognitive and engineering sciences.

  3. Mental Models of Consumers in Response to (Bio)Terrorism Threats

    The goal of this study is to understand the cognitive processes underlying decision-making of laypeople in response to terrorism threats, with particular focus on development of mental models regarding such (bio)terrorist threats as consumers get exposed to rapid and new information from the media. Understanding these models will guide the design of guidelines for educating public health officials , journalists, and health professionals about presenting information in ways that will increase public preparedness and minimize panic reactions.

[ top of page ]

Paul Pavlidis

    Dr. Pavlidis' research focuses on computationally-assisted interpretation of genomics data, with a focus on gene expression microarray data.
  1. Tmm

    The largest project in my group, Tmm involves large-scale analysis of gene expression. We are developing analytical methods, software systems and databases (“Tmm”) to support the meta-analysis of gene expression data sets from a large number of sources. The goal of this research is to allow improved access to published data sets, facilitate comparison of new data sets to existing data, and to make systematic predictions of gene function and gene interactions. We have a particular focus on neuroscience applications. While our current focus has been on studying coexpression, we are expanding the system to allow analysis of differential expression, expression levels, pathways, and the integration of other types of data.

  2. Analysis of gene coexpression networks for functional discovery

    We are applying our coexpression analysis methods to predict gene function using a variety of computational methods, including graph-based learning algorithms.

  3. Functional class analysis of microarray data

    We have developed methods and a software tool (“ClassScore”) for the analysis of Gene Ontology functional annotations in microarray data. The research goals of this project are to develop new statistical methods for evaluating biologically-defined groups of genes in the context of genomics data.

  4. Database for gene annotations for microarrays ("Ermine")

    Ermine is a database and set of related tools which assembles and organizes gene annotations. Most of the annotations are obtained from disparate public sources, while others are created by our own informatics analysis. This project includes analysis of existing annotations to identify and resolve conflicts. A major use of this database is to feed information into other computational projects in our group, but our annotations are used by other groups for their analysis efforts.

  5. Gist

    I maintain a popular support vector machine software suite, Gist, which was developed in collaboration with William Noble (U. Washington). This package is used by hundreds of researchers for machine learning analysis of many types of data.

  6. In addition to these projects, I am involved in a number of collaborative efforts to analyze microarray data in a wide variety of biological and computational contexts. Recent and ongoing projects have included:

    • Etienne Sibille – (Neuroscience) Analysis of age-related gene networks in human and mouse brain.
    • Harold Worman – (Medicine) Pathogenesis of Emery-Dreifuss molecular dystrophy.
    • Gordon Barr – (Hunter College/ NYSPI) Software development and analysis of pain-induced gene expression changes in rat spinal cord.
    • Panos Papapanou – (School of Dental and Oral Surgery ) Analysis of human gene expression patterns in periodontitis.
    • Andreas Kottmann – (Psychiatry) Analysis of gene expression in mouse brain for regional specificity and relevance to models of psychosis.
    • Conrad Gilliam/Eric Kandel/Rene Hen – (Genetics/Psychiatry) Bioinformatic analysis of pathways implicated in anxiety disorders in mouse models.

[ top of page ]

Andrey Rzhetsky

  1. GeneWays

    The largest of my projects, GeneWays, has already taken a team of 10 people working for 6 years. The system was designed with the ambitious goal of automating extraction of information on molecular interactions locked in the text of journal articles. GeneWays is an integrated system that uses multiple sources of information to infer a consensus view of molecular networks. One short-term product of this project will be a database of molecular interactions and set of algorithms. Our long-term plan is to create a "forever-young review article" - a self-updating database that will accumulate and summarize information on molecular interactions from newly published literature on a daily basis. In both the near and far horizons; the resulting database will be accessible to researchers throughout the world via the World Wide Web. Our system will also serve as an extensive indexing and search database.

    This work is an on-going collaboration with groups of Drs. Carol Friedman (Department of Biomedical Information) and Vasilis Hatzivassiloglou (Department of Computer Science) at Columbia University.

  2. Computational approaches to predict protein-protein interactions from amino-acid sequences

    How can you decide whether two proteins will bind to each other if you know their protein sequences? This problem can be viewed as a machine-learning problem that we approach with Bayesian probabilistic modeling.

    The work developed as a collaboration with Drs. Shawn M. Gomes (Institute Pasteur, France), Bill Noble (University of Washington, Seattle), and Joel Bader (John Hopkins University).

  3. Modeling evolution of molecular pathways

    The goal of this new project is to create a probabilistic model of pathway evolution that can be used for inference of parameters from real data

  4. Probabilistic identification of true statements in a large sets of data extracted automatically from research literature

    Consider a large database populated by statements produced by an information-extraction system. How do we separate reliable statements from unreliable ones? The major innovation in our approach lies in considering numerous statements extracted from research articles globally, taking into account the publication time and source of every statement, rather than looking at each extracted statement locally. To implement such an approach, we need a probabilistic model of a research community that produces knowledge, and publishes that knowledge in scientific journals. In addition, we need to model probabilistically our own information-extraction system./p>

    This project is collaboration with a number of colleagues at Columbia University (Dr. Michael Krauthammer, Mr. Ivan Iossifov, Professors George Hripcsak, and Carol Friedman).

  5. Study of molecular-interaction networks in Drosophila -- integration of experimental and computational techniques that will lead to a unified picture of signal-transduction pathways in the fruit fly. This project is collaboration with Yale scientists (Drs. Kevin White and Lynn Cooley).

[ top of page ]

Edward Shortliffe

  1. InterMed: Standards for Encoding and Sharing Clinical Guidelines

    The InterMed Collaboratory has been a joint project of biomedical informatics laborato ries at Harvard (the Decision Systems Group at Brigham and Women's Hospital), Stanford, and Columbia Universities. InterMed collaborators have created the GuideLine Interchange Format (GLIF), a specification for structured representation of guidelines (see www.glif.org). The goal of GLIF is to facilitate sharing of clinical guidelines by providing a specification for their representation in a computer-interpretable form that is intended to be precise, non-ambiguous, human-readable, and independent of computing platforms (to facilitate sharing). In our laboratory at Columbia we have also emphasized the development of an execution engine that will allow the integration of GLIF-encoded guidelines with operational clinical systems, such as results-reporting systems and physician order-entry systems. See the descriptions of GLEE and GESDOR in the research summary of Dr. Dongwen Wang.

  2. Vigilens

    See the description of this collaborative project summarized by Dr. Yves Lussier.

[ top of page ]

Justin Starren

    IDEATel

    During the past year, the Informatics for Diabetes Education and Telemedicine (IDEATel) Project surpassed its goal of recruiting 1500 subjects. The project is a 4-year, $28 million randomized clinical trial, funded by the Centers for Medicare & Medicaid Services, involving diabetic Medicare patients in urban and rural New York State. The urban arm is centered at Columbia University and NewYork Presbyterian Hospital and the rural arm is centered at the Joslin Diabetes Center, SUNY Syracuse. Dr. Steven Shea, Department of Medicine, is PI for the multi-center study. Dr. Justin Starren is co-PI in charge of technology. In this role, he designed the technology architecture and oversees its implementation. The goal was to develop HIPAA compliant systems that could still be used by elderly patients. The study is in the third of four years. At present 639 subjects are actively using the Home Telemedicine Units. The IDEATel project has demonstrated the feasibility of large-scale home telemedicine for management of chronic diseases.

[ top of page ]

Peter Stetson

    Dr. Stetson's research interests span two complementary areas: 1) the detection and prevention of adverse medical events, and 2) clinical quality assessment and improvement.
  1. Mining complex clinical data for patient safety research

    In work supported by the AHRQ and a grant from the New York State Department of Health's Empire Clinical Investigator Program, we are using a combination of informatics tools (such as data mining and natural language processing) to detect adverse medical events. This project is headed by George Hripcsak MD, MS (see above) with collaborations with Carol Freidman, PhD and Stephen B. Johnson, PhD. Dr. Stetson's roles are:

    1. detection of "conflicts" in the electronic medical record as markers for adverse events
    2. leveraging local expertise in natural language processing to extract knowledge from free text regarding errors, particularly from "cross-coverage" notes - a period known to be associated with high risk for patients to suffer adverse events

  2. Improving coordination of care, information management and reducing errors through mobile computing

    This work is headed by James Cimino, MD and Eneida Mendonca MD, PhD. This project is called PalmCIS. We are building an extension of the functionality of WebCIS to a PDA. In addition we will be adding new tools to:

    1. search for online sources of patient-specific information
    2. facilitate asynchronous electronic messaging between providers using a virtual whiteboard
    3. push clinical alerts to providers regarding critical lab and procedure results

    Dr. Stetson is working on a web-based scheduling application to maintain the patient-provider index that will support the functions of PalmCIS.

[ top of page ]