Hidden Clicker Hidden Clicker
首頁 > 館藏查詢 > 查詢結果 > 書目資料
後分類 X

目前查詢

歷史查詢

縮小檢索範圍

Bayesian Variable Selection in Clustering and Hierarchical Mixture Modeling
切換:
  • 簡略
  • 詳細(MARC)
  • ISBD
  • 分享

Bayesian Variable Selection in Clustering and Hierarchical Mixture Modeling

正題名/作者 : Bayesian Variable Selection in Clustering and Hierarchical Mixture Modeling/ Lin Lin.

作者 : Lin, Lin.

出版者 : Ann Arbor :ProQuest Dissertations & Theses,2012.

面頁冊數 : 151 p.

附註 : Source: Dissertation Abstracts International, Volume: 74-01(E), Section: B.

Contained By : Dissertation Abstracts International74-01B(E).

標題 : Statistics. -

電子資源 : 線上閱讀(PQDT論文)

ISBN : 9781267537010

LEADER 04721nmm 2200325 450

001 196435

005 20130812110408.5

008 130812s2012 ||||||||||||||||| ||eng d

020 $a9781267537010

035 $a00235681

090 $aE-BOOK/378.242/Duk/2012///UM046497

100 1 $aLin, Lin.$3351543

245 10$aBayesian Variable Selection in Clustering and Hierarchical Mixture Modeling$h[electronic resource] /$cLin Lin.

260 $aAnn Arbor :$bProQuest Dissertations & Theses,$c2012.

300 $a151 p.

500 $aSource: Dissertation Abstracts International, Volume: 74-01(E), Section: B.

500 $aAdviser: Mike West.

502 $aThesis (Ph.D.)--Duke University, 2012.

520 $aClustering methods are designed to separate heterogeneous data into groups of similar objects such that objects within a group are similar, and objects in different groups are dissimilar. From the machine learning perspective, clustering can also be viewed as one of the most important topics within the unsupervised learning problem, which involves finding structures in a collection of unlabeled data. Various clustering methods have been developed under different problem contexts. Specifically, high dimensional data has stimulated a high level of interest in combining clustering algorithms and variable selection procedures; large data sets with expanding dimension have provoked an increasing need for relevant, customized clustering algorithms that offer the ability to detect low probability clusters.

520 $aThis dissertation focuses on the model-based Bayesian approach to clustering. I first develop a new Bayesian Expectation-Maximization algorithm in fitting Dirichlet process mixture models and an algorithm to identify clusters under mixture models by aggregating mixture components. These two algorithms are used extensively throughout the dissertation. I then develop the concept and theory of a new variable selection method that is based on an evaluation of subsets of variables for the discriminatory evidence they provide in multivariate mixture modeling. This new approach to discriminative information analysis uses a natural measure of concordance between mixture component densities. The approach is both effective and computationally attractive for routine use in assessing and prioritizing subsets of variables according to their roles in the discrimination of one or more clusters. I demonstrate that the approach is useful for providing an objective basis for including or excluding specific variables in flow cytometry data analysis. These studies demonstrate how ranked sets of such variables can be used to optimize clustering strategies and selectively visualize identified clusters of the data of interest.

520 $aNext, I create a new approach to Bayesian mixture modeling with large data sets for a specific, important class of problems in biological subtype identification. The context, that of combinatorial encoding in flow cytometry, naturally introduces the hierarchical structure that these new models are designed to incorporate. I describe these novel classes of Bayesian mixture models with hierarchical structures that reflect the underlying problem context. The Bayesian analysis involves structured priors and computations using customized Markov chain Monte Carlo methods for model fitting that exploit a distributed GPU (graphics processing unit) implementation. The hierarchical mixture model is applied in the novel use of automated flow cytometry technology to measure levels of protein markers on thousands to millions of cells.

520 $aFinally, I develop a new approach to cluster high dimensional data based on Kingman's coalescent tree modeling ideas. Under traditional clustering models, the number of parameters required to construct the model increases exponentially with the number of dimensions. This phenomenon can lead to model overfitting and an enormous computational search challenge. The approach addresses these issues by proposing to learn the data structure in each individual dimension and combining these dimensions in a flexible tree-based model class. The new tree-based mixture model is studied extensively under various simulation studies, under which the model's superiority is reflected compared with traditional mixture models.

590 $aSchool code: 0066.

650 4$aStatistics.$3140131

710 2 $aDuke University.$bStatistical Science.$3351544

773 0 $tDissertation Abstracts International$g74-01B(E).

790 10$aWest, Mike,$eadvisor

790 10$aReiter, Jerry$ecommittee member

790 10$aMa, Li$ecommittee member

790 10$aChan, Cliburn$ecommittee member

790 $a0066

791 $aPh.D.

792 $a2012

856 40$uhttps://erm.library.ntpu.edu.tw/login?url=http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3522017$z線上閱讀(PQDT論文)

Lin, Lin.

Bayesian Variable Selection in Clustering and Hierarchical Mixture Modeling[electronic resource] /Lin Lin. - Ann Arbor :ProQuest Dissertations & Theses,2012. - 151 p.

Source: Dissertation Abstracts International, Volume: 74-01(E), Section: B.

Thesis (Ph.D.)--Duke University, 2012.

Clustering methods are designed to separate heterogeneous data into groups of similar objects such that objects within a group are similar, and objects in different groups are dissimilar. From the machine learning perspective, clustering can also be viewed as one of the most important topics within the unsupervised learning problem, which involves finding structures in a collection of unlabeled data. Various clustering methods have been developed under different problem contexts. Specifically, high dimensional data has stimulated a high level of interest in combining clustering algorithms and variable selection procedures; large data sets with expanding dimension have provoked an increasing need for relevant, customized clustering algorithms that offer the ability to detect low probability clusters.

ISBN: 9781267537010Subjects--Topical Terms:

140131
Statistics.
  • 館藏(1)
  • 心得(0)
  • 標籤
  • 相同喜好的讀者(0)
  • 相關資料(0)

歡迎將此書加入書櫃

Hidden Clicker Hidden Clicker Hidden Clicker Hidden Clicker Hidden Clicker Hidden Clicker Hidden Clicker Hidden Clicker Hidden Clicker Hidden Clicker Hidden Clicker
行動借閱證